Article Open Access

AI-Assisted Animation Storyboard Design and Automated Storyboard Generation

Han Ou

Abstract


This paper develops an Artificial Intelligence assisted animation storyboard design framework that uses Stable Diffusion 1.5 (SD-1.5) together with Visual Geometry Group 1-Convolutional Neural Network (VGG-1CNN) and Generative Pre-trained Transformer 3.5 (GPT-3.5) to produce automated game character images and narrative-focused storyboards. The proposed system utilizes combined text and sketch prompts for generating storyboard frames which preserve visual coherence together with stylistic continuity. The three main elements that power improved image generation through advanced diffusion control techniques include Contrastive Language-Image Pretraining (CLIP) neural networks and VGG-1CNN and Variational Autoencoder (VAE). The sequence starts by translating textual descriptions into numerical latent space codes using a neural network before the computer generates images based on these guidelines. The basic sketch receives edge detection through Canny edge maps to give better results in image refinement. By applying the VGGNet architecture to vector representations of generated images the system improves visual precision together with prompt compliance. The image quality receives additional enhancement through an iterative scheduler-based removal of noise which refines vector representations during multiple successive stages. The deployment of GPT-3.5 gives the system ability to create written narratives suited for each story frame while preserving narrational continuity. A decoder-based upscaling technique applies to the final output to generate high-resolution visually appealing storyboard frames that properly highlight the visual elements alongside textual content. The automated solution established through this model delivers an efficient pre-production animation pipeline automation that minimizes work efforts and conserves artistic and narrative quality.


Keywords


VGG-1CNN Neural Network, Generative Pre-Trained, Transformer, Stable Diffusion, Automated Image Generation

References


R. Gao, “AIGC technology: Reshaping the future of the animation industry,” Highlights Sci. Eng. Technol., vol. 56, pp. 148–152, 2023, doi: 10.54097/hset.v56i.10096.

M. Izani, A. Razak, D. Rehad, and M. Rosli, “The impact of artificial intelligence on animation filmmaking: Tools, trends, and future implications,” in Proc. 2024 Int. Visualization, Informatics and Technology Conf. (IVIT), 2024, pp. 57–62, doi: 10.1109/IVIT62102.2024.10692804.

C. Lan, Y. Wang, C. Wang, S. Song, and Z. Gong, “Application of ChatGPT-based digital human in animation creation,” Future Internet, vol. 15, no. 9, p. 300, 2023, doi: 10.3390/fi15090300.

K. He, A. Lapham, and Z. Li, “Enhancing narratives with SayMotion’s text-to-3D animation and LLMs,” in ACM SIGGRAPH 2024 Real-Time Live!, 2024, pp. 1–2, doi: 10.1145/3641520.3665309.

S. Sadulla, “Next-generation semiconductor devices: Breakthroughs in materials and applications,” Prog. Electron. Commun. Eng., vol. 1, no. 1, pp. 13–18, 2024, doi: 10.31838/PECE/01.01.03.

Z. Yang, “2D animation comic character action generation technology based on biomechanics simulation and artificial intelligence,” Mol. Cell. Biomech., vol. 21, no. 1, p. 338, 2024, doi: 10.32604/mcb.2024.021338.

P. Paudel, A. Khanal, D. P. Paudel, J. Tandukar, and A. Chhatkuli, “ihuman: Instant animatable digital humans from monocular videos,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2024, pp. 304–323, doi: 10.1007/978-3-031-73226-3_18.

L. Mourot, L. Hoyet, F. Le Clerc, F. Schnitzler, and P. Hellier, “A survey on deep learning for skeleton-based human animation,” Comput. Graph. Forum, vol. 41, no. 1, pp. 122–157, Feb. 2022, doi: 10.1111/cgf.14426.

S. Sadulla, “A comparative study of antenna design strategies for millimeter-wave wireless communication,” SCCTS J. Embedded Syst. Des. Appl., vol. 1, no. 1, pp. 13–18, 2024, doi: 10.31838/ESA/01.01.03.

N. Krome and S. Kopp, “Minimal latency speech-driven gesture generation for continuous interaction in social XR,” in Proc. 2024 IEEE Int. Conf. Artificial Intelligence and eXtended and Virtual Reality (AIxVR), 2024, pp. 236–240, doi: 10.1109/AIxVR59

D. Gopinath, H. Joo, and J. Won, “Motion in-betweening for physically simulated characters,” in SIGGRAPH Asia 2022 Posters, 2022, pp. 1–2, doi: 10.1145/3550082.3564186.

F. Danieau et al., “Automatic generation and stylization of 3D facial rigs,” in Proc. 2019 IEEE Conf. Virtual Reality and 3D User Interfaces (VR), 2019, pp. 784–792, doi: 10.1109/VR.2019.8797971.

J. Muralidharan, “Machine learning techniques for anomaly detection in smart IoT sensor networks,” J. Wireless Sensor Netw. IoT, vol. 1, no. 1, pp. 15–22, 2024, doi: 10.31838/WSNIOT/01.01.03.

N. Kolotouros et al., “DreamHuman: Animatable 3D avatars from text,” in Advances in Neural Information Processing Systems, vol. 36, 2023, pp. 10516–10529, doi: 10.48550/arXiv.2306.09329.

P. Sagar and A. Handa, “Exploring the mechanical, metallurgical, and fracture characteristics of hybrid-reinforced magnesium metal matrix composite synthesized via friction stir processing route,” Proc. IMechE, Part L: J. Mater.: Des. Appl., vol. 238, no. 5, pp. 829–844, 2024, doi: 10.1177/14644207231200640.

V. C. Lungu-Stan and I. G. Mocanu, “3D character animation and asset generation using deep learning,” Appl. Sci., vol. 14, no. 16, p. 7234, 2024, doi: 10.3390/app14167234.

P. Jin et al., “Local action-guided motion diffusion model for text-to-motion generation,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2024, pp. 392–409, doi: 10.1007/978-3-031-72698-9_23.

N. T. Hoa and M. Voznak, “Critical review on understanding cyber security threats,” Innov. Rev. Eng. Sci., vol. 2, no. 2, pp. 17–24, 2025, doi: 10.31838/INES/02.02.03.

F. Lamberti, V. Gatteschi, A. Sanna, and A. Cannavò, “A multimodal interface for virtual character animation based on live performance and natural language processing,” Int. J. Hum.-Comput. Interact., vol. 35, no. 18, pp. 1655–1671, 2019, doi: 10.1080/10447318.2018.1561068.

J. Q. Zhang et al., “Write-An-Animation: High-level text-based animation editing with character-scene interaction,” Comput. Graph. Forum, vol. 40, no. 7, pp. 217–228, Oct. 2021, doi: 10.1111/cgf.14415.

X. Wang et al., “AnimatableDreamer: Text-guided non-rigid 3D model generation and reconstruction with canonical score distillation,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2024, pp. 321–339, doi: 10.48550/arXiv.2312.03795.

F. Lamberti, G. Paravati, V. Gatteschi, A. Cannavò, and P. Montuschi, “Virtual character animation based on affordable motion capture and reconfigurable tangible interfaces,” IEEE Trans. Vis. Comput. Graph., vol. 24, no. 5, pp. 1742–1755, 2018, doi: 10.1109/TVCG.2017.2690433.

S. Sadulla, “Techniques and applications for adaptive resource management in reconfigurable computing,” SCCTS Trans. Reconfigurable Comput., vol. 1, no. 1, pp. 6–10, 2024, doi: 10.31838/RCC/01.01.02.

A. Rao et al., “Dynamic storyboard generation in an engine-based virtual environment for video production,” in ACM SIGGRAPH 2023 Posters, 2023, pp. 1–2, doi: 10.1145/3588028.3603647.

S. Jo, S. Shin, and S. W. Kim, “Interactive storyboarding system leveraging large-scale pre-trained model,” SSRN Preprint 4399439, 2023, doi: 10.2139/ssrn.4399439.

Z. Liang et al., “StoryDiffusion: How to support UX storyboarding with generative-AI,” arXiv:2407.07672, 2024, doi: 10.48550/arXiv.2407.07672.

H. Kim et al., “ASAP for multi-outputs: Auto-generating storyboard and pre-visualization with virtual actors based on screenplay,” Multimedia Tools Appl., pp. 1–24, 2024, doi: 10.1007/s11042-024-17678-9.

C. Y. Kim et al., “Bridging generations using AI-supported co-creative activities,” arXiv:2503.01154, 2025, doi: 10.48550/arXiv.2503.01154.

K. Geetha, “Advanced fault tolerance mechanisms in embedded systems for automotive safety,” J. Integr. VLSI, Embedded Comput. Technol., vol. 1, no. 1, pp. 6–10, 2024, doi: 10.31838/JIVCT/01.01.02.

J. Muralidharan, “Innovative RF design for high-efficiency wireless power amplifiers,” Nat. J. RF Eng. Wireless Commun., vol. 1, no. 1, pp. 1–9, 2023, doi: 10.31838/RFMW/01.01.01.

T. Tang et al., “PlotThread: Creating expressive storyline visualizations using reinforcement learning,” IEEE Trans. Vis. Comput. Graph., vol. 27, no. 2, pp. 294–303, 2021 (published online 2020), doi: 10.1109/TVCG.2020.3030359.

J. Kim, Y. Heo, H. Yu, and J. Nang, “A multi-modal story generation framework with AI-driven storyline guidance,” Electronics, vol. 12, no. 6, p. 1289, 2023, doi: 10.3390/electronics12061289.

M. Tao, B.-K. Bao, H. Tang, Y. Wang, and C. Xu, “StoryImager: A unified and efficient framework for coherent story visualization and completion,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2024, pp. 479–495, doi: 10.1007/978-3-031-20053-3_28.

T. Fernandes, V. Nisi, N. Nunes, and S. James, “ArtAI4DS: AI art and its empowering role in digital storytelling,” in Int. Conf. Entertainment Computing, 2024, pp. 78–93, doi: 10.1007/978-3-031-20065-6_6.

E. M. Y. Chan et al., “SketchBoard: Sketch-guided storyboard generation for game characters in the game industry,” in Proc. 2024 IEEE 22nd Int. Conf. Industrial Informatics (INDIN), 2024, pp. 1–8, doi: 10.1109/INDIN41052.2024.9557423.




DOI: https://doi.org/10.52088/ijesty.v5i2.1534

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Han Ou

International Journal of Engineering, Science, and Information Technology (IJESTY) eISSN 2775-2674