ToonComposer: From Sketch to Cartoon

Transform your cartoon production workflow with generative AI that combines inbetweening and colorization into a single, efficient process. Create professional cartoon animations from simple sketches with minimal manual effort.

What is ToonComposer?

ToonComposer is a generative AI model that transforms traditional cartoon and anime production. Created by researchers at Tencent ARC, this tool addresses the time-consuming challenges of cartoon production by unifying inbetweening and colorization into a single post-keyframing stage.

ToonComposer demonstration

Source: https://github.com/TencentARC/ToonComposer

Traditional cartoon production involves three labor-intensive stages: keyframing, inbetweening, and colorization. Each stage requires skilled artists and extensive manual work. ToonComposer automates the inbetweening and colorization processes, allowing artists to focus on the creative aspects of keyframing while significantly reducing production time.

The model employs a sparse sketch injection mechanism that provides precise control using keyframe sketches. With as few as a single sketch and a colored reference frame, ToonComposer can generate complete cartoon sequences while maintaining artistic consistency and motion fluidity.

Overview of ToonComposer

FeatureDescription
AI ToolToonComposer
CategoryGenerative Animation Framework
FunctionCartoon Inbetweening and Colorization
Input RequirementsSingle sketch + colored reference frame
Research Paperarxiv.org/abs/2508.10881
GitHub Repositorygithub.com/TencentARC/ToonComposer

Technical Innovation

ToonComposer represents a significant advancement in cartoon production technology. The model addresses critical limitations in existing methods that handle keyframing, inbetweening, and colorization as separate processes, which often leads to error accumulation and visual artifacts.

The system uses a cartoon adaptation method with spatial low-rank adapters to tailor modern video foundation models to the cartoon domain while preserving temporal consistency. This approach ensures that the generated animations maintain smooth motion and coherent visual style across frames.

One of the key innovations is the sparse sketch injection mechanism, which allows precise control over character movement and expression using minimal input. Artists can provide sketches at any temporal location for more precise motion control, making the tool flexible for various animation styles and requirements.

Key Features of ToonComposer

  • Unified Post-Keyframing Process

    Combines inbetweening and colorization into a single generative process, eliminating the need for separate workflows and reducing error accumulation between stages.

  • Sparse Sketch Injection

    Provides precise control using keyframe sketches with minimal input requirements, allowing artists to guide the animation process effectively.

  • Cartoon Domain Adaptation

    Uses spatial low-rank adapters to adapt video foundation models specifically for cartoon production while maintaining temporal consistency.

  • Flexible Input Requirements

    Requires as few as a single sketch and colored reference frame, while supporting multiple sketches for enhanced control over complex animations.

  • Motion Control Precision

    Supports sketches at any temporal location for precise motion control, enabling artists to define key moments in the animation sequence.

  • Real-World Application Focus

    Designed with real-world cartoon production workflows in mind, reducing manual workload while improving creative flexibility for professional artists.

How ToonComposer Works

ToonComposer operates through a sophisticated generative process that takes advantage of modern AI capabilities while addressing the specific needs of cartoon production. The system begins with keyframe inputs and generates intermediate frames along with appropriate colorization.

Input Processing

The system accepts keyframe sketches and a colored reference frame. These inputs serve as the foundation for the generation process, providing both structural and stylistic guidance for the output animation.

Temporal Understanding

ToonComposer analyzes the temporal relationships between keyframes, understanding motion patterns and character dynamics to generate smooth transitions between frames.

Generation Process

Using the adapted video foundation model, the system generates intermediate frames that maintain consistency with the input sketches while applying appropriate colors and details based on the reference frame.

Quality Refinement

The output undergoes refinement processes to ensure visual quality, temporal consistency, and adherence to the cartoon art style defined by the input materials.

Production Benefits

ToonComposer addresses fundamental challenges in cartoon production that have historically required extensive manual labor and specialized expertise. By automating the most time-consuming aspects of the workflow, it enables studios to focus resources on creative development and artistic vision.

Time Efficiency

Reduces production time by automating inbetweening and colorization processes that traditionally require hours of manual work per sequence.

Cost Reduction

Minimizes the need for large teams of inbetween artists and colorists, allowing studios to allocate resources more efficiently.

Creative Focus

Allows artists to concentrate on keyframing and creative direction rather than repetitive technical tasks.

Quality Consistency

Maintains consistent visual quality across frames, reducing the variability that can occur with manual production methods.

Getting Started with ToonComposer

ToonComposer is available as an open-source project with comprehensive setup instructions and usage guidelines. The system requires specific dependencies and hardware configurations for optimal performance.

System Requirements

Python 3.10 and PyTorch 2.6.0 are recommended for optimal compatibility. The system also requires specific versions of flash-attn (2.8.0.post2) and gradio (5.25.2) for proper functionality.

GPU acceleration is highly recommended for reasonable processing times, though CPU processing is possible for smaller projects.

Installation Process

The installation involves cloning the repository, setting up the Python environment, and installing dependencies. The system automatically downloads required model weights from Hugging Face if they are not available locally.

Users can optionally provide local directories for model weights to avoid repeated downloads, which is particularly useful given the large size of the foundation models.

Interface Access

ToonComposer provides a Gradio interface that launches on port 7860 by default. The interface allows users to upload keyframes, set parameters, and generate animations through a user-friendly web interface.

The system can be accessed locally or deployed on remote servers for team collaboration and production workflows.

ToonComposer process workflow

Source: https://lg-li.github.io/project/tooncomposer/

Research and Development

ToonComposer represents the culmination of extensive research in generative AI and cartoon production workflows. The project was developed by a team of researchers including Lingen Li, Guangzhi Wang, Zhaoyang Zhang, Yaowei Li, Xiaoyu Li, Qi Dou, Jinwei Gu, Tianfan Xue, and Ying Shan.

The research addresses fundamental limitations in existing animation tools that handle production stages separately. Previous methods often struggled with large motions in inbetweening and required dense per-frame sketches for colorization, making them impractical for production-scale projects.

To evaluate the model's performance, the researchers created PKBench, a specialized benchmark featuring human-drawn sketches that simulate real-world use cases. This benchmark provides objective metrics for comparing ToonComposer against existing methods and demonstrates its superior performance in practical applications.

The research findings show that ToonComposer consistently outperforms existing methods in both quantitative metrics and qualitative assessments, particularly in scenarios that closely mirror professional cartoon production workflows.

Advantages and Limitations

Advantages

  • Unified workflow reduces error accumulation
  • Minimal input requirements for maximum output
  • Supports diverse animation styles and requirements
  • Maintains temporal consistency across frames
  • Reduces manual labor in production pipelines
  • Open-source availability for research and development
  • Professional-grade quality suitable for commercial use

Limitations

  • Requires significant computational resources
  • Limited to cartoon and anime art styles
  • Learning curve for optimal parameter settings
  • Dependency on specific software versions
  • May require fine-tuning for specialized styles
  • Processing time varies with sequence complexity

Production Workflow

ToonComposer production workflow demonstration

Source: https://lg-li.github.io/project/tooncomposer/

1

Prepare Keyframe Materials

Create keyframe sketches and select a colored reference frame that defines the visual style for the animation sequence.

2

Configure Generation Parameters

Set text prompts, number of output frames, resolution settings, and adjust CFG scale and position-aware residual scale as needed.

3

Upload and Position Assets

Upload keyframe sketches at selected frame positions and optionally add motion masks to define areas of free movement.

4

Generate Animation Sequence

Execute the generation process and monitor progress through the interface status panel while the AI creates intermediate frames.

5

Review and Export Results

Examine the generated animation in the output panel and export the final video for use in production workflows or further editing.

Frequently Asked Questions