🤖 Animation Inbetweening: Machine Learning and Spline Curves for 2D Animation Interpolation

Project Summary

Goal: To solve the issues of outline blur and large motion span inherent in traditional video interpolation algorithms when applied to low-frame-rate 2D animation.
Core Innovation: We propose a method using B-spline curves to fit and match the outlines of two input frames, ensuring the clarity of the intermediate frame’s outline during interpolation. The algorithm supports the generation of any number of mid-frames.
Key Techniques: B-spline fitting, contour matching, biclustering, network flow, Gale–Shapley algorithm, and regression for movement prediction.
Authors: Haina Wang, Houxian Su, Zhizhi Wang.
Institution: Zhejiang University.

Motivation and Unique Challenges

Traditional video interpolation is designed for high-frame-rate videos ( \geq 24\text{fps} ). Animation inbetweening, however, requires adding intermediate frames to keyframes (often ( \leq 12\text{fps} )), presenting several difficulties:

Large Motion Spans: The movement of objects between adjacent frames changes greatly due to the low frame rate.
Outline Preservation: Most algorithms cause blurring of the generated intermediate frame outlines, which is unacceptable for animation production.
Data Scarcity: Obtaining large, annotated datasets for 2D animation is challenging due to trade secrets, although the ATD-12K dataset was utilized.

Technical Approach: The Spline-Based Framework

Our algorithm utilizes a multi-stage, machine-learning-assisted process to maintain outline integrity throughout interpolation.

1. Pre-processing and Vectorization

Binarization: We use local thresholding, stylization, and global thresholding in OpenCV to binarize adjacent input frames.
Contour Extraction & Fitting: We use cv2.findcontours() to extract outlines and then apply B-spline curves to fit the contours, limiting the number of control points to unify the vectorized length for easier matching.

2. Motion Analysis and Matching

Movement Detection: We apply criteria based on five learned coefficients ($k_{1}$ to $k_{5}$) to determine which long curves are truly moving between frames.
Clustering: We use biclustering (DBSCAN and agglomerative clustering) for long moving curves and K-means for short curves to aid matching.
Structural Matching:
- Long Curves: Matched using a structural method that scores curve pairs based on shape metrics (e.g., cross product vectors) and control point polygon shape. We apply network flow for piecewise matching.
- Short Curves: Treated as separate points and matched using the Gale–Shapley algorithm.

3. Movement Prediction and Frame Generation

Trajectory Prediction: For matched control point pairs, we use trained quadratic motion equations (in time $t$) to predict the position of the control points in the intermediate frame $t_{1/2}$.
Frame Restoration: The intermediate frame is generated by restoring the B-spline curves from the predicted control points (moved curves) and superimposing them onto the unmoved curves from the adjacent frames.

Technical Challenges and Reflections

This project highlighted core challenges in combining traditional computer vision and vector graphics with machine learning:

Contour Robustness: Relying on cv2.findcontours() dramatically affected the program’s robustness, as a few pixels of noise could lead to poor matching results. The issue of contours intersecting or separating was also observed.
B-spline Limitations: We encountered problems with B-spline fitting, including the risk of self-intersection if the fitting epoch was not carefully set. Additionally, the inherent variability in contour length made simply using the same number of control points for all curves an unwise idea.
Learning Process: The project involved multiple dead ends, such as failed attempts with motion-detection models and treating the image as simple point clouds for transformers. Progress was only achieved when realizing the simplification of contours into B-splines was the correct path.

Citation: Haina Wang, Houxian Su, Zhizhi Wang. . “Animation Inbetweening Based on Machine Learning and Spline Curves.” Preprint (2023).

Download Slides

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)