Poster
in
Workshop: Generative AI and Creativity: A dialogue between machine learning researchers and creative professionals
Iterative Optimization of SDS Loss for Video Generation from Multi-Object Sketches
Donghun Kim · Changho Choi · Junmo Kim
In this paper, we present a novel framework for generating videos with realistic movements and natural interactions between multiple objects derived from a single sketch image, utilizing text-to-video diffusion priors. We identify limitations in existing sketch-to-video methods, particularly their challenges in managing multiple objects and accurately representing the independent movements and interactions of separate objects. Our approach aims to facilitate the independent movement of objects while maintaining interaction consistency. To achieve this, we manually separated a vector image into distinct components and trained neural networks for each object using text-to-video diffusion with score distillation sampling (SDS) loss. The optimization process was conducted iteratively: initially, the separate objects were trained individually, followed by joint training of the combined image. As a result, our method effectively produces realistic movements for separate objects while preserving the coherence of their interactions.