Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Authors : Quan Meng, Lei Li, Matthias Nießner, Angela Dai (CASIA, DAMO Academy, Alibaba Group, Hupan Lab)
Despite various research in dynamic 3D content generation (4D Generation) there hasn’t been a singular foundation model. Separately learning spatial factors from 3D models and temporal motions from video models result in quality degradation (e.g. SVD + Zero-123), and animating 3D objects usually fails to preserve multi-view attributes.
Animate3D suggests to animate any 3D models with unified spatiotemporal consistent supervision. The process first starts with MV-VDM, a foundational 4D model based from MVDream and a spatiotemporal motion module, focused on learning natural dynamic motions. A MV2V-Adapter, adapted from I2V-Adapter, is also used to handle multi-view image conditions. For 3D context, 4DGS is jointly optimized through both reconstruction and 4D Score Distillation Sampling. For training, the authors also create MV-Video, a large-scale multi-view video dataset that consists about 1.8M multi-view videos.
Citation
@article{
jiang2024animate3d,
title={Animate3D: Animating Any 3D Model with Multi-view Video Diffusion},
author={Yanqin Jiang and Chaohui Yu and Chenjie Cao and Fan Wang and Weiming Hu and Jin Gao},
booktitle={arXiv},
year={2024},
}
Enjoy Reading This Article?
Here are some more articles you might like to read next: