Animate Anybody:角色动画的图像到视频合成
Animate Anyone: Image-to-video synthesis for character animation

原始链接: https://humanaigc.github.io/animate-anyone/

这项新技术由阿里巴巴智能计算研究院的研究人员开发,为角色动画提供一致且可控的图像到视频合成。 它被称为“Animate Anybody”,它利用强大的扩散模型,同时使用其 ReferenceNet 保持参考图像中复杂外观特征的一致性。 由于有效的时间建模方法和有效的姿势引导器以保持整个过程的控制,该技术还确保视频帧之间尽可能平滑的移动。 这些技术在 UBC 时尚视频数据集(以及 TikTok 数据集)上进行测试,在时尚视频和人类舞蹈合成应用程序中均取得了最先进的结果。 欲了解更多信息,请访问 https://humanaigc.github.io/animate-anyone/。 该研究论文的完整版本(包括代码)可以在 https://arxiv.org/abs/2311.17117 找到。

作者的评论强调了对使用深度学习创建合成视频内容的担忧,特别是与隐私、道德和潜在滥用该技术相关的问题。 在承认该领域令人兴奋的进步的同时,作者提出了有关该技术将如何影响社会以及是否会破坏传统证据收集标准的问题。 分散视频的使用被视为对传统证据实践的潜在威胁,引发了一旦使用深度学习算法令人信服地模拟视频内容,如何验证录制内容的完整性的问题。 此外,作者认为,初步研究结果的呈现(主要是女性受试者)加剧了 STEM 领域现有的性别差异。 总的来说,作者鼓励对这一新兴技术的后果进行反思和谨慎考虑。
相关文章

原文

Character Animation aims to generating character videos from still images through driving signals. Currently, diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities. However, challenges persist in the realm of image-to-video, especially in character animation, where temporally maintaining consistency with detailed information from character remains a formidable problem. In this paper, we leverage the power of diffusion models and propose a novel framework tailored for character animation. To preserve consistency of intricate appearance features from reference image, we design ReferenceNet to merge detail features via spatial attention. To ensure controllability and continuity, we introduce an efficient pose guider to direct character's movements and employ an effective temporal modeling approach to ensure smooth inter-frame transitions between video frames. By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods. Furthermore, we evaluate our method on benchmarks for fashion video and human dance synthesis, achieving state-of-the-art results.

联系我们 contact @ memedata.com