简介 / About Us - LivePortrait

LivePortrait

CN

LivePortrait是由快手开源的一款具有创新意义的AI肖像动画框架，‌旨在通过单一静态肖像图像生成极具逼真感的动态视频肖像。‌该模型不仅能够精确控制眼睛的注视方向和嘴唇的开合动作，‌还能处理多个人物肖像的无缝拼接，‌使得不同人物特征合并到一个视频中时，‌过渡自然平滑，‌无突兀边界效果。‌LivePortrait通过扩展训练数据到约6900万高质量帧，‌并采用混合图像视频训练策略，‌以更好地泛化并适应更多不同类型的输入数据。‌此外，‌该模型利用紧凑的隐式关键点代表混合形状，‌并设计缝合和重定向模块，‌使用计算开销极小的小型MLP，‌从而增强了对生成动画的控制能力。‌

LivePortrait的部署相对简单，‌用户需要克隆仓库、‌创建虚拟环境、‌安装所需的依赖项和FFmpeg。‌通过下载模型的权重文件并运行脚本，‌即可生成动态视频。‌此外，‌LivePortrait还支持在线使用，‌对于没有足够算力或觉得部署麻烦的用户来说，‌提供了一个便捷的体验途径。‌

在性能方面，‌LivePortrait的单帧生成速度在RTX 4090 GPU上能达到12.8ms，‌经过进一步优化（‌如TensorRT）‌后，‌预计能达到10ms以内。‌模型的训练分为两个阶段：‌基础模型训练和贴合及重定向模块训练。‌基础模型训练使用了公开视频数据集Voxceleb、‌MEAD、‌RAVDESS和风格化图片数据集AAHQ等高质量数据。‌通过视频-图像混合训练策略，‌LivePortrait不仅对真实人像表现良好，‌还对风格化人像（‌如动漫）‌展现出良好的泛化能力

EN

LivePortrait is an innovative AI portrait animation framework opened by Kwai, which aims to generate a very realistic dynamic video portrait through a single static portrait image. This model can not only accurately control the gaze direction of the eyes and the opening and closing actions of the lips, but also handle the seamless stitching of multiple portraits, so that when different character features are merged into one video, the transition is natural and smooth, without abrupt boundary effects. LivePortrait expands its training data to approximately 69 million high-quality frames and adopts a mixed image video training strategy to better generalize and adapt to a wider range of input data types. In addition, the model utilizes compact implicit keypoints to represent mixed shapes and designs stitching and redirection modules, using a small MLP with minimal computational overhead to enhance control over generated animations. ‌

The deployment of LivePortrait is relatively simple, and users need to clone the repository, create a virtual environment, install the required dependencies, and FFmpeg. By downloading the weight file of the model and running the script, dynamic videos can be generated. In addition, LivePortrait also supports online use, providing a convenient experience for users who do not have enough computing power or find deployment troublesome. ‌

In terms of performance, LivePortrait’s single frame generation speed can reach 12.8ms on RTX 4090 GPU, and after further optimization (such as TensorRT), it is expected to reach within 10ms. The training of the model is divided into two stages: basic model training and fitting and redirection module training. The basic model training used high-quality data such as publicly available video datasets Voxceleb, MEAD, RAVDESS, and stylized image dataset AAHQ. By using a video image hybrid training strategy, LivePortrait not only performs well on real portraits, but also demonstrates good generalization ability on stylized portraits (such as anime)