Generative AI

LiveAvatar

Real-time streaming avatar generation at 20 FPS. 14B parameter diffusion model deployed on 5x H800 GPUs.

About the Project

LiveAvatar is an advanced AI system developed in collaboration with Alibaba Group that generates real-time, streaming, infinitely-long interactive avatar videos driven by audio input. The system uses a 14-billion parameter diffusion model (WanS2V-14B) to create convincing animated characters that respond naturally to speech. Deployed on a 5x H800 GPU cluster, it achieves 20 FPS output for production-grade avatar experiences.

Key Features

14B parameter WanS2V diffusion model for photorealistic avatar generation
Real-time streaming at 20 FPS using 5x H800 GPU cluster
Supports 10,000+ seconds of continuous video generation
4-step sampling optimization with LoRA fine-tuning
Apache 2.0 licensed with open model weights on HuggingFace

Impact

Ranked #1 on HuggingFace Papers with 1000+ GitHub stars. Enables next-generation human-AI interaction for enterprise clients.

Tech Stack

PyTorchWanS2V-14BDiffusersDeepSpeedGradioCUDA 12.4

Metrics

1000+ GitHub stars

#1 HuggingFace Paper

10,000+ sec video generation

Links

Live Demo Source Code

Interested in this project?

Let's discuss how I can build something similar for you.

All Projects Get in Touch