What shipped
Meituan's LongCat-Video-Avatar can turn static photos into speaking avatars with synchronized lip movements across multiple people simultaneously. The longcat-multi-avatar extension, highlighted in early February 2026, builds on the core model's December 2025 release by handling group conversations rather than single speakers.
The system uses 13.6 billion parameters and runs entirely on local hardware, processing 10-30 seconds per video second. It's MIT licensed, meaning enterprise deployments carry no subscription costs.
Why this matters
The open-source positioning challenges established players. HeyGen, Hedra, and similar services charge per seat or usage. LongCat eliminates that model for organizations willing to run infrastructure.
Benchmark performance backs the technical claims: LongCat topped EvalTalker's anthropomorphism tests with 492 participants and led HDTF, CelebV-HQ, and EMTD datasets for lip-sync accuracy and identity consistency. The multi-speaker capability specifically outperforms rivals like InfiniteTalk and OmniHuman in maintaining character coherence across longer sequences.
Meituan's Cross-Chunk Latent Stitching technique enables unlimited video length without quality degradation, addressing a common weakness in diffusion models. The system handles 140+ languages, relevant for APAC organizations managing multilingual content.
Implementation trade-offs
Local deployment means infrastructure overhead. Organizations need GPU capacity, though a distilled variant cuts VRAM requirements by 50% for smaller teams. Output quality at 480p-720p sits below some commercial alternatives but meets training, internal comms, and prototype use cases.
Queue-based processing introduces latency variability. The distilled version trades some quality for resource efficiency.
WaveSpeedAI began hosting the model in early 2026, and ComfyUI wrappers now enable workflow integration. No indication yet of enterprise support contracts or SLAs from Meituan.
What to watch
Whether enterprises adopt this over managed services depends on volume economics and acceptable quality thresholds. The MIT license removes licensing friction, but operational complexity remains. Organizations already running local AI infrastructure have the clearest path to deployment.