A production-ready deployment solution for MeiGen-AI's InfiniteTalk and MultiTalk models, featuring automatic model management, Docker containerization, and a user-friendly Gradio interface.
- 🚀 One-Click Deployment - Automated model download and Docker containerization
- 📦 Complete Model Support - All 16 official models (InfiniteTalk + MultiTalk)
- 🎯 Smart Model Management - Auto-download missing models, auto-unload after 5min idle
- 🖥️ Modern Web UI - Gradio-based interface with real-time progress tracking
- 🔄 Multi-Mode Support - Image-to-video and video-to-video generation
- 💾 Optimized Storage - Supports INT8/FP8 quantized models (228GB total)
- 🌐 Production Ready - Nginx reverse proxy with SSL and authentication
- Features
- Quick Start
- Installation
- Model Guide
- Configuration
- Usage
- Tech Stack
- Project Structure
- Contributing
- License
# Clone repository
git clone https://github.com/neosun100/infinitetalk-deployment.git
cd infinitetalk-deployment
# Start with Docker
docker-compose up -d
# Access UI at http://localhost:8418- Docker >= 20.10
- Docker Compose >= 2.0
- NVIDIA GPU with CUDA support
- nvidia-docker2 installed
docker pull infinitetalk:latestdocker run -d \
--name infinitetalk \
--gpus all \
-p 8418:7860 \
-v /storage/infinitetalk/models:/app/models \
infinitetalk:latest# Check container status
docker ps | grep infinitetalk
# View logs
docker logs -f infinitetalk
# Access UI
curl http://localhost:8418| Variable | Description | Default |
|---|---|---|
GRADIO_SERVER_PORT |
Web UI port | 7860 |
IDLE_TIMEOUT |
Model auto-unload timeout (seconds) | 300 |
version: '3.8'
services:
infinitetalk:
image: infinitetalk:latest
container_name: infinitetalk
restart: unless-stopped
ports:
- "8418:7860"
volumes:
- /storage/infinitetalk/models:/app/models
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
environment:
- GRADIO_SERVER_PORT=7860
- IDLE_TIMEOUT=300- Python 3.10+
- CUDA 11.8+ / CUDA 12.1+
- 32GB+ RAM
- 500GB+ free disk space
pip install -r requirements.txtModels will be automatically downloaded on first run. You can also manually download:
# Download all models (228GB)
bash download_models.sh
# Or download specific models
bash download_multitalk.shpython app.pyThe application will be available at http://localhost:7860
| Model | Size | Type | Use Case |
|---|---|---|---|
| ⭐ Single (Original) | 11GB | Standard | Single person talking, recommended for beginners |
| ⭐ Multi (Original) | 9.95GB | Standard | Multi-person conversation, recommended |
| Single INT8 | 19.5GB | Quantized | Higher quality, single person |
| Single INT8 LoRA | 19.5GB | Quantized+Style | Style control support |
| Multi INT8 | 19.5GB | Quantized | Higher quality, multi-person |
| Multi INT8 LoRA | 19.5GB | Quantized+Style | Multi-person with style control |
| Single FP8 | 19.5GB | Quantized | Balanced quality/speed |
| Multi FP8 | 19.5GB | Quantized | Balanced quality/speed |
| Multi FP8 LoRA | 19.5GB | Quantized+Style | Multi-person with style |
| T5 FP8 | 6.73GB | Auxiliary | Text encoder (optional) |
| Model | Size | Type | Use Case |
|---|---|---|---|
| 🎭 MultiTalk (Original) | 9.95GB | Standard | Multi-person conversation |
| MultiTalk INT8 | 19.1GB | Quantized | Higher quality |
| MultiTalk INT8 FusionX | 19.1GB | Fast | 2-3x faster (4-8 steps) |
| MultiTalk FP8 FusionX | 19.1GB | Fast | Balanced speed/quality |
| MultiTalk T5 INT8 | 6.73GB | Auxiliary | Text encoder |
| MultiTalk T5 FP8 | 6.73GB | Auxiliary | Text encoder |
Total: 228GB (all 16 models)
For detailed model selection guide, see MODEL_GUIDE.md
server {
listen 443 ssl;
server_name infinitetalk.yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
auth_basic "Restricted Access";
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
proxy_pass http://localhost:8418;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}Models are stored in /app/models inside the container, mapped to /storage/infinitetalk/models on the host.
models/
├── single/
│ └── infinitetalk.safetensors (11GB)
├── multi/
│ └── infinitetalk.safetensors (9.95GB)
├── quant_models/
│ ├── infinitetalk_single_int8.safetensors (19.5GB)
│ ├── infinitetalk_multi_fp8.safetensors (19.5GB)
│ └── ... (7 more models)
└── multitalk/
├── multitalk.safetensors (9.95GB)
└── quant_models/ (5 models)
- Select Model Type: Choose InfiniteTalk or MultiTalk
- Select Model: Pick from available models
- Load Model: Click "🔄 Load Model" button
- Choose Mode: Image-to-video or Video-to-video
- Upload Files: Upload image/video and audio
- Generate: Click "🎬 Generate Video"
- Auto Model Management: Models auto-download if missing
- Smart Memory: Auto-unload after 5 minutes of inactivity
- Real-time Progress: Download and generation progress tracking
- Model Details: View model info, size, and recommendations
- Backend: Python 3.10, Gradio 6.0
- Deep Learning: PyTorch, Diffusers
- Containerization: Docker, Docker Compose
- Web Server: Nginx (reverse proxy)
- Models: InfiniteTalk, MultiTalk (MeiGen-AI)
infinitetalk-deployment/
├── app.py # Main Gradio application
├── Dockerfile # Docker image definition
├── download_in_container.sh # Auto-download script
├── download_models.sh # Manual download script
├── MODEL_GUIDE.md # Detailed model guide
├── README.md # English documentation
├── README_CN.md # Chinese documentation
├── README_TW.md # Traditional Chinese
├── README_JP.md # Japanese documentation
└── models/ # Model storage (gitignored)
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Initial Release
- ✅ Complete deployment solution for InfiniteTalk & MultiTalk
- ✅ Docker containerization with auto-download
- ✅ All 16 official models support (228GB)
- ✅ Gradio web interface with real-time progress
- ✅ Auto model management (download, load, unload)
- ✅ Fixed file size calculation (GB vs GiB)
- ✅ Nginx reverse proxy configuration
- ✅ Multi-language documentation (EN/CN/TW/JP)
Features
- Auto-download missing models on startup
- Smart memory management (5min idle timeout)
- Real-time download progress tracking
- Model selection with detailed descriptions
- Support for both InfiniteTalk and MultiTalk
- INT8/FP8 quantized models support
- Image-to-video and video-to-video modes
Technical Details
- Fixed GB/GiB calculation inconsistency
- Optimized Docker CMD for proper startup
- Implemented model auto-unload mechanism
- Added comprehensive model metadata
- Created detailed model selection guide
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
The InfiniteTalk and MultiTalk models are licensed by MeiGen-AI under Apache 2.0.
- MeiGen-AI for the amazing InfiniteTalk and MultiTalk models
- Gradio for the web interface framework
- All contributors and users of this project
Note: This is a deployment wrapper. For the original InfiniteTalk/MultiTalk code, visit:
- InfiniteTalk: https://github.com/MeiGen-AI/InfiniteTalk
- MultiTalk: https://huggingface.co/MeiGen-AI/MeiGen-MultiTalk
