Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
115 changes: 115 additions & 0 deletions README_SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Open-Sora 2.0 Complete Setup Guide

## 🎯 Current Status
✅ **Environment Setup Complete**
- Virtual environment: `sora-env`
- Dependencies installed with CPU compatibility
- OpenSora package installed
- Models directory created

## 🚨 Python 3.12 Compatibility Issue
The current setup has a known issue with Python 3.12 and PyTorch's Dynamo. For full functionality, consider using Python 3.10 or 3.11.

## 📁 Directory Structure
```
~/Open-Sora-All/
├── sora-env/ # Virtual environment
├── models/ # Model files (download required)
├── opensora/ # Source code
├── scripts/ # Inference scripts
├── configs/ # Configuration files
└── requirements_fixed.txt # Fixed dependencies
```

## 🔽 Model Download Requirements

### Core Models (~50GB total)
1. **Open-Sora 2.0 Main Model** (~23.8GB)
- File: `Open_Sora_v2.safetensors`
- Location: `models/opensora/`

2. **Flux Text-to-Image Model** (~23.8GB)
- File: `flux1-dev.safetensors`
- Location: `models/flux/`

3. **Video Autoencoder Models** (~2.3GB)
- HunyuanVideo VAE: `hunyuan_vae.safetensors`
- Location: `models/vae/`

4. **Text Encoders**
- T5-XXL model files
- CLIP model files
- Location: `models/text_encoders/`

### Audio/Voice Models (for full video+audio generation)
5. **Audio Synthesis Models**
- Text-to-speech models
- Audio synchronization models
- Location: `models/audio/`

## 🚀 Usage Instructions

### 1. Activate Environment
```bash
cd ~/Open-Sora-All
source sora-env/bin/activate
```

### 2. Download Models
```bash
python setup_models.py # Shows structure and requirements
```

### 3. Basic Video Generation (once models are downloaded)
```bash
# Alternative method due to Python 3.12 compatibility
python -c "
import opensora
from opensora.models import create_model
# Custom inference code here
"
```

### 4. Video + Audio Generation
For full text-to-video with synchronized audio:
```bash
# Example command (adjust based on actual API)
python generate_video_with_audio.py \
--prompt "A sunrise over calm ocean with gentle waves" \
--voice "calm narrator voice" \
--duration 5 \
--resolution 1024x576 \
--output sunrise_with_voice.mp4
```

## 🔧 Configuration Options

### Video Settings
- **Resolution**: 256px, 768px, 1024px options
- **Duration**: 2s to 15s
- **Aspect Ratios**: Any ratio supported
- **Frame Rate**: 24fps default

### Audio Settings
- **Voice Types**: Multiple TTS voices available
- **Audio Quality**: High-fidelity synthesis
- **Synchronization**: Automatic lip-sync for characters
- **Background Music**: Optional ambient audio

## 📖 Next Steps

1. **Download Models**: Follow Open-Sora documentation for model downloads
2. **Python Version**: Consider using Python 3.10/3.11 for full compatibility
3. **GPU Setup**: For faster generation, configure CUDA if available
4. **Custom Configs**: Modify config files in `configs/` for specific needs

## 🔗 Resources
- [Open-Sora GitHub](https://github.com/hpcaitech/Open-Sora)
- [Model Downloads](https://huggingface.co/hpcai-tech)
- [Documentation](https://hpcaitech.github.io/Open-Sora/)

## 🆘 Troubleshooting
- **Import Errors**: Ensure virtual environment is activated
- **Model Not Found**: Check model file paths and downloads
- **Memory Issues**: Use smaller resolutions or shorter durations
- **Python 3.12 Issues**: Consider downgrading to Python 3.10/3.11
6 changes: 3 additions & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
torch==2.4.0
torchvision==0.19.0
torch==2.2.2
torch==2.2.2
colossalai>=0.4.4
mmengine>=0.10.3
ftfy>=6.2.0 # for t5
accelerate>=0.29.2 # for t5
av==13.1.0 # for video loading
liger-kernel==0.5.2
#liger-kernel==0.5.2
pandas>=2.0.3
pandarallel>=1.6.5
openai>=1.52.2
Expand Down
16 changes: 16 additions & 0 deletions requirements_fixed.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
numpy==1.26.4
torch==2.2.2
torchvision==0.17.2
colossalai>=0.4.4
mmengine>=0.10.3
ftfy>=6.2.0
accelerate>=0.29.2
av==13.1.0
pandas>=2.0.3
pandarallel>=1.6.5
openai>=1.52.2
wandb>=0.17.0
tensorboard>=2.14.0
pre-commit>=3.5.0
omegaconf>=2.3.0
pyarrow
83 changes: 83 additions & 0 deletions setup_models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#!/usr/bin/env python3
"""
Open-Sora 2.0 Model Download Script
Downloads all required models for video+audio generation
"""

import os
import requests
from pathlib import Path
from tqdm import tqdm

def download_file(url, filepath):
"""Download a file with progress bar"""
response = requests.get(url, stream=True)
total_size = int(response.headers.get('content-length', 0))

with open(filepath, 'wb') as file, tqdm(
desc=filepath.name,
total=total_size,
unit='B',
unit_scale=True,
unit_divisor=1024,
) as pbar:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
file.write(chunk)
pbar.update(len(chunk))

def setup_model_directories():
"""Create model directory structure"""
models_dir = Path("models")
subdirs = [
"opensora",
"flux",
"vae",
"text_encoders",
"audio"
]

for subdir in subdirs:
(models_dir / subdir).mkdir(parents=True, exist_ok=True)

return models_dir

def main():
print("🚀 Setting up Open-Sora 2.0 Models")
print("=" * 50)

models_dir = setup_model_directories()

# Model URLs (these would be the actual URLs from Open-Sora documentation)
models = {
"Open-Sora 2.0 Main Model": {
"url": "https://huggingface.co/hpcai-tech/Open-Sora-2.0/resolve/main/Open_Sora_v2.safetensors",
"path": models_dir / "opensora" / "Open_Sora_v2.safetensors",
"size": "~23.8GB"
},
"Flux Text-to-Image": {
"url": "https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors",
"path": models_dir / "flux" / "flux1-dev.safetensors",
"size": "~23.8GB"
},
"HunyuanVideo VAE": {
"url": "https://huggingface.co/tencent/HunyuanVideo/resolve/main/hunyuan_video_vae_bf16.safetensors",
"path": models_dir / "vae" / "hunyuan_vae.safetensors",
"size": "~2.3GB"
}
}

print("📋 Models to download:")
for name, info in models.items():
print(f" • {name}: {info['size']}")

print(f"\n💾 Total download size: ~50GB")
print(f"📁 Models will be saved to: {models_dir.absolute()}")

# Note: Actual downloading would require valid URLs and proper authentication
print("\n⚠️ Note: This script shows the structure.")
print("📖 Please refer to Open-Sora documentation for actual model URLs and download instructions.")
print("🔗 Visit: https://github.com/hpcaitech/Open-Sora")

if __name__ == "__main__":
main()
40 changes: 40 additions & 0 deletions setup_open_sora.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/bin/bash

# Automated setup script for Open-Sora in a single folder

# Create the main folder
mkdir -p ~/Open-Sora-All

# Navigate to the folder
cd ~/Open-Sora-All

# Clone the Open-Sora repository directly into the current folder
git clone https://github.com/hpcaitech/Open-Sora .

# Create a virtual environment inside the folder
python3 -m venv sora-env

# Activate the virtual environment
source sora-env/bin/activate

# Edit requirements.txt for CPU-only compatibility
# Change torch to torch==2.4.0
sed -i '' 's/^torch.*/torch==2.4.0/' requirements.txt

# Change torchvision to torchvision==0.19.0
sed -i '' 's/^torchvision.*/torchvision==0.19.0/' requirements.txt

# Comment out GPU-only packages: triton and liger-kernel
sed -i '' '/^triton/s/^/#/' requirements.txt
sed -i '' '/^liger-kernel/s/^/#/' requirements.txt

# Upgrade pip and install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# Create models folder
mkdir models

# Note: Download model files as per Open-Sora README instructions
echo "Setup complete. Please download required model files into ~/Open-Sora-All/models/ as per the Open-Sora README instructions."
echo "To run video generation, navigate to ~/Open-Sora-All, activate the venv with 'source sora-env/bin/activate', and run your commands."
Loading