| # Ovi FusionModel - FP8 Quantized |
|
|
| This is the Ovi FusionModel quantized with FP8 (e4m3_e4m3_dynamic_per_tensor) for faster inference. |
|
|
| ## Quantization Details |
|
|
| - **Video Model Blocks**: 30 blocks quantized |
| - **Audio Model Blocks**: 30 blocks quantized |
| - **Attention/FFN layers**: e4m3_e4m3_dynamic_per_tensor |
| - **Other layers**: e4m3_weightonly |
| |
| ## Usage |
| |
| ```python |
| import sys |
| import os |
| import torch |
| from omegaconf import OmegaConf |
| from huggingface_hub import hf_hub_download |
|
|
| OVI_PATH = "./workspace/Ovi" |
| sys.path.insert(0, OVI_PATH) |
| os.chdir(OVI_PATH) |
| |
| from ovi.ovi_fusion_engine import OviFusionEngine |
| |
| # Download quantized weights |
| model_path = hf_hub_download( |
| repo_id="wavespeed/Ovi-e4m3_e4m3_dynamic_per_tensor", |
| filename="model.pth" |
| ) |
| |
| config = OmegaConf.load("config.yaml") |
| engine = OviFusionEngine(config=config, device="cuda", target_dtype=torch.bfloat16) |
| |
| # Load quantized weights |
| engine.model.load_state_dict(torch.load(model_path)) |
|
|
| # Model is already quantized, ready for inference |
| ``` |
| |
| ## Model Card |
| |
| - **Developed by**: Alibaba/Character.AI |
| - **Model type**: Video + Audio generation (FusionModel) |
| - **Quantization**: FP8 (e4m3_e4m3_dynamic_per_tensor) |
| - **License**: Check original Ovi repository |
| |
| ## Original Model |
| |
| Based on [Ovi](https://github.com/character-ai/Ovi) |
| |