🐎 Happy Horse 1.0
Alibaba's dominant 15B open model. It debuted anonymously on the Artificial Analysis Video Arena in April 2026 and took #1 in both Text-to-Video and Image-to-Video on blind human preference votes. Its signature trick: joint video + synchronized audio in a single pass, with near-perfect lip-sync across seven languages. And it's fast.
Why it stands out
- #1 ranked, blind-tested: 1333 Elo (T2V) and 1392 Elo (I2V), top of public leaderboards by real human preference.
- Joint audio-video: a unified single-stream Transformer generates picture and sound together, so audio fits the scene.
- 7-language lip-sync with industry-low word error rate: English, Mandarin, Cantonese, Japanese, Korean, German, French.
- Fast & cheap: ~10s average generation; a 720p tier runs ~half the price of 1080p.
Specs at a glance
How to access
Available via API on fal (T2V, I2V, R2V, video-edit endpoints), Alibaba Cloud Bailian, and assorted wrappers (MuAPI, etc.). Open-source release includes base model, distilled model, super-resolution module, and inference code, with commercial-use rights.
curl -X POST 'https://happyhorse.app/api/generate' \
-H 'Authorization: Bearer YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model": "happyhorse-1.0/video",
"prompt": "A cinematic shot of mountains at sunrise",
"mode": "pro",
"duration": 5,
"aspect_ratio": "16:9"
}'
Modes: pro vs std; audio on/off changes credit cost. multi_shots supports multi-prompt sequences where total duration is the sum of shots.
What Happy Horse is best for
Optimal prompt pattern
Same director formula. Because audio is native and prompts can run long (2,500 chars), describe the soundscape and (for talking heads) the exact dialogue line and language for the lip-sync engine.
Pro tips
- Draft on the 720p tier (~half cost), finalize at 1080p.
- Use
stdfor iteration,profor the keeper. - For dialogue, name the language and keep lines short for the cleanest lip-sync.
- Multi-shot: when using
multi_shots, make per-shot durations sum to your total.
Content policy
Happy Horse is geared toward mainstream/commercial use; hosted APIs apply standard filters and permissiveness varies by provider. Open-source availability makes self-hosting possible, but mature-content tooling lags far behind the Wan ecosystem. For adult work, prefer Seedance 2.0 or Wan 2.7; reach for Happy Horse when you want the best-looking SFW clip fast, especially talking characters.