VEO 3.1
Veo 3.1 in five channels — audio, 4K, and 15-second remix in one API.
The Gemini Omni API turns a prompt, a single image, or three reference images into a 4 to 10 second clip at 720p, 1080p, or 4K. One endpoint covers text-to-video, image-to-video, and three-image fusion — Google's newest video model, billed per generation.
≤ 2000 chars · required
Default 720p
16:9 or 9:16 · default 16:9
Default 6 · ignored in reference-to-video mode
Try one of these prompts
Real-world workflows and production use cases you can build and ship with this model.
Pass one reference image and a motion prompt. The Gemini Omni API returns a 4 to 10 second clip from the same endpoint as your text-to-video calls — no model swap, no extra integration. Send a 1080p or 4K request when you want the result production-ready.
Generate a clipSend three reference images alongside a prompt and the Gemini Omni API combines scene, character, and product into a single motion shot. Skip the storyboard, the masking, and the multi-pass compositing — three-image fusion is the most differentiated mode on the Gemini Omni API and ships from the same /api/v1/videos/generations endpoint as text-to-video.
Describe the scene, pick 4K, and the Gemini Omni API returns a clip at the highest fidelity tier — useful for hero shots, social ads, and landing-page video. Audio is omitted in the reapi surface, so the result drops cleanly into any downstream editor.
Credit-based — 1 credit = $0.001 USD. Pay only for completed generations.
| Category | Unit | Price |
|---|---|---|
| 720p | ||
| 4 seconds | 1 generation | $0.495 495 credits |
| 6 seconds | 1 generation | $0.66 660 credits |
| 8 seconds | 1 generation | $0.825 825 credits |
| 10 seconds | 1 generation | $0.99 990 credits |
| 1080p | ||
| 4 seconds | 1 generation | $0.495 495 credits |
| 6 seconds | 1 generation | $0.66 660 credits |
| 8 seconds | 1 generation | $0.825 825 credits |
| 10 seconds | 1 generation | $0.99 990 credits |
| 4K | ||
| 4 seconds | 1 generation | $1.155 1155 credits |
| 6 seconds | 1 generation | $1.32 1320 credits |
| 8 seconds | 1 generation | $1.485 1485 credits |
| 10 seconds | 1 generation | $1.65 1650 credits |
| Reference 720p | ||
| per generation | 1 generation | $1.32 1320 credits |
| Reference 1080p | ||
| per generation | 1 generation | $1.32 1320 credits |
| Reference 4K | ||
| per generation | 1 generation | $1.98 1980 credits |
The Gemini Omni API picks its mode from the count of image_urls you send. Zero gives you text-to-video, one gives image-to-video, three gives three-image fusion — all on the same /api/v1/videos/generations call, with the same authentication and the same task polling pattern. Two images is not supported; the Gemini Omni API will reject that combination at the gateway with a clear 400.
The Gemini Omni API charges per generation, not per second. 720p and 1080p share the same rate; only 4K is uplifted. See current per-tier rates in the pricing table on this page. Failed Gemini Omni API jobs refund automatically — your worker never pays for a result you didn't get.
Skip the Google Cloud onboarding, billing setup, and service-account dance. Sign up for reapi, grab a key, and you can call the Gemini Omni API in under a minute. Same model, same outputs — fewer hoops to ship.
Sign up and grab a key from the dashboard. Free credits cover your first Gemini Omni API calls — no card required.
OpenPOST to /api/v1/videos/generations with model = gemini-omni. The Gemini Omni API returns a task ID immediately so your worker can move on.
OpenGET /api/v1/tasks/:id until status is completed. Download the Gemini Omni API output and ship it.
OpenCommon questions about this model.
Explore more models in the same category.
Veo 3.1 in five channels — audio, 4K, and 15-second remix in one API.
ByteDance
Text/image/audio-to-video — 4 variants, per-second pricing.
Alibaba Cloud Bailian
Text, image, reference video, and video edit — one Happy Horse 1.0 API call.
Try it in the playground or grab an API key to integrate now.