rreAPI Docs
rreAPI Docs
HomeWelcome

Image

nano-banana-2-litemidjourney-v8flux-2z-imageqwen-image-2midjourney-v7wan-2-7-imagegpt-image-2gpt-image-2-officialgemini-2.5-flash-image-previewgemini-3-pro-image-previewgemini-3.1-flash-image-previewdoubao-seedream-5-0-liteimagen-4-0

Audio

Mureka V9 Song APIVocal Remover APIMusic Extractor APIVoice Cleaner APIMultistem Splitter APIVoice Changer API

Video

topaz-video-upscalerkling-3-0-turbokling-3-0music-video-1-0wan-2-7-videokling-motion-controlpixverse-v6Seedance 2.5doubao-seedance-2.0seedance-2-0-minihappyhorse-1-1happyhorse-1.0happyhorse-1.0-officialviduq3grok-imagine-video-1.5-betagrok-imagine-video-1.5-officialgrok-imagine-1.0-videoVeo 3.1gemini-omni

Chat

claude-fable-5minimax-m3deepseek-v4gpt-5.5gpt-5.4claude-opus-4-8claude-opus-4-7claude-sonnet-4-6

Text

ai-essay-writerhumanizeai-text-detector

Tools

enhance-video-1.0
X (Twitter)

gemini-omni

Gemini Omni — Google's any-input video model. One endpoint covers text-to-video, image-to-video, three-image fusion, and reference-to-video at 720p, 1080p, or 4K. Flat per-generation billing.

Gemini Omni — Google's any-to-any video model, exposed through reapi as a single async endpoint. Mode is implicit: the counts of image_urls / video_urls pick text-to-video, image-to-video, three-image fusion, or reference-to-video. 4 to 10 second outputs at 720p, 1080p, or 4K. Flat per-generation pricing across every mode — see the model page.

Quick example

curl https://reapi.ai/api/v1/videos/generations \
  -H "Authorization: Bearer rk_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-omni",
    "prompt": "A kitten playing piano, slow camera push-in",
    "duration": 6,
    "resolution": "1080p",
    "aspect_ratio": "16:9"
  }'
import requests

resp = requests.post(
    "https://reapi.ai/api/v1/videos/generations",
    headers={
        "Authorization": "Bearer rk_live_xxx",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-omni",
        "prompt": "A kitten playing piano, slow camera push-in",
        "duration": 6,
        "resolution": "1080p",
        "aspect_ratio": "16:9",
    },
    timeout=30,
)
print(resp.json())
const r = await fetch("https://reapi.ai/api/v1/videos/generations", {
  method: "POST",
  headers: {
    Authorization: "Bearer rk_live_xxx",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gemini-omni",
    prompt: "A kitten playing piano, slow camera push-in",
    duration: 6,
    resolution: "1080p",
    aspect_ratio: "16:9",
  }),
});
console.log(await r.json());
package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

func main() {
    body, _ := json.Marshal(map[string]any{
        "model":        "gemini-omni",
        "prompt":       "A kitten playing piano, slow camera push-in",
        "duration":     6,
        "resolution":   "1080p",
        "aspect_ratio": "16:9",
    })
    req, _ := http.NewRequest("POST",
        "https://reapi.ai/api/v1/videos/generations", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer rk_live_xxx")
    req.Header.Set("Content-Type", "application/json")

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    out, _ := io.ReadAll(resp.Body)
    fmt.Println(string(out))
}

Submit response

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "gemini-omni",
  "status": "processing",
  "created_at": 1735000000
}

Poll GET /api/v1/tasks/{id} (see the Tasks reference) until status === "completed". The completed payload's output.video_urls holds the generated MP4 URL, valid for 7 days.


Authentication

Every call needs a Bearer token. Generate keys at reapi.ai/settings/apikeys.

Authorization: Bearer YOUR_API_KEY

Keys carry the active workspace's billing scope — there is no separate project header.


Endpoint

POST /api/v1/videos/generations
GET  /api/v1/tasks/{id}

Submission is async. The POST returns immediately with a task_id; the task endpoint returns the same envelope until completion. Polling does not consume credits.


Mode routing

gemini-omni picks its mode from the counts of image_urls and video_urls you send — there is no mode parameter:

image_urlsvideo_urlsModeWhat it does
00Text-to-videoGenerate from a prompt.
10Image-to-videoAnimate from a single starting frame.
30Three-image fusionCombine three references into one motion shot.
0 / 11Reference-to-videoReference a source clip (≤ 30s) for motion / style. Flat per generation.

Unsupported counts.

  • image_urls = 2 is rejected with 400 image_urls cardinality 2 is not supported. Submit 0, 1, or 3 — there is no first/last-frame mode on gemini-omni.
  • image_urls = 4 or more is also rejected.
  • video_urls = 2 or more is rejected with 400 video_urls accepts at most 1 entry.
  • duration and video_urls cannot both be set. Reference-to-video mode reads the source clip's length; passing duration alongside is rejected with 400 duration and video_urls cannot both be set.

Request body

model — required

string. Must be "gemini-omni".

prompt — string, required

Up to 2,000 characters. Required in every mode (text-to-video, image-to-video, three-image fusion, reference-to-video). Empty / whitespace-only prompts are treated as missing.

Failure modes.

  • Empty / missing → 400 prompt is required (code 20002).
  • Longer than 2,000 chars → 400 prompt exceeds 2000 characters (got N) (code 20007).

duration — integer, default 6

One of 4, 6, 8, 10 seconds. Other values are rejected with 400 duration must be 4, 6, 8, or 10 seconds, got N.

Reference-to-video mode ignores duration. When video_urls is set, the vendor reads the source clip's length to drive the output, and duration must be omitted. Passing both is rejected with 400 duration and video_urls cannot both be set.

resolution — string, default "720p"

720p / 1080p / 4k. Lowercase is canonical; uppercase forms ("4K") are accepted and normalized. Drives the per-generation rate — 720p and 1080p share the same price; only 4K is uplifted.

aspect_ratio — string, default "16:9"

Output framing. One of:

ValueShape
16:9Landscape
9:16Portrait

Unknown ratios are rejected with 400 invalid aspect_ratio.

size — string, alias for aspect_ratio

The same value the supplier doc lists as a separate field. If both are sent, they must match; otherwise 400 aspect_ratio and size disagree. The reapi playground does not surface size; the JSON body still accepts it for parity.

image_urls — string[]

Array of public HTTP(S) URLs. Allowed counts: 0, 1, or 3.

  • 0 entries — text-to-video.
  • 1 entry — image-to-video; the image is treated as the starting frame.
  • 3 entries — three-image fusion. The model combines all three references into one motion shot.

video_urls — string[]

Array of public HTTP(S) URLs. Allowed counts: 0 or 1.

  • 0 entries — non-reference modes (text / image / fusion).
  • 1 entry — reference-to-video. The source clip drives the output; it must be ≤ 30 seconds (longer is rejected with 400), and the first ≤ 10 seconds are used as the reference. Billed at a flat per-generation rate (not per second). duration MUST be omitted in this mode.

No data: URIs. reAPI rejects base64 inputs platform-wide — every URL field on this endpoint must be a public HTTP(S) URL. Upload to your own object storage (S3, R2, OSS, …) and pass the URL.


Response envelope

Submit and poll share the same shape — only status and output fill in over time.

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "gemini-omni",
  "status": "completed",
  "created_at": 1735000000,
  "output": {
    "video_urls": ["https://cdn.reapi.ai/media/tasks/018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e/0.mp4"]
  },
  "error": null
}
FieldTypeNotes
idstringTask identifier — keep it for polling and audit
modelstringAlways "gemini-omni" (echo of the submitted model)
statusstringprocessing / completed / failed
created_atintegerSubmission unix timestamp
outputobject | nullnull until completion. output.video_urls holds MP4s
errorobject | nullPopulated on failed — { code, message }

output.video_urls URLs are valid for 7 days. Re-host to your own storage if you need them longer.


Validation errors

All cases below return HTTP 400 with code 20003 unless noted. Pattern-match on code, not message — message strings carry request-specific context (field names, observed values, etc.) and are not a stable contract.

TriggerCodeMessage (illustrative)
prompt missing or blank20002gemini-omni: prompt is required
prompt longer than 2,000 chars20007gemini-omni: prompt exceeds 2000 characters (got N)
image_urls length 220003gemini-omni: image_urls cardinality 2 is not supported
image_urls length > 320003gemini-omni: image_urls accepts at most 3 entries, got N
duration not one of 4/6/8/1020003gemini-omni: duration must be 4, 6, 8, or 10 seconds, got N
Unknown resolution20003gemini-omni: invalid resolution "X" (allowed: 720p / 1080p / 4k)
Unknown aspect_ratio20003gemini-omni: invalid aspect_ratio "X" (allowed: 16:9 / 9:16)
aspect_ratio and size disagree20003gemini-omni: aspect_ratio "X" and size "Y" disagree
image_urls carrying a data: URI or non-http(s)20003gemini-omni: image_urls entries must be public http(s) URLs
video_urls source clip longer than 30 seconds20003Reference video must be at most 30s (got Ns); trim the clip before submitting.

The full envelope is { "error": { "code", "message", "request_id" } } — see Errors catalog for the wire format and request_id correlation tips.


Recipes

Text-to-video — minimum request

{
  "model": "gemini-omni",
  "prompt": "A little girl walking down a sunset coastal road"
}

Text-to-video — full parameters

{
  "model": "gemini-omni",
  "prompt": "A kitten playing piano, slow camera push-in, cinematic warm tones",
  "duration": 8,
  "resolution": "1080p",
  "aspect_ratio": "16:9"
}

Image-to-video — animate a single frame

{
  "model": "gemini-omni",
  "prompt": "Bring the scene to life with a gentle camera dolly forward",
  "image_urls": ["https://your-cdn.com/first_frame.jpg"],
  "duration": 6,
  "resolution": "1080p"
}

Three-image fusion

{
  "model": "gemini-omni",
  "prompt": "Compose a 10-second product spot mixing scene, character, and product",
  "image_urls": [
    "https://your-cdn.com/scene.jpg",
    "https://your-cdn.com/character.jpg",
    "https://your-cdn.com/product.jpg"
  ],
  "duration": 10,
  "resolution": "1080p",
  "aspect_ratio": "9:16"
}

4K hero shot

{
  "model": "gemini-omni",
  "prompt": "A neon city street in the rain, slow camera pan, reflections on the asphalt",
  "duration": 4,
  "resolution": "4k",
  "aspect_ratio": "16:9"
}

Reference-to-video — match a source clip

{
  "model": "gemini-omni",
  "prompt": "the same scene but at night with neon lights",
  "resolution": "720p",
  "aspect_ratio": "16:9",
  "video_urls": ["https://your-cdn.com/source.mp4"]
}

The output tracks the source clip (which must be ≤ 30s; the first ≤ 10s are used as the reference); omit duration. Billing is a flat per-generation rate (see Pricing below).


Choosing a mode

NeedSend
Generate from textprompt only
Animate a stillprompt + image_urls (1 entry)
Compose scene + character + productprompt + image_urls (3 entries)
Match an existing clip's motion / lengthprompt + video_urls (1 entry, no duration)
Cut spendDrop resolution to 720p and duration to 4
Hero shotPick 4k at 4-10s

Polling pattern

The task endpoint behaves identically to other video tasks — the only difference is the completed output shape (video_urls instead of image_urls). A pragmatic schedule:

0–5 minutes:    poll every 5s
5 min – 1 h:    back off gradually toward 1 min
≥ 1 h:          cap at 3 min between polls

A typical task completes in a few minutes. The worker's wall-clock cap is 48 hours, comfortably above any realistic queue.


Pricing

Two billing modes — picked by request shape.

Per generation — text / image / fusion

Charged once per submitted job. 720p and 1080p share the same rate at every duration; only 4K is uplifted. Duration tiers are 4s / 6s / 8s / 10s. See current per-tier rates on the Gemini Omni model page.

Flat per generation — reference-to-video

Triggered when video_urls is set. Charged once per job at a flat rate by resolution — independent of clip length. 720p and 1080p share the same rate; 4K is uplifted. The source clip must be ≤ 30 seconds: the gateway probes its decoded length server-side and rejects longer clips with 400; the first ≤ 10 seconds are used as the reference. See current rates on the Gemini Omni model page.

Bill formula

1 credit = $0.001. Integer credits = ceil(usd × 1000). Failed jobs refund automatically.


Tips

  • Prompt motion, not just scene. "Slow push-in, warm tones, shallow depth of field" outperforms a pure noun-list of what's on screen.
  • Pick 720p first if you're iterating. It's the same per-generation price as 1080p, but renders faster and lets you change your mind on the final tier without re-doing the bill math.
  • Three-image fusion needs cohesive references. Pick three images that share lighting and composition cues — the model fuses them more cleanly than three random shots.
  • Pick 4K only when shipping. A 4K render is roughly 2× the cost of 720p / 1080p; reserve it for the final keeper.
  • Keep reference clips ≤ 30s. Reference-to-video requires a source clip of ≤ 30 seconds (the server probes its length via ffmpeg and rejects anything longer with 400); only the first ≤ 10 seconds are used as the reference. Billing is a flat per-generation rate regardless of length. Omit duration — the server rejects requests that pass both.

Related

  • Errors catalog
  • Authentication
  • Quickstart

Table of Contents

Quick example
Submit response
Authentication
Endpoint
Mode routing
Request body
model — required
prompt — string, required
duration — integer, default 6
resolution — string, default "720p"
aspect_ratio — string, default "16:9"
size — string, alias for aspect_ratio
image_urls — string[]
video_urls — string[]
Response envelope
Validation errors
Recipes
Text-to-video — minimum request
Text-to-video — full parameters
Image-to-video — animate a single frame
Three-image fusion
4K hero shot
Reference-to-video — match a source clip
Choosing a mode
Polling pattern
Pricing
Per generation — text / image / fusion
Flat per generation — reference-to-video
Bill formula
Tips
Related