doubao-seedance-2.0

ByteDance Seedance 2.0 — async video generation. One endpoint, four variants, and implicit mode routing across text, image, first/last-frame, and reference-video / audio inputs.

ByteDance's async video model on reAPI. Four variants share one endpoint and one parameter shape — pick the variant via model. Mode is implicit: which media fields you set (prompt, image_urls, image_with_roles, video_urls, audio_urls) decides whether the request runs as text-to-video, image-to-video, first/last-frame transition, or reference-driven generation. 4–15 second outputs at 480p / 720p / 1080p / 4k. See current pricing on the model page.

Quick example

curl https://reapi.ai/api/v1/videos/generations \
  -H "Authorization: Bearer rk_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "doubao-seedance-2.0-face",
    "prompt": "A kitten yawning at the camera, cinematic warm tones",
    "resolution": "720p",
    "size": "16:9",
    "duration": 5
  }'

import requests

resp = requests.post(
    "https://reapi.ai/api/v1/videos/generations",
    headers={
        "Authorization": "Bearer rk_live_xxx",
        "Content-Type": "application/json",
    },
    json={
        "model": "doubao-seedance-2.0-face",
        "prompt": "A kitten yawning at the camera, cinematic warm tones",
        "resolution": "720p",
        "size": "16:9",
        "duration": 5,
    },
    timeout=30,
)
print(resp.json())

const r = await fetch("https://reapi.ai/api/v1/videos/generations", {
  method: "POST",
  headers: {
    Authorization: "Bearer rk_live_xxx",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "doubao-seedance-2.0",
    prompt: "A kitten yawning at the camera, cinematic warm tones",
    resolution: "720p",
    size: "16:9",
    duration: 5,
  }),
});
console.log(await r.json());

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

func main() {
    body, _ := json.Marshal(map[string]any{
        "model":      "doubao-seedance-2.0-face",
        "prompt":     "A kitten yawning at the camera, cinematic warm tones",
        "resolution": "720p",
        "size":       "16:9",
        "duration":   5,
    })
    req, _ := http.NewRequest("POST",
        "https://reapi.ai/api/v1/videos/generations", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer rk_live_xxx")
    req.Header.Set("Content-Type", "application/json")

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    out, _ := io.ReadAll(resp.Body)
    fmt.Println(string(out))
}

Submit response

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "doubao-seedance-2.0-face",
  "status": "processing",
  "created_at": 1735000000
}

Poll GET /api/v1/tasks/{id} (see the Tasks reference) until status === "completed". The completed payload's output.video_urls holds the generated MP4 URL, valid for 7 days. output.last_frame_url is present when the request set return_last_frame: true.

Authentication

Every call needs a Bearer token. Generate keys at reapi.ai/settings/apikeys.

Authorization: Bearer YOUR_API_KEY

Keys carry the active workspace's billing scope — there is no separate project header.

Endpoint

POST /api/v1/videos/generations
GET  /api/v1/tasks/{id}

Submission is async. The POST returns immediately with a task_id; the task endpoint returns the same envelope until completion. Polling does not consume credits.

Variants

doubao-seedance-2.0 is a family of two variants sharing one parameter shape. Pick via model:

Variant	Speed	1080p / 4k	Real-person uploads
`doubao-seedance-2.0-face`	standard	✅	✅
`doubao-seedance-2.0-fast-face`	faster	❌ (480p / 720p only)	✅

reAPI never silently substitutes one variant for another. Sending resolution: "1080p" to the Fast variant returns 400, never an auto-downgraded clip.

Real-person uploads — both variants accept real-person source images / videos.

Channels

The variants above ship across two channels — same async endpoint, selected by the model id you send:

Channel	Model ids	Notes
Standard	`doubao-seedance-2.0-face`, `doubao-seedance-2.0-fast-face`	Face variants accept real-person inputs.
Official	`doubao-seedance-2.0-official`, `doubao-seedance-2.0-fast-official`	Official direct channel — lower price. Real-person inputs are not accepted.

Standard and Official share the parameter shape documented below.

Mode routing

doubao-seedance-2.0 picks its mode from which media fields you set — there is no mode parameter:

Fields you send	Mode	What it does
`prompt` only	T2V	Generate from text
`prompt` + `image_urls` (1–9)	I2V	Animate / extend from reference images
`prompt` + `image_with_roles` (1–2 frames)	FRAMES	First / last frame transition
`prompt` + `video_urls` and / or `audio_urls` (+ optional `image_urls`)	REF	Reference-driven, optionally multi-modal

Mutex rules. Most field combinations are illegal. The single legal multi-field shape is image_urls + video_urls + audio_urls (REF mode, multi-modal). Any other combination is rejected with 400 (code 20003).

prompt is required on every request (all four modes carry it)
image_urls ⊕ image_with_roles — never together
image_with_roles cannot be combined with video_urls or audio_urls
audio_urls requires image_urls or video_urls

Request body

`model` — required

string. One of the four variants in the table above.

`prompt` — string, required

Required on every request, min 3 characters, up to 4,000 (≤ 500 recommended — quality drops past ~500 chars on the upstream model). Applies to all modes — T2V, I2V, FRAMES, REF.

Best results come from naming, in order, the subject, the action, the camera move, and the style. e.g. "A kitten, yawning into the camera, slow push-in, cinematic warm tones".

Failure modes.

Missing / empty → 400 (code 20002).
Shorter than 3 chars → 400 (code 20003).
Longer than 4,000 chars → 400 (code 20003).

`duration` — integer, default `5`

Output length in seconds. Any integer in [4, 15]. Out-of-range → 400.

Billable seconds = sum(video_urls clip lengths) + duration. The input reference clips and the generated output both contribute to cost. Image and audio references don't carry a billable time component — only video_urls adds. reAPI probes video_urls server-side via ffmpeg metadata; the value reported by your client is never trusted for billing. See Pricing for the full formula.

`size` — string, default `"adaptive"`

Output ratio. One of:

Value	Shape
`16:9`	Landscape
`9:16`	Portrait
`1:1`	Square
`4:3`	Traditional landscape
`3:4`	Traditional portrait
`21:9`	Cinematic ultrawide
`adaptive`	Match the input image / video's ratio

Invalid values → 400 (no silent fallback).

`resolution` — string, default `"720p"`

480p / 720p / 1080p / 4k — lowercase only. Drives pricing. Uppercase forms like 1080P are rejected with 400.

1080p and 4k are variant-gated. Only doubao-seedance-2.0 and doubao-seedance-2.0-face accept 1080p / 4k. The Fast variants (-fast, -fast-face) cap at 720p — sending a higher resolution returns 400 resolution=<value> is not supported by <variant> (code 20003), no auto-downgrade.

`generate_audio` — boolean, default `true`

When true (the default), the model synthesizes an audio track that plays alongside the generated video; pass false to get a silent clip. Independent of audio_urls (which is a reference for the model to align with — not a synthesis toggle).

`return_last_frame` — boolean, default `false`

When true, the completed task carries an extra output.last_frame_url holding the final frame as a still image. Pass it as image_urls of the next request to chain continuous video without prompt drift.

`tools` — object[]

Per-tool capability list. Today only one type is recognized:

"tools": [{ "type": "web_search" }]

web_search lets the model query the web during generation — useful for current events or named brands. Unknown type values are rejected with 400 tools[i].type must be "web_search".

`nsfw_checker` — boolean, default `true`

Safety checking is enabled by default. Direct API callers can pass "nsfw_checker": false on the Standard model ids in this page:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Your prompt",
  "resolution": "720p",
  "size": "16:9",
  "duration": 5,
  "nsfw_checker": false
}

When set to false, reAPI sends the task directly through the Flexible channel when that channel is available and compatible with the request, and does not attach fallback to that task. If no compatible Flexible channel is available, reAPI silently uses the selected Standard channel instead; Standard channel generation remains safety-checked. In both cases, no fallback is attached for that request.

`image_urls` — string[]

Array of public HTTP(S) URLs. Up to 9 entries. Triggers I2V (when sent without other media fields) or augments REF (when combined with video_urls / audio_urls).

Mutually exclusive with image_with_roles. Sending more than 9 is rejected with 400 at most 9 image_urls allowed.

No data: URIs. reAPI rejects base64 inputs platform-wide — every URL field on this endpoint must be a public HTTP(S) URL. Upload to your own object storage (S3, R2, OSS, …) and pass the URL.

`image_with_roles` — object[]

First / last frame interpolation. Each entry is a {url, role} object:

"image_with_roles": [
  { "url": "https://your-cdn.com/day.jpg",   "role": "first_frame" },
  { "url": "https://your-cdn.com/night.jpg", "role": "last_frame"  }
]

role is one of first_frame / last_frame. Up to 9 entries (typical use: 1 or 2 — one first frame and one last frame).

Cannot be combined with image_urls, video_urls, or audio_urls — these modes are exclusive.

`video_urls` — string[]

Reference video clips for REF mode. Up to 3 entries; each clip 2–15 s long, combined ≤ 15 s. Each clip's frame must be 300–6000 px on each side, 0.41–8.3 MP total, aspect ratio 0.4–2.5. Public HTTP(S) URLs only.

No real people on standard / fast variants. Use the Face variants (-face, -fast-face) when the reference clip features identifiable real people — the non-Face variants reject them upstream.

reAPI probes each clip's resolution and duration server-side via ffmpeg metadata. Out-of-spec assets surface as a 400:

Frame outside 300–6000 px/side, 0.41–8.3 MP, or aspect 0.4–2.5 → 400 video_urls[i] resolution WxH is out of range (code 20003)
Each clip outside 2–15s, or combined > 15s → 400 video_urls total duration X.XXs exceeds the 15s limit (code 20003)
Probe failure (network / format) → 400 Could not determine source video duration for billing (code 30002) — no charge

Mutually exclusive with image_with_roles.

`audio_urls` — string[]

Reference audio for REF mode. Up to 3 entries; combined duration ≤ 15 seconds. Public HTTP(S) URLs only.

Must accompany image_urls OR video_urls — a request with audio_urls and no visual reference is rejected with 400 audio_urls must be used together with image_urls or video_urls.

Mutually exclusive with image_with_roles.

Response envelope

Submit and poll share the same shape — only status and output fill in over time.

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "doubao-seedance-2.0-face",
  "status": "completed",
  "created_at": 1735000000,
  "output": {
    "video_urls": ["https://cdn.reapi.ai/media/tasks/018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e/0.mp4"],
    "last_frame_url": "https://cdn.reapi.ai/media/tasks/018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e/0.png"
  },
  "error": null
}

Field	Type	Notes
`id`	string	Task identifier — keep it for polling and audit
`model`	string	Echo of the submitted `model` (the variant you picked)
`status`	string	`processing` / `completed` / `failed`
`created_at`	integer	Submission unix timestamp
`output`	object \| null	`null` until completion
`output.video_urls`	string[]	Generated MP4 URL(s) — valid for 7 days
`output.last_frame_url`	string \| null	Present only when the request set `return_last_frame: true`
`error`	object \| null	Populated on `failed` — `{ code, message }`

Validation errors

All cases below return HTTP 400 with the noted code. Pattern-match on code, not message — message strings carry request-specific context (field names, observed values) and are not a stable contract.

Trigger	Code	Message
Missing `prompt`	`20002`	`prompt: Invalid input: expected string, received undefined`
`prompt` shorter than 3 chars	`20003`	`prompt: Too small: expected string to have >=3 characters`
`prompt` longer than 4,000 chars	`20003`	`prompt: Too big: expected string to have <=4000 characters`
`image_urls` and `image_with_roles` together	`20003`	`image_urls and image_with_roles cannot be used simultaneously`
`image_with_roles` + `video_urls` or `audio_urls`	`20003`	`image_with_roles cannot be combined with video_urls or audio_urls`
`audio_urls` without visual reference	`20003`	`audio_urls must be used together with image_urls or video_urls`
`image_urls` > 9	`20003`	`at most 9 image_urls allowed, got N`
`image_with_roles` > 9	`20003`	`at most 9 image_with_roles allowed, got N`
`image_with_roles[i].role` invalid	`20003`	`image_with_roles[i].role must be first_frame or last_frame, got "..."`
`video_urls` > 3	`20003`	`at most 3 video_urls allowed, got N`
`video_urls` clip frame out of range (300–6000px / 0.41–8.3MP / aspect 0.4–2.5)	`20003`	`video_urls[i] resolution WxH is out of range`
`video_urls` combined > 15s	`20003`	`video_urls total duration X.XXs exceeds the 15s limit`
`audio_urls` > 3	`20003`	`at most 3 audio_urls allowed, got N`
`audio_urls` combined > 15s	`20003`	`audio_urls total duration X.XXs exceeds the 15s limit`
`duration` outside 4–15	`20003`	`duration must be 4-15 seconds, got N`
Invalid `size` value	`20005`	`invalid size "..." (allowed: 16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / adaptive)`
Invalid `resolution` value	`20003`	`invalid resolution "..." (allowed: 480p / 720p / 1080p / 4k)`
`1080p` on Fast variant	`20003`	`resolution=1080p is not supported by <variant> (use doubao-seedance-2.0 or doubao-seedance-2.0-face)`
`tools[i].type` not `web_search`	`20003`	`tools[i].type must be "web_search", got "..."`
Any URL field carrying a `data:` URI	`20003`	`<field> entries must be public URLs; base64 data URIs are not supported`
Reference video probe fails	`30002`	`Could not determine source video duration for billing: ...`

The full envelope is { "error": { "code", "message", "request_id" } } — see Errors catalog for wire format and request_id correlation.

Recipes

T2V — text-to-video

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A kitten yawning at the camera, slow push-in, warm tones",
  "resolution": "720p",
  "size": "16:9",
  "duration": 5,
  "fallback": { "enabled": false }
}

I2V — single reference image

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten stands up and walks toward the camera",
  "image_urls": ["https://your-cdn.com/cat.jpg"],
  "duration": 5
}

FRAMES — first / last frame transition

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Smooth transition from day to night",
  "image_with_roles": [
    { "url": "https://your-cdn.com/day.jpg",   "role": "first_frame" },
    { "url": "https://your-cdn.com/night.jpg", "role": "last_frame"  }
  ],
  "duration": 5
}

REF — reference video (style transfer)

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Restylize the reference clip into anime aesthetics",
  "video_urls": ["https://your-cdn.com/reference.mp4"]
}

REF — reference video + reference audio

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A scene of a person speaking",
  "video_urls": ["https://your-cdn.com/reference.mp4"],
  "audio_urls": ["https://your-cdn.com/speech.wav"],
  "size": "16:9",
  "duration": 11
}

Voiced video (synthesized audio)

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A man calls out to a woman: \"Remember — never point at the moon with your finger.\"",
  "generate_audio": true
}

Continuous video chain

Step 1 — produce a 5s clip and ask for the last-frame URL:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten approaches the camera",
  "image_urls": ["https://your-cdn.com/kitten-start.png"],
  "return_last_frame": true
}

Step 2 — feed output.last_frame_url as image_urls of the next call:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten turns and walks away",
  "image_urls": ["<paste output.last_frame_url from step 1>"]
}

Fast variant — quick timelapse

{
  "model": "doubao-seedance-2.0-fast-face",
  "prompt": "City nightscape timelapse",
  "size": "21:9",
  "duration": 8
}

The full REF surface — combine all three reference types for tightly directed product / brand spots.

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "First-person POV product ad with dynamic camera moves",
  "image_urls": [
    "https://your-cdn.com/product-1.jpg",
    "https://your-cdn.com/product-2.jpg"
  ],
  "video_urls": ["https://your-cdn.com/style-ref.mp4"],
  "audio_urls": ["https://your-cdn.com/bgm.mp3"],
  "generate_audio": true,
  "size": "16:9",
  "duration": 11
}

Choosing a variant

Need	Pick
Highest quality, full resolution range	`doubao-seedance-2.0-face`
Cheaper / faster, 720p ceiling	`doubao-seedance-2.0-fast-face`

Variants are independent products — reAPI never rewrites your selected model on the primary attempt. Standard variants can still use the fallback policy above after an eligible generation-side failure.

Polling pattern

The task endpoint behaves identically to image tasks — only the completed output shape differs (video_urls / last_frame_url instead of image_urls). A pragmatic schedule:

0–5 minutes:    poll every 5s
5 min – 1 h:    back off gradually toward 1 min
≥ 1 h:          cap at 3 min between polls

A typical task completes in a few minutes. A single generation attempt can run for up to 48 hours; when fallback is enabled, the overall task window can cover two generation attempts.

Pricing

Per-second × billable seconds, where:

billable_seconds = sum(video_urls clip lengths, server-probed)
                 + duration

video_urls clip lengths are measured server-side via ffmpeg metadata — client-stated values are never trusted for billing. Image and audio references don't add to billable time. T2V / I2V / FRAMES requests (no video_urls) bill on duration alone.

The per-second rate depends on three axes:

Variant (2 options)
Resolution (480p / 720p / 1080p / 4k)
Mode — text (no media references) vs. ref (any of image_urls, image_with_roles, video_urls, audio_urls is set)

REF rates are lower than text rates at every cell. See live numbers on the model page — that table is dynamic and always reflects the current rate.

Bill formula (1 credit = $0.001):

credits = ceil(per_second_usd × billable_seconds × 1000)

Charge on submit; refund automatically on failed. Probe failures (unreachable / unreadable video_urls) return 400 PRICING_UNAVAILABLE with no charge.

When fallback is enabled, reAPI reserves the larger of the primary and fallback attempt prices. The final successful task is settled to the winning attempt's price and the difference is refunded automatically. If both attempts fail, the full reserve is refunded.

Worked example. doubao-seedance-2.0 at 720p, REF mode, with a 5-second reference video and duration: 6:

billable_seconds = 5 + 6 = 11
credits = ceil(per_second_usd × 11 × 1000)

The same duration: 6 request without video_urls would bill 6 seconds at the (higher) text rate.

Tips

Prompt motion, not just scene. "Slow push-in, warm tones, shallow depth of field" outperforms a noun-list of what's on screen.
Sweet-spot duration: 5–10 seconds. Below 5s motion looks choppy; above 10s generation time grows fast.
Trim reference clips before upload. Both their actual length AND your duration count toward the bill. A 2-second style snippet is usually enough to convey style — there's no quality bonus for uploading a 15s reference.
Pick doubao-seedance-2.0-fast for iteration. Fast variants cost noticeably less and miss only the 1080p tier — perfect for prompt-tuning loops where final quality comes later.
Real people → Face variants. The non-Face variants reject identifiable real-person assets during generation; switching is a one-character change to model.
Chain continuous video with return_last_frame. Pass the returned URL as image_urls of the next request. No prompt drift between segments.

Quick example

curl https://reapi.ai/api/v1/videos/generations \
  -H "Authorization: Bearer rk_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "doubao-seedance-2.0-face",
    "prompt": "A kitten yawning at the camera, cinematic warm tones",
    "resolution": "720p",
    "size": "16:9",
    "duration": 5
  }'

import requests

resp = requests.post(
    "https://reapi.ai/api/v1/videos/generations",
    headers={
        "Authorization": "Bearer rk_live_xxx",
        "Content-Type": "application/json",
    },
    json={
        "model": "doubao-seedance-2.0-face",
        "prompt": "A kitten yawning at the camera, cinematic warm tones",
        "resolution": "720p",
        "size": "16:9",
        "duration": 5,
    },
    timeout=30,
)
print(resp.json())

const r = await fetch("https://reapi.ai/api/v1/videos/generations", {
  method: "POST",
  headers: {
    Authorization: "Bearer rk_live_xxx",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "doubao-seedance-2.0",
    prompt: "A kitten yawning at the camera, cinematic warm tones",
    resolution: "720p",
    size: "16:9",
    duration: 5,
  }),
});
console.log(await r.json());

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

func main() {
    body, _ := json.Marshal(map[string]any{
        "model":      "doubao-seedance-2.0-face",
        "prompt":     "A kitten yawning at the camera, cinematic warm tones",
        "resolution": "720p",
        "size":       "16:9",
        "duration":   5,
    })
    req, _ := http.NewRequest("POST",
        "https://reapi.ai/api/v1/videos/generations", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer rk_live_xxx")
    req.Header.Set("Content-Type", "application/json")

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    out, _ := io.ReadAll(resp.Body)
    fmt.Println(string(out))
}

Submit response

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "doubao-seedance-2.0-face",
  "status": "processing",
  "created_at": 1735000000
}

Authentication

Every call needs a Bearer token. Generate keys at reapi.ai/settings/apikeys.

Authorization: Bearer YOUR_API_KEY

Keys carry the active workspace's billing scope — there is no separate project header.

Endpoint

POST /api/v1/videos/generations
GET  /api/v1/tasks/{id}

Submission is async. The POST returns immediately with a task_id; the task endpoint returns the same envelope until completion. Polling does not consume credits.

Variants

doubao-seedance-2.0 is a family of two variants sharing one parameter shape. Pick via model:

Variant	Speed	1080p / 4k	Real-person uploads
`doubao-seedance-2.0-face`	standard	✅	✅
`doubao-seedance-2.0-fast-face`	faster	❌ (480p / 720p only)	✅

reAPI never silently substitutes one variant for another. Sending resolution: "1080p" to the Fast variant returns 400, never an auto-downgraded clip.

Real-person uploads — both variants accept real-person source images / videos.

Channels

The variants above ship across two channels — same async endpoint, selected by the model id you send:

Channel	Model ids	Notes
Standard	`doubao-seedance-2.0-face`, `doubao-seedance-2.0-fast-face`	Face variants accept real-person inputs.
Official	`doubao-seedance-2.0-official`, `doubao-seedance-2.0-fast-official`	Official direct channel — lower price. Real-person inputs are not accepted.

Standard and Official share the parameter shape documented below.

Mode routing

doubao-seedance-2.0 picks its mode from which media fields you set — there is no mode parameter:

Fields you send	Mode	What it does
`prompt` only	T2V	Generate from text
`prompt` + `image_urls` (1–9)	I2V	Animate / extend from reference images
`prompt` + `image_with_roles` (1–2 frames)	FRAMES	First / last frame transition
`prompt` + `video_urls` and / or `audio_urls` (+ optional `image_urls`)	REF	Reference-driven, optionally multi-modal

prompt is required on every request (all four modes carry it)
image_urls ⊕ image_with_roles — never together
image_with_roles cannot be combined with video_urls or audio_urls
audio_urls requires image_urls or video_urls

Request body

`model` — required

string. One of the four variants in the table above.

`prompt` — string, required

Required on every request, min 3 characters, up to 4,000 (≤ 500 recommended — quality drops past ~500 chars on the upstream model). Applies to all modes — T2V, I2V, FRAMES, REF.

Best results come from naming, in order, the subject, the action, the camera move, and the style. e.g. "A kitten, yawning into the camera, slow push-in, cinematic warm tones".

Failure modes.

Missing / empty → 400 (code 20002).
Shorter than 3 chars → 400 (code 20003).
Longer than 4,000 chars → 400 (code 20003).

`duration` — integer, default `5`

Output length in seconds. Any integer in [4, 15]. Out-of-range → 400.

`size` — string, default `"adaptive"`

Output ratio. One of:

Value	Shape
`16:9`	Landscape
`9:16`	Portrait
`1:1`	Square
`4:3`	Traditional landscape
`3:4`	Traditional portrait
`21:9`	Cinematic ultrawide
`adaptive`	Match the input image / video's ratio

Invalid values → 400 (no silent fallback).

`resolution` — string, default `"720p"`

480p / 720p / 1080p / 4k — lowercase only. Drives pricing. Uppercase forms like 1080P are rejected with 400.

`generate_audio` — boolean, default `true`

`return_last_frame` — boolean, default `false`

`tools` — object[]

Per-tool capability list. Today only one type is recognized:

"tools": [{ "type": "web_search" }]

web_search lets the model query the web during generation — useful for current events or named brands. Unknown type values are rejected with 400 tools[i].type must be "web_search".

`nsfw_checker` — boolean, default `true`

Safety checking is enabled by default. Direct API callers can pass "nsfw_checker": false on the Standard model ids in this page:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Your prompt",
  "resolution": "720p",
  "size": "16:9",
  "duration": 5,
  "nsfw_checker": false
}

`image_urls` — string[]

Array of public HTTP(S) URLs. Up to 9 entries. Triggers I2V (when sent without other media fields) or augments REF (when combined with video_urls / audio_urls).

Mutually exclusive with image_with_roles. Sending more than 9 is rejected with 400 at most 9 image_urls allowed.

`image_with_roles` — object[]

First / last frame interpolation. Each entry is a {url, role} object:

"image_with_roles": [
  { "url": "https://your-cdn.com/day.jpg",   "role": "first_frame" },
  { "url": "https://your-cdn.com/night.jpg", "role": "last_frame"  }
]

role is one of first_frame / last_frame. Up to 9 entries (typical use: 1 or 2 — one first frame and one last frame).

Cannot be combined with image_urls, video_urls, or audio_urls — these modes are exclusive.

`video_urls` — string[]

No real people on standard / fast variants. Use the Face variants (-face, -fast-face) when the reference clip features identifiable real people — the non-Face variants reject them upstream.

reAPI probes each clip's resolution and duration server-side via ffmpeg metadata. Out-of-spec assets surface as a 400:

Frame outside 300–6000 px/side, 0.41–8.3 MP, or aspect 0.4–2.5 → 400 video_urls[i] resolution WxH is out of range (code 20003)
Each clip outside 2–15s, or combined > 15s → 400 video_urls total duration X.XXs exceeds the 15s limit (code 20003)
Probe failure (network / format) → 400 Could not determine source video duration for billing (code 30002) — no charge

Mutually exclusive with image_with_roles.

`audio_urls` — string[]

Reference audio for REF mode. Up to 3 entries; combined duration ≤ 15 seconds. Public HTTP(S) URLs only.

Must accompany image_urls OR video_urls — a request with audio_urls and no visual reference is rejected with 400 audio_urls must be used together with image_urls or video_urls.

Mutually exclusive with image_with_roles.

Response envelope

Submit and poll share the same shape — only status and output fill in over time.

{
  "id": "task_018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "model": "doubao-seedance-2.0-face",
  "status": "completed",
  "created_at": 1735000000,
  "output": {
    "video_urls": ["https://cdn.reapi.ai/media/tasks/018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e/0.mp4"],
    "last_frame_url": "https://cdn.reapi.ai/media/tasks/018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e/0.png"
  },
  "error": null
}

Field	Type	Notes
`id`	string	Task identifier — keep it for polling and audit
`model`	string	Echo of the submitted `model` (the variant you picked)
`status`	string	`processing` / `completed` / `failed`
`created_at`	integer	Submission unix timestamp
`output`	object \| null	`null` until completion
`output.video_urls`	string[]	Generated MP4 URL(s) — valid for 7 days
`output.last_frame_url`	string \| null	Present only when the request set `return_last_frame: true`
`error`	object \| null	Populated on `failed` — `{ code, message }`

Validation errors

Trigger	Code	Message
Missing `prompt`	`20002`	`prompt: Invalid input: expected string, received undefined`
`prompt` shorter than 3 chars	`20003`	`prompt: Too small: expected string to have >=3 characters`
`prompt` longer than 4,000 chars	`20003`	`prompt: Too big: expected string to have <=4000 characters`
`image_urls` and `image_with_roles` together	`20003`	`image_urls and image_with_roles cannot be used simultaneously`
`image_with_roles` + `video_urls` or `audio_urls`	`20003`	`image_with_roles cannot be combined with video_urls or audio_urls`
`audio_urls` without visual reference	`20003`	`audio_urls must be used together with image_urls or video_urls`
`image_urls` > 9	`20003`	`at most 9 image_urls allowed, got N`
`image_with_roles` > 9	`20003`	`at most 9 image_with_roles allowed, got N`
`image_with_roles[i].role` invalid	`20003`	`image_with_roles[i].role must be first_frame or last_frame, got "..."`
`video_urls` > 3	`20003`	`at most 3 video_urls allowed, got N`
`video_urls` clip frame out of range (300–6000px / 0.41–8.3MP / aspect 0.4–2.5)	`20003`	`video_urls[i] resolution WxH is out of range`
`video_urls` combined > 15s	`20003`	`video_urls total duration X.XXs exceeds the 15s limit`
`audio_urls` > 3	`20003`	`at most 3 audio_urls allowed, got N`
`audio_urls` combined > 15s	`20003`	`audio_urls total duration X.XXs exceeds the 15s limit`
`duration` outside 4–15	`20003`	`duration must be 4-15 seconds, got N`
Invalid `size` value	`20005`	`invalid size "..." (allowed: 16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / adaptive)`
Invalid `resolution` value	`20003`	`invalid resolution "..." (allowed: 480p / 720p / 1080p / 4k)`
`1080p` on Fast variant	`20003`	`resolution=1080p is not supported by <variant> (use doubao-seedance-2.0 or doubao-seedance-2.0-face)`
`tools[i].type` not `web_search`	`20003`	`tools[i].type must be "web_search", got "..."`
Any URL field carrying a `data:` URI	`20003`	`<field> entries must be public URLs; base64 data URIs are not supported`
Reference video probe fails	`30002`	`Could not determine source video duration for billing: ...`

The full envelope is { "error": { "code", "message", "request_id" } } — see Errors catalog for wire format and request_id correlation.

Recipes

T2V — text-to-video

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A kitten yawning at the camera, slow push-in, warm tones",
  "resolution": "720p",
  "size": "16:9",
  "duration": 5,
  "fallback": { "enabled": false }
}

I2V — single reference image

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten stands up and walks toward the camera",
  "image_urls": ["https://your-cdn.com/cat.jpg"],
  "duration": 5
}

FRAMES — first / last frame transition

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Smooth transition from day to night",
  "image_with_roles": [
    { "url": "https://your-cdn.com/day.jpg",   "role": "first_frame" },
    { "url": "https://your-cdn.com/night.jpg", "role": "last_frame"  }
  ],
  "duration": 5
}

REF — reference video (style transfer)

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "Restylize the reference clip into anime aesthetics",
  "video_urls": ["https://your-cdn.com/reference.mp4"]
}

REF — reference video + reference audio

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A scene of a person speaking",
  "video_urls": ["https://your-cdn.com/reference.mp4"],
  "audio_urls": ["https://your-cdn.com/speech.wav"],
  "size": "16:9",
  "duration": 11
}

Voiced video (synthesized audio)

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "A man calls out to a woman: \"Remember — never point at the moon with your finger.\"",
  "generate_audio": true
}

Continuous video chain

Step 1 — produce a 5s clip and ask for the last-frame URL:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten approaches the camera",
  "image_urls": ["https://your-cdn.com/kitten-start.png"],
  "return_last_frame": true
}

Step 2 — feed output.last_frame_url as image_urls of the next call:

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "The kitten turns and walks away",
  "image_urls": ["<paste output.last_frame_url from step 1>"]
}

Fast variant — quick timelapse

{
  "model": "doubao-seedance-2.0-fast-face",
  "prompt": "City nightscape timelapse",
  "size": "21:9",
  "duration": 8
}

The full REF surface — combine all three reference types for tightly directed product / brand spots.

{
  "model": "doubao-seedance-2.0-face",
  "prompt": "First-person POV product ad with dynamic camera moves",
  "image_urls": [
    "https://your-cdn.com/product-1.jpg",
    "https://your-cdn.com/product-2.jpg"
  ],
  "video_urls": ["https://your-cdn.com/style-ref.mp4"],
  "audio_urls": ["https://your-cdn.com/bgm.mp3"],
  "generate_audio": true,
  "size": "16:9",
  "duration": 11
}

Choosing a variant

Need	Pick
Highest quality, full resolution range	`doubao-seedance-2.0-face`
Cheaper / faster, 720p ceiling	`doubao-seedance-2.0-fast-face`

Polling pattern

The task endpoint behaves identically to image tasks — only the completed output shape differs (video_urls / last_frame_url instead of image_urls). A pragmatic schedule:

0–5 minutes:    poll every 5s
5 min – 1 h:    back off gradually toward 1 min
≥ 1 h:          cap at 3 min between polls

A typical task completes in a few minutes. A single generation attempt can run for up to 48 hours; when fallback is enabled, the overall task window can cover two generation attempts.

Pricing

Per-second × billable seconds, where:

billable_seconds = sum(video_urls clip lengths, server-probed)
                 + duration

The per-second rate depends on three axes:

Variant (2 options)
Resolution (480p / 720p / 1080p / 4k)
Mode — text (no media references) vs. ref (any of image_urls, image_with_roles, video_urls, audio_urls is set)

REF rates are lower than text rates at every cell. See live numbers on the model page — that table is dynamic and always reflects the current rate.

Bill formula (1 credit = $0.001):

credits = ceil(per_second_usd × billable_seconds × 1000)

Charge on submit; refund automatically on failed. Probe failures (unreachable / unreadable video_urls) return 400 PRICING_UNAVAILABLE with no charge.

Worked example. doubao-seedance-2.0 at 720p, REF mode, with a 5-second reference video and duration: 6:

billable_seconds = 5 + 6 = 11
credits = ceil(per_second_usd × 11 × 1000)

The same duration: 6 request without video_urls would bill 6 seconds at the (higher) text rate.

Tips

Prompt motion, not just scene. "Slow push-in, warm tones, shallow depth of field" outperforms a noun-list of what's on screen.
Sweet-spot duration: 5–10 seconds. Below 5s motion looks choppy; above 10s generation time grows fast.
Trim reference clips before upload. Both their actual length AND your duration count toward the bill. A 2-second style snippet is usually enough to convey style — there's no quality bonus for uploading a 15s reference.
Pick doubao-seedance-2.0-fast for iteration. Fast variants cost noticeably less and miss only the 1080p tier — perfect for prompt-tuning loops where final quality comes later.
Real people → Face variants. The non-Face variants reject identifiable real-person assets during generation; switching is a one-character change to model.
Chain continuous video with return_last_frame. Pass the returned URL as image_urls of the next request. No prompt drift between segments.

doubao-seedance-2.0

Table of Contents

doubao-seedance-2.0

Table of Contents