rreAPI
  • Models
  • Chat
  • Pricing
  • Blog
  • Docs
  • Changelog
PlaygroundUse casesPricingFAQAPI
Home/Models/Kling 3.0new

Kling 3.0 — Multi-Shot AI Video with Native Audio

Kling 3.0 is Kuaishou's flagship video model: write a prompt for text-to-video, animate a still with image-to-video, or direct a multi-shot sequence — each shot with its own prompt, duration, and camera work. It generates native multilingual audio with lip sync, up to 4K, and you call Kling 3.0 through reApi's OpenAI-compatible API.

Input

Up to 2500 characters · required for single-shot

Public HTTPS URLs only (no base64) · up to 2 (first / last frame)

6

3–15 seconds · billed per second

16:9 / 9:16 / 1:1 · default 16:9

std 720p · pro 1080p · 4K · price scales with the tier

Off by default · audio tiers cost more per second

Pair with Multi-shot prompts below

JSON: up to 5 shots of prompt + duration · 1–12s each

JSON: up to 3 elements, each with 2–4 image URLs · reference with @name in the prompt

Estimated cost594 credits≈ $0.594
Result

Try one of these prompts

What you can build with this model

Real-world workflows and production use cases you can build and ship with this model.

Direct a multi-shot sequence

The headline upgrade in Kling 3.0 is multi-shot. Hand it up to five shots — each with its own prompt, duration, and camera direction — and Kling 3.0 returns one connected sequence with consistent characters and scenes across cuts. It is built for storyboards, ads, and short narratives where a single static clip is not enough, all from one request and one task ID.

Open the playground

Animate a still with image-to-video

Pass a start frame — and optionally an end frame — and Kling 3.0 animates the still into smooth motion, interpolating between the two when both are supplied. Aspect ratio auto-adapts to your images. It is the fastest way to turn a product shot, a character render, or a key frame into a moving clip without re-describing the whole scene in text.

Video with native audio

Turn on sound and Kling 3.0 generates synchronized audio with the video — multilingual dialogue with lip sync, dialects and accents, plus ambient sound — instead of a separate audio pass. Pair it with multi-shot prompts and you get talking characters across cuts, which makes Kling 3.0 a one-call tool for narrated shorts and dialogue scenes.

Pricing

Credit-based — 1 credit = $0.001 USD. Pay only for completed generations.

CategoryUnitPrice
std 720p · no audio
Per second1 second
$0.077
77 credits
std 720p · audio
Per second1 second
$0.11
110 credits
pro 1080p · no audio
Per second1 second
$0.099
99 credits
pro 1080p · audio
Per second1 second
$0.149
149 credits
4K
Per second1 second
$0.369
369 credits

Why reAPI

Multi-shot storytelling in one call

Most video models give you a single clip. Kling 3.0 directs up to five shots — distinct prompts, durations, and camera moves — into one coherent sequence. You skip the stitch-and-sync busywork, and characters stay consistent across cuts through the same request to /api/v1/videos/generations.

Native audio, not a separate pass

Kling 3.0 generates synchronized audio with the picture — multilingual dialogue, lip sync, dialects, and ambient sound. Turn `sound` on and the clip comes back voiced. It is billed per second, with audio tiers priced above silent ones, so you pay for sound only when you ask for it.

Async-first, OpenAI-compatible

Submit a Kling 3.0 task, get a task_id back, poll until it completes. 1 credit equals $0.001 USD and you pay per second × tier, so cost is predictable. The JSON matches the OpenAI generations contract, so adding Kling 3.0 is a model-id change, not a new integration.

Kling 3.0 vs Veo 3

Both generate premium video with audio from one API. Kling 3.0 leans into multi-shot direction, multilingual audio, and flexible duration with per-second pricing. Here is how the two compare on publicly documented behavior.

Capability
Kling 3.0 on reAPI
Veo 3
Multi-shot sequences
Up to five shots in one request — each with its own prompt, duration, and camera direction.
Single-clip generation; sequences are stitched from multiple requests.
Native audio
Multilingual dialogue with lip sync, dialects, and ambient sound, generated with the video.
Native audio including dialogue and sound effects.
Image-to-video
Start frame plus optional end frame, with aspect ratio auto-adapted.
Image-to-video from a single reference image.
Resolution
std 720p, pro 1080p, or 4K via mode.
Up to 1080p / 4K depending on tier.
Duration
3–15s single-shot; multi-shot up to five 1–12s shots.
Fixed short clip lengths.
Pricing model
Per second × tier × audio; pay-as-you-go credits, no subscription.
Per-second pricing that scales with resolution and audio.

Comparison reflects publicly documented behavior at the time of writing. Model behavior and pricing can change; check the pricing card above and the API docs for current values.

Integrate Kling 3.0 in three steps

  1. 01
    step 01

    Create an API key

    Sign up at reApi and grab an API key. Free signup credits cover your first Kling 3.0 clips — enough to test text-to-video and image-to-video before you top up.

    Open
  2. 02
    step 02

    Submit a generation

    POST to /api/v1/videos/generations with model: kling-3-0, a prompt, and optional duration / aspect_ratio / mode / sound / image_urls. Kling 3.0 returns a task_id immediately.

    Open
  3. 03
    step 03

    Poll the result

    GET /api/v1/tasks/:id until status is completed. The Kling 3.0 response carries the video URL; mirror it to your own storage if you need it long term, since generated links expire.

    Open

Frequently asked questions

Common questions about this model.

Kling 3.0 is Kuaishou's flagship video model. It does text-to-video and image-to-video, generates native multilingual audio with lip sync, and — new in 3.0 — directs multi-shot cinematic sequences. On reApi, Kling 3.0 ships on the OpenAI-compatible /api/v1/videos/generations endpoint under the model id kling-3-0.

Related models

Explore more models in the same category.

View all models
Video

Kuaishou

Kling Motion Control

Reference image plus reference video — controllable motion transfer in one Kling Motion Control API call.

From $0.063 per second
Video

Alibaba Cloud Bailian

Happy Horse 1.0

Text, image, reference video, and video edit — one Happy Horse 1.0 API call.

From $0.163 per second
Tools

—

Enhance Video 1.0

Video enhancer API — 4K upscale, denoise, scene presets, frame interpolation in one async endpoint.

From $0.003 per second
Video

ByteDance

Seedance 2.0

Text/image/audio-to-video — 4 variants, per-second pricing.

From $0.041 per second
View all models
start building

Ready to ship?

Try it in the playground or grab an API key to integrate now.

Try Kling 3.0View API Docs
rreAPI

reAPI is the AI API aggregator with sub-second failover, zero request logging, and one OpenAI-compatible endpoint for every top model.

GitHubX (Twitter)
Built withLogo of reAPIreAPI
Featured on There's An AI For ThatFeatured on Findly.toolsFazier badgeDang.ai
ai tools code.market
Featured on Twelve Tools
Image
  • GPT Image 2
  • Gemini 3 Pro Image
  • Gemini 3.1 Flash Image
  • Gemini 2.5 Flash Image
  • Seedream 5.0 Lite
  • Imagen 4.0
  • Wan 2.7 Image
Video
  • Seedance 2.0
  • Happy Horse 1.0
  • Vidu Q3
  • Pixverse v6
  • Grok Imagine 1.0
  • VEO 3.1
  • Gemini Omni
  • Wan 2.7 Video
  • Kling Motion Control
LLM
  • Claude Opus 4.8
  • Claude Opus 4.7
  • Claude Sonnet 4.6
  • DeepSeek V4
  • GPT-5.4
  • GPT-5.5
Audio
  • Mureka V9
  • Vocal Remover
  • Music Extractor
  • Voice Cleaner
  • Multistem Splitter
  • Voice Changer
Text
  • AI Humanizer
  • AI Text Detector
Tools
  • Enhance Video 1.0
·······
© 2026 reAPI. All Rights Reserved.[email protected]
AboutContactChangelogCookie PolicyPrivacy PolicyTerms of Service