rreAPI Docs
rreAPI Docs
HomepageWelcome

Image

wan-2-7-imagegpt-image-2gpt-image-2-officialgemini-2.5-flash-image-previewgemini-3-pro-image-previewgemini-3.1-flash-image-previewdoubao-seedream-5-0-liteimagen-4-0

Audio

Mureka V9 Song APIVocal Remover APIMusic Extractor APIVoice Cleaner APIMultistem Splitter APIVoice Changer API

Video

wan-2-7-videokling-motion-controlpixverse-v6doubao-seedance-2.0doubao-seedance-2.0-officialdoubao-seedance-2.0-betahappyhorse-1.0happyhorse-1.0-officialviduq3grok-imagine-1.0-videoVeo 3.1gemini-omni

Chat

gpt-5.5gpt-5.4claude-opus-4-8claude-opus-4-7claude-sonnet-4-6

Tools

enhance-video-1.0
X (Twitter)

gpt-5.5

GPT-5.5 — OpenAI's frontier reasoning model. OpenAI-compatible /v1/chat/completions with 1M context, 128K max output, advanced reasoning with adjustable effort, and Tool Search for large agent workflows.

GPT-5.5 is OpenAI's frontier reasoning model, exposed through reAPI as a drop-in OpenAI-compatible Chat Completions endpoint. 1M token context, 128K max output, advanced reasoning with adjustable effort, and Tool Search for large agent workflows. Current rates live on the model page and on api.reapi.ai/pricing.

Quick example

curl https://api.reapi.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "group": "default",
    "messages": [
      { "role": "user", "content": "Hello" }
    ],
    "stream": true,
    "temperature": 0.7,
    "top_p": 1,
    "frequency_penalty": 0,
    "presence_penalty": 0
  }'
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.reapi.ai/v1",
)

stream = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
    temperature=0.7,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    extra_body={"group": "default"},
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.reapi.ai/v1",
});

const stream = await client.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Hello" }],
  stream: true,
  temperature: 0.7,
  top_p: 1,
  frequency_penalty: 0,
  presence_penalty: 0,
  // `group` is a reAPI-specific extension; pass it via extra body.
  // @ts-expect-error — not part of the OpenAI types
  group: "default",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)

func main() {
    body, _ := json.Marshal(map[string]any{
        "model": "gpt-5.5",
        "group": "default",
        "messages": []map[string]string{
            {"role": "user", "content": "Hello"},
        },
        "stream":            true,
        "temperature":       0.7,
        "top_p":             1,
        "frequency_penalty": 0,
        "presence_penalty":  0,
    })
    req, _ := http.NewRequest("POST",
        "https://api.reapi.ai/v1/chat/completions", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")

    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    out, _ := io.ReadAll(resp.Body)
    fmt.Println(string(out))
}

Authentication

Every request needs a Bearer token. The GPT-5.5 chat workspace lives on the api.reapi.ai platform — sign in there to create a key and top up tokens.

  1. Open api.reapi.ai and sign in (or create an account).
  2. Generate an API key under API Keys.
  3. Top up tokens under Top Up (pay-as-you-go, billed in USD per 1M tokens — see api.reapi.ai/pricing).
Authorization: Bearer YOUR_API_KEY

The chat surface (api.reapi.ai) is a separate workspace from the image/video/audio task gateway at reapi.ai/api/v1/*. Keys and balances do not cross over — a key issued on reapi.ai/settings/apikeys will not authenticate against api.reapi.ai/v1/chat/completions, and vice versa.


Endpoint

POST https://api.reapi.ai/v1/chat/completions

OpenAI-compatible. The same SDKs (openai-python, openai-node, openai-go, …) work once the base URL is set to https://api.reapi.ai/v1.


Request body

model — string, required

Must be "gpt-5.5". The value is echoed back in the response envelope.

messages — array, required

Conversation history as an array of message objects. Same shape as the OpenAI Chat Completions spec:

{
  "role": "system" | "user" | "assistant" | "tool",
  "content": "string or content-parts array"
}

Multi-turn history is sent in chronological order — the last message is the one the model responds to.

stream — boolean, default false

When true, the response is streamed as server-sent events (SSE) with Content-Type: text/event-stream. Each event is a JSON delta in the OpenAI format, terminated by a data: [DONE] line. When false, the full response body is returned in one HTTP response.

temperature — number, default 1

Range 0.0 – 2.0. Sampling temperature. Lower values make output more deterministic; higher values increase randomness. OpenAI recommends tuning either temperature or top_p, not both.

top_p — number, default 1

Range 0.0 – 1.0. Nucleus sampling cutoff — restricts sampling to the smallest set of tokens whose cumulative probability mass exceeds top_p.

frequency_penalty — number, default 0

Range -2.0 – 2.0. Penalises tokens by how often they've already appeared in the response so far. Positive values discourage literal repetition.

presence_penalty — number, default 0

Range -2.0 – 2.0. Penalises tokens that have appeared at all, regardless of frequency. Positive values encourage the model to talk about new topics.

group — string, default "default"

reAPI-specific extension. Selects a token group on the gateway, which routes the request to a specific upstream channel pool. "default" is the standard pool and covers nearly every workload — you can omit the field if you don't need custom routing.

Other OpenAI parameters

Every other field on the OpenAI Chat Completions spec — max_tokens, stop, n, seed, tools, tool_choice, response_format, logprobs, top_logprobs, user, parallel_tool_calls, reasoning_effort (none / low / medium / high / xhigh) — passes through unchanged. The OpenAI SDKs do not need a reAPI-specific shim.


Response shape

Non-streaming (stream: false)

{
  "id": "chatcmpl-018f5a3a1b6e7d9f8c2b4d6e8f0a2c4e",
  "object": "chat.completion",
  "created": 1735000000,
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

usage.prompt_tokens and usage.completion_tokens are the inputs to the bill — see api.reapi.ai/pricing for the live rate card.

Streaming (stream: true)

Content-Type: text/event-stream. Each data: line is a JSON delta:

data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1735000000,"model":"gpt-5.5","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1735000000,"model":"gpt-5.5","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1735000000,"model":"gpt-5.5","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

The final event before [DONE] carries the finish_reason (stop / length / tool_calls / content_filter). Usage stats are omitted from the stream — call again with stream: false if you need exact token counts per turn.


Pricing

GPT-5.5 is billed pay-as-you-go in USD against your api.reapi.ai token balance. The live per-1M-token rate card lives on api.reapi.ai/pricing; top up tokens at api.reapi.ai.

The bill for a single call is:

input_cost  = prompt_tokens     × input_rate  / 1,000,000
output_cost = completion_tokens × output_rate / 1,000,000

Failed requests are not charged.

Long-context tier (>272K input tokens)

When the input portion of a single request exceeds 272K tokens, the entire request is billed at 2× the input rate and 1.5× the output rate. A request with 270K input tokens stays at the standard rate; a request with 280K input tokens shifts the whole call (input and output) to the long-context rate. See api.reapi.ai/pricing for the resolved per-1M-token numbers in both tiers.


Limits

LimitValue
Context window1M tokens
Max output per call128K tokens
Standard-rate input≤ 272K tokens

Streams that hit the output cap finish with finish_reason: "length"; call again with a continuation message if you need more text.


Errors

The error envelope follows the OpenAI shape — HTTP status, plus a JSON body:

{
  "error": {
    "message": "...",
    "type": "invalid_request_error",
    "code": "..."
  }
}

Common cases:

StatusWhenNotes
400Bad request shape, unknown field, etc.Same shape OpenAI returns
401Missing / invalid API keyRe-issue a key at api.reapi.ai
402Insufficient balanceTop up at api.reapi.ai
429Per-group rate limit hitBack off, or move to a different group
500Upstream / gateway errorSafe to retry — failed calls are not charged

api.reapi.ai does not internally retry chat requests. Every customer call maps to exactly one upstream POST. If a network error reaches you, that's a one-for-one wire failure and a retry from your side is safe; the upstream provider may have already produced output, but the gateway will not double-bill.


Recipes

Minimum request

{
  "model": "gpt-5.5",
  "messages": [
    { "role": "user", "content": "Summarise the OpenAI Chat Completions spec in three sentences." }
  ]
}

Full parameter set

{
  "model": "gpt-5.5",
  "group": "default",
  "messages": [
    { "role": "system", "content": "You are a senior staff engineer." },
    { "role": "user",   "content": "Walk me through a 1M-token codebase review strategy." }
  ],
  "stream": true,
  "temperature": 0.7,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}

Tool use (function calling)

{
  "model": "gpt-5.5",
  "messages": [
    { "role": "user", "content": "What's the weather in Tokyo today?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Look up the current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": { "city": { "type": "string" } },
          "required": ["city"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Reasoning effort

{
  "model": "gpt-5.5",
  "reasoning_effort": "high",
  "messages": [
    { "role": "user", "content": "Prove that the sum of the first n odd numbers is n^2." }
  ]
}

reasoning_effort accepts none / low / medium / high / xhigh — pick the lowest level that still produces correct output for the workload to keep latency and token spend down.


Tips

  • Stream by default for chat UX. Streaming responses cut perceived latency dramatically and let your UI render tokens as they're produced.
  • Watch the long-context boundary. Splitting a 300K-token prompt into a 270K turn and a follow-up keeps you on the standard rate rather than paying the 2× / 1.5× long-context premium.
  • Tune temperature or top_p, not both. Mixing them tends to produce results that are hard to reason about.
  • reasoning_effort: high is the right default for agents. Reserve xhigh for the genuinely hard turns — it adds latency and token spend.
  • Drop frequency_penalty and presence_penalty first when debugging weird output. Non-zero values can introduce artefacts that look like model bugs.

Related

  • Authentication
  • Quickstart
  • Errors catalog

Table of Contents

Quick example
Authentication
Endpoint
Request body
model — string, required
messages — array, required
stream — boolean, default false
temperature — number, default 1
top_p — number, default 1
frequency_penalty — number, default 0
presence_penalty — number, default 0
group — string, default "default"
Other OpenAI parameters
Response shape
Non-streaming (stream: false)
Streaming (stream: true)
Pricing
Long-context tier (>272K input tokens)
Limits
Errors
Recipes
Minimum request
Full parameter set
Tool use (function calling)
Reasoning effort
Tips
Related