rreAPI
  • 模型
  • 聊天
  • 博客
  • 文档
  • 更新日志
What Is Claude Opus 4.8? Anthropic's New Model Explained
2026/05/30

What Is Claude Opus 4.8? Anthropic's New Model Explained

Claude Opus 4.8 is Anthropic's most capable model for reasoning and agentic coding. Here is what's new, its benchmarks, pricing, and how to access it.

Anthropic shipped Claude Opus 4.8 on May 28, 2026, an upgrade to Opus 4.7 that it describes as a modest but tangible improvement[1]. It is Anthropic's most capable model for complex reasoning and long-horizon agentic coding, and it launched at the same price as the model it replaces. The headline change is not a benchmark leap but reliability: Claude Opus 4.8 is roughly four times less likely than Opus 4.7 to let a flaw in code it wrote pass unremarked[1].

This guide explains what Claude Opus 4.8 is, what changed from 4.7, its published benchmarks, what it costs, and how to call it through an OpenAI-compatible API. Every number below comes from Anthropic's own announcement, model overview, and pricing pages, retrieved on May 30, 2026.

What is Claude Opus 4.8?

Claude Opus 4.8 is the latest model in Anthropic's Opus line, the tier aimed at the hardest reasoning and agentic-coding work. The API identifier is claude-opus-4-8, with no date suffix[2].

The core specs:

  • Context window: 1M tokens, billed at the standard per-token rate across the whole window[2].
  • Max output: 128K tokens on the synchronous API, up to 300K on the Batch API with a beta header[2].
  • Input modalities: text and image. Output is text. Vision is supported; Anthropic does not list audio or video input[2].
  • Knowledge cutoff: January 2026[2].

What's new compared to Opus 4.7

Anthropic frames Opus 4.8 as building on 4.7 rather than fully replacing it, and ships it at the same input and output price[1]. Three changes stand out.

  • Honesty. Opus 4.8 is about four times less likely than its predecessor to let a flaw in its own code pass without flagging it, and it is more willing to mark uncertainty instead of asserting unsupported claims[1].
  • A cheaper, faster fast mode. Fast mode now runs at $10 per million input tokens and $50 per million output, which Anthropic says is three times cheaper and 2.5 times faster than the previous generation's fast tier[1].
  • Effort control. The model defaults to high effort and can be pushed to extra or max, spending more tokens for better results on hard problems[2].

Core capabilities of Claude Opus 4.8

A 1M-token context window

Opus 4.8 reads up to a million tokens in one request, enough for a large codebase or a long document set, and a 900K-token request is billed at the same per-token rate as a 9K-token one[2]. On Microsoft Foundry the window is capped at 200K[2].

Agentic coding and reasoning

The model leads Anthropic's published comparison on agentic coding and multidisciplinary reasoning. It scores 69.2% on SWE-Bench Pro and 57.9% on Humanity's Last Exam with tools, both ahead of Opus 4.7[1]. Full numbers are in the benchmark table below.

Effort control and adaptive thinking

Rather than a single extended-thinking toggle, Opus 4.8 uses adaptive thinking plus an effort setting. Leaving effort at the default high suits most work; raising it to extra or max trades more tokens for deeper reasoning on the hardest tasks[2].

Vision, tool use, and prompt caching

Opus 4.8 accepts image input alongside text, supports tool use and function calling, and works with prompt caching to cut the cost of repeated context[2]. These are the building blocks for agents that read screenshots, call tools, and carry long context across a session.

Claude Opus 4.8 benchmarks

These are Anthropic's own published results, comparing Opus 4.8 with Opus 4.7, GPT-5.5, and Gemini 3.1 Pro[1]. As with any vendor-run benchmark, treat them as a directional signal rather than a neutral audit.

BenchmarkOpus 4.8Opus 4.7GPT-5.5Gemini 3.1 Pro
Agentic coding (SWE-Bench Pro)69.2%64.3%58.6%54.2%
Terminal coding (Terminal-Bench 2.1)74.6%66.1%78.2%70.3%
Reasoning (Humanity's Last Exam, no tools)49.8%46.9%41.4%44.4%
Reasoning (Humanity's Last Exam, with tools)57.9%54.7%52.2%51.4%
Computer use (OSWorld-Verified)83.4%82.8%78.7%76.2%
Knowledge work (GDPval-AA)1890175317691314
Financial analysis (Finance Agent v2)53.9%51.5%51.8%43.0%

Opus 4.8 tops the table on every row except Terminal-Bench 2.1, where GPT-5.5 leads at 78.2%[1].

Claude Opus 4.8 pricing

Pricing is per million tokens (MTok), unchanged from Opus 4.7[1][3]:

ItemPrice / MTok
Input$5
Output$25
Cache read$0.50
Cache write (5-minute)$6.25
Cache write (1-hour)$10
Batch input$2.50
Batch output$12.50
Fast mode input$10
Fast mode output$50

The Batch API runs at half the standard input and output price, and prompt caching makes repeated context cheap to reread at $0.50 per million tokens[3].

How to access Claude Opus 4.8

You can call Claude Opus 4.8 directly through Anthropic, or reach it through reAPI's OpenAI-compatible gateway at Anthropic's standard rates. See the reapi.ai/models/claude-opus-4-8 page for live rates and a playground. The advantage of the gateway is one key and one client across providers: the same integration that calls GPT-5 and Gemini, plus reAPI's image and video models, also calls Claude Opus 4.8.

Point the OpenAI SDK at reAPI and set the model:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_REAPI_KEY",
    base_url="https://api.reapi.ai/v1",
)

resp = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[
        {"role": "user", "content": "Refactor this function and explain the change."},
    ],
    extra_body={"group": "default"},
)
print(resp.choices[0].message.content)

The native Anthropic /v1/messages surface is available too, so SDKs written for either format work without a rewrite. Input runs $5 per million tokens and output $25, the same rates Anthropic charges directly.

Who should use Claude Opus 4.8

Opus 4.8 is built for work where a mistake is expensive and context is large:

  • Agentic coding. Long-horizon tasks where the model plans, edits, and verifies its own work, and where the honesty gains reduce silent bugs.
  • Large-context analysis. Reading an entire codebase or document set in one request, up to a million tokens.
  • Tool-using agents. Workflows that call tools, read screenshots, and carry state across many steps.
  • High-stakes reasoning. Research, financial analysis, and other work where flagged uncertainty beats a confident wrong answer.

For high-volume, low-complexity calls, a smaller and cheaper model is usually the better fit; Opus is the tier you reach for when capability matters more than cost per token.

Why Claude Opus 4.8 matters in 2026

The interesting part of Opus 4.8 is not a single benchmark. It is that Anthropic chose to spend a release on reliability, cutting how often the model lets its own mistakes slide, rather than chasing a leaderboard. For agentic coding, where a model edits real code across many steps, that kind of honesty compounds: fewer silent bugs to catch downstream. Paired with a million-token window and a cheaper fast mode, Opus 4.8 is aimed squarely at teams building agents they have to trust.

FAQ

When was Claude Opus 4.8 released?

May 28, 2026, as an upgrade to Claude Opus 4.7, at the same price[1].

What is the Claude Opus 4.8 context window?

1M tokens, billed at the standard per-token rate across the full window. Maximum output is 128K tokens, or up to 300K on the Batch API[2].

How much does Claude Opus 4.8 cost?

$5 per million input tokens and $25 per million output, with cache reads at $0.50 and the Batch API at half price[3]. Fast mode is $10 input and $50 output[1].

Is Claude Opus 4.8 multimodal?

It accepts text and image input and returns text, with vision supported. Anthropic does not list audio or video input for the model[2].

How is Claude Opus 4.8 different from Opus 4.7?

Same price, with higher honesty (about four times less likely to let its own code flaws pass), a cheaper and faster fast mode, and small gains across Anthropic's benchmark suite[1].

How do I call Claude Opus 4.8 with the OpenAI SDK?

Point the OpenAI client at https://api.reapi.ai/v1, set model to claude-opus-4-8, and send a normal chat request. reAPI exposes Claude through an OpenAI-compatible endpoint at Anthropic's standard rates.

The short version

Claude Opus 4.8 is an incremental but real upgrade: the same price as Opus 4.7, a million-token context, and a deliberate focus on honesty that matters most for agentic coding. If you are building agents that edit code or reason over large context, Claude Opus 4.8 is the current top of Anthropic's line, and you can reach it through an OpenAI-compatible API alongside every other model you call.

Further reading

  • reapi.ai/docs/claude-opus-4-8 — endpoint reference and SDK examples.
  • What is reAPI? — the unified API behind one key for chat, image, and video.
  • reapi.ai/models — browse the full model catalog.

References

  1. Anthropic. Introducing Claude Opus 4.8. Retrieved May 2026 from anthropic.com/news/claude-opus-4-8
  2. Anthropic. Models overview — context, output, modality, and capabilities. Retrieved May 2026 from platform.claude.com/docs/en/about-claude/models/overview
  3. Anthropic. Pricing — token, cache, and batch rates. Retrieved May 2026 from platform.claude.com/docs/en/about-claude/pricing
全部文章

作者

avatar for reAPI Team
reAPI Team

分类

  • Guides
What is Claude Opus 4.8?What's new compared to Opus 4.7Core capabilities of Claude Opus 4.8A 1M-token context windowAgentic coding and reasoningEffort control and adaptive thinkingVision, tool use, and prompt cachingClaude Opus 4.8 benchmarksClaude Opus 4.8 pricingHow to access Claude Opus 4.8Who should use Claude Opus 4.8Why Claude Opus 4.8 matters in 2026FAQWhen was Claude Opus 4.8 released?What is the Claude Opus 4.8 context window?How much does Claude Opus 4.8 cost?Is Claude Opus 4.8 multimodal?How is Claude Opus 4.8 different from Opus 4.7?How do I call Claude Opus 4.8 with the OpenAI SDK?The short versionFurther readingReferences

更多文章

Best fal.ai Alternatives in 2026: 5 Options Compared
Comparisons

Best fal.ai Alternatives in 2026: 5 Options Compared

Looking for fal.ai alternatives in 2026? We compare Replicate, Together AI, RunPod, Hugging Face, and reAPI on model range, pricing, speed, and API design.

avatar for reAPI Team
reAPI Team
2026/05/30
What Can reAPI Do for You? Image, Video & LLM Use Cases
Guides

What Can reAPI Do for You? Image, Video & LLM Use Cases

What can reAPI do today? One API for image, video, audio, and chat models, OpenAI-compatible, with real use cases, a platform comparison, and a setup checklist.

avatar for reAPI Team
reAPI Team
2026/05/30
Best Replicate Alternatives in 2026: 5 Options Compared
Comparisons

Best Replicate Alternatives in 2026: 5 Options Compared

Looking for Replicate alternatives in 2026? Compare fal.ai, Together AI, RunPod, Hugging Face, and reAPI on model range, pricing, speed, and API design.

avatar for reAPI Team
reAPI Team
2026/05/30
rreAPI

reAPI 是 AI API 聚合站,提供亚秒级故障切换、不记录请求内容,一个 OpenAI 兼容端点直达所有顶级模型。

GitHubX (Twitter)
Built withLogo of reAPIreAPI
Featured on There's An AI For ThatFeatured on Findly.toolsFazier badgeDang.ai
ai tools code.market
Featured on Twelve Tools
图像
  • GPT Image 2
  • Gemini 3 Pro Image
  • Gemini 3.1 Flash Image
  • Gemini 2.5 Flash Image
  • Seedream 5.0 Lite
视频
  • Seedance 2.0
  • Happy Horse 1.0
  • Vidu Q3
  • Grok Imagine 1.0
  • VEO 3.1
工具
  • Enhance Video 1.0
资源
  • 博客
  • 关于我们
  • 联系我们
  • 更新日志
  • Cookie政策
  • 隐私政策
  • 服务条款
·······
© 2026 reAPI. All Rights Reserved.[email protected]