Anthropic released Claude Opus 4.8 on May 28, just 41 days after Opus 4.7. The model ships across the Claude API, Claude Code, claude.ai, and all major cloud platforms at the same price as its predecessor. Headline numbers: agentic coding scores climb from 64.3% to 69.2%, multidisciplinary reasoning with tools jumps from 54.7% to 57.9%, and the knowledge-work score moves from 1753 to 1890. Anthropic also exposed a per-message Effort Control — including an 'xhigh' setting for tasks that need extra computation — and a 'Fast Mode' that runs at roughly 2.5x the speed of the standard configuration.
The headline capability is Dynamic Workflows, now in research preview. A user can ask Opus 4.8 to author a workflow and it will orchestrate work across tens to hundreds of background subagents in parallel. Paired with Claude Code, Anthropic is positioning this for end-to-end codebase migrations, large-scale refactors, and research sweeps that previously required either custom agent scaffolding or a human keeping dozens of windows open. Early testers note the model flags its own uncertainty more readily and makes fewer unsupported claims — the kind of behavior change that matters more in production than a benchmark delta.
The 41-day cadence is the story behind the story. Anthropic shipped Opus 4.5 in November 2025, 4.6 in February, 4.7 in mid-April, and 4.8 this week. Each cycle has compressed, and each has held price flat while improving the frontier on coding and agent tasks. The competitive frame is clear — Google shipped Gemini Omni and Gemini 3.5 Flash at I/O ten days ago, OpenAI is iterating GPT-5.5, and Anthropic is racing the same calendar. Rolling Opus releases also keep Claude Code's defaults current without forcing every enterprise customer through a migration.
A takeaway for learners: when frontier vendors ship new flagships every six weeks, the right skill is not memorizing today's model card — it's writing evals you own. Pick the three tasks that actually matter for your work, save the inputs and the rubric, and re-run them on each new release. You will learn faster from a single repeated test than from a hundred Twitter takes about whether the latest model is 'better.'