Follow up on the latest improvements and updates.

RSS

July 24, 2026

new

Now Available: Claude Opus 5 from Anthropic

Claude Opus 5, Anthropic's latest flagship model for coding, reasoning, and knowledge work, is now available through DigitalOcean Inference Engine. Designed for demanding agentic workflows, Opus 5 delivers frontier-level performance with improved cost efficiency, offering intelligence close to Claude Fable 5 at half the price. It excels at complex software engineering, scientific research, and business automation tasks, while supporting long-context reasoning, tool use, and structured outputs.

Access the model ->

July 23, 2026

new

Now Available: model synthesis, a new server-side tool for DigitalOcean Inference Engine (public preview)

Model synthesis is a new opt-in server-side tool in DigitalOcean Inference Engine. Run multiple models on the same inference request and get a single synthesized response, no custom orchestration required. Built for complex, high-reasoning tasks, it works with the existing Chat Completions, Responses, and Messages APIs. Choose an optimized preset or configure the panel and synthesizer directly.

Get started with model synthesis tool->

July 10, 2026

new

Now Available: GPT-5.6 Sol, Luna, and Terra

OpenAI's GPT-5.6 family, Sol (flagship), Terra (balanced), and Luna (fast, cost-efficient), is now available through DigitalOcean Serverless Inference. Sol delivers state-of-the-art performance across coding, knowledge work, cybersecurity, and scientific reasoning, while Terra provides a balanced option for everyday production workloads and Luna offers their fastest, most affordable model in the family. New

max

reasoning and

ultra

mode help tackle complex, multi-step tasks.

Access the new models now ->

July 1, 2026

new

Now Available: Prompt Caching in Public Preview

With Prompt Caching, DigitalOcean automatically recognizes repeated token prefixes and currently offers an 80% discount on cached input tokens compared to standard input pricing across supported models. See the pricing page for current rates, applicable terms, and any pricing updates. The result is lower costs and faster time-to-first-token, with no changes to your application code. Caching applies automatically, and every API response includes a

cached_tokens

count so you can verify your savings in real time. It is available today across a broad set of leading open models, with more models coming soon.

Learn more ->

July 1, 2026

new

Now Generally Available: DigitalOcean Evaluations

Teams can now validate any model or inference router configuration on their own data before production. Run structured LLM-as-a-Judge evaluations across catalog models, fine-tuned models, BYOM imports, and router setups without stitching together a separate evaluation stack.

Access evaluations now ->

July 1, 2026

new

Now Available: Claude Sonnet 5

Claude Sonnet 5, Anthropic’s latest model for coding, autonomous agents, and professional work at scale, is now available through DigitalOcean Serverless Inference. Designed for production AI applications, Sonnet 5 can plan, use tools, and complete complex tasks while delivering near-Opus 4.8 performance on leading agentic benchmarks, including SWE-bench, BrowseComp, and OSWorld-Verified, at Sonnet pricing.

Access the new model now ->

June 30, 2026

new

Now Available: OIDC Authentication for Managed Kubernetes

DOKS clusters can now authenticate users through an external OpenID Connect provider. Each cluster has its own independent configuration managed via doctl, so dev, staging, and production environments can each enforce distinct access policies. Token issuance and revocation are handled directly from the IdP, so deactivating a user there removes their cluster access without manual credential rotation.

Configure SSO for your cluster →

June 25, 2026

new

Now Available: GLM-5.1 and GLM-5.2 from Z.ai

GLM-5.1 and GLM-5.2, the latest open-source models from Z.ai for autonomous software engineering and repository-scale reasoning, are now available through DigitalOcean Inference Engine. GLM-5.1 is optimized for long-running agentic coding workflows and can sustain complex autonomous tasks for up to 8 hours, while GLM-5.2 introduces a 1M-token context window and dual reasoning modes for large-scale code analysis and refactoring. Both models support tool calling, structured output, and multilingual applications.

Access the models now ->

June 17, 2026

new

Now in Public Preview: DigitalOcean Server-Side Tools

Simplify how AI applications and agents are built by embedding tool use directly into the Inference Engine and removing the need to stitch together separate APIs, credentials, and orchestration layers. You can enable models to use web search, web fetch, web mode, knowledge bases, MCP servers, and existing Anthropic/OpenAI tools directly within inference using their existing DigitalOcean access key.

Enable the new tools now ->

June 9, 2026

new

Now Available: Anthropic Claude Fable 5

Claude Fable 5 is now available through DigitalOcean Serverless Inference, bringing Anthropic's most advanced generally available model for autonomous knowledge work, coding, and complex agentic workflows. It delivers frontier reasoning, stronger first-shot accuracy, enhanced debugging and self-correction capabilities, advanced vision understanding, and enterprise-ready safety features.

Access the model ->

→