new

Now Available: Prompt Caching in Public Preview

With Prompt Caching, DigitalOcean automatically recognizes repeated token prefixes and currently offers an 80% discount on cached input tokens compared to standard input pricing across supported models. See the pricing page for current rates, applicable terms, and any pricing updates. The result is lower costs and faster time-to-first-token, with no changes to your application code. Caching applies automatically, and every API response includes a

cached_tokens

count so you can verify your savings in real time. It is available today across a broad set of leading open models, with more models coming soon.

Learn more ->