How to Manage Claude Tokens
$ API Cost Control

Use prompt caching

Pay 10% on the parts that don't change

Prompt caching lets you mark a portion of your prompt so Anthropic caches it server-side. Subsequent calls using the same cached prefix cost 90% less for those tokens.

When to use it

  • Large system prompts that don't change between requests
  • Reference documents you pass on every call
  • Tool definitions that are static

Add cache_control to the content block you want cached. Caches last ~5 minutes. Check the Anthropic docs for the latest cache behavior.

operator note

If your system prompt is 2,000 tokens and you're making 1,000 calls/day, caching it saves ~1.8M tokens in input costs per day.

Changelog · 1
  • Initial release — 5 sections, 11 lessons.