How to Manage Claude Tokens
$ API Cost Control
Use prompt caching
Pay 10% on the parts that don't change
Prompt caching lets you mark a portion of your prompt so Anthropic caches it server-side. Subsequent calls using the same cached prefix cost 90% less for those tokens.
When to use it
- Large system prompts that don't change between requests
- Reference documents you pass on every call
- Tool definitions that are static
Add cache_control to the content block you want cached. Caches last ~5 minutes. Check the Anthropic docs for the latest cache behavior.
operator note
If your system prompt is 2,000 tokens and you're making 1,000 calls/day, caching it saves ~1.8M tokens in input costs per day.
Changelog · 1
- Initial release — 5 sections, 11 lessons.