KTS Digital Products

Pay 10% on the parts that don't change

Prompt caching lets you mark a portion of your prompt so Anthropic caches it server-side. Subsequent calls using the same cached prefix cost 90% less for those tokens.

When to use it

Large system prompts that don't change between requests
Reference documents you pass on every call
Tool definitions that are static

Add cache_control to the content block you want cached. Caches last ~5 minutes. Check the Anthropic docs for the latest cache behavior.

operator note

If your system prompt is 2,000 tokens and you're making 1,000 calls/day, caching it saves ~1.8M tokens in input costs per day.