New: Prompt caching with Claude. Caching lets you instantly fine-tune model responses with longer and more instructive prompts—all while reducing costs by up to 90%. Available in beta on the Anthropic API today: https://lnkd.in/gmj22PYH.
With prompt caching, you can reuse a book’s worth of context across multiple API requests. This can also reduce latency by up to 85% on long prompts.
Use cases include conversational assistants, document processing, agentic tool use, and the ability to talk to books, papers, documentation, podcast transcripts, and other long-form content.