What it does
/cost displays your live token spend across the current Claude Code session, broken down by model, input vs output tokens, and cache metrics. It shows both the raw token count and USD cost for the current window, plus cumulative totals if you've been running for a while. The breakdown includes cache read tokens (charged at 10% of input token rate) and cache creation tokens (charged at full input rate), so you can see exactly where your bill is going.
Run it anytime to check spend before it accumulates, or watch it grow as you run agents and workflows in the background.
When to use it
- Before spinning up expensive work — multi-agent workflows, web searches, or orchestrated tasks can consume hundreds of thousands of tokens. A quick
/costcheck beforehand tells you what the session has already burned and gives you a baseline. - Troubleshooting runaway costs — if a loop or workflow is consuming more than expected,
/costshows the token profile (is it input-heavy? output-heavy? lots of cache creation?) so you can adjust your prompt or agent count. - Understanding cache behaviour — when you resume a workflow or repeat a command with the same context, you should see cache read tokens instead of fresh input tokens. If the numbers look wrong,
/costconfirms whether caching is working. - Tuning model choice — comparing a Haiku task vs a Sonnet task? Run each, check
/cost, and decide if the quality uplift is worth the 4–5x cost difference.
Try it yourself
Open a session and run an agent or workflow — maybe /code-review --high on a pending PR, or a quick Agent call with a complex prompt. While it's running or after it completes, type /cost to see the breakdown. The output shows input tokens, output tokens, cache reads (if any), and the total USD spent — easy to spot if a single command went unexpectedly high.
Gotchas
Cache reads are real but small. You only see significant cache read tokens if you're resuming a workflow with the resumeFromRunId flag or repeating identical prompts with a large prior context. A fresh /code-review call won't show cache reads because the codebase context is new each time.
Output tokens are pricier than input. Most Claude models charge 3–5x more per output token than input token. A 10k-token response costs more than a 50k-token input prompt. If your bill is high, look at output volume first — long-form agent responses, verbose logs, or streaming many tokens fast will drive costs up faster than large inputs.
Cache creation is not free. When you first upload a large context (like a full codebase in Explore mode or a multi-file workflow), the cache creation tokens are charged at full input rate, not discounted. You only recover that cost if you re-use that context multiple times in a 5-minute window.
The cost display is USD but your billing may differ. /cost always shows USD based on the official Claude API pricing. If you're on a different billing region, subscription tier, or volume contract, your actual cost may vary — check your billing dashboard for your specific rate.
Try it yourself
Type the command in the fake terminal. Nothing leaves your browser.