Agent Context Planner

Add your fixed context blocks โ€” system prompt, tool schemas, retrieved documents โ€” and see exactly how many tokens they consume. Calculates remaining budget, max conversation turns, and warns when you're running low.

Model & Turn Settings

Tokens to reserve for the model's final response. Reduces available turns.

Context Blocks

Click a block type above to add context blocks.

Context usage

Healthy
0.8% used127,000 remaining

Total used

1,000

tokens

Context window

128,000

tokens

Max turns

211

at avg turn size

Per turn cost

600

tokens/turn

Token breakdown

Reserved output1,000 (0.8%)

๐Ÿ’ก Recommendations

ยท Context budget looks healthy. You have room for long multi-turn conversations.

Context Budget Guidelines

Under 50% โ€” โœ… Healthy

Plenty of room for long conversations. Good for interactive agents.

50โ€“75% โ€” โš  Watch

Monitor context usage. Consider compressing old turns in long sessions.

Over 75% โ€” ๐Ÿ”ด Act

Few turns available. Trim fixed blocks or switch to a larger context model.

Tips for reducing context usage

  • โ€บTrim your system prompt โ€” every word costs tokens. Aim for under 500 tokens for simple agents.
  • โ€บCompress tool schemas โ€” remove verbose descriptions, use short field names where possible.
  • โ€บUse retrieval-augmented generation (RAG) instead of stuffing entire documents into context.
  • โ€บSummarise conversation history after N turns instead of keeping the full transcript.
  • โ€บChoose a model with a larger context window if your use case requires more fixed context.

Privacy: This tool runs entirely in your browser. Your content is never uploaded or stored.

Frequently Asked Questions

What is context budget planning for AI agents?
Every AI model has a context window โ€” the maximum number of tokens it can process at once. In an agent session, this budget must be shared between fixed blocks (system prompt, tool schemas, retrieved documents) and dynamic conversation turns. Context budget planning calculates how many conversation turns can fit before the window is exhausted.
What are context blocks?
Context blocks are the fixed content that takes up space in every agent invocation regardless of the conversation. Typical blocks: system prompt, tool definitions/schemas, few-shot examples, retrieved documents from RAG systems, and conversation history summaries.
How is token count estimated?
Tokens are estimated at approximately 1 token per 4 characters of English text (the standard rule-of-thumb from OpenAI documentation). This gives a ยฑ10-20% estimate โ€” accurate enough for planning purposes.
What does "reserved for final output" mean?
Most models have a separate max_tokens or max_output_tokens parameter. If you set it to 2000, those 2000 tokens are consumed from the context window for the response. Reserving this amount prevents the calculation from being overly optimistic.
What does "max turns" mean?
Max turns is how many complete user+assistant exchange pairs fit in the remaining context budget after subtracting fixed blocks and reserved output. For example: if 40,000 tokens remain and each turn uses 600 tokens on average, approximately 66 turns fit.
Is my content uploaded anywhere?
No. All token counting and planning runs in your browser. Nothing is uploaded or stored.