Context Window Calculator

Estimate how much usable context remains after system prompts, tool schemas, memory, retrieved chunks, and output reserve — before you build your RAG, MCP, or agent system.

Inputs

Model Preset

Model Context Window (tokens)

Total token limit for the selected model

System Prompt (tokens)

Tokens consumed by your system/instructions prompt

Tool Schema Tokens

Tokens used by tool/function definitions sent to the model

Conversation History (tokens)

Tokens from prior turns kept in memory

Retrieved Chunk Count

Number of RAG/MCP chunks injected into context

Avg Tokens per Chunk

Average size of each retrieved chunk in tokens

Output Token Reserve

Tokens reserved for the model's response

Fill in your inputs and click Calculate to see how your context window is allocated.

Architecture Tips

• Keep tool schemas compact — verbose schemas silently consume thousands of tokens.
• Use sliding window or summarized memory for long conversations instead of full history.
• Target ≤60% context utilization to leave room for unexpected response length.
• For RAG systems, prioritize fewer high-quality chunks over many low-quality ones.
• With MCP, each tool definition adds to your tool schema token count.

Related Calculators

LLM Inference Cost Agent Cost Calculator RAG Vector DB Cost RAG Chunking All Calculators