When is compaction triggered
Kombai monitors your token usage and triggers compaction under the following conditions:- Context window saturation: When the thread consumes the majority (typically 90-95%) of the available context window.
- Cache expiry: To optimize costs, if you return to a thread after 5 minutes of inactivity and 50-60% of the context is already used, Kombai will auto-compact because the cache has expired. Cache is also expired when you toggle Agent Variant or Extended Thinking.
If you edit or restore a message, the previous compaction is reset, which may temporarily increase context usage. As you continue the conversation, Kombai will re-evaluate and trigger compaction again if necessary.
How context is managed
The context window is shared across several elements beyond your message history:- Infrastructure overhead: Tools, system prompts, skills, and RAG attachments typically occupy 15–35% of the window, depending on your tech stack.
- Persistent rules: Project and User Rules are never compacted. They remain in their full form to ensure the Agent strictly adheres to your constraints at all times.
Toggle auto-compaction
Compaction is enabled by default. If you prefer to manage the context window manually, you can disable the Auto compact threads toggle under the icon.Manually compact threads
You can trigger compaction at any time by clicking the icon at the end of your thread and selecting Compact thread.Why compact threads
- Extended conversations: Restore available tokens to maintain a single, continuous thread for long-term projects.
- Optimized performance: Experience faster response times as compaction will reduce the massive overhead of uncompressed history.
- Preserved context: Retain critical information and decision points through intelligent summarization while discarding redundant data.
- Cost efficiency: Dramatically reduce token consumption in API calls by compressing the massive context.