Choose from Kombai’s model routers and a curated set of frontier models, and control the trade-off between quality, speed, and credit consumption.
Kombai gives you two ways to power a task:
Model Routers: Kombai’s internal routers that automatically pick the ideal model for your task within a chosen cost tier.
Models: Direct access to individual frontier models from providers like OpenAI, Anthropic, Google, xAI, Moonshot, and Qwen.
Each model and router has a credit multiplier that is applied to the base credit usage of a task. You can switch between routers and models at any point within the same chat.
The credit multiplier is relative to the baseline (1x). For example, a task that consumes 10 credits on a 1x model would consume approximately 5 credits on a 0.5x model and 25 credits on a 2.5x model.
Model Routers are powered by a continuously optimized and benchmarked model stack. Kombai benchmarks the latest LLMs from top providers across a wide range of frontend tasks—from interpreting complex UI logic to multi-file refactors—and the internal router automatically selects the best model for the task within the cost tier you choose.This gives you granular control over cost while ensuring you consistently get the best performance per dollar, even as newer and better models emerge, without any effort to manually test and benchmark them.
Router
Credits
Best for
Kombai-Auto
Auto
Automatically picks the best model for the current task
Kombai-Ultra
2x
Best for hard tasks and deep reasoning. Uses Opus 4.8
Kombai-High
1x
Excels in complex problem-solving and reasoning
Kombai-Medium
0.5x
Balances cost-efficiency with quality output
Kombai-Mini
0.33x
Good for high volume, low-complexity tasks
Kombai-Auto is the recommended default. It analyzes each task and routes it to the optimal model automatically, balancing quality and cost for you.
Thinking effort controls how deeply a model reasons before it responds. Higher effort improves planning, edge-case handling, and self-correction for complex tasks, while lower effort returns faster, cheaper responses for simpler ones.
Effort
When to use
None
No additional reasoning. Fastest responses for trivial changes.
Minimal
Very light reasoning for simple, well-defined tasks.
Low
Light reasoning that keeps responses quick.
Medium
Balanced reasoning suitable for most everyday tasks.
High
Deep reasoning for complex logic, refactors, and architectural work.
Extended
Maximum reasoning depth available on supported OpenAI models.
Max
Maximum reasoning depth available on supported Anthropic models.
Not every model supports every level. When you open the thinking effort selector for a model, only the levels it supports are shown. Some models (such as Grok Build 0.1, Kimi K2.5, Kimi K2.6, and Qwen 3.6 27B) don’t expose a configurable thinking effort.
Gemini 3.5 Flash / 3.1 Pro / 3 Flash / 3.1 Flash Lite
Minimal, Low, Medium, High
Grok Build 0.1, Kimi K2.5, Kimi K2.6, Qwen 3.6 27B
Not configurable
Higher thinking effort increases generation time and cost, but it does not consume your context window, allowing the agent to think deeply without losing track of large codebases.