What You Need to Know
Before adaptive thinking, you had to manually set a thinking level โ off, low, medium, high, or xhigh โ and it applied uniformly to every message. A simple "what's the weather" query burned the same reasoning tokens as a complex debugging session. This was wasteful and forced users to constantly toggle between levels.
Adaptive mode solves this by letting the model decide how much reasoning to apply per request. Simple factual questions get minimal thinking overhead. Multi-step analysis, code generation, and decision-making tasks automatically receive deeper reasoning. The result is better answers on hard problems and lower cost on easy ones โ without any manual intervention.
For Claude 4.6 specifically, adaptive is now the default. Other reasoning models (GPT-4o, Gemini Pro, DeepSeek) default to low, which provides a baseline level of chain-of-thought without the cost of full reasoning on every turn. You can still override per-session with /think high or per-config with agent.model.params.thinking.
The practical impact is significant for users running heartbeat and cron jobs. A heartbeat check that returns HEARTBEAT_OK now uses minimal reasoning tokens, while a heartbeat that detects an anomaly and needs to formulate an alert automatically engages deeper thinking. Over a month of 30-minute heartbeat intervals, this can reduce costs by 30-50% compared to a fixed medium setting.
If you want full control, nothing changes โ you can still pin a specific thinking level in config or via /think. But for most users, leaving the default adaptive is the best balance of quality and cost. Monitor your token usage for a week after upgrading to see the difference.