← v2026.4.14

Ollama timeout and usage fixes

v2026.4.14 · Release notes

Correct timeout wiring and usage accounting for slower local Ollama streaming runs.

Ollama timeout and usage fixes

Two key reliability fixes for local Ollama runs:

  • configured embedded-run timeout is forwarded into undici stream timeout tuning, so slower local models respect operator-set limits
  • OpenAI-compatible streaming sends stream_options.include_usage so usage accounting does not fall back to bogus token estimates