💰 Cost Optimization

Most OpenClaw users spend $10-30/month. Power users spending $100+ usually have misconfigured cron jobs or are using expensive models for simple tasks. This guide shows you how to cut costs by 50-80% without losing quality.

All levels$5-30/mo targetModel tiering

📊 Where does the money go?

Category	Typical cost	Culprit
Chat (primary model)	$5-15/mo	Sonnet/GPT for daily conversations
Cron jobs	$2-8/mo	Scheduled tasks running expensive models
Heartbeat	$0.50-3/mo	Running every 15-30 min on an expensive model
Sub-agents	$1-5/mo	Spawned workers using the primary model instead of a cheaper one
TTS/STT	$5-20/mo	ElevenLabs for voice (if chatty)
Context overflow	$0-10/mo	Long conversations with no compaction or session reset

✅ Rule of thumb: Use /status regularly. It shows model, tokens used, and cost for the current session. If a single session is over $1, something is off.

🏗️ Model tiering strategy

The biggest cost lever: use expensive models only when needed.

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-5",
        "fallbacks": ["openai/gpt-4.1-mini"]
      },
      "heartbeat": {
        "model": "google/gemini-2.5-flash-lite"
      }
    }
  }
}

Task	Best model	Why
Daily chat	Sonnet 4.5	Best quality for interactive use
Heartbeat checks	Flash-Lite / Nano	Simple "check calendar, check email" tasks — cheap
Morning briefing cron	GPT-4.1-mini / Flash	Structured summary — doesn't need top-tier model
Code review	Sonnet 4.5	Quality matters for code
Dependency audit cron	Flash-Lite	Parsing `npm outdated` output — trivial task
Sub-agent research	Flash / Mini	Good enough for web search + summary

💓 Cheap heartbeat model

Heartbeat runs every 15-30 minutes. Using Sonnet here wastes money on "check if there's a new email" tasks:

{
  "agents": {
    "defaults": {
      "heartbeat": {
        "every": "30m",
        "model": "google/gemini-2.5-flash-lite",
        "activeHours": {
          "start": "08:00",
          "end": "22:00",
          "timezone": "Europe/Bucharest"
        }
      }
    }
  }
}

Flash-Lite for heartbeat: ~$0.50/mo vs ~$3/mo with Sonnet
Active hours: No heartbeats while you sleep = 33% savings
Increase interval: "every": "1h" if 30 min is too frequent

⏰ Cron job optimization

Cron jobs are the second biggest cost driver. Key strategies:

Use cheaper models per cron job

openclaw cron add \
  --name "Dependency audit" \
  --cron "0 9 * * 1" \
  --model "google/gemini-2.5-flash-lite" \
  --message "Run npm audit and npm outdated"

Use `--session isolated`

Isolated sessions prevent cron jobs from inflating your main session's context (and cost):

openclaw cron add \
  --name "Morning briefing" \
  --cron "0 7 * * *" \
  --session isolated \
  --message "Send my morning briefing"

Set session retention

{
  "cron": {
    "sessionRetention": "24h",
    "runLog": {
      "maxBytes": "2mb",
      "keepLines": 2000
    }
  }
}

Sessions from completed cron runs are pruned after 24h, preventing bloat.

📏 Context & token management

Daily session reset: Prevents conversations from growing indefinitely — config example:

{
  "session": {
    "reset": {
      "mode": "daily",
      "atHour": 4,
      "idleMinutes": 120
    }
  }
}

Auto-compaction: OpenClaw automatically compacts context when it overflows — but better to reset before that happens
Idle timeout: idleMinutes: 120 resets the session after 2 hours of inactivity
Shorter SOUL.md: Every word in SOUL.md is sent with every message. Trim unnecessary instructions.

📦 Skills optimization

Every enabled skill gets injected into the system prompt on every API call — even when irrelevant. 20 custom skills ≈ ~32,000 extra tokens per message, adding $30–100/month.

Quick wins

Trim skills.allowBundled: The default loads all 53 bundled skills. Keep only 3–5 essentials you use on every session.
Archive custom skills: Move specialist skills to ~/.openclaw/skills-archive/ and load them on demand via a Skill-Router or IDENTITY.md Relay.

{
  "skills": {
    "allowBundled": ["github", "healthcheck", "session-logs"]
  }
}

Result: Production setups have cut skill token costs by up to 89% (e.g. ~125k → ~19k tokens per call). For the full pattern — Skill-Router vs IDENTITY.md Relay, implementation steps, and real-world benchmarks — see How to Load Skills On Demand.

🖥️ Local models (zero cost)

Run models locally with Ollama for zero API cost:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-5",
        "fallbacks": ["ollama/qwen2.5:7b"]
      },
      "heartbeat": {
        "model": "ollama/qwen2.5:7b"
      }
    }
  }
}

Local fallback serves two purposes: saves money on simple tasks, and keeps working when cloud APIs go down.

💡 From the community: "When cloud APIs timeout (happens often), agents automatically fall back to local Ollama. Zero human intervention required." Run both for resilience and savings.

🚨 Budget limits & alerts

OpenRouter spending limits

If using OpenRouter as your provider, set a monthly cap on your API key:

# In OpenRouter dashboard:
# Settings → API Keys → Set monthly limit: $30

This is critical for Telegram bots that are always online — a runaway conversation loop can burn through credits fast.

Per-model rate limiting

{
  "agents": {
    "defaults": {
      "model": {
        "rateLimit": {
          "maxRequestsPerMinute": 10,
          "maxTokensPerDay": 500000
        }
      }
    }
  }
}

📋 Cost profiles

Profile	Monthly cost	Setup
🟢 Budget	$5-10	Flash-Lite primary, Ollama heartbeat, 2 cron jobs, no TTS
🟡 Standard	$15-25	Sonnet primary, Flash-Lite heartbeat/cron, 5 cron jobs, basic TTS
🔴 Power	$30-50	Sonnet primary, Flash cron, sub-agents, full TTS, 10+ cron jobs
🏠 Local-first	$0-5	Ollama primary, cloud Sonnet fallback for complex tasks only

📈 Usage tracking

# Check current session cost
/status

# Check model usage across all sessions
openclaw status --deep

# View cron job costs
openclaw cron runs --id <job-id>

# Check OpenRouter spend
# Visit openrouter.ai/activity

✅ Weekly habit: Run /status at end of week to see total token usage. If it's higher than expected, check cron job frequency and heartbeat model.

💰 Cost Optimization

On this page

📊 Where does the money go?

🏗️ Model tiering strategy

💓 Cheap heartbeat model

⏰ Cron job optimization

Use cheaper models per cron job

Use --session isolated

Set session retention

📏 Context & token management

📦 Skills optimization

Quick wins

🖥️ Local models (zero cost)

🚨 Budget limits & alerts

OpenRouter spending limits

Per-model rate limiting

📋 Cost profiles

📈 Usage tracking

Use `--session isolated`