Every skill you install gets injected into the system prompt on every single API call. That means if you have 20 custom skills, you're sending ~32,000 extra tokens to the model every time you say "hello" — even when none of those skills are relevant. At $0.80–$3.00 per million input tokens, that adds up fast.
This guide explains the problem and presents two solutions: a Skill-Router pattern (single skill that indexes all others) and an IDENTITY.md Relay pattern (lightweight references per agent). Both achieve the same goal — loading skills on demand instead of all at once — with different tradeoffs.
On this page
The problem: skills tax every API call
OpenClaw loads all enabled skills into the system prompt so the model "knows" they exist. This is by design — the model needs to see a skill's instructions to use it. But it means:
- 10 custom skills ≈ ~16,000 extra tokens per call
- 20 custom skills ≈ ~32,000 extra tokens per call
- 30+ custom skills ≈ ~48,000+ extra tokens per call
This happens on every call, whether you're asking about Stripe billing or just saying "good morning."
| Skill example | Size | ~Tokens |
|---|---|---|
| stripe-billing | 8,855 bytes | ~2,200 |
| supabase-manage | 8,864 bytes | ~2,200 |
| resend-email | 14,645 bytes | ~3,600 |
| deploy-pipeline | 6,707 bytes | ~1,700 |
| compliance-audit | 9,047 bytes | ~2,300 |
Real-world impact: In one production OpenClaw setup with 20 custom skills + 7 bundled skills, the system prompt was sending ~125,000 tokens per API call. After implementing on-demand loading, it dropped to ~19,000 tokens — an 85% reduction that cut API costs from $0.088 to $0.0097 per call.
The solution: archive skills, load on demand
Both approaches share the same foundation:
- Move custom skills out of the auto-load path into an archive directory
- Keep a lightweight reference so the agent knows skills exist
- Read the full skill file only when the task requires it
Step 1: Archive your custom skills
mkdir -p ~/.openclaw/skills-archive
# Move all custom skills to archive
for skill_dir in ~/.openclaw/skills/*/; do
mv "$skill_dir" ~/.openclaw/skills-archive/
done
Step 2: Trim bundled skills
Only keep the bundled skills you actually use on every session:
{
"skills": {
"allowBundled": [
"github",
"healthcheck",
"session-logs"
]
}
}
The default loads all 53 bundled skills. Trimming to 3–5 essentials saves thousands of tokens per call.
Approach 1: The Skill-Router
A single custom skill that acts as a lightweight index of all your archived skills. It uses keyword matching to decide which skill to load for each task.
How it works
User message arrives
↓
Skill-Router scans keywords against its index (~800 tokens)
↓
Match found? → Agent reads the specific SKILL.md from archive (~2,000 tokens)
No match? → Agent proceeds without loading any skill (0 extra tokens)
The SKILL.md
Create ~/.openclaw/skills/skill-router/SKILL.md:
---
name: skill-router
description: >
Smart skill loader. Instead of loading all skills into context,
this maintains a lightweight index and loads only relevant skills on demand.
Use on EVERY task to check if specialist knowledge is needed.
---
# Skill Router — On-Demand Skill Loading
You have a library of specialist skills in ~/.openclaw/skills-archive/.
**NEVER** load all skills. Use the index below to find and load ONLY what's needed.
## Step 1: Match the Task
| Skill | Keywords | Path |
|-------|----------|------|
| **stripe-billing** | stripe, payment, subscription, billing, checkout, invoice, webhook | stripe-billing/SKILL.md |
| **supabase-manage** | supabase, database, postgres, rls, edge function, migration | supabase-manage/SKILL.md |
| **resend-email** | resend, email, transactional, dunning email, template | resend-email/SKILL.md |
| **deploy-pipeline** | deploy, vercel, github actions, ci/cd, pipeline, rollback | deploy-pipeline/SKILL.md |
| **compliance-audit** | gdpr, compliance, dpa, privacy policy, terms of service | compliance-audit/SKILL.md |
## Step 2: Load or Skip
- **No match** → Proceed without loading any skill
- **1-2 matches** → Read the matched SKILL.md file(s)
- **3+ matches** → Load only the 2 most relevant
## Step 3: Execute
Follow the loaded skill's instructions. Do NOT mention the routing process to the user.
Token cost
| Scenario | Tokens added |
|---|---|
| Every call (index always loaded) | ~800 |
| When a skill is needed | ~800 + ~2,000 (skill file) |
| When no skill is needed | ~800 |
Pros: Keyword matching helps the model find the right skill. One file to maintain. Works for single-agent setups.
Cons: 800 tokens loaded on every call. Keyword table grows as you add skills.
Approach 2: The IDENTITY.md Relay
Add skill references directly to each agent's IDENTITY.md file. In multi-agent setups, only the orchestrator agent needs the full skill list — it reads the skill file and passes relevant content to sub-agents in the delegation message.
Orchestrator IDENTITY.md
## Specialist Skills Library
When a task matches the keywords below, read the skill from
~/.openclaw/skills-archive/<name>/SKILL.md and include relevant
sections when delegating to a sub-agent. Sub-agents cannot access
host filesystem paths — YOU must read and relay skill content.
Do NOT load skills preemptively — only when keywords match.
- **stripe-billing** [stripe, payment, subscription, billing, checkout, invoice, webhook] — Stripe integration
- **supabase-manage** [supabase, database, postgres, rls, edge function, migration, schema] — Supabase operations
- **resend-email** [resend, email, transactional, dunning email, template] — Email sending
- **deploy-pipeline** [deploy, vercel, github actions, ci/cd, pipeline, rollback] — Deployment
- **compliance-audit** [gdpr, compliance, dpa, privacy policy, terms of service, legal] — Legal compliance
Pros: Cheapest option — only ~280 tokens for the orchestrator, 0 for sub-agents. Each agent only sees skills relevant to its role.
Cons: Requires a multi-agent setup with an orchestrator. Orchestrator must relay skill content.
Head-to-head comparison
| All skills loaded | Skill-Router | IDENTITY.md Relay | |
|---|---|---|---|
| Tokens per call (no skill needed) | ~32,000 | ~800 | ~280 |
| Tokens per call (skill needed) | ~32,000 | ~2,800 | ~2,280 |
| Monthly cost (40 calls/day, GLM-5) | ~$106 | ~$15 | ~$14 |
| Best for | < 3 skills | Single agent | Multi-agent factory |
Implementation checklist
- Archive all custom skills to
~/.openclaw/skills-archive/ - Trim
skills.allowBundledto only essential bundled skills (3–5 max) - Enable compaction:
"compaction": { "mode": "safeguard" } - Test that agents can still access archived skills when needed
- Monitor token usage for the first few sessions
- Set an OpenRouter spending cap as a safety net
Real-world results
From a production 12-agent OpenClaw factory running GLM-5:
| Metric | Before | After |
|---|---|---|
| Avg prompt tokens per call | 125,665 | 19,492 |
| Avg cost per call | $0.088 | $0.0097 |
| Cache hit rate | 35.8% | 65.3% |
| Monthly projection (40 calls/day) | $105.96 | $11.65 |
| Savings | 89% |
Which should you choose?
Use the Skill-Router if:
- You run a single agent (no sub-agents)
- You want one file to maintain
- You don't mind 800 tokens on every call
Use the IDENTITY.md Relay if:
- You run multiple agents with an orchestrator
- You want the absolute lowest token cost
- Your sub-agents run in sandboxed containers
Use neither (keep skills loaded) if:
- You have fewer than 3 custom skills
- Token cost isn't a concern
- You need guaranteed skill availability on every call