📦 On-Demand Skill Loading

Stop loading all skills on every call. Two approaches — Skill-Router and IDENTITY.md Relay — to cut skill token costs by up to 89%.

Cost savingsSkillsAdvanced

Every skill you install gets injected into the system prompt on every single API call. That means if you have 20 custom skills, you're sending ~32,000 extra tokens to the model every time you say "hello" — even when none of those skills are relevant. At $0.80–$3.00 per million input tokens, that adds up fast.

This guide explains the problem and presents two solutions: a Skill-Router pattern (single skill that indexes all others) and an IDENTITY.md Relay pattern (lightweight references per agent). Both achieve the same goal — loading skills on demand instead of all at once — with different tradeoffs.

The problem: skills tax every API call

OpenClaw loads all enabled skills into the system prompt so the model "knows" they exist. This is by design — the model needs to see a skill's instructions to use it. But it means:

10 custom skills ≈ ~16,000 extra tokens per call
20 custom skills ≈ ~32,000 extra tokens per call
30+ custom skills ≈ ~48,000+ extra tokens per call

This happens on every call, whether you're asking about Stripe billing or just saying "good morning."

Skill example	Size	~Tokens
stripe-billing	8,855 bytes	~2,200
supabase-manage	8,864 bytes	~2,200
resend-email	14,645 bytes	~3,600
deploy-pipeline	6,707 bytes	~1,700
compliance-audit	9,047 bytes	~2,300

Real-world impact: In one production OpenClaw setup with 20 custom skills + 7 bundled skills, the system prompt was sending ~125,000 tokens per API call. After implementing on-demand loading, it dropped to ~19,000 tokens — an 85% reduction that cut API costs from $0.088 to $0.0097 per call.

The solution: archive skills, load on demand

Both approaches share the same foundation:

Move custom skills out of the auto-load path into an archive directory
Keep a lightweight reference so the agent knows skills exist
Read the full skill file only when the task requires it

Step 1: Archive your custom skills

mkdir -p ~/.openclaw/skills-archive

# Move all custom skills to archive
for skill_dir in ~/.openclaw/skills/*/; do
    mv "$skill_dir" ~/.openclaw/skills-archive/
done

Step 2: Trim bundled skills

Only keep the bundled skills you actually use on every session:

{
  "skills": {
    "allowBundled": [
      "github",
      "healthcheck",
      "session-logs"
    ]
  }
}

The default loads all 53 bundled skills. Trimming to 3–5 essentials saves thousands of tokens per call.

Approach 1: The Skill-Router

A single custom skill that acts as a lightweight index of all your archived skills. It uses keyword matching to decide which skill to load for each task.

How it works

User message arrives
    ↓
Skill-Router scans keywords against its index (~800 tokens)
    ↓
Match found? → Agent reads the specific SKILL.md from archive (~2,000 tokens)
No match? → Agent proceeds without loading any skill (0 extra tokens)

The SKILL.md

Create ~/.openclaw/skills/skill-router/SKILL.md:

---
name: skill-router
description: >
  Smart skill loader. Instead of loading all skills into context,
  this maintains a lightweight index and loads only relevant skills on demand.
  Use on EVERY task to check if specialist knowledge is needed.
---

# Skill Router — On-Demand Skill Loading

You have a library of specialist skills in ~/.openclaw/skills-archive/.
**NEVER** load all skills. Use the index below to find and load ONLY what's needed.

## Step 1: Match the Task

| Skill | Keywords | Path |
|-------|----------|------|
| **stripe-billing** | stripe, payment, subscription, billing, checkout, invoice, webhook | stripe-billing/SKILL.md |
| **supabase-manage** | supabase, database, postgres, rls, edge function, migration | supabase-manage/SKILL.md |
| **resend-email** | resend, email, transactional, dunning email, template | resend-email/SKILL.md |
| **deploy-pipeline** | deploy, vercel, github actions, ci/cd, pipeline, rollback | deploy-pipeline/SKILL.md |
| **compliance-audit** | gdpr, compliance, dpa, privacy policy, terms of service | compliance-audit/SKILL.md |

## Step 2: Load or Skip

- **No match** → Proceed without loading any skill
- **1-2 matches** → Read the matched SKILL.md file(s)
- **3+ matches** → Load only the 2 most relevant

## Step 3: Execute

Follow the loaded skill's instructions. Do NOT mention the routing process to the user.

Token cost

Scenario	Tokens added
Every call (index always loaded)	~800
When a skill is needed	~800 + ~2,000 (skill file)
When no skill is needed	~800

Pros: Keyword matching helps the model find the right skill. One file to maintain. Works for single-agent setups.

Cons: 800 tokens loaded on every call. Keyword table grows as you add skills.

Approach 2: The IDENTITY.md Relay

Add skill references directly to each agent's IDENTITY.md file. In multi-agent setups, only the orchestrator agent needs the full skill list — it reads the skill file and passes relevant content to sub-agents in the delegation message.

Orchestrator IDENTITY.md

## Specialist Skills Library
When a task matches the keywords below, read the skill from
~/.openclaw/skills-archive/<name>/SKILL.md and include relevant
sections when delegating to a sub-agent. Sub-agents cannot access
host filesystem paths — YOU must read and relay skill content.

Do NOT load skills preemptively — only when keywords match.

- **stripe-billing** [stripe, payment, subscription, billing, checkout, invoice, webhook] — Stripe integration
- **supabase-manage** [supabase, database, postgres, rls, edge function, migration, schema] — Supabase operations
- **resend-email** [resend, email, transactional, dunning email, template] — Email sending
- **deploy-pipeline** [deploy, vercel, github actions, ci/cd, pipeline, rollback] — Deployment
- **compliance-audit** [gdpr, compliance, dpa, privacy policy, terms of service, legal] — Legal compliance

Pros: Cheapest option — only ~280 tokens for the orchestrator, 0 for sub-agents. Each agent only sees skills relevant to its role.

Cons: Requires a multi-agent setup with an orchestrator. Orchestrator must relay skill content.

Head-to-head comparison

	All skills loaded	Skill-Router	IDENTITY.md Relay
Tokens per call (no skill needed)	~32,000	~800	~280
Tokens per call (skill needed)	~32,000	~2,800	~2,280
Monthly cost (40 calls/day, GLM-5)	~$106	~$15	~$14
Best for	< 3 skills	Single agent	Multi-agent factory

Implementation checklist

Archive all custom skills to ~/.openclaw/skills-archive/
Trim skills.allowBundled to only essential bundled skills (3–5 max)
Enable compaction: "compaction": { "mode": "safeguard" }
Test that agents can still access archived skills when needed
Monitor token usage for the first few sessions
Set an OpenRouter spending cap as a safety net

Real-world results

From a production 12-agent OpenClaw factory running GLM-5:

Metric	Before	After
Avg prompt tokens per call	125,665	19,492
Avg cost per call	$0.088	$0.0097
Cache hit rate	35.8%	65.3%
Monthly projection (40 calls/day)	$105.96	$11.65
Savings		89%

Which should you choose?

Use the Skill-Router if:

You run a single agent (no sub-agents)
You want one file to maintain
You don't mind 800 tokens on every call

Use the IDENTITY.md Relay if:

You run multiple agents with an orchestrator
You want the absolute lowest token cost
Your sub-agents run in sandboxed containers

Use neither (keep skills loaded) if:

You have fewer than 3 custom skills
Token cost isn't a concern
You need guaranteed skill availability on every call