Agent Skills Series (1): From “it runs” to “it matches how we work”
About this series
Plenty of people have heard of Agent Skills, or installed a few in OpenClaw, yet still can’t pin down what it is, what problems it fits, what should become a Skill, how to write one, or how to roll it out on a team; only after that come fragmentation across many tools and packages, plus versioning and collaboration.
This series follows that thread. This post uses one slice of day-to-day coding friction as a hook. Next come SKILL.md, skill-base (GitHub · website) and skb, choosing topics and triggers; Fragmentation covers trigger clashes and a single source of truth. Treat 1 → 2 → 3 as the spine; part 4 and the fragmentation piece can be read in between as needed.
Series map (sibling posts)
| Part | Post | What it covers |
|---|---|---|
| 2 | Concepts in practice | What SKILL.md looks like and how to write it |
| 3 | Tooling | skill-base / skb distribution and install |
| 4 | What deserves a Skill | What content is worth a Skill; design-side notes |
| 5 | Admin query page case study | Turning implicit project norms into an executable Skill |
| 6 | Operating playbook | After launch: governing triggers, conflicts, and versions |
| Extra | Fragmentation | Triggers and governance with many tools and many Skills |
💔 A scene you’ve seen a hundred times
Wednesday afternoon. You’re wrapping up the customer list filters and ask the AI to spit out a query.
A minute later you get code that is syntactically fine and common in real projects:
// AI output that “looks professional”
export async function listCustomers(params: {
page: number;
pageSize: number;
keyword?: string;
}) {
return db.customer.findMany({
where: {
name: params.keyword ? { contains: params.keyword } : undefined,
},
orderBy: { updatedAt: 'desc' },
skip: (params.page - 1) * params.pageSize,
take: params.pageSize,
});
}
Hold it against your team’s rules — the checklist blows up:
- You’re multi-tenant: every query must be scoped with
tenantId; this snippet skips it - Admin users can cross tenants, but only via explicit
withTenantScope()— never a silent full-table scan - Export and list endpoints must share the same filter object so you don’t get “50 rows on screen, 50k in the export”
- Audit wants
actorIdplus a digest of query conditions; this path never hooks auditing
You won’t end up in tears, but you know the feeling cold: the demo is fine; the real system isn’t safe — the next half hour goes into scoping, audit, and export alignment.
It isn’t trying to spite you — it simply doesn’t know how we wire things by default.
😤 Worse: when the whole team uses AI
Think solo pain is bad?
Try ten engineers all leaning on AI:
Three months later, the codebase is often several “locally reasonable” stacks stitched together:
| Author | Lists / tables | Data fetching | Forms & validation |
|---|---|---|---|
| A | Hand-rolled <table> + paging | Raw fetch | Local useState soup |
| B | Enterprise DataGrid | react-query + custom keys | react-hook-form + zod |
| C | Admin template ProTable | Wrapped useRequest | Form.Item rules scattered |
Nobody can answer in three minutes: which pattern should a new page copy? Review comments drift from “bug here” to “this doesn’t look like us.”
You still pay for tools — then pay the time back on “align to spec” and “restyle.”
🤔 So what’s actually wrong?
The model isn’t dumb.
If anything, it’s too clever — clever in a generic way, and generic means averaged.
It doesn’t know your team’s:
- 🏢 Business jargon: What does “settlement transfer bill” mean? What is “green-channel drive”? What is “anti-association”?
- 🎨 Product templates: dialogs always use
DialogV2, lists always useProTable - 📐 Code conventions: must use TypeScript, must use
async/await,varis forbidden - 🔧 Tech choices and boundaries: queries must go through
withTenantScope(), amounts must useDecimal, high-risk operations must attach audit hooks
The model has no memory of your team.
Every thread is a stranger landing on a greenfield README.
💡 What do people try?
You might say: “I’ll paste all the rules every time.”
Sure — until you burn out.
Others push norms into Cursor Rules, AGENTS.md / CLAUDE.md at the repo root, or a README section aimed at AI: at least you’re not dictating from zero each chat. The catch: many projects, thick rules — those files turn into “collections of short essays.” If you dump the whole pack into context every time, the window fills with policy first and the code or logs you actually need get squeezed out. Saving tokens means hand-picking snippets to paste — the carrying cost moved from the chat box to the doc box.
You: Remember, every query must carry tenantId
You: And admin cross-tenant must explicitly use withTenantScope, no shortcuts
You: And export and list must share the same filters
You: And sensitive reads must write audit logs
You: And remember…
You: Forget it, I’ll type this myself
Is there a way for the AI to already know our defaults?
That’s what an Agent Skill is for.
🎯 What is an Agent Skill?
In short: structured team norms that tell the model:
In our company / team, when scenario X shows up, handle it with Y.
A common shape (e.g. Claude Skills) is a skill directory: the root must have SKILL.md (YAML front matter with name / description), plus optional references/, scripts/, and other materials loaded on demand — easier to maintain and less context-hungry than pasting a giant prompt every session.
Think employee handbook for the model: what triggers it, what’s off-limits, examples you can copy.
With it, round one lands much closer to your conventions — fewer “no, not like that” loops.
📣 What’s next?
Pick up with Concepts in practice; other sibling posts are in the table above, and official field and folder conventions are linked at the end.
skill-base: website · GitHub — private team Agent Skill knowledge base.
Further reading
- Anthropic: Agent Skills overview:
SKILL.mdfields and directory layout per official docs; IDE and tool behavior per vendor.