Agent Skills Series (1): From “it runs” to “it matches how we work”

About this series

Plenty of people have heard of Agent Skills, or installed a few in OpenClaw, yet still can’t pin down what it is, what problems it fits, what should become a Skill, how to write one, or how to roll it out on a team; only after that come fragmentation across many tools and packages, plus versioning and collaboration.

This series follows that thread. This post uses one slice of day-to-day coding friction as a hook. Next come SKILL.md, skill-base (GitHub · website) and skb, choosing topics and triggers; Fragmentation covers trigger clashes and a single source of truth. Treat 1 → 2 → 3 as the spine; part 4 and the fragmentation piece can be read in between as needed.


Series map (sibling posts)

PartPostWhat it covers
2Concepts in practiceWhat SKILL.md looks like and how to write it
3Toolingskill-base / skb distribution and install
4What deserves a SkillWhat content is worth a Skill; design-side notes
5Admin query page case studyTurning implicit project norms into an executable Skill
6Operating playbookAfter launch: governing triggers, conflicts, and versions
ExtraFragmentationTriggers and governance with many tools and many Skills

💔 A scene you’ve seen a hundred times

Wednesday afternoon. You’re wrapping up the customer list filters and ask the AI to spit out a query.

A minute later you get code that is syntactically fine and common in real projects:

// AI output that “looks professional”
export async function listCustomers(params: {
  page: number;
  pageSize: number;
  keyword?: string;
}) {
  return db.customer.findMany({
    where: {
      name: params.keyword ? { contains: params.keyword } : undefined,
    },
    orderBy: { updatedAt: 'desc' },
    skip: (params.page - 1) * params.pageSize,
    take: params.pageSize,
  });
}

Hold it against your team’s rules — the checklist blows up:

  • You’re multi-tenant: every query must be scoped with tenantId; this snippet skips it
  • Admin users can cross tenants, but only via explicit withTenantScope() — never a silent full-table scan
  • Export and list endpoints must share the same filter object so you don’t get “50 rows on screen, 50k in the export”
  • Audit wants actorId plus a digest of query conditions; this path never hooks auditing

You won’t end up in tears, but you know the feeling cold: the demo is fine; the real system isn’t safe — the next half hour goes into scoping, audit, and export alignment.

It isn’t trying to spite you — it simply doesn’t know how we wire things by default.


😤 Worse: when the whole team uses AI

Think solo pain is bad?

Try ten engineers all leaning on AI:

Three months later, the codebase is often several “locally reasonable” stacks stitched together:

AuthorLists / tablesData fetchingForms & validation
AHand-rolled <table> + pagingRaw fetchLocal useState soup
BEnterprise DataGridreact-query + custom keysreact-hook-form + zod
CAdmin template ProTableWrapped useRequestForm.Item rules scattered

Nobody can answer in three minutes: which pattern should a new page copy? Review comments drift from “bug here” to “this doesn’t look like us.”

You still pay for tools — then pay the time back on “align to spec” and “restyle.”


🤔 So what’s actually wrong?

The model isn’t dumb.

If anything, it’s too clever — clever in a generic way, and generic means averaged.

It doesn’t know your team’s:

  • 🏢 Business jargon: What does “settlement transfer bill” mean? What is “green-channel drive”? What is “anti-association”?
  • 🎨 Product templates: dialogs always use DialogV2, lists always use ProTable
  • 📐 Code conventions: must use TypeScript, must use async/await, var is forbidden
  • 🔧 Tech choices and boundaries: queries must go through withTenantScope(), amounts must use Decimal, high-risk operations must attach audit hooks

The model has no memory of your team.

Every thread is a stranger landing on a greenfield README.


💡 What do people try?

You might say: “I’ll paste all the rules every time.”

Sure — until you burn out.

Others push norms into Cursor Rules, AGENTS.md / CLAUDE.md at the repo root, or a README section aimed at AI: at least you’re not dictating from zero each chat. The catch: many projects, thick rules — those files turn into “collections of short essays.” If you dump the whole pack into context every time, the window fills with policy first and the code or logs you actually need get squeezed out. Saving tokens means hand-picking snippets to paste — the carrying cost moved from the chat box to the doc box.

You: Remember, every query must carry tenantId
You: And admin cross-tenant must explicitly use withTenantScope, no shortcuts
You: And export and list must share the same filters
You: And sensitive reads must write audit logs
You: And remember…
You: Forget it, I’ll type this myself

Is there a way for the AI to already know our defaults?

That’s what an Agent Skill is for.


🎯 What is an Agent Skill?

In short: structured team norms that tell the model:

In our company / team, when scenario X shows up, handle it with Y.

A common shape (e.g. Claude Skills) is a skill directory: the root must have SKILL.md (YAML front matter with name / description), plus optional references/, scripts/, and other materials loaded on demand — easier to maintain and less context-hungry than pasting a giant prompt every session.

Think employee handbook for the model: what triggers it, what’s off-limits, examples you can copy.

With it, round one lands much closer to your conventions — fewer “no, not like that” loops.


📣 What’s next?

Pick up with Concepts in practice; other sibling posts are in the table above, and official field and folder conventions are linked at the end.


skill-base: website · GitHub — private team Agent Skill knowledge base.


Further reading