Overview
skills is a Claude Code plugin that delivers 13 methodology skills —
structured workflows covering Architecture Decision Records, TDD, contract registries, tech-debt management,
codebase mapping, architecture auditing, multi-layer QA, and UI/UX design. Each skill ships with a
step-by-step workflow, shell helper scripts, and reference guides.
Skills are distributed as a marketplace plugin: declare the marketplace once in a repo's
committed .claude/settings.json and every session — local terminal, web app, or mobile —
installs the same skills automatically.
marketplace.jsondotts-h/claude-skills
→
skillsplugin · v0.1.0
→
13 skillsSKILL.md + scripts + refs
→
any sessionterminal · web · mobile
Architecture
Repository layout
dotts-h/claude-skills/
├── .claude-plugin/
│ └── marketplace.json # catalog — registers the plugin
├── plugins/skills/
│ ├── .claude-plugin/
│ │ └── plugin.json # manifest (name, version, author)
│ └── skills/
│ ├── recording-decisions/
│ │ ├── SKILL.md # frontmatter + workflow + rules
│ │ ├── references/ # guides & templates (1–5 files)
│ │ └── scripts/ # bash helpers (1–4 scripts)
│ └── … (13 skills total)
├── docs/
│ ├── index.html # GitHub Pages site
│ └── .nojekyll
└── README.md
Skill anatomy
Every skill follows the same three-part structure:
SKILL.md
YAML frontmatter (name, description, allowed-tools) followed by the full workflow: ordered steps, trigger conditions, rules, and inline pointers to reference material.
references/
Markdown guides, templates, and pattern libraries the skill cites — typically 1–5 files. These are the knowledge library Claude reads before acting.
scripts/
Bash helpers the skill runs: allocating numbers, mining signals, running quality gates, computing metrics. Deterministic and idempotent where possible.
SKILL.md frontmatter
---
name: recording-decisions
description: >
Propose, number, and manage Architecture Decision Records (ADRs)
using the MADR-lite format under docs/adr/.
allowed-tools: Read, Write, Edit, Bash, Grep, Glob
---
## Workflow
…
Marketplace distribution
The .claude-plugin/marketplace.json file registers the repo as a named marketplace
(ori). The plugin.json manifest inside the plugin declares
what skills it provides. Cloud sessions read a repo's committed .claude/settings.json
and install declared plugins automatically at session start — no local setup needed for web or mobile.
Skills in ~/.claude/skills live only on one machine. A marketplace declared in
.claude/settings.json travels with the repo: commit it once and every collaborator's
cloud session gets the same skills.
Installation
Via settings.json — persists across all sessions
Add to the repo's committed .claude/settings.json:
{
"extraKnownMarketplaces": {
"ori": {
"source": { "source": "github", "repo": "dotts-h/claude-skills" }
}
},
"enabledPlugins": ["skills@ori"]
}
Commit and push. Any cloud session for that repo installs the plugin automatically at session start.
Interactively in a local session
/plugin marketplace add dotts-h/claude-skills
/plugin install skills@ori
Invoking a skill
Once installed, invoke any skill by typing its name as a slash command:
/recording-decisions
/practicing-tdd
/exploring-quality
Each skill reads docs/CONVENTIONS.md first (if it exists in your repo), then runs its workflow.
Skills
13 skills across 4 groups. Expand any skill to see its full description, workflow, scripts, and reference files.
Docs & Knowledge
recording-decisions
MADR-lite Architecture Decision Records under docs/adr/
Propose, number, and manage Architecture Decision Records (ADRs) using the MADR-lite format. ADRs capture the why behind significant decisions — framework choices, patterns, naming conventions, persistence strategies. Once accepted, an ADR is immutable; decisions are recorded as a new ADR that supersedes the old one, never as an edit.
Invoke when
you make a significant architectural or design decision worth preserving
Workflow
- Read
docs/CONVENTIONS.md for any project-specific ADR rules
- Run
new-adr.sh to allocate the next sequential number and create a stub file
- Write the ADR body: context, decision, consequences, and alternatives considered
- Mark status as Accepted
- Run
relink.sh if any supersedes/superseded-by links need updating
Scripts
| Script | Purpose |
| new-adr.sh | Allocate the next ADR number and create a stub at docs/adr/NNNN-kebab-title.md |
| reindex.sh | Renumber all ADR files sequentially after deletions or re-ordering |
| relink.sh | Fix supersedes / superseded-by cross-links across all ADR files |
Reference files
| File | Content |
| madr-template.md | MADR-lite frontmatter + body template (status, deciders, date, context, decision, consequences) |
| styles.md | Comparison of MADR vs Nygard ADR styles — when to use each |
registering-contracts
Interface, route, event & schema registry in docs/CONTRACTS.md
Maintain docs/CONTRACTS.md — the single registry of all stable promises between components. Covers Go interfaces, HTTP routes, SSE event types, config schemas, and invariants. Every entry carries a stability marker: stable, internal, or experimental. Breaking a stable contract requires an ADR before the change is made.
Invoke when
adding or changing a public interface, route, event type, or schema
Workflow
- Classify the contract type (interface / route / event / schema / invariant)
- Run
extract-interfaces.sh to cross-check all exported Go interfaces against the registry
- Add or update the row in CONTRACTS.md with the correct stability marker
- For stable → breaking changes: write an ADR first, then update the contract entry
Scripts
| Script | Purpose |
| extract-interfaces.sh | Grep exported interfaces from Go source to verify registry coverage |
| ensure-doc.sh | Create CONTRACTS.md with the standard stub if it doesn't exist |
Reference files
| File | Content |
| contract-entry-template.md | Column definitions and example rows for each contract type |
| detecting-drift.md | How to spot contract drift at code-review time; checklist of drift signals |
maintaining-conventions
The project constitution — docs/CONVENTIONS.md
Curate docs/CONVENTIONS.md — the central rulebook that every other methodology skill reads before acting. Contains copy-pasteable commands, exact file paths, coverage floors, merge strategy, naming rules, and environment facts. Rules are not paraphrased; they are copy-pasteable and machine-checkable. Every enforceable rule links back to its ADR.
Invoke when
adding a workflow rule, updating a quality gate, or codifying a repeated pattern
Workflow
- Run
harvest.sh to mine existing TODOs, patterns, and config for convention candidates
- Propose the convention in the correct section (see
convention-categories.md for the required section set)
- Add an ADR backlink if the convention is enforceable
- Run
check.sh to validate all required sections are present
Scripts
| Script | Purpose |
| harvest.sh | Mine TODOs, FIXMEs, config keys, and naming patterns to surface convention candidates |
| check.sh | Validate CONVENTIONS.md has all required sections and no broken ADR links |
Reference files
| File | Content |
| convention-categories.md | Required sections and their content contracts: workflow, architecture, testing, quality gates, persistence, environment, naming/style |
logging-learnings
Fixed bugs, dead-ends & gotchas in docs/REGRESSIONS.md
Log fixed bugs and dead-ends to docs/REGRESSIONS.md (or LEARNINGS.md) so the same mistake can't be repeated silently. Three entry types: Fixed (symptom / root-cause / fix / guard-test), Dead-end (tried / why-failed / instead), and Gotcha (unexpected behavior or constraint). Every fixed bug must name the guarding test that prevents recurrence.
Invoke when
after fixing a bug, hitting a dead-end, or discovering an unexpected gotcha
Workflow
- Run
ensure-doc.sh to create the stub if REGRESSIONS.md doesn't exist
- Classify the entry type (Fixed / Dead-end / Gotcha)
- For bug fixes: name the guarding test — this is mandatory
- Write the entry from the appropriate template and append to the document
Scripts
| Script | Purpose |
| ensure-doc.sh | Create REGRESSIONS.md with standard stub and section headers if the file is missing |
Reference files
| File | Content |
| entry-templates.md | Complete templates for Fixed, Dead-end, and Gotcha entry types |
mapping-codebases
Module layout & data-flow map in docs/CODEBASE_MAP.md
Generate docs/CODEBASE_MAP.md — a module table, primary data path (as an arrow chain), and seam identification. The map is an index, not an encyclopedia: it answers "where does X live and how does data move?" then points to ARCHITECTURE.md and CONTRACTS.md for depth. Distinguishes pure-core modules from thin-edge adapters.
Invoke when
onboarding, before a cross-cutting refactor, or when "where does X live?" keeps coming up
Workflow
- Run
module-inventory.sh to get LOC and exported symbol count per package
- Write the module table: path, purpose, LOC, exports, pure-core or thin-edge
- Trace the primary data path from entry point to output as an arrow chain
- Identify the main seams — interfaces where mocks exist or should exist
Scripts
| Script | Purpose |
| module-inventory.sh | Compute LOC and exported symbol count per Go package / module |
Reference files
| File | Content |
| map-template.md | Full CODEBASE_MAP.md template with all required sections and example rows |
Process & Architecture
practicing-tdd
Test-first red → green → refactor cycle
Drive development test-first: write a failing test that defines the requirement, write the minimum code to make it pass, refactor under green, then run all quality gates. Tests are written through the seam (the copilot.Client mock interface) — never against internals. Table-driven tests are the default.
Invoke when
starting any feature or bug fix
Workflow
- Write the failing test first — assert on events/state, not on internal function calls
- Confirm it's red before writing any product code (never skip this)
- Write the minimum code to make it green
- Confirm green
- Refactor with confidence — tests stay green throughout
- Run
gate.sh to execute all project lint, test, and coverage gates
Scripts
| Script | Purpose |
| gate.sh | Auto-detect and run all project quality gates (lint, test, coverage floor) in one command |
Reference files
| File | Content |
| testing-through-the-seam.md | Seam-based unit testing: MockClient pattern, what to assert on, what to avoid |
| red-green-refactor.md | Table-driven test structure, discipline rules, and common TDD mistakes |
managing-tech-debt
Prioritized debt register in docs/TECH_DEBT.md
Record shortcuts, legacy knots, and architecture drift in docs/TECH_DEBT.md. Each row captures description, location, severity, effort, and — most critically — interest (ongoing drag × probability of triggering). Rank by interest, not severity: a medium-severity item that slows every PR has higher priority than a severe item you'll never touch.
Invoke when
taking a shortcut, discovering untracked debt, or reviewing before a release
Workflow
- Run
debt-scan.sh to sweep for TODO/FIXME/HACK with file and line context
- Classify each item (architecture / testing / docs / tooling / performance)
- Score severity, effort, and interest (ongoing drag × likelihood)
- Add rows to TECH_DEBT.md; set a paydown trigger condition for each item
Scripts
| Script | Purpose |
| debt-scan.sh | Grep for TODO/FIXME/HACK across the codebase with file and line context |
| ensure-doc.sh | Create TECH_DEBT.md with the standard stub if missing |
Reference files
| File | Content |
| debt-register-template.md | Table column definitions, scoring rubric, and example rows |
| prioritization.md | Interest = probability × ongoing drag; how to score and rank the register |
improving-architecture
Structural assessment & refactoring — always via ADRs
Assess module boundaries, dependency direction, and coupling. All structural changes are proposed as ADRs — never refactored silently. Checks structure: layer violations, forbidden imports, hidden coupling. Distinct from bug-finding (/code-review) and line-level cleanup (/simplify).
Invoke when
coupling concerns surface, before a large refactor, or after mapping the codebase
Workflow
- Run
deps-check.sh for a per-package import summary and forbidden cross-layer imports
- Identify coupling smells using the
coupling-smells.md symptom table
- Check dependency direction against the allowed rules in
dependency-rules.md
- Propose the structural change as an ADR using the
refactor-as-adr.md template
- Implement under the ADR — the decision record is the paper trail
Scripts
| Script | Purpose |
| deps-check.sh | Per-package import summary; flags imports that violate layer dependency rules |
Reference files
| File | Content |
| coupling-smells.md | Symptom → root cause table for common coupling problems |
| dependency-rules.md | Allowed import directions by architectural layer |
| refactor-as-adr.md | Template for recording structural changes as Architecture Decision Records |
auditing-code-quality
Patterns & antipatterns review against project idioms
Review code against Go idioms, the project's pure-core/thin-edges architecture, seam discipline, error handling patterns, and naming conventions. Produces a ranked findings report with file and line references. Distinct from bug-finding (/code-review) and mechanical formatting cleanup (/simplify).
Invoke when
pre-merge quality gate, periodic code health check, or onboarding audit
Key antipatterns (ranked by cost)
- Seam leak — production type used directly in tests instead of the interface
- Punted error — error swallowed, logged-and-continued, or returned as nil
- Impure core — telemetry/config/ctxforge package importing business logic
- Voodoo constants — magic numbers or strings with no named constant
- Inconsistent terminology — same concept named differently across packages
- Nondeterminism — map iteration, goroutine ordering, or time.Now() in pure functions
- Premature abstraction — interface with one implementation and no test doubles
Scripts
| Script | Purpose |
| smells.sh | Detect seam leaks, punted errors, and naming drift via grep patterns |
Reference files
| File | Content |
| antipatterns-catalog.md | Ranked catalog with symptoms, root causes, and recommended fixes |
| go-patterns.md | Idiomatic Go patterns the project follows |
| project-idioms.md | Project-specific naming conventions and style rules |
Quality Engineering
hardening-tests
SDET audit — coverage gaps, weak assertions, flakes
Audit the test suite as an SDET: identify coverage gaps, assess assertion strength, hunt flaky tests, and detect missing edge/property/fuzz cases. Doesn't write product code — only strengthens tests. Key insight: a line ran is not a line checked; mutation testing is the real coverage bar.
Invoke when
pre-release hardening, after a regression, or when mutation scores are low
Workflow
- Scan for coverage gaps: uncovered code paths and missing error branches
- Assess assertion strength — are tests checking behavior or just "no panic"?
- Run
flake-hunt.sh to detect non-deterministic failures under the -race flag
- Run
mutation-run.sh to identify surviving mutants (undertested logic)
- Write guard tests for any fixed bugs that are missing coverage
Scripts
| Script | Purpose |
| flake-hunt.sh | Run tests N times with -race; collect and report non-deterministic failures |
| mutation-run.sh | Run mutation testing; report surviving mutants as undertested logic |
Reference files
| File | Content |
| assertion-strength.md | Weak vs. strong assertion patterns; how to upgrade from "no panic" to behavioral checks |
| flake-hunting.md | Isolation strategies, common root causes, and fixes for flaky tests |
| mutation-testing.md | Reading mutation results, typical survivors, and how to kill them |
authoring-tests
e2e · API · performance · accessibility test layers
Write higher-level tests across four layers: Playwright end-to-end (real browser, htmx, SSE), API/contract (escaping, cookies, isolation), performance benchmarks (hot paths, latency), and accessibility (WCAG 2.1/2.2). Locator discipline is enforced: role/test-id selectors and web-first assertions only. The demo server is a shared in-memory session — see demo-gotchas.md for isolation pitfalls.
Invoke when
a feature is complete and needs e2e, API contract, performance, or a11y coverage
Test layers
| Layer | Scope |
| e2e | Real browser, full user flows, htmx DOM updates, SSE message delivery |
| api / contract | HTTP responses, escaping, cookie handling, cross-endpoint isolation |
| perf | Benchmark hot paths, latency under load, memory allocation |
| a11y | WCAG 2.1/2.2: role attributes, keyboard navigation, focus order, contrast ratios |
Scripts
| Script | Purpose |
| init-agents.sh | Scaffold Playwright agent test files for the target layer |
| run-layer.sh | Run a specific test layer: e2e | api | perf | a11y |
Reference files
| File | Content |
| test-layers.md | Full taxonomy of test layers: what each covers and when to use each |
| playwright-agents.md | Locator discipline, web-first assertions, and the Playwright agents workflow |
| demo-gotchas.md | Shared in-memory demo session pitfalls and test isolation strategies |
exploring-quality
Two-phase exploratory QA → ranked findings report
Two-phase exploratory QA: Phase 1 is breadth — derive the full surface from code, probe all routes headlessly, cheaply. Phase 2 is depth — real browser, curiosity-led investigation using accessibility-tree snapshots to chase anomalies found in Phase 1. Output is a ranked findings report saved to docs/qa/exploratory-<date>.md.
Invoke when
pre-release QA pass, after a major feature, or when automated tests can't answer "is this actually good?"
Workflow
- Phase 1 (breadth): run
surface-inventory.sh to extract all routes → run breadth-sweep.sh to probe them headlessly → collect anomalies
- Phase 2 (depth): run
launch-demo.sh → open a real browser → investigate top anomalies from Phase 1 with Playwright accessibility-tree snapshots
- Compile ranked findings into
docs/qa/exploratory-<date>.md using the findings-report template
Scripts
| Script | Purpose |
| surface-inventory.sh | Extract all routes and endpoints from source code |
| launch-demo.sh | Start the demo server for manual or Playwright-driven testing |
| breadth-sweep.sh | Scripted headless probe of all surfaces found by surface-inventory.sh |
Reference files
| File | Content |
| phase1-breadth.md | Headless sweep methodology: what to check and how to record anomalies |
| phase2-deepdive.md | Real-browser curiosity-led investigation: accessibility-tree technique and chase heuristics |
| findings-report.md | Ranked findings report template (severity / impact / reproduction / recommendation) |
Design
designing-ui-ux
UX heuristics · a11y audit · design tokens · htmx patterns
Full UX/UI design loop: audit UX flows against Nielsen's 10 heuristics and Krug's principles, audit accessibility (WCAG POUR), review visual hierarchy (Refactoring-UI), update design tokens in docs/DESIGN.md, implement, and verify with automated scans. Design tokens as CSS custom properties prevent visual drift. Accessibility wins are locked with automated tests.
Invoke when
designing new UI, auditing existing UX, addressing a11y issues, or updating design tokens
Workflow
- Audit UX flows against Nielsen/Krug heuristics
- Audit accessibility (WCAG POUR: Perceivable, Operable, Understandable, Robust)
- Audit visual hierarchy against Refactoring-UI spacing and typography principles
- Update design token definitions in
docs/DESIGN.md (CSS custom properties)
- Implement changes
- Run
axe-scan.sh to verify a11y compliance; run screenshot-states.sh to capture visual state
Scripts
| Script | Purpose |
| axe-scan.sh | Run the axe accessibility scanner against the running demo and report violations |
| screenshot-states.sh | Capture key UI states for visual comparison and regression detection |
Reference files
| File | Content |
| ux-heuristics.md | Nielsen's 10 usability heuristics + Krug's "Don't Make Me Think" principles applied to web UX |
| wcag-pour.md | WCAG 2.1/2.2 Perceivable/Operable/Understandable/Robust checklist |
| refactoring-ui.md | Visual hierarchy, spacing, typography, and color contrast principles |
| design-tokens.md | CSS custom property token schema — how to define, document, and use tokens |
| htmx-implementation.md | htmx-specific UX patterns: partial updates, loading states, error feedback |
Core Principles
Shared disciplines enforced across all 13 skills:
Documentation as code
Every enforceable rule links an ADR (the why). Docs are the source of truth, not copies of code comments.
Immutable decisions
ADRs are never edited once accepted. A changed decision becomes a new ADR that supersedes the old one.
Deterministic tooling
Scripts compute numbers (ADR allocation, module inventory, debt scores) rather than relying on memory or approximation.
Seam discipline
A defined interface separates testable business logic from network adapters. Tests mock the seam, never the implementation.
Pure core, thin edges
Core packages (telemetry, config, ctxforge) stay dependency-free. All I/O lives in thin adapter packages at the edges.
Contracts as registry
All stable promises (interfaces, routes, events, schemas) live in one table. Breaking a stable contract requires an ADR first.
Interest over severity
Tech debt is ranked by interest (ongoing drag × probability), not by severity alone. High-friction items that slow every PR outrank rare catastrophic ones.
Two-phase QA
Breadth first (cheap, headless, full surface), then depth (real browser, curiosity-led, chasing anomalies). Don't go deep before going wide.
Constitution pattern
docs/CONVENTIONS.md is the single rulebook. Every skill reads it before acting. Rules are copy-pasteable, not paraphrased.