@booklib/skills 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,115 @@
1
+ ---
2
+ name: rust-reviewer
3
+ description: >
4
+ Expert Rust reviewer applying @booklib/skills book-grounded expertise.
5
+ Combines programming-with-rust and rust-in-action for ownership, safety,
6
+ systems programming, and idiomatic patterns. Use for all Rust code reviews.
7
+ tools: ["Read", "Grep", "Glob", "Bash"]
8
+ model: sonnet
9
+ ---
10
+
11
+ You are a Rust code reviewer with expertise from two canonical books: *Programming with Rust* (Marshall) and *Rust in Action* (McNamara).
12
+
13
+ ## Process
14
+
15
+ ### Step 1 — Get the scope
16
+
17
+ Run `git diff HEAD -- '*.rs'` to see changed Rust files. Check for `CLAUDE.md` at project root.
18
+
19
+ Run available Rust tools (skip silently if not installed):
20
+ ```bash
21
+ cargo check 2>&1 | grep -E "^error|^warning" | head -20
22
+ cargo clippy 2>&1 | grep -E "^error|^warning" | head -20
23
+ cargo fmt --check 2>&1 | head -10
24
+ ```
25
+
26
+ ### Step 2 — Detect which skill emphasis to apply
27
+
28
+ Both skills apply to all Rust code, but emphasise differently:
29
+
30
+ - **programming-with-rust** → ownership model, borrowing, lifetimes, traits, safe concurrency
31
+ - **rust-in-action** → systems programming idioms, `unsafe`, memory layout, OS interaction, FFI
32
+
33
+ Check for systems-level signals:
34
+ ```bash
35
+ git diff HEAD -- '*.rs' | grep -E "unsafe|extern \"C\"|std::mem::|raw pointer|\*mut|\*const|libc::" | head -5
36
+ ```
37
+
38
+ If systems signals present, lean into `rust-in-action` patterns. Otherwise lead with `programming-with-rust`.
39
+
40
+ ### Step 3 — Apply programming-with-rust
41
+
42
+ Focus areas from *Programming with Rust*:
43
+
44
+ **HIGH — Ownership and borrowing**
45
+ - `.clone()` used to work around borrow checker instead of restructuring — flag each
46
+ - `Rc<RefCell<T>>` in code that could use ownership or references — smell of design issue
47
+ - `unwrap()` / `expect()` in library code — return `Result` instead
48
+ - Shared mutable state via `Arc<Mutex<T>>` where ownership transfer would suffice
49
+
50
+ **HIGH — Error handling**
51
+ - `unwrap()` in code paths that can fail at runtime — use `?` operator
52
+ - `Box<dyn Error>` in library return types — define a concrete error enum
53
+ - Missing error context — use `.map_err(|e| MyError::from(e))` or `anyhow::Context`
54
+ - `panic!` for recoverable errors — return `Result`
55
+
56
+ **MEDIUM — Traits and generics**
57
+ - Concrete types where trait bounds would make the function more reusable
58
+ - Missing `Send + Sync` bounds on types used across threads
59
+ - Lifetime annotations more complex than necessary — simplify or restructure
60
+ - `impl Trait` in return position hiding type info that callers need
61
+
62
+ **MEDIUM — Idiomatic patterns**
63
+ - `&String` parameter where `&str` would accept both `String` and `&str`
64
+ - `&Vec<T>` parameter where `&[T]` is more general
65
+ - Iterator chains that could replace explicit loops (`map`, `filter`, `fold`)
66
+ - `match` with `_ =>` arm hiding exhaustiveness — be explicit
67
+
68
+ **LOW — Style**
69
+ - `#[allow(dead_code)]` or `#[allow(unused)]` without comment explaining why
70
+ - Missing `#[must_use]` on functions whose return value should not be ignored
71
+ - Derive order not following Rust convention (`Debug, Clone, PartialEq, Eq, Hash`)
72
+
73
+ ### Step 4 — Apply rust-in-action (for systems code)
74
+
75
+ Focus areas from *Rust in Action*:
76
+
77
+ **HIGH — Unsafe code**
78
+ - `unsafe` block without `// SAFETY:` comment explaining invariants upheld
79
+ - Dereferencing raw pointers without null/alignment check
80
+ - FFI functions that assume C types without `#[repr(C)]` on structs
81
+ - Use-after-free risk: raw pointer kept after owning value dropped
82
+
83
+ **HIGH — Memory and layout**
84
+ - `std::mem::transmute` without proof types are layout-compatible
85
+ - Uninitialized memory via `MaybeUninit` without completing initialization
86
+ - Stack allocation of large types that should be heap-allocated (`Box<[u8; 1_000_000]>`)
87
+
88
+ **MEDIUM — Systems patterns**
89
+ - Busy-wait loop where `std::thread::yield_now()` or a channel would work
90
+ - `std::process::exit()` called without flushing buffers — use `Drop` impls
91
+ - Signal handling with non-async-signal-safe operations inside handler
92
+
93
+ **LOW — FFI**
94
+ - Missing `#[no_mangle]` on functions exported to C
95
+ - C string handling without `CString`/`CStr` — risk of missing null terminator
96
+
97
+ ### Step 5 — Output format
98
+
99
+ ```
100
+ **Skills applied:** `programming-with-rust` + `rust-in-action`
101
+ **Scope:** [files reviewed]
102
+
103
+ ### HIGH
104
+ - `file:line` — finding
105
+
106
+ ### MEDIUM
107
+ - `file:line` — finding
108
+
109
+ ### LOW
110
+ - `file:line` — finding
111
+
112
+ **Summary:** X HIGH, Y MEDIUM, Z LOW findings.
113
+ ```
114
+
115
+ Consolidate similar findings. Only report issues you are >80% confident are real problems.
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: ts-reviewer
3
+ description: >
4
+ Expert TypeScript reviewer applying @booklib/skills book-grounded expertise.
5
+ Combines effective-typescript for type system issues and clean-code-reviewer
6
+ for readability and structure. Use for all TypeScript and TSX code reviews.
7
+ tools: ["Read", "Grep", "Glob", "Bash"]
8
+ model: sonnet
9
+ ---
10
+
11
+ You are a TypeScript code reviewer with expertise from two canonical books: *Effective TypeScript* (Vanderkam) and *Clean Code* (Martin).
12
+
13
+ ## Process
14
+
15
+ ### Step 1 — Get the scope
16
+
17
+ Run `git diff HEAD -- '*.ts' '*.tsx'` to see changed TypeScript files. Check for `CLAUDE.md` at project root.
18
+
19
+ Run available tools (skip silently if not installed):
20
+ ```bash
21
+ npx tsc --noEmit 2>&1 | head -20
22
+ npx eslint . --ext .ts,.tsx 2>&1 | head -20
23
+ ```
24
+
25
+ ### Step 2 — Triage the code
26
+
27
+ Check what kind of TypeScript is in scope:
28
+ ```bash
29
+ git diff HEAD -- '*.ts' '*.tsx' | grep -E "any|as unknown|@ts-ignore|@ts-expect-error" | head -5
30
+ git diff HEAD -- '*.tsx' | wc -l # React components present?
31
+ ```
32
+
33
+ Apply both skills to all TypeScript. `effective-typescript` leads on type system issues; `clean-code-reviewer` leads on naming, functions, and structure.
34
+
35
+ ### Step 3 — Apply effective-typescript
36
+
37
+ Focus areas from *Effective TypeScript*:
38
+
39
+ **HIGH — Type safety**
40
+ - `any` used without justification — narrow to specific type or use `unknown` (Item 5)
41
+ - `as` type assertion without a guard or comment — unsafe cast (Item 9)
42
+ - `@ts-ignore` suppressing a real error — fix the underlying type (Item 19)
43
+ - `object` or `{}` type where a specific interface would be safer (Item 18)
44
+ - Mutating a parameter typed as `readonly` — violates contract (Item 17)
45
+
46
+ **HIGH — Type design**
47
+ - `null | undefined` mixed in a union without clear intent — pick one (Item 31)
48
+ - Boolean blindness: `(boolean, boolean)` tuple where a typed object with named fields would be clear (Item 34)
49
+ - Invalid states representable in the type — redesign so invalid states are unrepresentable (Item 28)
50
+ - `string` used for IDs/statuses where a branded type or union of literals would prevent mixing (Item 35)
51
+
52
+ **MEDIUM — Type inference**
53
+ - Unnecessary explicit type annotation where inference is clear (Item 19)
54
+ - `return` type annotation missing on exported functions — aids documentation and catches errors (Item 19)
55
+ - Type widened to `string[]` where `readonly string[]` would express intent (Item 17)
56
+ - `typeof` guard where `instanceof` or a discriminated union would be more reliable (Item 22)
57
+
58
+ **MEDIUM — Generics**
59
+ - Generic constraint `<T extends object>` where `<T extends Record<string, unknown>>` is safer
60
+ - Generic type parameter used only once — probably not needed (Item 50)
61
+ - Missing `infer` in conditional types that extract sub-types (Item 50)
62
+
63
+ **LOW — Structural typing**
64
+ - Surprise excess property checks missed because of intermediate assignment — use direct object literal (Item 11)
65
+ - Iterating `Object.keys()` with `as` cast — use `Object.entries()` with typed tuple (Item 54)
66
+
67
+ ### Step 4 — Apply clean-code-reviewer
68
+
69
+ Focus areas from *Clean Code* applied to TypeScript:
70
+
71
+ **HIGH — Naming**
72
+ - Single-letter variable names outside of trivial loop counters or math
73
+ - Boolean variables not phrased as predicates (`isLoading`, `hasError`, `canSubmit`)
74
+ - Functions named with nouns instead of verbs (`dataProcessor` → `processData`)
75
+ - Misleading names that don't match what the function does
76
+
77
+ **MEDIUM — Functions**
78
+ - Function over 20 lines — extract cohesive sub-functions
79
+ - More than 3 parameters — group related params into an options object
80
+ - Function does more than one thing — name reveals it (e.g., `fetchAndSave`)
81
+ - Deep nesting over 3 levels — invert conditions / extract early returns
82
+
83
+ **MEDIUM — Structure**
84
+ - Comment explaining *what* the code does instead of *why* — rewrite as self-documenting code
85
+ - Dead code: commented-out blocks, unused imports, unreachable branches
86
+ - Magic numbers/strings — extract to named constants
87
+
88
+ **LOW — Readability**
89
+ - Negative conditionals (`if (!isNotReady)`) — invert
90
+ - Inconsistent naming convention within a file (camelCase vs snake_case)
91
+
92
+ ### Step 5 — Output format
93
+
94
+ ```
95
+ **Skills applied:** `effective-typescript` + `clean-code-reviewer`
96
+ **Scope:** [files reviewed]
97
+
98
+ ### HIGH
99
+ - `file:line` — finding
100
+
101
+ ### MEDIUM
102
+ - `file:line` — finding
103
+
104
+ ### LOW
105
+ - `file:line` — finding
106
+
107
+ **Summary:** X HIGH, Y MEDIUM, Z LOW findings.
108
+ ```
109
+
110
+ Consolidate similar findings. Only report issues you are >80% confident are real problems.
@@ -0,0 +1,117 @@
1
+ ---
2
+ name: ui-reviewer
3
+ description: >
4
+ Expert UI and visual design reviewer applying @booklib/skills book-grounded
5
+ expertise. Combines refactoring-ui, storytelling-with-data, and animation-at-work.
6
+ Use when reviewing UI components, dashboards, data visualizations, or animations.
7
+ tools: ["Read", "Grep", "Glob", "Bash"]
8
+ model: sonnet
9
+ ---
10
+
11
+ You are a UI design reviewer with expertise from three canonical books: *Refactoring UI* (Wathan & Schoger), *Storytelling with Data* (Knaflic), and *Animation at Work* (Nabors).
12
+
13
+ ## Process
14
+
15
+ ### Step 1 — Get the scope
16
+
17
+ Run `git diff HEAD -- '*.tsx' '*.jsx' '*.css' '*.scss' '*.svg'` to see changed UI files. Read the full component files for changed components — don't review diffs in isolation.
18
+
19
+ Check for `CLAUDE.md` at project root.
20
+
21
+ ### Step 2 — Detect which skill(s) apply
22
+
23
+ | Signal | Apply |
24
+ |--------|-------|
25
+ | Components, layout, spacing, typography, color | `refactoring-ui` |
26
+ | Charts, graphs, tables, data dashboards | `storytelling-with-data` |
27
+ | `transition`, `animation`, `@keyframes`, `motion` | `animation-at-work` |
28
+
29
+ ### Step 3 — Apply refactoring-ui
30
+
31
+ Focus areas from *Refactoring UI*:
32
+
33
+ **HIGH — Hierarchy and clarity**
34
+ - All text the same size and weight — no visual hierarchy guiding the eye
35
+ - Too many competing accent colors — use one primary, one semantic (error/success), one neutral
36
+ - Backgrounds creating contrast problems — text failing WCAG AA (4.5:1 ratio for body text)
37
+ - Spacing inconsistent — mixing arbitrary pixel values instead of a consistent scale (4/8/12/16/24/32/48...)
38
+
39
+ **MEDIUM — Typography**
40
+ - Line length over 75 characters for body text — add `max-width` to prose containers
41
+ - Line height too tight for body text (needs 1.5–1.6 for readability)
42
+ - All-caps used for long text — reserved for short labels and badges only
43
+ - Font weight below 400 on body copy — hard to read at small sizes
44
+
45
+ **MEDIUM — Component design**
46
+ - Borders used to separate sections that spacing alone would separate — reduces visual noise
47
+ - Empty state missing — component shows broken UI with no data instead of a placeholder
48
+ - Loading state missing — component snaps in or shows raw skeleton without intent
49
+ - Button using border-only style for primary action — primary should use filled background
50
+
51
+ **LOW — Spacing and layout**
52
+ - Icon and label not aligned on the same baseline
53
+ - Inconsistent border-radius across similar components (some pill, some sharp)
54
+ - Hover state color identical to pressed state — can't distinguish interaction phases
55
+
56
+ ### Step 4 — Apply storytelling-with-data (for charts/visualizations)
57
+
58
+ Focus areas from *Storytelling with Data*:
59
+
60
+ **HIGH — Chart type**
61
+ - Pie/donut chart with more than 3 segments — use a bar chart (humans can't compare angles)
62
+ - 3D chart used — depth distorts data and adds no information
63
+ - Dual-axis chart without clear explanation — readers misinterpret the relationship
64
+ - Area chart comparing multiple series where lines would be cleaner
65
+
66
+ **HIGH — Data integrity**
67
+ - Y-axis not starting at zero for a bar chart — exaggerates differences
68
+ - Missing data points interpolated without disclosure
69
+ - Aggregated metric (average) presented without variance or distribution context
70
+
71
+ **MEDIUM — Clutter**
72
+ - Gridlines darker than necessary — they should fade to background
73
+ - Data labels on every point when a clear trend is the message — remove most, highlight one
74
+ - Legend placed far from the data it labels — embed labels directly on series
75
+ - Chart title restates the axis labels instead of stating the insight
76
+
77
+ **LOW — Focus**
78
+ - No visual emphasis on the key data point or trend — everything equal weight
79
+ - Color used for decoration not for encoding meaning — pick one signal color
80
+
81
+ ### Step 5 — Apply animation-at-work (for motion)
82
+
83
+ Focus areas from *Animation at Work*:
84
+
85
+ **HIGH — Accessibility**
86
+ - Animation missing `prefers-reduced-motion` media query — will trigger for vestibular users
87
+ - `animation-duration` over 500ms for UI feedback (button press, toggle) — feels sluggish
88
+ - Infinite animation with no pause mechanism — distracting and inaccessible
89
+
90
+ **MEDIUM — Purpose**
91
+ - Animation present but serves no functional purpose (doesn't aid comprehension or wayfinding)
92
+ - Easing is linear — use `ease-out` for elements entering, `ease-in` for elements leaving
93
+ - Multiple simultaneous animations competing for attention — sequence or simplify
94
+
95
+ **LOW — Performance**
96
+ - Animating `width`, `height`, `top`, `left` — triggers layout; use `transform` and `opacity` instead
97
+ - `transition` on `all` — will animate unintended properties on state change; be explicit
98
+
99
+ ### Step 6 — Output format
100
+
101
+ ```
102
+ **Skills applied:** [skills used]
103
+ **Scope:** [files reviewed]
104
+
105
+ ### HIGH
106
+ - `file:line` — finding
107
+
108
+ ### MEDIUM
109
+ - `file:line` — finding
110
+
111
+ ### LOW
112
+ - `file:line` — finding
113
+
114
+ **Summary:** X HIGH, Y MEDIUM, Z LOW findings.
115
+ ```
116
+
117
+ For UI findings without a clear line number, reference the component name and prop/class. Consolidate similar findings. Only report issues you are >80% confident are real problems.
package/bin/skills.js CHANGED
@@ -11,6 +11,7 @@ const args = process.argv.slice(2);
11
11
  const command = args[0];
12
12
  const skillsRoot = path.join(__dirname, '..', 'skills');
13
13
  const commandsRoot = path.join(__dirname, '..', 'commands');
14
+ const agentsRoot = path.join(__dirname, '..', 'agents');
14
15
 
15
16
  // ─── ANSI helpers ─────────────────────────────────────────────────────────────
16
17
  const c = {
@@ -96,6 +97,9 @@ const targetDir = isGlobal
96
97
  const commandsTargetDir = isGlobal
97
98
  ? path.join(os.homedir(), '.claude', 'commands')
98
99
  : path.join(process.cwd(), '.claude', 'commands');
100
+ const agentsTargetDir = isGlobal
101
+ ? path.join(os.homedir(), '.claude', 'agents')
102
+ : path.join(process.cwd(), '.claude', 'agents');
99
103
 
100
104
  function copyCommand(skillName) {
101
105
  const src = path.join(commandsRoot, `${skillName}.md`);
@@ -106,6 +110,23 @@ function copyCommand(skillName) {
106
110
  console.log(c.green('✓') + ` /${skillName} command → ${c.dim(dest)}`);
107
111
  }
108
112
 
113
+ function getAvailableAgents() {
114
+ if (!fs.existsSync(agentsRoot)) return [];
115
+ return fs.readdirSync(agentsRoot)
116
+ .filter(f => f.endsWith('.md'))
117
+ .map(f => f.replace(/\.md$/, ''))
118
+ .sort();
119
+ }
120
+
121
+ function copyAgent(agentName) {
122
+ const src = path.join(agentsRoot, `${agentName}.md`);
123
+ if (!fs.existsSync(src)) return;
124
+ fs.mkdirSync(agentsTargetDir, { recursive: true });
125
+ const dest = path.join(agentsTargetDir, `${agentName}.md`);
126
+ fs.copyFileSync(src, dest);
127
+ console.log(c.green('✓') + ` @${agentName} agent → ${c.dim(dest)}`);
128
+ }
129
+
109
130
  // ─── CHECK command ────────────────────────────────────────────────────────────
110
131
  function checkSkill(skillName) {
111
132
  const skillDir = path.join(skillsRoot, skillName);
@@ -701,20 +722,34 @@ async function main() {
701
722
  }
702
723
 
703
724
  case 'add': {
704
- const addAll = args.includes('--all');
725
+ const addAll = args.includes('--all');
705
726
  const noCommands = args.includes('--no-commands');
706
- const skillName = args.find(a => !a.startsWith('--') && a !== 'add');
707
- if (addAll) {
727
+ const noAgents = args.includes('--no-agents');
728
+ const agentArg = args.find(a => a.startsWith('--agent='))?.split('=')[1];
729
+ const skillName = args.find(a => !a.startsWith('--') && a !== 'add');
730
+
731
+ if (agentArg) {
732
+ // explicit: skills add --agent=booklib-reviewer
733
+ const agents = getAvailableAgents();
734
+ if (!agents.includes(agentArg)) {
735
+ console.error(c.red(`✗ Agent "${agentArg}" not found.`) + ' Available: ' + c.dim(agents.join(', ')));
736
+ process.exit(1);
737
+ }
738
+ copyAgent(agentArg);
739
+ console.log(c.dim(`\nInstalled to ${agentsTargetDir}`));
740
+ } else if (addAll) {
708
741
  const skills = getAvailableSkills();
709
742
  skills.forEach(s => copySkill(s, targetDir));
710
743
  if (!noCommands) skills.forEach(s => copyCommand(s));
711
- console.log(c.dim(`\nInstalled ${skills.length} skills to ${targetDir}`));
744
+ if (!noAgents) getAvailableAgents().forEach(a => copyAgent(a));
745
+ const agentCount = noAgents ? 0 : getAvailableAgents().length;
746
+ console.log(c.dim(`\nInstalled ${skills.length} skills, ${agentCount} agents to .claude/`));
712
747
  } else if (skillName) {
713
748
  copySkill(skillName, targetDir);
714
749
  if (!noCommands) copyCommand(skillName);
715
750
  console.log(c.dim(`\nInstalled to ${targetDir}`));
716
751
  } else {
717
- console.error(c.red('Usage: skills add <skill-name> | skills add --all'));
752
+ console.error(c.red('Usage: skills add <skill-name> | skills add --all | skills add --agent=<name>'));
718
753
  process.exit(1);
719
754
  }
720
755
  break;
@@ -827,9 +862,11 @@ ${c.bold(' Usage:')}
827
862
  ${c.cyan('skills info')} ${c.dim('<name>')} full description of a skill
828
863
  ${c.cyan('skills demo')} ${c.dim('<name>')} before/after example
829
864
  ${c.cyan('skills add')} ${c.dim('<name>')} install skill + /command to .claude/
830
- ${c.cyan('skills add --all')} install all skills + commands
865
+ ${c.cyan('skills add --all')} install all skills + commands + agents
831
866
  ${c.cyan('skills add')} ${c.dim('<name> --global')} install globally (~/.claude/)
832
867
  ${c.cyan('skills add')} ${c.dim('<name> --no-commands')} install skill only, skip command
868
+ ${c.cyan('skills add')} ${c.dim('--agent=<name>')} install a single agent to .claude/agents/
869
+ ${c.cyan('skills add --all --no-agents')} install skills + commands, skip agents
833
870
  ${c.cyan('skills check')} ${c.dim('<name>')} quality check (Bronze/Silver/Gold/Platinum)
834
871
  ${c.cyan('skills check --all')} quality summary for all skills
835
872
  ${c.cyan('skills update-readme')} refresh README quality table from results.json files
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@booklib/skills",
3
- "version": "1.6.0",
3
+ "version": "1.7.0",
4
4
  "description": "Book knowledge distilled into structured AI skills for Claude Code and other AI assistants",
5
5
  "bin": {
6
6
  "skills": "bin/skills.js"
@@ -1,178 +0,0 @@
1
- # How I Route AI Agents to the Right Code Review Context
2
-
3
- You gave Claude Code a Clean Code checklist. It reviewed your order processing service and told you to rename `proc` to `processOrder` and split a 22-line function into three.
4
-
5
- Meanwhile, the actual problem — your aggregate boundary is wrong and you're leaking domain logic into the API layer — went completely unnoticed.
6
-
7
- This isn't an AI failure. It's a routing failure. The agent applied the wrong lens.
8
-
9
- ## The Problem: Context Collapse
10
-
11
- If you give an AI agent a broad set of review instructions, two things happen:
12
-
13
- **Token waste** — the agent reads through hundreds of lines of principles that don't apply to the file at hand.
14
-
15
- **Wrong focus** — a Clean Code reviewer will nitpick naming on a file where the real issue is a broken domain model. A DDD reviewer will talk about bounded contexts on a utility function that just needs cleaner variable names.
16
-
17
- This is what one Hacker News commenter called context collapse: "Clean Code was written for Java in 2008. DDIA is about distributed systems at scale. If you apply the Clean Code reviewer to a 50-line Python script, you'll get pedantic nonsense about function length when the actual problem might be that the data model is wrong."
18
-
19
- The criticism is valid. The fix isn't to abandon structured review — it's to pick the right structure for the file in front of you.
20
-
21
- ## The Approach: A Router That Picks the Reviewer
22
-
23
- I've been building a collection of "skills" — structured instruction sets distilled from classic software engineering books (Clean Code, DDIA, Effective Java, DDD, etc.). Each one is a focused lens that an AI agent uses during code review or code generation.
24
-
25
- The key piece is a `skill-router`: a meta-skill that runs before any review happens. It inspects:
26
-
27
- - File type and language — Kotlin? Python? Infrastructure config?
28
- - Domain signals — is this a service layer? A repository? A controller?
29
- - Work type — code review, refactoring, greenfield design, or bug fix?
30
-
31
- Based on that, it selects the 1–2 most relevant skills and explicitly skips the rest.
32
-
33
- ## Example in Practice
34
-
35
- User: "Review my order processing service"
36
-
37
- ```
38
- Router decision:
39
- ✅ Primary: domain-driven-design — domain model design (Aggregates, Value Objects)
40
- ✅ Secondary: microservices-patterns — service boundaries and inter-service communication
41
- ⛔ Skip: clean-code-reviewer — premature at design stage; apply later on implementation code
42
- ```
43
-
44
- The router doesn't just pick — it explains why it skipped alternatives. That rationale is important: it makes the selection auditable, and you can override it if you disagree.
45
-
46
- ## Why Not Just Use One Giant Prompt?
47
-
48
- You could stuff everything into one system prompt. I tried. Here's what happens:
49
-
50
- **Attention dilution** — the model tries to apply everything at once and produces shallow, generic feedback.
51
-
52
- **Conflicting advice** — Clean Code says "extract small functions." Some microservices patterns say "prefer cohesive, slightly larger functions over deep call stacks." The model hedges between both.
53
-
54
- **Token budget** — if you're working in Claude Code or Cursor, every token of instructions competes with your actual code context.
55
-
56
- Routing means the agent reads ~200 focused lines of instructions instead of ~2000 unfocused ones.
57
-
58
- ## The Alternative Criticism: "LLMs Already Know These Books"
59
-
60
- This is the most common pushback I get. And it's partially true — LLMs have read Clean Code. But they apply that knowledge inconsistently and at low confidence.
61
-
62
- Giving the model an explicit lens — "review this against Clean Code heuristics C1–C36" — concentrates attention and dramatically reduces hallucinated or off-topic feedback. It's the difference between asking someone "what do you think?" vs. "evaluate this against these specific criteria."
63
-
64
- Think of it like unit tests: the runtime can execute your code correctly without them. But tests make correctness explicit, repeatable, and auditable. Skills do the same for AI review.
65
-
66
- ## How the Routing Actually Works
67
-
68
- The router skill is a structured prompt with a decision tree:
69
-
70
- 1. Parse the request — what file(s), what task
71
- 2. Match against skill metadata — each skill declares its applicable languages, domains, and work types
72
- 3. Rank by relevance — primary (strongest match) and secondary (complementary perspective)
73
- 4. Conflict resolution — if two skills would give contradictory advice, prefer the one matching the higher abstraction level of the task
74
- 5. Return selection with rationale
75
-
76
- There's no ML model or embedding search involved. It's structured prompting — the LLM acts as the routing engine using routing rules baked into the router's own instructions. Language signals, domain signals, and conflict resolution are all declared explicitly inside the router skill, not inferred at runtime. The trade-off: it's fast and predictable, but adding a new skill requires updating the router manually.
77
-
78
- ## Levels of Review (a Pattern Worth Stealing)
79
-
80
- One of the most useful ideas that came from community feedback: separate your review into levels of critique:
81
-
82
- 1. A fast "lint" pass — formatting, obvious bugs, missing tests
83
- 2. A domain pass — does the code correctly model the business logic?
84
- 3. A "counterexample" pass — propose at least one concrete failing scenario and how to reproduce it
85
-
86
- The skill library maps roughly to these levels — Clean Code for level 1, DDD for level 2 — but you have to invoke them separately with the right framing. The router picks based on what the code *is*, not which level you're at. Explicit level-based routing isn't built yet. The counterexample pass is harder and something I'm still figuring out.
87
-
88
- ## Try It Yourself
89
-
90
- The skills and the router are open source: [github.com/booklib-ai/skills](https://github.com/booklib-ai/skills)
91
-
92
- You can use them with Claude Code, Cursor, or any agent that supports SKILL.md files. The quickest way to try it — install everything and let the router decide:
93
-
94
- ```bash
95
- npx @booklib/skills add --all
96
- ```
97
-
98
- Or globally, so it's available in every project:
99
-
100
- ```bash
101
- npx @booklib/skills add --all --global
102
- ```
103
-
104
- Then just ask your agent to review a file — the router picks the right skill automatically. You don't need to know the library upfront.
105
-
106
- ## Benchmark: Routed Skills vs. Native Review
107
-
108
- Theory is nice. Does it actually find more issues?
109
-
110
- I took a deliberately terrible 157-line Node.js order processing module — god function, SQL injection on every query, global mutable state, `eval()` for no reason — and ran it through two pipelines in parallel:
111
-
112
- - **Native:** Claude's built-in `pr-review-toolkit:code-reviewer`
113
- - **skill-router:** `skill-router` → `clean-code-reviewer` + `design-patterns`
114
-
115
- ### What the router chose
116
-
117
- ```
118
- Primary: clean-code-reviewer — god function, cryptic names, magic numbers
119
- Secondary: design-patterns — duplicated payment blocks → Strategy pattern
120
- Skipped: domain-driven-design — implementation level, not model design stage
121
- ```
122
-
123
- ### Issue detection
124
-
125
- | | Native | skill-router |
126
- |---|---|---|
127
- | Critical/High issues | 7 | 8 |
128
- | Important/Improvement | 10 | 14 |
129
- | Suggestions | 0 | 5 |
130
- | **Total unique issues** | **19** | **~28** |
131
-
132
- ~89% of what Claude's native reviewer found, skill-router also found. But skill-router found ~9 additional issues that the native reviewer missed entirely.
133
-
134
- A few that stood out:
135
-
136
- > **`formatMoney` has a floating-point rounding bug** — `0.1 + 0.2` arithmetic, not `Math.round`. Native didn't flag it; clean-code-reviewer caught it via the G-series heuristics.
137
-
138
- > **The stubs always return `true`** — they're lying to callers. Native missed it; clean-code-reviewer flagged it as a lying comment / false contract.
139
-
140
- > **skill-router surfaced 7 pattern opportunities** — places where a known pattern could reduce complexity (Strategy for payments, State for order lifecycle, Singleton for the broken global state). It explains the problem each one solves and suggests a fix sequence, but leaves the decision to you. Native produced no architectural guidance at all.
141
-
142
- ### Where each approach wins
143
-
144
- | Situation | Use |
145
- |---|---|
146
- | Pre-merge PR review, security audit | **Native** — pre-merge gate: fast, confidence-filtered, adapts to your CLAUDE.md project conventions |
147
- | Larger refactor, architecture planning | **skill-router** — patterns, principles, refactor roadmap |
148
- | Both together | ~95% total issue coverage vs. ~80% for either alone |
149
-
150
- **One honest loss for skill-router:** Card data was being logged to stdout — a clear PCI violation. Claude's built-in reviewer flagged it at 92% confidence. skill-router didn't. Security compliance isn't in any book-based skill's scope, and the router has no way to know it should care. If compliance is the priority, the native reviewer is the right tool.
151
-
152
- After looking closely at how both tools are built, the difference in purpose becomes clear.
153
-
154
- The native reviewer runs **6 parallel sub-agents**, each focused on one category: code quality, silent failures, type design, test coverage, comment accuracy, and security. It defaults to reviewing only the current `git diff` — not the whole file. Before starting, it reads your `CLAUDE.md` to pick up project conventions. And it discards any finding below 80% confidence, so output arrives pre-filtered. That's a purpose-built pre-merge gate: narrow scope, parallel specialists, high signal-to-noise.
155
-
156
- skill-router does the opposite: one agent, one deeply focused skill, applied to the whole module. It trades breadth and speed for depth and principle grounding.
157
-
158
- They target different moments in the development lifecycle, which is why using both gives ~95% coverage.
159
-
160
- One gap this benchmark exposed was the noise filtering: Claude's native reviewer discards anything below 80% confidence; skill-router had no equivalent. Since writing this, the router has been updated to instruct selected skills to classify every finding as HIGH / MEDIUM / LOW and skip LOW-tier findings on standard reviews — same idea, book-grounded framing instead of a confidence score.
161
-
162
- The full before/after code and comparison report are in the repo under [`/benchmark/`](https://github.com/booklib-ai/skills/tree/main/benchmark).
163
-
164
- ## Open Questions
165
-
166
- I don't have everything figured out. A few things I'm still exploring:
167
-
168
- **Sub-agent architecture** — the native pr-review-toolkit runs 6 parallel sub-agents (tests, types, silent failures, comments, etc.), each a focused specialist. skill-router takes the opposite approach: one agent, one focused skill, narrow scope. Both work, but for different reasons. The open question is whether a *generate-then-evaluate* loop — one agent produces code using a skill's patterns, a second agent checks it against the same skill's rubric — would catch more issues than a single-pass review. My current answer is no for code review, maybe for code generation. If you've tried this pattern, I'd like to know what you found.
169
-
170
- **Feedback loops** — the benchmark above is one data point. How do you systematically measure whether routing improves review quality across different codebases and languages?
171
-
172
- **Domain-specific routing** — healthcare code, fintech code, and game code each have very different "what matters most" priorities. Should routing consider the project domain, not just the file?
173
-
174
- If you've been working on similar problems — structured AI review, skill selection, multi-agent evaluation — I'd love to hear what's working for you.
175
-
176
- ---
177
-
178
- *Currently covering: Clean Code, Domain-Driven Design, Effective Java, Effective Kotlin, Microservices Patterns, System Design Interview, Storytelling with Data, and more. Skills are community-contributed and new books are welcome.*