ancoder-skill-cli 0.6.4 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -84,6 +84,121 @@ my-skill/
84
84
 
85
85
  Users who `npm install -g ancoder-skill-cli` get a fully bundled package. No extra binary download is required during install.
86
86
 
87
+ ## Test-Driven Skill Development (100:10:1 Architecture)
88
+
89
+ skill-cli adopts a test-driven approach to skill development, inspired by [oh-my-claudecode](https://github.com/Yeachan-Heo/oh-my-claudecode)'s multi-agent orchestration patterns. The core principle: **invest the majority of compute in building robust test skills, not the skill itself.**
90
+
91
+ ### Time Allocation: 100:10:1
92
+
93
+ When creating a skill for a task, the system simultaneously creates a **main skill** and a **test skill**:
94
+
95
+ | Phase | Time Share | Purpose |
96
+ |-------|-----------|---------|
97
+ | Test skill development | 90% (100 units) | Build an automated evaluator that compares expected vs actual output, locating specific differences |
98
+ | Main skill development | 9% (10 units) | Implement the actual skill, guided by test skill feedback |
99
+ | Execution & verification | 1% (1 unit) | Final end-to-end smoke test |
100
+
101
+ ### Architecture
102
+
103
+ ```text
104
+ Phase 1: Test Skill Development (90% compute)
105
+ generate structured acceptance criteria
106
+ -> N planners generate test strategies in parallel
107
+ -> critic reviews + eliminates weak strategies
108
+ -> N executors implement test skills in parallel
109
+ -> golden test evaluation (tournament selection)
110
+ -> repeat until precision threshold met
111
+ -> best test skill selected
112
+
113
+ Phase 2: Main Skill Development (9% compute)
114
+ generate main skill
115
+ -> test skill verifies (independent executor)
116
+ -> structured diff feedback injected into next prompt
117
+ -> repeat until test skill passes
118
+ -> main skill complete
119
+
120
+ Phase 3: Final Verification (1% compute)
121
+ end-to-end smoke test
122
+ ```
123
+
124
+ ### Key Design Principles
125
+
126
+ **1. Separation of Author and Reviewer**
127
+
128
+ The agent that generates the main skill and the agent that runs the test skill operate in separate contexts. This prevents self-approval bias. The verify phase spawns an independent executor to run the test skill, ensuring honest evaluation (borrowed from OMC's verifier lane pattern).
129
+
130
+ **2. Structured Diff Feedback**
131
+
132
+ Test skills output structured diff reports instead of simple pass/fail:
133
+
134
+ ```yaml
135
+ diffs:
136
+ - location: "page 3, paragraph 2"
137
+ type: "content_loss"
138
+ severity: "critical"
139
+ expected: "table with 3 columns and 5 rows"
140
+ actual: "table missing entirely"
141
+ - location: "page 5, heading"
142
+ type: "format_drift"
143
+ severity: "warning"
144
+ expected: "## Second-level heading"
145
+ actual: "### Third-level heading"
146
+ ```
147
+
148
+ This structured feedback is injected back into the main skill's improvement loop, enabling targeted fixes rather than blind retries.
149
+
150
+ **3. QA Cycling with Early Exit**
151
+
152
+ Borrowed from OMC's UltraQA pattern:
153
+ - Test skill finds issues -> structured diagnosis -> main skill fixes -> retest -> loop
154
+ - Same error appearing 3 times triggers early exit (avoids infinite compute burn)
155
+ - Maximum 5 QA cycles per iteration
156
+
157
+ **4. Tournament Selection for Test Skills**
158
+
159
+ During the 90% test skill development phase, multiple test strategies are generated in parallel and evaluated against golden tests (known-correct input/output pairs). The strategy with the highest detection precision wins, similar to OMC's self-improve tournament selection.
160
+
161
+ **5. PRD-Driven Acceptance Criteria**
162
+
163
+ Test skills define concrete, testable acceptance criteria (not vague "implementation is complete"):
164
+
165
+ ```text
166
+ Bad: "PDF conversion works correctly"
167
+ Good: "All tables with merged cells are preserved as HTML <table> blocks
168
+ with correct colspan/rowspan attributes"
169
+ ```
170
+
171
+ ### Example: PDF-to-Markdown Skill
172
+
173
+ For a PDF-to-Markdown conversion skill:
174
+
175
+ - **Test skill** (100 min): Compares original PDF content with generated Markdown, detecting content loss (missing paragraphs, tables, images), format drift (heading levels, list styles), and encoding issues. Outputs structured diffs with page/paragraph-level location info.
176
+ - **Main skill** (10 min): Implements PDF parsing and Markdown generation, iteratively improved by test skill feedback.
177
+ - **Verification** (1 min): End-to-end smoke test on fixture PDFs.
178
+
179
+ ### `skill_eval` Check Type
180
+
181
+ The verify system supports a `skill_eval` check type that invokes a test skill as a verification oracle:
182
+
183
+ ```yaml
184
+ checks:
185
+ - id: quality-check
186
+ type: skill_eval
187
+ skill: pdf-to-md-test
188
+ config:
189
+ threshold: 0.95
190
+ output_format: structured_diff
191
+ ```
192
+
193
+ ### Verify Phase: Independent Executor
194
+
195
+ During the loop's verify phase, a separate Claude executor is spawned to run the test skill. This executor:
196
+ - Has no shared context with the main skill's executor
197
+ - Produces an objective evaluation report
198
+ - Returns structured diff feedback that feeds into the next iteration
199
+
200
+ This mirrors OMC's principle: "Keep authoring and review as separate passes."
201
+
87
202
  ## License
88
203
 
89
204
  MIT
Binary file
Binary file
Binary file
Binary file
Binary file
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ancoder-skill-cli",
3
- "version": "0.6.4",
3
+ "version": "0.7.1",
4
4
  "description": "CLI for managing everything-claude-code (ECC) components — agents, skills, commands, rules, hooks, MCP configs. Single binary, all assets embedded.",
5
5
  "bin": {
6
6
  "skill-cli": "bin/skill-cli.js"