developing-agent-forge 2.4.0 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/README.md +17 -48
  2. package/README.zh-CN.md +17 -48
  3. package/developing-forge.yaml +0 -10
  4. package/dist/agents/developer.d.ts.map +1 -1
  5. package/dist/agents/developer.js +7 -15
  6. package/dist/agents/developer.js.map +1 -1
  7. package/dist/agents/factory.d.ts.map +1 -1
  8. package/dist/agents/factory.js +0 -2
  9. package/dist/agents/factory.js.map +1 -1
  10. package/dist/agents/index.d.ts +0 -1
  11. package/dist/agents/index.d.ts.map +1 -1
  12. package/dist/agents/index.js +0 -1
  13. package/dist/agents/index.js.map +1 -1
  14. package/dist/agents/manager.d.ts.map +1 -1
  15. package/dist/agents/manager.js +7 -15
  16. package/dist/agents/manager.js.map +1 -1
  17. package/dist/agents/prompts.d.ts +0 -1
  18. package/dist/agents/prompts.d.ts.map +1 -1
  19. package/dist/agents/prompts.js +0 -3
  20. package/dist/agents/prompts.js.map +1 -1
  21. package/dist/agents/reviewer.d.ts.map +1 -1
  22. package/dist/agents/reviewer.js +1 -5
  23. package/dist/agents/reviewer.js.map +1 -1
  24. package/dist/agents/types.d.ts +0 -1
  25. package/dist/agents/types.d.ts.map +1 -1
  26. package/dist/agents/types.js.map +1 -1
  27. package/dist/cli.js +2 -2
  28. package/dist/cli.js.map +1 -1
  29. package/dist/index.d.ts +4 -4
  30. package/dist/index.d.ts.map +1 -1
  31. package/dist/index.js +2 -2
  32. package/dist/index.js.map +1 -1
  33. package/dist/pipeline/index.d.ts +0 -1
  34. package/dist/pipeline/index.d.ts.map +1 -1
  35. package/dist/pipeline/index.js +0 -1
  36. package/dist/pipeline/index.js.map +1 -1
  37. package/dist/pipeline/pipeline.d.ts +0 -9
  38. package/dist/pipeline/pipeline.d.ts.map +1 -1
  39. package/dist/pipeline/pipeline.js +2 -9
  40. package/dist/pipeline/pipeline.js.map +1 -1
  41. package/dist/pipeline/project-devloop.d.ts +1 -1
  42. package/dist/pipeline/project-devloop.d.ts.map +1 -1
  43. package/dist/pipeline/project-devloop.js +2 -3
  44. package/dist/pipeline/project-devloop.js.map +1 -1
  45. package/dist/pipeline/task-devloop.d.ts +1 -1
  46. package/dist/pipeline/task-devloop.d.ts.map +1 -1
  47. package/dist/pipeline/task-devloop.js +1 -2
  48. package/dist/pipeline/task-devloop.js.map +1 -1
  49. package/package.json +2 -5
  50. package/dist/agents/trajectory-optimizer.d.ts +0 -11
  51. package/dist/agents/trajectory-optimizer.d.ts.map +0 -1
  52. package/dist/agents/trajectory-optimizer.js +0 -33
  53. package/dist/agents/trajectory-optimizer.js.map +0 -1
  54. package/dist/pipeline/pipelineskill.d.ts +0 -109
  55. package/dist/pipeline/pipelineskill.d.ts.map +0 -1
  56. package/dist/pipeline/pipelineskill.js +0 -45
  57. package/dist/pipeline/pipelineskill.js.map +0 -1
  58. package/metaskills/coding-style/METASKILL.md +0 -214
  59. package/skills/coding-style/SKILL.md +0 -802
  60. package/skills/coding-style/agents/openai.yaml +0 -4
@@ -1,802 +0,0 @@
1
- ---
2
- name: coding-style
3
- description: >-
4
- Maintain clean, local, low-coupling code changes in existing repositories. Use
5
- when an agent writes or edits code, refactors modules, implements features,
6
- harnesses, tests, exports, or framework docs. This skill does not initialize
7
- template repositories or generate full project scaffolds from empty
8
- directories.
9
- ---
10
-
11
- # Coding Style
12
-
13
- ## Mission
14
-
15
- Use this skill as a code-quality and framework-consistency layer for an
16
- existing repository. The upstream task decides what to build; this skill decides
17
- how to keep the implementation readable, local, low-coupling, testable, and
18
- consistent with the current framework.
19
-
20
- Do not use this skill to initialize a repository template or recreate a project
21
- scaffold from an empty directory. Template initialization belongs to a separate
22
- skill. This skill may add files, modules, tests, harness support, or docs only
23
- when the current task and current repository need them.
24
-
25
- ## Operating Boundary
26
-
27
- Use the user-specified repository root as the project boundary. Do not create,
28
- modify, or reference project files outside that root unless the user explicitly
29
- asks.
30
-
31
- Respect the existing source layout, naming style, language ecosystem, tests,
32
- harnesses, docs, and project configuration. Improve local structure when it
33
- makes the current change clearer or safer, but do not redesign the whole
34
- repository because a draft describes future systems.
35
-
36
- Ignore unrelated drafts, logs, historical outputs, old runs, and nearby files
37
- unless the user makes them part of the task.
38
-
39
- Keep these artifact directories when they already exist:
40
-
41
- - `data/`: input data, pointers, traces, manifests, fixtures, or samples.
42
- - `output/`: program-run outputs and intermediate artifacts.
43
- - `results/`: curated result artifacts.
44
- - `harness/`: harness code, contracts, configs, schemas, samples, and support.
45
-
46
- Do not force a fixed test directory. Tests follow the repository's existing
47
- layout, project configuration, initialization docs, or adjacent test style.
48
-
49
- ## Runtime Binding
50
-
51
- Keep the skill project-agnostic. Bind names, paths, classes, functions,
52
- datasets, methods, metrics, harnesses, artifact fields, and validation commands
53
- from the current user request, current goal or task context, current
54
- repository, and existing code.
55
-
56
- Do not carry project facts from one run into the skill. If a rule contains a
57
- real path, symbol, dataset, method, harness, test name, artifact field, or
58
- project-specific claim, generalize it into a principle or remove it.
59
-
60
- Use placeholders only for examples, such as `<method_name>`, `<metric_name>`,
61
- `<harness_name>`, `<module_name>`, and `<artifact_type>`. Examples are
62
- illustrative, not fixed templates.
63
-
64
- ## Pre-Edit Inventory
65
-
66
- Before editing, establish a small task-relevant inventory:
67
-
68
- - repository root and version-control root;
69
- - files and directories relevant to the requested change;
70
- - expected source, test, harness, export, docs, and dependency surfaces;
71
- - files that must be left untouched by scope;
72
- - existing test and harness layout when relevant;
73
- - current dirty or untracked files, without reverting user work;
74
- - accepted constructor fields, identity fields, validation owner, provenance
75
- fields, and export surfaces for record-backed helpers.
76
-
77
- Treat a suddenly empty or partially missing tree as an integrity blocker. Do not
78
- reconstruct missing code from memory, notes, reports, or old outputs unless the
79
- user asks for restoration from a trusted source.
80
-
81
- ## Task Classification
82
-
83
- Classify the task before editing:
84
-
85
- - **Feature or implementation**: add the smallest clear code path that satisfies
86
- the requested behavior.
87
- - **Refactor or cleanup**: move, split, merge, rename, or delete code only to
88
- improve locality, readability, or testability for the current change.
89
- - **Harness work**: keep harness code under the relevant `harness/` area; make
90
- objective, inputs, metrics, raw artifacts, and run loop explicit.
91
- - **Test work**: place tests in the existing test system's natural location and
92
- keep each test focused on one behavior with small fixtures or toy inputs.
93
- - **Method, baseline, metric, or export work**: keep the change near the owning
94
- extension point and update registration, docs, exports, and tests only when
95
- those surfaces are in scope.
96
- - **Validation-only pass**: run the exact requested command from the repository
97
- root. If it passes, make no source, test, docs, dependency, export, or TODO
98
- changes except removing artifacts created by the run. If it fails, inspect
99
- the failure and make only the smallest local fix to the accepted contract.
100
- - **Framework or docs sync**: update framework docs when module boundaries,
101
- extension points, harness/test organization, artifact schemas, or repository
102
- responsibilities change and docs are in scope or are part of the accepted
103
- framework surface.
104
- - **Trajectory or TODO maintenance**: record accepted, verified work. Select a
105
- next task only when the user, active workflow, or existing trajectory
106
- explicitly asks for one.
107
-
108
- If a task is broad, choose a bounded slice that can be reviewed. If meaningful
109
- progress now requires datasets, extended runs, external evidence, harness
110
- runs, or generated results outside the request, stop at the accepted boundary and
111
- report the blocker.
112
-
113
- ## Implementation Style
114
-
115
- Prefer code that is short, direct, and easy to read in execution order. The data
116
- flow should be visible: inputs, validation, transformation, calls, outputs, and
117
- side effects should appear in a natural order.
118
-
119
- Use names from the current domain contract and existing code semantics. Keep one
120
- concept's spelling consistent across code, config, tests, harnesses, artifacts,
121
- prompts, and docs.
122
-
123
- When parsing an external schema into an internal record or value, keep source
124
- column names separate from internal field names. If the task says to map
125
- external `<source_field>` to internal `<target_field>`, expose and test the
126
- internal name unless the user explicitly asks to preserve the source field as a
127
- public output. Keep raw source names local to parsing, validation errors, or
128
- provenance only when that is the clearest contract.
129
-
130
- For parsers and loaders, normalize raw text at the boundary and construct
131
- public records from already-typed values. Do not relax an existing record's
132
- field validators because a new file format arrives as strings; parse the new
133
- format before record construction or add a narrow local parser helper. If a
134
- private validator is shared across old and new loaders, its contract must stay
135
- true for every caller and must not broaden legacy public behavior unless the
136
- task explicitly scopes that behavior change.
137
-
138
- Keep responsibilities single:
139
-
140
- - one file should mainly carry one interface, adapter family, metric family,
141
- data-processing step, harness entry/support area, export shape, or test group;
142
- - split files that mix unrelated change reasons or abstraction levels;
143
- - merge or simplify files that only add thin wrappers, pure forwarding, or extra
144
- jumps;
145
- - avoid `utils`, `misc`, mega-runners, and all-in-one modules unless they are
146
- already narrow and stable.
147
-
148
- Prefer inline or local helpers when logic is used once and remains readable.
149
- Extract helpers, adapters, registries, factories, contexts, or interfaces only
150
- when they provide real reuse, isolate a stable boundary, preserve an invariant,
151
- reduce caller code, or make tests simpler.
152
-
153
- When reusing an existing private helper for a broader case, first check whether
154
- the helper name, parameters, and doc-adjacent wording still describe every
155
- caller. Rename the private helper to the smallest neutral name when its original
156
- name encodes a narrower case, tail, direction, artifact type, or caller-specific
157
- behavior that is no longer true. Do not change the public contract merely to fix
158
- a private naming drift.
159
-
160
- Do not add abstractions for imagined future cases. If a simple implementation
161
- clearly satisfies the current task, keep it simple.
162
-
163
- Reduce global state, hidden path assumptions, implicit side effects, long call
164
- chains, repeated registration points, and heavy configuration for simple tasks.
165
-
166
- When an interface forces every caller to pass excessive parameters, consider a
167
- small explicit context or config object. Do not turn that into a framework when
168
- plain values remain clearer.
169
-
170
- When a field or config contract says "finite number", validate finiteness
171
- explicitly. Reject `NaN`, positive infinity, negative infinity, booleans when
172
- the language treats booleans as numbers, negative values when the contract says
173
- non-negative, and non-numeric values. Do not treat "not NaN" as equivalent to
174
- finite. If an existing shared validator has intentionally weaker legacy
175
- behavior, leave it unchanged unless the task scopes that contract change, and
176
- add a local validator for the stricter new config.
177
-
178
- ## Change Locality
179
-
180
- Before writing code, identify the natural owner of the change:
181
-
182
- - a method change should mainly touch method code and necessary comparison or
183
- registration surfaces;
184
- - a baseline change should mainly touch baseline code and focused tests;
185
- - a metric change should mainly touch metric definition, computation, export
186
- normalization if needed, and tests;
187
- - a public package export change should mainly touch the package entrypoint or
188
- existing export module plus a focused export-surface test;
189
- - a harness change should mainly touch the relevant harness area plus necessary
190
- shared interfaces;
191
- - a result-artifact change should mainly touch artifact schema, export logic,
192
- and tests;
193
- - a loader or manifest change should mainly touch the input layer and tests.
194
-
195
- If one feature requires unrelated edits across many areas, treat that as a
196
- framework-boundary risk. Do the smallest local refactor that brings related code
197
- together, or report the coupling if a safe local refactor is outside scope.
198
-
199
- Keep code that changes together close. Keep unrelated reasons to change in
200
- separate modules. Public/shared layers should contain only stable capabilities
201
- needed by multiple users; special cases should stay near their use sites.
202
-
203
- ## Harness And Test Discipline
204
-
205
- Harnesses serve evaluation goals, performance comparison, workflow screening,
206
- module optimization, and result analysis. Tests serve functional correctness,
207
- interfaces, data formats, config parsing, metrics, export behavior, and basic
208
- module interaction.
209
-
210
- Keep harness and test responsibilities separate:
211
-
212
- - harnesses should expose stable entry semantics, input protocols, metric names,
213
- raw artifacts, seeds, splits, config snapshots, and parseable outputs;
214
- - tests should use small fixtures, toy inputs, and clear pass/fail assertions;
215
- - each test should have one named behavioral responsibility;
216
- - formula, threshold, ordering, percentile, or ranking tests should use
217
- discriminating fixtures where a neighboring formula, adjacent threshold,
218
- reversed ordering, or copied existing helper would fail;
219
- - numeric config validation tests should match the stated contract for each
220
- key: missing/default behavior, accepted boundary values, negative values when
221
- non-negative is required, non-numeric values, `NaN`, and infinities when the
222
- contract says finite. If multiple fields or parameters say "finite" or "not
223
- bool", cover each owner, not only one representative owner;
224
- - parser and loader tests should assert internal field names, units, converted
225
- values, row/sample provenance, slicing semantics, immutable return shape when
226
- promised, and boundary normalization after any external-schema mapping. A test
227
- that only proves the raw source column was read does not prove the internal
228
- contract was respected;
229
- - fixed-shape parser tests should make malformed structure failures explicit:
230
- wrong component counts for each owned tuple/vector, malformed delimiters,
231
- non-finite values for each finite owner, and invalid window arguments for each
232
- public slicing parameter. Keep these as small contract fixtures, not large
233
- real-data reproductions;
234
- - budget-enforcement tests are separate from budget-configuration validation.
235
- When the scope asks for over-budget rejection or reason capture, use a valid
236
- constrained budget that lets at least one eligible candidate reach the
237
- selection loop and then exceed the remaining budget. Assert the rejected ID
238
- and exact budget-exceeded reason named by the task; an invalid budget test,
239
- missing-budget test, filtered candidate, cadence skip, or type rejection does
240
- not cover selection-time budget exhaustion;
241
- - for ordered selectors, missing-budget or unbounded-budget behavior still
242
- follows the selector's ordering contract after filtering and staging. The
243
- expected selected IDs should be the full eligible set in sorted order, not the
244
- input order, unless the contract explicitly says input order is preserved;
245
- - tie-breaker tests should make the primary sort keys equal and deliberately
246
- set other sort-like fields to favor the opposite order, so only the requested
247
- tie-break field can explain the expected result;
248
- - for multi-key ordering, test each tie-break level separately: hold all higher
249
- priority keys equal, set the key under test to determine the expected order,
250
- and set lower-priority sort-like keys to favor the opposite order. A fixture
251
- does not prove a middle tie-break if the final ID/name/order key would choose
252
- the same winner;
253
- - treat compound ordering phrases such as "deadline/object-id",
254
- "density/deadline/id", or "score/frame/deadline/id" as a checklist, not as a
255
- single fixture. Prove the first key with adversarial lower-priority fields,
256
- then add a same-higher-key fixture for each fallback key, including the final
257
- lexical identifier fallback when it is named;
258
- - tie-break fixture names and object IDs are labels, not evidence. Before
259
- accepting an ordering test, inspect the actual tuple fields used by the sort
260
- and confirm the expected winner is not also favored by a lower-priority
261
- fallback field;
262
- - for filtered, staged, or multi-phase selection, repeat the ordering audit for
263
- each accepted subset or phase, including catch-all groups such as regular,
264
- default, or non-special candidates. A fixture that proves the final
265
- identifier tie-break inside one phase does not prove that an earlier ordering
266
- key, subset ordering key, or phase priority is enforced;
267
- - for partitioned budgets, lanes, quotas, queues, or resource pools, isolation
268
- tests should leave spare capacity in one partition while a candidate in
269
- another partition exceeds its own limit. A fixture where every partition is
270
- fully consumed does not prove that borrowing, sharing, or leakage is absent;
271
- - no-mutation tests should inspect the same objects or mutable containers passed
272
- into the implementation;
273
- - export-surface assertions belong in export tests, invalid-state assertions in
274
- invalid-state tests, and identity/schema assertions in clearly named identity
275
- or schema tests;
276
- - harness code should not become functional test code;
277
- - test code should not become benchmark or performance evaluation.
278
-
279
- When a harness grows, split support modules inside that harness's own folder
280
- before pushing special logic into shared layers. When tests grow, split them in
281
- the existing test system's style.
282
-
283
- ## Framework Docs
284
-
285
- Maintain framework docs only when docs are in scope, the active workflow
286
- requires docs, or the accepted change would leave a current documented surface
287
- materially misleading. Keep docs about current reality, not template
288
- initialization, aspirational status, or skill mechanics.
289
-
290
- Framework docs should explain where future local changes should happen:
291
-
292
- - stable boundaries and extension points;
293
- - change map from feature type to module, harness, test, or export area;
294
- - harness purposes, metrics, and raw artifacts;
295
- - test organization actually used by the repository;
296
- - raw-first export approach and downstream analysis boundary;
297
- - framework risks where future changes cannot yet stay local.
298
-
299
- For README-style or package docs, read the requested files first and classify
300
- surfaces as current, stale, or historical. Edit only stale current surfaces
301
- needed for the accepted change. If all requested docs are current, report a
302
- no-op docs sync and the readback/search checks that proved it.
303
-
304
- When one accepted symbol, artifact, metric, method, or helper is documented in
305
- multiple parallel surfaces, build a small surface map before editing: helper or
306
- API lists, emitted names, package/module summaries, layout rows, test summaries,
307
- and absence clauses. Update each stale parallel surface consistently, but do
308
- not add new public exports, runtime behavior, or future-intent claims just because
309
- the docs mention the accepted bounded surface.
310
- Bind the surface map to the current selected subject. Neighboring helpers,
311
- methods, tests, metrics, or earlier accepted features in the same document are
312
- context, not part of the sync, unless the user explicitly scopes them or the
313
- same sentence/list must change to stay truthful. Do not carry predecessor tokens
314
- or coverage details from a previous task into the current docs pass when the
315
- current request names a different stale predecessor or surface boundary.
316
-
317
- For each subject-specific surface in that map, carry the full scoped contract
318
- when the user names it: accepted inputs or candidate classes, rejection reasons,
319
- configuration or budget keys, ordering or priority rules, emitted metadata, and
320
- validation behavior. Do not rely on a neighboring surface, a shared-helper
321
- phrase, or an absence clause to imply a detail that the current subject's
322
- surface must state explicitly.
323
- If the scope gives an exact callable signature, return annotation, record
324
- shape, output container, or immutability promise, reproduce that contract on
325
- every requested API, package-summary, module-summary, and translated surface
326
- that names the callable or record. Generic phrases such as "sample values",
327
- "helper output", or "bounded parser" are not substitutes for a named return
328
- shape when the user supplied one.
329
-
330
- When a docs-sync request says a surface currently lists through a previous
331
- accepted subject, search that predecessor token in every requested document
332
- before editing. Treat each occurrence as a current surface or historical note.
333
- Update stale current rosters and scoped surfaces; leave historical notes alone
334
- after confirming they are not current-surface lists.
335
-
336
- Write docs at the stable contract level by default. Summarize behavior,
337
- metadata, configuration, rejection reasons, ordering, artifact shape, and test
338
- coverage clearly enough for future maintainers to find the right code. Do not
339
- copy full fixture ID lists, exhaustive invalid-value matrices, or lengthy test
340
- expectations into README-style docs unless the user explicitly requests that
341
- level of detail, the existing docs already use that convention for the same
342
- surface, or review feedback depends on an exact fixture detail.
343
-
344
- Keep priority, visibility, partition, and ordering terms separate from filters.
345
- If an item outside a priority group remains eligible, document it as lower
346
- priority or off-priority, not as rejected or as an invalid type. Rejection
347
- wording should describe the actual contract owner: type filters reject types,
348
- validation rejects invalid inputs, and budget handling rejects otherwise
349
- eligible over-limit items.
350
-
351
- When the user scopes specific test-coverage details for README-style docs,
352
- turn those details into a per-document checklist for every requested test
353
- summary surface. Preserve the exact behavioral distinction that made the test
354
- valuable: discriminating fixture setup, tie-break owner, invalid-input owner,
355
- metadata value, provenance field, non-mutation target, or same mutable object
356
- when those are named. A generic sentence such as "covers tie-breaking" or
357
- "covers non-mutation" is not enough when the scope names the fixture condition
358
- or the object whose mutation must be rejected.
359
-
360
- For staged, filtered, or multi-path behavior in README-style docs, document the
361
- behavior by responsibility: accepted subset definitions, phase priority, primary
362
- ordering for each accepted subset, tie-break fixtures, metadata, and each
363
- rejection reason's owner. Do not let a tie-break fixture stand in for the
364
- primary ordering case, and do not describe one rejection reason as applying to
365
- all rejected items when another rejection path, such as budget or validation,
366
- uses a different reason.
367
-
368
- When a README section groups multiple symbols, helpers, schedulers, metrics, or
369
- tests under one sentence or bullet list, the group label must be true for every
370
- item in that list. If a metadata value, rejection reason, config key, fixture,
371
- or coverage case belongs to only one grouped subject, split it into a
372
- subject-specific bullet or paragraph instead of relying on a shared block.
373
-
374
- Write absence clauses narrowly. Before saying a broad category is absent, check
375
- the current code and docs for accepted bounded surfaces in that category. If a
376
- small in-memory conversion, helper, adapter, or test surface exists, qualify the
377
- missing surface precisely, such as "file-based", "result", "additional",
378
- "runtime", "full", "full-data", or "generated-output" capability. Do not let a
379
- negative sentence contradict an implemented helper documented elsewhere.
380
- When the accepted feature is a bounded or partial member of a broader algorithm,
381
- model, runtime, or framework family, absence wording should name only the
382
- unimplemented larger surface, such as "full", "additional", "beyond the
383
- accepted bounded formula", or "runtime integration". Do not use the broad
384
- family name alone as absent when the current docs also document an accepted
385
- bounded implementation in that family.
386
- Prefer narrow positive absence sentences such as "This bounded surface does not
387
- add <capability>" or "<capability> remains unimplemented." Avoid double
388
- negatives and "No <capability> is not ..." constructions, especially after
389
- rewriting a long absence clause.
390
-
391
- Do not automatically queue a docs-only task after every source/test change.
392
- Queue or perform docs sync only when docs are explicitly requested, are part of
393
- the active workflow, or the accepted change would leave a current documented
394
- surface materially misleading. If docs are excluded from the source/test task,
395
- do not promote stale documentation found during validation or TODO maintenance
396
- into the next developing task unless the user, active workflow, or existing
397
- trajectory explicitly selects docs sync. Record possible docs staleness as a
398
- caveat or candidate, not as a selected handoff.
399
-
400
- ## Trajectory And TODO Maintenance
401
-
402
- Trajectory files should record accepted facts, exact validation commands and
403
- results, cache cleanup or no-cache findings, and explicit exclusions that
404
- preserve scope.
405
-
406
- For docs-only or TODO-only accepted work, record the readback and targeted
407
- search checks that replaced test execution, and state that tests were skipped
408
- because no executable code or test files changed.
409
- For a no-op docs sync, record the targeted searches and requested-surface
410
- readback that proved the docs were already current, plus a changed-file check
411
- showing no requested docs were modified.
412
-
413
- For scoped docs-only or TODO-only work, also record a changed-file check or
414
- equivalent scope check showing that edits stayed inside the allowed file set.
415
- If an executable, test, dependency, export, harness, generated artifact, or
416
- result-artifact file changed accidentally, treat the run as no longer docs-only
417
- and validate or repair according to the user's scope.
418
-
419
- Do not use TODO or handoff files to invent the next source, harness, docs, or
420
- implementation task. Select a next task only when the user has explicitly selected
421
- it, the current workflow instruction names that handoff, or an existing active
422
- trajectory already contains that selected task. Otherwise leave a neutral
423
- waiting state such as "no next developing task is selected."
424
-
425
- When the accepted source/test task explicitly excluded docs, TODO, exports,
426
- harnesses, or generated outputs, preserve that exclusion in the
427
- trajectory. A later TODO-only pass may record accepted work and verified stale
428
- surfaces, but it must not turn excluded surfaces into selected follow-up work
429
- without explicit task selection.
430
- Verified stale docs, exports, harnesses, or artifacts are evidence for a future
431
- task-selection pass, not a selected next task by themselves. A repository habit,
432
- recent sequence, or reasonable maintenance preference is not explicit selection
433
- when the just-finished task excluded that surface. Require selection language
434
- from the user, a workflow instruction, or an already-active backlog item before
435
- writing "next developing task: sync docs" or any equivalent handoff after a
436
- source/test task that excluded docs.
437
- Explicit exclusions in the current task are not backlog seeds. If the user says
438
- not to add a capability, parser family, registry, export, adapter, harness,
439
- CLI, artifact, generated output, or adjacent capability, a TODO-only pass may record that
440
- the exclusion was preserved, but must not select that excluded capability as
441
- the next task unless a later explicit task-selection input asks for it.
442
-
443
- After an accepted docs-only or TODO-only update, treat any next implementation
444
- task as a separate task-selection decision, not as a consequence of making docs
445
- current. If a next source/test task is recorded, tie it to an explicit upstream
446
- selector, accepted backlog item, or already-scanned stale implementation gap;
447
- otherwise leave the trajectory neutral.
448
- Do not use a docs-only sync that merely documented an accepted helper as the
449
- reason to select an adjacent implementation task. Newly visible omissions in
450
- the docs may be recorded as candidates for later selection, but the next
451
- developing task stays neutral unless the user or active workflow explicitly
452
- selects that implementation work.
453
-
454
- If a docs-only sync is explicitly selected, name the exact stale current
455
- surfaces found in a read-only scan and make clear that it is a separate future
456
- pass, not part of a source/test task that excluded docs. If no live stale
457
- surface was verified, do not create a generic documentation task.
458
-
459
- When a handoff selects a docs-only follow-up, include a short stale-surface map:
460
- the document files and surface types to update, such as helper/API lists,
461
- emitted names, package or module summaries, layout rows, test summaries, or
462
- absence clauses. A generic "sync docs for <accepted change>" task is not enough
463
- unless those concrete stale surfaces are also named. Keep the handoff small:
464
- name stable contracts and stale surface types, not every fixture ID,
465
- selected-object list, or assertion from the tests.
466
-
467
- If accepted review tightened a discriminator, rejection reason, metadata value,
468
- or mutation target that docs must preserve, carry that exact detail into the
469
- handoff scope. Do not copy unrelated fixture lists merely because they were
470
- accepted in tests.
471
-
472
- When review corrections changed wording, hierarchy, or absence scope, record
473
- the final accepted correction as the current contract. Do not preserve rejected
474
- draft wording as a new TODO item unless the reviewer or user explicitly asks
475
- for a follow-up.
476
-
477
- After validation-only work, record only the command, result, no-fix status, and
478
- cache cleanup/no-cache finding. A green validation run confirms current
479
- contracts; it does not create new feature, docs, export, harness, or
480
- generated-output work.
481
-
482
- ## Naming, State, And References
483
-
484
- Names must reflect real meaning and data shape. Do not keep historical,
485
- placeholder, or overgeneral names after the concept changes.
486
-
487
- Use content names for content and reference names for paths, handles, IDs, URLs,
488
- or external resources. Do not let a variable named like a reference carry loaded
489
- content, or a content name carry a location.
490
-
491
- Place each variable, state object, config, and data structure at the layer that
492
- actually owns it. Local intermediate content should stay local. Only stable
493
- cross-boundary data should enter shared structures.
494
-
495
- When outer orchestration owns saving, archiving, or exporting, inner business
496
- logic should return values rather than also writing files. Write, save, export,
497
- and return responsibilities should be single-owner.
498
-
499
- ## Prompts And Comments
500
-
501
- If repository code includes prompts, task instructions, or embedded agent text,
502
- write them as direct task instructions. Clearly distinguish external references
503
- from direct content and state who returns, saves, or exports each output.
504
-
505
- Use code comments sparingly. Comments should explain non-obvious decisions,
506
- constraints, provenance, or special cases. If clearer names or structure make a
507
- comment unnecessary, simplify the code instead.
508
-
509
- Do not write skill rules, debugging process, generation process, or style
510
- analysis into code comments.
511
-
512
- ## Open-Source Reuse
513
-
514
- When the task needs mature existing functionality, first decide whether legal,
515
- appropriate, low-maintenance reuse is better than custom implementation.
516
-
517
- Reuse preference:
518
-
519
- 1. direct dependency with stable packaging and compatible license;
520
- 2. adapter around a stable API;
521
- 3. small copied or ported snippet when license permits;
522
- 4. custom implementation when reuse would add more cost than value.
523
-
524
- Before copying or porting external code, check license compatibility. Preserve
525
- required notices and add a short source/provenance comment near copied or
526
- ported code. Maintain a third-party notice file or equivalent when the
527
- repository accumulates copied external code.
528
-
529
- Do not vendor large unrelated projects or import heavy dependencies to satisfy a
530
- small local feature.
531
-
532
- ## Validation
533
-
534
- Use the user's requested validation command when provided. Before running, check
535
- that every explicitly requested target exists; a missing target is a blocker to
536
- report, not permission to silently narrow the command or create the target.
537
-
538
- For source or test changes, prefer the smallest relevant test target that proves
539
- the accepted contract, unless the user asked for a broader suite. Use command
540
- forms that avoid repository cache or bytecode artifacts when the project allows.
541
-
542
- After validation, check for generated cache/build/test artifacts created by the
543
- run and remove only those generated artifacts. Do not clean unrelated dirty or
544
- untracked user work.
545
-
546
- For docs-only or TODO-only work, do not run tests unless executable code or
547
- test files changed accidentally. Re-read edited docs/TODO files and run targeted
548
- text searches for the accepted names, stale predecessor names, and broad absence
549
- phrases that were in scope.
550
-
551
- When README-style docs must describe focused test coverage, include targeted
552
- readback checks for the scoped coverage nouns and discriminators, not only the
553
- new public symbol. Search for the tie-break field, opposite-order fixture clue,
554
- metadata key or value, provenance field, invalid-input category, and exact
555
- non-mutation target when the user named them.
556
-
557
- For multi-surface docs, do not rely on whole-file search alone. Check the
558
- specific edited section or paragraph type that was in scope, especially test
559
- summary paragraphs in translated docs, so a term present elsewhere in the same
560
- file does not mask a stale summary.
561
-
562
- For Markdown docs with nested bullets or long copied list blocks, audit the
563
- local hierarchy after editing. Read the lines around every edited heading and
564
- the next sibling heading or bullet. Confirm top-level file, module, test, or
565
- artifact bullets remain siblings rather than becoming children of the previous
566
- coverage block, and confirm nested bullets are nested only where intended.
567
- For each edited bullet, compare its literal indentation prefix with the nearest
568
- same-level sibling and nearest child bullet in the same list. A child entry
569
- under a module/test/feature parent should keep the same prefix as neighboring
570
- children; a new sibling module/test/file entry should keep the same prefix as
571
- neighboring siblings. Do this line-level check before reporting docs-only
572
- validation complete.
573
-
574
- For README-style docs that extend a roster from a previous accepted subject,
575
- search both the predecessor token and the new token after editing. Read every
576
- remaining predecessor occurrence in local context and confirm either that the
577
- new token appears in the same current roster, layout row, summary, or coverage
578
- block, or that the predecessor occurrence is intentionally historical and not a
579
- current-surface list.
580
-
581
- When scoped details reuse strings already present for other subjects, validate
582
- with local context: the current subject name and the required fixture, metadata,
583
- reason, or config term should appear in the same bullet, paragraph, or clearly
584
- bounded coverage block.
585
-
586
- When the scope names an exact callable signature, return type, output container,
587
- or record immutability contract, validate that exact contract in every requested
588
- surface that names the callable. Read local context around the callable in each
589
- document; a whole-file hit for the record name or helper name does not prove the
590
- API surface carries the return contract.
591
-
592
- When a scoped invalid-value matrix names multiple keys or inputs, validate each
593
- owner separately in local context. Search/read back for every named key or input
594
- together with the invalid-value class or explicit invalid values in the same
595
- test-summary bullet, paragraph, or bounded coverage block.
596
-
597
- When a docs-sync scope includes both default and non-default configured
598
- fixtures, read back those bullets separately. Confirm the config value or
599
- "default" label matches the expected selected/rejected IDs and reasons in that
600
- same local context.
601
-
602
- When the user names specific excluded capability categories, include those
603
- terms or close equivalents in docs/TODO readback searches. The check should
604
- confirm that absence wording stayed narrow for every explicitly scoped
605
- exclusion, not only that the new accepted name appears.
606
-
607
- For README-style docs, include a quick local prose cleanup pass on edited
608
- paragraphs: remove duplicated adjacent words or lines, stale sentence tails left
609
- after rewriting a clause, and grammar artifacts that can make a scoped absence
610
- claim ambiguous.
611
-
612
- ## Review Guidance
613
-
614
- When reviewing, lead with defects that harm readability, locality, naming,
615
- state ownership, interface clarity, harness/test separation, artifact shape, or
616
- framework consistency.
617
-
618
- Prefer review suggestions that delete, inline, move to the use site, rename,
619
- align ordering, split responsibilities, clarify ownership, or reduce caller
620
- burden. Do not default to adding wrappers, registries, config layers, factories,
621
- or defensive branches unless they solve a concrete defect.
622
-
623
- For bounded helpers, verify that the implementation:
624
-
625
- - reads only the accepted inputs and fields;
626
- - maps external source fields to the requested internal output names without
627
- leaking raw source names into public records unless explicitly scoped;
628
- - rejects invalid inputs at the intended validation owner;
629
- - implements numeric contracts literally, including rejecting infinities when a
630
- value must be finite and preserving weaker legacy validators unless changing
631
- them is explicitly in scope;
632
- - returns the accepted record or value shape;
633
- - preserves provenance when requested;
634
- - does not mutate source records or inputs unless mutation is the contract;
635
- - keeps identity behavior delegated to the accepted record or schema type;
636
- - keeps any reused private helper name semantically true for all current
637
- callers;
638
- - avoids adjacent runtime surfaces such as loaders, registries, exporters,
639
- harnesses, CLI, or generated outputs unless explicitly in scope.
640
-
641
- For documentation reviews, compare every newly edited absence clause against
642
- the implemented-surface list, package/module summaries, layout rows, and test
643
- summaries. Treat broad "no <category>" wording as a defect when a narrower
644
- bounded surface in that category is already accepted; ask for the smallest
645
- wording fix instead of reopening source or tests.
646
- Also treat leftover duplicated words, duplicated sentence tails, or malformed
647
- negative clauses as docs defects when they change or obscure the intended
648
- scope.
649
-
650
- Also compare every requested test-coverage detail against each edited test
651
- summary surface, including translated README surfaces. If one document keeps a
652
- generic coverage phrase while another contains the precise discriminating
653
- fixture or non-mutation target, request the smallest wording fix in the stale
654
- document only.
655
-
656
- Treat cross-surface leakage as a docs defect: a required rejection reason,
657
- metadata value, fixture ID, provenance field, or mutation target is still
658
- missing if it appears only in a helper/API list while the scoped test-summary
659
- paragraph omits it.
660
-
661
- Treat exact-contract leakage as a docs defect: if the user supplied a callable
662
- signature, return annotation, output container, or frozen/immutable record
663
- promise, every requested API or module surface that names the callable must
664
- state that exact contract or a direct equivalent in the same local context.
665
-
666
- Treat roster leakage as a docs defect: when a request extends an implemented
667
- surface that was previously listed through an older subject, any current
668
- helper list, module summary, package summary, layout row, parenthetical roster,
669
- or test summary that still stops at the predecessor is stale even if another
670
- surface in the same file already includes the new subject.
671
-
672
- Treat subject leakage as a docs defect too: a required fixture, metadata value,
673
- rejection reason, or config key is still missing if it appears only under a
674
- neighboring helper, scheduler, method, metric, or test block.
675
-
676
- Treat unrelated-subject drift as a docs defect. If a docs-sync task is scoped
677
- to one accepted symbol, helper, parser, method, metric, artifact, or test, do
678
- not accept rewrites to neighboring subjects merely because they are nearby in
679
- the same README. Request the smallest revert or wording trim unless the
680
- neighboring edit is necessary to keep a shared sentence, roster, or absence
681
- clause truthful.
682
-
683
- Treat validation-owner leakage as a docs defect: if a scoped invalid-value set
684
- applies to multiple named keys, fields, modes, or inputs, the test-summary
685
- surface is stale when it documents the invalid set for only one owner or hides
686
- the owner list behind a vague "invalid config" phrase.
687
-
688
- Treat validation-owner leakage as a test defect too: when the implementation
689
- contract names multiple finite values, integer fields, or parameters that must
690
- reject booleans, require at least one focused test per owner or a compact
691
- parametrized test that names each owner. Do not accept a single neighboring
692
- owner's `NaN`, infinity, or boolean test as coverage for the whole helper.
693
-
694
- Treat default/config leakage as a docs defect: a test-summary surface is stale
695
- when it labels a fixture with an explicit non-default configuration as default,
696
- or when it documents the configured fixture but omits the separate default
697
- fixture expectation named by the scope.
698
-
699
- For documentation reviews, also check the scope sentence or heading that
700
- introduces grouped bullets. A fact is misdocumented if it appears under a group
701
- where one or more named subjects do not own that metadata value, rejection
702
- reason, config key, fixture, or coverage case, even if the fact is present
703
- somewhere in the requested file.
704
-
705
- Treat Markdown hierarchy drift as a docs defect. If an edited file, module,
706
- test, artifact, or capability bullet becomes nested under a neighboring
707
- coverage block or subject, request a smallest-possible indentation fix even
708
- when the words themselves are correct.
709
-
710
- For docs that describe staged, filtered, or multi-path behavior, verify that the
711
- primary ordering coverage, phase priority, subset definition, and each
712
- rejection reason are all present in the correct test-summary surface. A broad
713
- "rejected with <reason>" phrase is a defect when only one filtered subset uses
714
- that reason and another path uses budget, validation, or a different rejection
715
- contract.
716
-
717
- Review tests against their fixture values and names. If a test name says
718
- "all-zero", "empty", "single", "all", or "none", the fixture should actually
719
- match that case. Passing tests are not enough when naming, boundary, or
720
- provenance contracts are misleading.
721
-
722
- For variants added next to an existing formula or helper, check that at least
723
- one focused test distinguishes the new variant from the nearest existing one.
724
- Do not accept a mixed fixture that would still pass if the implementation used
725
- the previous threshold, percentile, sort direction, condition, or field.
726
-
727
- For sort-chain tests, trace the expected order through the exact sort tuple.
728
- Reject a fixture if the expected winner is also favored by a lower-priority
729
- fallback key or by an unrelated aligned field. Each named tie-break level should
730
- have at least one fixture that would fail if that level were omitted.
731
- Do not trust helper names, object IDs, or comments that say "earlier", "later",
732
- "best", or "tie" unless the underlying field values prove that relationship and
733
- the fallback fields are adversarial where needed.
734
- If review scope asks for a combined fallback chain, such as `<primary>/<id>` or
735
- `<primary>/<secondary>/<id>`, verify there is both a dominance fixture for each
736
- non-final key and a same-higher-key fixture for the final identifier fallback.
737
-
738
- For staged, filtered, or partitioned-lane schedulers, review the sort tuple for
739
- every stage, lane, or accepted subset separately, including default or regular
740
- subsets. If a phase says it orders by one key and then a fallback key, require
741
- one discriminator for the first key and a separate same-key fixture for the
742
- fallback; do not accept a same-key fixture as evidence that the first key is
743
- implemented.
744
-
745
- For resource-isolation claims, check the fixture has unused capacity in at least
746
- one non-borrowing partition and an over-limit item in another partition. A test
747
- where each lane, quota, queue, or resource pool exactly consumes its own budget
748
- does not prove that unused capacity cannot leak across boundaries.
749
-
750
- For budgeted selectors, review invalid-budget tests separately from
751
- over-budget selection tests. If the task asked for budget rejection and reason
752
- capture, require a valid-budget fixture where an otherwise eligible candidate
753
- is rejected only because remaining budget is insufficient, and assert that
754
- candidate's exact rejection reason. Do not count missing-budget behavior,
755
- invalid-budget exceptions, filtered objects, cadence skips, or type rejections
756
- as over-budget coverage.
757
-
758
- For ordered selectors, review missing-budget or unbounded-budget assertions
759
- against the same filtering, staging, and sort tuple used by constrained-budget
760
- selection. Selecting every eligible candidate should still prove the accepted
761
- ordering contract unless the requested behavior explicitly preserves input
762
- order.
763
-
764
- When addressing review feedback in a file with repeated tests or similar helper
765
- fixtures, verify the exact named test, helper, or caller cited by the review was
766
- changed. Do not treat a similar edit in a neighboring existing test as
767
- satisfying feedback for the new surface.
768
-
769
- ## Readability Audit
770
-
771
- After edits, audit:
772
-
773
- - names match real meaning and data shape;
774
- - data flow is direct and naturally ordered;
775
- - functions, files, and modules have clear responsibilities;
776
- - abstractions reduce real complexity rather than add jumps;
777
- - no avoidable global state, hidden paths, repeated registration points, or
778
- heavy config burden were added;
779
- - the change stayed local to the natural owner;
780
- - harness and test responsibilities remain separate;
781
- - artifact schemas, exporters, docs, and tests agree when any changed;
782
- - framework docs were updated or confirmed current when in scope;
783
- - external reused code has compatible license and attribution;
784
- - no generated cache/build/test/output/result artifacts were left behind unless
785
- explicitly requested.
786
-
787
- For skill edits, also perform a project leakage audit. Remove or generalize any
788
- real project path, symbol, dataset, method, metric, harness, test, artifact
789
- field, historical output, or one-off debug lesson that does not hold across
790
- repositories.
791
-
792
- ## Final Response
793
-
794
- Keep the final response concise:
795
-
796
- - changed paths;
797
- - behavior or contract covered;
798
- - validation performed, using readback/search checks for docs-only work;
799
- - caveats that affect the user's next action.
800
-
801
- Do not explain skill internals, tool mechanics, or style theory unless the user
802
- asked for a skill optimizer report.