academic-army 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/.editorconfig +9 -0
  2. package/.github/workflows/publish.yml +44 -0
  3. package/.prettierrc.json +3 -0
  4. package/LICENSE +21 -0
  5. package/README.md +172 -0
  6. package/README.zh-CN.md +172 -0
  7. package/agent-forge.yaml +83 -0
  8. package/eslint.config.js +28 -0
  9. package/install_mcp.py +85 -0
  10. package/mcp-server/__main__.py +33 -0
  11. package/mcp-server/deepresearch/__init__.py +3 -0
  12. package/mcp-server/deepresearch/tools.py +33 -0
  13. package/mcp-server/requirements.txt +4 -0
  14. package/metaskills/README.md +131 -0
  15. package/metaskills/README.zh-CN.md +131 -0
  16. package/metaskills/academic-army-architect/METASKILL.md +91 -0
  17. package/metaskills/academic-army-architect/envolve.sh +9 -0
  18. package/metaskills/academic-army-coding-plan/ENVOLVETASK.md +1 -0
  19. package/metaskills/academic-army-coding-plan/METASKILL.md +118 -0
  20. package/metaskills/academic-army-coding-plan/envolve.sh +9 -0
  21. package/metaskills/academic-army-coding-style/METASKILL.md +292 -0
  22. package/metaskills/academic-army-experiment-plan/ENVOLVETASK.md +1 -0
  23. package/metaskills/academic-army-experiment-plan/METASKILL.md +82 -0
  24. package/metaskills/academic-army-experiment-plan/envolve.sh +9 -0
  25. package/metaskills/academic-army-repo-scaffold/ENVOLVETASK.md +1 -0
  26. package/metaskills/academic-army-repo-scaffold/METASKILL.md +223 -0
  27. package/metaskills/academic-army-repo-scaffold/envolve.sh +9 -0
  28. package/package.json +35 -0
  29. package/runs/develop-skill.sh +17 -0
  30. package/runs/develop.sh +16 -0
  31. package/skills/academic-army-architect/SKILL.md +336 -0
  32. package/skills/academic-army-architect/agents/openai.yaml +11 -0
  33. package/skills/academic-army-architect/references/blueprint-schema.md +345 -0
  34. package/skills/academic-army-coding-plan/SKILL.md +491 -0
  35. package/skills/academic-army-coding-plan/agents/openai.yaml +11 -0
  36. package/skills/academic-army-coding-style/SKILL.md +915 -0
  37. package/skills/academic-army-coding-style/agents/openai.yaml +11 -0
  38. package/skills/academic-army-experiment-plan/SKILL.md +517 -0
  39. package/skills/academic-army-experiment-plan/agents/openai.yaml +11 -0
  40. package/skills/academic-army-repo-scaffold/SKILL.md +756 -0
  41. package/skills/academic-army-repo-scaffold/agents/openai.yaml +10 -0
  42. package/src/README.md +79 -0
  43. package/src/README.zh-CN.md +79 -0
  44. package/src/cli.ts +55 -0
  45. package/src/developing/README.md +146 -0
  46. package/src/developing/README.zh-CN.md +146 -0
  47. package/src/developing/agents/developer.ts +40 -0
  48. package/src/developing/agents/factory.ts +11 -0
  49. package/src/developing/agents/index.ts +8 -0
  50. package/src/developing/agents/manager.ts +74 -0
  51. package/src/developing/agents/prompts.ts +12 -0
  52. package/src/developing/agents/reviewer.ts +44 -0
  53. package/src/developing/agents/trajectory-optimizer.ts +70 -0
  54. package/src/developing/agents/types.ts +41 -0
  55. package/src/developing/index.ts +2 -0
  56. package/src/developing/pipeline.ts +306 -0
  57. package/src/developing/pipelineskill.ts +169 -0
  58. package/src/evolve-skill/README.md +116 -0
  59. package/src/evolve-skill/README.zh-CN.md +116 -0
  60. package/src/evolve-skill/agents/evaluator.ts +28 -0
  61. package/src/evolve-skill/agents/factory.ts +11 -0
  62. package/src/evolve-skill/agents/index.ts +4 -0
  63. package/src/evolve-skill/agents/modifier.ts +27 -0
  64. package/src/evolve-skill/agents/runner.ts +19 -0
  65. package/src/evolve-skill/index.ts +1 -0
  66. package/src/evolve-skill/pipeline.ts +140 -0
  67. package/src/pipeline.ts +65 -0
  68. package/tsconfig.json +22 -0
@@ -0,0 +1,756 @@
1
+ ---
2
+ name: academic-army-repo-scaffold
3
+ description: >-
4
+ Initialize template-first research code repositories for the Academic Army
5
+ autoresearch workflow from a paper_blueprint, experiment plan, coding plan,
6
+ and user-specified repository path. Use when Codex needs to create or adapt a
7
+ real starter repository scaffold generated by an initializer, template tool,
8
+ official starter, or high-quality template repository, then overlay fixed
9
+ experiment directories, bilingual README/REFERENCES files, semantic harness
10
+ folders, and template-informed test layout notes without implementing paper
11
+ methods, experiments, metrics, runners, loaders, exporters, tests, or business
12
+ logic.
13
+ ---
14
+
15
+ # Academic Army Repo Scaffold
16
+
17
+ ## Purpose
18
+
19
+ Create a real starter repository first, then add the Academic Army experiment
20
+ overlay. The primary artifact is the generated starter repo plus the experiment
21
+ directory structure. `README.md`, `README.zh-CN.md`, `REFERENCES.md`, and
22
+ `REFERENCES.zh-CN.md` are supporting documentation.
23
+
24
+ This skill is scaffold-only:
25
+
26
+ - Generate or preserve starter files, boilerplate source, project metadata,
27
+ dependency declarations, build/test configuration roles, minimal entry files,
28
+ and ecosystem conventions from the selected template or initializer.
29
+ - Do not implement paper methods, data loaders, metric formulas, result
30
+ exporters, experiment runners, harness logic, test logic, configuration
31
+ parsing, or domain business behavior.
32
+ - Do not install project dependencies, resolve dependencies, generate a new
33
+ lock state, run generated code, run tests, run harnesses, or execute
34
+ experiments.
35
+
36
+ If the final repository contains only documentation and empty directories,
37
+ report scaffold generation as failed. A completed scaffold needs substantive
38
+ starter-repo artifacts discovered from the selected template, such as project
39
+ metadata, dependency declaration, build/run configuration, selected source
40
+ layout, selected test layout, entry points, examples, or an initializer-specific
41
+ artifact.
42
+ Hand-authored metadata plus empty package directories is not enough unless a
43
+ real generator or template produced that structure and the artifact registry can
44
+ identify the generated project roles.
45
+
46
+ ## Inputs And Scope
47
+
48
+ Require a target repository path. If the user does not provide one, ask before
49
+ creating files.
50
+
51
+ Read only the inputs needed to initialize the repository:
52
+
53
+ - paper blueprint
54
+ - experiment plan
55
+ - coding plan
56
+ - explicit user constraints about target path, existing-repo adaptation,
57
+ language, framework, runtime, template, dependency policy, or repository
58
+ preservation
59
+
60
+ If planning paths are provided, read those files. Otherwise locate the closest
61
+ conventional planning artifacts by name, then stop. Do not inspect unrelated
62
+ nearby source trees, logs, notebooks, old outputs, package manifests, or other
63
+ workspace noise.
64
+
65
+ Read files inside the target repository only when:
66
+
67
+ - the user explicitly asks to adapt an existing repository,
68
+ - an initializer has generated files there that must be inspected,
69
+ - existing scaffold files must be preserved or refreshed safely, or
70
+ - target contents must be classified before writing.
71
+
72
+ Keep runtime mechanics out of generated repository documents. Do not write about
73
+ sandbox limits, shell failures, MCP failures, dependency-install failures, or
74
+ local workaround details unless the user explicitly asks for operational notes.
75
+
76
+ ## Path And Preservation
77
+
78
+ Resolve the target repository path before writing. Every created or modified
79
+ repository file must stay inside that path. Repository documents should use
80
+ repository-relative paths for internal files.
81
+
82
+ Treat planning inputs outside the repository as invocation context only. Do not
83
+ reference them from reusable repository documents unless the user explicitly
84
+ asks to copy them into the repository. Repository documentation must not contain
85
+ broken repo-relative references, parent-relative workspace paths, machine
86
+ absolute paths, or workspace-specific commands.
87
+
88
+ If the target path already contains files, classify them before changing
89
+ anything:
90
+
91
+ - existing starter-repo files to preserve
92
+ - generated scaffold docs that can be refreshed safely
93
+ - generated scaffold residue that can be pruned safely
94
+ - user-authored or ambiguous content that must be preserved
95
+
96
+ Do not overwrite non-generated or ambiguous content. Ask before destructive
97
+ replacement when authorship is unclear. If an existing target is
98
+ documentation-only, upgrade it with a real starter-repository layer while
99
+ preserving user-authored material.
100
+
101
+ ## Required DeepResearch
102
+
103
+ Run `academic_army_mcp_tools.deepresearch` before choosing the initializer or
104
+ template, unless the task includes a fresh directly relevant lookup artifact
105
+ that already compares scaffold-generation options for this project.
106
+
107
+ DeepResearch must identify the target ecosystem from the planning inputs and
108
+ compare ways to generate a real starter repository:
109
+
110
+ - official initialization commands and framework CLIs
111
+ - template tools and ecosystem package generators
112
+ - high-quality template repositories and GitHub template repositories
113
+ - research-code, benchmark, and harness templates
114
+ - high-quality public repositories in the paper domain, used for structural
115
+ lessons or implementation references
116
+ - package-management, dependency-declaration, and environment-isolation
117
+ practices
118
+ - test-layout conventions when the selected generator does not create tests
119
+ - installable dependencies and reference-only sources, including license,
120
+ packaging quality, interface stability, maintenance, and risk notes
121
+
122
+ Do not hardcode a language, runtime, package manager, dependency file, build
123
+ file, source path, test path, test framework, or configuration file in this
124
+ skill. Select them at invocation time from user constraints, planning inputs,
125
+ DeepResearch evidence, template quality, license clarity, generated output
126
+ shape, and downstream implementation cost.
127
+
128
+ When no language/runtime is specified, infer a practical starter ecosystem from
129
+ the plans and DeepResearch evidence. Do not leave language/runtime undecided
130
+ for a new scaffold.
131
+
132
+ Use a prompt shaped like:
133
+
134
+ ```text
135
+ Research template-first repository initialization options for an Academic Army
136
+ research-code project.
137
+
138
+ Project context:
139
+ [paper goal, domain, candidate methods, experiment harness needs, coding-plan
140
+ logical modules, explicit user constraints, target repository constraints]
141
+
142
+ Return:
143
+ - official initializers, template tools, template repositories, research-code
144
+ templates, benchmark templates, and related high-quality repositories
145
+ - expected generated output shape for each candidate by role: project metadata,
146
+ dependency declaration, build/run configuration, source layout, test layout,
147
+ entry points, docs, examples
148
+ - generated-structure comparison across plausible candidates: directory shape,
149
+ dependency mechanism, entry files, test structure, configuration complexity,
150
+ documentation quality, license, maintenance, and fit for the paper workflow
151
+ - dependency declaration mechanism, repo-local installation workflow, and
152
+ environment isolation approach
153
+ - runtime dependency candidates, development/test dependency candidates, and
154
+ reference-only sources with license, packaging, stability, maintenance, and
155
+ risk notes
156
+ - exact release tag, package version, commit SHA, license, or dated unpinned
157
+ access snapshot where practical
158
+ - which candidates are safe to generate from, adapt, cite only, or reject
159
+ - recommended initializer/template and why
160
+ ```
161
+
162
+ If multiple candidates remain plausible, compare actual generated structures.
163
+ Inspect docs, run a dry generator in a disposable location, or statically inspect
164
+ a template repo when candidate output shape is unclear. Do not choose from
165
+ descriptions alone.
166
+
167
+ ## Template-First Design
168
+
169
+ Use a hybrid layout:
170
+
171
+ - **Template layer:** generated by the selected initializer, template tool,
172
+ official starter, or template repository. Preserve its starter source,
173
+ metadata, dependency declarations, build/test configuration roles, entry
174
+ points, examples, and ecosystem conventions unless they are unrelated sample
175
+ residue, ambiguous third-party code, or license-risky implementation content.
176
+ - **Experiment overlay:** fixed Academic Army top-level directories and
177
+ bilingual documentation.
178
+
179
+ Always create or preserve these overlay entries:
180
+
181
+ - `data/`: input data and dataset assets
182
+ - `output/`: program run outputs and intermediate artifacts
183
+ - `results/`: experiment result records and paper-facing summaries
184
+ - `harness/`: all research/evaluation harnesses
185
+ - `README.md`: English repository overview
186
+ - `README.zh-CN.md`: Chinese repository overview
187
+ - `REFERENCES.md`: English provenance and external references
188
+ - `REFERENCES.zh-CN.md`: Chinese provenance and external references
189
+
190
+ The fixed directories define experiment workflow semantics only. They do not
191
+ prescribe the target ecosystem's internal source or test layout.
192
+
193
+ After generation, build an internal project artifact registry by inspecting the
194
+ actual repository. Record artifact roles, not hardcoded names:
195
+
196
+ - selected initializer/template and generation method
197
+ - generation evidence, such as command/source used and the starter artifacts it
198
+ produced before overlay edits
199
+ - a pre-overlay artifact snapshot, captured immediately after the initializer
200
+ or template finishes and before README/REFERENCES/harness edits
201
+ - template-origin evidence for each retained metadata, dependency, build,
202
+ source, test, entry, example, and tool-configuration role
203
+ - project metadata artifact
204
+ - dependency declaration artifact
205
+ - build/run configuration artifacts
206
+ - starter source layout and entry points
207
+ - test layout generated by the template, or the researched minimal test-layout
208
+ note when the template generated none
209
+ - fixed experiment directories
210
+ - harness subfolders and explanation files
211
+ - README/REFERENCES files
212
+ - dependency registry and reference-only source registry
213
+
214
+ Use this registry to drive dependency edits, installation instructions,
215
+ REFERENCES provenance, and static validation. Do not write the registry as a
216
+ separate manifest unless the user asks.
217
+
218
+ Normalize generated metadata after template creation. Remove, blank, or replace
219
+ personal names, personal emails, organization secrets, machine-specific paths,
220
+ and local initializer defaults unless the user supplied them.
221
+
222
+ Run a template-origin static check before accepting the starter layer. Confirm
223
+ that the selected initializer or template actually created the non-documentation
224
+ roles recorded in the registry. Do not accept a hand-built approximation of
225
+ metadata, empty source directories, and README files as equivalent to
226
+ initializer output.
227
+
228
+ If the selected official initializer creates only a very thin package shell,
229
+ prefer a richer official mode or a better-maintained template when that choice
230
+ matches the project and does not add irrelevant sample application code. When a
231
+ thin official starter remains the best fit, record it internally as a weak pass:
232
+ the registry must show the exact initializer-origin metadata, dependency
233
+ mechanism, source layout, and test layout or absence, and the final response
234
+ must state that the starter layer is intentionally minimal. Do not compensate by
235
+ inventing empty business modules or executable paper logic.
236
+
237
+ Create additional package/source directories only when the selected template
238
+ generated them, or when the selected ecosystem's normal starter shape requires
239
+ them. The coding plan's logical modules do not automatically justify physical
240
+ package subdirectories. Prefer a single package namespace plus documentation of
241
+ module ownership over a forest of empty source subpackages. Treat repetitive
242
+ namespace-only subpackages that mirror planning nouns as residue unless the
243
+ chosen generator produced them or a concrete starter file gives each directory
244
+ a real ecosystem role. If a source directory is created, describe what it owns
245
+ in present-state responsibility language, not as an absence list.
246
+ Run an empty-package audit after overlay edits: a directory containing only
247
+ namespace files, marker files, or README-only notes must either be generated by
248
+ the selected starter or have a current project responsibility that is more
249
+ specific than a coding-plan module name.
250
+
251
+ ## Dependencies And Installation
252
+
253
+ Classify sources from DeepResearch:
254
+
255
+ - **Installable dependencies:** packaging is clear, license is acceptable,
256
+ interfaces are stable enough, and the source is suitable for direct use via
257
+ the selected ecosystem's dependency mechanism.
258
+ - **Development/test installable dependencies:** tools needed because the
259
+ scaffold selected that development, testing, formatting, documentation, or
260
+ packaging workflow.
261
+ - **Reference-only sources:** useful for implementation ideas, benchmarks,
262
+ harness structure, or domain understanding, but unsuitable to install directly
263
+ because of license uncertainty, heavy dependencies, unstable interfaces,
264
+ maintenance risk, incompatible scope, or limited reuse need.
265
+
266
+ Maintain an internal runtime-dependency decision record. If the runtime
267
+ dependency set is empty, record the candidate runtime libraries considered,
268
+ their classification, why each was rejected, deferred, or made reference-only,
269
+ and the shared technical rationale that keeps runtime dependencies empty. Keep
270
+ README, REFERENCES, and the dependency declaration consistent with that record.
271
+
272
+ Write installable dependencies into the selected template's native dependency
273
+ declaration mechanism. If the template already generated a dependency
274
+ declaration artifact, update that mechanism instead of creating a parallel one.
275
+ If an installable-dependency category is intentionally empty, record a concrete
276
+ technical rationale in README and REFERENCES: no suitable directly installable
277
+ runtime library was selected, or runtime choices depend on unresolved substrate,
278
+ hardware, simulator, or framework compatibility. Do not frame empty categories
279
+ as an omission or as stage language such as "before code exists", "later", or
280
+ "not yet implemented".
281
+
282
+ Do not install, resolve, lock, download, or import project dependencies. If the
283
+ ecosystem normally uses a lock state that requires dependency resolution, keep a
284
+ template-provided lock state if present, but do not generate a new one by
285
+ running installation or resolution commands.
286
+
287
+ Do not add ignore rules for dependency lock artifacts by default. Track or
288
+ ignore lock artifacts according to the selected ecosystem, project type, and
289
+ DeepResearch evidence. If the project is a research application or experiment
290
+ repository, prefer allowing a later resolved lock artifact to be versioned for
291
+ reproducibility. If a library-style policy intentionally excludes a lock
292
+ artifact, record that rationale in the registry and explain the package-policy
293
+ choice in README or REFERENCES without mentioning the skill process.
294
+
295
+ README installation sections must:
296
+
297
+ - include `## Installation` in English and `## 安装` in Chinese,
298
+ - distinguish system prerequisites from project dependencies,
299
+ - use repo-local, workspace-local, project-local, sandboxed, containerized, or
300
+ environment-isolated setup when the ecosystem supports it,
301
+ - explain commands from the repository root,
302
+ - install project dependencies through the same dependency declaration mechanism
303
+ used by the template, or clearly label any manual fallback with identical
304
+ package/version constraints from the dependency registry,
305
+ - prefer adding a template-native development/test extra or equivalent
306
+ installable group over duplicating dependency names in fallback commands when
307
+ the ecosystem supports such a mechanism,
308
+ - omit a fallback path rather than writing commands that diverge from the
309
+ dependency declaration or REFERENCES registry,
310
+ - not install project dependencies globally,
311
+ - not include commands that run experiments, harnesses, tests, or paper methods,
312
+ - use neutral setup language that makes installation commands actionable from
313
+ the repository root without implying that dependency installation or
314
+ dependency resolution has already been run.
315
+
316
+ Static validation must compare the dependency declaration artifact, README
317
+ installation commands, and REFERENCES installable-dependency registry. A package
318
+ declared for runtime or development/test use must appear in the unified
319
+ Installable Dependencies registry with the same bucket and purpose. Reference-only
320
+ sources must not appear in dependency declarations or installation commands.
321
+
322
+ REFERENCES installable-dependency rows must include, for each selected package
323
+ or toolchain dependency:
324
+
325
+ - project/package name,
326
+ - source URL,
327
+ - license,
328
+ - selected version or version range,
329
+ - project role,
330
+ - direct-install rationale,
331
+ - repository consumer by role, such as package core, development tests,
332
+ build backend, harness family, exporter role, or documentation role.
333
+
334
+ REFERENCE-only rows must include:
335
+
336
+ - project or source name,
337
+ - source URL,
338
+ - license status,
339
+ - reference value for this paper project,
340
+ - why it is not declared as a dependency,
341
+ - the concrete implementation, harness, benchmark, or comparison role it may
342
+ inform.
343
+
344
+ Do not replace per-source reasoning with a combined warning list. A shared
345
+ warning may appear after the table, but every reference-only source still needs
346
+ its own license status, why-not-dependency rationale, and concrete project role.
347
+
348
+ ## Test Layout
349
+
350
+ Testing is template-informed, not a fixed overlay.
351
+
352
+ 1. Inspect the selected initializer's generated test structure.
353
+ 2. If generated tests, test directories, or test configuration exist, preserve
354
+ that native structure and document its responsibility in current project
355
+ terms.
356
+ 3. If the initializer generated no test structure, use DeepResearch evidence for
357
+ the target ecosystem and add only the smallest compatible test entry or a
358
+ lightweight test-layout note.
359
+ 4. Do not create guessed subdivisions such as unit, functional, integration,
360
+ feature-specific, or coding-plan-derived test folders before concrete code
361
+ defines reliable boundaries.
362
+ 5. Do not add executable test logic during scaffold initialization.
363
+
364
+ README and any test-layout note must explain the selected test-layout basis
365
+ and the boundary between software correctness checks and research harnesses.
366
+ Use responsibility language: the test area is responsible for software
367
+ correctness validation of package behavior. Do not write that tests check,
368
+ verify, validate, exercise, or cover configuration parsing, record validation,
369
+ lifecycle transitions, metric arithmetic, export schemas, CLI wiring, or any
370
+ other implementation surface unless actual implementation objects and matching
371
+ test files exist. Keep test notes lighter than harness notes: describe where
372
+ package-level correctness checks belong and how they stay separate from
373
+ research evidence. Do not mention fixtures, interface contracts, smoke checks,
374
+ or other concrete test categories unless the matching files exist.
375
+
376
+ ## Harness Overlay
377
+
378
+ Create semantic subfolders under `harness/` for experiment objectives that are
379
+ already determined by the paper blueprint, experiment plan, and coding plan.
380
+ Names should describe the task; do not use abstract numbering.
381
+
382
+ Each harness folder must include an explanation file with these sections:
383
+
384
+ - `Purpose`
385
+ - `Experiment Objective`
386
+ - `Entrypoint Semantics`
387
+ - `Inputs`
388
+ - `Metrics`
389
+ - `Output Artifacts`
390
+ - `Results Relationship`
391
+
392
+ Write harness explanations as positive responsibility specs. Do not use
393
+ absence-list wording such as "no runner is included" as the main content, and
394
+ do not use stage/placeholder language. The specs describe what the harness
395
+ directory owns and how harness code relates to experiment evidence; they do not
396
+ implement harness logic.
397
+
398
+ Each section must contain task-specific content from the planning inputs. A
399
+ harness explanation with only section headers, generic prose, or repeated
400
+ absence statements is invalid. Inputs, metrics, and output artifacts should name
401
+ the relevant workload families, metric identifiers, and artifact families from
402
+ the experiment/coding plans.
403
+
404
+ ## Repository Documentation
405
+
406
+ README files should be concise and user-facing. Include:
407
+
408
+ - project purpose,
409
+ - installation,
410
+ - fixed experiment directory overview,
411
+ - harness map,
412
+ - test layout responsibilities and harness/test boundary,
413
+ - pointer to REFERENCES for external dependencies and source attributions.
414
+
415
+ README files must present the repository as the project itself. They must not
416
+ describe template selection, generator choice, DeepResearch, scaffold process,
417
+ candidate comparison, rejected candidates, or internal decision flow.
418
+ They must not list planning input files that live outside the repository. If
419
+ the user asks to keep planning inputs in the repository, copy or create them
420
+ inside the repository first and reference only repository-local paths.
421
+
422
+ Open README with project purpose and repository responsibilities, not with
423
+ generation-stage status or omissions. Use present-state role language. Avoid
424
+ documentation that reads as a stage note, placeholder note, or absence report.
425
+ Do not use whole-word/whole-phrase terms such as `future`, `placeholder`,
426
+ `scaffold stage`, `will be implemented later`, `to be filled`,
427
+ `template`, `scaffold`, `generated from`, `starter`, `boilerplate`,
428
+ `deepresearch`, `skill`, `Codex`, `initialization stage`, `current boundary`,
429
+ `reserved`, `reserved for`, `reserved package area`, `later`, `once`,
430
+ `does not include`, `not included`, `no runnable`, or similar process,
431
+ negative, or temporal phrasing in README, REFERENCES, harness explanations, or
432
+ test-layout notes. Match banned single words as whole words so terms such as
433
+ `preserved` are not false positives. Do not claim unimplemented paper methods,
434
+ experiments, metrics, runners, loaders, or exporters work.
435
+ Also avoid project-management or invocation terms such as `external task
436
+ inputs`, `external inputs`, `bucket`, `consulted`, `operator`, and `task` when
437
+ plain project documentation can say dependencies, references, setup, maintainers,
438
+ or users instead.
439
+
440
+ Chinese README and REFERENCES should be polished Chinese technical
441
+ documentation. Translate ordinary workflow terms; keep identifiers, package
442
+ names, commands, filenames, artifact names, URLs, licenses, and precision terms
443
+ in English when translation would reduce clarity. Chinese REFERENCES should use
444
+ Chinese field labels rather than English key dumps.
445
+
446
+ REFERENCES must be category-driven and include:
447
+
448
+ - unified Installable Dependencies registry with runtime and development/test
449
+ buckets, plus other project dependency buckets if the generated dependency
450
+ mechanism requires them,
451
+ - package-management and installation strategy sources,
452
+ - source attributions only when external or inherited files, notices, or
453
+ licenses require attribution,
454
+ - harness/benchmark references,
455
+ - reference-only repositories,
456
+ - implementation references.
457
+
458
+ REFERENCES should record sources that the current project actually uses, depends
459
+ on, needs to attribute, or cites as implementation/harness/benchmark references.
460
+ Do not record template search history, rejected templates, DeepResearch
461
+ scratchwork, generator rationale, or sources that did not affect project files,
462
+ dependencies, attribution, or useful implementation references.
463
+ Do not list broad sources merely because they appeared in research. Each
464
+ REFERENCE entry must state its concrete relationship to the repository. Remove
465
+ benchmark or harness sources that only served as general background and did not
466
+ shape retained files, dependency choices, harness organization, benchmark
467
+ semantics, or implementation handoff.
468
+ If a benchmark or tool source is kept, explain the specific retained convention,
469
+ comparison role, artifact relationship, or implementation handoff it supports.
470
+
471
+ If a template source or generated file must be attributed for license reasons,
472
+ record it briefly under source attributions. If the template only provided
473
+ structure and the retained files have been rewritten into project-specific
474
+ content without attribution requirements, do not document the template process.
475
+ Do not add a source-attribution row for repository-authored files owned by the
476
+ current project; the repository license covers those files.
477
+
478
+ For reference-only sources with unknown, unresolved, custom, restrictive,
479
+ research-only, or incompatible licenses, include an explicit warning forbidding
480
+ copying, porting, or directly reusing code until license and compatibility are
481
+ verified. Include an equivalent natural Chinese warning in
482
+ `REFERENCES.zh-CN.md`.
483
+
484
+ Avoid vague `consulted` wording. Use precise relationship labels such as
485
+ installable dependency, build dependency, source attribution, harness reference,
486
+ benchmark reference, comparison reference, or implementation reference. Use
487
+ attribution or retained-file wording only when a source actually produced files
488
+ in the repository or license terms require attribution.
489
+
490
+ ## Workflow
491
+
492
+ 1. Resolve the target repository path and confirm the write scope.
493
+ 2. Read only the planning inputs and explicit user constraints needed for
494
+ scaffold generation.
495
+ 3. Extract scaffold requirements: target ecosystem signals, experiment forms,
496
+ harness objectives, dependency needs, input data families, output/result
497
+ artifact families, and downstream implementation direction.
498
+ 4. Run DeepResearch for initializers, templates, generated structures, package
499
+ management, environment isolation, test layout, installable dependencies,
500
+ reference-only sources, and domain repositories.
501
+ 5. Infer language/runtime/ecosystem and record the basis.
502
+ 6. Compare candidate generated structures and choose an initializer/template.
503
+ 7. Generate the starter repository inside the target path. If an existing
504
+ documentation-only scaffold is present, generate in a disposable location and
505
+ merge starter artifacts into the target while preserving user content.
506
+ 8. Capture a pre-overlay artifact snapshot, inspect generated artifacts, and
507
+ build the internal project artifact registry.
508
+ 9. Preserve template starter structure; prune only unrelated sample residue,
509
+ unsuitable third-party implementation code, or files that conflict with the
510
+ scaffold boundary.
511
+ 10. Normalize local initializer metadata not supplied by the user.
512
+ 11. Check source/package directories against the selected template output.
513
+ Remove or avoid empty subpackage forests that only mirror planning modules.
514
+ 12. Classify installable dependencies and reference-only sources. Update the
515
+ template-native dependency declaration without installing or resolving.
516
+ 13. Record the runtime-dependency decision, including intentionally empty
517
+ runtime sets and the candidates assigned to reference-only or deferred
518
+ categories.
519
+ 14. Determine repo-local installation strategy from the template and
520
+ DeepResearch evidence.
521
+ 15. Overlay `data/`, `output/`, `results/`, and `harness/`.
522
+ 16. Add semantic harness folders with schema-complete explanation files.
523
+ 17. Preserve generated test structure, or add only the minimal researched
524
+ test-layout note when no test structure was generated.
525
+ 18. Create concise README files and detailed category-driven REFERENCES files.
526
+ 19. Run static validation from the project artifact registry. Revise until all
527
+ gates pass.
528
+
529
+ ## Static Validation
530
+
531
+ Perform static checks only. Do not install dependencies, lint, format, type
532
+ check, run generated project code, run generated project scripts, run tests, run
533
+ harnesses, or execute experiments.
534
+
535
+ Validation must confirm by role, using the project artifact registry:
536
+
537
+ - all created or modified repository files are inside the target path,
538
+ - a starter repository was generated or a suitable existing starter layer was
539
+ preserved,
540
+ - the pre-overlay artifact snapshot proves which project roles came from the
541
+ selected initializer/template rather than hand-authored mimicry,
542
+ - the starter layer includes substantive non-documentation artifacts such as
543
+ project metadata, dependency declaration, build/run configuration, source
544
+ layout, test layout, entry points, examples, or initializer-specific files,
545
+ - thin official starters are flagged as weak passes unless the registry shows a
546
+ richer generated starter layer; weak passes must still have initializer-origin
547
+ metadata, dependency mechanism, source layout, and test-layout evidence,
548
+ - the final repository is not documentation-only,
549
+ - template-generated source, package, build, test, entry, and tool-config roles
550
+ were not overwritten by the fixed experiment overlay,
551
+ - fixed overlay directories and bilingual docs exist,
552
+ - each harness has a semantic subfolder and schema-complete explanation file,
553
+ - test structure follows the selected template or DeepResearch-supported
554
+ ecosystem layout,
555
+ - no guessed test subdivisions were added before implementation code exists,
556
+ - README describes the test-layout basis and harness/test boundary,
557
+ - README and test notes describe test responsibilities without claiming
558
+ implementation-specific checks exist before corresponding code and tests
559
+ exist,
560
+ - concrete test ecosystem wording has matching development/test dependency
561
+ declarations,
562
+ - test notes stay lightweight when the selected test area has no executable
563
+ test files, avoiding fixture, contract, smoke, CLI, schema, or parser wording
564
+ unless matching files exist,
565
+ - dependency declaration, README installation commands, and REFERENCES
566
+ Installable Dependencies registry agree exactly,
567
+ - the runtime-dependency decision record agrees with README, REFERENCES, and
568
+ the dependency declaration, including empty runtime dependency sets,
569
+ - fallback installation commands either consume the selected dependency
570
+ declaration mechanism or use the same package names and version constraints
571
+ recorded in REFERENCES,
572
+ - installable dependencies in configuration appear in the registry with matching
573
+ bucket and purpose,
574
+ - registry entries marked installable appear in the dependency declaration,
575
+ - candidates classified as rejected, deferred, or reference-only are absent from
576
+ dependency declarations and installation commands,
577
+ - each REFERENCES installable-dependency entry includes source URL, license,
578
+ selected version or version range, project role, direct-install rationale, and
579
+ repository consumer role,
580
+ - reference-only sources appear in REFERENCES and are absent from dependency
581
+ declarations and installation commands,
582
+ - each reference-only entry includes source URL, license status, project
583
+ reference value, why it is not declared as a dependency, and the concrete
584
+ implementation, harness, benchmark, or comparison role it may inform,
585
+ - risky reference-only sources have explicit English and Chinese no-copy/no-port
586
+ warnings,
587
+ - lock artifact policy follows the selected ecosystem and project type; lock
588
+ artifacts are not ignored by default, and any intentional library-style ignore
589
+ rule has a registry rationale and matching README or REFERENCES explanation,
590
+ - README installation sections exist, use isolated/repo-local project dependency
591
+ setup where available, and do not claim installation was run,
592
+ - README and REFERENCES agree with project dependencies, source attributions,
593
+ installation strategy, test layout responsibilities, fixed directories,
594
+ harness structure, and actual tree,
595
+ - README and repository docs do not reference planning input files outside the
596
+ target repository; any referenced planning files exist inside the repository
597
+ or are omitted from project docs,
598
+ - repo-relative reference audit passes: repository docs contain no relative
599
+ links, inline file references, or path-like mentions that point outside the
600
+ target repository or to missing files, unless the text clearly identifies a
601
+ public URL rather than a repository path,
602
+ - generated metadata has no personal author/email/local-path values unless
603
+ supplied by the user,
604
+ - README, REFERENCES, harness explanations, and test-layout notes use
605
+ present-state objective language and avoid banned process, template, stage,
606
+ placeholder, reservation, and absence phrases,
607
+ - README begins with purpose and project structure rather than a list
608
+ of missing implementation pieces,
609
+ - harness explanation sections contain task-specific inputs, metrics, output
610
+ artifacts, and result-record relationships rather than generic section text,
611
+ - source package additions are template-generated, backed by starter files, or
612
+ consolidated into a single package-layout note,
613
+ - empty source subpackage forests that mirror coding-plan modules are absent
614
+ unless the selected template generated them or each directory has a concrete
615
+ starter-repo role,
616
+ - empty-package audit passes for all source/package directories added after the
617
+ starter was generated,
618
+ - generated docs do not describe unimplemented functionality as working,
619
+ - project-only documentation audit passes: repository docs describe the current
620
+ project, not template generation, scaffold process, DeepResearch, skill
621
+ execution, or internal decision flow,
622
+ - project-only documentation audit rejects internal artifact-management wording
623
+ such as external inputs, bucket, consulted, operator-executed setup, broad
624
+ research-log language, or other invocation-process phrasing when ordinary
625
+ project documentation can be used,
626
+ - REFERENCES contains only sources that are project dependencies, source
627
+ attributions, license notices, harness/benchmark references, or
628
+ implementation references with actual project value,
629
+ - REFERENCES source-attribution sections omit repository-authored files unless
630
+ an external or inherited source actually requires attribution,
631
+ - template default project names, template welcome text, sample-app
632
+ descriptions, tutorial links, and generator instructions are removed or
633
+ rewritten in project terms,
634
+ - each retained documentation or source file can be explained by a current
635
+ project responsibility,
636
+ - no active paper business logic, experiment workflow, harness execution, metric
637
+ computation, loader/exporter code, or real test logic was implemented,
638
+ - no dependency installation, resolution, package download, generated-code run,
639
+ test run, harness run, or experiment execution occurred.
640
+
641
+ Treat any of these as validation failures:
642
+
643
+ - no real starter generation occurred for a new scaffold,
644
+ - template-origin evidence is missing for the generated project roles,
645
+ - a documentation-only scaffold was accepted without adding a starter layer,
646
+ - a thin official starter is presented as a strong generated repository without
647
+ a richer initializer/template comparison or weak-pass note,
648
+ - language/runtime/ecosystem is undecided for a new scaffold,
649
+ - README lacks `Installation` or README.zh-CN lacks `安装`,
650
+ - README or any repository document references planning input files that are
651
+ outside the target repository,
652
+ - README or any repository document contains a repo-relative file reference that
653
+ resolves outside the target repository or does not resolve inside the target
654
+ repository,
655
+ - installation sections are not actionable for the current dependency
656
+ declaration, only defer setup, or install project dependencies globally,
657
+ - manual fallback commands install package names or version ranges that do not
658
+ match the dependency registry,
659
+ - REFERENCES lacks category-driven structure focused on project dependencies,
660
+ source attributions, license notices, and actual references,
661
+ - REFERENCES installable-dependency rows lack source URL, license, selected
662
+ version/range, direct-install rationale, project role, or repository consumer
663
+ role,
664
+ - REFERENCES reference-only rows lack license status, why-not-dependency
665
+ reasoning, or a concrete current-project reference role,
666
+ - REFERENCES relies on a single combined warning for reference-only sources
667
+ without per-source license status and why-not-dependency reasoning,
668
+ - REFERENCES includes broad research-log sources that did not affect retained
669
+ structure, dependencies, attribution, harness semantics, benchmark semantics,
670
+ comparison scope, or implementation handoff,
671
+ - REFERENCES includes repository-authored files as if they were external source
672
+ attributions,
673
+ - dependencies declared in configuration are missing from Installable
674
+ Dependencies, or registry entries marked installable are missing from
675
+ configuration,
676
+ - an empty runtime dependency set lacks a candidate classification record and a
677
+ matching technical rationale in README and REFERENCES,
678
+ - reference-only sources appear in dependency configuration or installation
679
+ commands,
680
+ - ignore rules exclude dependency lock artifacts without project-type rationale
681
+ from DeepResearch and matching documentation,
682
+ - risky reference-only warnings fail to forbid copying, porting, or direct reuse
683
+ until license and compatibility are verified,
684
+ - the scaffold creates test subdivisions from guessed coding-plan categories,
685
+ - validation depends on a predeclared ecosystem file or directory name instead
686
+ of discovered artifact roles,
687
+ - README, REFERENCES, harness explanations, or test notes use banned stage
688
+ terms or negative absence phrases such as `does not include`, `not included`,
689
+ `current boundary`, whole-word `reserved`, `reserved for`,
690
+ `reserved package area`, `later`, `once`, or `no runnable`,
691
+ - README, REFERENCES, harness explanations, test notes, or retained template
692
+ documentation contain process words such as `template`, `scaffold`,
693
+ `generated from`, `starter`, `boilerplate`, `placeholder`,
694
+ `future implementation`, `deepresearch`, `skill`, `Codex`, or
695
+ `initialization stage`, unless the word is a project-domain term or legally
696
+ required attribution,
697
+ - README or REFERENCES describes template selection, generator rationale,
698
+ DeepResearch process, rejected templates, or internal comparison workflow,
699
+ - README, REFERENCES, harness explanations, test notes, or retained docs use
700
+ internal artifact-management wording such as `External Task Inputs`,
701
+ `external inputs`, whole-word `bucket`, vague `consulted`, `operators
702
+ execute`, or similar process language,
703
+ - REFERENCES includes sources that were only searched or rejected and have no
704
+ dependency, attribution, license, benchmark, harness, or implementation value,
705
+ - retained template files still contain default template project names, template
706
+ tutorials, template welcome prose, generator instructions, or unrelated sample
707
+ app descriptions,
708
+ - harness explanations omit purpose, experiment objective, entrypoint semantics,
709
+ inputs, metrics, output artifacts, or results relationship,
710
+ - harness explanations include the required headings but not meaningful
711
+ task-specific content from the planning inputs,
712
+ - README-only source subdirectories mirror coding-plan nouns without template
713
+ support or starter files,
714
+ - empty source subpackage forests mirror coding-plan modules without being
715
+ produced by the selected generator or required by the selected ecosystem,
716
+ - the final repository looks hand-built from metadata files plus empty package
717
+ directories rather than produced by an initializer, template tool, official
718
+ starter, or high-quality template repository,
719
+ - README or test-layout notes claim concrete checks for configuration parsing,
720
+ records, lifecycle transitions, metrics, exports, CLI wiring, or similar
721
+ implementation surfaces when those implementation objects and test files are
722
+ absent,
723
+ - README or test-layout notes mention fixtures, interface contracts, smoke
724
+ checks, or other concrete test categories when the corresponding test files
725
+ are absent,
726
+ - generated metadata preserves local personal names, emails, or paths not
727
+ supplied by the user.
728
+
729
+ ## Final Response
730
+
731
+ Summarize:
732
+
733
+ - target repository path,
734
+ - selected language/runtime,
735
+ - substantive project artifacts retained or adjusted,
736
+ - whether the selected initializer/template produced a strong starter layer or
737
+ a weak-pass thin starter, based on the pre-overlay artifact snapshot,
738
+ - dependency declaration mechanism and selected installable dependencies, or
739
+ explicit empty-bucket decisions,
740
+ - runtime-dependency candidate classification when runtime dependencies are
741
+ empty,
742
+ - reference-only sources recorded,
743
+ - repo-local/environment-isolated installation strategy documented,
744
+ - fixed experiment directories overlaid,
745
+ - semantic harness folders created,
746
+ - test-layout basis and any minimal note added,
747
+ - static validation performed, including project-only documentation,
748
+ non-documentation, installation, dependency/reference consistency, test-layout
749
+ provenance, harness schema, and scaffold-only checks,
750
+ - preservation decisions or skipped overwrites,
751
+ - next implementation handoff point.
752
+
753
+ Keep the response focused on the starter scaffold and experiment overlay. Do
754
+ not present dependency installation, runtime execution, test results, harness
755
+ results, or experiment results unless the user explicitly requested separate
756
+ operational work outside this skill.