npm - academic-army - Versions diffs - 0.1.0 - Mend

academic-army 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/.editorconfig +9 -0
package/.github/workflows/publish.yml +44 -0
package/.prettierrc.json +3 -0
package/LICENSE +21 -0
package/README.md +172 -0
package/README.zh-CN.md +172 -0
package/agent-forge.yaml +83 -0
package/eslint.config.js +28 -0
package/install_mcp.py +85 -0
package/mcp-server/__main__.py +33 -0
package/mcp-server/deepresearch/__init__.py +3 -0
package/mcp-server/deepresearch/tools.py +33 -0
package/mcp-server/requirements.txt +4 -0
package/metaskills/README.md +131 -0
package/metaskills/README.zh-CN.md +131 -0
package/metaskills/academic-army-architect/METASKILL.md +91 -0
package/metaskills/academic-army-architect/envolve.sh +9 -0
package/metaskills/academic-army-coding-plan/ENVOLVETASK.md +1 -0
package/metaskills/academic-army-coding-plan/METASKILL.md +118 -0
package/metaskills/academic-army-coding-plan/envolve.sh +9 -0
package/metaskills/academic-army-coding-style/METASKILL.md +292 -0
package/metaskills/academic-army-experiment-plan/ENVOLVETASK.md +1 -0
package/metaskills/academic-army-experiment-plan/METASKILL.md +82 -0
package/metaskills/academic-army-experiment-plan/envolve.sh +9 -0
package/metaskills/academic-army-repo-scaffold/ENVOLVETASK.md +1 -0
package/metaskills/academic-army-repo-scaffold/METASKILL.md +223 -0
package/metaskills/academic-army-repo-scaffold/envolve.sh +9 -0
package/package.json +35 -0
package/runs/develop-skill.sh +17 -0
package/runs/develop.sh +16 -0
package/skills/academic-army-architect/SKILL.md +336 -0
package/skills/academic-army-architect/agents/openai.yaml +11 -0
package/skills/academic-army-architect/references/blueprint-schema.md +345 -0
package/skills/academic-army-coding-plan/SKILL.md +491 -0
package/skills/academic-army-coding-plan/agents/openai.yaml +11 -0
package/skills/academic-army-coding-style/SKILL.md +915 -0
package/skills/academic-army-coding-style/agents/openai.yaml +11 -0
package/skills/academic-army-experiment-plan/SKILL.md +517 -0
package/skills/academic-army-experiment-plan/agents/openai.yaml +11 -0
package/skills/academic-army-repo-scaffold/SKILL.md +756 -0
package/skills/academic-army-repo-scaffold/agents/openai.yaml +10 -0
package/src/README.md +79 -0
package/src/README.zh-CN.md +79 -0
package/src/cli.ts +55 -0
package/src/developing/README.md +146 -0
package/src/developing/README.zh-CN.md +146 -0
package/src/developing/agents/developer.ts +40 -0
package/src/developing/agents/factory.ts +11 -0
package/src/developing/agents/index.ts +8 -0
package/src/developing/agents/manager.ts +74 -0
package/src/developing/agents/prompts.ts +12 -0
package/src/developing/agents/reviewer.ts +44 -0
package/src/developing/agents/trajectory-optimizer.ts +70 -0
package/src/developing/agents/types.ts +41 -0
package/src/developing/index.ts +2 -0
package/src/developing/pipeline.ts +306 -0
package/src/developing/pipelineskill.ts +169 -0
package/src/evolve-skill/README.md +116 -0
package/src/evolve-skill/README.zh-CN.md +116 -0
package/src/evolve-skill/agents/evaluator.ts +28 -0
package/src/evolve-skill/agents/factory.ts +11 -0
package/src/evolve-skill/agents/index.ts +4 -0
package/src/evolve-skill/agents/modifier.ts +27 -0
package/src/evolve-skill/agents/runner.ts +19 -0
package/src/evolve-skill/index.ts +1 -0
package/src/evolve-skill/pipeline.ts +140 -0
package/src/pipeline.ts +65 -0
package/tsconfig.json +22 -0

package/skills/academic-army-repo-scaffold/SKILL.md ADDED Viewed

@@ -0,0 +1,756 @@
+---
+name: academic-army-repo-scaffold
+description: >-
+  Initialize template-first research code repositories for the Academic Army
+  autoresearch workflow from a paper_blueprint, experiment plan, coding plan,
+  and user-specified repository path. Use when Codex needs to create or adapt a
+  real starter repository scaffold generated by an initializer, template tool,
+  official starter, or high-quality template repository, then overlay fixed
+  experiment directories, bilingual README/REFERENCES files, semantic harness
+  folders, and template-informed test layout notes without implementing paper
+  methods, experiments, metrics, runners, loaders, exporters, tests, or business
+  logic.
+---
+# Academic Army Repo Scaffold
+## Purpose
+Create a real starter repository first, then add the Academic Army experiment
+overlay. The primary artifact is the generated starter repo plus the experiment
+directory structure. `README.md`, `README.zh-CN.md`, `REFERENCES.md`, and
+`REFERENCES.zh-CN.md` are supporting documentation.
+This skill is scaffold-only:
+- Generate or preserve starter files, boilerplate source, project metadata,
+  dependency declarations, build/test configuration roles, minimal entry files,
+  and ecosystem conventions from the selected template or initializer.
+- Do not implement paper methods, data loaders, metric formulas, result
+  exporters, experiment runners, harness logic, test logic, configuration
+  parsing, or domain business behavior.
+- Do not install project dependencies, resolve dependencies, generate a new
+  lock state, run generated code, run tests, run harnesses, or execute
+  experiments.
+If the final repository contains only documentation and empty directories,
+report scaffold generation as failed. A completed scaffold needs substantive
+starter-repo artifacts discovered from the selected template, such as project
+metadata, dependency declaration, build/run configuration, selected source
+layout, selected test layout, entry points, examples, or an initializer-specific
+artifact.
+Hand-authored metadata plus empty package directories is not enough unless a
+real generator or template produced that structure and the artifact registry can
+identify the generated project roles.
+## Inputs And Scope
+Require a target repository path. If the user does not provide one, ask before
+creating files.
+Read only the inputs needed to initialize the repository:
+- paper blueprint
+- experiment plan
+- coding plan
+- explicit user constraints about target path, existing-repo adaptation,
+  language, framework, runtime, template, dependency policy, or repository
+  preservation
+If planning paths are provided, read those files. Otherwise locate the closest
+conventional planning artifacts by name, then stop. Do not inspect unrelated
+nearby source trees, logs, notebooks, old outputs, package manifests, or other
+workspace noise.
+Read files inside the target repository only when:
+- the user explicitly asks to adapt an existing repository,
+- an initializer has generated files there that must be inspected,
+- existing scaffold files must be preserved or refreshed safely, or
+- target contents must be classified before writing.
+Keep runtime mechanics out of generated repository documents. Do not write about
+sandbox limits, shell failures, MCP failures, dependency-install failures, or
+local workaround details unless the user explicitly asks for operational notes.
+## Path And Preservation
+Resolve the target repository path before writing. Every created or modified
+repository file must stay inside that path. Repository documents should use
+repository-relative paths for internal files.
+Treat planning inputs outside the repository as invocation context only. Do not
+reference them from reusable repository documents unless the user explicitly
+asks to copy them into the repository. Repository documentation must not contain
+broken repo-relative references, parent-relative workspace paths, machine
+absolute paths, or workspace-specific commands.
+If the target path already contains files, classify them before changing
+anything:
+- existing starter-repo files to preserve
+- generated scaffold docs that can be refreshed safely
+- generated scaffold residue that can be pruned safely
+- user-authored or ambiguous content that must be preserved
+Do not overwrite non-generated or ambiguous content. Ask before destructive
+replacement when authorship is unclear. If an existing target is
+documentation-only, upgrade it with a real starter-repository layer while
+preserving user-authored material.
+## Required DeepResearch
+Run `academic_army_mcp_tools.deepresearch` before choosing the initializer or
+template, unless the task includes a fresh directly relevant lookup artifact
+that already compares scaffold-generation options for this project.
+DeepResearch must identify the target ecosystem from the planning inputs and
+compare ways to generate a real starter repository:
+- official initialization commands and framework CLIs
+- template tools and ecosystem package generators
+- high-quality template repositories and GitHub template repositories
+- research-code, benchmark, and harness templates
+- high-quality public repositories in the paper domain, used for structural
+  lessons or implementation references
+- package-management, dependency-declaration, and environment-isolation
+  practices
+- test-layout conventions when the selected generator does not create tests
+- installable dependencies and reference-only sources, including license,
+  packaging quality, interface stability, maintenance, and risk notes
+Do not hardcode a language, runtime, package manager, dependency file, build
+file, source path, test path, test framework, or configuration file in this
+skill. Select them at invocation time from user constraints, planning inputs,
+DeepResearch evidence, template quality, license clarity, generated output
+shape, and downstream implementation cost.
+When no language/runtime is specified, infer a practical starter ecosystem from
+the plans and DeepResearch evidence. Do not leave language/runtime undecided
+for a new scaffold.
+Use a prompt shaped like:
+```text
+Research template-first repository initialization options for an Academic Army
+research-code project.
+Project context:
+[paper goal, domain, candidate methods, experiment harness needs, coding-plan
+logical modules, explicit user constraints, target repository constraints]
+Return:
+- official initializers, template tools, template repositories, research-code
+  templates, benchmark templates, and related high-quality repositories
+- expected generated output shape for each candidate by role: project metadata,
+  dependency declaration, build/run configuration, source layout, test layout,
+  entry points, docs, examples
+- generated-structure comparison across plausible candidates: directory shape,
+  dependency mechanism, entry files, test structure, configuration complexity,
+  documentation quality, license, maintenance, and fit for the paper workflow
+- dependency declaration mechanism, repo-local installation workflow, and
+  environment isolation approach
+- runtime dependency candidates, development/test dependency candidates, and
+  reference-only sources with license, packaging, stability, maintenance, and
+  risk notes
+- exact release tag, package version, commit SHA, license, or dated unpinned
+  access snapshot where practical
+- which candidates are safe to generate from, adapt, cite only, or reject
+- recommended initializer/template and why
+```
+If multiple candidates remain plausible, compare actual generated structures.
+Inspect docs, run a dry generator in a disposable location, or statically inspect
+a template repo when candidate output shape is unclear. Do not choose from
+descriptions alone.
+## Template-First Design
+Use a hybrid layout:
+- **Template layer:** generated by the selected initializer, template tool,
+  official starter, or template repository. Preserve its starter source,
+  metadata, dependency declarations, build/test configuration roles, entry
+  points, examples, and ecosystem conventions unless they are unrelated sample
+  residue, ambiguous third-party code, or license-risky implementation content.
+- **Experiment overlay:** fixed Academic Army top-level directories and
+  bilingual documentation.
+Always create or preserve these overlay entries:
+- `data/`: input data and dataset assets
+- `output/`: program run outputs and intermediate artifacts
+- `results/`: experiment result records and paper-facing summaries
+- `harness/`: all research/evaluation harnesses
+- `README.md`: English repository overview
+- `README.zh-CN.md`: Chinese repository overview
+- `REFERENCES.md`: English provenance and external references
+- `REFERENCES.zh-CN.md`: Chinese provenance and external references
+The fixed directories define experiment workflow semantics only. They do not
+prescribe the target ecosystem's internal source or test layout.
+After generation, build an internal project artifact registry by inspecting the
+actual repository. Record artifact roles, not hardcoded names:
+- selected initializer/template and generation method
+- generation evidence, such as command/source used and the starter artifacts it
+  produced before overlay edits
+- a pre-overlay artifact snapshot, captured immediately after the initializer
+  or template finishes and before README/REFERENCES/harness edits
+- template-origin evidence for each retained metadata, dependency, build,
+  source, test, entry, example, and tool-configuration role
+- project metadata artifact
+- dependency declaration artifact
+- build/run configuration artifacts
+- starter source layout and entry points
+- test layout generated by the template, or the researched minimal test-layout
+  note when the template generated none
+- fixed experiment directories
+- harness subfolders and explanation files
+- README/REFERENCES files
+- dependency registry and reference-only source registry
+Use this registry to drive dependency edits, installation instructions,
+REFERENCES provenance, and static validation. Do not write the registry as a
+separate manifest unless the user asks.
+Normalize generated metadata after template creation. Remove, blank, or replace
+personal names, personal emails, organization secrets, machine-specific paths,
+and local initializer defaults unless the user supplied them.
+Run a template-origin static check before accepting the starter layer. Confirm
+that the selected initializer or template actually created the non-documentation
+roles recorded in the registry. Do not accept a hand-built approximation of
+metadata, empty source directories, and README files as equivalent to
+initializer output.
+If the selected official initializer creates only a very thin package shell,
+prefer a richer official mode or a better-maintained template when that choice
+matches the project and does not add irrelevant sample application code. When a
+thin official starter remains the best fit, record it internally as a weak pass:
+the registry must show the exact initializer-origin metadata, dependency
+mechanism, source layout, and test layout or absence, and the final response
+must state that the starter layer is intentionally minimal. Do not compensate by
+inventing empty business modules or executable paper logic.
+Create additional package/source directories only when the selected template
+generated them, or when the selected ecosystem's normal starter shape requires
+them. The coding plan's logical modules do not automatically justify physical
+package subdirectories. Prefer a single package namespace plus documentation of
+module ownership over a forest of empty source subpackages. Treat repetitive
+namespace-only subpackages that mirror planning nouns as residue unless the
+chosen generator produced them or a concrete starter file gives each directory
+a real ecosystem role. If a source directory is created, describe what it owns
+in present-state responsibility language, not as an absence list.
+Run an empty-package audit after overlay edits: a directory containing only
+namespace files, marker files, or README-only notes must either be generated by
+the selected starter or have a current project responsibility that is more
+specific than a coding-plan module name.
+## Dependencies And Installation
+Classify sources from DeepResearch:
+- **Installable dependencies:** packaging is clear, license is acceptable,
+  interfaces are stable enough, and the source is suitable for direct use via
+  the selected ecosystem's dependency mechanism.
+- **Development/test installable dependencies:** tools needed because the
+  scaffold selected that development, testing, formatting, documentation, or
+  packaging workflow.
+- **Reference-only sources:** useful for implementation ideas, benchmarks,
+  harness structure, or domain understanding, but unsuitable to install directly
+  because of license uncertainty, heavy dependencies, unstable interfaces,
+  maintenance risk, incompatible scope, or limited reuse need.
+Maintain an internal runtime-dependency decision record. If the runtime
+dependency set is empty, record the candidate runtime libraries considered,
+their classification, why each was rejected, deferred, or made reference-only,
+and the shared technical rationale that keeps runtime dependencies empty. Keep
+README, REFERENCES, and the dependency declaration consistent with that record.
+Write installable dependencies into the selected template's native dependency
+declaration mechanism. If the template already generated a dependency
+declaration artifact, update that mechanism instead of creating a parallel one.
+If an installable-dependency category is intentionally empty, record a concrete
+technical rationale in README and REFERENCES: no suitable directly installable
+runtime library was selected, or runtime choices depend on unresolved substrate,
+hardware, simulator, or framework compatibility. Do not frame empty categories
+as an omission or as stage language such as "before code exists", "later", or
+"not yet implemented".
+Do not install, resolve, lock, download, or import project dependencies. If the
+ecosystem normally uses a lock state that requires dependency resolution, keep a
+template-provided lock state if present, but do not generate a new one by
+running installation or resolution commands.
+Do not add ignore rules for dependency lock artifacts by default. Track or
+ignore lock artifacts according to the selected ecosystem, project type, and
+DeepResearch evidence. If the project is a research application or experiment
+repository, prefer allowing a later resolved lock artifact to be versioned for
+reproducibility. If a library-style policy intentionally excludes a lock
+artifact, record that rationale in the registry and explain the package-policy
+choice in README or REFERENCES without mentioning the skill process.
+README installation sections must:
+- include `## Installation` in English and `## 安装` in Chinese,
+- distinguish system prerequisites from project dependencies,
+- use repo-local, workspace-local, project-local, sandboxed, containerized, or
+  environment-isolated setup when the ecosystem supports it,
+- explain commands from the repository root,
+- install project dependencies through the same dependency declaration mechanism
+  used by the template, or clearly label any manual fallback with identical
+  package/version constraints from the dependency registry,
+- prefer adding a template-native development/test extra or equivalent
+  installable group over duplicating dependency names in fallback commands when
+  the ecosystem supports such a mechanism,
+- omit a fallback path rather than writing commands that diverge from the
+  dependency declaration or REFERENCES registry,
+- not install project dependencies globally,
+- not include commands that run experiments, harnesses, tests, or paper methods,
+- use neutral setup language that makes installation commands actionable from
+  the repository root without implying that dependency installation or
+  dependency resolution has already been run.
+Static validation must compare the dependency declaration artifact, README
+installation commands, and REFERENCES installable-dependency registry. A package
+declared for runtime or development/test use must appear in the unified
+Installable Dependencies registry with the same bucket and purpose. Reference-only
+sources must not appear in dependency declarations or installation commands.
+REFERENCES installable-dependency rows must include, for each selected package
+or toolchain dependency:
+- project/package name,
+- source URL,
+- license,
+- selected version or version range,
+- project role,
+- direct-install rationale,
+- repository consumer by role, such as package core, development tests,
+  build backend, harness family, exporter role, or documentation role.
+REFERENCE-only rows must include:
+- project or source name,
+- source URL,
+- license status,
+- reference value for this paper project,
+- why it is not declared as a dependency,
+- the concrete implementation, harness, benchmark, or comparison role it may
+  inform.
+Do not replace per-source reasoning with a combined warning list. A shared
+warning may appear after the table, but every reference-only source still needs
+its own license status, why-not-dependency rationale, and concrete project role.
+## Test Layout
+Testing is template-informed, not a fixed overlay.
+1. Inspect the selected initializer's generated test structure.
+2. If generated tests, test directories, or test configuration exist, preserve
+   that native structure and document its responsibility in current project
+   terms.
+3. If the initializer generated no test structure, use DeepResearch evidence for
+   the target ecosystem and add only the smallest compatible test entry or a
+   lightweight test-layout note.
+4. Do not create guessed subdivisions such as unit, functional, integration,
+   feature-specific, or coding-plan-derived test folders before concrete code
+   defines reliable boundaries.
+5. Do not add executable test logic during scaffold initialization.
+README and any test-layout note must explain the selected test-layout basis
+and the boundary between software correctness checks and research harnesses.
+Use responsibility language: the test area is responsible for software
+correctness validation of package behavior. Do not write that tests check,
+verify, validate, exercise, or cover configuration parsing, record validation,
+lifecycle transitions, metric arithmetic, export schemas, CLI wiring, or any
+other implementation surface unless actual implementation objects and matching
+test files exist. Keep test notes lighter than harness notes: describe where
+package-level correctness checks belong and how they stay separate from
+research evidence. Do not mention fixtures, interface contracts, smoke checks,
+or other concrete test categories unless the matching files exist.
+## Harness Overlay
+Create semantic subfolders under `harness/` for experiment objectives that are
+already determined by the paper blueprint, experiment plan, and coding plan.
+Names should describe the task; do not use abstract numbering.
+Each harness folder must include an explanation file with these sections:
+- `Purpose`
+- `Experiment Objective`
+- `Entrypoint Semantics`
+- `Inputs`
+- `Metrics`
+- `Output Artifacts`
+- `Results Relationship`
+Write harness explanations as positive responsibility specs. Do not use
+absence-list wording such as "no runner is included" as the main content, and
+do not use stage/placeholder language. The specs describe what the harness
+directory owns and how harness code relates to experiment evidence; they do not
+implement harness logic.
+Each section must contain task-specific content from the planning inputs. A
+harness explanation with only section headers, generic prose, or repeated
+absence statements is invalid. Inputs, metrics, and output artifacts should name
+the relevant workload families, metric identifiers, and artifact families from
+the experiment/coding plans.
+## Repository Documentation
+README files should be concise and user-facing. Include:
+- project purpose,
+- installation,
+- fixed experiment directory overview,
+- harness map,
+- test layout responsibilities and harness/test boundary,
+- pointer to REFERENCES for external dependencies and source attributions.
+README files must present the repository as the project itself. They must not
+describe template selection, generator choice, DeepResearch, scaffold process,
+candidate comparison, rejected candidates, or internal decision flow.
+They must not list planning input files that live outside the repository. If
+the user asks to keep planning inputs in the repository, copy or create them
+inside the repository first and reference only repository-local paths.
+Open README with project purpose and repository responsibilities, not with
+generation-stage status or omissions. Use present-state role language. Avoid
+documentation that reads as a stage note, placeholder note, or absence report.
+Do not use whole-word/whole-phrase terms such as `future`, `placeholder`,
+`scaffold stage`, `will be implemented later`, `to be filled`,
+`template`, `scaffold`, `generated from`, `starter`, `boilerplate`,
+`deepresearch`, `skill`, `Codex`, `initialization stage`, `current boundary`,
+`reserved`, `reserved for`, `reserved package area`, `later`, `once`,
+`does not include`, `not included`, `no runnable`, or similar process,
+negative, or temporal phrasing in README, REFERENCES, harness explanations, or
+test-layout notes. Match banned single words as whole words so terms such as
+`preserved` are not false positives. Do not claim unimplemented paper methods,
+experiments, metrics, runners, loaders, or exporters work.
+Also avoid project-management or invocation terms such as `external task
+inputs`, `external inputs`, `bucket`, `consulted`, `operator`, and `task` when
+plain project documentation can say dependencies, references, setup, maintainers,
+or users instead.
+Chinese README and REFERENCES should be polished Chinese technical
+documentation. Translate ordinary workflow terms; keep identifiers, package
+names, commands, filenames, artifact names, URLs, licenses, and precision terms
+in English when translation would reduce clarity. Chinese REFERENCES should use
+Chinese field labels rather than English key dumps.
+REFERENCES must be category-driven and include:
+- unified Installable Dependencies registry with runtime and development/test
+  buckets, plus other project dependency buckets if the generated dependency
+  mechanism requires them,
+- package-management and installation strategy sources,
+- source attributions only when external or inherited files, notices, or
+  licenses require attribution,
+- harness/benchmark references,
+- reference-only repositories,
+- implementation references.
+REFERENCES should record sources that the current project actually uses, depends
+on, needs to attribute, or cites as implementation/harness/benchmark references.
+Do not record template search history, rejected templates, DeepResearch
+scratchwork, generator rationale, or sources that did not affect project files,
+dependencies, attribution, or useful implementation references.
+Do not list broad sources merely because they appeared in research. Each
+REFERENCE entry must state its concrete relationship to the repository. Remove
+benchmark or harness sources that only served as general background and did not
+shape retained files, dependency choices, harness organization, benchmark
+semantics, or implementation handoff.
+If a benchmark or tool source is kept, explain the specific retained convention,
+comparison role, artifact relationship, or implementation handoff it supports.
+If a template source or generated file must be attributed for license reasons,
+record it briefly under source attributions. If the template only provided
+structure and the retained files have been rewritten into project-specific
+content without attribution requirements, do not document the template process.
+Do not add a source-attribution row for repository-authored files owned by the
+current project; the repository license covers those files.
+For reference-only sources with unknown, unresolved, custom, restrictive,
+research-only, or incompatible licenses, include an explicit warning forbidding
+copying, porting, or directly reusing code until license and compatibility are
+verified. Include an equivalent natural Chinese warning in
+`REFERENCES.zh-CN.md`.
+Avoid vague `consulted` wording. Use precise relationship labels such as
+installable dependency, build dependency, source attribution, harness reference,
+benchmark reference, comparison reference, or implementation reference. Use
+attribution or retained-file wording only when a source actually produced files
+in the repository or license terms require attribution.
+## Workflow
+1. Resolve the target repository path and confirm the write scope.
+2. Read only the planning inputs and explicit user constraints needed for
+   scaffold generation.
+3. Extract scaffold requirements: target ecosystem signals, experiment forms,
+   harness objectives, dependency needs, input data families, output/result
+   artifact families, and downstream implementation direction.
+4. Run DeepResearch for initializers, templates, generated structures, package
+   management, environment isolation, test layout, installable dependencies,
+   reference-only sources, and domain repositories.
+5. Infer language/runtime/ecosystem and record the basis.
+6. Compare candidate generated structures and choose an initializer/template.
+7. Generate the starter repository inside the target path. If an existing
+   documentation-only scaffold is present, generate in a disposable location and
+   merge starter artifacts into the target while preserving user content.
+8. Capture a pre-overlay artifact snapshot, inspect generated artifacts, and
+   build the internal project artifact registry.
+9. Preserve template starter structure; prune only unrelated sample residue,
+   unsuitable third-party implementation code, or files that conflict with the
+   scaffold boundary.
+10. Normalize local initializer metadata not supplied by the user.
+11. Check source/package directories against the selected template output.
+    Remove or avoid empty subpackage forests that only mirror planning modules.
+12. Classify installable dependencies and reference-only sources. Update the
+    template-native dependency declaration without installing or resolving.
+13. Record the runtime-dependency decision, including intentionally empty
+    runtime sets and the candidates assigned to reference-only or deferred
+    categories.
+14. Determine repo-local installation strategy from the template and
+    DeepResearch evidence.
+15. Overlay `data/`, `output/`, `results/`, and `harness/`.
+16. Add semantic harness folders with schema-complete explanation files.
+17. Preserve generated test structure, or add only the minimal researched
+    test-layout note when no test structure was generated.
+18. Create concise README files and detailed category-driven REFERENCES files.
+19. Run static validation from the project artifact registry. Revise until all
+    gates pass.
+## Static Validation
+Perform static checks only. Do not install dependencies, lint, format, type
+check, run generated project code, run generated project scripts, run tests, run
+harnesses, or execute experiments.
+Validation must confirm by role, using the project artifact registry:
+- all created or modified repository files are inside the target path,
+- a starter repository was generated or a suitable existing starter layer was
+  preserved,
+- the pre-overlay artifact snapshot proves which project roles came from the
+  selected initializer/template rather than hand-authored mimicry,
+- the starter layer includes substantive non-documentation artifacts such as
+  project metadata, dependency declaration, build/run configuration, source
+  layout, test layout, entry points, examples, or initializer-specific files,
+- thin official starters are flagged as weak passes unless the registry shows a
+  richer generated starter layer; weak passes must still have initializer-origin
+  metadata, dependency mechanism, source layout, and test-layout evidence,
+- the final repository is not documentation-only,
+- template-generated source, package, build, test, entry, and tool-config roles
+  were not overwritten by the fixed experiment overlay,
+- fixed overlay directories and bilingual docs exist,
+- each harness has a semantic subfolder and schema-complete explanation file,
+- test structure follows the selected template or DeepResearch-supported
+  ecosystem layout,
+- no guessed test subdivisions were added before implementation code exists,
+- README describes the test-layout basis and harness/test boundary,
+- README and test notes describe test responsibilities without claiming
+  implementation-specific checks exist before corresponding code and tests
+  exist,
+- concrete test ecosystem wording has matching development/test dependency
+  declarations,
+- test notes stay lightweight when the selected test area has no executable
+  test files, avoiding fixture, contract, smoke, CLI, schema, or parser wording
+  unless matching files exist,
+- dependency declaration, README installation commands, and REFERENCES
+  Installable Dependencies registry agree exactly,
+- the runtime-dependency decision record agrees with README, REFERENCES, and
+  the dependency declaration, including empty runtime dependency sets,
+- fallback installation commands either consume the selected dependency
+  declaration mechanism or use the same package names and version constraints
+  recorded in REFERENCES,
+- installable dependencies in configuration appear in the registry with matching
+  bucket and purpose,
+- registry entries marked installable appear in the dependency declaration,
+- candidates classified as rejected, deferred, or reference-only are absent from
+  dependency declarations and installation commands,
+- each REFERENCES installable-dependency entry includes source URL, license,
+  selected version or version range, project role, direct-install rationale, and
+  repository consumer role,
+- reference-only sources appear in REFERENCES and are absent from dependency
+  declarations and installation commands,
+- each reference-only entry includes source URL, license status, project
+  reference value, why it is not declared as a dependency, and the concrete
+  implementation, harness, benchmark, or comparison role it may inform,
+- risky reference-only sources have explicit English and Chinese no-copy/no-port
+  warnings,
+- lock artifact policy follows the selected ecosystem and project type; lock
+  artifacts are not ignored by default, and any intentional library-style ignore
+  rule has a registry rationale and matching README or REFERENCES explanation,
+- README installation sections exist, use isolated/repo-local project dependency
+  setup where available, and do not claim installation was run,
+- README and REFERENCES agree with project dependencies, source attributions,
+  installation strategy, test layout responsibilities, fixed directories,
+  harness structure, and actual tree,
+- README and repository docs do not reference planning input files outside the
+  target repository; any referenced planning files exist inside the repository
+  or are omitted from project docs,
+- repo-relative reference audit passes: repository docs contain no relative
+  links, inline file references, or path-like mentions that point outside the
+  target repository or to missing files, unless the text clearly identifies a
+  public URL rather than a repository path,
+- generated metadata has no personal author/email/local-path values unless
+  supplied by the user,
+- README, REFERENCES, harness explanations, and test-layout notes use
+  present-state objective language and avoid banned process, template, stage,
+  placeholder, reservation, and absence phrases,
+- README begins with purpose and project structure rather than a list
+  of missing implementation pieces,
+- harness explanation sections contain task-specific inputs, metrics, output
+  artifacts, and result-record relationships rather than generic section text,
+- source package additions are template-generated, backed by starter files, or
+  consolidated into a single package-layout note,
+- empty source subpackage forests that mirror coding-plan modules are absent
+  unless the selected template generated them or each directory has a concrete
+  starter-repo role,
+- empty-package audit passes for all source/package directories added after the
+  starter was generated,
+- generated docs do not describe unimplemented functionality as working,
+- project-only documentation audit passes: repository docs describe the current
+  project, not template generation, scaffold process, DeepResearch, skill
+  execution, or internal decision flow,
+- project-only documentation audit rejects internal artifact-management wording
+  such as external inputs, bucket, consulted, operator-executed setup, broad
+  research-log language, or other invocation-process phrasing when ordinary
+  project documentation can be used,
+- REFERENCES contains only sources that are project dependencies, source
+  attributions, license notices, harness/benchmark references, or
+  implementation references with actual project value,
+- REFERENCES source-attribution sections omit repository-authored files unless
+  an external or inherited source actually requires attribution,
+- template default project names, template welcome text, sample-app
+  descriptions, tutorial links, and generator instructions are removed or
+  rewritten in project terms,
+- each retained documentation or source file can be explained by a current
+  project responsibility,
+- no active paper business logic, experiment workflow, harness execution, metric
+  computation, loader/exporter code, or real test logic was implemented,
+- no dependency installation, resolution, package download, generated-code run,
+  test run, harness run, or experiment execution occurred.
+Treat any of these as validation failures:
+- no real starter generation occurred for a new scaffold,
+- template-origin evidence is missing for the generated project roles,
+- a documentation-only scaffold was accepted without adding a starter layer,
+- a thin official starter is presented as a strong generated repository without
+  a richer initializer/template comparison or weak-pass note,
+- language/runtime/ecosystem is undecided for a new scaffold,
+- README lacks `Installation` or README.zh-CN lacks `安装`,
+- README or any repository document references planning input files that are
+  outside the target repository,
+- README or any repository document contains a repo-relative file reference that
+  resolves outside the target repository or does not resolve inside the target
+  repository,
+- installation sections are not actionable for the current dependency
+  declaration, only defer setup, or install project dependencies globally,
+- manual fallback commands install package names or version ranges that do not
+  match the dependency registry,
+- REFERENCES lacks category-driven structure focused on project dependencies,
+  source attributions, license notices, and actual references,
+- REFERENCES installable-dependency rows lack source URL, license, selected
+  version/range, direct-install rationale, project role, or repository consumer
+  role,
+- REFERENCES reference-only rows lack license status, why-not-dependency
+  reasoning, or a concrete current-project reference role,
+- REFERENCES relies on a single combined warning for reference-only sources
+  without per-source license status and why-not-dependency reasoning,
+- REFERENCES includes broad research-log sources that did not affect retained
+  structure, dependencies, attribution, harness semantics, benchmark semantics,
+  comparison scope, or implementation handoff,
+- REFERENCES includes repository-authored files as if they were external source
+  attributions,
+- dependencies declared in configuration are missing from Installable
+  Dependencies, or registry entries marked installable are missing from
+  configuration,
+- an empty runtime dependency set lacks a candidate classification record and a
+  matching technical rationale in README and REFERENCES,
+- reference-only sources appear in dependency configuration or installation
+  commands,
+- ignore rules exclude dependency lock artifacts without project-type rationale
+  from DeepResearch and matching documentation,
+- risky reference-only warnings fail to forbid copying, porting, or direct reuse
+  until license and compatibility are verified,
+- the scaffold creates test subdivisions from guessed coding-plan categories,
+- validation depends on a predeclared ecosystem file or directory name instead
+  of discovered artifact roles,
+- README, REFERENCES, harness explanations, or test notes use banned stage
+  terms or negative absence phrases such as `does not include`, `not included`,
+  `current boundary`, whole-word `reserved`, `reserved for`,
+  `reserved package area`, `later`, `once`, or `no runnable`,
+- README, REFERENCES, harness explanations, test notes, or retained template
+  documentation contain process words such as `template`, `scaffold`,
+  `generated from`, `starter`, `boilerplate`, `placeholder`,
+  `future implementation`, `deepresearch`, `skill`, `Codex`, or
+  `initialization stage`, unless the word is a project-domain term or legally
+  required attribution,
+- README or REFERENCES describes template selection, generator rationale,
+  DeepResearch process, rejected templates, or internal comparison workflow,
+- README, REFERENCES, harness explanations, test notes, or retained docs use
+  internal artifact-management wording such as `External Task Inputs`,
+  `external inputs`, whole-word `bucket`, vague `consulted`, `operators
+  execute`, or similar process language,
+- REFERENCES includes sources that were only searched or rejected and have no
+  dependency, attribution, license, benchmark, harness, or implementation value,
+- retained template files still contain default template project names, template
+  tutorials, template welcome prose, generator instructions, or unrelated sample
+  app descriptions,
+- harness explanations omit purpose, experiment objective, entrypoint semantics,
+  inputs, metrics, output artifacts, or results relationship,
+- harness explanations include the required headings but not meaningful
+  task-specific content from the planning inputs,
+- README-only source subdirectories mirror coding-plan nouns without template
+  support or starter files,
+- empty source subpackage forests mirror coding-plan modules without being
+  produced by the selected generator or required by the selected ecosystem,
+- the final repository looks hand-built from metadata files plus empty package
+  directories rather than produced by an initializer, template tool, official
+  starter, or high-quality template repository,
+- README or test-layout notes claim concrete checks for configuration parsing,
+  records, lifecycle transitions, metrics, exports, CLI wiring, or similar
+  implementation surfaces when those implementation objects and test files are
+  absent,
+- README or test-layout notes mention fixtures, interface contracts, smoke
+  checks, or other concrete test categories when the corresponding test files
+  are absent,
+- generated metadata preserves local personal names, emails, or paths not
+  supplied by the user.
+## Final Response
+Summarize:
+- target repository path,
+- selected language/runtime,
+- substantive project artifacts retained or adjusted,
+- whether the selected initializer/template produced a strong starter layer or
+  a weak-pass thin starter, based on the pre-overlay artifact snapshot,
+- dependency declaration mechanism and selected installable dependencies, or
+  explicit empty-bucket decisions,
+- runtime-dependency candidate classification when runtime dependencies are
+  empty,
+- reference-only sources recorded,
+- repo-local/environment-isolated installation strategy documented,
+- fixed experiment directories overlaid,
+- semantic harness folders created,
+- test-layout basis and any minimal note added,
+- static validation performed, including project-only documentation,
+  non-documentation, installation, dependency/reference consistency, test-layout
+  provenance, harness schema, and scaffold-only checks,
+- preservation decisions or skipped overwrites,
+- next implementation handoff point.
+Keep the response focused on the starter scaffold and experiment overlay. Do
+not present dependency installation, runtime execution, test results, harness
+results, or experiment results unless the user explicitly requested separate
+operational work outside this skill.