codex-genesis-harness 0.1.7 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/.codebase/COMPRESSED_CONTEXT.md +80 -0
  2. package/.codebase/CURRENT_STATE.md +10 -10
  3. package/.codebase/DEPENDENCY_GRAPH.md +14 -1
  4. package/.codebase/IMPLEMENTATION_HANDOFF.md +34 -336
  5. package/.codebase/KNOWN_PROBLEMS.md +73 -3
  6. package/.codebase/MODULE_INDEX.md +23 -2
  7. package/.codebase/PIPELINE_FLOW.md +16 -6
  8. package/.codebase/RECOVERY_POINTS.md +80 -78
  9. package/.codebase/TECH_DEBT.md +6 -0
  10. package/.codebase/TEST_MATRIX.md +8 -3
  11. package/.codebase/VISUAL_GRAPH.md +127 -0
  12. package/.codebase/context-policy.json +68 -0
  13. package/.codebase/memories/lessons_learned.md +63 -0
  14. package/.codebase/memories/preferences.md +17 -0
  15. package/.codebase/state.json +156 -17
  16. package/.codex/skills/genesis-architecture/SKILL.md +5 -0
  17. package/.codex/skills/genesis-debug-guide/SKILL.md +10 -4
  18. package/.codex/skills/genesis-docs-automation/SKILL.md +52 -973
  19. package/.codex/skills/genesis-executing-plans/SKILL.md +54 -0
  20. package/.codex/skills/genesis-executing-plans/agents/openai.yaml +6 -0
  21. package/.codex/skills/genesis-executing-plans/checklists/.gitkeep +0 -0
  22. package/.codex/skills/genesis-executing-plans/examples/.gitkeep +0 -0
  23. package/.codex/skills/genesis-executing-plans/templates/.gitkeep +0 -0
  24. package/.codex/skills/genesis-harness/SKILL.md +73 -1385
  25. package/.codex/skills/genesis-harness/agents/openai.yaml +1 -2
  26. package/.codex/skills/genesis-harness/references/state-machine.md +4 -1
  27. package/.codex/skills/genesis-harness/references/workflows.md +7 -1
  28. package/.codex/skills/genesis-harness/scripts/check-docs-sync.sh +3 -3
  29. package/.codex/skills/genesis-harness/scripts/init-planning.sh +246 -14
  30. package/.codex/skills/genesis-new-design/SKILL.md +4 -1
  31. package/.codex/skills/genesis-new-design/agents/openai.yaml +2 -0
  32. package/.codex/skills/genesis-observability-automation/SKILL.md +69 -303
  33. package/.codex/skills/genesis-observability-automation/references/common-mistakes-and-recovery.md +84 -0
  34. package/.codex/skills/genesis-observability-automation/references/workflow-phases.md +78 -0
  35. package/.codex/skills/genesis-performance-profiling/SKILL.md +1 -22
  36. package/.codex/skills/genesis-performance-profiling/agents/openai.yaml +1 -1
  37. package/.codex/skills/genesis-pipeline-orchestration/SKILL.md +15 -3
  38. package/.codex/skills/genesis-planning/SKILL.md +6 -1
  39. package/.codex/skills/genesis-release/SKILL.md +5 -0
  40. package/.codex/skills/genesis-research-first/SKILL.md +6 -0
  41. package/.codex/skills/genesis-spec-propagation/SKILL.md +52 -504
  42. package/.codex/skills/genesis-test-driven-development/SKILL.md +55 -0
  43. package/.codex/skills/genesis-test-driven-development/agents/openai.yaml +6 -0
  44. package/.codex/skills/genesis-test-driven-development/checklists/.gitkeep +0 -0
  45. package/.codex/skills/genesis-test-driven-development/examples/.gitkeep +0 -0
  46. package/.codex/skills/genesis-test-driven-development/templates/.gitkeep +0 -0
  47. package/.codex/skills/genesis-upgrade-design/SKILL.md +4 -2
  48. package/.codex/skills/genesis-upgrade-design/agents/openai.yaml +2 -0
  49. package/.codex/skills/genesis-using-git-worktrees/SKILL.md +54 -0
  50. package/.codex/skills/genesis-using-git-worktrees/agents/openai.yaml +6 -0
  51. package/.codex/skills/genesis-using-git-worktrees/checklists/.gitkeep +0 -0
  52. package/.codex/skills/genesis-using-git-worktrees/examples/.gitkeep +0 -0
  53. package/.codex/skills/genesis-using-git-worktrees/templates/.gitkeep +0 -0
  54. package/.codex/skills/genesis-verification-before-completion/SKILL.md +53 -0
  55. package/.codex/skills/genesis-verification-before-completion/agents/openai.yaml +6 -0
  56. package/.codex/skills/genesis-verification-before-completion/checklists/.gitkeep +0 -0
  57. package/.codex/skills/genesis-verification-before-completion/examples/.gitkeep +0 -0
  58. package/.codex/skills/genesis-verification-before-completion/templates/.gitkeep +0 -0
  59. package/.codex/skills/spec-impact-engine/SKILL.md +77 -500
  60. package/.codex/skills/spec-impact-engine/checklists/checklist.md +10 -0
  61. package/.codex-plugin/plugin.json +6 -5
  62. package/CHANGELOG.md +25 -1
  63. package/README.EN.md +74 -17
  64. package/README.VI.md +77 -19
  65. package/README.md +126 -10
  66. package/VERSION +1 -2
  67. package/bin/genesis-harness.js +2979 -149
  68. package/contracts/features/project-registry-schema.json +37 -0
  69. package/contracts/features/registry-schema.json +15 -0
  70. package/contracts/observability/agent-run-schema.json +39 -0
  71. package/contracts/observability/failure-schema.json +35 -0
  72. package/contracts/ui/auth/login-screen-contract.json +43 -0
  73. package/features/REGISTRY.md +65 -0
  74. package/features/SCOPE-template.md +65 -0
  75. package/fixtures/pipeline/end-to-end-project-lifecycle-fixture.md +39 -0
  76. package/fixtures/pipeline/feature-completion-fixture.md +26 -0
  77. package/fixtures/pipeline/run-to-feature-execution-fixture.md +20 -0
  78. package/fixtures/planning/MOCKUP_PROMPT_TEMPLATE.md +16 -0
  79. package/observability/agent-runs/sample-run.json +13 -0
  80. package/observability/decision-logs/sample-decision.md +43 -0
  81. package/observability/failures/sample-failure.json +12 -0
  82. package/package.json +15 -4
  83. package/playwright/e2e/app-template.spec.js +37 -0
  84. package/playwright/e2e/auth/login-screen.spec.js +65 -0
  85. package/playwright/e2e/web-template.spec.js +28 -0
  86. package/scripts/check-repository-hygiene.js +48 -0
  87. package/scripts/check-scope.sh +100 -0
  88. package/scripts/cold-start-check.js +133 -0
  89. package/scripts/install.sh +4 -0
  90. package/scripts/prompt_sentinel.js +35 -4
  91. package/scripts/run-evals.sh +152 -3
  92. package/scripts/schema/001-init.sql +129 -0
  93. package/scripts/schema/002-story-verify.sql +9 -0
  94. package/scripts/schema/003-tool-registry.sql +15 -0
  95. package/scripts/schema/004-intervention.sql +15 -0
  96. package/scripts/scratch_parser.js +49 -0
  97. package/scripts/spec_visual_sync.js +1 -1
  98. package/scripts/test_generator.js +2 -2
  99. package/scripts/transition_state.sh +32 -8
  100. package/scripts/uninstall.sh +4 -0
  101. package/scripts/validation_gates.sh +2 -80
  102. package/scripts/verify.sh +19 -2
  103. package/tests/fixtures/fixture-index.md +5 -0
  104. package/tests/integration/cli-smoke.test.js +506 -0
  105. package/tests/unit/feature_registry.test.js +152 -0
  106. package/tests/unit/prompt_sentinel.test.js +1 -1
  107. package/tests/unit/repository_hygiene.test.js +17 -0
  108. package/tests/unit/spec_visual_sync.test.js +1 -1
  109. package/tests/unit/state_metadata.test.js +76 -0
  110. package/tests/unit/test_generator.test.js +1 -1
  111. package/tests/unit/verify_gate.test.js +25 -0
  112. package/tests/unit/workflow_contracts.test.js +90 -0
  113. package/fixtures/tts/tts-fixture-template.md +0 -14
  114. package/fixtures/videos/video-fixture-template.md +0 -14
  115. package/playwright/e2e/e2e-template.md +0 -4
@@ -1,83 +1,85 @@
1
1
  # Recovery Points
2
2
 
3
- **Purpose**: Document where harness architecture implementation can be paused and resumed without losing context or creating inconsistencies.
4
-
5
- **Use When**: Evolution of the Codex harness (verification loops, CLI tools, scripts) needs to be paused, or when a rollback is necessary due to environment breakage.
6
-
7
- ---
8
-
9
- ## Quick Reference: Current Recovery Points
10
-
11
- | Phase | Status | Resumption File | Last Updated |
12
- |-------|--------|-----------------|--------------|
13
- | TUI Mockup Viewer Integration | ✓ Complete | `.codebase/CURRENT_STATE.md` | 2026-06-01 |
14
- | Harness Verification Streamlining | ✓ Complete | `.codebase/CURRENT_STATE.md` | 2026-06-01 |
15
- | Bead Memory Regression Tests | ✓ Complete | `scripts/run-evals.sh` | 2026-06-01 |
16
- | Harness Engineering Overhaul | ⏸️ Idle (Stable) | `scripts/verify.sh` | 2026-06-01 |
17
-
18
- ---
19
-
20
- ## Phase: Harness Verification Streamlining & Memory Evals
21
-
22
- **Status**: ✓ Complete
23
- **Last Updated**: 2026-06-01
24
-
25
- ### What Happened
26
-
27
- - Cleaned up legacy/deprecated skills (e.g., `genesis-mvp-planning`, `genesis-release-orchestration`) from `scripts/verify.sh`, `scripts/uninstall.sh`, and `scripts/run-evals.sh`.
28
- - Removed hard-coded skill name mappings (`expected_name` switch statements), enabling dynamic mapping directly based on directory names.
29
- - Added test coverage in `run-evals.sh` for the local bead memory commands (`remember`, `recall`, `prime`, `forget`).
30
- - Enforced `state-machine.md` presence in `verify_harness_skill()`.
31
-
32
- ### Safe State Confirmation
33
-
34
- The harness currently passes all structural tests cleanly.
35
- ```bash
36
- # Verify structure
37
- ./scripts/verify.sh
38
-
39
- # Verify regression
40
- ./scripts/run-evals.sh
41
-
42
- # Dry-run package integrity
43
- npm run pack:check
44
- ```
45
-
46
- ---
47
-
48
- ## Rollback Points
49
-
50
- ### If A Future Harness Evolution Breaks the CLI/Environment
51
-
52
- **Rollback Level 1: Last Stable Run (Current State)**
53
- If a new change to `bin/genesis-harness.js` or `scripts/verify.sh` creates infinite loops or immediate failures:
54
- ```bash
55
- git checkout -- bin/genesis-harness.js scripts/verify.sh scripts/run-evals.sh
56
- npm install
57
- ./scripts/verify.sh
58
- ```
59
-
60
- **Rollback Level 2: Full Repository Reset**
61
- If tests are failing in a manner that contaminates local fixtures or memory:
62
- ```bash
63
- git reset --hard HEAD
64
- git clean -fd
65
- npm install
66
- ./scripts/verify.sh
67
- ```
68
-
69
- ---
70
-
71
- ## Checklist: Before Pausing Work on Harness Evolutions
72
-
73
- - [ ] `scripts/verify.sh` passing cleanly (Exit Code 0)
74
- - [ ] `scripts/run-evals.sh` passing cleanly (Exit Code 0)
75
- - [ ] Script files verified for POSIX/LF line endings
76
- - [ ] No uncommitted changes in core scripts that break existing workflows
77
- - [ ] `.codebase/CURRENT_STATE.md` updated with exact phase details
3
+ A reverse-chronological log of stable states to return to if the current task corrupts the project.
78
4
 
79
5
  ---
80
6
 
81
- ## Contact For Questions
82
- **Owner**: Codex Harness Engineering Team
83
- **Last Validated**: 2026-06-01
7
+ ## 2026-06-12: End-to-End Project Lifecycle
8
+ - **Status**: Stable
9
+ - **Git State**: Multi-feature orchestration, project verification, release-ready handoff, append-only events, and lifecycle audit are implemented and covered by integration tests.
10
+ - **Why it's stable**: `cli-smoke.test.js` exercises idea bootstrap through two feature completions, project proof, final completion, idempotency, event history, and audit; the canonical `verify-gate` passes and is now the single completion gate.
11
+ - **How to recover**: Use `.runs/<session-id>/STATE.json`, `RESUME.md`, and `EVENTS.jsonl`; run `pipeline-audit` before resuming the command reported by `next`.
12
+ - **Files changed**: CLI, lifecycle state machine, project registry contract, pipeline fixtures/tests, orchestration skill, plugin prompt, README files, and repository memory.
13
+
14
+ ## 2026-06-12T16:50:17+07:00: Lifecycle Pipeline + Repository Hygiene
15
+ - **Status**: Stable
16
+ - **Git State**: Lifecycle and hygiene changes verified locally; tracked `node_modules/` entries are staged for removal while local installed dependencies remain available.
17
+ - **Why it's stable**: `run --idea` creates a project feature registry, `next` resolves executable work, and `complete-feature` requires a passing command plus explicit evidence before closing state. Repository verification blocks tracked dependencies and generated package artifacts, while tarball smoke tests reject generated `scripts/bin/` binaries.
18
+ - **How to recover**: Reapply this checkpoint if `.planning/FEATURE_REGISTRY.json` stops being generated, completion bypasses verification, evidence or metrics disappear, or `node_modules/` becomes tracked again.
19
+ - **Files changed**: `bin/genesis-harness.js`, `scripts/check-repository-hygiene.js`, `scripts/verify.sh`, `scripts/run-evals.sh`, lifecycle contracts/fixtures/tests, README files, and `.codebase/*`.
20
+
21
+ ## 2026-06-10T10:05:00Z: Feature Execution Bootstrap
22
+ - **Status**: Stable
23
+ - **Git State**: Working tree verified after `run`/`resume` orchestration started auto-scaffolding the first execution-ready feature.
24
+ - **Why it's stable**: `tests/integration/cli-smoke.test.js`, `scripts/verify.sh`, `scripts/run-evals.sh`, and `genesis-harness verify-gate` now cover the handoff from discovery into `IMPLEMENTATION` with `active_feature` persisted in `.runs/`.
25
+ - **How to recover**: Reapply from this point if `run --idea` falls back to `PLANNING`, if `.planning/features/<NNN>-...` stops being created automatically, or if `resume` loses the active feature checkpoint.
26
+ - **Files changed**: `bin/genesis-harness.js`, `tests/integration/cli-smoke.test.js`, `fixtures/pipeline/run-to-feature-execution-fixture.md`, and `.codebase/*.md`.
27
+
28
+ ## 2026-06-10T10:25:00Z: Typed First-Slice Contract Bootstrap
29
+ - **Status**: Stable
30
+ - **Git State**: Working tree verified after the first feature scaffold started emitting API/UI-specific contracts and fixtures.
31
+ - **Why it's stable**: `run --idea` now creates `contracts/ui`, `contracts/api`, `playwright/fixtures`, and `fixtures/api` artifacts when the discovery answers imply those surfaces, and `tests/integration/cli-smoke.test.js` locks the generated paths and tailored values.
32
+ - **How to recover**: Reapply from this point if the first feature loses typed contract scaffolding, if generated routes/endpoints regress to generic placeholders, or if `TEST_CONTRACT.md` stops referencing the generated contract paths.
33
+ - **Files changed**: `bin/genesis-harness.js`, `tests/integration/cli-smoke.test.js`, `fixtures/pipeline/run-to-feature-execution-fixture.md`, `tests/fixtures/fixture-index.md`, and `.codebase/*.md`.
34
+
35
+ ## 2026-06-10T08:34:56Z: Workflow Consolidation + Trusted Publish Hardening
36
+ - **Status**: Stable
37
+ - **Git State**: Working tree verified after CI workflow consolidation, registry cleanup, and release-path hardening.
38
+ - **Why it's stable**: GitHub Actions now reuse a single `verify-gate` path, release publishing expects OIDC trusted publishing with provenance, and workflow contract tests block drift back to placeholder CI logic.
39
+ - **How to recover**: Reapply from this point if CI starts bypassing `verify-gate`, if docs-sync regains custom placeholder logic, or if npm publishing falls back to long-lived tokens and mutable CI version rewrites.
40
+ - **Files changed**: `.github/workflows/*.yml`, `tests/unit/workflow_contracts.test.js`, `features/REGISTRY.md`, and `.codebase/*.md`.
41
+
42
+ ## 2026-06-10T16:45:00+07:00: Resume + Run Artifact Hardening
43
+ - **Status**: Stable
44
+ - **Git State**: Working tree verified after resumable run-artifact and state-invariant changes.
45
+ - **Why it's stable**: `run` now writes per-session `.runs/<session-id>` artifacts, `resume` can backfill and report from them, and state metadata tests block stale timestamps.
46
+ - **How to recover**: Reapply from this point if mid-project resume loses the next task, if `.runs/` stops being populated, or if `completed_at` drifts behind the active session.
47
+ - **Files changed**: `bin/genesis-harness.js`, `.codebase/state.json`, `.codebase/*.md`, `scripts/run-evals.sh`, and CLI/unit test coverage.
48
+
49
+ ## 2026-06-10T14:20:00+07:00: Auto-init + Discovery Bootstrap
50
+ - **Status**: Stable
51
+ - **Git State**: Working tree verified after init bootstrap changes.
52
+ - **Why it's stable**: `tests/integration/cli-smoke.test.js`, `scripts/verify.sh`, `scripts/run-evals.sh` (with temporary npm cache override), and `npm run pack:check` pass.
53
+ - **How to recover**: Reapply from this point if init stops creating `.planning/INIT_QA.md`, `01-discovery-and-qa`, or `.codebase/PHASE_DEPENDENCY_MAP.md`.
54
+ - **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `genesis-harness` skill routing docs, and init smoke coverage.
55
+
56
+ ## 2026-06-10T14:55:00+07:00: Idea-Seeded Planner Bootstrap
57
+ - **Status**: Stable
58
+ - **Git State**: Working tree verified after brief-to-planning bootstrap changes.
59
+ - **Why it's stable**: `init --idea "<brief>"` now fills planning docs and planner state, and verification still passes on `cli-smoke`, `verify.sh`, `run-evals.sh`, and `pack:check`.
60
+ - **How to recover**: Reapply from this point if user brief content stops propagating into `PROJECT.md`, `REQUIREMENTS.md`, `STACK.md`, `SUMMARY.md`, or `.codebase/state.json`.
61
+ - **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `genesis-harness` prompt/routing docs, and init smoke coverage.
62
+
63
+ ## 2026-06-10T15:05:00+07:00: Runtime Pipeline + Verification Hardening
64
+ - **Status**: Stable
65
+ - **Git State**: Working tree verified after runtime pipeline, gate hardening, and metadata drift fixes.
66
+ - **Why it's stable**: `genesis-harness run --idea ... --yes` now advances a blank repo into planning with persisted discovery answers, and `verify-gate` now matches the required completion contract.
67
+ - **How to recover**: Reapply from this point if `run` stops filling planning docs, if `verify-gate` stops executing evals/docs/pack checks, or if plugin/package/state metadata drift returns.
68
+ - **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `.codex-plugin/plugin.json`, `.codebase/*.md`, `.codebase/state.json`, `scripts/run-evals.sh`, and CLI/unit/integration tests.
69
+
70
+ ## 2026-06-03T09:55:00+07:00: Full Score Harness Fix (110/110)
71
+ - **Status**: Stable
72
+ - **Git State**: Everything committed + new features added.
73
+ - **Why it's stable**: All tests (`tests/unit/*.test.js`), `verify.sh`, `run-evals.sh`, and `cold-start-check.js` pass with exit code 0.
74
+ - **How to recover**: `git reset --hard HEAD` (assuming commit happens immediately after this)
75
+ - **Files added**: `features/REGISTRY.md`, `scripts/cold-start-check.js`, `scripts/check-scope.sh`, observability schemas/samples.
76
+
77
+ ## 2026-06-03T09:30:00+07:00: LeanCTX + CLI Postinstall Seed
78
+ - **Status**: Stable
79
+ - **Why it's stable**: `npm run verify` and `npm run eval` pass. `context-policy.json` successfully bootstrapped.
80
+ - **How to recover**: Return to commit before the evaluation score fixes.
81
+
82
+ ## 2026-06-03T08:35:00+07:00: Harness Drift Gate Hardening
83
+ - **Status**: Stable
84
+ - **Why it's stable**: `npm run verify`, `npm run eval`, and `npm run pack:check` all pass.
85
+ - **How to recover**: Revert to branch state before LeanCTX introduction.
@@ -0,0 +1,6 @@
1
+ # Tech Debt Ledger
2
+
3
+ This file logs structural rule violations, skipped tests, and out-of-scope modifications that were bypassed using `VIBE_MODE`.
4
+ The AI and developers should periodically review this file to pay down technical debt.
5
+
6
+ ---
@@ -2,10 +2,15 @@
2
2
 
3
3
  Required checks:
4
4
 
5
- - `scripts/verify.sh`: repository harness structure, skill metadata, contracts, fixtures, and harness smoke test.
6
- - `scripts/run-evals.sh`: install/verify/uninstall regression checks.
5
+ - `scripts/verify.sh`: repository harness structure, skill metadata, contracts, fixtures, harness smoke test, and `SKILL.md` progressive-disclosure line limit.
6
+ - `scripts/run-evals.sh`: install/verify/uninstall regression checks, manifest route checks, sync-generated Mermaid relationship checks, hook docs-gate checks, LeanCTX policy checks, handoff/state freshness checks, `tests/unit/*.test.js`, and `tests/integration/*.test.js`.
7
+ - `tests/integration/cli-smoke.test.js`: package CLI smoke for install/postinstall LeanCTX seeding, deterministic idea bootstrap, resumable run artifacts, multi-feature queue promotion, evidence-gated feature completion, project-wide verification, release-ready handoff, idempotent project completion, append-only events, lifecycle audit, metrics, and observability output.
8
+ - `tests/unit/repository_hygiene.test.js`: prevents tracked dependency or generated artifacts from returning.
9
+ - `tests/unit/prompt_sentinel.test.js`: LeanCTX-backed prompt sentinel threshold and truncation behavior.
10
+ - `tests/unit/state_metadata.test.js`: keeps `.codebase/CURRENT_STATE.md` and `.codebase/state.json` aligned for session, TTFV, legal state enums, and non-stale completion timestamps.
11
+ - `tests/unit/verify_gate.test.js`: verifies that `verify-gate` includes evals, docs-gate, pack dry-run, and LeanCTX checks.
12
+ - `tests/unit/workflow_contracts.test.js`: verifies CI workflows delegate to the reusable `verify-gate` path, pin critical actions, and use trusted npm publishing with provenance.
7
13
  - `npm run pack:check`: package contents dry-run.
8
14
  - Skill validation: run `quick_validate.py` for changed skills when available.
9
15
 
10
16
  Feature rule: add or update fixtures and expected output before implementation.
11
-
@@ -0,0 +1,127 @@
1
+ # Visual Project Graph
2
+
3
+ ## Harness Relationship Map
4
+
5
+ ```mermaid
6
+ flowchart LR
7
+ manifest[".codex-plugin/plugin.json"] --> skills[".codex/skills/*"]
8
+ package["package.json"] --> cli["bin/genesis-harness.js"]
9
+ package --> verify["scripts/verify.sh"]
10
+ package --> evals["scripts/run-evals.sh"]
11
+ cli --> install["install / postinstall"]
12
+ cli --> hooks["setup-hooks"]
13
+ hooks --> docsgate["genesis-harness docs-gate"]
14
+ docsgate --> docsync["check-docs-sync.sh"]
15
+ docsgate --> specsync["check-spec-changelog.sh"]
16
+ skills --> contracts["contracts/"]
17
+ skills --> fixtures["fixtures/"]
18
+ skills --> tests["tests/ + playwright/"]
19
+ skills --> memory[".codebase/"]
20
+ verify --> skills
21
+ verify --> contracts
22
+ verify --> fixtures
23
+ verify --> memory
24
+ evals --> install
25
+ evals --> cli
26
+ evals --> unit["tests/unit/*.test.js"]
27
+ evals --> integration["tests/integration/*.test.js"]
28
+ evals --> pack["npm pack smoke"]
29
+ ```
30
+
31
+ ## Skill Workflow Relationships
32
+
33
+ ```mermaid
34
+ flowchart TD
35
+ harness["genesis-harness"] --> planning["genesis-planning"]
36
+ harness --> research["genesis-research-first"]
37
+ planning --> architecture["genesis-architecture"]
38
+ planning --> api["genesis-api-contract"]
39
+ planning --> design["genesis-design-spec"]
40
+ api --> apisync["genesis-api-sync"]
41
+ design --> ui["genesis-ui-ux-test"]
42
+ api --> specimpact["spec-impact-engine"]
43
+ specimpact --> specprop["genesis-spec-propagation"]
44
+ specprop --> docs["genesis-docs-automation"]
45
+ ui --> verifybefore["genesis-verification-before-completion"]
46
+ apisync --> verifybefore
47
+ docs --> verifybefore
48
+ verifybefore --> release["genesis-release"]
49
+ harness --> memorymap["genesis-codebase-map"]
50
+ harness --> observability["genesis-observability-automation"]
51
+ ```
52
+
53
+ ## Code Dependency Hints
54
+
55
+ ```mermaid
56
+ flowchart TD
57
+ "tests/integration/cli-smoke.test.js" --> "assert"
58
+ "tests/integration/cli-smoke.test.js" --> "fs"
59
+ "tests/integration/cli-smoke.test.js" --> "os"
60
+ "tests/integration/cli-smoke.test.js" --> "path"
61
+ "tests/integration/cli-smoke.test.js" --> "child_process"
62
+ "tests/unit/contract_integrity_gate.test.js" --> "assert"
63
+ "tests/unit/contract_integrity_gate.test.js" --> "fs"
64
+ "tests/unit/contract_integrity_gate.test.js" --> "path"
65
+ "tests/unit/contract_integrity_gate.test.js" --> "child_process"
66
+ "tests/unit/healing_telemetry.test.js" --> "assert"
67
+ "tests/unit/healing_telemetry.test.js" --> "fs"
68
+ "tests/unit/healing_telemetry.test.js" --> "path"
69
+ "tests/unit/healing_telemetry.test.js" --> "child_process"
70
+ "tests/unit/prompt_sentinel.test.js" --> "assert"
71
+ "tests/unit/prompt_sentinel.test.js" --> "fs"
72
+ "tests/unit/prompt_sentinel.test.js" --> "path"
73
+ "tests/unit/prompt_sentinel.test.js" --> "child_process"
74
+ "tests/unit/spec_visual_sync.test.js" --> "assert"
75
+ "tests/unit/spec_visual_sync.test.js" --> "fs"
76
+ "tests/unit/spec_visual_sync.test.js" --> "path"
77
+ "tests/unit/spec_visual_sync.test.js" --> "child_process"
78
+ "tests/unit/test_generator.test.js" --> "assert"
79
+ "tests/unit/test_generator.test.js" --> "fs"
80
+ "tests/unit/test_generator.test.js" --> "path"
81
+ "tests/unit/test_generator.test.js" --> "child_process"
82
+ "bin/genesis-harness.js" --> "fs"
83
+ "bin/genesis-harness.js" --> "path"
84
+ "bin/genesis-harness.js" --> "child_process"
85
+ "bin/genesis-harness.js" --> "@babel/parser"
86
+ "bin/genesis-harness.js" --> "@babel/traverse"
87
+ "bin/genesis-harness.js" --> "child_process"
88
+ ```
89
+
90
+ ## .planning/ROADMAP.md Derived Feature Status
91
+
92
+ ```mermaid
93
+ graph TD
94
+ classDef completed fill:#d4edda,stroke:#28a745,stroke-width:2px;
95
+ classDef inprogress fill:#fff3cd,stroke:#ffc107,stroke-width:2px;
96
+ classDef pending fill:#e2e3e5,stroke:#6c757d,stroke-width:2px;
97
+ subgraph Role_0 ["Role: User"]
98
+ Task0["Roadmap task 0"]
99
+ class Task0 completed;
100
+ Task1["Roadmap task 1"]
101
+ class Task1 inprogress;
102
+ Task2["Roadmap task 2"]
103
+ class Task2 pending;
104
+ end
105
+ subgraph Role_1 ["Role: Admin"]
106
+ Task3["Roadmap task 3"]
107
+ class Task3 completed;
108
+ Task4["Roadmap task 4"]
109
+ class Task4 pending;
110
+ Task5["Roadmap task 5"]
111
+ class Task5 inprogress;
112
+ end
113
+ subgraph Role_2 ["Role: Analytics"]
114
+ Task6["Roadmap task 6"]
115
+ class Task6 pending;
116
+ Task7["Roadmap task 7"]
117
+ class Task7 pending;
118
+ Task8["Roadmap task 8"]
119
+ class Task8 inprogress;
120
+ end
121
+ Task0 --> Task1
122
+ Task0 --> Task2
123
+ Task2 --> Task4
124
+ Task2 --> Task5
125
+ Task4 --> Task6
126
+ ```
127
+
@@ -0,0 +1,68 @@
1
+ {
2
+ "name": "leanctx-default",
3
+ "token_budget": 12000,
4
+ "_comment_token_budget": "Default conservative budget. Override in project .codebase/context-policy.json. For 128k-context models, set to 40000.",
5
+ "auto_scale": {
6
+ "enabled": false,
7
+ "note": "Set enabled=true and provide model_context_window to auto-calculate budgets. Formula: token_budget = model_context_window * 0.09 (9% for harness context leaves 91% for generation).",
8
+ "model_context_window": null,
9
+ "scale_factor": 0.09
10
+ },
11
+ "warn_at": 0.6,
12
+ "compact_at": 0.7,
13
+ "hard_stop_at": 0.85,
14
+ "layers": [
15
+ {
16
+ "name": "core",
17
+ "max_tokens": 2500,
18
+ "include": [
19
+ "AGENTS.md",
20
+ ".codex/SOUL.md",
21
+ ".codebase/CURRENT_STATE.md",
22
+ ".codebase/MODULE_INDEX.md",
23
+ ".codebase/TEST_MATRIX.md"
24
+ ]
25
+ },
26
+ {
27
+ "name": "active_context",
28
+ "max_tokens": 6500,
29
+ "include": [
30
+ ".codebase/COMPRESSED_CONTEXT.md",
31
+ ".codebase/VISUAL_GRAPH.md",
32
+ ".planning/STATE.md",
33
+ ".planning/ROADMAP.md",
34
+ "contracts/",
35
+ "fixtures/"
36
+ ]
37
+ },
38
+ {
39
+ "name": "deferred_reference",
40
+ "max_tokens": 3000,
41
+ "include": [
42
+ ".codex/skills/*/references/",
43
+ ".codex/skills/*/playbooks/",
44
+ ".codex/skills/*/checklists/",
45
+ "README*.md"
46
+ ]
47
+ }
48
+ ],
49
+ "defer_patterns": [
50
+ ".codex/skills/*/templates/**",
51
+ ".codex/skills/*/examples/**",
52
+ "playwright/**",
53
+ "observability/**",
54
+ "node_modules/**",
55
+ "dist/**",
56
+ "coverage/**"
57
+ ],
58
+ "portable_commands": [
59
+ "genesis-harness leanctx",
60
+ "genesis-harness sync",
61
+ "genesis-harness docs-gate",
62
+ "genesis-harness verify-gate",
63
+ "npm run verify",
64
+ "npm run eval",
65
+ "node scripts/cold-start-check.js"
66
+ ],
67
+ "wrapper_policy": "rtk optional when installed locally; public docs and CI must use portable commands."
68
+ }
@@ -0,0 +1,63 @@
1
+ # Lessons Learned & Historical Bugs
2
+
3
+ This file chronicles the major failures, recursive bugs, and architectural dead-ends we have encountered. It acts as an immune system preventing the agent from repeating history.
4
+
5
+ ## 1. Duplicate Slash Commands in Registry
6
+ - **Symptom**: Agent registered 4 copies of the same slash command for a single skill.
7
+ - **Root Cause**: The CLI script recursively scanned the entire `.codex/` directory for active skills, accidentally parsing backup folders (`.codex/backup/`) generated during skill upgrades.
8
+ - **Resolution**: Backup directories must ALWAYS be placed completely outside the active parsed directory (e.g., moved to `~/.codex/backups` globally).
9
+ - **Rule**: When doing file tree walks for plugins/skills, always explicitly ignore `.git`, `node_modules`, `backup`, and `tmp` folders.
10
+
11
+ ## 2. Documentation Drift & Broken Contracts
12
+ - **Symptom**: Code in `scripts/` changed logic without updating `contracts/`.
13
+ - **Root Cause**: Agent skipped the documentation step after a "quick fix" code edit.
14
+ - **Resolution**: Implemented Validation Gates (`npm run verify`).
15
+ - **Rule**: Never finalize a code edit without explicitly checking `TEST_MATRIX.md` and related schemas in `contracts/`. The validation gate will fail the build if it detects drift.
16
+
17
+ ## 3. Excessive Token Usage from `cat` and `ls`
18
+ - **Symptom**: Context window flooded with massive minified bundle files or deep directory trees.
19
+ - **Root Cause**: Using `cat` on large files or `ls -R` without filters.
20
+ - **Resolution**:
21
+ - **Rule**: Always use the native `view_file`, `list_dir`, and `grep_search` tools with precise line bounds or search terms. NEVER `cat` a file directly in bash if a native agent tool exists.
22
+
23
+ ## 4. Init Must Not Depend On Explicit Slash Commands
24
+ - **Symptom**: The harness stayed idle on a blank repo until the user typed `/init`, even when the user had already provided a product idea.
25
+ - **Root Cause**: The entry skill documented `/init`, but the actual CLI/bootstrap path only exposed an explicit interactive command and did not scaffold discovery artifacts automatically.
26
+ - **Resolution**: Treat "empty repo + user idea" as implicit init in `genesis-harness` docs, and make CLI `init` call `init-planning.sh` to create Foundation, Discovery/QA, and dependency-map artifacts.
27
+ - **Rule**: For cold starts, initialize first, then ask the discovery/QA/tech-stack questions. Do not force the user to know the harness command vocabulary.
28
+
29
+ ## 5. Auto-init Must Preserve The User Brief
30
+ - **Symptom**: Even after auto-init started running, the planner still dumped mostly `TBD` placeholders and lost the original idea unless the user repeated it.
31
+ - **Root Cause**: Initialization created structure but did not treat the first user brief as durable bootstrap input.
32
+ - **Resolution**: `genesis-harness init --idea "<brief>"` now seeds planning docs and planner state from the brief before follow-up QA begins.
33
+ - **Rule**: The first user idea is a source artifact. Persist it into planning docs and state immediately, then ask only the missing clarification questions.
34
+
35
+ ## 6. Prompt Contracts Must Match Runtime Contracts
36
+ - **Symptom**: Skill docs and plugin prompts said the harness could auto-init from an idea, but the executable runtime still depended on manual follow-up and incomplete verification gates.
37
+ - **Root Cause**: Routing docs, plugin metadata, and gate definitions evolved separately from the actual CLI control flow.
38
+ - **Resolution**: Add a deterministic `genesis-harness run --idea ... --yes` pipeline, make `verify-gate` execute the full completion bar, and add regression tests for both.
39
+ - **Rule**: Do not describe a harness behavior in prompts or memory until there is a CLI/runtime path and a regression test that enforces it.
40
+
41
+ ## 7. Resume Requires Durable Session Artifacts, Not Just State Labels
42
+ - **Symptom**: The harness could move into planning, but a later session still had to infer what to do next from scattered markdown because there was no canonical run checkpoint.
43
+ - **Root Cause**: `.codebase/state.json` carried phase labels, but there was no per-session artifact bundle tying brief, discovery answers, and next tasks together.
44
+ - **Resolution**: `run` now writes `.runs/<session-id>/INPUT.md`, `DISCOVERY.json`, `STATE.json`, and `RESUME.md`, and `resume` reads or backfills them from state.
45
+ - **Rule**: Any harness phase that claims resumability must emit a durable per-session artifact bundle and a deterministic resume entrypoint.
46
+
47
+ ## 8. Discovery-Only Pipelines Still Break End-To-End Execution
48
+ - **Symptom**: `run --idea` looked complete in docs, but it only stopped at "Create the first feature plan", forcing a human or later session to bridge the actual execution gap manually.
49
+ - **Root Cause**: Discovery persistence existed, but there was no runtime handoff that turned approved scope into a concrete active feature scaffold.
50
+ - **Resolution**: `run` now creates the first feature scaffold automatically, seeds spec/plan/test-contract/verification files, records `active_feature`, and advances resumable state into `IMPLEMENTATION`.
51
+ - **Rule**: A harness pipeline is not end-to-end unless it leaves the next session inside an execution-ready slice with explicit tests, contracts, and verification steps already scaffolded.
52
+
53
+ ## 9. Generic Execution Scaffolds Still Leave Contract Work To Humans
54
+ - **Symptom**: Even after execution bootstrap existed, the first feature slice still began with only generic planning files, so the next agent had to invent API/UI contracts and fixtures manually.
55
+ - **Root Cause**: The runtime scaffold did not classify the first slice by surface area and did not reuse the repository's contract and fixture structure.
56
+ - **Resolution**: The bootstrap now infers `ui`, `api`, or `full-stack` from discovery answers and emits typed artifacts in `contracts/ui/<feature>/`, `contracts/api/<feature>/`, `playwright/fixtures/`, and `fixtures/api/`.
57
+ - **Rule**: If a harness claims contract-first execution, the first slice must already contain the concrete contract and fixture paths needed by the likely implementation surface.
58
+
59
+ ## 10. Feature Completion Is Not Project Completion
60
+ - **Symptom**: The runtime could verify one active feature, but there was no repeatable queue promotion, project-wide proof rerun, release-ready handoff, or drift audit before marking the project done.
61
+ - **Root Cause**: Feature lifecycle state and project lifecycle state were treated as the same boundary.
62
+ - **Resolution**: Added a multi-feature registry, append-only lifecycle events, project-wide verification, a distinct `RELEASE_READY` state, evidence-gated project completion, and `pipeline-audit`.
63
+ - **Rule**: A project may reach `COMPLETED` only after every feature is verified, all proof commands pass again at project scope, the handoff exists, and release or acceptance evidence is recorded.
@@ -0,0 +1,17 @@
1
+ # Developer Preferences
2
+
3
+ This file records the specific technical choices, preferences, and stylistic guidelines of the human developer for this repository. Adhere to these implicitly during code generation and problem-solving.
4
+
5
+ ## Technology Stack
6
+ - **Primary Language**: JavaScript (Node.js for backend scripts). Keep code modern but compatible with Node >= 18.
7
+ - **Testing**: Use standard Unix bash testing scripts (`verify.sh`, `run-evals.sh`) and standard Node asserts for unit testing unless specified otherwise.
8
+ - **Frontend/UI**: When dealing with UI generation, prefer Vanilla CSS for precise control and maximum performance. Emphasize "WOW" factor, modern gradients, glassmorphism, and responsive layouts.
9
+
10
+ ## Architectural Choices
11
+ - **Harness Engineering**: The system relies on state machines (FSM) and validation gates. Never skip a validation gate (`contract_integrity_gate.js`, `healing_telemetry.js`).
12
+ - **File Integrity**: Ensure all metadata inside `.codebase/` and `contracts/` stays perfectly synchronized with any changes to actual codebase logic (`scripts/spec_visual_sync.js`).
13
+
14
+ ## Communication Style
15
+ - Be concise, professional, and skip unnecessary pleasantries when delivering technical solutions.
16
+ - Use GitHub Flavored Markdown for formatting logs, alerts, and instructions.
17
+ - Provide Vietnamese localization in READMEs and user-facing artifacts where possible, as the user frequently requests Vietnamese communication.