codex-genesis-harness 0.1.7 → 0.1.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codebase/COMPRESSED_CONTEXT.md +80 -0
- package/.codebase/CURRENT_STATE.md +10 -10
- package/.codebase/DEPENDENCY_GRAPH.md +14 -1
- package/.codebase/IMPLEMENTATION_HANDOFF.md +34 -336
- package/.codebase/KNOWN_PROBLEMS.md +73 -3
- package/.codebase/MODULE_INDEX.md +23 -2
- package/.codebase/PIPELINE_FLOW.md +16 -6
- package/.codebase/RECOVERY_POINTS.md +80 -78
- package/.codebase/TECH_DEBT.md +6 -0
- package/.codebase/TEST_MATRIX.md +8 -3
- package/.codebase/VISUAL_GRAPH.md +127 -0
- package/.codebase/context-policy.json +68 -0
- package/.codebase/memories/lessons_learned.md +63 -0
- package/.codebase/memories/preferences.md +17 -0
- package/.codebase/state.json +156 -17
- package/.codex/skills/genesis-architecture/SKILL.md +5 -0
- package/.codex/skills/genesis-debug-guide/SKILL.md +10 -4
- package/.codex/skills/genesis-docs-automation/SKILL.md +52 -973
- package/.codex/skills/genesis-executing-plans/SKILL.md +54 -0
- package/.codex/skills/genesis-executing-plans/agents/openai.yaml +6 -0
- package/.codex/skills/genesis-executing-plans/checklists/.gitkeep +0 -0
- package/.codex/skills/genesis-executing-plans/examples/.gitkeep +0 -0
- package/.codex/skills/genesis-executing-plans/templates/.gitkeep +0 -0
- package/.codex/skills/genesis-harness/SKILL.md +73 -1385
- package/.codex/skills/genesis-harness/agents/openai.yaml +1 -2
- package/.codex/skills/genesis-harness/references/state-machine.md +4 -1
- package/.codex/skills/genesis-harness/references/workflows.md +7 -1
- package/.codex/skills/genesis-harness/scripts/check-docs-sync.sh +3 -3
- package/.codex/skills/genesis-harness/scripts/init-planning.sh +246 -14
- package/.codex/skills/genesis-new-design/SKILL.md +4 -1
- package/.codex/skills/genesis-new-design/agents/openai.yaml +2 -0
- package/.codex/skills/genesis-observability-automation/SKILL.md +69 -303
- package/.codex/skills/genesis-observability-automation/references/common-mistakes-and-recovery.md +84 -0
- package/.codex/skills/genesis-observability-automation/references/workflow-phases.md +78 -0
- package/.codex/skills/genesis-performance-profiling/SKILL.md +1 -22
- package/.codex/skills/genesis-performance-profiling/agents/openai.yaml +1 -1
- package/.codex/skills/genesis-pipeline-orchestration/SKILL.md +15 -3
- package/.codex/skills/genesis-planning/SKILL.md +6 -1
- package/.codex/skills/genesis-release/SKILL.md +5 -0
- package/.codex/skills/genesis-research-first/SKILL.md +6 -0
- package/.codex/skills/genesis-spec-propagation/SKILL.md +52 -504
- package/.codex/skills/genesis-test-driven-development/SKILL.md +55 -0
- package/.codex/skills/genesis-test-driven-development/agents/openai.yaml +6 -0
- package/.codex/skills/genesis-test-driven-development/checklists/.gitkeep +0 -0
- package/.codex/skills/genesis-test-driven-development/examples/.gitkeep +0 -0
- package/.codex/skills/genesis-test-driven-development/templates/.gitkeep +0 -0
- package/.codex/skills/genesis-upgrade-design/SKILL.md +4 -2
- package/.codex/skills/genesis-upgrade-design/agents/openai.yaml +2 -0
- package/.codex/skills/genesis-using-git-worktrees/SKILL.md +54 -0
- package/.codex/skills/genesis-using-git-worktrees/agents/openai.yaml +6 -0
- package/.codex/skills/genesis-using-git-worktrees/checklists/.gitkeep +0 -0
- package/.codex/skills/genesis-using-git-worktrees/examples/.gitkeep +0 -0
- package/.codex/skills/genesis-using-git-worktrees/templates/.gitkeep +0 -0
- package/.codex/skills/genesis-verification-before-completion/SKILL.md +53 -0
- package/.codex/skills/genesis-verification-before-completion/agents/openai.yaml +6 -0
- package/.codex/skills/genesis-verification-before-completion/checklists/.gitkeep +0 -0
- package/.codex/skills/genesis-verification-before-completion/examples/.gitkeep +0 -0
- package/.codex/skills/genesis-verification-before-completion/templates/.gitkeep +0 -0
- package/.codex/skills/spec-impact-engine/SKILL.md +77 -500
- package/.codex/skills/spec-impact-engine/checklists/checklist.md +10 -0
- package/.codex-plugin/plugin.json +6 -5
- package/CHANGELOG.md +25 -1
- package/README.EN.md +74 -17
- package/README.VI.md +77 -19
- package/README.md +126 -10
- package/VERSION +1 -2
- package/bin/genesis-harness.js +2979 -149
- package/contracts/features/project-registry-schema.json +37 -0
- package/contracts/features/registry-schema.json +15 -0
- package/contracts/observability/agent-run-schema.json +39 -0
- package/contracts/observability/failure-schema.json +35 -0
- package/contracts/ui/auth/login-screen-contract.json +43 -0
- package/features/REGISTRY.md +65 -0
- package/features/SCOPE-template.md +65 -0
- package/fixtures/pipeline/end-to-end-project-lifecycle-fixture.md +39 -0
- package/fixtures/pipeline/feature-completion-fixture.md +26 -0
- package/fixtures/pipeline/run-to-feature-execution-fixture.md +20 -0
- package/fixtures/planning/MOCKUP_PROMPT_TEMPLATE.md +16 -0
- package/observability/agent-runs/sample-run.json +13 -0
- package/observability/decision-logs/sample-decision.md +43 -0
- package/observability/failures/sample-failure.json +12 -0
- package/package.json +15 -4
- package/playwright/e2e/app-template.spec.js +37 -0
- package/playwright/e2e/auth/login-screen.spec.js +65 -0
- package/playwright/e2e/web-template.spec.js +28 -0
- package/scripts/check-repository-hygiene.js +48 -0
- package/scripts/check-scope.sh +100 -0
- package/scripts/cold-start-check.js +133 -0
- package/scripts/install.sh +4 -0
- package/scripts/prompt_sentinel.js +35 -4
- package/scripts/run-evals.sh +152 -3
- package/scripts/schema/001-init.sql +129 -0
- package/scripts/schema/002-story-verify.sql +9 -0
- package/scripts/schema/003-tool-registry.sql +15 -0
- package/scripts/schema/004-intervention.sql +15 -0
- package/scripts/scratch_parser.js +49 -0
- package/scripts/spec_visual_sync.js +1 -1
- package/scripts/test_generator.js +2 -2
- package/scripts/transition_state.sh +32 -8
- package/scripts/uninstall.sh +4 -0
- package/scripts/validation_gates.sh +2 -80
- package/scripts/verify.sh +19 -2
- package/tests/fixtures/fixture-index.md +5 -0
- package/tests/integration/cli-smoke.test.js +506 -0
- package/tests/unit/feature_registry.test.js +152 -0
- package/tests/unit/prompt_sentinel.test.js +1 -1
- package/tests/unit/repository_hygiene.test.js +17 -0
- package/tests/unit/spec_visual_sync.test.js +1 -1
- package/tests/unit/state_metadata.test.js +76 -0
- package/tests/unit/test_generator.test.js +1 -1
- package/tests/unit/verify_gate.test.js +25 -0
- package/tests/unit/workflow_contracts.test.js +90 -0
- package/fixtures/tts/tts-fixture-template.md +0 -14
- package/fixtures/videos/video-fixture-template.md +0 -14
- package/playwright/e2e/e2e-template.md +0 -4
|
@@ -1,83 +1,85 @@
|
|
|
1
1
|
# Recovery Points
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
**Use When**: Evolution of the Codex harness (verification loops, CLI tools, scripts) needs to be paused, or when a rollback is necessary due to environment breakage.
|
|
6
|
-
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
## Quick Reference: Current Recovery Points
|
|
10
|
-
|
|
11
|
-
| Phase | Status | Resumption File | Last Updated |
|
|
12
|
-
|-------|--------|-----------------|--------------|
|
|
13
|
-
| TUI Mockup Viewer Integration | ✓ Complete | `.codebase/CURRENT_STATE.md` | 2026-06-01 |
|
|
14
|
-
| Harness Verification Streamlining | ✓ Complete | `.codebase/CURRENT_STATE.md` | 2026-06-01 |
|
|
15
|
-
| Bead Memory Regression Tests | ✓ Complete | `scripts/run-evals.sh` | 2026-06-01 |
|
|
16
|
-
| Harness Engineering Overhaul | ⏸️ Idle (Stable) | `scripts/verify.sh` | 2026-06-01 |
|
|
17
|
-
|
|
18
|
-
---
|
|
19
|
-
|
|
20
|
-
## Phase: Harness Verification Streamlining & Memory Evals
|
|
21
|
-
|
|
22
|
-
**Status**: ✓ Complete
|
|
23
|
-
**Last Updated**: 2026-06-01
|
|
24
|
-
|
|
25
|
-
### What Happened
|
|
26
|
-
|
|
27
|
-
- Cleaned up legacy/deprecated skills (e.g., `genesis-mvp-planning`, `genesis-release-orchestration`) from `scripts/verify.sh`, `scripts/uninstall.sh`, and `scripts/run-evals.sh`.
|
|
28
|
-
- Removed hard-coded skill name mappings (`expected_name` switch statements), enabling dynamic mapping directly based on directory names.
|
|
29
|
-
- Added test coverage in `run-evals.sh` for the local bead memory commands (`remember`, `recall`, `prime`, `forget`).
|
|
30
|
-
- Enforced `state-machine.md` presence in `verify_harness_skill()`.
|
|
31
|
-
|
|
32
|
-
### Safe State Confirmation
|
|
33
|
-
|
|
34
|
-
The harness currently passes all structural tests cleanly.
|
|
35
|
-
```bash
|
|
36
|
-
# Verify structure
|
|
37
|
-
./scripts/verify.sh
|
|
38
|
-
|
|
39
|
-
# Verify regression
|
|
40
|
-
./scripts/run-evals.sh
|
|
41
|
-
|
|
42
|
-
# Dry-run package integrity
|
|
43
|
-
npm run pack:check
|
|
44
|
-
```
|
|
45
|
-
|
|
46
|
-
---
|
|
47
|
-
|
|
48
|
-
## Rollback Points
|
|
49
|
-
|
|
50
|
-
### If A Future Harness Evolution Breaks the CLI/Environment
|
|
51
|
-
|
|
52
|
-
**Rollback Level 1: Last Stable Run (Current State)**
|
|
53
|
-
If a new change to `bin/genesis-harness.js` or `scripts/verify.sh` creates infinite loops or immediate failures:
|
|
54
|
-
```bash
|
|
55
|
-
git checkout -- bin/genesis-harness.js scripts/verify.sh scripts/run-evals.sh
|
|
56
|
-
npm install
|
|
57
|
-
./scripts/verify.sh
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
**Rollback Level 2: Full Repository Reset**
|
|
61
|
-
If tests are failing in a manner that contaminates local fixtures or memory:
|
|
62
|
-
```bash
|
|
63
|
-
git reset --hard HEAD
|
|
64
|
-
git clean -fd
|
|
65
|
-
npm install
|
|
66
|
-
./scripts/verify.sh
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
---
|
|
70
|
-
|
|
71
|
-
## Checklist: Before Pausing Work on Harness Evolutions
|
|
72
|
-
|
|
73
|
-
- [ ] `scripts/verify.sh` passing cleanly (Exit Code 0)
|
|
74
|
-
- [ ] `scripts/run-evals.sh` passing cleanly (Exit Code 0)
|
|
75
|
-
- [ ] Script files verified for POSIX/LF line endings
|
|
76
|
-
- [ ] No uncommitted changes in core scripts that break existing workflows
|
|
77
|
-
- [ ] `.codebase/CURRENT_STATE.md` updated with exact phase details
|
|
3
|
+
A reverse-chronological log of stable states to return to if the current task corrupts the project.
|
|
78
4
|
|
|
79
5
|
---
|
|
80
6
|
|
|
81
|
-
##
|
|
82
|
-
**
|
|
83
|
-
**
|
|
7
|
+
## 2026-06-12: End-to-End Project Lifecycle
|
|
8
|
+
- **Status**: Stable
|
|
9
|
+
- **Git State**: Multi-feature orchestration, project verification, release-ready handoff, append-only events, and lifecycle audit are implemented and covered by integration tests.
|
|
10
|
+
- **Why it's stable**: `cli-smoke.test.js` exercises idea bootstrap through two feature completions, project proof, final completion, idempotency, event history, and audit; the canonical `verify-gate` passes and is now the single completion gate.
|
|
11
|
+
- **How to recover**: Use `.runs/<session-id>/STATE.json`, `RESUME.md`, and `EVENTS.jsonl`; run `pipeline-audit` before resuming the command reported by `next`.
|
|
12
|
+
- **Files changed**: CLI, lifecycle state machine, project registry contract, pipeline fixtures/tests, orchestration skill, plugin prompt, README files, and repository memory.
|
|
13
|
+
|
|
14
|
+
## 2026-06-12T16:50:17+07:00: Lifecycle Pipeline + Repository Hygiene
|
|
15
|
+
- **Status**: Stable
|
|
16
|
+
- **Git State**: Lifecycle and hygiene changes verified locally; tracked `node_modules/` entries are staged for removal while local installed dependencies remain available.
|
|
17
|
+
- **Why it's stable**: `run --idea` creates a project feature registry, `next` resolves executable work, and `complete-feature` requires a passing command plus explicit evidence before closing state. Repository verification blocks tracked dependencies and generated package artifacts, while tarball smoke tests reject generated `scripts/bin/` binaries.
|
|
18
|
+
- **How to recover**: Reapply this checkpoint if `.planning/FEATURE_REGISTRY.json` stops being generated, completion bypasses verification, evidence or metrics disappear, or `node_modules/` becomes tracked again.
|
|
19
|
+
- **Files changed**: `bin/genesis-harness.js`, `scripts/check-repository-hygiene.js`, `scripts/verify.sh`, `scripts/run-evals.sh`, lifecycle contracts/fixtures/tests, README files, and `.codebase/*`.
|
|
20
|
+
|
|
21
|
+
## 2026-06-10T10:05:00Z: Feature Execution Bootstrap
|
|
22
|
+
- **Status**: Stable
|
|
23
|
+
- **Git State**: Working tree verified after `run`/`resume` orchestration started auto-scaffolding the first execution-ready feature.
|
|
24
|
+
- **Why it's stable**: `tests/integration/cli-smoke.test.js`, `scripts/verify.sh`, `scripts/run-evals.sh`, and `genesis-harness verify-gate` now cover the handoff from discovery into `IMPLEMENTATION` with `active_feature` persisted in `.runs/`.
|
|
25
|
+
- **How to recover**: Reapply from this point if `run --idea` falls back to `PLANNING`, if `.planning/features/<NNN>-...` stops being created automatically, or if `resume` loses the active feature checkpoint.
|
|
26
|
+
- **Files changed**: `bin/genesis-harness.js`, `tests/integration/cli-smoke.test.js`, `fixtures/pipeline/run-to-feature-execution-fixture.md`, and `.codebase/*.md`.
|
|
27
|
+
|
|
28
|
+
## 2026-06-10T10:25:00Z: Typed First-Slice Contract Bootstrap
|
|
29
|
+
- **Status**: Stable
|
|
30
|
+
- **Git State**: Working tree verified after the first feature scaffold started emitting API/UI-specific contracts and fixtures.
|
|
31
|
+
- **Why it's stable**: `run --idea` now creates `contracts/ui`, `contracts/api`, `playwright/fixtures`, and `fixtures/api` artifacts when the discovery answers imply those surfaces, and `tests/integration/cli-smoke.test.js` locks the generated paths and tailored values.
|
|
32
|
+
- **How to recover**: Reapply from this point if the first feature loses typed contract scaffolding, if generated routes/endpoints regress to generic placeholders, or if `TEST_CONTRACT.md` stops referencing the generated contract paths.
|
|
33
|
+
- **Files changed**: `bin/genesis-harness.js`, `tests/integration/cli-smoke.test.js`, `fixtures/pipeline/run-to-feature-execution-fixture.md`, `tests/fixtures/fixture-index.md`, and `.codebase/*.md`.
|
|
34
|
+
|
|
35
|
+
## 2026-06-10T08:34:56Z: Workflow Consolidation + Trusted Publish Hardening
|
|
36
|
+
- **Status**: Stable
|
|
37
|
+
- **Git State**: Working tree verified after CI workflow consolidation, registry cleanup, and release-path hardening.
|
|
38
|
+
- **Why it's stable**: GitHub Actions now reuse a single `verify-gate` path, release publishing expects OIDC trusted publishing with provenance, and workflow contract tests block drift back to placeholder CI logic.
|
|
39
|
+
- **How to recover**: Reapply from this point if CI starts bypassing `verify-gate`, if docs-sync regains custom placeholder logic, or if npm publishing falls back to long-lived tokens and mutable CI version rewrites.
|
|
40
|
+
- **Files changed**: `.github/workflows/*.yml`, `tests/unit/workflow_contracts.test.js`, `features/REGISTRY.md`, and `.codebase/*.md`.
|
|
41
|
+
|
|
42
|
+
## 2026-06-10T16:45:00+07:00: Resume + Run Artifact Hardening
|
|
43
|
+
- **Status**: Stable
|
|
44
|
+
- **Git State**: Working tree verified after resumable run-artifact and state-invariant changes.
|
|
45
|
+
- **Why it's stable**: `run` now writes per-session `.runs/<session-id>` artifacts, `resume` can backfill and report from them, and state metadata tests block stale timestamps.
|
|
46
|
+
- **How to recover**: Reapply from this point if mid-project resume loses the next task, if `.runs/` stops being populated, or if `completed_at` drifts behind the active session.
|
|
47
|
+
- **Files changed**: `bin/genesis-harness.js`, `.codebase/state.json`, `.codebase/*.md`, `scripts/run-evals.sh`, and CLI/unit test coverage.
|
|
48
|
+
|
|
49
|
+
## 2026-06-10T14:20:00+07:00: Auto-init + Discovery Bootstrap
|
|
50
|
+
- **Status**: Stable
|
|
51
|
+
- **Git State**: Working tree verified after init bootstrap changes.
|
|
52
|
+
- **Why it's stable**: `tests/integration/cli-smoke.test.js`, `scripts/verify.sh`, `scripts/run-evals.sh` (with temporary npm cache override), and `npm run pack:check` pass.
|
|
53
|
+
- **How to recover**: Reapply from this point if init stops creating `.planning/INIT_QA.md`, `01-discovery-and-qa`, or `.codebase/PHASE_DEPENDENCY_MAP.md`.
|
|
54
|
+
- **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `genesis-harness` skill routing docs, and init smoke coverage.
|
|
55
|
+
|
|
56
|
+
## 2026-06-10T14:55:00+07:00: Idea-Seeded Planner Bootstrap
|
|
57
|
+
- **Status**: Stable
|
|
58
|
+
- **Git State**: Working tree verified after brief-to-planning bootstrap changes.
|
|
59
|
+
- **Why it's stable**: `init --idea "<brief>"` now fills planning docs and planner state, and verification still passes on `cli-smoke`, `verify.sh`, `run-evals.sh`, and `pack:check`.
|
|
60
|
+
- **How to recover**: Reapply from this point if user brief content stops propagating into `PROJECT.md`, `REQUIREMENTS.md`, `STACK.md`, `SUMMARY.md`, or `.codebase/state.json`.
|
|
61
|
+
- **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `genesis-harness` prompt/routing docs, and init smoke coverage.
|
|
62
|
+
|
|
63
|
+
## 2026-06-10T15:05:00+07:00: Runtime Pipeline + Verification Hardening
|
|
64
|
+
- **Status**: Stable
|
|
65
|
+
- **Git State**: Working tree verified after runtime pipeline, gate hardening, and metadata drift fixes.
|
|
66
|
+
- **Why it's stable**: `genesis-harness run --idea ... --yes` now advances a blank repo into planning with persisted discovery answers, and `verify-gate` now matches the required completion contract.
|
|
67
|
+
- **How to recover**: Reapply from this point if `run` stops filling planning docs, if `verify-gate` stops executing evals/docs/pack checks, or if plugin/package/state metadata drift returns.
|
|
68
|
+
- **Files changed**: `bin/genesis-harness.js`, `init-planning.sh`, `.codex-plugin/plugin.json`, `.codebase/*.md`, `.codebase/state.json`, `scripts/run-evals.sh`, and CLI/unit/integration tests.
|
|
69
|
+
|
|
70
|
+
## 2026-06-03T09:55:00+07:00: Full Score Harness Fix (110/110)
|
|
71
|
+
- **Status**: Stable
|
|
72
|
+
- **Git State**: Everything committed + new features added.
|
|
73
|
+
- **Why it's stable**: All tests (`tests/unit/*.test.js`), `verify.sh`, `run-evals.sh`, and `cold-start-check.js` pass with exit code 0.
|
|
74
|
+
- **How to recover**: `git reset --hard HEAD` (assuming commit happens immediately after this)
|
|
75
|
+
- **Files added**: `features/REGISTRY.md`, `scripts/cold-start-check.js`, `scripts/check-scope.sh`, observability schemas/samples.
|
|
76
|
+
|
|
77
|
+
## 2026-06-03T09:30:00+07:00: LeanCTX + CLI Postinstall Seed
|
|
78
|
+
- **Status**: Stable
|
|
79
|
+
- **Why it's stable**: `npm run verify` and `npm run eval` pass. `context-policy.json` successfully bootstrapped.
|
|
80
|
+
- **How to recover**: Return to commit before the evaluation score fixes.
|
|
81
|
+
|
|
82
|
+
## 2026-06-03T08:35:00+07:00: Harness Drift Gate Hardening
|
|
83
|
+
- **Status**: Stable
|
|
84
|
+
- **Why it's stable**: `npm run verify`, `npm run eval`, and `npm run pack:check` all pass.
|
|
85
|
+
- **How to recover**: Revert to branch state before LeanCTX introduction.
|
package/.codebase/TEST_MATRIX.md
CHANGED
|
@@ -2,10 +2,15 @@
|
|
|
2
2
|
|
|
3
3
|
Required checks:
|
|
4
4
|
|
|
5
|
-
- `scripts/verify.sh`: repository harness structure, skill metadata, contracts, fixtures,
|
|
6
|
-
- `scripts/run-evals.sh`: install/verify/uninstall regression checks.
|
|
5
|
+
- `scripts/verify.sh`: repository harness structure, skill metadata, contracts, fixtures, harness smoke test, and `SKILL.md` progressive-disclosure line limit.
|
|
6
|
+
- `scripts/run-evals.sh`: install/verify/uninstall regression checks, manifest route checks, sync-generated Mermaid relationship checks, hook docs-gate checks, LeanCTX policy checks, handoff/state freshness checks, `tests/unit/*.test.js`, and `tests/integration/*.test.js`.
|
|
7
|
+
- `tests/integration/cli-smoke.test.js`: package CLI smoke for install/postinstall LeanCTX seeding, deterministic idea bootstrap, resumable run artifacts, multi-feature queue promotion, evidence-gated feature completion, project-wide verification, release-ready handoff, idempotent project completion, append-only events, lifecycle audit, metrics, and observability output.
|
|
8
|
+
- `tests/unit/repository_hygiene.test.js`: prevents tracked dependency or generated artifacts from returning.
|
|
9
|
+
- `tests/unit/prompt_sentinel.test.js`: LeanCTX-backed prompt sentinel threshold and truncation behavior.
|
|
10
|
+
- `tests/unit/state_metadata.test.js`: keeps `.codebase/CURRENT_STATE.md` and `.codebase/state.json` aligned for session, TTFV, legal state enums, and non-stale completion timestamps.
|
|
11
|
+
- `tests/unit/verify_gate.test.js`: verifies that `verify-gate` includes evals, docs-gate, pack dry-run, and LeanCTX checks.
|
|
12
|
+
- `tests/unit/workflow_contracts.test.js`: verifies CI workflows delegate to the reusable `verify-gate` path, pin critical actions, and use trusted npm publishing with provenance.
|
|
7
13
|
- `npm run pack:check`: package contents dry-run.
|
|
8
14
|
- Skill validation: run `quick_validate.py` for changed skills when available.
|
|
9
15
|
|
|
10
16
|
Feature rule: add or update fixtures and expected output before implementation.
|
|
11
|
-
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# Visual Project Graph
|
|
2
|
+
|
|
3
|
+
## Harness Relationship Map
|
|
4
|
+
|
|
5
|
+
```mermaid
|
|
6
|
+
flowchart LR
|
|
7
|
+
manifest[".codex-plugin/plugin.json"] --> skills[".codex/skills/*"]
|
|
8
|
+
package["package.json"] --> cli["bin/genesis-harness.js"]
|
|
9
|
+
package --> verify["scripts/verify.sh"]
|
|
10
|
+
package --> evals["scripts/run-evals.sh"]
|
|
11
|
+
cli --> install["install / postinstall"]
|
|
12
|
+
cli --> hooks["setup-hooks"]
|
|
13
|
+
hooks --> docsgate["genesis-harness docs-gate"]
|
|
14
|
+
docsgate --> docsync["check-docs-sync.sh"]
|
|
15
|
+
docsgate --> specsync["check-spec-changelog.sh"]
|
|
16
|
+
skills --> contracts["contracts/"]
|
|
17
|
+
skills --> fixtures["fixtures/"]
|
|
18
|
+
skills --> tests["tests/ + playwright/"]
|
|
19
|
+
skills --> memory[".codebase/"]
|
|
20
|
+
verify --> skills
|
|
21
|
+
verify --> contracts
|
|
22
|
+
verify --> fixtures
|
|
23
|
+
verify --> memory
|
|
24
|
+
evals --> install
|
|
25
|
+
evals --> cli
|
|
26
|
+
evals --> unit["tests/unit/*.test.js"]
|
|
27
|
+
evals --> integration["tests/integration/*.test.js"]
|
|
28
|
+
evals --> pack["npm pack smoke"]
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Skill Workflow Relationships
|
|
32
|
+
|
|
33
|
+
```mermaid
|
|
34
|
+
flowchart TD
|
|
35
|
+
harness["genesis-harness"] --> planning["genesis-planning"]
|
|
36
|
+
harness --> research["genesis-research-first"]
|
|
37
|
+
planning --> architecture["genesis-architecture"]
|
|
38
|
+
planning --> api["genesis-api-contract"]
|
|
39
|
+
planning --> design["genesis-design-spec"]
|
|
40
|
+
api --> apisync["genesis-api-sync"]
|
|
41
|
+
design --> ui["genesis-ui-ux-test"]
|
|
42
|
+
api --> specimpact["spec-impact-engine"]
|
|
43
|
+
specimpact --> specprop["genesis-spec-propagation"]
|
|
44
|
+
specprop --> docs["genesis-docs-automation"]
|
|
45
|
+
ui --> verifybefore["genesis-verification-before-completion"]
|
|
46
|
+
apisync --> verifybefore
|
|
47
|
+
docs --> verifybefore
|
|
48
|
+
verifybefore --> release["genesis-release"]
|
|
49
|
+
harness --> memorymap["genesis-codebase-map"]
|
|
50
|
+
harness --> observability["genesis-observability-automation"]
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Code Dependency Hints
|
|
54
|
+
|
|
55
|
+
```mermaid
|
|
56
|
+
flowchart TD
|
|
57
|
+
"tests/integration/cli-smoke.test.js" --> "assert"
|
|
58
|
+
"tests/integration/cli-smoke.test.js" --> "fs"
|
|
59
|
+
"tests/integration/cli-smoke.test.js" --> "os"
|
|
60
|
+
"tests/integration/cli-smoke.test.js" --> "path"
|
|
61
|
+
"tests/integration/cli-smoke.test.js" --> "child_process"
|
|
62
|
+
"tests/unit/contract_integrity_gate.test.js" --> "assert"
|
|
63
|
+
"tests/unit/contract_integrity_gate.test.js" --> "fs"
|
|
64
|
+
"tests/unit/contract_integrity_gate.test.js" --> "path"
|
|
65
|
+
"tests/unit/contract_integrity_gate.test.js" --> "child_process"
|
|
66
|
+
"tests/unit/healing_telemetry.test.js" --> "assert"
|
|
67
|
+
"tests/unit/healing_telemetry.test.js" --> "fs"
|
|
68
|
+
"tests/unit/healing_telemetry.test.js" --> "path"
|
|
69
|
+
"tests/unit/healing_telemetry.test.js" --> "child_process"
|
|
70
|
+
"tests/unit/prompt_sentinel.test.js" --> "assert"
|
|
71
|
+
"tests/unit/prompt_sentinel.test.js" --> "fs"
|
|
72
|
+
"tests/unit/prompt_sentinel.test.js" --> "path"
|
|
73
|
+
"tests/unit/prompt_sentinel.test.js" --> "child_process"
|
|
74
|
+
"tests/unit/spec_visual_sync.test.js" --> "assert"
|
|
75
|
+
"tests/unit/spec_visual_sync.test.js" --> "fs"
|
|
76
|
+
"tests/unit/spec_visual_sync.test.js" --> "path"
|
|
77
|
+
"tests/unit/spec_visual_sync.test.js" --> "child_process"
|
|
78
|
+
"tests/unit/test_generator.test.js" --> "assert"
|
|
79
|
+
"tests/unit/test_generator.test.js" --> "fs"
|
|
80
|
+
"tests/unit/test_generator.test.js" --> "path"
|
|
81
|
+
"tests/unit/test_generator.test.js" --> "child_process"
|
|
82
|
+
"bin/genesis-harness.js" --> "fs"
|
|
83
|
+
"bin/genesis-harness.js" --> "path"
|
|
84
|
+
"bin/genesis-harness.js" --> "child_process"
|
|
85
|
+
"bin/genesis-harness.js" --> "@babel/parser"
|
|
86
|
+
"bin/genesis-harness.js" --> "@babel/traverse"
|
|
87
|
+
"bin/genesis-harness.js" --> "child_process"
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
## .planning/ROADMAP.md Derived Feature Status
|
|
91
|
+
|
|
92
|
+
```mermaid
|
|
93
|
+
graph TD
|
|
94
|
+
classDef completed fill:#d4edda,stroke:#28a745,stroke-width:2px;
|
|
95
|
+
classDef inprogress fill:#fff3cd,stroke:#ffc107,stroke-width:2px;
|
|
96
|
+
classDef pending fill:#e2e3e5,stroke:#6c757d,stroke-width:2px;
|
|
97
|
+
subgraph Role_0 ["Role: User"]
|
|
98
|
+
Task0["Roadmap task 0"]
|
|
99
|
+
class Task0 completed;
|
|
100
|
+
Task1["Roadmap task 1"]
|
|
101
|
+
class Task1 inprogress;
|
|
102
|
+
Task2["Roadmap task 2"]
|
|
103
|
+
class Task2 pending;
|
|
104
|
+
end
|
|
105
|
+
subgraph Role_1 ["Role: Admin"]
|
|
106
|
+
Task3["Roadmap task 3"]
|
|
107
|
+
class Task3 completed;
|
|
108
|
+
Task4["Roadmap task 4"]
|
|
109
|
+
class Task4 pending;
|
|
110
|
+
Task5["Roadmap task 5"]
|
|
111
|
+
class Task5 inprogress;
|
|
112
|
+
end
|
|
113
|
+
subgraph Role_2 ["Role: Analytics"]
|
|
114
|
+
Task6["Roadmap task 6"]
|
|
115
|
+
class Task6 pending;
|
|
116
|
+
Task7["Roadmap task 7"]
|
|
117
|
+
class Task7 pending;
|
|
118
|
+
Task8["Roadmap task 8"]
|
|
119
|
+
class Task8 inprogress;
|
|
120
|
+
end
|
|
121
|
+
Task0 --> Task1
|
|
122
|
+
Task0 --> Task2
|
|
123
|
+
Task2 --> Task4
|
|
124
|
+
Task2 --> Task5
|
|
125
|
+
Task4 --> Task6
|
|
126
|
+
```
|
|
127
|
+
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "leanctx-default",
|
|
3
|
+
"token_budget": 12000,
|
|
4
|
+
"_comment_token_budget": "Default conservative budget. Override in project .codebase/context-policy.json. For 128k-context models, set to 40000.",
|
|
5
|
+
"auto_scale": {
|
|
6
|
+
"enabled": false,
|
|
7
|
+
"note": "Set enabled=true and provide model_context_window to auto-calculate budgets. Formula: token_budget = model_context_window * 0.09 (9% for harness context leaves 91% for generation).",
|
|
8
|
+
"model_context_window": null,
|
|
9
|
+
"scale_factor": 0.09
|
|
10
|
+
},
|
|
11
|
+
"warn_at": 0.6,
|
|
12
|
+
"compact_at": 0.7,
|
|
13
|
+
"hard_stop_at": 0.85,
|
|
14
|
+
"layers": [
|
|
15
|
+
{
|
|
16
|
+
"name": "core",
|
|
17
|
+
"max_tokens": 2500,
|
|
18
|
+
"include": [
|
|
19
|
+
"AGENTS.md",
|
|
20
|
+
".codex/SOUL.md",
|
|
21
|
+
".codebase/CURRENT_STATE.md",
|
|
22
|
+
".codebase/MODULE_INDEX.md",
|
|
23
|
+
".codebase/TEST_MATRIX.md"
|
|
24
|
+
]
|
|
25
|
+
},
|
|
26
|
+
{
|
|
27
|
+
"name": "active_context",
|
|
28
|
+
"max_tokens": 6500,
|
|
29
|
+
"include": [
|
|
30
|
+
".codebase/COMPRESSED_CONTEXT.md",
|
|
31
|
+
".codebase/VISUAL_GRAPH.md",
|
|
32
|
+
".planning/STATE.md",
|
|
33
|
+
".planning/ROADMAP.md",
|
|
34
|
+
"contracts/",
|
|
35
|
+
"fixtures/"
|
|
36
|
+
]
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"name": "deferred_reference",
|
|
40
|
+
"max_tokens": 3000,
|
|
41
|
+
"include": [
|
|
42
|
+
".codex/skills/*/references/",
|
|
43
|
+
".codex/skills/*/playbooks/",
|
|
44
|
+
".codex/skills/*/checklists/",
|
|
45
|
+
"README*.md"
|
|
46
|
+
]
|
|
47
|
+
}
|
|
48
|
+
],
|
|
49
|
+
"defer_patterns": [
|
|
50
|
+
".codex/skills/*/templates/**",
|
|
51
|
+
".codex/skills/*/examples/**",
|
|
52
|
+
"playwright/**",
|
|
53
|
+
"observability/**",
|
|
54
|
+
"node_modules/**",
|
|
55
|
+
"dist/**",
|
|
56
|
+
"coverage/**"
|
|
57
|
+
],
|
|
58
|
+
"portable_commands": [
|
|
59
|
+
"genesis-harness leanctx",
|
|
60
|
+
"genesis-harness sync",
|
|
61
|
+
"genesis-harness docs-gate",
|
|
62
|
+
"genesis-harness verify-gate",
|
|
63
|
+
"npm run verify",
|
|
64
|
+
"npm run eval",
|
|
65
|
+
"node scripts/cold-start-check.js"
|
|
66
|
+
],
|
|
67
|
+
"wrapper_policy": "rtk optional when installed locally; public docs and CI must use portable commands."
|
|
68
|
+
}
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# Lessons Learned & Historical Bugs
|
|
2
|
+
|
|
3
|
+
This file chronicles the major failures, recursive bugs, and architectural dead-ends we have encountered. It acts as an immune system preventing the agent from repeating history.
|
|
4
|
+
|
|
5
|
+
## 1. Duplicate Slash Commands in Registry
|
|
6
|
+
- **Symptom**: Agent registered 4 copies of the same slash command for a single skill.
|
|
7
|
+
- **Root Cause**: The CLI script recursively scanned the entire `.codex/` directory for active skills, accidentally parsing backup folders (`.codex/backup/`) generated during skill upgrades.
|
|
8
|
+
- **Resolution**: Backup directories must ALWAYS be placed completely outside the active parsed directory (e.g., moved to `~/.codex/backups` globally).
|
|
9
|
+
- **Rule**: When doing file tree walks for plugins/skills, always explicitly ignore `.git`, `node_modules`, `backup`, and `tmp` folders.
|
|
10
|
+
|
|
11
|
+
## 2. Documentation Drift & Broken Contracts
|
|
12
|
+
- **Symptom**: Code in `scripts/` changed logic without updating `contracts/`.
|
|
13
|
+
- **Root Cause**: Agent skipped the documentation step after a "quick fix" code edit.
|
|
14
|
+
- **Resolution**: Implemented Validation Gates (`npm run verify`).
|
|
15
|
+
- **Rule**: Never finalize a code edit without explicitly checking `TEST_MATRIX.md` and related schemas in `contracts/`. The validation gate will fail the build if it detects drift.
|
|
16
|
+
|
|
17
|
+
## 3. Excessive Token Usage from `cat` and `ls`
|
|
18
|
+
- **Symptom**: Context window flooded with massive minified bundle files or deep directory trees.
|
|
19
|
+
- **Root Cause**: Using `cat` on large files or `ls -R` without filters.
|
|
20
|
+
- **Resolution**:
|
|
21
|
+
- **Rule**: Always use the native `view_file`, `list_dir`, and `grep_search` tools with precise line bounds or search terms. NEVER `cat` a file directly in bash if a native agent tool exists.
|
|
22
|
+
|
|
23
|
+
## 4. Init Must Not Depend On Explicit Slash Commands
|
|
24
|
+
- **Symptom**: The harness stayed idle on a blank repo until the user typed `/init`, even when the user had already provided a product idea.
|
|
25
|
+
- **Root Cause**: The entry skill documented `/init`, but the actual CLI/bootstrap path only exposed an explicit interactive command and did not scaffold discovery artifacts automatically.
|
|
26
|
+
- **Resolution**: Treat "empty repo + user idea" as implicit init in `genesis-harness` docs, and make CLI `init` call `init-planning.sh` to create Foundation, Discovery/QA, and dependency-map artifacts.
|
|
27
|
+
- **Rule**: For cold starts, initialize first, then ask the discovery/QA/tech-stack questions. Do not force the user to know the harness command vocabulary.
|
|
28
|
+
|
|
29
|
+
## 5. Auto-init Must Preserve The User Brief
|
|
30
|
+
- **Symptom**: Even after auto-init started running, the planner still dumped mostly `TBD` placeholders and lost the original idea unless the user repeated it.
|
|
31
|
+
- **Root Cause**: Initialization created structure but did not treat the first user brief as durable bootstrap input.
|
|
32
|
+
- **Resolution**: `genesis-harness init --idea "<brief>"` now seeds planning docs and planner state from the brief before follow-up QA begins.
|
|
33
|
+
- **Rule**: The first user idea is a source artifact. Persist it into planning docs and state immediately, then ask only the missing clarification questions.
|
|
34
|
+
|
|
35
|
+
## 6. Prompt Contracts Must Match Runtime Contracts
|
|
36
|
+
- **Symptom**: Skill docs and plugin prompts said the harness could auto-init from an idea, but the executable runtime still depended on manual follow-up and incomplete verification gates.
|
|
37
|
+
- **Root Cause**: Routing docs, plugin metadata, and gate definitions evolved separately from the actual CLI control flow.
|
|
38
|
+
- **Resolution**: Add a deterministic `genesis-harness run --idea ... --yes` pipeline, make `verify-gate` execute the full completion bar, and add regression tests for both.
|
|
39
|
+
- **Rule**: Do not describe a harness behavior in prompts or memory until there is a CLI/runtime path and a regression test that enforces it.
|
|
40
|
+
|
|
41
|
+
## 7. Resume Requires Durable Session Artifacts, Not Just State Labels
|
|
42
|
+
- **Symptom**: The harness could move into planning, but a later session still had to infer what to do next from scattered markdown because there was no canonical run checkpoint.
|
|
43
|
+
- **Root Cause**: `.codebase/state.json` carried phase labels, but there was no per-session artifact bundle tying brief, discovery answers, and next tasks together.
|
|
44
|
+
- **Resolution**: `run` now writes `.runs/<session-id>/INPUT.md`, `DISCOVERY.json`, `STATE.json`, and `RESUME.md`, and `resume` reads or backfills them from state.
|
|
45
|
+
- **Rule**: Any harness phase that claims resumability must emit a durable per-session artifact bundle and a deterministic resume entrypoint.
|
|
46
|
+
|
|
47
|
+
## 8. Discovery-Only Pipelines Still Break End-To-End Execution
|
|
48
|
+
- **Symptom**: `run --idea` looked complete in docs, but it only stopped at "Create the first feature plan", forcing a human or later session to bridge the actual execution gap manually.
|
|
49
|
+
- **Root Cause**: Discovery persistence existed, but there was no runtime handoff that turned approved scope into a concrete active feature scaffold.
|
|
50
|
+
- **Resolution**: `run` now creates the first feature scaffold automatically, seeds spec/plan/test-contract/verification files, records `active_feature`, and advances resumable state into `IMPLEMENTATION`.
|
|
51
|
+
- **Rule**: A harness pipeline is not end-to-end unless it leaves the next session inside an execution-ready slice with explicit tests, contracts, and verification steps already scaffolded.
|
|
52
|
+
|
|
53
|
+
## 9. Generic Execution Scaffolds Still Leave Contract Work To Humans
|
|
54
|
+
- **Symptom**: Even after execution bootstrap existed, the first feature slice still began with only generic planning files, so the next agent had to invent API/UI contracts and fixtures manually.
|
|
55
|
+
- **Root Cause**: The runtime scaffold did not classify the first slice by surface area and did not reuse the repository's contract and fixture structure.
|
|
56
|
+
- **Resolution**: The bootstrap now infers `ui`, `api`, or `full-stack` from discovery answers and emits typed artifacts in `contracts/ui/<feature>/`, `contracts/api/<feature>/`, `playwright/fixtures/`, and `fixtures/api/`.
|
|
57
|
+
- **Rule**: If a harness claims contract-first execution, the first slice must already contain the concrete contract and fixture paths needed by the likely implementation surface.
|
|
58
|
+
|
|
59
|
+
## 10. Feature Completion Is Not Project Completion
|
|
60
|
+
- **Symptom**: The runtime could verify one active feature, but there was no repeatable queue promotion, project-wide proof rerun, release-ready handoff, or drift audit before marking the project done.
|
|
61
|
+
- **Root Cause**: Feature lifecycle state and project lifecycle state were treated as the same boundary.
|
|
62
|
+
- **Resolution**: Added a multi-feature registry, append-only lifecycle events, project-wide verification, a distinct `RELEASE_READY` state, evidence-gated project completion, and `pipeline-audit`.
|
|
63
|
+
- **Rule**: A project may reach `COMPLETED` only after every feature is verified, all proof commands pass again at project scope, the handoff exists, and release or acceptance evidence is recorded.
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# Developer Preferences
|
|
2
|
+
|
|
3
|
+
This file records the specific technical choices, preferences, and stylistic guidelines of the human developer for this repository. Adhere to these implicitly during code generation and problem-solving.
|
|
4
|
+
|
|
5
|
+
## Technology Stack
|
|
6
|
+
- **Primary Language**: JavaScript (Node.js for backend scripts). Keep code modern but compatible with Node >= 18.
|
|
7
|
+
- **Testing**: Use standard Unix bash testing scripts (`verify.sh`, `run-evals.sh`) and standard Node asserts for unit testing unless specified otherwise.
|
|
8
|
+
- **Frontend/UI**: When dealing with UI generation, prefer Vanilla CSS for precise control and maximum performance. Emphasize "WOW" factor, modern gradients, glassmorphism, and responsive layouts.
|
|
9
|
+
|
|
10
|
+
## Architectural Choices
|
|
11
|
+
- **Harness Engineering**: The system relies on state machines (FSM) and validation gates. Never skip a validation gate (`contract_integrity_gate.js`, `healing_telemetry.js`).
|
|
12
|
+
- **File Integrity**: Ensure all metadata inside `.codebase/` and `contracts/` stays perfectly synchronized with any changes to actual codebase logic (`scripts/spec_visual_sync.js`).
|
|
13
|
+
|
|
14
|
+
## Communication Style
|
|
15
|
+
- Be concise, professional, and skip unnecessary pleasantries when delivering technical solutions.
|
|
16
|
+
- Use GitHub Flavored Markdown for formatting logs, alerts, and instructions.
|
|
17
|
+
- Provide Vietnamese localization in READMEs and user-facing artifacts where possible, as the user frequently requests Vietnamese communication.
|