codex-genesis-harness 0.1.7 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. package/.codebase/COMPRESSED_CONTEXT.md +80 -0
  2. package/.codebase/CURRENT_STATE.md +37 -11
  3. package/.codebase/DEPENDENCY_GRAPH.md +14 -1
  4. package/.codebase/IMPLEMENTATION_HANDOFF.md +34 -336
  5. package/.codebase/KNOWN_PROBLEMS.md +54 -3
  6. package/.codebase/MODULE_INDEX.md +8 -0
  7. package/.codebase/PIPELINE_FLOW.md +7 -5
  8. package/.codebase/RECOVERY_POINTS.md +17 -78
  9. package/.codebase/TECH_DEBT.md +6 -0
  10. package/.codebase/TEST_MATRIX.md +4 -3
  11. package/.codebase/VISUAL_GRAPH.md +127 -0
  12. package/.codebase/context-policy.json +68 -0
  13. package/.codebase/memories/lessons_learned.md +21 -0
  14. package/.codebase/memories/preferences.md +17 -0
  15. package/.codebase/state.json +45 -24
  16. package/.codex/skills/genesis-architecture/SKILL.md +5 -0
  17. package/.codex/skills/genesis-debug-guide/SKILL.md +10 -4
  18. package/.codex/skills/genesis-docs-automation/SKILL.md +52 -973
  19. package/.codex/skills/genesis-executing-plans/SKILL.md +54 -0
  20. package/.codex/skills/genesis-executing-plans/agents/openai.yaml +6 -0
  21. package/.codex/skills/genesis-executing-plans/checklists/.gitkeep +0 -0
  22. package/.codex/skills/genesis-executing-plans/examples/.gitkeep +0 -0
  23. package/.codex/skills/genesis-executing-plans/templates/.gitkeep +0 -0
  24. package/.codex/skills/genesis-harness/SKILL.md +64 -1385
  25. package/.codex/skills/genesis-harness/scripts/check-docs-sync.sh +3 -3
  26. package/.codex/skills/genesis-harness/scripts/init-planning.sh +1 -1
  27. package/.codex/skills/genesis-new-design/SKILL.md +4 -1
  28. package/.codex/skills/genesis-new-design/agents/openai.yaml +2 -0
  29. package/.codex/skills/genesis-observability-automation/SKILL.md +69 -303
  30. package/.codex/skills/genesis-observability-automation/references/common-mistakes-and-recovery.md +84 -0
  31. package/.codex/skills/genesis-observability-automation/references/workflow-phases.md +78 -0
  32. package/.codex/skills/genesis-performance-profiling/SKILL.md +1 -22
  33. package/.codex/skills/genesis-performance-profiling/agents/openai.yaml +1 -1
  34. package/.codex/skills/genesis-planning/SKILL.md +6 -1
  35. package/.codex/skills/genesis-release/SKILL.md +5 -0
  36. package/.codex/skills/genesis-research-first/SKILL.md +6 -0
  37. package/.codex/skills/genesis-spec-propagation/SKILL.md +52 -504
  38. package/.codex/skills/genesis-test-driven-development/SKILL.md +55 -0
  39. package/.codex/skills/genesis-test-driven-development/agents/openai.yaml +6 -0
  40. package/.codex/skills/genesis-test-driven-development/checklists/.gitkeep +0 -0
  41. package/.codex/skills/genesis-test-driven-development/examples/.gitkeep +0 -0
  42. package/.codex/skills/genesis-test-driven-development/templates/.gitkeep +0 -0
  43. package/.codex/skills/genesis-upgrade-design/SKILL.md +4 -2
  44. package/.codex/skills/genesis-upgrade-design/agents/openai.yaml +2 -0
  45. package/.codex/skills/genesis-using-git-worktrees/SKILL.md +54 -0
  46. package/.codex/skills/genesis-using-git-worktrees/agents/openai.yaml +6 -0
  47. package/.codex/skills/genesis-using-git-worktrees/checklists/.gitkeep +0 -0
  48. package/.codex/skills/genesis-using-git-worktrees/examples/.gitkeep +0 -0
  49. package/.codex/skills/genesis-using-git-worktrees/templates/.gitkeep +0 -0
  50. package/.codex/skills/genesis-verification-before-completion/SKILL.md +53 -0
  51. package/.codex/skills/genesis-verification-before-completion/agents/openai.yaml +6 -0
  52. package/.codex/skills/genesis-verification-before-completion/checklists/.gitkeep +0 -0
  53. package/.codex/skills/genesis-verification-before-completion/examples/.gitkeep +0 -0
  54. package/.codex/skills/genesis-verification-before-completion/templates/.gitkeep +0 -0
  55. package/.codex/skills/spec-impact-engine/SKILL.md +77 -500
  56. package/.codex/skills/spec-impact-engine/checklists/checklist.md +10 -0
  57. package/.codex-plugin/plugin.json +3 -4
  58. package/CHANGELOG.md +4 -1
  59. package/README.EN.md +32 -17
  60. package/README.VI.md +35 -19
  61. package/README.md +48 -10
  62. package/VERSION +1 -1
  63. package/bin/genesis-harness.js +735 -5
  64. package/contracts/features/registry-schema.json +15 -0
  65. package/contracts/observability/agent-run-schema.json +34 -0
  66. package/contracts/observability/failure-schema.json +35 -0
  67. package/contracts/ui/auth/login-screen-contract.json +43 -0
  68. package/features/REGISTRY.md +63 -0
  69. package/features/SCOPE-template.md +65 -0
  70. package/fixtures/planning/MOCKUP_PROMPT_TEMPLATE.md +16 -0
  71. package/observability/agent-runs/sample-run.json +13 -0
  72. package/observability/decision-logs/sample-decision.md +43 -0
  73. package/observability/failures/sample-failure.json +12 -0
  74. package/package.json +9 -3
  75. package/playwright/e2e/app-template.spec.js +37 -0
  76. package/playwright/e2e/auth/login-screen.spec.js +65 -0
  77. package/playwright/e2e/web-template.spec.js +28 -0
  78. package/scripts/check-scope.sh +100 -0
  79. package/scripts/cold-start-check.js +133 -0
  80. package/scripts/install.sh +4 -0
  81. package/scripts/prompt_sentinel.js +35 -4
  82. package/scripts/run-evals.sh +119 -3
  83. package/scripts/scratch_parser.js +49 -0
  84. package/scripts/spec_visual_sync.js +1 -1
  85. package/scripts/test_generator.js +2 -2
  86. package/scripts/uninstall.sh +4 -0
  87. package/scripts/verify.sh +16 -1
  88. package/tests/integration/cli-smoke.test.js +103 -0
  89. package/tests/unit/feature_registry.test.js +152 -0
  90. package/tests/unit/prompt_sentinel.test.js +1 -1
  91. package/tests/unit/spec_visual_sync.test.js +1 -1
  92. package/tests/unit/test_generator.test.js +1 -1
  93. package/playwright/e2e/e2e-template.md +0 -4
@@ -0,0 +1,78 @@
1
+ # Observability — Phase-by-Phase Workflow
2
+
3
+ Tài liệu chi tiết từng giai đoạn triển khai observability.
4
+ Được gọi bởi `genesis-observability-automation/SKILL.md` → `## Workflow Detail: Phase-by-Phase Execution`.
5
+
6
+ ---
7
+
8
+ ## Phase 1: Observability Architecture Generation
9
+
10
+ **Goal**: Thiết kế và tài liệu hoá toàn bộ topology observability trước khi viết bất kỳ config nào.
11
+
12
+ ### Architecture components
13
+
14
+ | Pillar | Component | Purpose |
15
+ |--------|-----------|---------|
16
+ | Metrics | Prometheus / Datadog Agent | Scrape and store numeric time-series |
17
+ | Metrics | Grafana / Datadog Dashboards | Visualize and alert on metrics |
18
+ | Logs | Structured logging library | Produce machine-readable log events |
19
+ | Logs | Log aggregator (Loki/ELK/CloudWatch) | Collect and index logs |
20
+ | Logs | Kibana/Grafana/Datadog | Search and visualize logs |
21
+ | Traces | OpenTelemetry SDK | Instrument service for tracing |
22
+ | Traces | Jaeger/Zipkin/Datadog APM | Collect and visualize traces |
23
+
24
+ ### Service instrumentation by language
25
+
26
+ - **Node.js**: `prom-client` (metrics), `winston`/`pino` (structured logs), `@opentelemetry/sdk-node` (traces).
27
+ - **Python**: `prometheus_client` (metrics), `structlog`/`python-json-logger` (logs), `opentelemetry-sdk` (traces).
28
+ - **Go**: `prometheus/client_golang` (metrics), `zap`/`logrus` (logs), `go.opentelemetry.io/otel` (traces).
29
+
30
+ ---
31
+
32
+ ## Phase 2: Dashboard Generation
33
+
34
+ **Required panels (RED metrics)**:
35
+ - **Rate**: Requests per second (total and per endpoint)
36
+ - **Errors**: Error rate percentage (4xx and 5xx separately)
37
+ - **Duration**: Response time as histogram with p50, p95, p99
38
+
39
+ **SATURATION metrics**:
40
+ - **CPU**: Process CPU utilization %
41
+ - **Memory**: Heap and RSS memory
42
+ - **Connection pool**: Active connections vs. pool limit
43
+ - **Queue depth**: Job queue length (background workers)
44
+
45
+ See `templates/monitoring-dashboard-template.md` for complete Grafana JSON scaffold.
46
+
47
+ ---
48
+
49
+ ## Phase 3: Alerting Policy Generation
50
+
51
+ **SLO-based alert thresholds (99.9% availability = 43.8 min/month error budget)**:
52
+
53
+ ```
54
+ Fast burn (1h): error_rate > 2% → P1 page immediately
55
+ Medium burn (6h): error_rate > 0.5% → P2 business hours
56
+ Slow burn (3d): error_rate > 0.1% → Slack + ticket
57
+ ```
58
+
59
+ See `templates/alerting-policy-template.md` for complete Prometheus alerting rules.
60
+
61
+ ---
62
+
63
+ ## Phase 4: Health Check Automation
64
+
65
+ **Standard health endpoint specification**:
66
+ - `GET /health` → 200 always (load balancer basic routing)
67
+ - `GET /readiness` → 200 if dependencies healthy, 503 if not
68
+ - `GET /liveness` → 200 if process alive + event loop not stuck
69
+ - `GET /metrics` → Prometheus text format
70
+
71
+ ---
72
+
73
+ ## Phase 5: Incident Response Runbook Generation
74
+
75
+ **Runbook structure requirements** (every runbook must have):
76
+ Severity definition → Detection signals → Triage steps (with commands) → Escalation triggers → Resolution steps → Rollback procedure → Communication templates → Post-mortem checklist.
77
+
78
+ See `playbooks/incident-triage-playbook.md` for complete P0/P1/P2/P3 runbooks.
@@ -486,25 +486,4 @@ Default thresholds:
486
486
 
487
487
  **Goal**: Produce a prioritized, actionable list of optimizations ranked by expected impact vs implementation effort.
488
488
 
489
- **Recommendation template:**
490
-
491
- ```markdown
492
- ### [BOTTLENECK-001] Slow database query on /api/users (N+1 pattern)
493
-
494
- **Evidence**: EXPLAIN ANALYZE shows sequential scan on `users` table (150,000 rows).
495
- DB query time = 145 ms (81% of total response time).
496
- Identified via: slow query log + pg_stat_statements.
497
-
498
- **Recommended fix**: Add composite index on (tenant_id, status, created_at).
499
- Fix N+1 ORM query pattern: use eager loading (`include: ['profile']`).
500
-
501
- **Estimated impact**: HIGH — Expected p95 improvement: 100–140 ms (55–78% reduction).
502
-
503
- **Implementation complexity**: EASY — Index creation: 1 migration file.
504
- ORM fix: 3 lines of code change.
505
-
506
- **Validation method**: Re-run baseline after migration. Confirm p95 ≤ 80 ms.
507
- Run regression-detection phase against new baseline.
508
-
509
- **Risk**: Index creation on large table requires `CREATE INDEX CONCURRENTLY` to avoid table lock.
510
- ```
489
+ Use `templates/performance-report-template.md` for recommendation shape and include evidence, fix, impact, complexity, validation, and risk for each bottleneck.
@@ -3,4 +3,4 @@ interface:
3
3
  short_description: "Automate performance baseline, profiling, and load testing"
4
4
  default_prompt: "Use $genesis-performance-profiling to establish baseline and identify bottlenecks."
5
5
  policy:
6
- allow_implicit_invocation: false
6
+ allow_implicit_invocation: true
@@ -44,6 +44,11 @@ Planning implementation before tests, omitting fixtures, and leaving verificatio
44
44
  ## Recovery workflow
45
45
  If a plan becomes invalid, update the failing test and fixture first, then revise phases.
46
46
 
47
+ ## MCP Automation Requirements
48
+
49
+ To ensure plans are grounded in long-term project context and not just short-term conversation history, you **MUST** use the following MCP server:
50
+ 1. **`@modelcontextprotocol/server-memory`**: Use this MCP tool to automatically query the Knowledge Graph for past bugs, developer preferences, and architectural rules *before* drafting the implementation phases. Any new plan must respect the persistent memory graph.
51
+
47
52
  ---
48
53
 
49
54
  ## 🏗️ 5-Phase MVP Roadmap Standards
@@ -58,7 +63,7 @@ When decomposing large-scale features or initializing new project modules, you m
58
63
  - Auto-generate test skeletons in `tests/integration/` (RED state).
59
64
  3. **Phase 3: Logic Implementation & Self-Healing** (Template: `phase-3-features.md`)
60
65
  - Implement minimal code handlers, controllers, or service endpoints.
61
- - Execute closed-loop Verify-Fix cycles autonomously until all tests turn GREEN.
66
+ - Execute closed-loop Verify-Fix cycles autonomously until all tests turn GREEN. **CRITICAL**: Do NOT just run `npm test`. You MUST run `npx genesis-harness heal "<test-command>"`. If the test fails, this command will output an `[AGENT_DIRECTIVE]`. You must read the directive and automatically fix the code until it passes without stopping.
62
67
  4. **Phase 4: Multi-Agent & External Integrations** (Template: `phase-4-integrations.md`)
63
68
  - Wire up external components, API sync triggers, or media providers.
64
69
  5. **Phase 5: E2E Visual QA & Release Readiness** (Template: `phase-5-readiness.md`)
@@ -44,6 +44,11 @@ Publishing cache files, missing executable scripts, and omitting npm 2FA/token r
44
44
  ## Recovery workflow
45
45
  If publish fails, capture error, update known problems, do not change version until root cause is fixed.
46
46
 
47
+ ## MCP Automation Requirements
48
+
49
+ To ensure zero human error in releases and tags, you **MUST** use the following MCP server:
50
+ 1. **`@modelcontextprotocol/server-github`**: Use this MCP tool to automatically retrieve the list of closed Pull Requests since the last release tag to draft the `CHANGELOG.md` and Release Notes. You must also use it to automatically create the Git Tag and GitHub Release via the API. Do NOT ask the user to do this manually in the browser.
51
+
47
52
  ---
48
53
 
49
54
  ## 🚀 Automated Release & Deployment Orchestration
@@ -84,6 +84,12 @@ Format:
84
84
  - Next Steps: What to verify?
85
85
  ```
86
86
 
87
+ ## MCP Automation Requirements
88
+
89
+ To prevent hallucinations and avoid manual terminal scraping, you **MUST** use the following MCP servers during research:
90
+ 1. **`@modelcontextprotocol/server-fetch`**: Use this MCP tool to natively fetch and read the contents of external documentation URLs or Stack Overflow threads. Do NOT guess the API structure.
91
+ 2. **`@modelcontextprotocol/server-github`**: Use this MCP tool to search for existing issues, pull requests, or trending repositories related to the task.
92
+
87
93
  ## Output
88
94
 
89
95
  Each research produces: