forge-next 0.1.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (140) hide show
  1. forge_codex/__init__.py +7 -0
  2. forge_codex/assets/__init__.py +2 -0
  3. forge_codex/assets/prompts/__init__.py +2 -0
  4. forge_codex/assets/prompts/code-review/architecture_check.md +78 -0
  5. forge_codex/assets/prompts/code-review/deep_dive.md +42 -0
  6. forge_codex/assets/prompts/code-review/diff_analysis.md +73 -0
  7. forge_codex/assets/prompts/code-review/discussion.md +48 -0
  8. forge_codex/assets/prompts/code-review/mode_selection.md +45 -0
  9. forge_codex/assets/prompts/code-review/report.md +42 -0
  10. forge_codex/assets/prompts/code-review/security_scan.md +76 -0
  11. forge_codex/assets/prompts/code-review/target_detection.md +31 -0
  12. forge_codex/assets/prompts/develop/approval.md +30 -0
  13. forge_codex/assets/prompts/develop/handoff.md +21 -0
  14. forge_codex/assets/prompts/develop/investigation.md +25 -0
  15. forge_codex/assets/prompts/develop/investigation_review.md +14 -0
  16. forge_codex/assets/prompts/develop/scope.md +38 -0
  17. forge_codex/assets/prompts/develop/solution.md +148 -0
  18. forge_codex/assets/prompts/develop/startup.md +25 -0
  19. forge_codex/assets/prompts/diagnose/analyze.md +31 -0
  20. forge_codex/assets/prompts/diagnose/decompose.md +17 -0
  21. forge_codex/assets/prompts/diagnose/define.md +34 -0
  22. forge_codex/assets/prompts/diagnose/evidence.md +26 -0
  23. forge_codex/assets/prompts/diagnose/quick_fix.md +25 -0
  24. forge_codex/assets/prompts/diagnose/report.md +26 -0
  25. forge_codex/assets/prompts/diagnose/solutions.md +22 -0
  26. forge_codex/assets/prompts/implement/branch_setup.md +42 -0
  27. forge_codex/assets/prompts/implement/documentation.md +28 -0
  28. forge_codex/assets/prompts/implement/handoff.md +33 -0
  29. forge_codex/assets/prompts/implement/integration_check.md +55 -0
  30. forge_codex/assets/prompts/implement/plan_detect.md +35 -0
  31. forge_codex/assets/prompts/implement/wave_complete.md +32 -0
  32. forge_codex/assets/prompts/implement/wave_dispatch.md +53 -0
  33. forge_codex/assets/prompts/implement/wave_review.md +58 -0
  34. forge_codex/assets/prompts/plan/approval.md +34 -0
  35. forge_codex/assets/prompts/plan/architecture.md +17 -0
  36. forge_codex/assets/prompts/plan/context.md +12 -0
  37. forge_codex/assets/prompts/plan/creation.md +29 -0
  38. forge_codex/assets/prompts/plan/handoff.md +29 -0
  39. forge_codex/assets/prompts/plan/review_loop.md +17 -0
  40. forge_codex/assets/prompts/post/code_quality.md +42 -0
  41. forge_codex/assets/prompts/post/completeness_audit.md +42 -0
  42. forge_codex/assets/prompts/post/correctness.md +53 -0
  43. forge_codex/assets/prompts/post/operational_readiness.md +74 -0
  44. forge_codex/assets/prompts/post/performance.md +72 -0
  45. forge_codex/assets/prompts/pre/codebase_alignment.md +37 -0
  46. forge_codex/assets/prompts/pre/completeness.md +41 -0
  47. forge_codex/assets/prompts/pre/feasibility.md +45 -0
  48. forge_codex/assets/prompts/pre/risk_dependencies.md +82 -0
  49. forge_codex/assets/prompts/report.md +58 -0
  50. forge_codex/assets/prompts/review/findings_aggregation.md +31 -0
  51. forge_codex/assets/prompts/review/remediation.md +35 -0
  52. forge_codex/assets/prompts/review/team_dispatch.md +30 -0
  53. forge_codex/assets/prompts/shared/discussion.md +27 -0
  54. forge_codex/assets/prompts/shared/plan_parsing.md +21 -0
  55. forge_codex/assets/prompts/test/context.md +36 -0
  56. forge_codex/assets/prompts/test/coverage_gaps.md +72 -0
  57. forge_codex/assets/prompts/test/discovery.md +60 -0
  58. forge_codex/assets/prompts/test/execution.md +67 -0
  59. forge_codex/assets/prompts/test/failure_analysis.md +58 -0
  60. forge_codex/assets/prompts/test/flow_author.md +164 -0
  61. forge_codex/assets/prompts/test/flow_context.md +9 -0
  62. forge_codex/assets/prompts/test/flow_execute.md +115 -0
  63. forge_codex/assets/prompts/test/flow_recommendation.md +140 -0
  64. forge_codex/assets/prompts/test/flow_report.md +177 -0
  65. forge_codex/assets/prompts/test/flow_scaffold.md +129 -0
  66. forge_codex/assets/prompts/test/flow_scope.md +162 -0
  67. forge_codex/assets/prompts/test/report.md +54 -0
  68. forge_codex/assets/templates/__init__.py +2 -0
  69. forge_codex/assets/templates/adr-template.md +69 -0
  70. forge_codex/assets/templates/autonomy-levels.md +99 -0
  71. forge_codex/assets/templates/beads-integration.md +80 -0
  72. forge_codex/assets/templates/brainstorming-gates.md +296 -0
  73. forge_codex/assets/templates/brainstorming.md +323 -0
  74. forge_codex/assets/templates/code-smells.md +78 -0
  75. forge_codex/assets/templates/codex-runtime.md +69 -0
  76. forge_codex/assets/templates/dashboard.md +84 -0
  77. forge_codex/assets/templates/data-analysis.md +288 -0
  78. forge_codex/assets/templates/five-why-protocol.md +97 -0
  79. forge_codex/assets/templates/handoff-protocol.md +136 -0
  80. forge_codex/assets/templates/memory-README.md +61 -0
  81. forge_codex/assets/templates/memory-protocol.md +97 -0
  82. forge_codex/assets/templates/mock-flow-types.md +529 -0
  83. forge_codex/assets/templates/parallel-dispatch.md +166 -0
  84. forge_codex/assets/templates/pre-mortem.md +78 -0
  85. forge_codex/assets/templates/review-loop.md +74 -0
  86. forge_codex/assets/templates/scoring-rubric.md +48 -0
  87. forge_codex/assets/templates/stage-approval.md +101 -0
  88. forge_codex/assets/templates/stage-document.md +109 -0
  89. forge_codex/assets/templates/stage-implement.md +90 -0
  90. forge_codex/assets/templates/stage-investigate.md +69 -0
  91. forge_codex/assets/templates/stage-plan.md +91 -0
  92. forge_codex/assets/templates/stage-review.md +115 -0
  93. forge_codex/assets/templates/stage-solution.md +79 -0
  94. forge_codex/assets/templates/systematic-debugging.md +162 -0
  95. forge_codex/assets/templates/tdd-protocol.md +213 -0
  96. forge_codex/assets/templates/user-questions.md +42 -0
  97. forge_codex/assets/templates/verification-protocol.md +219 -0
  98. forge_codex/assets/templates/writing-plans.md +166 -0
  99. forge_codex/cli.py +409 -0
  100. forge_next-0.1.1.dist-info/METADATA +297 -0
  101. forge_next-0.1.1.dist-info/RECORD +140 -0
  102. forge_next-0.1.1.dist-info/WHEEL +5 -0
  103. forge_next-0.1.1.dist-info/entry_points.txt +2 -0
  104. forge_next-0.1.1.dist-info/top_level.txt +2 -0
  105. scripts/__init__.py +0 -0
  106. scripts/code_review/__init__.py +0 -0
  107. scripts/code_review/code_review.py +415 -0
  108. scripts/develop/__init__.py +0 -0
  109. scripts/develop/develop.py +372 -0
  110. scripts/diagnose/__init__.py +0 -0
  111. scripts/diagnose/decision_matrix.py +180 -0
  112. scripts/diagnose/diagnostic_report.py +239 -0
  113. scripts/diagnose/fmea_score.py +172 -0
  114. scripts/diagnose/git_hotspots.py +229 -0
  115. scripts/diagnose/log_analyzer.py +252 -0
  116. scripts/diagnose/orchestrate.py +430 -0
  117. scripts/evaluate/__init__.py +0 -0
  118. scripts/evaluate/evaluate.py +566 -0
  119. scripts/evaluate/mode_detector.py +80 -0
  120. scripts/evaluate/plan_resolver.py +127 -0
  121. scripts/evaluate/state.py +117 -0
  122. scripts/evaluate/template_engine.py +91 -0
  123. scripts/implement/__init__.py +0 -0
  124. scripts/implement/implement.py +604 -0
  125. scripts/plan/__init__.py +0 -0
  126. scripts/plan/plan.py +512 -0
  127. scripts/shared/__init__.py +0 -0
  128. scripts/shared/findings.py +82 -0
  129. scripts/shared/orchestrator.py +1151 -0
  130. scripts/shared/report.py +81 -0
  131. scripts/shared/resume.py +482 -0
  132. scripts/shared/skill_chain.py +43 -0
  133. scripts/smoke.py +261 -0
  134. scripts/test/__init__.py +0 -0
  135. scripts/test/_cassette.py +45 -0
  136. scripts/test/_scenario_index.py +179 -0
  137. scripts/test/_sidecar.py +139 -0
  138. scripts/test/flow_types.py +264 -0
  139. scripts/test/test.py +775 -0
  140. scripts/test/test_layout.py +510 -0
@@ -0,0 +1,7 @@
1
+ """Forge Codex launcher package.
2
+
3
+ This package provides the `forge` CLI entrypoint and bundles prompt/template
4
+ assets so workflows can run against any target repo without vendoring the
5
+ forge-codex repository into that repo.
6
+ """
7
+
@@ -0,0 +1,2 @@
1
+ """Bundled runtime assets (prompts/templates)."""
2
+
@@ -0,0 +1,2 @@
1
+ """Bundled prompt templates."""
2
+
@@ -0,0 +1,78 @@
1
+ # Phase 3: Team Dispatch — Architecture Mode
2
+
3
+ Dispatch all reviewers to analyze design patterns, coupling, and SOLID principles.
4
+
5
+ ## Review Target
6
+
7
+ **Mode:** Architecture Review
8
+ **Target:** {{TARGET}}
9
+ **Quick mode:** {{QUICK_MODE}}
10
+
11
+ ## Team Assignments
12
+
13
+ {{TEAM_ASSIGNMENTS}}
14
+
15
+ ## Instructions
16
+
17
+ ### 1. Identify Scope
18
+
19
+ Read the target files/modules and build a mental model of:
20
+ - Module boundaries and public interfaces
21
+ - Dependency graph (what depends on what)
22
+ - Data flow patterns (how data moves through the system)
23
+ - Error propagation patterns
24
+
25
+ ### 2. Dispatch Reviewers in Parallel
26
+
27
+ **Architect Review — SOLID Principles:**
28
+ - **S** (Single Responsibility): Does each module/class have one reason to change?
29
+ - **O** (Open/Closed): Can behavior be extended without modifying existing code?
30
+ - **L** (Liskov Substitution): Are subtypes truly substitutable for their base types?
31
+ - **I** (Interface Segregation): Are interfaces minimal and focused?
32
+ - **D** (Dependency Inversion): Do modules depend on abstractions, not concretions?
33
+
34
+ **Architect Review — Coupling & Cohesion:**
35
+ - Afferent coupling (Ca): How many modules depend on this one?
36
+ - Efferent coupling (Ce): How many modules does this one depend on?
37
+ - Instability (I = Ce / (Ca + Ce)): Is this module stable or volatile?
38
+ - Cohesion: Do the elements within each module belong together?
39
+
40
+ **Security Reviewer — Architectural Security:**
41
+ - Are trust boundaries clearly defined?
42
+ - Is authentication/authorization centralized or scattered?
43
+ - Are there privilege escalation paths?
44
+ - Is sensitive data properly compartmentalized?
45
+
46
+ **QA Reviewer — Testability:**
47
+ - Can components be tested in isolation?
48
+ - Are dependencies injectable?
49
+ - Are there hidden dependencies (globals, singletons)?
50
+ - Is the test infrastructure adequate for the architecture?
51
+
52
+ **Critic — Design Smells & Code Smells:**
53
+ - Run code smells assessment per `templates/code-smells.md`
54
+ - Priority smells: God Class, Shotgun Surgery, Inappropriate Intimacy (critical); Feature Envy, Long Method, Divergent Change (warning)
55
+ - For each smell: cite file:line, name the smell, state the consequence, recommend the specific refactoring
56
+ - Check for Dependency Structure Matrix issues: cyclic dependencies between modules, layering violations, coupling clusters
57
+
58
+ **Investigator — Dependency Analysis:**
59
+ - Map the full dependency graph
60
+ - Identify circular dependencies
61
+ - Check for dependency inversions (concrete depends on concrete)
62
+ - Evaluate third-party dependency health
63
+
64
+ **Doc-writer — Architecture Documentation:**
65
+ - Is the architecture documented?
66
+ - Do module-level docs explain the "why" not just the "what"?
67
+ - Are architectural decisions recorded (ADRs)?
68
+
69
+ ### 3. Compile Findings
70
+
71
+ Collect all findings into a unified list with:
72
+ - Finding ID (F1, F2, ...)
73
+ - Source reviewer
74
+ - Severity: critical / warning / suggestion
75
+ - Title (one line)
76
+ - Detail (explanation with specific code references)
77
+
78
+ Record findings in state and proceed to deep dive.
@@ -0,0 +1,42 @@
1
+ # Phase 4: Deep Dive
2
+
3
+ The Investigator follows up on critical findings from the team dispatch phase.
4
+
5
+ ## Current Findings
6
+
7
+ {{FINDINGS}}
8
+
9
+ ## Your Task
10
+
11
+ ### 1. Identify Critical Findings
12
+
13
+ From the findings above, select all findings with severity **critical** or **warning**
14
+ that need deeper investigation. These typically include:
15
+ - Security vulnerabilities that need proof-of-concept verification
16
+ - Architectural issues that may have wider impact than initially noted
17
+ - Logic errors that need call-chain tracing to confirm
18
+ - Performance concerns that need measurement
19
+
20
+ ### 2. Investigator Deep Dive
21
+
22
+ For each critical finding, the Investigator should:
23
+
24
+ 1. **Read the relevant code** — not just the flagged line, but the full context
25
+ (the function, the caller, the callee, the error handler)
26
+ 2. **Trace the data flow** — where does the input come from? Where does the output go?
27
+ 3. **Check for similar patterns** — is this a one-off issue or a pattern repeated elsewhere?
28
+ 4. **Assess blast radius** — if this finding is a real problem, what is the impact?
29
+ 5. **Verify or refute** — does the deeper investigation confirm or dismiss the finding?
30
+
31
+ ### 3. Update Findings
32
+
33
+ For each investigated finding:
34
+ - If confirmed: add detail with code references and impact assessment
35
+ - If refuted: mark as dismissed with explanation
36
+ - If escalated: upgrade severity with justification
37
+ - If new findings discovered: add them to the list
38
+
39
+ ### 4. Summarize
40
+
41
+ Produce an updated findings list ready for the discussion phase.
42
+ Focus on actionable findings — things the author can and should fix.
@@ -0,0 +1,73 @@
1
+ # Phase 3: Team Dispatch — PR Mode (Diff Analysis)
2
+
3
+ Dispatch all reviewers to analyze the PR diff in parallel.
4
+
5
+ ## Review Target
6
+
7
+ **Mode:** PR Review
8
+ **Target:** {{TARGET}}
9
+ **Quick mode:** {{QUICK_MODE}}
10
+
11
+ ## Team Assignments
12
+
13
+ {{TEAM_ASSIGNMENTS}}
14
+
15
+ ## Instructions
16
+
17
+ ### 1. Fetch the Diff
18
+
19
+ - If target is a PR number: `gh pr diff {{TARGET}}`
20
+ - If target is a branch: `git diff main...{{TARGET}}`
21
+ - If target is file paths: `git diff -- {{TARGET}}`
22
+ - If from handoff: diff the files listed in the handoff
23
+
24
+ ### 2. Dispatch Reviewers in Parallel
25
+
26
+ Each reviewer analyzes the diff from their perspective. For each reviewer, produce
27
+ a findings list with severity (critical / warning / suggestion).
28
+
29
+ **Architect Review:**
30
+ - Is the change consistent with existing architecture?
31
+ - Does it introduce unwanted coupling or layering violations?
32
+ - Are interfaces clean and well-defined?
33
+ - Is error handling consistent with project patterns?
34
+
35
+ **Security Reviewer:**
36
+ - Are there injection vulnerabilities (SQL, XSS, command)?
37
+ - Is authentication/authorization properly handled?
38
+ - Are secrets or credentials exposed?
39
+ - Is input validation sufficient?
40
+ - Are data flows safe (no PII leaks, proper sanitization)?
41
+
42
+ **QA Reviewer:**
43
+ - Are edge cases handled?
44
+ - Is there sufficient test coverage for the changes?
45
+ - Do existing tests still pass with these changes?
46
+ - Are error paths tested?
47
+
48
+ **Critic:**
49
+ - What assumptions does this change make?
50
+ - What could go wrong that the author did not consider?
51
+ - Is there over-engineering or unnecessary complexity?
52
+ - Are there simpler alternatives?
53
+
54
+ **Investigator:**
55
+ - What is the blast radius of these changes?
56
+ - What other code depends on the changed interfaces?
57
+ - Are there transitive effects through the dependency graph?
58
+
59
+ **Doc-writer:**
60
+ - Do public APIs have adequate documentation?
61
+ - Are comments accurate and helpful (not redundant)?
62
+ - Should README or changelog be updated?
63
+
64
+ ### 3. Compile Findings
65
+
66
+ Collect all findings into a unified list with:
67
+ - Finding ID (F1, F2, ...)
68
+ - Source reviewer
69
+ - Severity: critical / warning / suggestion
70
+ - Title (one line)
71
+ - Detail (explanation with file:line references)
72
+
73
+ Record findings in state and proceed to deep dive.
@@ -0,0 +1,48 @@
1
+ # Phase 5: Discussion
2
+
3
+ You have completed the analysis phases. Present your findings to the user for interactive review.
4
+
5
+ ## Review Mode: {{MODE}} ({{MODE_DISPLAY}})
6
+ ## Target: {{TARGET}}
7
+
8
+ ## Accumulated Findings
9
+
10
+ {{FINDINGS}}
11
+
12
+ ## Your Task
13
+
14
+ Present findings organized by severity:
15
+
16
+ 1. **Critical** — Issues that must be fixed before merging or proceeding
17
+ 2. **Warnings** — Issues that are concerning but may be acceptable with justification
18
+ 3. **Suggestions** — Improvements that would make the code better
19
+
20
+ Include code references (file:line) and explain the "why" for each finding.
21
+
22
+ ### Per-Finding Triage
23
+
24
+ For each **critical** or **warning** finding, ask the user directly how to
25
+ handle it (per `templates/user-questions.md`).
26
+
27
+ - Question: `How should we handle finding [ID] ([severity]): [title]?`
28
+ - Options:
29
+ - `Fix now` — block the workflow until this is resolved
30
+ - `Defer` — create a follow-up issue and proceed
31
+ - `Dismiss` — document why this is not a real issue and continue
32
+ - `Drill in` — investigate deeper before deciding
33
+
34
+ Record the user's decision in the finding tracker. "Drill in" re-enters the
35
+ discussion for that finding with deeper analysis before re-asking.
36
+
37
+ ### Overall Progress
38
+
39
+ After triaging the highest-severity findings, ask the user directly whether to
40
+ continue or proceed to the report.
41
+
42
+ - Question: `Done triaging findings?`
43
+ - Options:
44
+ - `Proceed` — all important findings are handled, so write the report
45
+ - `More triage` — keep discussing remaining findings
46
+ - `Re-scan` — dispatch reviewers again with updated focus
47
+
48
+ When the user chooses "Proceed", advance to the next phase.
@@ -0,0 +1,45 @@
1
+ # Phase 2: Mode Selection
2
+
3
+ Confirm and configure the review mode for this code review session.
4
+
5
+ ## Detected Configuration
6
+
7
+ **Mode:** {{MODE}} ({{MODE_DISPLAY}})
8
+ **Target:** {{TARGET}}
9
+ **Quick mode:** {{QUICK_MODE}}
10
+
11
+ ## Mode Details
12
+
13
+ ### PR Mode (`pr`)
14
+ Best for reviewing a specific PR or set of changes against a base branch.
15
+ - Fetch the full diff
16
+ - All reviewers analyze the diff from their perspective
17
+ - Focus: correctness, style, test coverage, security
18
+
19
+ ### Deep Mode (`deep`)
20
+ Best for troubleshooting reviews or investigating specific problem areas.
21
+ - Trace code paths related to the issue
22
+ - Focus on call chains, data flow, error handling
23
+ - Security Reviewer traces auth and data boundaries
24
+ - Investigator follows dependency chains
25
+
26
+ ### Architecture Mode (`architecture`)
27
+ Best for reviewing design decisions, structural patterns, and system health.
28
+ - Check SOLID principles adherence
29
+ - Analyze coupling and cohesion metrics
30
+ - Review dependency direction and layering
31
+ - Evaluate extensibility and maintainability
32
+
33
+ ## Your Task
34
+
35
+ 1. **Confirm the mode** with the user (or auto-confirm if the mode is obvious)
36
+ 2. **Prepare mode-specific instructions** for each team member:
37
+
38
+ {{TEAM_ASSIGNMENTS}}
39
+
40
+ 3. **Set the review scope:**
41
+ - For PR mode: identify the exact commits/diff to review
42
+ - For deep mode: identify the code paths to trace
43
+ - For architecture mode: identify the modules/packages to analyze
44
+
45
+ 4. Record the finalized mode and scope, then proceed to team dispatch.
@@ -0,0 +1,42 @@
1
+ # Phase 6: Report
2
+
3
+ Write the final code review report and hand off to the test skill.
4
+
5
+ ## Review Summary
6
+
7
+ **Mode:** {{MODE}} ({{MODE_DISPLAY}})
8
+ **Target:** {{TARGET}}
9
+ **Quick mode:** {{QUICK_MODE}}
10
+
11
+ ## Final Findings
12
+
13
+ {{FINDINGS}}
14
+
15
+ ## Your Task
16
+
17
+ ### 1. Write the Code Review Report
18
+
19
+ Write the report to `.codex/forge-codex/memory/code-review-report.md` with this structure:
20
+
21
+ - Summary section with mode, target, date, reviewers
22
+ - Findings table: ID, Severity, Title, Status
23
+ - Detailed findings: each with severity, source reviewer, file:line, detail, recommendation
24
+ - High-level recommendations
25
+ - Handoff notes for the test skill (areas needing test attention, edge cases found)
26
+
27
+ ### 2. Update Memory
28
+
29
+ - Update `.codex/forge-codex/memory/project.md` with code review completion status
30
+ - Record the finding count and severity breakdown
31
+
32
+ ### 3. Prepare Handoff
33
+
34
+ The handoff file will be written automatically. Ensure the findings are
35
+ recorded in the state so the test skill knows what to focus on.
36
+
37
+ ### 4. Present Dashboard
38
+
39
+ Show the user:
40
+ - Total findings by severity
41
+ - Open vs dismissed vs resolved
42
+ - Suggested next step: `test`
@@ -0,0 +1,76 @@
1
+ # Phase 3: Team Dispatch — Deep Mode (Security & Troubleshooting Scan)
2
+
3
+ Dispatch all reviewers to trace code paths and investigate specific areas.
4
+
5
+ ## Review Target
6
+
7
+ **Mode:** Deep Troubleshooting Review
8
+ **Target:** {{TARGET}}
9
+ **Quick mode:** {{QUICK_MODE}}
10
+
11
+ ## Team Assignments
12
+
13
+ {{TEAM_ASSIGNMENTS}}
14
+
15
+ ## Instructions
16
+
17
+ ### 1. Identify Investigation Areas
18
+
19
+ From the target and handoff context, identify:
20
+ - Specific code paths that need tracing
21
+ - Error conditions or failure modes to investigate
22
+ - Security-sensitive paths (auth, data handling, external input)
23
+ - Performance-critical paths
24
+
25
+ ### 2. Dispatch Reviewers in Parallel
26
+
27
+ **Security Reviewer — Code Path Tracing:**
28
+ - Trace all paths where external input enters the system
29
+ - Follow data through validation, processing, storage, and output
30
+ - Check for injection points at every boundary crossing
31
+ - Verify authentication checks on every protected resource
32
+ - Map authorization decision points and verify they are correct
33
+ - Check for TOCTOU (time-of-check-time-of-use) vulnerabilities
34
+ - Review cryptographic usage (algorithms, key management, random generation)
35
+
36
+ **Investigator — Deep Code Path Analysis:**
37
+ - Trace the specific code paths related to the target issue
38
+ - Follow the call chain from entry point to data store and back
39
+ - Map error propagation: where do errors originate and where are they caught?
40
+ - Identify silent failure modes (swallowed exceptions, default values hiding errors)
41
+ - Check for race conditions in concurrent code paths
42
+ - Trace resource lifecycle (open/close, acquire/release)
43
+
44
+ **Architect — Structural Analysis:**
45
+ - Are the code paths well-structured and easy to follow?
46
+ - Is error handling consistent across the traced paths?
47
+ - Are there unnecessary indirections or overly complex control flow?
48
+ - Do the code paths respect module boundaries?
49
+
50
+ **QA Reviewer — Edge Case Analysis:**
51
+ - What happens with empty/null/zero inputs on these paths?
52
+ - What happens when external services are unavailable?
53
+ - What happens under concurrent access?
54
+ - What happens at boundary values (max int, empty string, huge payload)?
55
+
56
+ **Critic — Assumption Audit:**
57
+ - What assumptions do these code paths make?
58
+ - Which assumptions are validated and which are implicit?
59
+ - What happens when assumptions are violated?
60
+ - Are there defensive checks where assumptions might fail?
61
+
62
+ **Doc-writer — Documentation Gaps:**
63
+ - Are complex code paths adequately commented?
64
+ - Are error codes and failure modes documented?
65
+ - Is the expected behavior documented for edge cases?
66
+
67
+ ### 3. Compile Findings
68
+
69
+ Collect all findings into a unified list with:
70
+ - Finding ID (F1, F2, ...)
71
+ - Source reviewer
72
+ - Severity: critical / warning / suggestion
73
+ - Title (one line)
74
+ - Detail (explanation with file:line references and code path traces)
75
+
76
+ Record findings in state and proceed to deep dive.
@@ -0,0 +1,31 @@
1
+ # Phase 1: Target Detection
2
+
3
+ You are starting a code review. First, identify what will be reviewed.
4
+
5
+ ## Current Target
6
+
7
+ **Target argument:** {{TARGET}}
8
+ **Detected mode:** {{MODE}} ({{MODE_DISPLAY}})
9
+
10
+ {{HANDOFF_CONTENT}}
11
+
12
+ ## Your Task
13
+
14
+ 1. **Identify the review target:**
15
+ - If a PR number was given, verify it exists with `gh pr view <number>`
16
+ - If a branch was given, verify it exists and check its diff against main
17
+ - If file paths were given, verify they exist
18
+ - If a handoff from implement exists, extract the changed files list
19
+ - If nothing was provided, check git for uncommitted changes or recent commits
20
+
21
+ 2. **Gather context:**
22
+ - Read `.codex/forge-codex/memory/project.md` if it exists for project context
23
+ - Check for recent handoff files to understand flow position
24
+ - Note the scope: how many files, how many lines changed
25
+
26
+ 3. **Confirm with user:**
27
+ - Present the detected target and mode
28
+ - Ask if they want to adjust the mode or target
29
+ - If quick mode: note that only lead reviewers (Architect, QA) will participate
30
+
31
+ Record the confirmed target in the state and proceed to mode selection.
@@ -0,0 +1,30 @@
1
+ # Stage 3 — Solution Review & User Approval
2
+
3
+ ## Review Loop
4
+ Per `templates/review-loop.md`:
5
+ | Step | Agent | Focus |
6
+ |------|-------|-------|
7
+ | Self-review | Architect | Honest cons? Consistent scores? |
8
+ | Cross-review | Security Reviewer + QA | Security implications? Testability? |
9
+ | Critic challenge | Critic | Worst-case outcome? Understated risks? |
10
+ | PM validation | PM | Enough info for user to choose? |
11
+ | Pre-mortem | All agents | Imagine this solution failed — what happened? |
12
+
13
+ Before presenting to the user, run a pre-mortem per `templates/pre-mortem.md`. Each agent generates 2-3 failure scenarios. Categorize, prioritize, and add mitigations to the risk assessment. Record any findings that change the recommendation.
14
+
15
+ ## User Approval
16
+
17
+ Present scored solutions summary:
18
+ {{SOLUTIONS_SUMMARY}}
19
+
20
+ Then ask the user directly for approval (per `templates/user-questions.md`).
21
+ Use this question and these options:
22
+
23
+ - Question: `Approve the recommended solution for implementation?`
24
+ - Options:
25
+ - `Approve` — accept the recommendation and hand off to `plan`
26
+ - `Revise` — return to Stage 2 with feedback
27
+ - `Alternate` — pick a different scored alternative
28
+ - `Reject` — stop here because no solution is acceptable
29
+
30
+ Record the user's decision in `project.md` and branch accordingly.
@@ -0,0 +1,21 @@
1
+ # Develop Complete — Handoff
2
+
3
+ ## Handoff
4
+ Write `.codex/forge-codex/memory/handoff-develop.md` with:
5
+ - Approved solutions with beads IDs
6
+ - Team composition
7
+ - Scope assessment
8
+ - Task type
9
+ - Key investigation findings
10
+
11
+ ## Dashboard
12
+ Render skill completion dashboard per `templates/dashboard.md`.
13
+
14
+ ## Doc-writer Capture
15
+ Dispatch Doc-writer to capture learnings in `.codex/forge-codex/memory/doc-writer.md`.
16
+
17
+ ## Suggested Next
18
+ `plan`
19
+
20
+ ## Git Checkpoint
21
+ git add .codex/forge-codex/ && git commit -m "workflow: develop complete -- solutions approved"
@@ -0,0 +1,25 @@
1
+ # Stage 1 — Investigation
2
+
3
+ Dispatch agents for deep investigation.
4
+
5
+ ## Agent Dispatch
6
+
7
+ ### Investigator (evidence gathering)
8
+ Explore the codebase and gather evidence:
9
+ - Read relevant code paths end-to-end
10
+ - Run existing tests, collect results
11
+ - Check git history for recent changes
12
+ - Collect error messages, stack traces, reproduction steps
13
+ - For bugfixes: follow `templates/systematic-debugging.md`
14
+
15
+ Write evidence to `.codex/forge-codex/memory/investigator.md`
16
+
17
+ ### Architect (analysis lead)
18
+ Analyze the evidence using `templates/five-why-protocol.md`:
19
+ - For each issue/challenge, drill through up to 5 why-layers
20
+ - Pattern analysis and hypothesis testing at each layer
21
+ - Record evidence at every layer
22
+ - Stop at an actionable root cause
23
+ - For features: use `templates/brainstorming.md` for requirements exploration
24
+
25
+ Write findings to `.codex/forge-codex/memory/investigation.md`
@@ -0,0 +1,14 @@
1
+ # Investigation Review Loop
2
+
3
+ Per `templates/review-loop.md`:
4
+
5
+ | Step | Agent | Focus |
6
+ |------|-------|-------|
7
+ | Self-review | Architect | Did I stop too early? Did I validate every hypothesis? |
8
+ | Cross-review | QA Reviewer | Are findings reproducible? Evidence concrete? Coverage gaps? |
9
+ | Critic challenge | Critic | What assumptions are untested? What if the opposite is true? |
10
+ | PM validation | PM | Complete against task type checklist? Beads IDs present? |
11
+
12
+ {{REVIEW_STATE}}
13
+
14
+ Loop until all four pass cleanly in the same round.
@@ -0,0 +1,38 @@
1
+ # Scope Assessment & Team Composition
2
+
3
+ ## Scope
4
+
5
+ First, infer task type and layers from the user's initial description. If
6
+ anything is unclear, ask the user directly to confirm (per
7
+ `templates/user-questions.md`).
8
+
9
+ - Question 1: `What type of task is this?`
10
+ - `Feature` — new functionality or enhancement
11
+ - `Bugfix` — fix broken behavior
12
+ - `Refactor` — improve structure without changing behavior
13
+ - Question 2: `Which layers does this task touch?`
14
+ - `Frontend` — UI or client-side changes
15
+ - `Backend` — API or server-side changes
16
+ - `Infra` — infrastructure, CI/CD, or deploy
17
+ - `Something else` — let the user specify manually
18
+
19
+ ### Complexity
20
+
21
+ Estimate automatically from scope:
22
+ - Small (1-2 files)
23
+ - Medium (3-10 files)
24
+ - Large (10+ files)
25
+
26
+ ## Team Composition
27
+
28
+ Base roles for every task: **Architect, Investigator, QA, Critic, Doc-writer.**
29
+
30
+ **Security activation rule:** Add the Security role whenever *any* selected layer includes Backend or Infra, or when auth/data-integrity concerns are present regardless of layer.
31
+
32
+ | Task Type | Additional Roles |
33
+ |-----------|-----------------|
34
+ | Feature | +Security (if Backend or Infra selected) |
35
+ | Bugfix | +Security (if auth/data) |
36
+ | Refactor | +Security (if auth/data) |
37
+
38
+ Record team composition in project.md.