forge-next 0.1.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- forge_codex/__init__.py +7 -0
- forge_codex/assets/__init__.py +2 -0
- forge_codex/assets/prompts/__init__.py +2 -0
- forge_codex/assets/prompts/code-review/architecture_check.md +78 -0
- forge_codex/assets/prompts/code-review/deep_dive.md +42 -0
- forge_codex/assets/prompts/code-review/diff_analysis.md +73 -0
- forge_codex/assets/prompts/code-review/discussion.md +48 -0
- forge_codex/assets/prompts/code-review/mode_selection.md +45 -0
- forge_codex/assets/prompts/code-review/report.md +42 -0
- forge_codex/assets/prompts/code-review/security_scan.md +76 -0
- forge_codex/assets/prompts/code-review/target_detection.md +31 -0
- forge_codex/assets/prompts/develop/approval.md +30 -0
- forge_codex/assets/prompts/develop/handoff.md +21 -0
- forge_codex/assets/prompts/develop/investigation.md +25 -0
- forge_codex/assets/prompts/develop/investigation_review.md +14 -0
- forge_codex/assets/prompts/develop/scope.md +38 -0
- forge_codex/assets/prompts/develop/solution.md +148 -0
- forge_codex/assets/prompts/develop/startup.md +25 -0
- forge_codex/assets/prompts/diagnose/analyze.md +31 -0
- forge_codex/assets/prompts/diagnose/decompose.md +17 -0
- forge_codex/assets/prompts/diagnose/define.md +34 -0
- forge_codex/assets/prompts/diagnose/evidence.md +26 -0
- forge_codex/assets/prompts/diagnose/quick_fix.md +25 -0
- forge_codex/assets/prompts/diagnose/report.md +26 -0
- forge_codex/assets/prompts/diagnose/solutions.md +22 -0
- forge_codex/assets/prompts/implement/branch_setup.md +42 -0
- forge_codex/assets/prompts/implement/documentation.md +28 -0
- forge_codex/assets/prompts/implement/handoff.md +33 -0
- forge_codex/assets/prompts/implement/integration_check.md +55 -0
- forge_codex/assets/prompts/implement/plan_detect.md +35 -0
- forge_codex/assets/prompts/implement/wave_complete.md +32 -0
- forge_codex/assets/prompts/implement/wave_dispatch.md +53 -0
- forge_codex/assets/prompts/implement/wave_review.md +58 -0
- forge_codex/assets/prompts/plan/approval.md +34 -0
- forge_codex/assets/prompts/plan/architecture.md +17 -0
- forge_codex/assets/prompts/plan/context.md +12 -0
- forge_codex/assets/prompts/plan/creation.md +29 -0
- forge_codex/assets/prompts/plan/handoff.md +29 -0
- forge_codex/assets/prompts/plan/review_loop.md +17 -0
- forge_codex/assets/prompts/post/code_quality.md +42 -0
- forge_codex/assets/prompts/post/completeness_audit.md +42 -0
- forge_codex/assets/prompts/post/correctness.md +53 -0
- forge_codex/assets/prompts/post/operational_readiness.md +74 -0
- forge_codex/assets/prompts/post/performance.md +72 -0
- forge_codex/assets/prompts/pre/codebase_alignment.md +37 -0
- forge_codex/assets/prompts/pre/completeness.md +41 -0
- forge_codex/assets/prompts/pre/feasibility.md +45 -0
- forge_codex/assets/prompts/pre/risk_dependencies.md +82 -0
- forge_codex/assets/prompts/report.md +58 -0
- forge_codex/assets/prompts/review/findings_aggregation.md +31 -0
- forge_codex/assets/prompts/review/remediation.md +35 -0
- forge_codex/assets/prompts/review/team_dispatch.md +30 -0
- forge_codex/assets/prompts/shared/discussion.md +27 -0
- forge_codex/assets/prompts/shared/plan_parsing.md +21 -0
- forge_codex/assets/prompts/test/context.md +36 -0
- forge_codex/assets/prompts/test/coverage_gaps.md +72 -0
- forge_codex/assets/prompts/test/discovery.md +60 -0
- forge_codex/assets/prompts/test/execution.md +67 -0
- forge_codex/assets/prompts/test/failure_analysis.md +58 -0
- forge_codex/assets/prompts/test/flow_author.md +164 -0
- forge_codex/assets/prompts/test/flow_context.md +9 -0
- forge_codex/assets/prompts/test/flow_execute.md +115 -0
- forge_codex/assets/prompts/test/flow_recommendation.md +140 -0
- forge_codex/assets/prompts/test/flow_report.md +177 -0
- forge_codex/assets/prompts/test/flow_scaffold.md +129 -0
- forge_codex/assets/prompts/test/flow_scope.md +162 -0
- forge_codex/assets/prompts/test/report.md +54 -0
- forge_codex/assets/templates/__init__.py +2 -0
- forge_codex/assets/templates/adr-template.md +69 -0
- forge_codex/assets/templates/autonomy-levels.md +99 -0
- forge_codex/assets/templates/beads-integration.md +80 -0
- forge_codex/assets/templates/brainstorming-gates.md +296 -0
- forge_codex/assets/templates/brainstorming.md +323 -0
- forge_codex/assets/templates/code-smells.md +78 -0
- forge_codex/assets/templates/codex-runtime.md +69 -0
- forge_codex/assets/templates/dashboard.md +84 -0
- forge_codex/assets/templates/data-analysis.md +288 -0
- forge_codex/assets/templates/five-why-protocol.md +97 -0
- forge_codex/assets/templates/handoff-protocol.md +136 -0
- forge_codex/assets/templates/memory-README.md +61 -0
- forge_codex/assets/templates/memory-protocol.md +97 -0
- forge_codex/assets/templates/mock-flow-types.md +529 -0
- forge_codex/assets/templates/parallel-dispatch.md +166 -0
- forge_codex/assets/templates/pre-mortem.md +78 -0
- forge_codex/assets/templates/review-loop.md +74 -0
- forge_codex/assets/templates/scoring-rubric.md +48 -0
- forge_codex/assets/templates/stage-approval.md +101 -0
- forge_codex/assets/templates/stage-document.md +109 -0
- forge_codex/assets/templates/stage-implement.md +90 -0
- forge_codex/assets/templates/stage-investigate.md +69 -0
- forge_codex/assets/templates/stage-plan.md +91 -0
- forge_codex/assets/templates/stage-review.md +115 -0
- forge_codex/assets/templates/stage-solution.md +79 -0
- forge_codex/assets/templates/systematic-debugging.md +162 -0
- forge_codex/assets/templates/tdd-protocol.md +213 -0
- forge_codex/assets/templates/user-questions.md +42 -0
- forge_codex/assets/templates/verification-protocol.md +219 -0
- forge_codex/assets/templates/writing-plans.md +166 -0
- forge_codex/cli.py +409 -0
- forge_next-0.1.1.dist-info/METADATA +297 -0
- forge_next-0.1.1.dist-info/RECORD +140 -0
- forge_next-0.1.1.dist-info/WHEEL +5 -0
- forge_next-0.1.1.dist-info/entry_points.txt +2 -0
- forge_next-0.1.1.dist-info/top_level.txt +2 -0
- scripts/__init__.py +0 -0
- scripts/code_review/__init__.py +0 -0
- scripts/code_review/code_review.py +415 -0
- scripts/develop/__init__.py +0 -0
- scripts/develop/develop.py +372 -0
- scripts/diagnose/__init__.py +0 -0
- scripts/diagnose/decision_matrix.py +180 -0
- scripts/diagnose/diagnostic_report.py +239 -0
- scripts/diagnose/fmea_score.py +172 -0
- scripts/diagnose/git_hotspots.py +229 -0
- scripts/diagnose/log_analyzer.py +252 -0
- scripts/diagnose/orchestrate.py +430 -0
- scripts/evaluate/__init__.py +0 -0
- scripts/evaluate/evaluate.py +566 -0
- scripts/evaluate/mode_detector.py +80 -0
- scripts/evaluate/plan_resolver.py +127 -0
- scripts/evaluate/state.py +117 -0
- scripts/evaluate/template_engine.py +91 -0
- scripts/implement/__init__.py +0 -0
- scripts/implement/implement.py +604 -0
- scripts/plan/__init__.py +0 -0
- scripts/plan/plan.py +512 -0
- scripts/shared/__init__.py +0 -0
- scripts/shared/findings.py +82 -0
- scripts/shared/orchestrator.py +1151 -0
- scripts/shared/report.py +81 -0
- scripts/shared/resume.py +482 -0
- scripts/shared/skill_chain.py +43 -0
- scripts/smoke.py +261 -0
- scripts/test/__init__.py +0 -0
- scripts/test/_cassette.py +45 -0
- scripts/test/_scenario_index.py +179 -0
- scripts/test/_sidecar.py +139 -0
- scripts/test/flow_types.py +264 -0
- scripts/test/test.py +775 -0
- scripts/test/test_layout.py +510 -0
forge_codex/__init__.py
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
# Phase 3: Team Dispatch — Architecture Mode
|
|
2
|
+
|
|
3
|
+
Dispatch all reviewers to analyze design patterns, coupling, and SOLID principles.
|
|
4
|
+
|
|
5
|
+
## Review Target
|
|
6
|
+
|
|
7
|
+
**Mode:** Architecture Review
|
|
8
|
+
**Target:** {{TARGET}}
|
|
9
|
+
**Quick mode:** {{QUICK_MODE}}
|
|
10
|
+
|
|
11
|
+
## Team Assignments
|
|
12
|
+
|
|
13
|
+
{{TEAM_ASSIGNMENTS}}
|
|
14
|
+
|
|
15
|
+
## Instructions
|
|
16
|
+
|
|
17
|
+
### 1. Identify Scope
|
|
18
|
+
|
|
19
|
+
Read the target files/modules and build a mental model of:
|
|
20
|
+
- Module boundaries and public interfaces
|
|
21
|
+
- Dependency graph (what depends on what)
|
|
22
|
+
- Data flow patterns (how data moves through the system)
|
|
23
|
+
- Error propagation patterns
|
|
24
|
+
|
|
25
|
+
### 2. Dispatch Reviewers in Parallel
|
|
26
|
+
|
|
27
|
+
**Architect Review — SOLID Principles:**
|
|
28
|
+
- **S** (Single Responsibility): Does each module/class have one reason to change?
|
|
29
|
+
- **O** (Open/Closed): Can behavior be extended without modifying existing code?
|
|
30
|
+
- **L** (Liskov Substitution): Are subtypes truly substitutable for their base types?
|
|
31
|
+
- **I** (Interface Segregation): Are interfaces minimal and focused?
|
|
32
|
+
- **D** (Dependency Inversion): Do modules depend on abstractions, not concretions?
|
|
33
|
+
|
|
34
|
+
**Architect Review — Coupling & Cohesion:**
|
|
35
|
+
- Afferent coupling (Ca): How many modules depend on this one?
|
|
36
|
+
- Efferent coupling (Ce): How many modules does this one depend on?
|
|
37
|
+
- Instability (I = Ce / (Ca + Ce)): Is this module stable or volatile?
|
|
38
|
+
- Cohesion: Do the elements within each module belong together?
|
|
39
|
+
|
|
40
|
+
**Security Reviewer — Architectural Security:**
|
|
41
|
+
- Are trust boundaries clearly defined?
|
|
42
|
+
- Is authentication/authorization centralized or scattered?
|
|
43
|
+
- Are there privilege escalation paths?
|
|
44
|
+
- Is sensitive data properly compartmentalized?
|
|
45
|
+
|
|
46
|
+
**QA Reviewer — Testability:**
|
|
47
|
+
- Can components be tested in isolation?
|
|
48
|
+
- Are dependencies injectable?
|
|
49
|
+
- Are there hidden dependencies (globals, singletons)?
|
|
50
|
+
- Is the test infrastructure adequate for the architecture?
|
|
51
|
+
|
|
52
|
+
**Critic — Design Smells & Code Smells:**
|
|
53
|
+
- Run code smells assessment per `templates/code-smells.md`
|
|
54
|
+
- Priority smells: God Class, Shotgun Surgery, Inappropriate Intimacy (critical); Feature Envy, Long Method, Divergent Change (warning)
|
|
55
|
+
- For each smell: cite file:line, name the smell, state the consequence, recommend the specific refactoring
|
|
56
|
+
- Check for Dependency Structure Matrix issues: cyclic dependencies between modules, layering violations, coupling clusters
|
|
57
|
+
|
|
58
|
+
**Investigator — Dependency Analysis:**
|
|
59
|
+
- Map the full dependency graph
|
|
60
|
+
- Identify circular dependencies
|
|
61
|
+
- Check for dependency inversions (concrete depends on concrete)
|
|
62
|
+
- Evaluate third-party dependency health
|
|
63
|
+
|
|
64
|
+
**Doc-writer — Architecture Documentation:**
|
|
65
|
+
- Is the architecture documented?
|
|
66
|
+
- Do module-level docs explain the "why" not just the "what"?
|
|
67
|
+
- Are architectural decisions recorded (ADRs)?
|
|
68
|
+
|
|
69
|
+
### 3. Compile Findings
|
|
70
|
+
|
|
71
|
+
Collect all findings into a unified list with:
|
|
72
|
+
- Finding ID (F1, F2, ...)
|
|
73
|
+
- Source reviewer
|
|
74
|
+
- Severity: critical / warning / suggestion
|
|
75
|
+
- Title (one line)
|
|
76
|
+
- Detail (explanation with specific code references)
|
|
77
|
+
|
|
78
|
+
Record findings in state and proceed to deep dive.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Phase 4: Deep Dive
|
|
2
|
+
|
|
3
|
+
The Investigator follows up on critical findings from the team dispatch phase.
|
|
4
|
+
|
|
5
|
+
## Current Findings
|
|
6
|
+
|
|
7
|
+
{{FINDINGS}}
|
|
8
|
+
|
|
9
|
+
## Your Task
|
|
10
|
+
|
|
11
|
+
### 1. Identify Critical Findings
|
|
12
|
+
|
|
13
|
+
From the findings above, select all findings with severity **critical** or **warning**
|
|
14
|
+
that need deeper investigation. These typically include:
|
|
15
|
+
- Security vulnerabilities that need proof-of-concept verification
|
|
16
|
+
- Architectural issues that may have wider impact than initially noted
|
|
17
|
+
- Logic errors that need call-chain tracing to confirm
|
|
18
|
+
- Performance concerns that need measurement
|
|
19
|
+
|
|
20
|
+
### 2. Investigator Deep Dive
|
|
21
|
+
|
|
22
|
+
For each critical finding, the Investigator should:
|
|
23
|
+
|
|
24
|
+
1. **Read the relevant code** — not just the flagged line, but the full context
|
|
25
|
+
(the function, the caller, the callee, the error handler)
|
|
26
|
+
2. **Trace the data flow** — where does the input come from? Where does the output go?
|
|
27
|
+
3. **Check for similar patterns** — is this a one-off issue or a pattern repeated elsewhere?
|
|
28
|
+
4. **Assess blast radius** — if this finding is a real problem, what is the impact?
|
|
29
|
+
5. **Verify or refute** — does the deeper investigation confirm or dismiss the finding?
|
|
30
|
+
|
|
31
|
+
### 3. Update Findings
|
|
32
|
+
|
|
33
|
+
For each investigated finding:
|
|
34
|
+
- If confirmed: add detail with code references and impact assessment
|
|
35
|
+
- If refuted: mark as dismissed with explanation
|
|
36
|
+
- If escalated: upgrade severity with justification
|
|
37
|
+
- If new findings discovered: add them to the list
|
|
38
|
+
|
|
39
|
+
### 4. Summarize
|
|
40
|
+
|
|
41
|
+
Produce an updated findings list ready for the discussion phase.
|
|
42
|
+
Focus on actionable findings — things the author can and should fix.
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Phase 3: Team Dispatch — PR Mode (Diff Analysis)
|
|
2
|
+
|
|
3
|
+
Dispatch all reviewers to analyze the PR diff in parallel.
|
|
4
|
+
|
|
5
|
+
## Review Target
|
|
6
|
+
|
|
7
|
+
**Mode:** PR Review
|
|
8
|
+
**Target:** {{TARGET}}
|
|
9
|
+
**Quick mode:** {{QUICK_MODE}}
|
|
10
|
+
|
|
11
|
+
## Team Assignments
|
|
12
|
+
|
|
13
|
+
{{TEAM_ASSIGNMENTS}}
|
|
14
|
+
|
|
15
|
+
## Instructions
|
|
16
|
+
|
|
17
|
+
### 1. Fetch the Diff
|
|
18
|
+
|
|
19
|
+
- If target is a PR number: `gh pr diff {{TARGET}}`
|
|
20
|
+
- If target is a branch: `git diff main...{{TARGET}}`
|
|
21
|
+
- If target is file paths: `git diff -- {{TARGET}}`
|
|
22
|
+
- If from handoff: diff the files listed in the handoff
|
|
23
|
+
|
|
24
|
+
### 2. Dispatch Reviewers in Parallel
|
|
25
|
+
|
|
26
|
+
Each reviewer analyzes the diff from their perspective. For each reviewer, produce
|
|
27
|
+
a findings list with severity (critical / warning / suggestion).
|
|
28
|
+
|
|
29
|
+
**Architect Review:**
|
|
30
|
+
- Is the change consistent with existing architecture?
|
|
31
|
+
- Does it introduce unwanted coupling or layering violations?
|
|
32
|
+
- Are interfaces clean and well-defined?
|
|
33
|
+
- Is error handling consistent with project patterns?
|
|
34
|
+
|
|
35
|
+
**Security Reviewer:**
|
|
36
|
+
- Are there injection vulnerabilities (SQL, XSS, command)?
|
|
37
|
+
- Is authentication/authorization properly handled?
|
|
38
|
+
- Are secrets or credentials exposed?
|
|
39
|
+
- Is input validation sufficient?
|
|
40
|
+
- Are data flows safe (no PII leaks, proper sanitization)?
|
|
41
|
+
|
|
42
|
+
**QA Reviewer:**
|
|
43
|
+
- Are edge cases handled?
|
|
44
|
+
- Is there sufficient test coverage for the changes?
|
|
45
|
+
- Do existing tests still pass with these changes?
|
|
46
|
+
- Are error paths tested?
|
|
47
|
+
|
|
48
|
+
**Critic:**
|
|
49
|
+
- What assumptions does this change make?
|
|
50
|
+
- What could go wrong that the author did not consider?
|
|
51
|
+
- Is there over-engineering or unnecessary complexity?
|
|
52
|
+
- Are there simpler alternatives?
|
|
53
|
+
|
|
54
|
+
**Investigator:**
|
|
55
|
+
- What is the blast radius of these changes?
|
|
56
|
+
- What other code depends on the changed interfaces?
|
|
57
|
+
- Are there transitive effects through the dependency graph?
|
|
58
|
+
|
|
59
|
+
**Doc-writer:**
|
|
60
|
+
- Do public APIs have adequate documentation?
|
|
61
|
+
- Are comments accurate and helpful (not redundant)?
|
|
62
|
+
- Should README or changelog be updated?
|
|
63
|
+
|
|
64
|
+
### 3. Compile Findings
|
|
65
|
+
|
|
66
|
+
Collect all findings into a unified list with:
|
|
67
|
+
- Finding ID (F1, F2, ...)
|
|
68
|
+
- Source reviewer
|
|
69
|
+
- Severity: critical / warning / suggestion
|
|
70
|
+
- Title (one line)
|
|
71
|
+
- Detail (explanation with file:line references)
|
|
72
|
+
|
|
73
|
+
Record findings in state and proceed to deep dive.
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Phase 5: Discussion
|
|
2
|
+
|
|
3
|
+
You have completed the analysis phases. Present your findings to the user for interactive review.
|
|
4
|
+
|
|
5
|
+
## Review Mode: {{MODE}} ({{MODE_DISPLAY}})
|
|
6
|
+
## Target: {{TARGET}}
|
|
7
|
+
|
|
8
|
+
## Accumulated Findings
|
|
9
|
+
|
|
10
|
+
{{FINDINGS}}
|
|
11
|
+
|
|
12
|
+
## Your Task
|
|
13
|
+
|
|
14
|
+
Present findings organized by severity:
|
|
15
|
+
|
|
16
|
+
1. **Critical** — Issues that must be fixed before merging or proceeding
|
|
17
|
+
2. **Warnings** — Issues that are concerning but may be acceptable with justification
|
|
18
|
+
3. **Suggestions** — Improvements that would make the code better
|
|
19
|
+
|
|
20
|
+
Include code references (file:line) and explain the "why" for each finding.
|
|
21
|
+
|
|
22
|
+
### Per-Finding Triage
|
|
23
|
+
|
|
24
|
+
For each **critical** or **warning** finding, ask the user directly how to
|
|
25
|
+
handle it (per `templates/user-questions.md`).
|
|
26
|
+
|
|
27
|
+
- Question: `How should we handle finding [ID] ([severity]): [title]?`
|
|
28
|
+
- Options:
|
|
29
|
+
- `Fix now` — block the workflow until this is resolved
|
|
30
|
+
- `Defer` — create a follow-up issue and proceed
|
|
31
|
+
- `Dismiss` — document why this is not a real issue and continue
|
|
32
|
+
- `Drill in` — investigate deeper before deciding
|
|
33
|
+
|
|
34
|
+
Record the user's decision in the finding tracker. "Drill in" re-enters the
|
|
35
|
+
discussion for that finding with deeper analysis before re-asking.
|
|
36
|
+
|
|
37
|
+
### Overall Progress
|
|
38
|
+
|
|
39
|
+
After triaging the highest-severity findings, ask the user directly whether to
|
|
40
|
+
continue or proceed to the report.
|
|
41
|
+
|
|
42
|
+
- Question: `Done triaging findings?`
|
|
43
|
+
- Options:
|
|
44
|
+
- `Proceed` — all important findings are handled, so write the report
|
|
45
|
+
- `More triage` — keep discussing remaining findings
|
|
46
|
+
- `Re-scan` — dispatch reviewers again with updated focus
|
|
47
|
+
|
|
48
|
+
When the user chooses "Proceed", advance to the next phase.
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# Phase 2: Mode Selection
|
|
2
|
+
|
|
3
|
+
Confirm and configure the review mode for this code review session.
|
|
4
|
+
|
|
5
|
+
## Detected Configuration
|
|
6
|
+
|
|
7
|
+
**Mode:** {{MODE}} ({{MODE_DISPLAY}})
|
|
8
|
+
**Target:** {{TARGET}}
|
|
9
|
+
**Quick mode:** {{QUICK_MODE}}
|
|
10
|
+
|
|
11
|
+
## Mode Details
|
|
12
|
+
|
|
13
|
+
### PR Mode (`pr`)
|
|
14
|
+
Best for reviewing a specific PR or set of changes against a base branch.
|
|
15
|
+
- Fetch the full diff
|
|
16
|
+
- All reviewers analyze the diff from their perspective
|
|
17
|
+
- Focus: correctness, style, test coverage, security
|
|
18
|
+
|
|
19
|
+
### Deep Mode (`deep`)
|
|
20
|
+
Best for troubleshooting reviews or investigating specific problem areas.
|
|
21
|
+
- Trace code paths related to the issue
|
|
22
|
+
- Focus on call chains, data flow, error handling
|
|
23
|
+
- Security Reviewer traces auth and data boundaries
|
|
24
|
+
- Investigator follows dependency chains
|
|
25
|
+
|
|
26
|
+
### Architecture Mode (`architecture`)
|
|
27
|
+
Best for reviewing design decisions, structural patterns, and system health.
|
|
28
|
+
- Check SOLID principles adherence
|
|
29
|
+
- Analyze coupling and cohesion metrics
|
|
30
|
+
- Review dependency direction and layering
|
|
31
|
+
- Evaluate extensibility and maintainability
|
|
32
|
+
|
|
33
|
+
## Your Task
|
|
34
|
+
|
|
35
|
+
1. **Confirm the mode** with the user (or auto-confirm if the mode is obvious)
|
|
36
|
+
2. **Prepare mode-specific instructions** for each team member:
|
|
37
|
+
|
|
38
|
+
{{TEAM_ASSIGNMENTS}}
|
|
39
|
+
|
|
40
|
+
3. **Set the review scope:**
|
|
41
|
+
- For PR mode: identify the exact commits/diff to review
|
|
42
|
+
- For deep mode: identify the code paths to trace
|
|
43
|
+
- For architecture mode: identify the modules/packages to analyze
|
|
44
|
+
|
|
45
|
+
4. Record the finalized mode and scope, then proceed to team dispatch.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Phase 6: Report
|
|
2
|
+
|
|
3
|
+
Write the final code review report and hand off to the test skill.
|
|
4
|
+
|
|
5
|
+
## Review Summary
|
|
6
|
+
|
|
7
|
+
**Mode:** {{MODE}} ({{MODE_DISPLAY}})
|
|
8
|
+
**Target:** {{TARGET}}
|
|
9
|
+
**Quick mode:** {{QUICK_MODE}}
|
|
10
|
+
|
|
11
|
+
## Final Findings
|
|
12
|
+
|
|
13
|
+
{{FINDINGS}}
|
|
14
|
+
|
|
15
|
+
## Your Task
|
|
16
|
+
|
|
17
|
+
### 1. Write the Code Review Report
|
|
18
|
+
|
|
19
|
+
Write the report to `.codex/forge-codex/memory/code-review-report.md` with this structure:
|
|
20
|
+
|
|
21
|
+
- Summary section with mode, target, date, reviewers
|
|
22
|
+
- Findings table: ID, Severity, Title, Status
|
|
23
|
+
- Detailed findings: each with severity, source reviewer, file:line, detail, recommendation
|
|
24
|
+
- High-level recommendations
|
|
25
|
+
- Handoff notes for the test skill (areas needing test attention, edge cases found)
|
|
26
|
+
|
|
27
|
+
### 2. Update Memory
|
|
28
|
+
|
|
29
|
+
- Update `.codex/forge-codex/memory/project.md` with code review completion status
|
|
30
|
+
- Record the finding count and severity breakdown
|
|
31
|
+
|
|
32
|
+
### 3. Prepare Handoff
|
|
33
|
+
|
|
34
|
+
The handoff file will be written automatically. Ensure the findings are
|
|
35
|
+
recorded in the state so the test skill knows what to focus on.
|
|
36
|
+
|
|
37
|
+
### 4. Present Dashboard
|
|
38
|
+
|
|
39
|
+
Show the user:
|
|
40
|
+
- Total findings by severity
|
|
41
|
+
- Open vs dismissed vs resolved
|
|
42
|
+
- Suggested next step: `test`
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# Phase 3: Team Dispatch — Deep Mode (Security & Troubleshooting Scan)
|
|
2
|
+
|
|
3
|
+
Dispatch all reviewers to trace code paths and investigate specific areas.
|
|
4
|
+
|
|
5
|
+
## Review Target
|
|
6
|
+
|
|
7
|
+
**Mode:** Deep Troubleshooting Review
|
|
8
|
+
**Target:** {{TARGET}}
|
|
9
|
+
**Quick mode:** {{QUICK_MODE}}
|
|
10
|
+
|
|
11
|
+
## Team Assignments
|
|
12
|
+
|
|
13
|
+
{{TEAM_ASSIGNMENTS}}
|
|
14
|
+
|
|
15
|
+
## Instructions
|
|
16
|
+
|
|
17
|
+
### 1. Identify Investigation Areas
|
|
18
|
+
|
|
19
|
+
From the target and handoff context, identify:
|
|
20
|
+
- Specific code paths that need tracing
|
|
21
|
+
- Error conditions or failure modes to investigate
|
|
22
|
+
- Security-sensitive paths (auth, data handling, external input)
|
|
23
|
+
- Performance-critical paths
|
|
24
|
+
|
|
25
|
+
### 2. Dispatch Reviewers in Parallel
|
|
26
|
+
|
|
27
|
+
**Security Reviewer — Code Path Tracing:**
|
|
28
|
+
- Trace all paths where external input enters the system
|
|
29
|
+
- Follow data through validation, processing, storage, and output
|
|
30
|
+
- Check for injection points at every boundary crossing
|
|
31
|
+
- Verify authentication checks on every protected resource
|
|
32
|
+
- Map authorization decision points and verify they are correct
|
|
33
|
+
- Check for TOCTOU (time-of-check-time-of-use) vulnerabilities
|
|
34
|
+
- Review cryptographic usage (algorithms, key management, random generation)
|
|
35
|
+
|
|
36
|
+
**Investigator — Deep Code Path Analysis:**
|
|
37
|
+
- Trace the specific code paths related to the target issue
|
|
38
|
+
- Follow the call chain from entry point to data store and back
|
|
39
|
+
- Map error propagation: where do errors originate and where are they caught?
|
|
40
|
+
- Identify silent failure modes (swallowed exceptions, default values hiding errors)
|
|
41
|
+
- Check for race conditions in concurrent code paths
|
|
42
|
+
- Trace resource lifecycle (open/close, acquire/release)
|
|
43
|
+
|
|
44
|
+
**Architect — Structural Analysis:**
|
|
45
|
+
- Are the code paths well-structured and easy to follow?
|
|
46
|
+
- Is error handling consistent across the traced paths?
|
|
47
|
+
- Are there unnecessary indirections or overly complex control flow?
|
|
48
|
+
- Do the code paths respect module boundaries?
|
|
49
|
+
|
|
50
|
+
**QA Reviewer — Edge Case Analysis:**
|
|
51
|
+
- What happens with empty/null/zero inputs on these paths?
|
|
52
|
+
- What happens when external services are unavailable?
|
|
53
|
+
- What happens under concurrent access?
|
|
54
|
+
- What happens at boundary values (max int, empty string, huge payload)?
|
|
55
|
+
|
|
56
|
+
**Critic — Assumption Audit:**
|
|
57
|
+
- What assumptions do these code paths make?
|
|
58
|
+
- Which assumptions are validated and which are implicit?
|
|
59
|
+
- What happens when assumptions are violated?
|
|
60
|
+
- Are there defensive checks where assumptions might fail?
|
|
61
|
+
|
|
62
|
+
**Doc-writer — Documentation Gaps:**
|
|
63
|
+
- Are complex code paths adequately commented?
|
|
64
|
+
- Are error codes and failure modes documented?
|
|
65
|
+
- Is the expected behavior documented for edge cases?
|
|
66
|
+
|
|
67
|
+
### 3. Compile Findings
|
|
68
|
+
|
|
69
|
+
Collect all findings into a unified list with:
|
|
70
|
+
- Finding ID (F1, F2, ...)
|
|
71
|
+
- Source reviewer
|
|
72
|
+
- Severity: critical / warning / suggestion
|
|
73
|
+
- Title (one line)
|
|
74
|
+
- Detail (explanation with file:line references and code path traces)
|
|
75
|
+
|
|
76
|
+
Record findings in state and proceed to deep dive.
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Phase 1: Target Detection
|
|
2
|
+
|
|
3
|
+
You are starting a code review. First, identify what will be reviewed.
|
|
4
|
+
|
|
5
|
+
## Current Target
|
|
6
|
+
|
|
7
|
+
**Target argument:** {{TARGET}}
|
|
8
|
+
**Detected mode:** {{MODE}} ({{MODE_DISPLAY}})
|
|
9
|
+
|
|
10
|
+
{{HANDOFF_CONTENT}}
|
|
11
|
+
|
|
12
|
+
## Your Task
|
|
13
|
+
|
|
14
|
+
1. **Identify the review target:**
|
|
15
|
+
- If a PR number was given, verify it exists with `gh pr view <number>`
|
|
16
|
+
- If a branch was given, verify it exists and check its diff against main
|
|
17
|
+
- If file paths were given, verify they exist
|
|
18
|
+
- If a handoff from implement exists, extract the changed files list
|
|
19
|
+
- If nothing was provided, check git for uncommitted changes or recent commits
|
|
20
|
+
|
|
21
|
+
2. **Gather context:**
|
|
22
|
+
- Read `.codex/forge-codex/memory/project.md` if it exists for project context
|
|
23
|
+
- Check for recent handoff files to understand flow position
|
|
24
|
+
- Note the scope: how many files, how many lines changed
|
|
25
|
+
|
|
26
|
+
3. **Confirm with user:**
|
|
27
|
+
- Present the detected target and mode
|
|
28
|
+
- Ask if they want to adjust the mode or target
|
|
29
|
+
- If quick mode: note that only lead reviewers (Architect, QA) will participate
|
|
30
|
+
|
|
31
|
+
Record the confirmed target in the state and proceed to mode selection.
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Stage 3 — Solution Review & User Approval
|
|
2
|
+
|
|
3
|
+
## Review Loop
|
|
4
|
+
Per `templates/review-loop.md`:
|
|
5
|
+
| Step | Agent | Focus |
|
|
6
|
+
|------|-------|-------|
|
|
7
|
+
| Self-review | Architect | Honest cons? Consistent scores? |
|
|
8
|
+
| Cross-review | Security Reviewer + QA | Security implications? Testability? |
|
|
9
|
+
| Critic challenge | Critic | Worst-case outcome? Understated risks? |
|
|
10
|
+
| PM validation | PM | Enough info for user to choose? |
|
|
11
|
+
| Pre-mortem | All agents | Imagine this solution failed — what happened? |
|
|
12
|
+
|
|
13
|
+
Before presenting to the user, run a pre-mortem per `templates/pre-mortem.md`. Each agent generates 2-3 failure scenarios. Categorize, prioritize, and add mitigations to the risk assessment. Record any findings that change the recommendation.
|
|
14
|
+
|
|
15
|
+
## User Approval
|
|
16
|
+
|
|
17
|
+
Present scored solutions summary:
|
|
18
|
+
{{SOLUTIONS_SUMMARY}}
|
|
19
|
+
|
|
20
|
+
Then ask the user directly for approval (per `templates/user-questions.md`).
|
|
21
|
+
Use this question and these options:
|
|
22
|
+
|
|
23
|
+
- Question: `Approve the recommended solution for implementation?`
|
|
24
|
+
- Options:
|
|
25
|
+
- `Approve` — accept the recommendation and hand off to `plan`
|
|
26
|
+
- `Revise` — return to Stage 2 with feedback
|
|
27
|
+
- `Alternate` — pick a different scored alternative
|
|
28
|
+
- `Reject` — stop here because no solution is acceptable
|
|
29
|
+
|
|
30
|
+
Record the user's decision in `project.md` and branch accordingly.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# Develop Complete — Handoff
|
|
2
|
+
|
|
3
|
+
## Handoff
|
|
4
|
+
Write `.codex/forge-codex/memory/handoff-develop.md` with:
|
|
5
|
+
- Approved solutions with beads IDs
|
|
6
|
+
- Team composition
|
|
7
|
+
- Scope assessment
|
|
8
|
+
- Task type
|
|
9
|
+
- Key investigation findings
|
|
10
|
+
|
|
11
|
+
## Dashboard
|
|
12
|
+
Render skill completion dashboard per `templates/dashboard.md`.
|
|
13
|
+
|
|
14
|
+
## Doc-writer Capture
|
|
15
|
+
Dispatch Doc-writer to capture learnings in `.codex/forge-codex/memory/doc-writer.md`.
|
|
16
|
+
|
|
17
|
+
## Suggested Next
|
|
18
|
+
`plan`
|
|
19
|
+
|
|
20
|
+
## Git Checkpoint
|
|
21
|
+
git add .codex/forge-codex/ && git commit -m "workflow: develop complete -- solutions approved"
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Stage 1 — Investigation
|
|
2
|
+
|
|
3
|
+
Dispatch agents for deep investigation.
|
|
4
|
+
|
|
5
|
+
## Agent Dispatch
|
|
6
|
+
|
|
7
|
+
### Investigator (evidence gathering)
|
|
8
|
+
Explore the codebase and gather evidence:
|
|
9
|
+
- Read relevant code paths end-to-end
|
|
10
|
+
- Run existing tests, collect results
|
|
11
|
+
- Check git history for recent changes
|
|
12
|
+
- Collect error messages, stack traces, reproduction steps
|
|
13
|
+
- For bugfixes: follow `templates/systematic-debugging.md`
|
|
14
|
+
|
|
15
|
+
Write evidence to `.codex/forge-codex/memory/investigator.md`
|
|
16
|
+
|
|
17
|
+
### Architect (analysis lead)
|
|
18
|
+
Analyze the evidence using `templates/five-why-protocol.md`:
|
|
19
|
+
- For each issue/challenge, drill through up to 5 why-layers
|
|
20
|
+
- Pattern analysis and hypothesis testing at each layer
|
|
21
|
+
- Record evidence at every layer
|
|
22
|
+
- Stop at an actionable root cause
|
|
23
|
+
- For features: use `templates/brainstorming.md` for requirements exploration
|
|
24
|
+
|
|
25
|
+
Write findings to `.codex/forge-codex/memory/investigation.md`
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
# Investigation Review Loop
|
|
2
|
+
|
|
3
|
+
Per `templates/review-loop.md`:
|
|
4
|
+
|
|
5
|
+
| Step | Agent | Focus |
|
|
6
|
+
|------|-------|-------|
|
|
7
|
+
| Self-review | Architect | Did I stop too early? Did I validate every hypothesis? |
|
|
8
|
+
| Cross-review | QA Reviewer | Are findings reproducible? Evidence concrete? Coverage gaps? |
|
|
9
|
+
| Critic challenge | Critic | What assumptions are untested? What if the opposite is true? |
|
|
10
|
+
| PM validation | PM | Complete against task type checklist? Beads IDs present? |
|
|
11
|
+
|
|
12
|
+
{{REVIEW_STATE}}
|
|
13
|
+
|
|
14
|
+
Loop until all four pass cleanly in the same round.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Scope Assessment & Team Composition
|
|
2
|
+
|
|
3
|
+
## Scope
|
|
4
|
+
|
|
5
|
+
First, infer task type and layers from the user's initial description. If
|
|
6
|
+
anything is unclear, ask the user directly to confirm (per
|
|
7
|
+
`templates/user-questions.md`).
|
|
8
|
+
|
|
9
|
+
- Question 1: `What type of task is this?`
|
|
10
|
+
- `Feature` — new functionality or enhancement
|
|
11
|
+
- `Bugfix` — fix broken behavior
|
|
12
|
+
- `Refactor` — improve structure without changing behavior
|
|
13
|
+
- Question 2: `Which layers does this task touch?`
|
|
14
|
+
- `Frontend` — UI or client-side changes
|
|
15
|
+
- `Backend` — API or server-side changes
|
|
16
|
+
- `Infra` — infrastructure, CI/CD, or deploy
|
|
17
|
+
- `Something else` — let the user specify manually
|
|
18
|
+
|
|
19
|
+
### Complexity
|
|
20
|
+
|
|
21
|
+
Estimate automatically from scope:
|
|
22
|
+
- Small (1-2 files)
|
|
23
|
+
- Medium (3-10 files)
|
|
24
|
+
- Large (10+ files)
|
|
25
|
+
|
|
26
|
+
## Team Composition
|
|
27
|
+
|
|
28
|
+
Base roles for every task: **Architect, Investigator, QA, Critic, Doc-writer.**
|
|
29
|
+
|
|
30
|
+
**Security activation rule:** Add the Security role whenever *any* selected layer includes Backend or Infra, or when auth/data-integrity concerns are present regardless of layer.
|
|
31
|
+
|
|
32
|
+
| Task Type | Additional Roles |
|
|
33
|
+
|-----------|-----------------|
|
|
34
|
+
| Feature | +Security (if Backend or Infra selected) |
|
|
35
|
+
| Bugfix | +Security (if auth/data) |
|
|
36
|
+
| Refactor | +Security (if auth/data) |
|
|
37
|
+
|
|
38
|
+
Record team composition in project.md.
|