takt 0.33.0 → 0.33.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (142) hide show
  1. package/builtins/en/facets/instructions/architecture-audit-plan.md +13 -0
  2. package/builtins/en/facets/instructions/architecture-audit-review.md +15 -0
  3. package/builtins/en/facets/instructions/architecture-audit-supervise.md +14 -0
  4. package/builtins/en/facets/instructions/architecture-audit-team-leader.md +22 -0
  5. package/builtins/en/facets/instructions/e2e-audit-plan.md +13 -0
  6. package/builtins/en/facets/instructions/e2e-audit-review.md +16 -0
  7. package/builtins/en/facets/instructions/e2e-audit-supervise.md +11 -0
  8. package/builtins/en/facets/instructions/e2e-audit-team-leader.md +22 -0
  9. package/builtins/en/facets/instructions/review-arch.md +4 -0
  10. package/builtins/en/facets/instructions/review-qa.md +2 -0
  11. package/builtins/en/facets/instructions/review-security.md +21 -8
  12. package/builtins/en/facets/instructions/supervise.md +22 -3
  13. package/builtins/en/facets/instructions/unit-audit-plan.md +13 -0
  14. package/builtins/en/facets/instructions/unit-audit-review.md +16 -0
  15. package/builtins/en/facets/instructions/unit-audit-supervise.md +11 -0
  16. package/builtins/en/facets/instructions/unit-audit-team-leader.md +22 -0
  17. package/builtins/en/facets/knowledge/security.md +24 -0
  18. package/builtins/en/facets/output-contracts/architecture-audit-plan.md +26 -0
  19. package/builtins/en/facets/output-contracts/architecture-audit.md +38 -0
  20. package/builtins/en/facets/output-contracts/{security-audit.md → audit-security.md} +15 -0
  21. package/builtins/en/facets/output-contracts/e2e-audit-plan.md +26 -0
  22. package/builtins/en/facets/output-contracts/e2e-audit.md +41 -0
  23. package/builtins/en/facets/output-contracts/unit-audit-plan.md +26 -0
  24. package/builtins/en/facets/output-contracts/unit-audit.md +41 -0
  25. package/builtins/en/facets/personas/conductor.md +11 -2
  26. package/builtins/en/facets/personas/security-reviewer.md +3 -0
  27. package/builtins/en/facets/policies/review.md +8 -0
  28. package/builtins/en/piece-categories.yaml +7 -5
  29. package/builtins/en/pieces/audit-architecture-backend.yaml +83 -0
  30. package/builtins/en/pieces/audit-architecture-dual.yaml +87 -0
  31. package/builtins/en/pieces/audit-architecture-frontend.yaml +87 -0
  32. package/builtins/en/pieces/audit-architecture.yaml +75 -0
  33. package/builtins/en/pieces/audit-e2e.yaml +92 -0
  34. package/builtins/en/pieces/{security-audit.yaml → audit-security.yaml} +7 -7
  35. package/builtins/en/pieces/audit-unit.yaml +94 -0
  36. package/builtins/ja/facets/instructions/architecture-audit-plan.md +13 -0
  37. package/builtins/ja/facets/instructions/architecture-audit-review.md +15 -0
  38. package/builtins/ja/facets/instructions/architecture-audit-supervise.md +14 -0
  39. package/builtins/ja/facets/instructions/architecture-audit-team-leader.md +22 -0
  40. package/builtins/ja/facets/instructions/e2e-audit-plan.md +13 -0
  41. package/builtins/ja/facets/instructions/e2e-audit-review.md +16 -0
  42. package/builtins/ja/facets/instructions/e2e-audit-supervise.md +11 -0
  43. package/builtins/ja/facets/instructions/e2e-audit-team-leader.md +22 -0
  44. package/builtins/ja/facets/instructions/review-arch.md +4 -0
  45. package/builtins/ja/facets/instructions/review-qa.md +2 -0
  46. package/builtins/ja/facets/instructions/review-security.md +22 -9
  47. package/builtins/ja/facets/instructions/supervise.md +23 -3
  48. package/builtins/ja/facets/instructions/unit-audit-plan.md +13 -0
  49. package/builtins/ja/facets/instructions/unit-audit-review.md +16 -0
  50. package/builtins/ja/facets/instructions/unit-audit-supervise.md +11 -0
  51. package/builtins/ja/facets/instructions/unit-audit-team-leader.md +22 -0
  52. package/builtins/ja/facets/knowledge/security.md +24 -0
  53. package/builtins/ja/facets/output-contracts/architecture-audit-plan.md +26 -0
  54. package/builtins/ja/facets/output-contracts/architecture-audit.md +38 -0
  55. package/builtins/ja/facets/output-contracts/{security-audit.md → audit-security.md} +15 -0
  56. package/builtins/ja/facets/output-contracts/e2e-audit-plan.md +26 -0
  57. package/builtins/ja/facets/output-contracts/e2e-audit.md +41 -0
  58. package/builtins/ja/facets/output-contracts/unit-audit-plan.md +26 -0
  59. package/builtins/ja/facets/output-contracts/unit-audit.md +41 -0
  60. package/builtins/ja/facets/personas/conductor.md +9 -0
  61. package/builtins/ja/facets/personas/security-reviewer.md +2 -0
  62. package/builtins/ja/facets/policies/review.md +8 -0
  63. package/builtins/ja/piece-categories.yaml +7 -5
  64. package/builtins/ja/pieces/audit-architecture-backend.yaml +83 -0
  65. package/builtins/ja/pieces/audit-architecture-dual.yaml +87 -0
  66. package/builtins/ja/pieces/audit-architecture-frontend.yaml +87 -0
  67. package/builtins/ja/pieces/audit-architecture.yaml +75 -0
  68. package/builtins/ja/pieces/audit-e2e.yaml +92 -0
  69. package/builtins/ja/pieces/{security-audit.yaml → audit-security.yaml} +7 -7
  70. package/builtins/ja/pieces/audit-unit.yaml +94 -0
  71. package/dist/app/cli/routing-inputs.d.ts +2 -2
  72. package/dist/app/cli/routing-inputs.d.ts.map +1 -1
  73. package/dist/app/cli/routing-inputs.js +11 -8
  74. package/dist/app/cli/routing-inputs.js.map +1 -1
  75. package/dist/features/pipeline/steps.d.ts.map +1 -1
  76. package/dist/features/pipeline/steps.js +7 -6
  77. package/dist/features/pipeline/steps.js.map +1 -1
  78. package/dist/features/tasks/add/index.js +4 -4
  79. package/dist/features/tasks/add/index.js.map +1 -1
  80. package/dist/features/tasks/add/issueTask.d.ts +1 -0
  81. package/dist/features/tasks/add/issueTask.d.ts.map +1 -1
  82. package/dist/features/tasks/add/issueTask.js +1 -1
  83. package/dist/features/tasks/add/issueTask.js.map +1 -1
  84. package/dist/features/tasks/execute/postExecution.js +4 -4
  85. package/dist/features/tasks/execute/postExecution.js.map +1 -1
  86. package/dist/features/tasks/execute/resolveTask.d.ts +1 -1
  87. package/dist/features/tasks/execute/resolveTask.d.ts.map +1 -1
  88. package/dist/features/tasks/execute/resolveTask.js +4 -4
  89. package/dist/features/tasks/execute/resolveTask.js.map +1 -1
  90. package/dist/features/tasks/execute/taskExecution.js +1 -1
  91. package/dist/features/tasks/execute/taskExecution.js.map +1 -1
  92. package/dist/infra/git/detect.d.ts +8 -1
  93. package/dist/infra/git/detect.d.ts.map +1 -1
  94. package/dist/infra/git/detect.js +14 -4
  95. package/dist/infra/git/detect.js.map +1 -1
  96. package/dist/infra/git/index.d.ts +2 -2
  97. package/dist/infra/git/index.d.ts.map +1 -1
  98. package/dist/infra/git/index.js +8 -8
  99. package/dist/infra/git/index.js.map +1 -1
  100. package/dist/infra/git/types.d.ts +7 -7
  101. package/dist/infra/git/types.d.ts.map +1 -1
  102. package/dist/infra/github/GitHubProvider.d.ts +7 -7
  103. package/dist/infra/github/GitHubProvider.d.ts.map +1 -1
  104. package/dist/infra/github/GitHubProvider.js +14 -14
  105. package/dist/infra/github/GitHubProvider.js.map +1 -1
  106. package/dist/infra/github/issue.d.ts +3 -3
  107. package/dist/infra/github/issue.d.ts.map +1 -1
  108. package/dist/infra/github/issue.js +11 -9
  109. package/dist/infra/github/issue.js.map +1 -1
  110. package/dist/infra/github/pr.d.ts +4 -4
  111. package/dist/infra/github/pr.d.ts.map +1 -1
  112. package/dist/infra/github/pr.js +11 -11
  113. package/dist/infra/github/pr.js.map +1 -1
  114. package/dist/infra/gitlab/GitLabProvider.d.ts +7 -7
  115. package/dist/infra/gitlab/GitLabProvider.d.ts.map +1 -1
  116. package/dist/infra/gitlab/GitLabProvider.js +14 -14
  117. package/dist/infra/gitlab/GitLabProvider.js.map +1 -1
  118. package/dist/infra/gitlab/issue.d.ts +2 -2
  119. package/dist/infra/gitlab/issue.d.ts.map +1 -1
  120. package/dist/infra/gitlab/issue.js +6 -5
  121. package/dist/infra/gitlab/issue.js.map +1 -1
  122. package/dist/infra/gitlab/pr.d.ts +4 -4
  123. package/dist/infra/gitlab/pr.d.ts.map +1 -1
  124. package/dist/infra/gitlab/pr.js +11 -11
  125. package/dist/infra/gitlab/pr.js.map +1 -1
  126. package/dist/infra/gitlab/utils.d.ts +6 -2
  127. package/dist/infra/gitlab/utils.d.ts.map +1 -1
  128. package/dist/infra/gitlab/utils.js +14 -5
  129. package/dist/infra/gitlab/utils.js.map +1 -1
  130. package/package.json +1 -1
  131. package/builtins/en/pieces/fill-e2e.yaml +0 -239
  132. package/builtins/en/pieces/fill-unit.yaml +0 -269
  133. package/builtins/ja/pieces/fill-e2e.yaml +0 -239
  134. package/builtins/ja/pieces/fill-unit.yaml +0 -269
  135. /package/builtins/en/facets/instructions/{security-audit-plan.md → audit-security-plan.md} +0 -0
  136. /package/builtins/en/facets/instructions/{security-audit-review.md → audit-security-review.md} +0 -0
  137. /package/builtins/en/facets/instructions/{security-audit-supervise.md → audit-security-supervise.md} +0 -0
  138. /package/builtins/en/facets/instructions/{security-audit-team-leader.md → audit-security-team-leader.md} +0 -0
  139. /package/builtins/ja/facets/instructions/{security-audit-plan.md → audit-security-plan.md} +0 -0
  140. /package/builtins/ja/facets/instructions/{security-audit-review.md → audit-security-review.md} +0 -0
  141. /package/builtins/ja/facets/instructions/{security-audit-supervise.md → audit-security-supervise.md} +0 -0
  142. /package/builtins/ja/facets/instructions/{security-audit-team-leader.md → audit-security-team-leader.md} +0 -0
@@ -0,0 +1,13 @@
1
+ Audit the project architecture before making changes.
2
+
3
+ **What to do:**
4
+ 1. Enumerate the main modules, layers, boundaries, and public entry points using Read, Glob, and Grep
5
+ 2. Identify the dependency directions, shared abstractions, and major call chains
6
+ 3. Build an audit scope that covers all modules relevant to structure, ownership, and wiring
7
+ 4. Highlight modules with higher architectural risk (boundary leaks, giant files, scattered logic, coupling hotspots)
8
+ 5. Prepare an audit order that reviews the highest-risk modules first
9
+
10
+ **Important:**
11
+ - Start from full module and boundary enumeration, not from a few suspicious files
12
+ - Focus on structure and wiring, not style-only comments
13
+ - If the architecture cannot be inferred from code alone, state the missing evidence explicitly
@@ -0,0 +1,15 @@
1
+ Re-audit the modules or boundaries that were judged insufficient in the previous architecture audit.
2
+
3
+ **Important:** Refer to these reports:
4
+ - Plan report: {report:01-architecture-audit-plan.md}
5
+ - Audit report: {report:02-architecture-audit.md}
6
+
7
+ **What to do:**
8
+ 1. Read the flagged modules, boundaries, and call chains in full
9
+ 2. Re-check the structural claims and identify what was previously skipped or weakly evidenced
10
+ 3. Update the audit result with concrete file evidence, explicit scope coverage, and missing-item reasons where applicable
11
+
12
+ **Strictly prohibited:**
13
+ - Modifying production code
14
+ - Claiming a boundary or dependency direction is valid without file evidence
15
+ - Skipping a flagged module because it "looks standard"
@@ -0,0 +1,14 @@
1
+ Verify the completeness and quality of the architecture audit itself.
2
+
3
+ **Important:** Refer to these reports:
4
+ - Plan report: {report:01-architecture-audit-plan.md}
5
+ - Audit report: {report:02-architecture-audit.md}
6
+
7
+ **Verification procedure:**
8
+ 1. Cross-check the module inventory from the plan against the audited modules in the audit report
9
+ 2. Reject if important modules or boundaries remain unaudited
10
+ 3. Reject if key dependency directions, wiring paths, ownership boundaries, or call chains from the plan are missing from the audit result without an explicit reason
11
+ 4. Verify the audit report includes concrete structural evidence, not just design opinions
12
+ 5. Verify the report includes the enumeration commands used and that they are sufficient to support the claimed scope
13
+ 6. Sample-read a few high-risk modules yourself to confirm the structural claims are credible
14
+ 7. Require re-audit if findings or suggested issue titles are too vague to file directly
@@ -0,0 +1,22 @@
1
+ Decompose the architecture audit, assign modules to each part, and execute in parallel.
2
+
3
+ **Important:** Refer to the plan report: {report:01-architecture-audit-plan.md}
4
+
5
+ **What to do:**
6
+ 1. Review the module inventory and architectural risk areas from the plan report
7
+ 2. Split the audit into 3 groups by module or boundary
8
+ 3. Assign exclusive ownership to each part so every relevant module is audited once
9
+
10
+ **Each part's instruction MUST include:**
11
+ - Assigned module and file list
12
+ - The boundaries and call chains to verify
13
+ - Required audit procedure:
14
+ 1. Read the assigned files in full
15
+ 2. Trace dependency direction, entry points, and shared abstractions
16
+ 3. Record structural findings with concrete file evidence
17
+ - Completion criteria: every assigned module has been audited and all findings are reported with evidence
18
+
19
+ **Constraints:**
20
+ - Each part is read-only
21
+ - Do not audit files outside the assignment
22
+ - Prefer evidence from code structure and call chains over style-only comments
@@ -0,0 +1,13 @@
1
+ Audit the target for E2E coverage before making changes.
2
+
3
+ **What to do:**
4
+ 1. Enumerate all user entry points, major routes, task flows, and failure paths from the codebase
5
+ 2. Read the existing E2E tests and map which flows and scenarios are already covered
6
+ 3. Build a complete list of auditable user flows and scenario variants
7
+ 4. Identify missing E2E scenarios and prioritize them by user impact and regression risk
8
+ 5. Prepare an implementation order that covers the highest-risk missing scenarios first
9
+
10
+ **Important:**
11
+ - Start from complete route and flow enumeration, not from a few obvious pages
12
+ - Include unhappy paths, permission differences, and recovery paths when relevant
13
+ - If a flow cannot be audited from local code and tests alone, state the missing evidence explicitly
@@ -0,0 +1,16 @@
1
+ Re-audit the routes or scenarios that were judged insufficient in the previous E2E audit.
2
+
3
+ **Important:** Review the supervisor's verification results and understand:
4
+ - Unaudited flows or scenarios
5
+ - Coverage claims lacking evidence
6
+ - Specific feedback on issue quality or scope
7
+
8
+ **What to do:**
9
+ 1. Read the flagged route-related code and corresponding E2E tests in full
10
+ 2. Re-check the coverage claims for the flagged scenarios and identify what was previously skipped or weakly evidenced
11
+ 3. Update the audit result in issue-ready form with concrete evidence, explicit scope coverage, and missing-item reasons where applicable
12
+
13
+ **Strictly prohibited:**
14
+ - Modifying E2E tests or production code
15
+ - Claiming a scenario is covered without citing the actual test evidence
16
+ - Skipping a flagged route because it "looks fine"
@@ -0,0 +1,11 @@
1
+ Verify the completeness and quality of the E2E audit itself.
2
+
3
+ **Important:** Refer to the audit plan report: {report:01-e2e-audit-plan.md}
4
+
5
+ **Verification procedure:**
6
+ 1. Cross-check the full route and flow inventory in the plan against the audited scenarios in the audit report
7
+ 2. Reject if any important entry point, user flow, unhappy path, permission variant, or recovery path from the plan is missing from the audit result without an explicit reason
8
+ 3. Verify the audit report includes concrete evidence for covered and missing scenarios, not just high-level claims
9
+ 4. Verify the report includes the enumeration commands used and that they are sufficient to support the claimed scope
10
+ 5. Sample-read a few high-risk routes and corresponding tests yourself to validate the coverage claims
11
+ 6. Require re-audit if issue titles, priorities, or recommended actions are too vague to be filed directly
@@ -0,0 +1,22 @@
1
+ Decompose the E2E audit, assign flows to each part, and execute in parallel.
2
+
3
+ **Important:** Refer to the plan report: {report:01-e2e-audit-plan.md}
4
+
5
+ **What to do:**
6
+ 1. Review the user flow list, existing scenarios, and risk areas from the plan report
7
+ 2. Split the audit into 3 groups by feature area or route cluster
8
+ 3. Assign exclusive ownership so every audited flow is reviewed once
9
+
10
+ **Each part's instruction MUST include:**
11
+ - Assigned routes, entry points, and corresponding E2E files
12
+ - The happy paths, failure paths, and permission variants to verify
13
+ - Required audit procedure:
14
+ 1. Read the relevant code for the assigned flows
15
+ 2. Read the corresponding E2E tests in full
16
+ 3. Record covered and missing scenarios with concrete evidence
17
+ - Completion criteria: every assigned flow has been audited and findings are reported in issue-ready form
18
+
19
+ **Constraints:**
20
+ - Each part is read-only
21
+ - Do not modify E2E tests or production code
22
+ - Do not audit routes outside the assignment
@@ -28,5 +28,9 @@ Review {report:coder-decisions.md} to understand the recorded design decisions.
28
28
  1. First, extract previous open findings and preliminarily classify as `new / persists / resolved`
29
29
  2. Review the change diff and detect issues based on the architecture and design criteria above
30
30
  - Cross-check changes against REJECT criteria tables defined in knowledge
31
+ - If you find a DRY violation, require it to be fixed
32
+ - Before proposing a fix, verify that the consolidation target fits existing responsibility boundaries, contracts, and public API shape
33
+ - If you require a new wrapper, helper, or public API, explain why that abstraction target is the natural one
34
+ - If the proposed abstraction goes beyond the task spec or plan, state why the additional scope is necessary and justified
31
35
  3. For each detected issue, classify as blocking/non-blocking based on Policy's scope determination table and judgment rules
32
36
  4. If there is even one blocking issue (`new` or `persists`), judge as REJECT
@@ -23,5 +23,7 @@ Review {report:coder-decisions.md} to understand the recorded design decisions.
23
23
  1. First, extract previous open findings and preliminarily classify as `new / persists / resolved`
24
24
  2. Review the change diff and detect issues based on the quality assurance criteria above
25
25
  - Cross-check changes against REJECT criteria tables defined in knowledge
26
+ - Even if tests pass, verify whether any additional change outside the task or plan is justified
27
+ - If review-driven follow-up changes expand the design, evaluate whether that extra change is actually necessary
26
28
  3. For each detected issue, classify as blocking/non-blocking based on Policy's scope determination table and judgment rules
27
29
  4. If there is even one blocking issue (`new` or `persists`), judge as REJECT
@@ -4,15 +4,28 @@ Review the changes from a security perspective. Check for the following vulnerab
4
4
  - Data exposure risks
5
5
  - Cryptographic weaknesses
6
6
 
7
+ **Primary sources to review:**
8
+ - Review `order.md` to understand requirements and prohibitions.
9
+ - Review `plan.md` to understand intended scope and design direction.
10
+ - Review {report:coder-decisions.md} to understand the recorded design decisions.
11
+ - Do not dismiss documented decisions as FP by default. Re-evaluate them against `order.md`, `plan.md`, and the actual code.
7
12
 
8
- **Design decisions reference:**
9
- Review {report:coder-decisions.md} to understand the recorded design decisions.
10
- - Do not flag intentionally documented decisions as FP
11
- - However, also evaluate whether the design decisions themselves are sound, and flag any problems
13
+ **Important:**
14
+ - Do not treat documented precedence rules, extension points, or configuration override behavior as vulnerabilities by themselves.
15
+ - Do not assume that removing an interactive confirmation or warning automatically means a security boundary regression.
16
+ - To issue a blocking finding, make the exploit path concrete: who controls what input, and what newly becomes possible.
12
17
 
13
18
  ## Judgment Procedure
14
19
 
15
- 1. Review the change diff and detect issues based on the security criteria above
16
- - Cross-check changes against REJECT criteria tables defined in knowledge
17
- 2. For each detected issue, classify as blocking/non-blocking based on Policy's scope determination table and judgment rules
18
- 3. If there is even one blocking issue, judge as REJECT
20
+ 1. Cross-check `order.md`, `plan.md`, `coder-decisions.md`, and the actual code to determine whether the behavior is intentional product behavior
21
+ 2. Review the change diff and extract issue candidates by cross-checking changes against REJECT criteria in knowledge
22
+ 3. For each candidate, verify the concrete exploit path
23
+ - Which actor controls the input or configuration
24
+ - Whether the change enables new privilege, data access, code execution, or prompt modification
25
+ - Whether the impact exceeds the existing documented precedence or extension model
26
+ 4. When configuration precedence, local/global shadowing, or non-interactive selection is involved, additionally verify:
27
+ - Whether the behavior is intended by `order.md` or `plan.md`
28
+ - Whether explicit selectors or arguments already make the user's intent clear
29
+ - Whether there is an actual trust-boundary break or new attack capability, rather than merely an override relationship
30
+ 5. For each detected issue, classify it as blocking or non-blocking based on the Policy scope table and judgment rules
31
+ 6. If there is even one blocking issue, judge as REJECT
@@ -5,7 +5,13 @@ Verify existing evidence for tests, builds, and functional checks, then perform
5
5
  - Does implementation match the plan?
6
6
  - Were all review movement findings properly addressed?
7
7
  - Was the original task objective achieved?
8
- 2. Whether each task spec requirement has been achieved
8
+ - Are prior review findings themselves valid against the task spec, plan, and actual code?
9
+ 2. Verify the task spec, plan, and decision history as primary sources
10
+ - Read `order.md` and extract required behavior and prohibitions
11
+ - Read `plan.md` and confirm intended approach and scope
12
+ - Read `coder-decisions.md` and confirm why the implementation moved in that direction
13
+ - Do not treat prior review conclusions as authoritative unless they align with all three and the code
14
+ 3. Whether each task spec requirement has been achieved
9
15
  - Extract requirements one by one from the task spec
10
16
  - If a single sentence contains multiple conditions or paths, split it into the smallest independently verifiable units
11
17
  - Example: treat `global/project` as separate requirements
@@ -17,14 +23,19 @@ Verify existing evidence for tests, builds, and functional checks, then perform
17
23
  - Evidence must cover the full content of the requirement row
18
24
  - Do not rely on the plan report's judgment; independently verify each requirement
19
25
  - If any requirement is unfulfilled, REJECT
20
- 3. Handling tests, builds, and functional checks
26
+ 4. Re-evaluate prior review findings
27
+ - Re-check each `new / persists / resolved` finding against the task spec, `plan.md`, `coder-decisions.md`, and actual code
28
+ - If a finding does not hold in code, classify it as `false_positive`
29
+ - If a finding holds technically but pushes work beyond the task objective or justified scope, classify it as `overreach`
30
+ - Do not leave `false_positive` / `overreach` reasoning implicit
31
+ 5. Handling tests, builds, and functional checks
21
32
  - Do not assume this movement will rerun commands
22
33
  - Use only evidence available in this run, such as execution logs, reports, or CI results
23
34
  - If evidence is missing, mark the item as unverified
24
35
  - If report text conflicts with execution evidence, call out the inconsistency explicitly
25
36
 
26
37
  **Report verification:** Read all reports in the Report Directory and
27
- check for any unaddressed improvement suggestions.
38
+ check whether any blocking finding remains unresolved and whether those findings are themselves valid.
28
39
 
29
40
  **Validation output contract:**
30
41
  ```markdown
@@ -45,6 +56,14 @@ Extract requirements from the task spec and verify each one individually against
45
56
  - ✅ without evidence is invalid (must verify against actual code)
46
57
  - Do not rely on plan report's judgment; independently verify each requirement
47
58
 
59
+ ## Re-evaluation of Prior Findings
60
+ | finding_id | Prior status | Re-evaluation | Evidence |
61
+ |------------|--------------|---------------|----------|
62
+ | {id} | new / persists / resolved | valid / false_positive / overreach | `src/file.ts:42`, `reports/plan.md` |
63
+
64
+ - If final judgment differs from prior review conclusions, explain why with evidence
65
+ - If marking `false_positive` or `overreach`, state whether it conflicts with the task objective, the plan, or both
66
+
48
67
  ## Verification Summary
49
68
  | Item | Status | Verification method |
50
69
  |------|--------|-------------------|
@@ -0,0 +1,13 @@
1
+ Audit the target for unit test coverage before making changes.
2
+
3
+ **What to do:**
4
+ 1. Enumerate the target production files, exported APIs, internal branches, error paths, boundary checks, and state transitions using Read, Glob, and Grep
5
+ 2. Read existing unit tests and map which behaviors are already covered
6
+ 3. Build a complete inventory of auditable behaviors for each target file
7
+ 4. Identify missing unit tests and prioritize them by regression risk
8
+ 5. Prepare an implementation order that covers the highest-risk gaps first
9
+
10
+ **Important:**
11
+ - Start from complete enumeration, not from a few obvious gaps
12
+ - Do not stop after identifying a handful of missing tests
13
+ - If the scope is unclear, state exactly which files or behaviors need clarification
@@ -0,0 +1,16 @@
1
+ Re-audit the files or behaviors that were judged insufficient in the previous unit audit.
2
+
3
+ **Important:** Review the supervisor's verification results and understand:
4
+ - Unaudited files or behaviors
5
+ - Coverage claims lacking evidence
6
+ - Specific feedback on issue quality or scope
7
+
8
+ **What to do:**
9
+ 1. Read the flagged production files and corresponding tests in full
10
+ 2. Re-check the coverage claims for the flagged behaviors and identify what was previously skipped or weakly evidenced
11
+ 3. Update the audit result in issue-ready form with concrete evidence, explicit scope coverage, and missing-item reasons where applicable
12
+
13
+ **Strictly prohibited:**
14
+ - Modifying tests or production code
15
+ - Claiming a behavior is covered without citing the actual test evidence
16
+ - Skipping a flagged file or behavior because it "looks fine"
@@ -0,0 +1,11 @@
1
+ Verify the completeness and quality of the unit test audit itself.
2
+
3
+ **Important:** Refer to the audit plan report: {report:01-unit-audit-plan.md}
4
+
5
+ **Verification procedure:**
6
+ 1. Cross-check the full target inventory in the plan against the audited files and behaviors in the audit report
7
+ 2. Reject if any production file, exported API, branch, error path, boundary check, or state transition from the plan is missing from the audit result without an explicit reason
8
+ 3. Verify the audit report includes concrete evidence for both covered and missing behaviors, not just conclusions
9
+ 4. Verify the report includes the enumeration commands used and that they are sufficient to support the claimed scope
10
+ 5. Sample-read a few target production files and corresponding tests yourself to confirm the coverage claims are credible
11
+ 6. Require re-audit if issue titles, priorities, or recommended actions are too vague to be filed directly
@@ -0,0 +1,22 @@
1
+ Decompose the unit audit, assign files to each part, and execute in parallel.
2
+
3
+ **Important:** Refer to the plan report: {report:01-unit-audit-plan.md}
4
+
5
+ **What to do:**
6
+ 1. Review the production file list, existing tests, and audited behavior inventory from the plan report
7
+ 2. Split the audit into 3 groups by module or test area
8
+ 3. Assign exclusive ownership so every target file and behavior is audited once
9
+
10
+ **Each part's instruction MUST include:**
11
+ - Assigned production files and corresponding test files
12
+ - The behaviors, branches, error paths, and boundary checks to verify
13
+ - Required audit procedure:
14
+ 1. Read every assigned production file in full
15
+ 2. Read the corresponding unit tests in full
16
+ 3. Record covered and missing behaviors with concrete file evidence
17
+ - Completion criteria: every assigned target has been audited and findings are reported in issue-ready form
18
+
19
+ **Constraints:**
20
+ - Each part is read-only
21
+ - Do not modify tests or production code
22
+ - Do not audit files outside the assignment
@@ -18,6 +18,30 @@ Require extra scrutiny:
18
18
  - Error messages (AI may expose internal details)
19
19
  - Config files (AI may use dangerous defaults from training data)
20
20
 
21
+ ## Precedence Resolution, Override, and Trust Boundaries
22
+
23
+ Resolving multiple configuration or definition sources by precedence, intentional override behavior, and extension points are not vulnerabilities by themselves. The real question is whether the change breaks a trust boundary or gives a lower-trust actor a new attack capability.
24
+
25
+ | Criteria | Verdict |
26
+ |----------|---------|
27
+ | Behavior follows documented precedence rules within the same user and trust level | OK |
28
+ | An explicit selector or argument chooses the target and resolution still follows the documented precedence model | OK |
29
+ | A higher-precedence definition wins over a lower-precedence one, but stays within the documented customization contract and does not expand privileges or data access | Warning at most. Normally not REJECT |
30
+ | A lower-trust actor can override a higher-trust setting or definition and thereby gain new code execution, modify higher-trust assets, access data, or bypass authorization | REJECT |
31
+ | An interactive confirmation step is removed, but explicit selection already makes intent unambiguous and the trust boundary is unchanged | OK |
32
+ | An interactive confirmation step was the only trust-boundary control, and removing it silently enables lower-trust override | May be REJECT. Make the attack preconditions and impact concrete |
33
+
34
+ ### How to Evaluate
35
+
36
+ To treat precedence resolution or override behavior as a vulnerability, make all of the following concrete:
37
+
38
+ - Who the lower-trust actor is and what input or configuration they control
39
+ - What the higher-trust asset is
40
+ - What becomes possible only after this change
41
+ - Why that behavior exceeds the documented precedence or extension model
42
+
43
+ If the product already allows behavior to be customized through multiple scoped definition files or configuration sources, enabling selection among definitions at the same trust level is usually not a new attack capability by itself.
44
+
21
45
  ## Injection Attacks
22
46
 
23
47
  **SQL Injection:**
@@ -0,0 +1,26 @@
1
+ ```markdown
2
+ # Architecture Audit Plan
3
+
4
+ ## Enumeration Evidence
5
+ - Commands used:
6
+ - `rg ...`
7
+ - `rg --files ...`
8
+ - Scope notes:
9
+ - {how modules, layers, boundaries, and entry points were enumerated}
10
+
11
+ ## Module Inventory
12
+ | # | Module / Layer | Key Files | Responsibility | Main Boundaries | Risk |
13
+ |---|----------------|-----------|----------------|-----------------|------|
14
+ | 1 | {module or layer} | `src/file.ts` | {primary responsibility} | {boundary summary} | High / Medium / Low |
15
+
16
+ ## Audit Targets
17
+ | # | Module / Layer | What to Verify | Priority |
18
+ |---|----------------|----------------|----------|
19
+ | 1 | {module or layer} | {dependency direction, wiring, ownership, abstraction} | High / Medium / Low |
20
+
21
+ ## Audit Order
22
+ - {ordered module review plan}
23
+
24
+ ## Clarifications / Risks
25
+ - {open questions or constraints}
26
+ ```
@@ -0,0 +1,38 @@
1
+ ```markdown
2
+ # Architecture Audit Report
3
+
4
+ ## Result: APPROVE / IMPROVE / REJECT
5
+
6
+ ## Enumeration Evidence
7
+ - Commands used:
8
+ - `rg ...`
9
+ - `rg --files ...`
10
+ - Coverage notes:
11
+ - {how you confirmed the full module and boundary set was audited}
12
+
13
+ ## Audit Scope
14
+ | # | Module / Layer | Audited | Key Files | Boundaries Verified |
15
+ |---|----------------|---------|-----------|---------------------|
16
+ | 1 | {module or layer} | ✅ | `src/file.ts` | {boundary summary} |
17
+
18
+ ## Findings
19
+ | # | Severity | Category | Location | Issue | Recommended Fix |
20
+ |---|----------|----------|----------|-------|-----------------|
21
+ | 1 | High / Medium / Low | boundary / coupling / wiring / dead-code | `src/file.ts:42` | {issue description} | {fix suggestion} |
22
+
23
+ ## Modules with No Blocking Issues
24
+ - {modules audited with no blocking findings}
25
+
26
+ ## Suggested Issue Titles
27
+ 1. {Issue title}
28
+ 2. {Issue title}
29
+
30
+ ## Follow-up Notes
31
+ - {non-blocking observations or constraints}
32
+ - {explicit reasons for any intentionally unaudited item}
33
+ ```
34
+
35
+ **Cognitive load reduction rules:**
36
+ - APPROVE → Scope table only (15 lines max)
37
+ - IMPROVE → Scope table + relevant findings only
38
+ - REJECT → Include only blocking findings and impacted modules
@@ -5,6 +5,13 @@
5
5
 
6
6
  ## Severity: None / Low / Medium / High / Critical
7
7
 
8
+ ## Enumeration Evidence
9
+ - Commands used:
10
+ - `rg ...`
11
+ - `rg --files ...`
12
+ - Coverage notes:
13
+ - {how you confirmed the full file set was audited}
14
+
8
15
  ## Audit Scope
9
16
  | # | File | Audited | Risk Classification |
10
17
  |---|------|---------|-------------------|
@@ -18,9 +25,17 @@
18
25
  ## Files with No Issues
19
26
  - {list of files where no issues were detected}
20
27
 
28
+ ## Suggested Issue Titles
29
+ 1. {Issue title}
30
+ 2. {Issue title}
31
+
21
32
  ## Recommendations (non-blocking)
22
33
  - {security improvement suggestions}
23
34
 
35
+ ## Notes
36
+ - {constraints, assumptions, or audit limits}
37
+ - {explicit reasons for any intentionally unaudited item}
38
+
24
39
  ## REJECT Criteria
25
40
  - REJECT if one or more High or Critical issues exist
26
41
  ```
@@ -0,0 +1,26 @@
1
+ ```markdown
2
+ # E2E Audit Plan
3
+
4
+ ## Enumeration Evidence
5
+ - Commands used:
6
+ - `rg ...`
7
+ - `rg --files ...`
8
+ - Scope notes:
9
+ - {how routes, flows, and E2E specs were enumerated}
10
+
11
+ ## Audited User Flows
12
+ | # | Area | Route / Entry | Existing Scenarios | Coverage Status | Risk |
13
+ |---|------|---------------|--------------------|-----------------|------|
14
+ | 1 | {feature area} | {route or entry point} | {existing test names} | Covered / Partial / Missing | High / Medium / Low |
15
+
16
+ ## Missing Scenarios
17
+ | # | Area | Scenario | Priority | Planned Test Location |
18
+ |---|------|----------|----------|-----------------------|
19
+ | 1 | {feature area} | {missing scenario} | High / Medium / Low | `e2e/example.spec.ts` |
20
+
21
+ ## Audit Order
22
+ - {ordered audit plan}
23
+
24
+ ## Clarifications / Risks
25
+ - {open questions or constraints}
26
+ ```
@@ -0,0 +1,41 @@
1
+ ```markdown
2
+ # E2E Audit Report
3
+
4
+ ## Result: APPROVE / IMPROVE / REJECT
5
+
6
+ ## Summary
7
+ {1-3 sentences summarizing the flow coverage situation}
8
+
9
+ ## Enumeration Evidence
10
+ - Commands used:
11
+ - `rg ...`
12
+ - `rg --files ...`
13
+ - Coverage notes:
14
+ - {how you confirmed the full flow set was audited}
15
+
16
+ ## Scope
17
+ | # | Area | Route / Entry | Existing Scenarios | Coverage Status | Risk |
18
+ |---|------|---------------|--------------------|-----------------|------|
19
+ | 1 | {feature area} | {route or entry point} | {existing test names} | Covered / Partial / Missing | High / Medium / Low |
20
+
21
+ ## Findings
22
+ | # | Priority | Area | Location | Gap | Recommended Action |
23
+ |---|----------|------|----------|-----|--------------------|
24
+ | 1 | High / Medium / Low | e2e-testing | `e2e/example.spec.ts` / `src/page.tsx:42` | {missing or weakly tested scenario} | {issue-ready action} |
25
+
26
+ ## No-Issue Areas
27
+ - {flows confirmed as adequately covered}
28
+
29
+ ## Suggested Issue Titles
30
+ 1. {Issue title}
31
+ 2. {Issue title}
32
+
33
+ ## Notes
34
+ - {constraints, assumptions, or audit limits}
35
+ - {explicit reasons for any intentionally unaudited item}
36
+ ```
37
+
38
+ **Cognitive load reduction rules:**
39
+ - APPROVE → Summary + Scope only
40
+ - IMPROVE → Include only relevant gaps
41
+ - REJECT → Include only blocking or high-priority gaps
@@ -0,0 +1,26 @@
1
+ ```markdown
2
+ # Unit Test Audit Plan
3
+
4
+ ## Enumeration Evidence
5
+ - Commands used:
6
+ - `rg ...`
7
+ - `rg --files ...`
8
+ - Scope notes:
9
+ - {how target production files and tests were enumerated}
10
+
11
+ ## Audit Scope
12
+ | # | Production File | Existing Test Files | Audited Behaviors / Branches | Coverage Status |
13
+ |---|-----------------|---------------------|------------------------------|-----------------|
14
+ | 1 | `src/file.ts` | `src/__tests__/file.test.ts` | {exported APIs, branches, errors, boundaries} | Covered / Partial / Missing |
15
+
16
+ ## Missing Test Cases
17
+ | # | Production File | Behavior / Branch | Priority | Planned Test Location |
18
+ |---|-----------------|-------------------|----------|-----------------------|
19
+ | 1 | `src/file.ts` | {missing behavior} | High / Medium / Low | `src/__tests__/file.test.ts` |
20
+
21
+ ## Audit Order
22
+ - {ordered audit plan}
23
+
24
+ ## Clarifications / Risks
25
+ - {open questions or constraints}
26
+ ```
@@ -0,0 +1,41 @@
1
+ ```markdown
2
+ # Unit Audit Report
3
+
4
+ ## Result: APPROVE / IMPROVE / REJECT
5
+
6
+ ## Summary
7
+ {1-3 sentences summarizing the coverage situation}
8
+
9
+ ## Enumeration Evidence
10
+ - Commands used:
11
+ - `rg ...`
12
+ - `rg --files ...`
13
+ - Coverage notes:
14
+ - {how you confirmed the full target set was audited}
15
+
16
+ ## Scope
17
+ | # | Production File | Existing Test Files | Audited Behaviors | Coverage Status |
18
+ |---|-----------------|---------------------|-------------------|-----------------|
19
+ | 1 | `src/file.ts` | `src/__tests__/file.test.ts` | {key behaviors} | Covered / Partial / Missing |
20
+
21
+ ## Findings
22
+ | # | Priority | Area | Location | Gap | Recommended Action |
23
+ |---|----------|------|----------|-----|--------------------|
24
+ | 1 | High / Medium / Low | unit-testing | `src/file.ts:42` | {missing or weakly tested behavior} | {issue-ready action} |
25
+
26
+ ## No-Issue Areas
27
+ - {files or behaviors confirmed as adequately covered}
28
+
29
+ ## Suggested Issue Titles
30
+ 1. {Issue title}
31
+ 2. {Issue title}
32
+
33
+ ## Notes
34
+ - {constraints, assumptions, or audit limits}
35
+ - {explicit reasons for any intentionally unaudited item}
36
+ ```
37
+
38
+ **Cognitive load reduction rules:**
39
+ - APPROVE → Summary + Scope only
40
+ - IMPROVE → Include only relevant gaps
41
+ - REJECT → Include only blocking or high-priority gaps
@@ -11,7 +11,8 @@ Read the provided information (report, agent response, or conversation log) and
11
11
  1. Review the information provided in the instruction (report/response/conversation log)
12
12
  2. Identify the judgment result (APPROVE/REJECT, etc.) or work outcome from the information
13
13
  3. Output the corresponding tag in one line according to the decision criteria table
14
- 4. **If you cannot determine, clearly state "Cannot determine"**
14
+ 4. If the provided information contains internal contradictions, do not output a tag; clearly state "Cannot determine"
15
+ 5. **If you cannot determine, clearly state "Cannot determine"**
15
16
 
16
17
  ## What NOT to do
17
18
 
@@ -19,6 +20,7 @@ Read the provided information (report, agent response, or conversation log) and
19
20
  - Do NOT use tools
20
21
  - Do NOT check additional files or analyze code
21
22
  - Do NOT modify or expand the provided information
23
+ - Do NOT force a tag when the report contradicts itself
22
24
 
23
25
  ## Output Format
24
26
 
@@ -37,6 +39,13 @@ If any of the following applies, clearly state "Cannot determine":
37
39
  - The provided information does not match any of the judgment criteria
38
40
  - Multiple criteria may apply
39
41
  - Insufficient information
42
+ - The report's conclusion conflicts with its own evidence
43
+
44
+ Examples of contradictions:
45
+ - `Result: APPROVE` but unresolved `new` / `persists` findings remain
46
+ - A requirements table contains ❌ while the result says APPROVE
47
+ - The report claims verification was completed while evidence is explicitly missing
48
+ - The re-evaluation of prior findings conflicts with the final conclusion
40
49
 
41
50
  Example output:
42
51
 
@@ -44,4 +53,4 @@ Example output:
44
53
  Cannot determine: Insufficient information
45
54
  ```
46
55
 
47
- **Important:** Respect the result shown in the provided information as-is and output the corresponding tag number. If uncertain, do NOT guess - state "Cannot determine" instead.
56
+ **Important:** Respect the result shown in the provided information as-is only when the report is internally consistent. If uncertain, do NOT guess - state "Cannot determine" instead.
@@ -40,3 +40,6 @@ Security cannot be retrofitted. It must be built in from the design stage; "we'l
40
40
  - How to fix it
41
41
 
42
42
  **Remember**: You are the security gatekeeper. Never let vulnerable code pass.
43
+
44
+ Also distinguish intended product precedence and extension behavior from actual trust-boundary breaks.
45
+ Do not label something a vulnerability based only on the presence or absence of a confirmation prompt; make the attacker, control point, and impact concrete.