@qball-inc/the-bulwark 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +43 -0
- package/agents/bulwark-fix-validator.md +633 -0
- package/agents/bulwark-implementer.md +391 -0
- package/agents/bulwark-issue-analyzer.md +308 -0
- package/agents/bulwark-standards-reviewer.md +221 -0
- package/agents/plan-creation-architect.md +323 -0
- package/agents/plan-creation-eng-lead.md +352 -0
- package/agents/plan-creation-po.md +300 -0
- package/agents/plan-creation-qa-critic.md +334 -0
- package/agents/product-ideation-competitive-analyzer.md +298 -0
- package/agents/product-ideation-idea-validator.md +268 -0
- package/agents/product-ideation-market-researcher.md +292 -0
- package/agents/product-ideation-pattern-documenter.md +308 -0
- package/agents/product-ideation-segment-analyzer.md +303 -0
- package/agents/product-ideation-strategist.md +259 -0
- package/agents/statusline-setup.md +97 -0
- package/hooks/hooks.json +59 -0
- package/package.json +45 -0
- package/scripts/hooks/cleanup-stale.sh +13 -0
- package/scripts/hooks/enforce-quality.sh +166 -0
- package/scripts/hooks/implementer-quality.sh +256 -0
- package/scripts/hooks/inject-protocol.sh +52 -0
- package/scripts/hooks/suggest-pipeline.sh +175 -0
- package/scripts/hooks/track-pipeline-start.sh +37 -0
- package/scripts/hooks/track-pipeline-stop.sh +52 -0
- package/scripts/init-rules.sh +35 -0
- package/scripts/init.sh +151 -0
- package/skills/anthropic-validator/SKILL.md +607 -0
- package/skills/anthropic-validator/references/agents-checklist.md +131 -0
- package/skills/anthropic-validator/references/commands-checklist.md +102 -0
- package/skills/anthropic-validator/references/hooks-checklist.md +151 -0
- package/skills/anthropic-validator/references/mcp-checklist.md +136 -0
- package/skills/anthropic-validator/references/plugins-checklist.md +148 -0
- package/skills/anthropic-validator/references/skills-checklist.md +85 -0
- package/skills/assertion-patterns/SKILL.md +296 -0
- package/skills/bug-magnet-data/SKILL.md +284 -0
- package/skills/bug-magnet-data/context/cli-args.md +91 -0
- package/skills/bug-magnet-data/context/db-query.md +104 -0
- package/skills/bug-magnet-data/context/file-contents.md +103 -0
- package/skills/bug-magnet-data/context/http-body.md +91 -0
- package/skills/bug-magnet-data/context/process-spawn.md +123 -0
- package/skills/bug-magnet-data/data/booleans/boundaries.yaml +143 -0
- package/skills/bug-magnet-data/data/collections/arrays.yaml +114 -0
- package/skills/bug-magnet-data/data/collections/objects.yaml +123 -0
- package/skills/bug-magnet-data/data/concurrency/race-conditions.yaml +118 -0
- package/skills/bug-magnet-data/data/concurrency/state-machines.yaml +115 -0
- package/skills/bug-magnet-data/data/dates/boundaries.yaml +137 -0
- package/skills/bug-magnet-data/data/dates/invalid.yaml +132 -0
- package/skills/bug-magnet-data/data/dates/timezone.yaml +118 -0
- package/skills/bug-magnet-data/data/encoding/charset.yaml +79 -0
- package/skills/bug-magnet-data/data/encoding/normalization.yaml +105 -0
- package/skills/bug-magnet-data/data/formats/email.yaml +154 -0
- package/skills/bug-magnet-data/data/formats/json.yaml +187 -0
- package/skills/bug-magnet-data/data/formats/url.yaml +165 -0
- package/skills/bug-magnet-data/data/language-specific/javascript.yaml +182 -0
- package/skills/bug-magnet-data/data/language-specific/python.yaml +174 -0
- package/skills/bug-magnet-data/data/language-specific/rust.yaml +148 -0
- package/skills/bug-magnet-data/data/numbers/boundaries.yaml +161 -0
- package/skills/bug-magnet-data/data/numbers/precision.yaml +89 -0
- package/skills/bug-magnet-data/data/numbers/special.yaml +69 -0
- package/skills/bug-magnet-data/data/strings/boundaries.yaml +109 -0
- package/skills/bug-magnet-data/data/strings/injection.yaml +208 -0
- package/skills/bug-magnet-data/data/strings/special-chars.yaml +190 -0
- package/skills/bug-magnet-data/data/strings/unicode.yaml +139 -0
- package/skills/bug-magnet-data/references/external-lists.md +115 -0
- package/skills/bulwark-brainstorm/SKILL.md +563 -0
- package/skills/bulwark-brainstorm/references/at-teammate-prompts.md +60 -0
- package/skills/bulwark-brainstorm/references/role-critical-analyst.md +78 -0
- package/skills/bulwark-brainstorm/references/role-development-lead.md +66 -0
- package/skills/bulwark-brainstorm/references/role-product-delivery-lead.md +79 -0
- package/skills/bulwark-brainstorm/references/role-product-manager.md +62 -0
- package/skills/bulwark-brainstorm/references/role-project-sme.md +59 -0
- package/skills/bulwark-brainstorm/references/role-technical-architect.md +66 -0
- package/skills/bulwark-research/SKILL.md +298 -0
- package/skills/bulwark-research/references/viewpoint-contrarian.md +63 -0
- package/skills/bulwark-research/references/viewpoint-direct-investigation.md +62 -0
- package/skills/bulwark-research/references/viewpoint-first-principles.md +65 -0
- package/skills/bulwark-research/references/viewpoint-practitioner.md +62 -0
- package/skills/bulwark-research/references/viewpoint-prior-art.md +66 -0
- package/skills/bulwark-scaffold/SKILL.md +330 -0
- package/skills/bulwark-statusline/SKILL.md +161 -0
- package/skills/bulwark-statusline/scripts/statusline.sh +144 -0
- package/skills/bulwark-verify/SKILL.md +519 -0
- package/skills/code-review/SKILL.md +428 -0
- package/skills/code-review/examples/anti-patterns/linting.ts +181 -0
- package/skills/code-review/examples/anti-patterns/security.ts +91 -0
- package/skills/code-review/examples/anti-patterns/standards.ts +195 -0
- package/skills/code-review/examples/anti-patterns/type-safety.ts +108 -0
- package/skills/code-review/examples/recommended/linting.ts +195 -0
- package/skills/code-review/examples/recommended/security.ts +154 -0
- package/skills/code-review/examples/recommended/standards.ts +231 -0
- package/skills/code-review/examples/recommended/type-safety.ts +181 -0
- package/skills/code-review/frameworks/angular.md +218 -0
- package/skills/code-review/frameworks/django.md +235 -0
- package/skills/code-review/frameworks/express.md +207 -0
- package/skills/code-review/frameworks/flask.md +298 -0
- package/skills/code-review/frameworks/generic.md +146 -0
- package/skills/code-review/frameworks/react.md +152 -0
- package/skills/code-review/frameworks/vue.md +244 -0
- package/skills/code-review/references/linting-patterns.md +221 -0
- package/skills/code-review/references/security-patterns.md +125 -0
- package/skills/code-review/references/standards-patterns.md +246 -0
- package/skills/code-review/references/type-safety-patterns.md +130 -0
- package/skills/component-patterns/SKILL.md +131 -0
- package/skills/component-patterns/references/pattern-cli-command.md +118 -0
- package/skills/component-patterns/references/pattern-database.md +166 -0
- package/skills/component-patterns/references/pattern-external-api.md +139 -0
- package/skills/component-patterns/references/pattern-file-parser.md +168 -0
- package/skills/component-patterns/references/pattern-http-server.md +162 -0
- package/skills/component-patterns/references/pattern-process-spawner.md +133 -0
- package/skills/continuous-feedback/SKILL.md +327 -0
- package/skills/continuous-feedback/references/collect-instructions.md +81 -0
- package/skills/continuous-feedback/references/specialize-code-review.md +82 -0
- package/skills/continuous-feedback/references/specialize-general.md +98 -0
- package/skills/continuous-feedback/references/specialize-test-audit.md +81 -0
- package/skills/create-skill/SKILL.md +359 -0
- package/skills/create-skill/references/agent-conventions.md +194 -0
- package/skills/create-skill/references/agent-template.md +195 -0
- package/skills/create-skill/references/content-guidance.md +291 -0
- package/skills/create-skill/references/decision-framework.md +124 -0
- package/skills/create-skill/references/template-pipeline.md +217 -0
- package/skills/create-skill/references/template-reference-heavy.md +111 -0
- package/skills/create-skill/references/template-research.md +210 -0
- package/skills/create-skill/references/template-script-driven.md +172 -0
- package/skills/create-skill/references/template-simple.md +80 -0
- package/skills/create-subagent/SKILL.md +353 -0
- package/skills/create-subagent/references/agent-conventions.md +268 -0
- package/skills/create-subagent/references/content-guidance.md +232 -0
- package/skills/create-subagent/references/decision-framework.md +134 -0
- package/skills/create-subagent/references/template-single-agent.md +192 -0
- package/skills/fix-bug/SKILL.md +241 -0
- package/skills/governance-protocol/SKILL.md +116 -0
- package/skills/init/SKILL.md +341 -0
- package/skills/issue-debugging/SKILL.md +385 -0
- package/skills/issue-debugging/references/anti-patterns.md +245 -0
- package/skills/issue-debugging/references/debug-report-schema.md +227 -0
- package/skills/mock-detection/SKILL.md +511 -0
- package/skills/mock-detection/references/false-positive-prevention.md +402 -0
- package/skills/mock-detection/references/stub-patterns.md +236 -0
- package/skills/pipeline-templates/SKILL.md +215 -0
- package/skills/pipeline-templates/references/code-change-workflow.md +277 -0
- package/skills/pipeline-templates/references/code-review.md +336 -0
- package/skills/pipeline-templates/references/fix-validation.md +421 -0
- package/skills/pipeline-templates/references/new-feature.md +335 -0
- package/skills/pipeline-templates/references/research-brainstorm.md +161 -0
- package/skills/pipeline-templates/references/research-planning.md +257 -0
- package/skills/pipeline-templates/references/test-audit.md +389 -0
- package/skills/pipeline-templates/references/test-execution-fix.md +238 -0
- package/skills/plan-creation/SKILL.md +497 -0
- package/skills/product-ideation/SKILL.md +372 -0
- package/skills/product-ideation/references/analysis-frameworks.md +161 -0
- package/skills/session-handoff/SKILL.md +139 -0
- package/skills/session-handoff/references/examples.md +223 -0
- package/skills/setup-lsp/SKILL.md +312 -0
- package/skills/setup-lsp/references/server-registry.md +85 -0
- package/skills/setup-lsp/references/troubleshooting.md +135 -0
- package/skills/subagent-output-templating/SKILL.md +415 -0
- package/skills/subagent-output-templating/references/examples.md +440 -0
- package/skills/subagent-prompting/SKILL.md +364 -0
- package/skills/subagent-prompting/references/examples.md +342 -0
- package/skills/test-audit/SKILL.md +531 -0
- package/skills/test-audit/references/known-limitations.md +41 -0
- package/skills/test-audit/references/priority-classification.md +30 -0
- package/skills/test-audit/references/prompts/deep-mode-detection.md +83 -0
- package/skills/test-audit/references/prompts/synthesis.md +57 -0
- package/skills/test-audit/references/rewrite-instructions.md +46 -0
- package/skills/test-audit/references/schemas/audit-output.yaml +100 -0
- package/skills/test-audit/references/schemas/diagnostic-output.yaml +49 -0
- package/skills/test-audit/scripts/data-flow-analyzer.ts +509 -0
- package/skills/test-audit/scripts/integration-mock-detector.ts +462 -0
- package/skills/test-audit/scripts/package.json +20 -0
- package/skills/test-audit/scripts/skip-detector.ts +211 -0
- package/skills/test-audit/scripts/verification-counter.ts +295 -0
- package/skills/test-classification/SKILL.md +310 -0
- package/skills/test-fixture-creation/SKILL.md +295 -0
|
@@ -0,0 +1,511 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mock-detection
|
|
3
|
+
description: Deep mock appropriateness analysis for Test Audit pipeline
|
|
4
|
+
user-invocable: false
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Mock Detection
|
|
8
|
+
|
|
9
|
+
Prompt template for deep mock appropriateness analysis using call graph tracing. Designed for a Sonnet sub-agent to detect T1-T4 violations and track violation scope.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## When to Use This Skill
|
|
14
|
+
|
|
15
|
+
**This is an internal skill loaded by the orchestrator during Test Audit pipeline.**
|
|
16
|
+
|
|
17
|
+
| Context | Action |
|
|
18
|
+
|---------|--------|
|
|
19
|
+
| `/test-audit` invoked | Orchestrator loads this skill for Stage 2 |
|
|
20
|
+
| Test Audit pipeline triggered by hook | Orchestrator loads this skill for Stage 2 |
|
|
21
|
+
| Need deep mock analysis | Load directly as prompt template for Sonnet |
|
|
22
|
+
| Files flagged by test-classification | Analyze only `needs_deep_analysis: true` files |
|
|
23
|
+
|
|
24
|
+
**DO NOT use for:**
|
|
25
|
+
- Direct user invocation (not user-invocable)
|
|
26
|
+
- Surface-level classification (use `test-classification` skill)
|
|
27
|
+
- Full audit synthesis (use `test-audit` skill)
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Role in Test Audit Pipeline
|
|
32
|
+
|
|
33
|
+
This skill provides the **second stage** prompt template:
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
test-audit (P0.8) orchestrates:
|
|
37
|
+
Stage 1: test-classification (Haiku) → classification YAML
|
|
38
|
+
Stage 2: mock-detection (Sonnet) → violations YAML ← THIS SKILL
|
|
39
|
+
Stage 3: synthesis (Sonnet) → audit report
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
The orchestrator loads this skill and constructs a 4-part prompt for a general-purpose Sonnet sub-agent.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## 4-Part Prompt Template
|
|
47
|
+
|
|
48
|
+
### GOAL
|
|
49
|
+
|
|
50
|
+
Analyze flagged test files for T1-T4 violations using mock appropriateness rubric and call graph analysis. Track the full scope of each violation for test effectiveness calculation.
|
|
51
|
+
|
|
52
|
+
### CONSTRAINTS
|
|
53
|
+
|
|
54
|
+
- Do NOT modify any files
|
|
55
|
+
- Only analyze files with `needs_deep_analysis: true` from classification
|
|
56
|
+
- Use call graph analysis to detect broken integration chains
|
|
57
|
+
- Track violation scope (all affected lines, not just violation line)
|
|
58
|
+
- Provide full context for each violation (line, snippet, reason, fix)
|
|
59
|
+
- Complete within 50 tool calls
|
|
60
|
+
|
|
61
|
+
### CONTEXT
|
|
62
|
+
|
|
63
|
+
**Classification output:** `{classification_yaml_path}`
|
|
64
|
+
|
|
65
|
+
**Files to analyze:** List of files with `needs_deep_analysis: true`
|
|
66
|
+
|
|
67
|
+
**Mock appropriateness rubric:** See "Mock Appropriateness Rubric" section below
|
|
68
|
+
|
|
69
|
+
**T1-T4 detection patterns:** See "T1-T4 Detection Patterns" section below
|
|
70
|
+
|
|
71
|
+
**Violation scope tracking:** See "Violation Scope Tracking" section below
|
|
72
|
+
|
|
73
|
+
**Extended stub/fake patterns:** See `references/stub-patterns.md` — Meszaros taxonomy, class hierarchy detection, factory function classification
|
|
74
|
+
|
|
75
|
+
**False positive prevention:** See `references/false-positive-prevention.md` — Two-tier allowlist (Universal Safe / Context-Dependent) and decision tree. Consult BEFORE flagging borderline patterns.
|
|
76
|
+
|
|
77
|
+
### OUTPUT
|
|
78
|
+
|
|
79
|
+
Write violations to: `logs/mock-detection-{YYYYMMDD-HHMMSS}.yaml`
|
|
80
|
+
|
|
81
|
+
Write diagnostics to: `logs/diagnostics/mock-detection-{YYYYMMDD-HHMMSS}.yaml`
|
|
82
|
+
|
|
83
|
+
Use the schema specified in "Output Schema" section below.
|
|
84
|
+
|
|
85
|
+
---
|
|
86
|
+
|
|
87
|
+
## Mock Appropriateness Rubric
|
|
88
|
+
|
|
89
|
+
Determine whether mocks are appropriate based on test type:
|
|
90
|
+
|
|
91
|
+
| Test Type | Expected Mocks (OK) | Inappropriate Mocks (VIOLATION) |
|
|
92
|
+
|-----------|---------------------|--------------------------------|
|
|
93
|
+
| **Unit** | External deps (DB, HTTP, fs) to isolate unit | Mocking function/module under test (T1) |
|
|
94
|
+
| **Integration** | Unrelated systems only | Mocking integration boundaries (T3), broken chain (T3+) |
|
|
95
|
+
| **E2E** | Almost never | Any mock breaking end-to-end flow |
|
|
96
|
+
|
|
97
|
+
### Mixed-Type Files (MANDATORY)
|
|
98
|
+
|
|
99
|
+
Test files commonly contain multiple test types in different describe blocks (e.g., unit tests at top, integration tests at bottom). You MUST evaluate mock appropriateness **per describe block/section**, not per file. A `jest.fn()` that is safe in a unit test section is a T3 violation in an integration test section of the same file.
|
|
100
|
+
|
|
101
|
+
Classification signals (language-agnostic — apply to TypeScript, Python, Java, Go, Ruby, etc.):
|
|
102
|
+
- Block/suite name containing keywords: `integration`, `e2e`, `end-to-end`, `acceptance`, `system`
|
|
103
|
+
- Preceding comments or section headers: `// INTEGRATION TESTS`, `# E2E`, `/* system tests */`
|
|
104
|
+
- Setup patterns within the block (real DB connections = integration, browser launch = e2e)
|
|
105
|
+
|
|
106
|
+
If AST integration-mock metadata is available (from `just integration-mocks`), use it as ground truth for section boundaries and mock locations. Validate AST leads and add any the AST missed.
|
|
107
|
+
|
|
108
|
+
**BINDING: AST classification is final.** When the AST script classifies a section as integration or e2e, that classification is NOT subject to LLM override. You MUST evaluate mocks in that section against integration/e2e rules — even if you believe the section is "actually" a unit test. Dismissing an AST T3 lead by re-classifying the section as a different test type is a rule violation. If you believe the section is mislabeled, note it as advisory — but still flag T3 violations against the classified type.
|
|
109
|
+
|
|
110
|
+
See `references/false-positive-prevention.md` § "Worked Example: Mixed-Type File" for a concrete demonstration.
|
|
111
|
+
|
|
112
|
+
### Key Principle
|
|
113
|
+
|
|
114
|
+
A mock is inappropriate when it **defeats the purpose of the test**:
|
|
115
|
+
- Unit test claims to test function X but mocks function X → T1
|
|
116
|
+
- Integration test claims to verify HTTP calls but mocks HTTP → T3
|
|
117
|
+
- Test verifies function was called but not what it produced → T2
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## T1-T4 Detection Patterns
|
|
122
|
+
|
|
123
|
+
### T1: Mocking System Under Test
|
|
124
|
+
|
|
125
|
+
**Severity:** Critical
|
|
126
|
+
**Priority:** P0 (False Confidence)
|
|
127
|
+
|
|
128
|
+
**Detection patterns:**
|
|
129
|
+
- `jest.spyOn(ModuleUnderTest, 'functionBeingTested')`
|
|
130
|
+
- `jest.mock('./file-being-tested')`
|
|
131
|
+
- `vi.mock()` on the module imported in the test's subject
|
|
132
|
+
- Mock intercepts the exact function the test claims to verify
|
|
133
|
+
|
|
134
|
+
**Call graph check:** Trace from test assertion back to setup. If mock sits between "action" and "assertion" for the claimed behavior, it's T1.
|
|
135
|
+
|
|
136
|
+
### T2: Verifying Calls Not Results
|
|
137
|
+
|
|
138
|
+
**Severity:** High
|
|
139
|
+
**Priority:** P1 (Incomplete Verification)
|
|
140
|
+
|
|
141
|
+
**Detection patterns:**
|
|
142
|
+
- `expect(fn).toHaveBeenCalled()` without result assertion
|
|
143
|
+
- `expect(fn).toHaveBeenCalledWith(...)` without verifying the effect
|
|
144
|
+
- `verify(mock).someMethod()` without outcome check
|
|
145
|
+
|
|
146
|
+
**Call graph check:** After the `toHaveBeenCalled` assertion, is there a result/state assertion for the same operation?
|
|
147
|
+
|
|
148
|
+
### T3: Mock at Integration Boundary
|
|
149
|
+
|
|
150
|
+
**Severity:** Critical
|
|
151
|
+
**Priority:** P1 (Incomplete Verification)
|
|
152
|
+
|
|
153
|
+
**Detection patterns:**
|
|
154
|
+
- In `.integration.*` file: `jest.mock('node-fetch')`, `jest.mock('fs')`, `jest.mock('http')`
|
|
155
|
+
- In integration test: Mocking the very boundary being integrated
|
|
156
|
+
|
|
157
|
+
**Call graph check:** Does the test claim to verify "integration with X" while mocking X?
|
|
158
|
+
|
|
159
|
+
### T3+: Broken Integration Chain
|
|
160
|
+
|
|
161
|
+
**Severity:** Critical
|
|
162
|
+
**Priority:** P0 (False Confidence)
|
|
163
|
+
|
|
164
|
+
**Detection patterns:**
|
|
165
|
+
- `mockData` used where real function output should flow
|
|
166
|
+
- Data manually constructed instead of flowing from upstream function
|
|
167
|
+
- Integration test with hardcoded intermediate values
|
|
168
|
+
- Property access on manually-constructed objects used in downstream calls (e.g., `mockOrder.id` passed into new objects)
|
|
169
|
+
|
|
170
|
+
**Class hierarchy signals** (see `references/stub-patterns.md` for full taxonomy):
|
|
171
|
+
- Classes named Fake*/Stub*/Mock*/InMemory*/Test* that implement interfaces or extend base classes
|
|
172
|
+
- Manual stub objects without naming conventions (all methods are no-ops or return hardcoded values)
|
|
173
|
+
- Factory functions prefixed with `buildMock*` or `createFake*`
|
|
174
|
+
|
|
175
|
+
**Call graph check:** Trace data flow. If Component A should output to Component B, but test injects `mockAOutput` into B, the chain is broken.
|
|
176
|
+
|
|
177
|
+
### T4: No Test Execution Verification
|
|
178
|
+
|
|
179
|
+
**Severity:** Medium
|
|
180
|
+
**Priority:** P2 (Pattern Issues)
|
|
181
|
+
|
|
182
|
+
**Detection patterns:**
|
|
183
|
+
- Test file exists but `just test` not run
|
|
184
|
+
- Tests pass in isolation but skip in suite
|
|
185
|
+
- Manual notation: "Requires V3 manual verification"
|
|
186
|
+
|
|
187
|
+
**Note:** T4 is primarily a process check. Flag for manual review.
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Violation Scope Tracking
|
|
192
|
+
|
|
193
|
+
Track the **full scope** of each violation - not just the violation line, but all lines affected by it. This enables accurate test effectiveness calculation.
|
|
194
|
+
|
|
195
|
+
### Scope Calculation Rules
|
|
196
|
+
|
|
197
|
+
| Violation Type | Scope Definition |
|
|
198
|
+
|----------------|------------------|
|
|
199
|
+
| **T1 (Mock SUT)** | All lines that use the mock: assertions depending on mock, calls using mock return value |
|
|
200
|
+
| **T2 (Call-only)** | The assertion line itself (single line) |
|
|
201
|
+
| **T3 (Mock boundary)** | All lines using the mocked boundary (similar to T1) |
|
|
202
|
+
| **T3+ (Broken chain)** | All lines using the incorrect/mocked data downstream |
|
|
203
|
+
|
|
204
|
+
### Example
|
|
205
|
+
|
|
206
|
+
```javascript
|
|
207
|
+
// Line 15: Mock setup (violation line)
|
|
208
|
+
const mockSpawn = jest.spyOn(child_process, 'spawn')
|
|
209
|
+
.mockReturnValue(mockProcess);
|
|
210
|
+
|
|
211
|
+
// Lines 20-95: All use mockSpawn results
|
|
212
|
+
const proxy = startProxy(); // Line 20 - uses mock
|
|
213
|
+
await proxy.waitForReady(); // Line 21 - uses mock
|
|
214
|
+
expect(proxy.port).toBe(8080); // Line 25 - assertion on mock
|
|
215
|
+
// ... more lines using mock ...
|
|
216
|
+
expect(proxy.isRunning()).toBe(true); // Line 95 - still mock
|
|
217
|
+
|
|
218
|
+
// violation_scope: [15, 95]
|
|
219
|
+
// affected_lines: 80
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
---
|
|
223
|
+
|
|
224
|
+
## Call Graph Analysis Approach
|
|
225
|
+
|
|
226
|
+
For each flagged file, perform systematic analysis:
|
|
227
|
+
|
|
228
|
+
### Step 1: Identify Test Claims
|
|
229
|
+
|
|
230
|
+
What does each test claim to verify? Look at:
|
|
231
|
+
- Test name/description (`it('starts proxy correctly', ...)`)
|
|
232
|
+
- Assertion statements (what is being expected?)
|
|
233
|
+
|
|
234
|
+
### Step 2: Trace Data Flow
|
|
235
|
+
|
|
236
|
+
How does data flow from setup → action → assertion?
|
|
237
|
+
- Where is input created?
|
|
238
|
+
- What transforms the input?
|
|
239
|
+
- What does the assertion check?
|
|
240
|
+
|
|
241
|
+
### Step 3: Locate Mock Interception
|
|
242
|
+
|
|
243
|
+
Where do mocks intercept this flow?
|
|
244
|
+
- Is mock between action and the claimed verification?
|
|
245
|
+
- Does mock replace real behavior the test should exercise?
|
|
246
|
+
|
|
247
|
+
### Step 4: Evaluate Appropriateness
|
|
248
|
+
|
|
249
|
+
Does the mock defeat the test's purpose?
|
|
250
|
+
- If test claims "proxy starts" but mocks spawn → T1
|
|
251
|
+
- If test claims "API integration" but mocks fetch → T3
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Output Schema
|
|
256
|
+
|
|
257
|
+
```yaml
|
|
258
|
+
metadata:
|
|
259
|
+
skill: mock-detection
|
|
260
|
+
timestamp: "{ISO-8601}"
|
|
261
|
+
classification_source: logs/test-classification-{YYYYMMDD-HHMMSS}.yaml
|
|
262
|
+
model: sonnet
|
|
263
|
+
files_analyzed: 5
|
|
264
|
+
|
|
265
|
+
violations:
|
|
266
|
+
- file: tests/proxy.test.ts
|
|
267
|
+
line: 15
|
|
268
|
+
violation_scope: [15, 95]
|
|
269
|
+
affected_lines: 80
|
|
270
|
+
rule: T1
|
|
271
|
+
severity: critical
|
|
272
|
+
priority: P0
|
|
273
|
+
pattern: "jest.spyOn(child_process, 'spawn')"
|
|
274
|
+
code_snippet: |
|
|
275
|
+
const mockSpawn = jest.spyOn(child_process, 'spawn')
|
|
276
|
+
.mockReturnValue(mockProcess);
|
|
277
|
+
reason: |
|
|
278
|
+
Test claims to verify "proxy starts correctly" but mocks spawn().
|
|
279
|
+
This provides false confidence - mock always succeeds.
|
|
280
|
+
Lines 15-95 all use this mock, making them ineffective.
|
|
281
|
+
suggested_fix: |
|
|
282
|
+
Replace mock with real spawn. Use port check to verify proxy started.
|
|
283
|
+
|
|
284
|
+
- file: tests/api.integration.ts
|
|
285
|
+
line: 8
|
|
286
|
+
violation_scope: [8, 45]
|
|
287
|
+
affected_lines: 37
|
|
288
|
+
rule: T3
|
|
289
|
+
severity: critical
|
|
290
|
+
priority: P1
|
|
291
|
+
pattern: "jest.mock('node-fetch')"
|
|
292
|
+
code_snippet: |
|
|
293
|
+
jest.mock('node-fetch');
|
|
294
|
+
// ... later in test
|
|
295
|
+
const response = await fetchUserData(userId);
|
|
296
|
+
reason: |
|
|
297
|
+
Integration test should verify real HTTP communication.
|
|
298
|
+
Mocking fetch defeats the purpose of integration testing.
|
|
299
|
+
suggested_fix: |
|
|
300
|
+
Remove jest.mock('node-fetch'). Use test server or MSW.
|
|
301
|
+
|
|
302
|
+
- file: tests/workflow.integration.ts
|
|
303
|
+
line: 42
|
|
304
|
+
violation_scope: [42, 78]
|
|
305
|
+
affected_lines: 36
|
|
306
|
+
rule: T3+
|
|
307
|
+
severity: critical
|
|
308
|
+
priority: P0
|
|
309
|
+
pattern: "Broken integration chain"
|
|
310
|
+
code_snippet: |
|
|
311
|
+
const result = await processOrder(mockOrderData);
|
|
312
|
+
// mockOrderData should come from createOrder() output
|
|
313
|
+
reason: |
|
|
314
|
+
Test uses mockOrderData instead of real createOrder() output.
|
|
315
|
+
This breaks the integration chain - no real integration tested.
|
|
316
|
+
suggested_fix: |
|
|
317
|
+
Replace mockOrderData with: const order = await createOrder(input);
|
|
318
|
+
|
|
319
|
+
- file: tests/config.test.ts
|
|
320
|
+
line: 42
|
|
321
|
+
violation_scope: [42, 42]
|
|
322
|
+
affected_lines: 1
|
|
323
|
+
rule: T2
|
|
324
|
+
severity: high
|
|
325
|
+
priority: P1
|
|
326
|
+
pattern: "expect(db.save).toHaveBeenCalled()"
|
|
327
|
+
code_snippet: |
|
|
328
|
+
await saveConfig(newConfig);
|
|
329
|
+
expect(db.save).toHaveBeenCalled();
|
|
330
|
+
reason: |
|
|
331
|
+
Verifies db.save was called but not what was saved.
|
|
332
|
+
Call verification without result verification is incomplete.
|
|
333
|
+
suggested_fix: |
|
|
334
|
+
Add result verification: expect(saved.value).toBe(newConfig.value);
|
|
335
|
+
|
|
336
|
+
totals:
|
|
337
|
+
critical: 3
|
|
338
|
+
high: 1
|
|
339
|
+
medium: 0
|
|
340
|
+
low: 0
|
|
341
|
+
total_affected_lines: 154
|
|
342
|
+
|
|
343
|
+
file_summaries:
|
|
344
|
+
# For each file, compute affected_lines as the UNION of all violation_scope ranges
|
|
345
|
+
# (merge overlapping/identical ranges). Do NOT sum individual affected_lines values.
|
|
346
|
+
# Example: two violations both scoped to [228, 269] = 42 affected lines, not 84.
|
|
347
|
+
- file: tests/proxy.test.ts
|
|
348
|
+
verification_lines: 95
|
|
349
|
+
affected_lines: 80
|
|
350
|
+
test_effectiveness: 16%
|
|
351
|
+
- file: tests/api.integration.ts
|
|
352
|
+
verification_lines: 55
|
|
353
|
+
affected_lines: 37
|
|
354
|
+
test_effectiveness: 33%
|
|
355
|
+
- file: tests/workflow.integration.ts
|
|
356
|
+
verification_lines: 50
|
|
357
|
+
affected_lines: 36
|
|
358
|
+
test_effectiveness: 28%
|
|
359
|
+
- file: tests/config.test.ts
|
|
360
|
+
verification_lines: 40
|
|
361
|
+
affected_lines: 1
|
|
362
|
+
test_effectiveness: 98%
|
|
363
|
+
|
|
364
|
+
summary: |
|
|
365
|
+
Analyzed 5 flagged files. Found 4 violations affecting 154 lines.
|
|
366
|
+
3 files below 95% test effectiveness threshold.
|
|
367
|
+
P0 violations (false confidence): proxy.test.ts, workflow.integration.ts
|
|
368
|
+
P1 violations (incomplete): api.integration.ts, config.test.ts
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
---
|
|
372
|
+
|
|
373
|
+
## Diagnostic Output
|
|
374
|
+
|
|
375
|
+
Write diagnostic output to `logs/diagnostics/mock-detection-{YYYYMMDD-HHMMSS}.yaml`:
|
|
376
|
+
|
|
377
|
+
```yaml
|
|
378
|
+
diagnostic:
|
|
379
|
+
skill: mock-detection
|
|
380
|
+
timestamp: "{ISO-8601}"
|
|
381
|
+
model: sonnet
|
|
382
|
+
|
|
383
|
+
execution:
|
|
384
|
+
tool_calls: 35
|
|
385
|
+
files_analyzed: 5
|
|
386
|
+
analysis_depth: "call graph tracing"
|
|
387
|
+
|
|
388
|
+
decisions:
|
|
389
|
+
- file: tests/proxy.test.ts
|
|
390
|
+
decision: T1_violation
|
|
391
|
+
call_graph_analysis: |
|
|
392
|
+
Test claims: "proxy starts correctly"
|
|
393
|
+
Action: startProxy() calls child_process.spawn()
|
|
394
|
+
Mock: jest.spyOn intercepts spawn()
|
|
395
|
+
Result: Assertion verifies mock behavior, not real spawn
|
|
396
|
+
confidence: high
|
|
397
|
+
|
|
398
|
+
- file: tests/config.test.ts
|
|
399
|
+
decision: T2_violation
|
|
400
|
+
call_graph_analysis: |
|
|
401
|
+
Assertion: toHaveBeenCalled() on db.save
|
|
402
|
+
Missing: No assertion on saved data value
|
|
403
|
+
Scope: Single assertion line (minimal impact)
|
|
404
|
+
confidence: high
|
|
405
|
+
|
|
406
|
+
errors: []
|
|
407
|
+
```
|
|
408
|
+
|
|
409
|
+
---
|
|
410
|
+
|
|
411
|
+
## Priority Classification
|
|
412
|
+
|
|
413
|
+
### P0: False Confidence
|
|
414
|
+
|
|
415
|
+
Tests that pass but provide no real assurance:
|
|
416
|
+
- **T1**: Mock hides real failures - test always passes regardless of SUT behavior
|
|
417
|
+
- **T3+**: Broken chain means integration is never actually tested
|
|
418
|
+
|
|
419
|
+
### P1: Incomplete Verification
|
|
420
|
+
|
|
421
|
+
Tests that run real code but don't fully verify:
|
|
422
|
+
- **T2**: Call happened but effect not verified
|
|
423
|
+
- **T3**: Integration boundary mocked (partial integration)
|
|
424
|
+
|
|
425
|
+
### P2: Pattern Issues
|
|
426
|
+
|
|
427
|
+
Style and organization issues:
|
|
428
|
+
- Minor mock patterns
|
|
429
|
+
- Test structure recommendations
|
|
430
|
+
|
|
431
|
+
---
|
|
432
|
+
|
|
433
|
+
## Integration Notes
|
|
434
|
+
|
|
435
|
+
### Orchestrator Usage
|
|
436
|
+
|
|
437
|
+
The orchestrator (P0.8) constructs the full prompt by:
|
|
438
|
+
|
|
439
|
+
1. Loading this skill content
|
|
440
|
+
2. Including classification YAML path in CONTEXT
|
|
441
|
+
3. Spawning: `Task(subagent_type="general-purpose", model="sonnet", prompt=...)`
|
|
442
|
+
4. Reading output from `logs/mock-detection-{YYYYMMDD-HHMMSS}.yaml`
|
|
443
|
+
|
|
444
|
+
### Upstream Input
|
|
445
|
+
|
|
446
|
+
From P0.6 (test-classification):
|
|
447
|
+
- `needs_deep_analysis: true` file list
|
|
448
|
+
- `verification_lines` count per file
|
|
449
|
+
- `mock_indicators` as analysis starting points
|
|
450
|
+
|
|
451
|
+
### Downstream Output
|
|
452
|
+
|
|
453
|
+
To P0.8 (test-audit synthesis):
|
|
454
|
+
- Violation details with scope tracking
|
|
455
|
+
- `affected_lines` per file
|
|
456
|
+
- Pre-calculated `test_effectiveness` per file
|
|
457
|
+
|
|
458
|
+
---
|
|
459
|
+
|
|
460
|
+
## Batching for Scale
|
|
461
|
+
|
|
462
|
+
When processing many flagged files (>10), the orchestrator must batch detection to avoid context limits.
|
|
463
|
+
|
|
464
|
+
### Batching Instructions
|
|
465
|
+
|
|
466
|
+
```
|
|
467
|
+
IF flagged_file_count > 10:
|
|
468
|
+
Split flagged files into batches of 10-15
|
|
469
|
+
FOR each batch:
|
|
470
|
+
Spawn Sonnet sub-agent with batch file list
|
|
471
|
+
Include verification_lines from classification for each file
|
|
472
|
+
Collect violations YAML for batch
|
|
473
|
+
Merge all batch results into single detection output
|
|
474
|
+
ELSE:
|
|
475
|
+
Process all flagged files in single sub-agent call
|
|
476
|
+
```
|
|
477
|
+
|
|
478
|
+
### Batch Merge Strategy
|
|
479
|
+
|
|
480
|
+
When merging batch results:
|
|
481
|
+
1. Combine all `violations` arrays
|
|
482
|
+
2. Recalculate `totals` across all batches
|
|
483
|
+
3. Combine all `file_summaries`
|
|
484
|
+
4. Preserve individual violation details exactly
|
|
485
|
+
|
|
486
|
+
### Parallel Execution
|
|
487
|
+
|
|
488
|
+
For optimal performance, spawn batch sub-agents in parallel:
|
|
489
|
+
|
|
490
|
+
```
|
|
491
|
+
Task(subagent_type="general-purpose", model="sonnet", prompt=batch1_prompt, run_in_background=true)
|
|
492
|
+
Task(subagent_type="general-purpose", model="sonnet", prompt=batch2_prompt, run_in_background=true)
|
|
493
|
+
...
|
|
494
|
+
```
|
|
495
|
+
|
|
496
|
+
Read all outputs after completion, then merge.
|
|
497
|
+
|
|
498
|
+
---
|
|
499
|
+
|
|
500
|
+
## References
|
|
501
|
+
|
|
502
|
+
| Document | Purpose |
|
|
503
|
+
|----------|---------|
|
|
504
|
+
| `references/stub-patterns.md` | Meszaros test double taxonomy, class hierarchy detection, factory function classification |
|
|
505
|
+
| `references/false-positive-prevention.md` | Two-tier allowlist (Universal Safe / Context-Dependent), decision tree for violation evaluation |
|
|
506
|
+
|
|
507
|
+
## Related Skills
|
|
508
|
+
|
|
509
|
+
- `test-classification` (P0.6) - Surface classification (upstream)
|
|
510
|
+
- `test-audit` (P0.8) - Orchestration and synthesis (downstream)
|
|
511
|
+
- `pipeline-templates` (P0.3) - Test Audit pipeline definition
|