agentic-qe 1.6.1 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (194) hide show
  1. package/.claude/skills/sherlock-review/SKILL.md +786 -0
  2. package/CHANGELOG.md +651 -0
  3. package/README.md +52 -8
  4. package/dist/agents/BaseAgent.d.ts +30 -10
  5. package/dist/agents/BaseAgent.d.ts.map +1 -1
  6. package/dist/agents/BaseAgent.js +115 -43
  7. package/dist/agents/BaseAgent.js.map +1 -1
  8. package/dist/agents/CoverageAnalyzerAgent.js +2 -2
  9. package/dist/agents/CoverageAnalyzerAgent.js.map +1 -1
  10. package/dist/agents/FleetCommanderAgent.d.ts +16 -0
  11. package/dist/agents/FleetCommanderAgent.d.ts.map +1 -1
  12. package/dist/agents/FleetCommanderAgent.js +35 -20
  13. package/dist/agents/FleetCommanderAgent.js.map +1 -1
  14. package/dist/agents/LearningAgent.d.ts +2 -2
  15. package/dist/agents/LearningAgent.d.ts.map +1 -1
  16. package/dist/agents/LearningAgent.js +4 -4
  17. package/dist/agents/LearningAgent.js.map +1 -1
  18. package/dist/agents/TestExecutorAgent.d.ts +9 -0
  19. package/dist/agents/TestExecutorAgent.d.ts.map +1 -1
  20. package/dist/agents/TestExecutorAgent.js +60 -0
  21. package/dist/agents/TestExecutorAgent.js.map +1 -1
  22. package/dist/agents/examples/batchAnalyze.d.ts +252 -0
  23. package/dist/agents/examples/batchAnalyze.d.ts.map +1 -0
  24. package/dist/agents/examples/batchAnalyze.js +259 -0
  25. package/dist/agents/examples/batchAnalyze.js.map +1 -0
  26. package/dist/agents/examples/batchGenerate.d.ts +153 -0
  27. package/dist/agents/examples/batchGenerate.d.ts.map +1 -0
  28. package/dist/agents/examples/batchGenerate.js +166 -0
  29. package/dist/agents/examples/batchGenerate.js.map +1 -0
  30. package/dist/agents/generateWithPII.d.ts +128 -0
  31. package/dist/agents/generateWithPII.d.ts.map +1 -0
  32. package/dist/agents/generateWithPII.js +175 -0
  33. package/dist/agents/generateWithPII.js.map +1 -0
  34. package/dist/agents/index.d.ts.map +1 -1
  35. package/dist/agents/index.js +0 -2
  36. package/dist/agents/index.js.map +1 -1
  37. package/dist/agents/lifecycle/AgentLifecycleManager.d.ts +5 -0
  38. package/dist/agents/lifecycle/AgentLifecycleManager.d.ts.map +1 -1
  39. package/dist/agents/lifecycle/AgentLifecycleManager.js +10 -0
  40. package/dist/agents/lifecycle/AgentLifecycleManager.js.map +1 -1
  41. package/dist/cli/commands/agentdb/learn.d.ts.map +1 -1
  42. package/dist/cli/commands/agentdb/learn.js +190 -71
  43. package/dist/cli/commands/agentdb/learn.js.map +1 -1
  44. package/dist/cli/commands/debug/agent.d.ts.map +1 -1
  45. package/dist/cli/commands/debug/agent.js +40 -13
  46. package/dist/cli/commands/debug/agent.js.map +1 -1
  47. package/dist/cli/commands/debug/diagnostics.js +38 -11
  48. package/dist/cli/commands/debug/diagnostics.js.map +1 -1
  49. package/dist/cli/commands/debug/health-check.js +47 -12
  50. package/dist/cli/commands/debug/health-check.js.map +1 -1
  51. package/dist/cli/commands/debug/profile.js +7 -7
  52. package/dist/cli/commands/debug/profile.js.map +1 -1
  53. package/dist/cli/commands/debug/trace.js +4 -4
  54. package/dist/cli/commands/debug/trace.js.map +1 -1
  55. package/dist/cli/commands/debug/troubleshoot.js +41 -27
  56. package/dist/cli/commands/debug/troubleshoot.js.map +1 -1
  57. package/dist/cli/commands/init.d.ts +6 -3
  58. package/dist/cli/commands/init.d.ts.map +1 -1
  59. package/dist/cli/commands/init.js +71 -54
  60. package/dist/cli/commands/init.js.map +1 -1
  61. package/dist/cli/commands/learn/index.d.ts +4 -0
  62. package/dist/cli/commands/learn/index.d.ts.map +1 -1
  63. package/dist/cli/commands/learn/index.js +57 -0
  64. package/dist/cli/commands/learn/index.js.map +1 -1
  65. package/dist/cli/commands/test/clean.d.ts.map +1 -1
  66. package/dist/cli/commands/test/clean.js +26 -9
  67. package/dist/cli/commands/test/clean.js.map +1 -1
  68. package/dist/cli/commands/test/debug.js +6 -7
  69. package/dist/cli/commands/test/debug.js.map +1 -1
  70. package/dist/cli/commands/test/diff.js +4 -37
  71. package/dist/cli/commands/test/diff.js.map +1 -1
  72. package/dist/cli/commands/test/profile.js +7 -40
  73. package/dist/cli/commands/test/profile.js.map +1 -1
  74. package/dist/cli/commands/test/trace.js +4 -37
  75. package/dist/cli/commands/test/trace.js.map +1 -1
  76. package/dist/cli/index.js +14 -0
  77. package/dist/cli/index.js.map +1 -1
  78. package/dist/core/ArtifactWorkflow.d.ts +4 -0
  79. package/dist/core/ArtifactWorkflow.d.ts.map +1 -1
  80. package/dist/core/ArtifactWorkflow.js +34 -13
  81. package/dist/core/ArtifactWorkflow.js.map +1 -1
  82. package/dist/core/coordination/BlackboardCoordination.d.ts +4 -0
  83. package/dist/core/coordination/BlackboardCoordination.d.ts.map +1 -1
  84. package/dist/core/coordination/BlackboardCoordination.js +28 -22
  85. package/dist/core/coordination/BlackboardCoordination.js.map +1 -1
  86. package/dist/core/coordination/ConsensusGating.d.ts +4 -0
  87. package/dist/core/coordination/ConsensusGating.d.ts.map +1 -1
  88. package/dist/core/coordination/ConsensusGating.js +25 -18
  89. package/dist/core/coordination/ConsensusGating.js.map +1 -1
  90. package/dist/core/memory/AgentDBManager.d.ts +5 -0
  91. package/dist/core/memory/AgentDBManager.d.ts.map +1 -1
  92. package/dist/core/memory/AgentDBManager.js +19 -1
  93. package/dist/core/memory/AgentDBManager.js.map +1 -1
  94. package/dist/core/memory/AgentDBService.d.ts.map +1 -1
  95. package/dist/core/memory/AgentDBService.js +6 -3
  96. package/dist/core/memory/AgentDBService.js.map +1 -1
  97. package/dist/core/memory/RealAgentDBAdapter.d.ts +8 -0
  98. package/dist/core/memory/RealAgentDBAdapter.d.ts.map +1 -1
  99. package/dist/core/memory/RealAgentDBAdapter.js +74 -17
  100. package/dist/core/memory/RealAgentDBAdapter.js.map +1 -1
  101. package/dist/core/memory/ReasoningBankAdapter.d.ts +4 -0
  102. package/dist/core/memory/ReasoningBankAdapter.d.ts.map +1 -1
  103. package/dist/core/memory/ReasoningBankAdapter.js +20 -0
  104. package/dist/core/memory/ReasoningBankAdapter.js.map +1 -1
  105. package/dist/core/memory/SwarmMemoryManager.d.ts +8 -0
  106. package/dist/core/memory/SwarmMemoryManager.d.ts.map +1 -1
  107. package/dist/core/memory/SwarmMemoryManager.js +33 -0
  108. package/dist/core/memory/SwarmMemoryManager.js.map +1 -1
  109. package/dist/learning/ImprovementLoop.js +2 -2
  110. package/dist/learning/ImprovementLoop.js.map +1 -1
  111. package/dist/learning/LearningEngine.d.ts +11 -7
  112. package/dist/learning/LearningEngine.d.ts.map +1 -1
  113. package/dist/learning/LearningEngine.js +157 -73
  114. package/dist/learning/LearningEngine.js.map +1 -1
  115. package/dist/learning/StateExtractor.d.ts +1 -1
  116. package/dist/learning/StateExtractor.d.ts.map +1 -1
  117. package/dist/learning/StateExtractor.js +62 -13
  118. package/dist/learning/StateExtractor.js.map +1 -1
  119. package/dist/mcp/handlers/filtered/coverage-analyzer-filtered.d.ts +83 -0
  120. package/dist/mcp/handlers/filtered/coverage-analyzer-filtered.d.ts.map +1 -0
  121. package/dist/mcp/handlers/filtered/coverage-analyzer-filtered.js +130 -0
  122. package/dist/mcp/handlers/filtered/coverage-analyzer-filtered.js.map +1 -0
  123. package/dist/mcp/handlers/filtered/flaky-detector-filtered.d.ts +58 -0
  124. package/dist/mcp/handlers/filtered/flaky-detector-filtered.d.ts.map +1 -0
  125. package/dist/mcp/handlers/filtered/flaky-detector-filtered.js +84 -0
  126. package/dist/mcp/handlers/filtered/flaky-detector-filtered.js.map +1 -0
  127. package/dist/mcp/handlers/filtered/index.d.ts +47 -0
  128. package/dist/mcp/handlers/filtered/index.d.ts.map +1 -0
  129. package/dist/mcp/handlers/filtered/index.js +63 -0
  130. package/dist/mcp/handlers/filtered/index.js.map +1 -0
  131. package/dist/mcp/handlers/filtered/performance-tester-filtered.d.ts +57 -0
  132. package/dist/mcp/handlers/filtered/performance-tester-filtered.d.ts.map +1 -0
  133. package/dist/mcp/handlers/filtered/performance-tester-filtered.js +83 -0
  134. package/dist/mcp/handlers/filtered/performance-tester-filtered.js.map +1 -0
  135. package/dist/mcp/handlers/filtered/quality-assessor-filtered.d.ts +57 -0
  136. package/dist/mcp/handlers/filtered/quality-assessor-filtered.d.ts.map +1 -0
  137. package/dist/mcp/handlers/filtered/quality-assessor-filtered.js +93 -0
  138. package/dist/mcp/handlers/filtered/quality-assessor-filtered.js.map +1 -0
  139. package/dist/mcp/handlers/filtered/security-scanner-filtered.d.ts +54 -0
  140. package/dist/mcp/handlers/filtered/security-scanner-filtered.d.ts.map +1 -0
  141. package/dist/mcp/handlers/filtered/security-scanner-filtered.js +73 -0
  142. package/dist/mcp/handlers/filtered/security-scanner-filtered.js.map +1 -0
  143. package/dist/mcp/handlers/filtered/test-executor-filtered.d.ts +61 -0
  144. package/dist/mcp/handlers/filtered/test-executor-filtered.d.ts.map +1 -0
  145. package/dist/mcp/handlers/filtered/test-executor-filtered.js +117 -0
  146. package/dist/mcp/handlers/filtered/test-executor-filtered.js.map +1 -0
  147. package/dist/mcp/handlers/phase2/Phase2Tools.js +2 -2
  148. package/dist/mcp/handlers/phase2/Phase2Tools.js.map +1 -1
  149. package/dist/mcp/tools/deprecated.d.ts +8 -8
  150. package/dist/scripts/backup-helper.d.ts +64 -0
  151. package/dist/scripts/backup-helper.d.ts.map +1 -0
  152. package/dist/scripts/backup-helper.js +251 -0
  153. package/dist/scripts/backup-helper.js.map +1 -0
  154. package/dist/scripts/migrate-with-backup.d.ts +15 -0
  155. package/dist/scripts/migrate-with-backup.d.ts.map +1 -0
  156. package/dist/scripts/migrate-with-backup.js +194 -0
  157. package/dist/scripts/migrate-with-backup.js.map +1 -0
  158. package/dist/security/pii-tokenization.d.ts +216 -0
  159. package/dist/security/pii-tokenization.d.ts.map +1 -0
  160. package/dist/security/pii-tokenization.js +325 -0
  161. package/dist/security/pii-tokenization.js.map +1 -0
  162. package/dist/utils/Config.d.ts.map +1 -1
  163. package/dist/utils/Config.js +14 -5
  164. package/dist/utils/Config.js.map +1 -1
  165. package/dist/utils/Database.d.ts.map +1 -1
  166. package/dist/utils/Database.js +5 -2
  167. package/dist/utils/Database.js.map +1 -1
  168. package/dist/utils/EmbeddingGenerator.d.ts +35 -0
  169. package/dist/utils/EmbeddingGenerator.d.ts.map +1 -0
  170. package/dist/utils/EmbeddingGenerator.js +72 -0
  171. package/dist/utils/EmbeddingGenerator.js.map +1 -0
  172. package/dist/utils/Logger.d.ts +1 -1
  173. package/dist/utils/Logger.d.ts.map +1 -1
  174. package/dist/utils/Logger.js +4 -4
  175. package/dist/utils/Logger.js.map +1 -1
  176. package/dist/utils/SecurityScanner.js +1 -1
  177. package/dist/utils/SecurityScanner.js.map +1 -1
  178. package/dist/utils/batch-operations.d.ts +215 -0
  179. package/dist/utils/batch-operations.d.ts.map +1 -0
  180. package/dist/utils/batch-operations.js +266 -0
  181. package/dist/utils/batch-operations.js.map +1 -0
  182. package/dist/utils/filtering.d.ts +180 -0
  183. package/dist/utils/filtering.d.ts.map +1 -0
  184. package/dist/utils/filtering.js +288 -0
  185. package/dist/utils/filtering.js.map +1 -0
  186. package/dist/utils/prompt-cache-examples.d.ts +111 -0
  187. package/dist/utils/prompt-cache-examples.d.ts.map +1 -0
  188. package/dist/utils/prompt-cache-examples.js +416 -0
  189. package/dist/utils/prompt-cache-examples.js.map +1 -0
  190. package/dist/utils/prompt-cache.d.ts +305 -0
  191. package/dist/utils/prompt-cache.d.ts.map +1 -0
  192. package/dist/utils/prompt-cache.js +448 -0
  193. package/dist/utils/prompt-cache.js.map +1 -0
  194. package/package.json +6 -3
@@ -0,0 +1,786 @@
1
+ ---
2
+ name: "Sherlock Review"
3
+ description: "Evidence-based investigative code review using deductive reasoning to determine what actually happened versus what was claimed. Use when verifying implementation claims, investigating bugs, validating fixes, or conducting root cause analysis. Elementary approach to finding truth through systematic observation."
4
+ ---
5
+
6
+ # Sherlock Review
7
+
8
+ ## What This Skill Does
9
+
10
+ Conducts methodical, evidence-based investigation of code, tests, and system behavior using Holmesian deductive reasoning. Unlike traditional code reviews that focus on style and best practices, Sherlock Review investigates **what actually happened** versus **what was claimed to happen**, seeing what others miss through systematic observation and logical deduction.
11
+
12
+ ## Prerequisites
13
+
14
+ - Access to codebase and version control history
15
+ - Ability to run tests and reproduce issues
16
+ - Understanding of the domain and system architecture
17
+ - Critical thinking and skepticism
18
+
19
+ ---
20
+
21
+ ## Quick Start (Elementary Method)
22
+
23
+ ### The 3-Step Investigation
24
+
25
+ ```bash
26
+ # 1. OBSERVE: Gather all evidence
27
+ git log --oneline -10
28
+ git diff <commit>
29
+ grep -r "claimed feature" .
30
+
31
+ # 2. DEDUCE: What does the evidence actually show?
32
+ npm test
33
+ git blame <file>
34
+
35
+ # 3. CONCLUDE: Does evidence support the claim?
36
+ # Document findings with evidence
37
+ ```
38
+
39
+ ---
40
+
41
+ ## Investigation Methodology
42
+
43
+ ### Level 1: Initial Observation (The Crime Scene)
44
+
45
+ **Principle**: "You see, but you do not observe. The distinction is clear."
46
+
47
+ #### What to Examine First
48
+
49
+ 1. **The Claim**: What was supposed to happen?
50
+ - PR description
51
+ - Commit messages
52
+ - Issue tickets
53
+ - Documentation updates
54
+
55
+ 2. **The Evidence**: What actually exists?
56
+ - Actual code changes
57
+ - Test coverage
58
+ - Build/test results
59
+ - Runtime behavior
60
+
61
+ 3. **The Timeline**: When did things happen?
62
+ - Commit history
63
+ - File modification times
64
+ - Test execution logs
65
+ - Deployment records
66
+
67
+ #### Evidence Collection Checklist
68
+
69
+ ```markdown
70
+ ## Evidence Collection
71
+
72
+ ### The Claim
73
+ - [ ] Read PR/issue description thoroughly
74
+ - [ ] Note all claimed features/fixes
75
+ - [ ] Identify specific assertions made
76
+ - [ ] Record expected behavior
77
+
78
+ ### The Code
79
+ - [ ] Examine actual file changes
80
+ - [ ] Review implementation details
81
+ - [ ] Check for edge cases
82
+ - [ ] Verify error handling
83
+
84
+ ### The Tests
85
+ - [ ] Count test cases added/modified
86
+ - [ ] Run tests independently
87
+ - [ ] Check test assertions
88
+ - [ ] Verify test coverage
89
+
90
+ ### The Behavior
91
+ - [ ] Run the code locally
92
+ - [ ] Test claimed scenarios
93
+ - [ ] Try edge cases
94
+ - [ ] Reproduce reported fixes
95
+ ```
96
+
97
+ ---
98
+
99
+ ## Level 2: Deductive Analysis (Elementary Reasoning)
100
+
101
+ ### The Sherlock Framework
102
+
103
+ #### 1. Eliminate the Impossible
104
+
105
+ **Method**: Systematically rule out what cannot be true
106
+
107
+ ```markdown
108
+ ## Investigation Notes
109
+
110
+ ### Claim: "Fixed user authentication bug"
111
+
112
+ #### Evidence Review:
113
+ - ✓ Modified auth.js (lines 45-67)
114
+ - ✓ Added 2 new test cases
115
+ - ✗ No changes to login flow
116
+ - ✗ No database migration
117
+ - ✗ Tests pass but don't cover reported scenario
118
+
119
+ #### Deductions:
120
+ - IMPOSSIBLE: Fix covers all auth scenarios (no login flow changes)
121
+ - POSSIBLE: Fix covers specific password reset case
122
+ - LIKELY: Fix is partial, limited to one code path
123
+ ```
124
+
125
+ #### 2. Follow the Evidence Chain
126
+
127
+ **Method**: Connect observable facts to logical conclusions
128
+
129
+ ```markdown
130
+ ## Evidence Chain
131
+
132
+ ### Observation 1: Test passes locally
133
+ ### Observation 2: Test fails in CI
134
+ ### Observation 3: Different Node versions
135
+
136
+ ### Chain of Reasoning:
137
+ 1. Test behavior differs by environment
138
+ 2. Environment difference is Node version
139
+ 3. Code uses Node-version-specific API
140
+ 4. Therefore: Fix is environment-dependent
141
+ 5. Conclusion: Claim of "fixed" is incomplete
142
+ ```
143
+
144
+ #### 3. Question Everything
145
+
146
+ **Critical Questions to Ask**:
147
+
148
+ - Does the code actually do what the commit message claims?
149
+ - Do the tests verify the claimed fix?
150
+ - Can the bug reproduce in conditions not covered by tests?
151
+ - Are there edge cases not considered?
152
+ - Does "works on my machine" equal "properly fixed"?
153
+
154
+ ---
155
+
156
+ ## Level 3: Systematic Investigation Process
157
+
158
+ ### Step-by-Step Sherlock Review
159
+
160
+ #### Step 1: Read the Case File
161
+
162
+ ```bash
163
+ # Examine the claim
164
+ git show <commit>
165
+ cat PR_DESCRIPTION.md
166
+
167
+ # Note specific assertions:
168
+ # - "Fixes race condition in async handler"
169
+ # - "Adds comprehensive error handling"
170
+ # - "Improves performance by 40%"
171
+ ```
172
+
173
+ #### Step 2: Examine the Evidence
174
+
175
+ ```bash
176
+ # What actually changed?
177
+ git diff main..feature-branch
178
+
179
+ # Count the facts:
180
+ FILES_CHANGED=$(git diff --name-only main..feature-branch | wc -l)
181
+ LINES_ADDED=$(git diff --stat main..feature-branch | tail -1)
182
+ TESTS_ADDED=$(git diff main..feature-branch | grep -c "test(" )
183
+
184
+ echo "Files modified: $FILES_CHANGED"
185
+ echo "Tests added: $TESTS_ADDED"
186
+ ```
187
+
188
+ #### Step 3: Test the Theory
189
+
190
+ ```bash
191
+ # Run claimed fixes through scientific method
192
+ npm test -- --coverage
193
+
194
+ # Test edge cases not covered:
195
+ node scripts/test-edge-cases.js
196
+
197
+ # Reproduce original bug:
198
+ git checkout <bug-commit>
199
+ npm test -- <failing-test>
200
+ git checkout <fix-commit>
201
+ npm test -- <failing-test>
202
+ ```
203
+
204
+ #### Step 4: Cross-Examine the Code
205
+
206
+ **Questions for Code Interrogation**:
207
+
208
+ ```javascript
209
+ // CLAIMED: "Handles all null cases"
210
+ function processData(data) {
211
+ if (data === null) return null; // ✓ Handles null
212
+ return data.items.map(x => x); // ✗ Doesn't handle data.items === null
213
+ }
214
+ // VERDICT: Claim is FALSE - only handles top-level null
215
+ ```
216
+
217
+ #### Step 5: Compile the Evidence Report
218
+
219
+ ```markdown
220
+ ## Sherlock Investigation Report
221
+
222
+ ### Case: PR #123 "Fix race condition in async handler"
223
+
224
+ ### Claimed Facts:
225
+ 1. "Eliminates race condition"
226
+ 2. "Adds mutex locking"
227
+ 3. "100% thread safe"
228
+
229
+ ### Evidence Examined:
230
+ - File: src/handlers/async-handler.js
231
+ - Changes: Added `async/await`, removed callbacks
232
+ - Tests: 2 new tests for async flow
233
+ - Coverage: 85% (was 75%)
234
+
235
+ ### Deductive Analysis:
236
+
237
+ #### Claim 1: "Eliminates race condition"
238
+ **Evidence**:
239
+ - Added `await` to sequential operations
240
+ - No actual mutex/lock mechanism found
241
+ - No test for concurrent requests
242
+
243
+ **Deduction**:
244
+ - Code now sequential, not concurrent
245
+ - Race condition avoided by removing concurrency
246
+ - Not eliminated, just prevented by design change
247
+
248
+ **Verdict**: PARTIALLY TRUE (solved differently than claimed)
249
+
250
+ #### Claim 2: "Adds mutex locking"
251
+ **Evidence**:
252
+ - No mutex library imported
253
+ - No lock variables found
254
+ - No synchronization primitives
255
+
256
+ **Deduction**:
257
+ - No mutex implementation exists
258
+ - Claim is factually incorrect
259
+
260
+ **Verdict**: FALSE
261
+
262
+ #### Claim 3: "100% thread safe"
263
+ **Evidence**:
264
+ - JavaScript is single-threaded
265
+ - Node.js event loop model
266
+ - No worker threads used
267
+
268
+ **Deduction**:
269
+ - "Thread safe" is meaningless in this context
270
+ - Shows misunderstanding of runtime model
271
+
272
+ **Verdict**: NONSENSICAL
273
+
274
+ ### Final Conclusion:
275
+ The fix works but not for the reasons claimed. The race condition is avoided by making operations sequential rather than by adding thread synchronization. Tests verify sequential behavior but don't test the original concurrent scenario.
276
+
277
+ ### Recommendations:
278
+ 1. Update PR description to accurately reflect solution
279
+ 2. Add test for concurrent request handling
280
+ 3. Clarify whether sequential execution is acceptable for performance
281
+ 4. Remove incorrect technical claims about "mutex" and "thread safety"
282
+ ```
283
+
284
+ ---
285
+
286
+ ## Level 4: Advanced Investigation Techniques
287
+
288
+ ### Technique 1: The Timeline Reconstruction
289
+
290
+ **Purpose**: Understand the sequence of events leading to current state
291
+
292
+ ```bash
293
+ # Build the timeline
294
+ git log --all --graph --oneline --decorate
295
+
296
+ # Examine critical commits
297
+ git log --grep="fix" --grep="bug" --all-match
298
+
299
+ # Find when bug was introduced
300
+ git bisect start
301
+ git bisect bad HEAD
302
+ git bisect good v1.0.0
303
+ ```
304
+
305
+ ### Technique 2: The Behavioral Analysis
306
+
307
+ **Purpose**: Observe what the code actually does, not what it's supposed to do
308
+
309
+ ```javascript
310
+ // Add instrumentation
311
+ console.log('[SHERLOCK] Entering function with:', arguments);
312
+ console.log('[SHERLOCK] State before:', JSON.stringify(state));
313
+ // ... original code ...
314
+ console.log('[SHERLOCK] State after:', JSON.stringify(state));
315
+ console.log('[SHERLOCK] Returning:', result);
316
+ ```
317
+
318
+ ### Technique 3: The Stress Test
319
+
320
+ **Purpose**: Find limits and breaking points
321
+
322
+ ```bash
323
+ # Test boundary conditions
324
+ npm test -- --iterations=10000
325
+
326
+ # Test with invalid inputs
327
+ echo '{"invalid": null}' | node src/process.js
328
+
329
+ # Test resource exhaustion
330
+ ab -n 10000 -c 100 http://localhost:3000/api/endpoint
331
+ ```
332
+
333
+ ### Technique 4: The Forensic Diff
334
+
335
+ **Purpose**: Understand what changed and why
336
+
337
+ ```bash
338
+ # Compare claimed vs actual changes
339
+ git diff --word-diff main..feature-branch
340
+
341
+ # Find silent changes (no commit message mention)
342
+ git diff main..feature-branch | grep -A5 -B5 "security\|auth\|password"
343
+
344
+ # Detect code that was removed
345
+ git diff main..feature-branch | grep "^-" | grep -v "^---"
346
+ ```
347
+
348
+ ---
349
+
350
+ ## Investigation Templates
351
+
352
+ ### Template 1: Bug Fix Verification
353
+
354
+ ```markdown
355
+ ## Sherlock Investigation: Bug Fix Verification
356
+
357
+ ### The Bug Report
358
+ - **Reported**: [date]
359
+ - **Severity**: [P0/P1/P2/P3]
360
+ - **Symptoms**: [what users observed]
361
+ - **Expected**: [what should happen]
362
+
363
+ ### The Claimed Fix
364
+ - **PR**: #[number]
365
+ - **Commit**: [hash]
366
+ - **Description**: [claimed solution]
367
+
368
+ ### Evidence Collection
369
+
370
+ #### 1. Reproduce Original Bug
371
+ - [ ] Checkout commit before fix
372
+ - [ ] Follow reproduction steps
373
+ - [ ] Confirm bug exists
374
+ - [ ] Document observed behavior
375
+
376
+ #### 2. Verify Fix
377
+ - [ ] Checkout commit with fix
378
+ - [ ] Follow same reproduction steps
379
+ - [ ] Confirm bug is resolved
380
+ - [ ] Test edge cases
381
+
382
+ #### 3. Code Analysis
383
+ - [ ] Review actual code changes
384
+ - [ ] Verify logic addresses root cause
385
+ - [ ] Check for side effects
386
+ - [ ] Assess test coverage
387
+
388
+ ### Deductive Analysis
389
+
390
+ **Root Cause Claimed**: [what PR says]
391
+ **Root Cause Actual**: [what evidence shows]
392
+
393
+ **Fix Mechanism Claimed**: [how PR says it works]
394
+ **Fix Mechanism Actual**: [how it actually works]
395
+
396
+ **Coverage Claimed**: [scenarios PR claims to handle]
397
+ **Coverage Actual**: [scenarios actually handled]
398
+
399
+ ### Verdict
400
+
401
+ - [ ] Bug is fully fixed
402
+ - [ ] Bug is partially fixed
403
+ - [ ] Bug is not fixed (claim is false)
404
+ - [ ] Bug is fixed but new bugs introduced
405
+
406
+ ### Evidence Summary
407
+ [Concise summary of findings with proof]
408
+
409
+ ### Recommendations
410
+ 1. [Action based on evidence]
411
+ 2. [Action based on evidence]
412
+ ```
413
+
414
+ ### Template 2: Feature Implementation Review
415
+
416
+ ```markdown
417
+ ## Sherlock Investigation: Feature Implementation
418
+
419
+ ### The Feature Request
420
+ - **Requirement**: [what was requested]
421
+ - **Acceptance Criteria**: [how to verify]
422
+ - **User Story**: [why it's needed]
423
+
424
+ ### The Implementation Claim
425
+ - **PR**: #[number]
426
+ - **Description**: [what PR claims to deliver]
427
+ - **Scope**: [claimed completeness]
428
+
429
+ ### Evidence Examination
430
+
431
+ #### Code Changes
432
+ ```bash
433
+ git diff main..feature-branch --stat
434
+ ```
435
+
436
+ - Files changed: [count]
437
+ - Lines added: [count]
438
+ - Lines removed: [count]
439
+ - Tests added: [count]
440
+
441
+ #### Acceptance Criteria Testing
442
+
443
+ | Criterion | Claimed | Tested | Verdict |
444
+ |-----------|---------|--------|---------|
445
+ | AC1: [criterion] | ✓ | [yes/no] | [pass/fail] |
446
+ | AC2: [criterion] | ✓ | [yes/no] | [pass/fail] |
447
+ | AC3: [criterion] | ✓ | [yes/no] | [pass/fail] |
448
+
449
+ ### Deductive Analysis
450
+
451
+ **Claim**: [what PR says is implemented]
452
+
453
+ **Evidence**:
454
+ - [Fact 1 from code]
455
+ - [Fact 2 from tests]
456
+ - [Fact 3 from behavior]
457
+
458
+ **Deduction**:
459
+ - [Logical conclusion from evidence]
460
+
461
+ **Verdict**: [supported/partially supported/not supported by evidence]
462
+
463
+ ### Missing Elements
464
+ - [ ] [Feature aspect not implemented]
465
+ - [ ] [Test scenario not covered]
466
+ - [ ] [Edge case not handled]
467
+
468
+ ### Conclusion
469
+ [Evidence-based assessment of implementation completeness]
470
+ ```
471
+
472
+ ### Template 3: Performance Claim Verification
473
+
474
+ ```markdown
475
+ ## Sherlock Investigation: Performance Claims
476
+
477
+ ### The Claim
478
+ "Improved performance by [X]% in [scenario]"
479
+
480
+ ### Investigation Setup
481
+
482
+ #### Baseline Measurement
483
+ ```bash
484
+ git checkout [before-commit]
485
+ npm run benchmark > baseline.txt
486
+ ```
487
+
488
+ #### Post-Fix Measurement
489
+ ```bash
490
+ git checkout [after-commit]
491
+ npm run benchmark > optimized.txt
492
+ ```
493
+
494
+ ### Evidence Collection
495
+
496
+ #### Benchmark Results
497
+
498
+ | Metric | Before | After | Improvement | Claimed |
499
+ |--------|--------|-------|-------------|---------|
500
+ | Latency | [ms] | [ms] | [%] | [%] |
501
+ | Throughput | [req/s] | [req/s] | [%] | [%] |
502
+ | Memory | [MB] | [MB] | [%] | [%] |
503
+ | CPU | [%] | [%] | [%] | [%] |
504
+
505
+ ### Deductive Analysis
506
+
507
+ **Claimed Improvement**: [X]%
508
+ **Measured Improvement**: [Y]%
509
+ **Variance**: [X-Y]%
510
+
511
+ **Measurement Conditions**:
512
+ - Environment: [prod/dev/local]
513
+ - Load: [concurrent users/requests]
514
+ - Data size: [records/MB]
515
+
516
+ **Verdict**:
517
+ - [ ] Claim supported by evidence
518
+ - [ ] Claim exaggerated (actual: [Y]%)
519
+ - [ ] Claim not reproducible
520
+ - [ ] Claim based on cherry-picked scenario
521
+
522
+ ### Conclusion
523
+ [Evidence-based assessment with actual numbers]
524
+ ```
525
+
526
+ ---
527
+
528
+ ## Holmesian Principles for QE
529
+
530
+ ### Principle 1: "Data! Data! Data!"
531
+
532
+ > "I can't make bricks without clay."
533
+
534
+ **Application**: Collect comprehensive evidence before forming conclusions
535
+
536
+ - Logs, traces, metrics
537
+ - Test results, coverage reports
538
+ - Code diffs, git history
539
+ - Reproduction steps
540
+
541
+ ### Principle 2: "Eliminate the Impossible"
542
+
543
+ > "When you have eliminated the impossible, whatever remains, however improbable, must be the truth."
544
+
545
+ **Application**: Use negative testing and boundary analysis
546
+
547
+ - Test what should NOT happen
548
+ - Verify constraints are enforced
549
+ - Check impossible inputs are rejected
550
+ - Validate error handling paths
551
+
552
+ ### Principle 3: "Observe, Don't Assume"
553
+
554
+ > "You see, but you do not observe."
555
+
556
+ **Application**: Run the code, don't just read it
557
+
558
+ - Execute tests locally
559
+ - Step through debugger
560
+ - Profile performance
561
+ - Monitor resource usage
562
+
563
+ ### Principle 4: "The Little Things Matter"
564
+
565
+ > "It has long been an axiom of mine that the little things are infinitely the most important."
566
+
567
+ **Application**: Pay attention to details others miss
568
+
569
+ - Off-by-one errors
570
+ - Null/undefined handling
571
+ - Timezone conversions
572
+ - Race conditions
573
+ - Memory leaks
574
+
575
+ ### Principle 5: "Question Everything"
576
+
577
+ > "I never guess. It is a capital mistake to theorize before one has data."
578
+
579
+ **Application**: Verify all claims empirically
580
+
581
+ - Don't trust commit messages
582
+ - Don't trust documentation
583
+ - Don't trust "it works on my machine"
584
+ - Trust only reproducible evidence
585
+
586
+ ---
587
+
588
+ ## The Sherlock Review Checklist
589
+
590
+ Before approving any PR, verify:
591
+
592
+ ### Evidence-Based Review
593
+
594
+ - [ ] **Claim vs Reality**: Does code match description?
595
+ - [ ] **Tests Verify Claims**: Do tests actually prove the fix/feature?
596
+ - [ ] **Reproducible**: Can you reproduce the bug/feature locally?
597
+ - [ ] **Edge Cases**: Are boundary conditions tested?
598
+ - [ ] **Negative Cases**: Are failure paths tested?
599
+
600
+ ### Deductive Reasoning
601
+
602
+ - [ ] **Root Cause**: Does fix address actual root cause?
603
+ - [ ] **Side Effects**: Could this break something else?
604
+ - [ ] **Performance**: Any evidence for performance claims?
605
+ - [ ] **Security**: Any security implications?
606
+ - [ ] **Assumptions**: Are all assumptions validated?
607
+
608
+ ### Observational Analysis
609
+
610
+ - [ ] **Code Quality**: Is code doing what it appears to do?
611
+ - [ ] **Error Handling**: Are errors handled or just hidden?
612
+ - [ ] **Resource Management**: Are resources properly managed?
613
+ - [ ] **Concurrency**: Any race conditions or deadlocks?
614
+ - [ ] **Data Validation**: Is input validated?
615
+
616
+ ### Timeline Verification
617
+
618
+ - [ ] **Related Changes**: Are there related commits?
619
+ - [ ] **Regression Risk**: Could this reintroduce old bugs?
620
+ - [ ] **Dependencies**: Are dependency changes necessary?
621
+ - [ ] **Migration Path**: Is there a rollback plan?
622
+
623
+ ---
624
+
625
+ ## Common Investigation Scenarios
626
+
627
+ ### Scenario 1: "This Fixed the Bug"
628
+
629
+ **Investigation Steps**:
630
+ 1. Reproduce bug on commit before fix
631
+ 2. Verify bug is gone on commit with fix
632
+ 3. Check if fix addresses root cause or just symptom
633
+ 4. Test edge cases not in original bug report
634
+ 5. Verify no regression in related functionality
635
+
636
+ **Red Flags**:
637
+ - Bug "fix" that just removes error logging
638
+ - Fix that works only for specific test case
639
+ - Fix that introduces workarounds instead of solving root cause
640
+ - No test added to prevent regression
641
+
642
+ ### Scenario 2: "Improved Performance by 50%"
643
+
644
+ **Investigation Steps**:
645
+ 1. Run benchmark on baseline commit
646
+ 2. Run same benchmark on optimized commit
647
+ 3. Compare results in identical conditions
648
+ 4. Verify measurement methodology
649
+ 5. Test under realistic load
650
+
651
+ **Red Flags**:
652
+ - Performance tested only on toy data
653
+ - Comparison uses different conditions
654
+ - "Improvement" in non-critical path
655
+ - Trade-off not mentioned (e.g., memory for speed)
656
+
657
+ ### Scenario 3: "Added Comprehensive Error Handling"
658
+
659
+ **Investigation Steps**:
660
+ 1. List all error paths in code
661
+ 2. Verify each path has handling
662
+ 3. Test each error condition
663
+ 4. Check error messages are actionable
664
+ 5. Verify errors are logged/monitored
665
+
666
+ **Red Flags**:
667
+ - Errors caught but ignored (`catch {}`)
668
+ - Generic error messages
669
+ - Errors handled by crashing
670
+ - No logging of critical errors
671
+
672
+ ---
673
+
674
+ ## Output Format
675
+
676
+ ### The Sherlock Report
677
+
678
+ ```markdown
679
+ # Sherlock Investigation Report
680
+
681
+ **Case**: [PR/Issue number and title]
682
+ **Investigator**: [Your name]
683
+ **Date**: [Investigation date]
684
+
685
+ ## Summary
686
+ [One paragraph: What was claimed, what was found, verdict]
687
+
688
+ ## Claims Examined
689
+ 1. [Claim 1]
690
+ 2. [Claim 2]
691
+ 3. [Claim 3]
692
+
693
+ ## Evidence Collected
694
+ - Code changes: [summary]
695
+ - Tests added: [count and coverage]
696
+ - Benchmarks: [results]
697
+ - Manual testing: [scenarios tested]
698
+
699
+ ## Deductive Analysis
700
+
701
+ ### Claim 1: [Claim text]
702
+ **Evidence**: [What you found]
703
+ **Deduction**: [Logical conclusion]
704
+ **Verdict**: ✓ TRUE / ✗ FALSE / ⚠ PARTIALLY TRUE
705
+
706
+ [Repeat for each claim]
707
+
708
+ ## Findings
709
+
710
+ ### What Works
711
+ - [Positive finding with evidence]
712
+
713
+ ### What Doesn't Work
714
+ - [Issue found with evidence]
715
+
716
+ ### What's Missing
717
+ - [Gap in implementation/testing]
718
+
719
+ ## Overall Verdict
720
+
721
+ - [ ] Approve: Claims fully supported by evidence
722
+ - [ ] Approve with Reservations: Claims mostly accurate
723
+ - [ ] Request Changes: Claims not supported by evidence
724
+ - [ ] Reject: Claims are false or misleading
725
+
726
+ ## Recommendations
727
+ 1. [Action item based on findings]
728
+ 2. [Action item based on findings]
729
+
730
+ ---
731
+
732
+ **Elementary Evidence**: [Link to detailed evidence files/logs]
733
+ **Reproducible**: [Yes/No - Can others verify your findings?]
734
+ ```
735
+
736
+ ---
737
+
738
+ ## Integration with AQE Fleet
739
+
740
+ ### Use Sherlock Review With:
741
+
742
+ 1. **qe-code-reviewer**: After automated review, investigate flagged issues
743
+ 2. **qe-security-auditor**: Verify security fix claims
744
+ 3. **qe-performance-validator**: Validate performance improvement claims
745
+ 4. **qe-flaky-test-hunter**: Investigate "test fixed" claims
746
+ 5. **production-validator**: Verify deployment-ready claims
747
+
748
+ ### Workflow Integration
749
+
750
+ ```bash
751
+ # 1. Automated review flags issues
752
+ aqe review --pr 123
753
+
754
+ # 2. Sherlock investigates flagged claims
755
+ # [Apply Sherlock methodology to each flag]
756
+
757
+ # 3. Document evidence-based findings
758
+ # [Generate Sherlock report]
759
+
760
+ # 4. Provide actionable feedback
761
+ # [Based on evidence, not assumptions]
762
+ ```
763
+
764
+ ---
765
+
766
+ ## Learn More
767
+
768
+ ### Recommended Reading
769
+ - "The Adventure of Silver Blaze" - Importance of negative evidence
770
+ - "A Scandal in Bohemia" - Observation vs. seeing
771
+ - "The Boscombe Valley Mystery" - Following the evidence chain
772
+
773
+ ### Related QE Skills
774
+ - `brutal-honesty-review` - Direct technical criticism
775
+ - `context-driven-testing` - Adapt to specific context
776
+ - `exploratory-testing-advanced` - Investigation techniques
777
+ - `bug-reporting-excellence` - Document findings clearly
778
+
779
+ ---
780
+
781
+ **Created**: 2025-11-15
782
+ **Category**: Quality Engineering
783
+ **Approach**: Evidence-Based Investigation
784
+ **Philosophy**: "Elementary" - Trust only what can be proven
785
+
786
+ *"It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts."* - Sherlock Holmes