agentic-qe 3.6.4 → 3.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (69) hide show
  1. package/.claude/helpers/v3/product-factors/sfdipot-reference-template.html +958 -0
  2. package/.claude/helpers/v3/quality-criteria/evidence-classification.md +116 -0
  3. package/.claude/helpers/v3/quality-criteria/htsm-categories.md +139 -0
  4. package/.claude/helpers/v3/quality-criteria/quality-criteria-reference-template.html +811 -0
  5. package/.claude/helpers/v3/quality-criteria/validate-quality-criteria.ts +167 -0
  6. package/.claude/skills/README.md +21 -8
  7. package/.claude/skills/skills-manifest.json +1 -1
  8. package/README.md +45 -49
  9. package/package.json +1 -1
  10. package/v3/CHANGELOG.md +13 -0
  11. package/v3/assets/agents/v3/README.md +9 -70
  12. package/v3/assets/agents/v3/qe-accessibility-auditor.md +8 -7
  13. package/v3/assets/agents/v3/qe-bdd-generator.md +8 -7
  14. package/v3/assets/agents/v3/qe-chaos-engineer.md +9 -8
  15. package/v3/assets/agents/v3/qe-code-complexity.md +8 -7
  16. package/v3/assets/agents/v3/qe-code-intelligence.md +8 -7
  17. package/v3/assets/agents/v3/qe-contract-validator.md +8 -7
  18. package/v3/assets/agents/v3/qe-coverage-specialist.md +8 -7
  19. package/v3/assets/agents/v3/qe-defect-predictor.md +8 -7
  20. package/v3/assets/agents/v3/qe-dependency-mapper.md +8 -7
  21. package/v3/assets/agents/v3/qe-deployment-advisor.md +8 -7
  22. package/v3/assets/agents/v3/qe-flaky-hunter.md +8 -7
  23. package/v3/assets/agents/v3/qe-fleet-commander.md +8 -7
  24. package/v3/assets/agents/v3/qe-gap-detector.md +8 -7
  25. package/v3/assets/agents/v3/qe-graphql-tester.md +8 -7
  26. package/v3/assets/agents/v3/qe-impact-analyzer.md +8 -7
  27. package/v3/assets/agents/v3/qe-kg-builder.md +8 -7
  28. package/v3/assets/agents/v3/qe-load-tester.md +8 -7
  29. package/v3/assets/agents/v3/qe-message-broker-tester.md +15 -10
  30. package/v3/assets/agents/v3/qe-metrics-optimizer.md +8 -7
  31. package/v3/assets/agents/v3/qe-middleware-validator.md +15 -10
  32. package/v3/assets/agents/v3/qe-mutation-tester.md +8 -7
  33. package/v3/assets/agents/v3/qe-odata-contract-tester.md +17 -12
  34. package/v3/assets/agents/v3/qe-performance-tester.md +8 -7
  35. package/v3/assets/agents/v3/qe-property-tester.md +8 -7
  36. package/v3/assets/agents/v3/qe-qx-partner.md +74 -14
  37. package/v3/assets/agents/v3/qe-regression-analyzer.md +8 -7
  38. package/v3/assets/agents/v3/qe-requirements-validator.md +8 -7
  39. package/v3/assets/agents/v3/qe-responsive-tester.md +8 -7
  40. package/v3/assets/agents/v3/qe-retry-handler.md +8 -7
  41. package/v3/assets/agents/v3/qe-risk-assessor.md +8 -7
  42. package/v3/assets/agents/v3/qe-root-cause-analyzer.md +8 -7
  43. package/v3/assets/agents/v3/qe-sap-idoc-tester.md +16 -11
  44. package/v3/assets/agents/v3/qe-sap-rfc-tester.md +15 -10
  45. package/v3/assets/agents/v3/qe-security-auditor.md +12 -7
  46. package/v3/assets/agents/v3/qe-security-scanner.md +9 -8
  47. package/v3/assets/agents/v3/qe-soap-tester.md +15 -10
  48. package/v3/assets/agents/v3/qe-sod-analyzer.md +17 -12
  49. package/v3/assets/agents/v3/qe-test-architect.md +8 -7
  50. package/v3/assets/agents/v3/qe-transfer-specialist.md +8 -7
  51. package/v3/assets/agents/v3/subagents/qe-code-reviewer.md +8 -7
  52. package/v3/assets/agents/v3/subagents/qe-integration-reviewer.md +8 -7
  53. package/v3/assets/agents/v3/subagents/qe-performance-reviewer.md +8 -7
  54. package/v3/assets/agents/v3/subagents/qe-security-reviewer.md +8 -7
  55. package/v3/assets/agents/v3/subagents/qe-tdd-green.md +8 -7
  56. package/v3/assets/agents/v3/subagents/qe-tdd-red.md +8 -7
  57. package/v3/assets/agents/v3/subagents/qe-tdd-refactor.md +8 -7
  58. package/v3/assets/agents/v3/templates/qx-report-template.html +26 -22
  59. package/v3/dist/cli/bundle.js +97 -19
  60. package/v3/dist/init/phases/11-claude-md.d.ts.map +1 -1
  61. package/v3/dist/init/phases/11-claude-md.js +94 -16
  62. package/v3/dist/init/phases/11-claude-md.js.map +1 -1
  63. package/v3/dist/kernel/constants.d.ts +1 -1
  64. package/v3/dist/kernel/constants.js +1 -1
  65. package/v3/dist/learning/qe-unified-memory.d.ts.map +1 -1
  66. package/v3/dist/learning/qe-unified-memory.js +8 -7
  67. package/v3/dist/learning/qe-unified-memory.js.map +1 -1
  68. package/v3/dist/mcp/bundle.js +80 -1
  69. package/v3/package.json +2 -2
@@ -0,0 +1,116 @@
1
+ # Evidence Classification Guide
2
+
3
+ Guidelines for classifying evidence in Quality Criteria recommendations.
4
+
5
+ ## Evidence Types
6
+
7
+ ### Direct Evidence
8
+ **Definition:** Actual code quote, explicit documentation statement, or measurable fact from source.
9
+
10
+ **Requirements:**
11
+ - Must include `file_path:line_range` reference (e.g., `src/auth/login.ts:45-52`)
12
+ - Line ranges should be narrow (max 10-15 lines)
13
+ - Must quote or directly reference the source
14
+
15
+ **Examples:**
16
+ ```
17
+ Source: src/payment/processor.ts:123-128
18
+ Type: Direct
19
+ Finding: No input validation before API call
20
+ Reasoning: Unvalidated input could enable injection attacks
21
+ ```
22
+
23
+ ### Inferred Evidence
24
+ **Definition:** Logical deduction from observed patterns, architectural implications, or domain knowledge.
25
+
26
+ **Requirements:**
27
+ - Must show reasoning chain
28
+ - Can use architectural implications
29
+ - Should reference what was observed
30
+
31
+ **Examples:**
32
+ ```
33
+ Source: Architecture review of src/api/
34
+ Type: Inferred
35
+ Finding: No rate limiting middleware detected
36
+ Reasoning: API endpoints could be vulnerable to DoS; need to verify with load testing
37
+ ```
38
+
39
+ ### Claimed Evidence
40
+ **Definition:** Statement that requires verification - based on assumptions or incomplete data.
41
+
42
+ **Requirements:**
43
+ - Must state "requires verification" or "needs inspection to confirm"
44
+ - Must NOT speculate about what "could" or "might" happen
45
+ - Used when source is unavailable or claim needs validation
46
+
47
+ **Examples:**
48
+ ```
49
+ WRONG: "Could range from efficient to aggressive implementation"
50
+ RIGHT: "Poll interval not specified - requires code inspection to verify"
51
+ ```
52
+
53
+ ## Evidence Table Format
54
+
55
+ ```html
56
+ <table class="evidence-table">
57
+ <thead>
58
+ <tr>
59
+ <th>Source Reference</th>
60
+ <th>Type</th>
61
+ <th>Quality Implication</th>
62
+ <th>Reasoning</th>
63
+ </tr>
64
+ </thead>
65
+ <tbody>
66
+ <tr>
67
+ <td><code>src/auth/session.ts:89-94</code></td>
68
+ <td><span class="evidence-type direct">Direct</span></td>
69
+ <td>Session tokens stored without encryption</td>
70
+ <td class="evidence-reasoning">Credential exposure risk if storage is compromised</td>
71
+ </tr>
72
+ </tbody>
73
+ </table>
74
+ ```
75
+
76
+ ## Source Reference Format
77
+
78
+ ### For Specific Code
79
+ ```
80
+ file_path:start_line-end_line
81
+ Example: src/agents/FleetCommanderAgent.ts:847-852
82
+ ```
83
+
84
+ ### For File-Level Metrics
85
+ ```
86
+ file_path (metric)
87
+ Example: src/agents/N8nBaseAgent.ts (683 LOC)
88
+ ```
89
+
90
+ ### For Search Results (No Matches)
91
+ ```
92
+ N/A (verified via Glob/Grep search)
93
+ - NOT: tests/**/n8n/**/*.test.ts (glob pattern)
94
+ ```
95
+
96
+ ## Reasoning Column Guidelines
97
+
98
+ The Reasoning column must explain **WHY** something matters, not **WHAT** the code does.
99
+
100
+ | WRONG (describes WHAT) | CORRECT (explains WHY) |
101
+ |------------------------|------------------------|
102
+ | "Retry logic with exponential backoff" | "Retry pattern handles transient failures; needs edge case testing for timeout exhaustion" |
103
+ | "Session cookie stored in memory" | "Credential in memory could leak if agent state is serialized to logs" |
104
+ | "getWorkflow supports forceRefresh flag" | "Cache bypass prevents stale data; but increases load on source system" |
105
+
106
+ **Formula:**
107
+ ```
108
+ {What the code does} → {Why that matters for quality} → {What could go wrong}
109
+ ```
110
+
111
+ ## Prohibited Patterns
112
+
113
+ - **No confidence percentages**: Use evidence types instead of "85% confident"
114
+ - **No vague blast radius**: Use "affects 19 agents" not "affects many"
115
+ - **No speculation in Claimed**: Use "requires verification" not "could be X or Y"
116
+ - **No keyword matching claims**: Show semantic reasoning, not keyword counts
@@ -0,0 +1,139 @@
1
+ # HTSM v6.3 Quality Criteria Categories
2
+
3
+ James Bach's Heuristic Test Strategy Model (HTSM) v6.3 Quality Criteria framework.
4
+
5
+ ## 1. Capability
6
+ **Can it perform the required functions?**
7
+
8
+ | Subcategory | Focus |
9
+ |-------------|-------|
10
+ | Sufficiency | Does it do what it's supposed to? |
11
+ | Correctness | Does it do it correctly? |
12
+
13
+ **Priority Indicators:**
14
+ - P0: Core business functionality
15
+ - P1: Important features
16
+ - P2: Secondary features
17
+ - P3: Nice-to-have features
18
+
19
+ ## 2. Reliability
20
+ **Will it work well and resist failure?**
21
+
22
+ | Subcategory | Focus |
23
+ |-------------|-------|
24
+ | Robustness | Can it handle adverse conditions? |
25
+ | Error Handling | Does it handle errors gracefully? |
26
+ | Data Integrity | Is data protected from corruption? |
27
+ | Safety | Does it avoid dangerous behaviors? |
28
+
29
+ **Cannot be omitted** - All systems can fail.
30
+
31
+ ## 3. Usability
32
+ **How easy is it for real users?**
33
+
34
+ | Subcategory | Focus |
35
+ |-------------|-------|
36
+ | Learnability | How quickly can users learn? |
37
+ | Operability | How easy to operate day-to-day? |
38
+ | Accessibility | Can users with disabilities use it? |
39
+
40
+ ## 4. Charisma
41
+ **How appealing is the product?**
42
+
43
+ | Subcategory | Focus |
44
+ |-------------|-------|
45
+ | Aesthetics | Is it visually pleasing? |
46
+ | Uniqueness | Does it stand out? |
47
+ | Entrancement | Does it engage users? |
48
+ | Image | Does it project the right brand? |
49
+
50
+ **Note:** "Brand guidelines handled separately" is NOT a valid omission reason. Charisma is about UX testing, not brand documentation.
51
+
52
+ ## 5. Security
53
+ **How well protected against unauthorized use?**
54
+
55
+ | Subcategory | Focus |
56
+ |-------------|-------|
57
+ | Authentication | Who is using it? |
58
+ | Authorization | What are they allowed to do? |
59
+ | Privacy | Is personal data protected? |
60
+ | Security Holes | Are there vulnerabilities? |
61
+
62
+ **Cannot be omitted** - Every system has attack surface.
63
+
64
+ ## 6. Scalability
65
+ **How well does deployment scale?**
66
+
67
+ | Subcategory | Focus |
68
+ |-------------|-------|
69
+ | Load Handling | Behavior under increased demand |
70
+ | Resource Efficiency | Resource usage at scale |
71
+
72
+ ## 7. Compatibility
73
+ **Works with external components?**
74
+
75
+ | Subcategory | Focus |
76
+ |-------------|-------|
77
+ | Application | Works with other applications? |
78
+ | OS | Works with target operating systems? |
79
+ | Hardware | Works with target hardware? |
80
+ | Backward | Works with previous versions? |
81
+ | Product Footprint | Resource requirements acceptable? |
82
+
83
+ ## 8. Performance
84
+ **How speedy and responsive?**
85
+
86
+ | Subcategory | Focus |
87
+ |-------------|-------|
88
+ | Response Time | Under various conditions |
89
+ | Throughput | Data processing capacity |
90
+ | Efficiency | Resource utilization |
91
+
92
+ **Cannot be omitted** - Every system has response time.
93
+
94
+ ## 9. Installability
95
+ **How easily installed?**
96
+
97
+ | Subcategory | Focus |
98
+ |-------------|-------|
99
+ | System Requirements | Clear and achievable? |
100
+ | Configuration | Easy to configure? |
101
+ | Uninstallation | Clean removal? |
102
+ | Upgrades/Patches | Easy to update? |
103
+ | Administration | Easy to administer? |
104
+
105
+ **Valid omission:** Pure SaaS/browser-based with no client installation.
106
+
107
+ ## 10. Development
108
+ **How well can we create/test/modify?**
109
+
110
+ | Subcategory | Focus |
111
+ |-------------|-------|
112
+ | Supportability | Easy to support? |
113
+ | Testability | Easy to test? |
114
+ | Maintainability | Easy to maintain? |
115
+ | Portability | Easy to port? |
116
+ | Localizability | Easy to localize? |
117
+
118
+ **Cannot be omitted** - Always applies to software.
119
+
120
+ ---
121
+
122
+ ## Priority Assignment Guide
123
+
124
+ | Priority | Definition | Example |
125
+ |----------|------------|---------|
126
+ | **P0 (Critical)** | Failure causes immediate business/user harm | Payment failures, data breaches |
127
+ | **P1 (High)** | Critical to core user value proposition | Core features not working |
128
+ | **P2 (Medium)** | Affects satisfaction but not blocking | Secondary features |
129
+ | **P3 (Low)** | Nice-to-have improvements | Polish, edge case optimization |
130
+
131
+ ## Valid vs Invalid Omission Reasons
132
+
133
+ | Category | Valid Omission | Invalid Omission |
134
+ |----------|----------------|------------------|
135
+ | Installability | "Pure SaaS, no client installation" | "Handled by ops team" |
136
+ | Charisma | "CLI tool, visual design N/A" | "Brand guidelines separate" |
137
+ | Compatibility | "Single-platform by contract" | "Will test on main browsers" |
138
+ | Development | **NEVER** | "Team is experienced" |
139
+ | Security | **NEVER** | "Internal system only" |