@mclawnet/agent 0.5.9 → 0.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/cli.js +168 -61
  2. package/dist/__tests__/cli.test.d.ts +2 -0
  3. package/dist/__tests__/cli.test.d.ts.map +1 -0
  4. package/dist/__tests__/service-config.test.d.ts +2 -0
  5. package/dist/__tests__/service-config.test.d.ts.map +1 -0
  6. package/dist/__tests__/service-linux.test.d.ts +2 -0
  7. package/dist/__tests__/service-linux.test.d.ts.map +1 -0
  8. package/dist/__tests__/service-macos.test.d.ts +2 -0
  9. package/dist/__tests__/service-macos.test.d.ts.map +1 -0
  10. package/dist/__tests__/service-windows.test.d.ts +2 -0
  11. package/dist/__tests__/service-windows.test.d.ts.map +1 -0
  12. package/dist/backend-adapter.d.ts +2 -0
  13. package/dist/backend-adapter.d.ts.map +1 -1
  14. package/dist/chunk-CBZIH6FY.js +93 -0
  15. package/dist/chunk-CBZIH6FY.js.map +1 -0
  16. package/dist/{chunk-KHPEQTWF.js → chunk-GLO5OZAY.js} +203 -213
  17. package/dist/chunk-GLO5OZAY.js.map +1 -0
  18. package/dist/chunk-RO47ET27.js +88 -0
  19. package/dist/chunk-RO47ET27.js.map +1 -0
  20. package/dist/hub-connection.d.ts.map +1 -1
  21. package/dist/index.js +5 -3
  22. package/dist/index.js.map +1 -1
  23. package/dist/linux-6AR7SXHW.js +176 -0
  24. package/dist/linux-6AR7SXHW.js.map +1 -0
  25. package/dist/macos-XVPWIH4C.js +174 -0
  26. package/dist/macos-XVPWIH4C.js.map +1 -0
  27. package/dist/service/config.d.ts +19 -0
  28. package/dist/service/config.d.ts.map +1 -0
  29. package/dist/service/index.d.ts +6 -0
  30. package/dist/service/index.d.ts.map +1 -0
  31. package/dist/service/index.js +47 -0
  32. package/dist/service/index.js.map +1 -0
  33. package/dist/service/linux.d.ts +18 -0
  34. package/dist/service/linux.d.ts.map +1 -0
  35. package/dist/service/macos.d.ts +18 -0
  36. package/dist/service/macos.d.ts.map +1 -0
  37. package/dist/service/types.d.ts +19 -0
  38. package/dist/service/types.d.ts.map +1 -0
  39. package/dist/service/windows.d.ts +18 -0
  40. package/dist/service/windows.d.ts.map +1 -0
  41. package/dist/session-manager.d.ts +4 -7
  42. package/dist/session-manager.d.ts.map +1 -1
  43. package/dist/skill-loader.d.ts +8 -0
  44. package/dist/skill-loader.d.ts.map +1 -0
  45. package/dist/start.d.ts.map +1 -1
  46. package/dist/start.js +2 -1
  47. package/dist/windows-NLONSCDA.js +165 -0
  48. package/dist/windows-NLONSCDA.js.map +1 -0
  49. package/package.json +7 -5
  50. package/skills/academic-search/SKILL.md +147 -0
  51. package/skills/architecture/SKILL.md +294 -0
  52. package/skills/changelog-generator/SKILL.md +112 -0
  53. package/skills/chart-visualization/SKILL.md +183 -0
  54. package/skills/code-review/SKILL.md +304 -0
  55. package/skills/codebase-health/SKILL.md +281 -0
  56. package/skills/consulting-analysis/SKILL.md +584 -0
  57. package/skills/content-research-writer/SKILL.md +546 -0
  58. package/skills/data-analysis/SKILL.md +194 -0
  59. package/skills/deep-research/SKILL.md +198 -0
  60. package/skills/docx/SKILL.md +211 -0
  61. package/skills/github-deep-research/SKILL.md +207 -0
  62. package/skills/image-generation/SKILL.md +209 -0
  63. package/skills/lead-research-assistant/SKILL.md +207 -0
  64. package/skills/mcp-builder/SKILL.md +304 -0
  65. package/skills/meeting-insights-analyzer/SKILL.md +335 -0
  66. package/skills/pair-programming/SKILL.md +196 -0
  67. package/skills/pdf/SKILL.md +309 -0
  68. package/skills/performance-analysis/SKILL.md +261 -0
  69. package/skills/podcast-generation/SKILL.md +224 -0
  70. package/skills/pptx/SKILL.md +497 -0
  71. package/skills/project-learnings/SKILL.md +280 -0
  72. package/skills/security-audit/SKILL.md +211 -0
  73. package/skills/skill-creator/SKILL.md +200 -0
  74. package/skills/technical-writing/SKILL.md +286 -0
  75. package/skills/testing/SKILL.md +363 -0
  76. package/skills/video-generation/SKILL.md +247 -0
  77. package/skills/web-design-guidelines/SKILL.md +203 -0
  78. package/skills/webapp-testing/SKILL.md +162 -0
  79. package/skills/workflow-automation/SKILL.md +299 -0
  80. package/skills/xlsx/SKILL.md +305 -0
  81. package/dist/chunk-KHPEQTWF.js.map +0 -1
@@ -0,0 +1,183 @@
1
+ ---
2
+ name: chart-visualization
3
+ description: Design and generate data visualizations including charts, graphs, and diagrams. Use when creating bar charts, line graphs, pie charts, scatter plots, heatmaps, or any data-driven visual representation.
4
+ disable-model-invocation: true
5
+ ---
6
+
7
+ # Chart Visualization
8
+
9
+ Design effective data visualizations by selecting the right chart type for your data, applying best practices for clarity and accuracy, and generating production-quality charts.
10
+
11
+ ## Overview
12
+
13
+ This skill guides the entire chart creation process: understanding your data's story, choosing the appropriate visualization type, designing for clarity, and implementing with popular charting libraries. The goal is always a chart that communicates insight at a glance.
14
+
15
+ ## When to Use
16
+
17
+ - Creating charts or graphs from data
18
+ - Choosing the right visualization type for a dataset
19
+ - Designing dashboards with multiple visual components
20
+ - Converting raw numbers into visual stories for presentations or reports
21
+ - Generating interactive or static data visualizations
22
+
23
+ ## When NOT to Use
24
+
25
+ - **Data analysis or computation** — use the `data-analysis` skill for processing data before visualization
26
+ - **Infographic design** — for non-data-driven visual design, use general design skills
27
+ - **Diagram creation** (flowcharts, architecture diagrams) — those are structural, not data-driven
28
+
29
+ ## Chart Type Selection Guide
30
+
31
+ ### By Data Relationship
32
+
33
+ | What You Want to Show | Chart Type | When to Use |
34
+ |---|---|---|
35
+ | Comparison across categories | Bar chart (vertical) | ≤12 categories, emphasizing magnitude |
36
+ | Comparison with long labels | Bar chart (horizontal) | Long category names, ranking |
37
+ | Change over time | Line chart | Continuous time series, trends |
38
+ | Part of a whole | Pie / Donut chart | ≤6 slices, percentages that sum to 100% |
39
+ | Part of a whole (many categories) | Stacked bar chart | >6 categories, comparison across groups |
40
+ | Distribution | Histogram | Continuous variable frequency |
41
+ | Correlation | Scatter plot | Two continuous variables, pattern detection |
42
+ | Density / frequency | Heatmap | Two categorical axes, many data points |
43
+ | Range / uncertainty | Error bar / Box plot | Statistical spread, confidence intervals |
44
+ | Geographic data | Choropleth / Bubble map | Location-based values |
45
+ | Hierarchy | Treemap / Sunburst | Nested categories with values |
46
+ | Flow / process | Sankey diagram | Quantities flowing between stages |
47
+ | Progress toward goal | Gauge / Progress bar | Single value against target |
48
+
49
+ ### Decision Flowchart
50
+
51
+ ```
52
+ What is the data about?
53
+ ├── Comparing values → How many categories?
54
+ │ ├── ≤5 → Grouped bar chart
55
+ │ ├── 6-12 → Horizontal bar chart
56
+ │ └── >12 → Consider filtering or small multiples
57
+ ├── Showing change over time → How many series?
58
+ │ ├── 1-3 → Line chart
59
+ │ ├── 4-7 → Line chart with legend or small multiples
60
+ │ └── >7 → Heatmap or highlight key series
61
+ ├── Showing proportions → How many parts?
62
+ │ ├── ≤6 → Pie or donut chart
63
+ │ └── >6 → Stacked bar or treemap
64
+ ├── Showing relationships → Scatter plot or bubble chart
65
+ ├── Showing distribution → Histogram or box plot
66
+ └── Showing geographic patterns → Map visualization
67
+ ```
68
+
69
+ ## Design Principles
70
+
71
+ ### Clarity First
72
+
73
+ 1. **Title**: Always include a descriptive title that tells the story ("Sales grew 40% in Q3" > "Q3 Sales Data")
74
+ 2. **Axes**: Label both axes with units. Use human-readable scales (1K, 1M, not 1000, 1000000)
75
+ 3. **Legend**: Place inside the chart area when possible. Order matches visual order
76
+ 4. **Color**: Use a consistent palette. Maximum 7 distinct colors before confusion sets in
77
+ 5. **Gridlines**: Light, subtle. Remove when not needed for reading values
78
+
79
+ ### Honesty Rules
80
+
81
+ - **Start Y-axis at zero** for bar charts (truncated axes exaggerate differences)
82
+ - **Use consistent intervals** on time axes (don't compress gaps)
83
+ - **Avoid 3D effects** — they distort perception of values
84
+ - **Avoid dual Y-axes** when possible — they confuse more than clarify
85
+ - **Show data density honestly** — don't cherry-pick ranges that support a narrative
86
+
87
+ ### Responsive Design
88
+
89
+ - Design for the smallest expected display size first
90
+ - Use relative sizing (%, vw/vh) not fixed pixels for web
91
+ - Reduce label density on mobile — rotate or abbreviate
92
+ - Consider small multiples over complex combined charts on small screens
93
+
94
+ ## Implementation Patterns
95
+
96
+ ### With Chart.js (JavaScript)
97
+
98
+ ```javascript
99
+ const chart = new Chart(ctx, {
100
+ type: 'bar',
101
+ data: {
102
+ labels: ['Jan', 'Feb', 'Mar', 'Apr'],
103
+ datasets: [{
104
+ label: 'Revenue ($K)',
105
+ data: [120, 150, 180, 210],
106
+ backgroundColor: '#4F46E5'
107
+ }]
108
+ },
109
+ options: {
110
+ responsive: true,
111
+ plugins: { title: { display: true, text: 'Monthly Revenue 2024' } },
112
+ scales: { y: { beginAtZero: true } }
113
+ }
114
+ });
115
+ ```
116
+
117
+ ### With Python (Matplotlib / Seaborn)
118
+
119
+ ```python
120
+ import matplotlib.pyplot as plt
121
+ import seaborn as sns
122
+
123
+ fig, ax = plt.subplots(figsize=(10, 6))
124
+ sns.barplot(data=df, x='month', y='revenue', ax=ax)
125
+ ax.set_title('Monthly Revenue 2024')
126
+ ax.set_ylabel('Revenue ($K)')
127
+ plt.tight_layout()
128
+ plt.savefig('revenue.png', dpi=150)
129
+ ```
130
+
131
+ ### With D3.js (Custom SVG)
132
+
133
+ ```javascript
134
+ const svg = d3.select('#chart')
135
+ .append('svg')
136
+ .attr('viewBox', `0 0 ${width} ${height}`);
137
+
138
+ svg.selectAll('rect')
139
+ .data(data)
140
+ .join('rect')
141
+ .attr('x', d => xScale(d.label))
142
+ .attr('y', d => yScale(d.value))
143
+ .attr('width', xScale.bandwidth())
144
+ .attr('height', d => height - margin.bottom - yScale(d.value))
145
+ .attr('fill', '#4F46E5');
146
+ ```
147
+
148
+ ## Color Palettes
149
+
150
+ ### Sequential (single variable, low-to-high)
151
+ - Blues: `#EFF6FF → #1D4ED8` (light background to dark emphasis)
152
+ - Greens: `#F0FDF4 → #15803D`
153
+
154
+ ### Diverging (positive/negative, above/below center)
155
+ - Red-Blue: `#DC2626 → #F5F5F5 → #2563EB`
156
+ - Orange-Purple: `#EA580C → #F5F5F5 → #7C3AED`
157
+
158
+ ### Categorical (distinct groups)
159
+ - Default 7: `#4F46E5, #059669, #D97706, #DC2626, #7C3AED, #0891B2, #DB2777`
160
+ - Accessible: meets WCAG contrast for both light and dark backgrounds
161
+
162
+ ### Colorblind-Safe Rules
163
+ - Never use red/green alone to distinguish categories
164
+ - Add patterns (hatching, dots) as a secondary differentiator
165
+ - Use colorblind simulation tools to verify
166
+
167
+ ## Common Anti-Patterns
168
+
169
+ ### Pie Chart Overuse
170
+ **Problem**: More than 6 slices, or slices of similar size, make pie charts unreadable.
171
+ **Fix**: Use a horizontal bar chart sorted by value.
172
+
173
+ ### Spaghetti Line Charts
174
+ **Problem**: 10+ overlapping lines create visual noise.
175
+ **Fix**: Highlight 2-3 key series, gray out the rest. Or use small multiples.
176
+
177
+ ### Missing Context
178
+ **Problem**: A chart without comparison points ("Is 42% good or bad?").
179
+ **Fix**: Add benchmarks, targets, or prior period comparisons.
180
+
181
+ ### Overdecorated Charts
182
+ **Problem**: Gradients, shadows, 3D effects, background images.
183
+ **Fix**: Remove every element that doesn't convey data. Edward Tufte's "data-ink ratio."
@@ -0,0 +1,304 @@
1
+ ---
2
+ name: code-review
3
+ description: Perform systematic, multi-dimensional code reviews on pull requests, diffs, or code snippets. Use when asked to review changes, audit code quality, or assess a PR before merge.
4
+ ---
5
+
6
+ # Code Review
7
+
8
+ A systematic code review skill that evaluates changes across five dimensions — correctness, security, performance, architecture, and test coverage — using a structured six-step process. Every finding is assigned a severity, confidence level, and concrete fix suggestion.
9
+
10
+ ## Overview
11
+
12
+ This skill guides a thorough, opinionated code review process. The reviewer acts as a senior engineer whose job is to catch real problems before they reach production, not to rubber-stamp changes.
13
+
14
+ Core principles:
15
+ - **Be direct.** Every finding takes a clear position: "this is wrong because X" or "this works because Y." Never hedge with "you might want to consider."
16
+ - **Be systematic.** Follow the six-step process in order. Do not skip steps even when the diff looks small.
17
+ - **Be proportional.** Spend review effort proportional to risk. A one-line change to auth logic deserves more scrutiny than a 500-line CSS refactor.
18
+ - **Be constructive.** Every problem identified must include a concrete suggestion for how to fix it.
19
+ - **Be honest about uncertainty.** When you are unsure, say so explicitly with a calibrated confidence level rather than presenting guesses as facts.
20
+
21
+ The goal is not to find the maximum number of issues. The goal is to find the issues that matter and communicate them clearly.
22
+
23
+ ## When to Use
24
+
25
+ Activate this skill when:
26
+ - Reviewing a pull request or merge request
27
+ - Reviewing a diff or set of code changes
28
+ - Auditing existing code for quality, security, or correctness
29
+ - Asked to provide feedback on code before it ships
30
+ - Performing a post-incident review of code that caused a bug
31
+ - Comparing two implementations to recommend which is better
32
+
33
+ ## When NOT to Use
34
+
35
+ Do not use this skill when:
36
+ - **Writing new code from scratch.** Use the architecture or implementation skills instead.
37
+ - **Writing or fixing tests.** Use the testing skill instead.
38
+ - **Debugging a runtime error.** That is a debugging task, not a review task.
39
+ - **Generating boilerplate or scaffolding.** That is code generation, not review.
40
+ - **The user wants unconditional praise.** This skill is designed to find real problems. If the code is good, the review will say so — but it will not inflate quality.
41
+
42
+ ## Review Process
43
+
44
+ Follow these six steps in order. Complete each step before moving to the next.
45
+
46
+ ### Step 1: Understand Context
47
+
48
+ Before reading any code, establish what the change is trying to do and why.
49
+
50
+ - Read the PR title, description, and any linked issues or tickets.
51
+ - Identify the type of change: feature, bugfix, refactor, performance improvement, dependency update, configuration change.
52
+ - Note the scope: which components, services, or layers are affected.
53
+ - Identify the risk profile: does this touch auth, payments, data persistence, public APIs, or other high-risk areas?
54
+ - Form an expectation of what the diff should look like given the stated intent.
55
+
56
+ If the PR description is missing or unclear, flag this as a finding. Good code review requires good context.
57
+
58
+ ### Step 2: Review Structure
59
+
60
+ Evaluate the organizational quality of the changes.
61
+
62
+ - Are new files placed in the correct directories following project conventions?
63
+ - Are files, classes, functions, and variables named clearly and consistently?
64
+ - Is the change appropriately sized? A PR with 50+ changed files may need to be split.
65
+ - Are unrelated changes mixed in (scope creep)? Flag drive-by refactors that complicate review.
66
+ - Do import/dependency changes make sense? Are new dependencies justified?
67
+ - Is dead code being added (commented-out blocks, unused imports, unreachable branches)?
68
+
69
+ ### Step 3: Review Logic
70
+
71
+ Read the code line-by-line and verify correctness.
72
+
73
+ - Trace the primary execution path. Does it produce the correct result for typical inputs?
74
+ - Trace error paths. What happens when inputs are null, empty, out of range, or malformed?
75
+ - Check boundary conditions: off-by-one errors, integer overflow, empty collections, concurrent access.
76
+ - Verify state management: are mutations intentional? Is state cleaned up properly?
77
+ - Check control flow: are all branches reachable? Are switch/match statements exhaustive?
78
+ - Verify data transformations: are types correct at each step? Are conversions safe?
79
+ - Check for race conditions in async or concurrent code.
80
+ - Verify that the code actually implements what the PR description claims.
81
+
82
+ ### Step 4: Review Security
83
+
84
+ Check for vulnerabilities using the OWASP Top 10 as a baseline.
85
+
86
+ - **Injection**: Are all user inputs validated and sanitized before use in SQL, shell commands, templates, or queries?
87
+ - **Authentication**: Are auth checks present on all protected endpoints? Are tokens validated correctly?
88
+ - **Authorization**: Does the code enforce proper access control? Can user A access user B's data?
89
+ - **Data exposure**: Are secrets, tokens, or PII logged, returned in error messages, or stored in plaintext?
90
+ - **Configuration**: Are default configurations secure? Are debug modes disabled for production?
91
+ - **Dependencies**: Do new dependencies have known vulnerabilities? Are versions pinned?
92
+ - **Cryptography**: Are modern algorithms used? Are random values cryptographically secure where needed?
93
+
94
+ ### Step 5: Review Performance
95
+
96
+ Identify code that will degrade under load or with large inputs.
97
+
98
+ - Check algorithmic complexity. Flag O(n²) or worse when O(n) or O(n log n) alternatives exist.
99
+ - Identify N+1 query patterns in database access code.
100
+ - Check for unnecessary memory allocations: large object copies, string concatenation in loops, unbounded buffers.
101
+ - Verify that I/O operations (network, disk, database) are batched where possible.
102
+ - Check caching: is cacheable data being re-fetched? Are cache invalidation strategies correct?
103
+ - Look for blocking operations in async contexts.
104
+ - Verify pagination: are unbounded queries possible? Can a user request unlimited data?
105
+ - Check resource cleanup: are connections, file handles, and streams properly closed?
106
+
107
+ ### Step 6: Review Tests
108
+
109
+ Evaluate whether the changes are adequately tested.
110
+
111
+ - Do new features have corresponding test cases?
112
+ - Do bug fixes include a regression test that would have caught the original bug?
113
+ - Are edge cases tested (empty input, null values, error conditions, boundary values)?
114
+ - Are tests actually asserting meaningful behavior, or are they trivially passing?
115
+ - Is mock/stub usage appropriate? Over-mocking can hide real bugs.
116
+ - Do integration tests cover the interaction between changed components?
117
+ - Are test names descriptive enough to serve as documentation?
118
+ - If tests were deleted or modified, is the reason clear and justified?
119
+
120
+ ## Five Review Dimensions
121
+
122
+ Each dimension receives an independent assessment. A change can score well on security but poorly on architecture.
123
+
124
+ ### 1. Correctness
125
+
126
+ The code does what it claims to do, handles edge cases, and does not introduce regressions.
127
+
128
+ Key checklist:
129
+ - Logic matches stated intent in PR description
130
+ - All input types and ranges are handled (including null, undefined, empty, negative, very large)
131
+ - Error handling is exhaustive — no silently swallowed exceptions
132
+ - Return values and types are correct at every call site
133
+ - Side effects are documented or obvious from naming
134
+ - Concurrency and ordering assumptions are explicit and correct
135
+ - Backwards compatibility is maintained unless the breaking change is intentional and documented
136
+
137
+ ### 2. Security
138
+
139
+ The code does not introduce vulnerabilities that could be exploited.
140
+
141
+ Key checklist:
142
+ - User input is never trusted: validated, sanitized, or parameterized before use
143
+ - Authentication is checked before any protected operation
144
+ - Authorization is checked at the data layer, not just the UI layer
145
+ - Secrets are not hardcoded, logged, or included in error responses
146
+ - Sensitive data (PII, credentials, tokens) is encrypted at rest and in transit
147
+ - CSRF, XSS, and clickjacking protections are in place for web-facing code
148
+ - Rate limiting and input size limits prevent abuse
149
+
150
+ ### 3. Performance
151
+
152
+ The code performs acceptably under expected and peak load.
153
+
154
+ Key checklist:
155
+ - No O(n²) or worse algorithms where better alternatives exist
156
+ - Database queries are indexed and bounded (LIMIT clauses, pagination)
157
+ - No N+1 query patterns
158
+ - Large data sets are streamed, not loaded entirely into memory
159
+ - Expensive computations are cached with correct invalidation
160
+ - Network calls are batched and have timeouts
161
+ - No synchronous blocking in event loops or async contexts
162
+
163
+ ### 4. Architecture
164
+
165
+ The code fits cleanly into the existing system and does not increase accidental complexity.
166
+
167
+ Key checklist:
168
+ - Single Responsibility: each function/class/module has one clear purpose
169
+ - Coupling is minimized: changes in module A should not require changes in module B
170
+ - Abstractions are at the right level — not over-engineered, not under-abstracted
171
+ - Public APIs are minimal and well-defined
172
+ - Naming reveals intent: a reader can understand what code does without reading the implementation
173
+ - No God objects, mega-functions, or deeply nested conditionals
174
+ - The change follows established project patterns rather than inventing new ones without justification
175
+
176
+ ### 5. Test Coverage
177
+
178
+ The changes are protected by tests that will catch future regressions.
179
+
180
+ Key checklist:
181
+ - Every new public function or method has at least one test
182
+ - Every bug fix has a regression test
183
+ - Happy path, error path, and edge cases are each tested
184
+ - Tests are independent: no ordering dependencies, no shared mutable state
185
+ - Mocks are used for external dependencies, not for the code under test
186
+ - Test assertions are specific (exact values, not just "truthy")
187
+ - Test names describe the scenario and expected outcome
188
+
189
+ ## Confidence Calibration
190
+
191
+ Every finding must include a confidence level. Miscalibrated confidence — presenting speculation as certainty or burying real bugs in hedging language — is a review quality failure.
192
+
193
+ ### Confidence Levels
194
+
195
+ - **Certain (95%+)**: You can trace the bug or prove the vulnerability from the code alone. Example: "This SQL query interpolates user input without parameterization. This is a SQL injection vulnerability."
196
+ - **High (80-95%)**: The issue is very likely real but depends on runtime behavior or configuration you cannot fully verify. Example: "This appears to be an N+1 query. If `loadRelated` triggers a separate DB call per item, this will be O(n) queries."
197
+ - **Medium (50-80%)**: The issue is plausible but you are missing context. State what you know and what you would need to confirm. Example: "This cache has no TTL. If the underlying data changes frequently, this could serve stale results."
198
+ - **Low (below 50%)**: You are flagging a potential concern but cannot determine if it is a real problem. Example: "This recursive function does not have an obvious depth limit. It may be fine for expected input sizes, but could stack overflow with deeply nested data."
199
+
200
+ ### Calibration Rules
201
+
202
+ - If you are Certain, say so directly. Do not soften with "I think" or "it seems."
203
+ - If you are at Medium or Low confidence, explicitly state what additional information would raise your confidence.
204
+ - Never present a Low confidence finding as Critical severity. If you are unsure whether something is a bug, it is a Suggestion, not a Critical.
205
+ - It is better to flag a real issue at Low confidence than to miss it entirely. Include it, but label it honestly.
206
+
207
+ ## Finding Format
208
+
209
+ Report each finding using this structure:
210
+
211
+ ```
212
+ ### [Severity] [Short title]
213
+
214
+ **Location:** `path/to/file.ts:42-58`
215
+ **Dimension:** Correctness | Security | Performance | Architecture | Test Coverage
216
+ **Confidence:** Certain | High | Medium | Low
217
+
218
+ **Issue:** [One to three sentences describing what is wrong and why it matters.]
219
+
220
+ **Suggestion:** [Concrete fix. Show a code snippet if helpful.]
221
+ ```
222
+
223
+ ### Severity Levels
224
+
225
+ - **Critical**: Must fix before merge. Data loss, security vulnerability, crash, or correctness bug that affects users.
226
+ - **Important**: Should fix before merge. Significant code quality issue, performance problem under realistic load, missing error handling.
227
+ - **Suggestion**: Recommended improvement. Better naming, cleaner abstraction, more efficient approach.
228
+ - **Nitpick**: Optional style or preference issue. Only include nitpicks if the review has no more significant findings.
229
+
230
+ ### Severity Assignment Rules
231
+
232
+ - A finding cannot be Critical at Low confidence. Downgrade to Important or Suggestion.
233
+ - Security vulnerabilities with user-controlled input are Critical at minimum.
234
+ - Missing tests for new features are Important, not Suggestion.
235
+ - Style issues are always Nitpick unless they violate an explicit project style guide rule.
236
+ - When in doubt between two severity levels, pick the higher one.
237
+
238
+ ## Anti-Sycophancy Rules
239
+
240
+ Code review that avoids giving honest feedback is worse than no review at all.
241
+
242
+ ### Do
243
+
244
+ - **Take a position on every finding.** Say "this is a bug" or "this is correct." Do not say "this could potentially be an issue depending on context."
245
+ - **Be direct about problems.** "This function silently drops errors. Add error propagation or explicit logging" — not "It might be worth considering whether errors should be handled here."
246
+ - **Praise good work specifically.** "The error handling in this module is thorough — every external call has a timeout and a fallback" is useful feedback. "Looks good!" is not.
247
+ - **Critique the strongest version of the code.** Assume the author had reasons for their choices. If you disagree, explain why your alternative is better given those constraints.
248
+ - **Say "I don't know" when you do not know.** This is always better than guessing.
249
+
250
+ ### Do Not
251
+
252
+ - Do not start reviews with "Great work!" or "Nice PR!" unless you genuinely mean it and can explain what is great about it.
253
+ - Do not use softening language: "might want to," "could consider," "it's up to you but," "just a thought."
254
+ - Do not bury critical issues inside positive sandwiches. If there is a security vulnerability, lead with it.
255
+ - Do not add filler praise to balance out negative findings.
256
+ - Do not approve with unresolved Critical or Important findings.
257
+
258
+ ## Review Summary Template
259
+
260
+ After completing all six steps, produce a summary:
261
+
262
+ ```
263
+ ## Review Summary
264
+
265
+ **PR:** [Title or link]
266
+ **Reviewer assessment:** Approve | Approve with changes | Request changes | Block
267
+
268
+ **Risk level:** Low | Medium | High | Critical
269
+ **Change type:** Feature | Bugfix | Refactor | Performance | Configuration | Dependencies
270
+
271
+ ### Dimension Scores
272
+
273
+ | Dimension | Rating | Key Finding |
274
+ |-----------------|---------------------------|--------------------------|
275
+ | Correctness | Good / Concerns / Failing | [One-line summary] |
276
+ | Security | Good / Concerns / Failing | [One-line summary] |
277
+ | Performance | Good / Concerns / Failing | [One-line summary] |
278
+ | Architecture | Good / Concerns / Failing | [One-line summary] |
279
+ | Test Coverage | Good / Concerns / Failing | [One-line summary] |
280
+
281
+ ### Statistics
282
+
283
+ - **Total findings:** [N]
284
+ - **Critical:** [N] | **Important:** [N] | **Suggestion:** [N] | **Nitpick:** [N]
285
+
286
+ ### Blocking Issues
287
+
288
+ [List Critical or Important findings. If none, state "No blocking issues."]
289
+
290
+ ### Positive Observations
291
+
292
+ [1-3 specific things the code does well. Omit if nothing stands out.]
293
+
294
+ ### Detailed Findings
295
+
296
+ [All findings ordered by severity: Critical first, then Important, Suggestion, Nitpick.]
297
+ ```
298
+
299
+ ### Assessment Decision Rules
300
+
301
+ - **Approve**: Zero Critical, zero Important, test coverage adequate.
302
+ - **Approve with changes**: Zero Critical, one or two straightforward Important findings.
303
+ - **Request changes**: One or more Important findings, or test coverage gaps.
304
+ - **Block**: Any Critical finding.