npm - @fro.bot/systematic - Versions diffs - 2.1.0 → 2.2.1 - Mend

@fro.bot/systematic 2.1.0 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/skills/ce-review/references/review-output-template.md ADDED Viewed

@@ -0,0 +1,148 @@
+# Code Review Output Template
+Use this **exact format** when presenting synthesized review findings. Findings are grouped by severity, not by reviewer.
+**IMPORTANT:** Use pipe-delimited markdown tables (`| col | col |`). Do NOT use ASCII box-drawing characters.
+## Example
+```markdown
+## Code Review Results
+**Scope:** merge-base with the review base branch -> working tree (14 files, 342 lines)
+**Intent:** Add order export endpoint with CSV and JSON format support
+**Mode:** autofix
+**Reviewers:** correctness, testing, maintainability, security, api-contract
+- security -- new public endpoint accepts user-provided format parameter
+- api-contract -- new /api/orders/export route with response schema
+### P0 -- Critical
+| # | File | Issue | Reviewer | Confidence | Route |
+|---|------|-------|----------|------------|-------|
+| 1 | `orders_controller.rb:42` | User-supplied ID in account lookup without ownership check | security | 0.92 | `gated_auto -> downstream-resolver` |
+### P1 -- High
+| # | File | Issue | Reviewer | Confidence | Route |
+|---|------|-------|----------|------------|-------|
+| 2 | `export_service.rb:87` | Loads all orders into memory -- unbounded for large accounts | performance | 0.85 | `safe_auto -> review-fixer` |
+| 3 | `export_service.rb:91` | No pagination -- response size grows linearly with order count | api-contract, performance | 0.80 | `manual -> downstream-resolver` |
+### P2 -- Moderate
+| # | File | Issue | Reviewer | Confidence | Route |
+|---|------|-------|----------|------------|-------|
+| 4 | `export_service.rb:45` | Missing error handling for CSV serialization failure | correctness | 0.75 | `safe_auto -> review-fixer` |
+### P3 -- Low
+| # | File | Issue | Reviewer | Confidence | Route |
+|---|------|-------|----------|------------|-------|
+| 5 | `export_helper.rb:12` | Format detection could use early return instead of nested conditional | maintainability | 0.70 | `advisory -> human` |
+### Applied Fixes
+- `safe_auto`: Added bounded export pagination guard and CSV serialization failure test coverage in this run
+### Residual Actionable Work
+| # | File | Issue | Route | Next Step |
+|---|------|-------|-------|-----------|
+| 1 | `orders_controller.rb:42` | Ownership check missing on export lookup | `gated_auto -> downstream-resolver` | Create residual todo and require explicit approval before behavior change |
+| 2 | `export_service.rb:91` | Pagination contract needs a broader API decision | `manual -> downstream-resolver` | Create residual todo with contract and client impact details |
+### Pre-existing Issues
+| # | File | Issue | Reviewer |
+|---|------|-------|----------|
+| 1 | `orders_controller.rb:12` | Broad rescue masking failed permission check | correctness |
+### Learnings & Past Solutions
+- [Known Pattern] `docs/solutions/export-pagination.md` -- previous export pagination fix applies to this endpoint
+### Agent-Native Gaps
+- New export endpoint has no CLI/agent equivalent -- agent users cannot trigger exports
+### Schema Drift Check
+- Clean: schema.rb changes match the migrations in scope
+### Deployment Notes
+- Pre-deploy: capture baseline row counts before enabling the export backfill
+- Verify: `SELECT COUNT(*) FROM exports WHERE status IS NULL;` should stay at `0`
+- Rollback: keep the old export path available until the backfill has been validated
+### Coverage
+- Suppressed: 2 findings below 0.60 confidence
+- Residual risks: No rate limiting on export endpoint
+- Testing gaps: No test for concurrent export requests
+---
+> **Verdict:** Ready with fixes
+>
+> **Reasoning:** 1 critical auth bypass must be fixed. The memory/pagination issues (P1) should be addressed for production safety.
+>
+> **Fix order:** P0 auth bypass -> P1 memory/pagination -> P2 error handling if straightforward
+```
+## Anti-patterns
+Do NOT produce output like this. The following is wrong:
+```markdown
+Findings
+Sev: P1
+File: foo.go:42
+Issue: Some problem description
+Reviewer(s): adversarial
+Confidence: 0.85
+Route: advisory -> human
+────────────────────────────────────────
+Sev: P2
+File: bar.go:99
+Issue: Another problem
+```
+This fails because: no pipe-delimited tables, no severity-grouped `###` headers, uses box-drawing horizontal rules, no numbered findings, no `## Code Review Results` title, and the verdict is not in a blockquote. Always use the table format from the example above.
+## Formatting Rules
+- **Pipe-delimited markdown tables** for findings -- never ASCII box-drawing characters or per-finding horizontal-rule separators between entries (the report-level `---` before the verdict is still required)
+- **Severity-grouped sections** -- `### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`. Omit empty severity levels.
+- **Always include file:line location** for code review issues
+- **Reviewer column** shows which persona(s) flagged the issue. Multiple reviewers = cross-reviewer agreement.
+- **Confidence column** shows the finding's confidence score
+- **Route column** shows the synthesized handling decision as ``<autofix_class> -> <owner>``.
+- **Header includes** scope, intent, and reviewer team with per-conditional justifications
+- **Mode line** -- include `interactive`, `autofix`, `report-only`, or `headless`
+- **Applied Fixes section** -- include only when a fix phase ran in this review invocation
+- **Residual Actionable Work section** -- include only when unresolved actionable findings were handed off for later work
+- **Pre-existing section** -- separate table, no confidence column (these are informational)
+- **Learnings & Past Solutions section** -- results from learnings-researcher, with links to docs/solutions/ files
+- **Agent-Native Gaps section** -- results from agent-native-reviewer. Omit if no gaps found.
+- **Schema Drift Check section** -- results from schema-drift-detector. Omit if the agent did not run.
+- **Deployment Notes section** -- key checklist items from deployment-verification-agent. Omit if the agent did not run.
+- **Coverage section** -- suppressed count, residual risks, testing gaps, failed reviewers
+- **Summary uses blockquotes** for verdict, reasoning, and fix order
+- **Horizontal rule** (`---`) separates findings from verdict
+- **`###` headers** for each section -- never plain text headers
+## Headless Mode Format
+In `mode:headless`, replace the interactive pipe-delimited table report with a structured text envelope. The headless format is defined in the `### Headless output format` section of SKILL.md. Key differences from the interactive format:
+- **No pipe-delimited tables.** Findings use `[severity][autofix_class -> owner] File: <file:line> -- <title>` line format with indented Why/Evidence/Suggested fix lines.
+- **Findings grouped by autofix_class** (gated-auto, manual, advisory) instead of severity. Within each group, findings are sorted by severity.
+- **Verdict in header** (top of output) instead of bottom, so programmatic callers get it first.
+- **`Artifact:` line** in metadata header gives callers the path to the full run artifact.
+- **`[needs-verification]` marker** on findings where `requires_verification: true`.
+- **Evidence lines** included per finding.
+- **Completion signal:** "Review complete" as the final line.

package/skills/ce-review/references/subagent-template.md ADDED Viewed

@@ -0,0 +1,84 @@
+# Sub-agent Prompt Template
+This template is used by the orchestrator to spawn each reviewer sub-agent. Variable substitution slots are filled at spawn time.
+---
+## Template
+```
+You are a specialist code reviewer.
+<persona>
+{persona_file}
+</persona>
+<scope-rules>
+{diff_scope_rules}
+</scope-rules>
+<output-contract>
+Return ONLY valid JSON matching the findings schema below. No prose, no markdown, no explanation outside the JSON object.
+{schema}
+Confidence rubric (0.0-1.0 scale):
+- 0.00-0.29: Not confident / likely false positive. Do not report.
+- 0.30-0.49: Somewhat confident. Do not report -- too speculative for actionable review.
+- 0.50-0.59: Moderately confident. Real but uncertain. Do not report unless P0 severity.
+- 0.60-0.69: Confident enough to flag. Include only when the issue is clearly actionable.
+- 0.70-0.84: Highly confident. Real and important. Report with full evidence.
+- 0.85-1.00: Certain. Verifiable from the code alone. Report.
+Suppress threshold: 0.60. Do not emit findings below 0.60 confidence (except P0 at 0.50+).
+False-positive categories to actively suppress:
+- Pre-existing issues unrelated to this diff (mark pre_existing: true for unchanged code the diff does not interact with; if the diff makes it newly relevant, it is secondary, not pre-existing)
+- Pedantic style nitpicks that a linter/formatter would catch
+- Code that looks wrong but is intentional (check comments, commit messages, PR description for intent)
+- Issues already handled elsewhere in the codebase (check callers, guards, middleware)
+- Suggestions that restate what the code already does in different words
+- Generic "consider adding" advice without a concrete failure mode
+Rules:
+- Every finding MUST include at least one evidence item grounded in the actual code.
+- Set pre_existing to true ONLY for issues in unchanged code that are unrelated to this diff. If the diff makes the issue newly relevant, it is NOT pre-existing.
+- You are operationally read-only. You may use non-mutating inspection commands, including read-oriented `git` / `gh` commands, to gather evidence. Do not edit files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state.
+- Set `autofix_class` accurately -- not every finding is `advisory`. Use this decision guide:
+  - `safe_auto`: The fix is local and deterministic — the fixer can apply it mechanically without design judgment. Examples: extracting a duplicated helper, adding a missing nil/null check, fixing an off-by-one, adding a missing test for an untested code path, removing dead code.
+  - `gated_auto`: A concrete fix exists but it changes contracts, permissions, or crosses a module boundary in a way that deserves explicit approval. Examples: adding authentication to an unprotected endpoint, changing a public API response shape, switching from soft-delete to hard-delete.
+  - `manual`: Actionable work that requires design decisions or cross-cutting changes. Examples: redesigning a data model, choosing between two valid architectural approaches, adding pagination to an unbounded query.
+  - `advisory`: Report-only items that should not become code-fix work. Examples: noting a design asymmetry the PR improves but doesn't fully resolve, flagging a residual risk, deployment notes.
+  Do not default to `advisory` when uncertain -- if a concrete fix is obvious, classify it as `safe_auto` or `gated_auto`.
+- Set `owner` to the default next actor for this finding: `review-fixer`, `downstream-resolver`, `human`, or `release`.
+- Set `requires_verification` to true whenever the likely fix needs targeted tests, a focused re-review, or operational validation before it should be trusted.
+- suggested_fix is optional. Only include it when the fix is obvious and correct. A bad suggestion is worse than none.
+- If you find no issues, return an empty findings array. Still populate residual_risks and testing_gaps if applicable.
+- **Intent verification:** Compare the code changes against the stated intent (and PR title/body when available). If the code does something the intent does not describe, or fails to do something the intent promises, flag it as a finding. Mismatches between stated intent and actual code are high-value findings.
+</output-contract>
+<pr-context>
+{pr_metadata}
+</pr-context>
+<review-context>
+Intent: {intent_summary}
+Changed files: {file_list}
+Diff:
+{diff}
+</review-context>
+```
+## Variable Reference
+| Variable | Source | Description |
+|----------|--------|-------------|
+| `{persona_file}` | Agent markdown file content | The full persona definition (identity, failure modes, calibration, suppress conditions) |
+| `{diff_scope_rules}` | `references/diff-scope.md` content | Primary/secondary/pre-existing tier rules |
+| `{schema}` | `references/findings-schema.json` content | The JSON schema reviewers must conform to |
+| `{intent_summary}` | Stage 2 output | 2-3 line description of what the change is trying to accomplish |
+| `{pr_metadata}` | Stage 1 output | PR title, body, and URL when reviewing a PR. Empty string when reviewing a branch or standalone checkout |
+| `{file_list}` | Stage 1 output | List of changed files from the scope step |
+| `{diff}` | Stage 1 output | The actual diff content to review |

package/skills/document-review/references/findings-schema.json ADDED Viewed

@@ -0,0 +1,109 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "title": "Document Review Findings",
+  "description": "Structured output schema for document review persona agents",
+  "type": "object",
+  "required": ["reviewer", "findings", "residual_risks", "deferred_questions"],
+  "properties": {
+    "reviewer": {
+      "type": "string",
+      "description": "Persona name that produced this output (e.g., 'coherence', 'feasibility', 'product-lens')"
+    },
+    "findings": {
+      "type": "array",
+      "description": "List of document review findings. Empty array if no issues found.",
+      "items": {
+        "type": "object",
+        "required": [
+          "title",
+          "severity",
+          "section",
+          "why_it_matters",
+          "finding_type",
+          "autofix_class",
+          "confidence",
+          "evidence"
+        ],
+        "properties": {
+          "title": {
+            "type": "string",
+            "description": "Short, specific issue title. 10 words or fewer.",
+            "maxLength": 100
+          },
+          "severity": {
+            "type": "string",
+            "enum": ["P0", "P1", "P2", "P3"],
+            "description": "Issue severity level"
+          },
+          "section": {
+            "type": "string",
+            "description": "Document section where the issue appears (e.g., 'Requirements Trace', 'Implementation Unit 3', 'Overview')"
+          },
+          "why_it_matters": {
+            "type": "string",
+            "description": "Impact statement -- not 'what is wrong' but 'what goes wrong if not addressed'"
+          },
+          "autofix_class": {
+            "type": "string",
+            "enum": ["auto", "present"],
+            "description": "How this issue should be handled. auto = one clear correct fix that can be applied silently (terminology, formatting, cross-references, completeness corrections, additions mechanically implied by other content). present = requires individual user judgment."
+          },
+          "finding_type": {
+            "type": "string",
+            "enum": ["error", "omission"],
+            "description": "Whether the finding is a mistake in what the document says (error) or something the document forgot to say (omission). Errors are design tensions, contradictions, or incorrect statements. Omissions are missing mechanical steps, forgotten list entries, or absent details."
+          },
+          "suggested_fix": {
+            "type": ["string", "null"],
+            "description": "Concrete fix text. Omit or null if no good fix is obvious -- a bad suggestion is worse than none."
+          },
+          "confidence": {
+            "type": "number",
+            "description": "Reviewer confidence in this finding, calibrated per persona",
+            "minimum": 0.0,
+            "maximum": 1.0
+          },
+          "evidence": {
+            "type": "array",
+            "description": "Quoted text from the document that supports this finding. At least 1 item.",
+            "items": { "type": "string" },
+            "minItems": 1
+          }
+        }
+      }
+    },
+    "residual_risks": {
+      "type": "array",
+      "description": "Risks the reviewer noticed but could not confirm as findings (below confidence threshold)",
+      "items": { "type": "string" }
+    },
+    "deferred_questions": {
+      "type": "array",
+      "description": "Questions that should be resolved in a later workflow stage (planning, implementation)",
+      "items": { "type": "string" }
+    }
+  },
+  "_meta": {
+    "confidence_thresholds": {
+      "suppress": "Below 0.50 -- do not report. Finding is speculative noise.",
+      "flag": "0.50-0.69 -- include only when the persona's calibration says the issue is actionable at that confidence.",
+      "report": "0.70+ -- report with full confidence."
+    },
+    "severity_definitions": {
+      "P0": "Contradictions or gaps that would cause building the wrong thing. Must fix before proceeding.",
+      "P1": "Significant gap likely hit during planning or implementation. Should fix.",
+      "P2": "Moderate issue with meaningful downside. Fix if straightforward.",
+      "P3": "Minor improvement. User's discretion."
+    },
+    "autofix_classes": {
+      "_principle": "Autofix class is independent of severity. A P1 finding can be auto if the fix is obvious. The test: is there one clear correct fix, or does resolving this require judgment?",
+      "auto": "One clear correct fix -- applied silently. Includes both internal reconciliation (summary/detail mismatches, wrong counts, stale cross-references, terminology drift) and additions mechanically implied by other content (missing steps, unstated thresholds, completeness gaps where the correct content is obvious). Must include suggested_fix.",
+      "present": "Requires individual user judgment -- strategic questions, design tradeoffs, or findings where reasonable people could disagree on the right action."
+    },
+    "finding_types": {
+      "error": "Something the document says that is wrong -- contradictions, incorrect statements, design tensions, incoherent tradeoffs. These are mistakes in what exists.",
+      "omission": "Something the document forgot to say -- missing mechanical steps, absent list entries, undefined thresholds, forgotten cross-references. These are gaps in completeness."
+    }
+  }
+}

package/skills/document-review/references/review-output-template.md ADDED Viewed

@@ -0,0 +1,89 @@
+# Document Review Output Template
+Use this **exact format** when presenting synthesized review findings. Findings are grouped by severity, not by reviewer.
+**IMPORTANT:** Use pipe-delimited markdown tables (`| col | col |`). Do NOT use ASCII box-drawing characters.
+## Example
+```markdown
+## Document Review Results
+**Document:** docs/plans/2026-03-15-feat-user-auth-plan.md
+**Type:** plan
+**Reviewers:** coherence, feasibility, security-lens, scope-guardian
+- security-lens -- plan adds public API endpoint with auth flow
+- scope-guardian -- plan has 15 requirements across 3 priority levels
+Applied 5 auto-fixes. 4 findings to consider (2 errors, 2 omissions).
+### Auto-fixes Applied
+- Standardized "pipeline"/"workflow" terminology to "pipeline" throughout (coherence)
+- Fixed cross-reference: Section 4 referenced "Section 3.2" which is actually "Section 3.1" (coherence)
+- Updated unit count from "6 units" to "7 units" to match listed units (coherence)
+- Added "update API rate-limit config" step to Unit 4 -- implied by Unit 3's rate-limit introduction (feasibility)
+- Added auth token refresh to test scenarios -- required by Unit 2's token expiry handling (security-lens)
+### P0 -- Must Fix
+#### Errors
+| # | Section | Issue | Reviewer | Confidence |
+|---|---------|-------|----------|------------|
+| 1 | Requirements Trace | Goal states "offline support" but technical approach assumes persistent connectivity | coherence | 0.92 |
+### P1 -- Should Fix
+#### Errors
+| # | Section | Issue | Reviewer | Confidence |
+|---|---------|-------|----------|------------|
+| 2 | Scope Boundaries | 8 of 12 units build admin infrastructure; only 2 touch stated goal | scope-guardian | 0.80 |
+#### Omissions
+| # | Section | Issue | Reviewer | Confidence |
+|---|---------|-------|----------|------------|
+| 3 | Implementation Unit 3 | Plan proposes custom auth but does not mention existing Devise setup or migration path | feasibility | 0.85 |
+### P2 -- Consider Fixing
+#### Omissions
+| # | Section | Issue | Reviewer | Confidence |
+|---|---------|-------|----------|------------|
+| 4 | API Design | Public webhook endpoint has no rate limiting mentioned | security-lens | 0.75 |
+### Residual Concerns
+| # | Concern | Source |
+|---|---------|--------|
+| 1 | Migration rollback strategy not addressed for Phase 2 data changes | feasibility |
+### Deferred Questions
+| # | Question | Source |
+|---|---------|--------|
+| 1 | Should the API use versioned endpoints from launch? | feasibility, security-lens |
+### Coverage
+| Persona | Status | Findings | Auto | Present | Residual |
+|---------|--------|----------|------|---------|----------|
+| coherence | completed | 4 | 3 | 1 | 0 |
+| feasibility | completed | 2 | 1 | 1 | 1 |
+| security-lens | completed | 2 | 1 | 1 | 0 |
+| scope-guardian | completed | 1 | 0 | 1 | 0 |
+| product-lens | not activated | -- | -- | -- | -- |
+| design-lens | not activated | -- | -- | -- | -- |
+```
+## Section Rules
+- **Summary line**: Always present after the reviewer list. Format: "Applied N auto-fixes. K findings to consider (X errors, Y omissions)." Omit any zero clause.
+- **Auto-fixes Applied**: List all fixes that were applied automatically (auto class). Include enough detail per fix to convey the substance -- especially for fixes that add content or touch document meaning. Omit section if none.
+- **P0-P3 sections**: Only include sections that have findings. Omit empty severity levels. Within each severity, separate into **Errors** and **Omissions** sub-headers. Omit a sub-header if that severity has none of that type.
+- **Residual Concerns**: Findings below confidence threshold that were promoted by cross-persona corroboration, plus unpromoted residual risks. Omit if none.
+- **Deferred Questions**: Questions for later workflow stages. Omit if none.
+- **Coverage**: Always include. All counts are **post-synthesis**. **Findings** must equal Auto + Present exactly -- if deduplication merged a finding across personas, attribute it to the persona with the highest confidence and reduce the other persona's count. **Residual** = count of `residual_risks` from this persona's raw output (not the promoted subset in the Residual Concerns section).

package/skills/document-review/references/subagent-template.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Document Review Sub-agent Prompt Template
+This template is used by the document-review orchestrator to spawn each reviewer sub-agent. Variable substitution slots are filled at dispatch time.
+---
+## Template
+```
+You are a specialist document reviewer.
+<persona>
+{persona_file}
+</persona>
+<output-contract>
+Return ONLY valid JSON matching the findings schema below. No prose, no markdown, no explanation outside the JSON object.
+{schema}
+Rules:
+- Suppress any finding below your stated confidence floor (see your Confidence calibration section).
+- Every finding MUST include at least one evidence item -- a direct quote from the document.
+- You are operationally read-only. Analyze the document and produce findings. Do not edit the document, create files, or make changes. You may use non-mutating tools (file reads, glob, grep, git log) to gather context about the codebase when evaluating feasibility or existing patterns.
+- Set `finding_type` for every finding:
+  - `error`: Something the document says that is wrong -- contradictions, incorrect statements, design tensions, incoherent tradeoffs.
+  - `omission`: Something the document forgot to say -- missing mechanical steps, absent list entries, undefined thresholds, forgotten cross-references.
+- Set `autofix_class` based on whether there is one clear correct fix, not on severity. A P1 finding can be `auto` if the fix is obvious:
+  - `auto`: One clear correct fix. Applied silently without asking. The test: is there only one reasonable way to resolve this? If yes, it is auto. Two categories:
+    - Internal reconciliation: one part of the document is authoritative over another -- reconcile toward the authority. Examples: summary/detail mismatches, wrong counts, missing list entries derivable from elsewhere, stale cross-references, terminology drift, prose/diagram contradictions where prose is authoritative.
+    - Implied additions: the correct content is mechanically obvious from the document's own context. Examples: adding a missing implementation step implied by other content, defining a threshold implied but never stated, completeness gaps where what to add is clear.
+    Always include `suggested_fix` for auto findings.
+    NOT auto (the gap is clear but more than one reasonable fix exists): choosing an implementation approach when the document states a need without constraining how (e.g., "support offline mode" could mean service workers, local-first database, or queue-and-sync -- there is no single obvious answer), changing scope or priority where the author may have weighed tradeoffs the reviewer can't see (e.g., promoting a P2 to P1, or cutting a feature the document intentionally keeps at a lower tier).
+  - `present`: Requires judgment -- strategic questions, tradeoffs, design tensions where reasonable people could disagree, findings where the right action is unclear.
+- `suggested_fix` is required for `auto` findings. For `present` findings, `suggested_fix` is optional -- include it only when the fix is obvious, and frame as a question when the right action is unclear.
+- If you find no issues, return an empty findings array. Still populate residual_risks and deferred_questions if applicable.
+- Use your suppress conditions. Do not flag issues that belong to other personas.
+</output-contract>
+<review-context>
+Document type: {document_type}
+Document path: {document_path}
+Document content:
+{document_content}
+</review-context>
+```
+## Variable Reference
+| Variable | Source | Description |
+|----------|--------|-------------|
+| `{persona_file}` | Agent markdown file content | The full persona definition (identity, analysis protocol, calibration, suppress conditions) |
+| `{schema}` | `references/findings-schema.json` content | The JSON schema reviewers must conform to |
+| `{document_type}` | Orchestrator classification | Either "requirements" or "plan" |
+| `{document_path}` | Skill input | Path to the document being reviewed |
+| `{document_content}` | File read | The full document text |