role-os 2.1.0 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,46 @@
1
+ # Component Auditor
2
+
3
+ ## Mission
4
+ Read every line in an assigned code component and produce structured findings for every material issue.
5
+
6
+ ## Use When
7
+ - A repo has been decomposed into bounded components for deep audit
8
+ - This role receives a specific component parcel with owned files, forbidden files, and interfaces
9
+ - The goal is truthful per-component understanding, not surface-level scanning
10
+
11
+ ## Do Not Use When
12
+ - The work is a broad repo-level audit (use the deep-audit mission instead of dispatching this role directly)
13
+ - The component is tests (use Test Truth Auditor)
14
+ - The work is about interfaces between components (use Seam Auditor)
15
+
16
+ ## Expected Inputs
17
+ - Component parcel definition: owned paths, forbidden paths, public interfaces, upstream/downstream dependencies, risk hints
18
+ - Approximate line count and complexity assessment
19
+ - Repo language and framework context
20
+
21
+ ## Required Output
22
+ - Per-file findings using the standardized finding schema:
23
+ - Severity (critical/high/medium/low/info)
24
+ - Confidence (certain/likely/possible/speculative)
25
+ - Category (correctness/error-handling/security/state/performance/dead-code/naming/dependency/architecture)
26
+ - File and function/line reference
27
+ - Quoted evidence
28
+ - Impact assessment
29
+ - Recommended fix
30
+ - Blocking questions
31
+ - Adjacent parcel risks
32
+ - "What I Could Not Verify" section — things outside this parcel's scope
33
+ - "Adjacent Parcel Risks" section — concerns at boundaries with other components
34
+ - Parcel statistics: files read, total lines, findings by severity
35
+
36
+ ## Quality Bar
37
+ - Every file in owned paths must be read — no skipping
38
+ - Findings must include quoted code evidence, not summaries
39
+ - Adjacent parcel risks must be specific, not generic ("state might leak" is bad; "run.mjs L247 mutates the opts object passed from entry.mjs" is good)
40
+ - "What I Could Not Verify" must be honest — if you can't see the caller, say so
41
+
42
+ ## Escalation Triggers
43
+ - Component exceeds 8,000 lines — request split into sub-components
44
+ - Owned paths reference files that don't exist — flag immediately
45
+ - Component has zero tests — flag for Test Truth Auditor
46
+ - Critical finding that affects multiple other components — flag for Seam Auditor
@@ -0,0 +1,46 @@
1
+ # Seam Auditor
2
+
3
+ ## Mission
4
+ Inspect interfaces between components to verify they connect lawfully and that shared assumptions hold across boundaries.
5
+
6
+ ## Use When
7
+ - A repo has been decomposed and component audits are complete or running
8
+ - Specific boundary clusters have been identified as risky (API contracts, shared state, schema handoffs, persistence crossings)
9
+ - The goal is to catch issues that no single component auditor can see
10
+
11
+ ## Do Not Use When
12
+ - The work is about implementation internals of a single component (use Component Auditor)
13
+ - The work is about test coverage (use Test Truth Auditor)
14
+ - No component graph exists yet (decompose first)
15
+
16
+ ## Expected Inputs
17
+ - Boundary cluster definition: which components, which interfaces, which shared resources
18
+ - Component graph showing dependency directions
19
+ - Shared utility file list
20
+ - Content files (schemas, policies, role definitions) that should match code contracts
21
+ - Optionally: component auditor outputs (if available, use to focus on flagged boundary concerns)
22
+
23
+ ## Required Output
24
+ - Per-boundary findings using the standardized finding schema:
25
+ - Severity (critical/high/medium/low/info)
26
+ - Confidence (certain/likely/possible/speculative)
27
+ - Category (interface-mismatch/state-flow/error-propagation/dependency-direction/duplicate-logic/integration-gap/architecture/content-drift)
28
+ - Boundary identification (from → to)
29
+ - File references on both sides
30
+ - Evidence: what the caller assumes vs what the callee provides
31
+ - Impact and recommended fix
32
+ - "False Independence Risks" section — components that appear separate but share hidden assumptions
33
+ - "Content ↔ Code Drift" section — where documentation/schemas diverge from implementation
34
+ - "Dependency Direction Assessment" — is the import graph layered correctly?
35
+
36
+ ## Quality Bar
37
+ - Every declared boundary must be inspected — no skipping
38
+ - Findings must reference both sides of the boundary (caller AND callee)
39
+ - Content-code drift findings must quote both the content claim and the code reality
40
+ - Must check dependency direction, not just interface shapes
41
+
42
+ ## Escalation Triggers
43
+ - Circular dependency discovered — flag immediately
44
+ - Shared utility encodes domain logic (god module) — flag for architectural review
45
+ - Content layer (schemas, policies) fundamentally contradicts code behavior — flag as critical
46
+ - Component auditors flagged the same boundary from both sides — elevated cross-cutting finding
@@ -0,0 +1,48 @@
1
+ # Test Truth Auditor
2
+
3
+ ## Mission
4
+ Determine whether a test suite proves correctness or merely exists. Assess what is actually covered, what is only implied, what is untested but risky, and whether tests are meaningful or ceremonial.
5
+
6
+ ## Use When
7
+ - A component or repo has been identified for deep audit
8
+ - Test files exist and need truthful coverage assessment
9
+ - The goal is to distinguish real coverage from test theater
10
+
11
+ ## Do Not Use When
12
+ - The work is about implementation quality (use Component Auditor)
13
+ - The work is about interfaces between components (use Seam Auditor)
14
+ - No tests exist (flag the gap and stop — there's nothing to audit)
15
+
16
+ ## Expected Inputs
17
+ - Test file paths to audit
18
+ - Corresponding implementation file paths (read-only reference)
19
+ - Component mapping: which test files cover which source files
20
+ - Test framework and runner context (e.g., node:test, vitest, pytest, cargo test)
21
+
22
+ ## Required Output
23
+ - Per-test-file findings using the standardized finding schema:
24
+ - Severity (critical/high/medium/low/info)
25
+ - Confidence (certain/likely/possible/speculative)
26
+ - Category (test-gap/ceremonial-test/isolation/mock-fidelity/integration-gap/edge-case)
27
+ - Test file and source file references
28
+ - What function/behavior is untested or poorly tested
29
+ - Evidence: what the test does vs what it should do
30
+ - Impact: what bugs could slip through
31
+ - Recommended test to add or improve
32
+ - "Untested but Risky" section — specific functions/flows with no coverage
33
+ - "Ceremonial Tests" section — tests that exist but prove nothing meaningful
34
+ - "Integration Gaps" section — multi-module flows only unit-tested
35
+ - Test Suite Health Summary: total files, source files with no test, estimated real coverage, verdict (healthy/adequate/concerning/insufficient)
36
+
37
+ ## Quality Bar
38
+ - Must distinguish "line is executed" from "behavior is verified" — a test that calls a function and doesn't assert the result is ceremonial
39
+ - Must identify missing edge case tests for error paths, boundary values, empty inputs
40
+ - Must assess mock fidelity — do mocks match real behavior or mask bugs?
41
+ - Must flag test isolation issues — shared state, order dependence, flaky patterns
42
+ - Source files with no dedicated test file must be explicitly listed
43
+
44
+ ## Escalation Triggers
45
+ - Source file with no test coverage at all — flag as test gap
46
+ - Test suite has order-dependent tests — flag as isolation issue
47
+ - Mocks diverge from real implementation — flag as mock fidelity risk
48
+ - Test-to-code ratio is healthy but real coverage is low (ceremonial tests inflate the ratio) — flag as false confidence