npm - @trohde/earos - Versions diffs - 1.0.0 - Mend

@trohde/earos 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (135) hide show

package/assets/init/.agents/skills/earos-template-fill/references/evidence-writing-guide.md ADDED Viewed

@@ -0,0 +1,186 @@
+# Evidence Writing Guide — EAROS Template Fill
+This file shows how to write architecture content that produces strong EAROS evidence. Read this when helping authors draft specific sections or reviewing content they've provided.
+---
+## The Core Principle: Explicit Over Implicit
+EAROS evaluators score what is stated, not what they can infer. An author who understands this writes explicitly: stating their reasoning, naming their stakeholders, listing their assumptions, and mapping their decisions to drivers.
+The single most common improvement that raises EAROS scores is replacing implied content with stated content.
+**Implied (scores 1–2):** "The event-driven architecture enables scalability."
+**Stated (scores 3–4):** "The event-driven architecture was chosen to address the scalability driver (Driver-3: handle 10x traffic peaks). Alternatives considered: synchronous REST (rejected: cascade failure risk under load), message queue fan-out (rejected: requires managed MQ service not approved in current cloud account)."
+The content is the same architectural idea — but the explicit version provides evidence anchors an assessor can score against each level descriptor.
+---
+## Writing Patterns by Section Type
+### Pattern 1 — The Stakeholder Table
+**Weak (scores 1):**
+> "Audience: Technical stakeholders and business owners."
+**Adequate (scores 2):**
+> "Stakeholders: Solution Architect, Platform Team, Security Review Board, Service Owner."
+**Strong (scores 3–4):**
+| Stakeholder | Role | Primary Concern | Addressed In |
+|------------|------|----------------|-------------|
+| Solution Architect | Document owner | Completeness, architectural soundness | All sections |
+| Platform Team | Operations | Deployment topology, runbook requirements | Section 5, Appendix B |
+| Security Review Board | Governance | Control compliance, threat model | Section 6, Appendix A |
+| Service Owner | Business | Cost model, SLA commitments | Section 7 |
+**Why the table works:** It provides direct evidence for STK-01 (named stakeholders with concerns) AND for CVP-01 (views addressing stakeholder concerns). One table serves multiple criteria.
+---
+### Pattern 2 — The Scope Block
+**Weak (scores 0–1):**
+> "This document covers the payments architecture."
+**Adequate (scores 2):**
+> "In scope: Payments service, API gateway, upstream banking core.
+> Out of scope: authentication, reporting."
+**Strong (scores 3–4):**
+```
+IN SCOPE:
+- Payments Service (new) — core payment processing logic
+- Notification Service (existing, modified) — payment confirmation events
+- Banking Core API (existing, upstream) — account validation and settlement
+OUT OF SCOPE:
+- Authentication/Authorization — handled by IAM Platform (see IAM-ARCH-2024-001)
+- Analytics Pipeline — separate initiative (ANALYTICS-2025 roadmap item)
+- Mobile App — consumer of this service, not modified in this initiative
+ASSUMPTIONS:
+- Banking Core API contract is stable for 12 months (contact: payments-arch@company.com)
+- Mobile app team provides test harness for integration testing by Q2 2026
+- Existing AWS EU-West-1 account remains the deployment target
+CONSTRAINTS:
+- No new PII data stores — any personal data must flow through the approved Data Platform
+- GDPR data residency requirements apply — all processing within EU
+```
+**Why this works:** Each section maps directly to what SCP-01 requires. An assessor can find evidence for each level descriptor immediately. Score 3 requires all four blocks present; score 4 adds consistency verification (e.g., scope boundary tested across all views).
+---
+### Pattern 3 — The Decision Record
+**Weak (scores 1):**
+> "We chose event-driven architecture for scalability."
+**Adequate (scores 2):**
+> "Event-driven architecture was chosen because it enables decoupling between producers and consumers, supporting the scalability requirements."
+**Strong (scores 3–4):**
+```
+Decision: Adopt event-driven architecture using Apache Kafka for inter-service communication
+Context: Payment volume is expected to scale 10x during peak periods (Driver-3). The current
+         synchronous REST integration pattern creates cascade failures when Banking Core API
+         latency increases (observed in P95 incident Aug 2025, Incident-2025-0143).
+Options considered:
+  A. Synchronous REST (rejected): cascade failure risk, confirmed by incident analysis
+  B. Message queue fan-out (rejected): requires managed MQ service not in approved catalog
+  C. Event-driven Kafka (selected): approved platform service, proven at 2x current volume
+Rationale: Option C addresses the scalability driver without introducing unapproved
+           dependencies. Operational overhead (Kafka expertise) accepted — Platform Team
+           confirmed capability.
+Revisit trigger: If Kafka proves operationally burdensome by Q3 2026 review, re-evaluate B.
+```
+**Why this works:** Provides TRC-01 evidence (link to driver), RAT-01 evidence (trade-offs considered), and ACT-01 evidence (revisit condition named). One decision record serves three criteria.
+---
+### Pattern 4 — The Risk Table
+**Absent (scores 0):**
+> "Risks: TBD"
+**Weak (scores 1):**
+> "Risks: Performance, security, integration."
+**Adequate (scores 2):**
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|------------|
+| API latency degradation | Medium | High | Circuit breaker pattern |
+| Data loss on Kafka failure | Low | Critical | Persistent disk, replication factor 3 |
+**Strong (scores 3–4):**
+| Risk | Likelihood | Impact | Mitigation | Owner | Residual Risk |
+|------|-----------|--------|------------|-------|--------------|
+| Banking Core API SLA breach | Medium | High | Circuit breaker + fallback to cached data; see RUNBOOK-PAY-003 | Platform Eng | Low — fallback tested to 15min outage |
+| Kafka consumer lag during peak | Medium | Medium | Auto-scaling consumer group; alert at 5min lag | Payments Eng | Medium — depends on Auto-scaling SLA |
+| PII data in event payload | Low | Critical | Event schema validation gate; DLP scanning on all topics | Security | Low — schema registry prevents unknown fields |
+**The most commonly missing columns:** Residual Risk and Owner. "Mitigation: TBD" or "Owner: TBD" caps RAT-01 at score 2.
+---
+### Pattern 5 — The Compliance Section
+**The most common failure mode (scores 0):**
+> "The solution will comply with all applicable security and regulatory standards."
+**Marginal (scores 1):**
+> "The solution addresses GDPR and ISO 27001 requirements."
+**Adequate (scores 2):**
+> "GDPR controls applied: data minimization (only payment reference stored, not card number), right to erasure implemented via Payments Data API. ISO 27001: encryption at rest and in transit implemented."
+**Strong (scores 3–4):**
+| Control | Standard | How Addressed | Evidence |
+|---------|----------|--------------|---------|
+| Data minimization | GDPR Art. 5(1)(c) | Card numbers never stored — only payment tokens via PCI-DSS vault | Section 4.2, Data Flow Diagram |
+| Encryption at rest | ISO 27001 A.10.1 | AES-256 on all storage; key rotation quarterly | Section 5.3 |
+| Access control | Enterprise Security Baseline v3 | RBAC via IAM Platform; no direct DB access | Section 5.4 |
+| PCI DSS SAQ-A | PCI DSS 3.2.1 | Delegated to payment gateway (Stripe); scope reduction confirmed by Security Review 2025-Q3 | Appendix A |
+---
+## The "Explicit Over Implicit" Checklist
+Use this to review any section before submission:
+- [ ] Are stakeholders **named** (not "technical teams") with **specific concerns**?
+- [ ] Is scope **listed** (not described) with explicit in-scope AND out-of-scope?
+- [ ] Are assumptions **stated** (not implied from the design choices)?
+- [ ] Are drivers **referenced by name** in decision rationale (not just alluded to)?
+- [ ] Are risks **in a table** with mitigations AND owners AND residual risk?
+- [ ] Is compliance **mapped to specific named controls** (not stated as assertion)?
+- [ ] Are component names **consistent** between text and all diagrams?
+- [ ] Does each diagram have a **legend** or annotation explaining its notation?
+---
+## Common Writing Anti-Patterns
+| Anti-Pattern | EAROS Problem | Fix |
+|-------------|--------------|-----|
+| "The architecture follows best practices" | Assertion without evidence; CMP-01 = 0 | Name the specific practices and how they're applied |
+| "Risks will be managed" | Not a risk statement; RAT-01 = 0–1 | Add: what risk, likelihood, impact, mitigation, owner |
+| "See attached diagram" | Diagram without narrative; CVP-01 = 1 | Add: what the diagram shows, what to look for, what boundaries mean |
+| "Compliant with enterprise standards" | No named standard; CMP-01 = 0 | Name each: GDPR, ISO 27001, PCI-DSS, etc. |
+| "To be determined" in gate criterion section | Immediate gate risk | These must be resolved before submission |
+| Generic stakeholders | STK-01 = 1 | Name roles with specific concerns |
+| Scope as a paragraph | Hard to extract in/out/assumptions; SCP-01 = 2 | Use structured lists or tables |
+| One diagram only | CVP-01 = 1 | Add context, deployment, and data flow views |

package/assets/init/.agents/skills/earos-template-fill/references/section-rubric-mapping.md ADDED Viewed

@@ -0,0 +1,200 @@
+# Section-to-Rubric Mapping — EAROS Template Fill
+This file maps common architecture document sections to the EAROS criteria they address. Use it when walking authors through their document or checking completeness.
+---
+## Why This Mapping Matters
+Authors structure documents for readability; EAROS evaluators assess documents for criterion coverage. These don't always align. An author can have a well-structured document that misses 3 core criteria because those concerns were spread across sections without explicit treatment.
+This mapping shows where criterion coverage is expected so authors can write explicitly, not just organically.
+---
+## Core Criteria — Section Mapping and Scoring Boundaries
+### STK-01 — Stakeholder Identification
+**Criterion question:** Are the decision audience and key stakeholders identified with their primary concerns?
+**Expected in sections:** Introduction, Purpose, Audience, or dedicated Stakeholders section
+**What assessors look for:**
+- Named stakeholders (roles, not just "the business")
+- Primary concern mapped to each stakeholder
+- Document's decision relevance to each stakeholder
+**Score 2 vs. 3 boundary:** Listed vs. listed-with-concerns-mapped
+**Strong (score 3–4):**
+> | Stakeholder | Role | Primary Concern | Addressed In |
+> |------------|------|----------------|-------------|
+> | Solution Architect | Document owner | Design completeness, soundness | All sections |
+> | Platform Team | Operations | Deployment topology, runbook completeness | Section 5, Appendix B |
+> | Security Review Board | Governance | Control compliance, threat model | Section 6, Appendix A |
+> | Service Owner | Business | Cost model, SLA commitments | Section 7 |
+**Weak (score 1):**
+> "Audience: Technical teams and business stakeholders."
+---
+### SCP-01 — Scope and Boundaries ⚠️ GATE (major or critical depending on profile)
+**Criterion question:** Are scope, out-of-scope elements, constraints, and assumptions explicitly stated?
+**Expected in sections:** Scope, Boundaries, Constraints and Assumptions, or Introduction
+**What assessors look for:**
+- Explicit in-scope list (named components/systems)
+- Explicit out-of-scope list (what isn't covered and why)
+- Stated assumptions (especially ones that affect the design)
+- Constraints (regulatory, technical, organizational)
+**Score 2 vs. 3 boundary:** Scope defined vs. scope + exclusions + assumptions all stated
+**Strong (score 3–4):**
+> IN SCOPE: Payments service, Notification service, upstream Banking Core API
+> OUT OF SCOPE: Authentication (handled by IAM platform — see IAM-2024-001), analytics pipeline
+> ASSUMPTIONS: Banking Core API versioned contract stable for 12 months
+> CONSTRAINTS: Must operate within existing AWS EU-West-1 account; GDPR data residency applies
+**Weak (score 0–1):**
+> "This document covers the payments service architecture."
+---
+### CVP-01 — Content and Viewpoints
+**Criterion question:** Are the chosen views appropriate for the stated stakeholders and decision purpose?
+**Expected in sections:** Architecture Views, Solution Overview, or any section with diagrams
+**What assessors look for:**
+- Multiple views (context, component, deployment, data flow — not just one diagram)
+- Connection between views and stakeholder concerns
+- Annotated diagrams with legends
+**Score 2 vs. 3 boundary:** Multiple views present vs. views explicitly mapped to stakeholder concerns
+---
+### TRC-01 — Traceability
+**Criterion question:** Are architecture decisions traceable to business drivers or requirements?
+**Expected in sections:** Architecture Decisions, Design Rationale, Decision Log
+**What assessors look for:**
+- Explicit links from decisions to the business drivers that motivated them
+- Decision record format: context → options → decision → rationale
+- References back to requirement IDs or named principles
+**Score 2 vs. 3 boundary:** Decisions exist vs. decisions with explicit driver references
+**Strong (score 3):**
+> "Decision: Adopt event-driven Kafka. Context: scalability driver (Driver-3). Options: REST synchronous (rejected: cascade failure risk), Kafka (selected). Rationale: proven at 2× current volume, approved platform service."
+**Weak (score 1):**
+> "We chose event-driven architecture for scalability."
+---
+### CON-01 — Internal Consistency
+**Criterion question:** Is terminology and component naming consistent across all sections and diagrams?
+**Expected everywhere:** Checked across all sections and diagrams (not a specific section)
+**What assessors look for:**
+- Same name for the same component in all diagrams and text
+- Scope boundary consistent between all views
+- API contracts consistent between description and sequence diagrams
+**Score 2 vs. 3 boundary:** Minor inconsistencies vs. fully consistent with a glossary
+---
+### RAT-01 — Risk and Assumptions ⚠️ GATE (major in most profiles)
+**Criterion question:** Are risks, assumptions, and constraints identified with mitigations and owners?
+**Expected in sections:** Risks, RAID Log, Assumptions, or Risk Register
+**What assessors look for:**
+- Risk register with all columns: Risk, Likelihood, Impact, Mitigation, Owner, Residual Risk
+- Architectural trade-offs explicitly named
+- Open questions flagged with planned resolution
+**Score 2 vs. 3 boundary:** Risks listed vs. risks with mitigations AND owners
+**Strong (score 3–4):**
+> | Risk | Likelihood | Impact | Mitigation | Owner | Residual |
+> |------|-----------|--------|------------|-------|---------|
+> | Banking Core API SLA breach | Medium | High | Circuit breaker + fallback to cached data | Platform Eng | Low |
+**Weak (score 0–1):**
+> "Risks: Performance, security, integration issues."
+---
+### CMP-01 — Compliance and Standards ⚠️ GATE (critical in many profiles)
+**Criterion question:** Does the design address applicable compliance frameworks and enterprise standards?
+**Expected in sections:** Compliance, Security, Standards References, or Architecture Decisions
+**What assessors look for:**
+- Named standards (not "industry standards" — specific names like GDPR, ISO 27001, PCI-DSS)
+- Specific controls mapped to specific design elements
+- Named exceptions with approval path
+**Score 2 vs. 3 boundary:** Standards mentioned vs. specific controls mapped to design
+**The critical anti-pattern** (scores 0):
+> "The solution will comply with all applicable security and regulatory standards."
+---
+### ACT-01 — Actions and Decisions
+**Criterion question:** Are the key decisions and required actions clearly stated with owners?
+**Expected in sections:** Decision, Recommendations, Next Steps, or Action Log
+**What assessors look for:**
+- Clear decision statement (what was decided, not just what was considered)
+- Named actions with owners and target dates
+- Decision authority identified
+---
+### MNT-01 — Maintainability and Ownership
+**Criterion question:** Is the document owned, versioned, and maintainable?
+**Expected in sections:** Document Control, cover page, header/footer
+**What assessors look for:**
+- Named owner (team or role)
+- Version number and date
+- Change history or changelog
+- Review trigger or next review date
+---
+## Gate Summary by Profile
+| Profile | Critical Gates | Major Gates |
+|---------|---------------|-------------|
+| Solution architecture | CMP-01 | SCP-01, RAT-01 |
+| Reference architecture | None | RA-VIEW-01, RA-IMPL-01 (see profile) |
+| ADR | SCP-01 | CON-01, RAT-01 |
+| Capability map | None | SCP-01, TRC-01 |
+| Roadmap | None | ACT-01, TRC-01 |
+> Always verify from the loaded profile YAML — this table is indicative only. Gate assignments are defined in the `gate` field of each criterion.
+---
+## Common Completeness Failures
+1. **Scope without assumptions** — Section exists but assumptions unstated → SCP-01 capped at 2
+2. **Risks without owners** — Risk table has "Owner: TBD" → RAT-01 capped at 2
+3. **Compliance by assertion** — "The solution will comply with all standards" → CMP-01 = 0
+4. **Single diagram** — One architecture diagram presented as complete → CVP-01 = 1
+5. **Traceability implied** — Decisions made with no reference to drivers → TRC-01 = 1–2
+6. **Generic stakeholders** — "Audience: technical teams" → STK-01 = 1
+7. **Out-of-scope omitted** — Scope section exists but no explicit exclusions → SCP-01 = 2

package/assets/init/.agents/skills/earos-validate/SKILL.md ADDED Viewed

@@ -0,0 +1,113 @@
+---
+name: earos-validate
+description: "Run a project health check on the EAROS repository. Validates all YAML rubric files against schemas, checks ID uniqueness, verifies cross-references, detects missing v2 fields, and reports on documentation accuracy. Triggers when the user wants to validate the EAROS repo, check rubric health, run a consistency check, verify schemas, find missing fields, or says \"validate the rubrics\", \"check the EAROS repo\", \"run a health check\", \"check for schema errors\", \"find inconsistencies\", or \"is the rubric set valid\"."
+---
+# EAROS Validate — Repository Health Check
+You run a systematic health check on the EAROS repository. This catches errors that accumulate silently during development: ID conflicts across files, missing required fields added in v2, documentation claims that no longer match the YAML, gate configurations that contradict the status logic.
+**Why this matters:** A rubric with a duplicate criterion ID will produce ambiguous evaluation records. A profile with a missing `decision_tree` field will calibrate unreliably. Documentation that says "10 criteria" when there are 11 creates confusion for authors and reviewers. These errors compound. A weekly or pre-commit health check prevents this.
+## What to Load First
+Read before running any checks:
+1. `earos.manifest.yaml` — the authoritative registry of all rubric files; Check 8 validates it
+2. `standard/schemas/rubric.schema.json` — schema for all rubric/profile/overlay YAML files
+3. `standard/schemas/evaluation.schema.json` — schema for evaluation record files
+Then read all YAML files in: `core/`, `profiles/`, `overlays/`, `examples/`
+## The Eight Checks
+Run all eight. Do not stop at the first error.
+**Check 1 — Schema conformance**
+For each rubric YAML, verify required top-level fields, `scoring` and `outputs` sub-fields, and kind-specific requirements (profiles must have `inherits`, overlays must not, overlays must use `append_to_base_rubric`).
+**Check 2 — Criterion v2 field completeness**
+Every criterion must have all 13 v2 fields: `id`, `question`, `description`, `metric_type`, `scale`, `gate`, `required_evidence`, `scoring_guide` (keys "0"–"4"), `anti_patterns`, `examples.good`, `examples.bad`, `decision_tree`, `remediation_hints`.
+**Check 3 — ID uniqueness**
+Collect all rubric IDs, dimension IDs, and criterion IDs. No duplicates allowed across any files. Criterion ID conflicts across profiles cause ambiguity in evaluation records.
+**Check 4 — Cross-reference validation**
+Profile `inherits` references must resolve to real rubric IDs. Gate configurations must have valid `severity` values and non-empty `failure_effect`. Dimension weights outside 0.5–2.0 should be flagged.
+**Check 5 — Evaluation record schema check**
+For each evaluation record in `examples/`: required fields, valid `status` values, valid `judgment_type` and `confidence` values per criterion. Status must match the gate failures and overall score.
+**Check 6 — Documentation accuracy**
+Check CLAUDE.md claims ("9 dimensions", "10 criteria", profile lists) against actual YAML content. Check README.md profile and overlay lists against actual files.
+**Check 7 — YAML style conventions**
+Two-space indentation, quoted numeric keys in `scoring_guide`, kebab-case filenames, no version number in filename (version is tracked inside the file only).
+**Check 8 — Manifest-filesystem consistency**
+Read `earos.manifest.yaml`. For each entry in `core`, `profiles`, and `overlays`:
+- Verify the file exists on disk at the listed path
+- Verify `rubric_id` in the manifest matches `rubric_id` in the file
+- Verify `title` and `artifact_type` are consistent
+Then verify completeness: every `.yaml` file in `core/`, `profiles/`, and `overlays/` must appear in the manifest. Any file on disk that is absent from the manifest is an ERROR. Any manifest entry whose file is missing is an ERROR.
+> Read `references/validation-checks.md` for the complete check procedures with exact field paths and error message formats. Read it before running any checks — it contains the precision needed to produce actionable error messages.
+## Severity Classification
+| Severity | Meaning |
+|----------|---------|
+| **ERROR** | Missing required field, schema violation, duplicate ID, gate-status contradiction |
+| **WARNING** | Style issue, extreme dimension weight, advisory-level inconsistency |
+Errors must be fixed before the repository can be used in production. Warnings should be reviewed.
+## Output Format
+```markdown
+# EAROS Repository Validation Report
+Date: [today]
+Files checked: [N rubric files] + [N evaluation records]
+## Summary
+| Check | Errors | Warnings |
+|-------|--------|---------|
+| Schema conformance | [N] | [N] |
+| Criterion v2 completeness | [N] | [N] |
+| ID uniqueness | [N] | [N] |
+| Cross-references | [N] | [N] |
+| Evaluation records | [N] | [N] |
+| Documentation accuracy | [N] | [N] |
+| YAML style | [N] | [N] |
+| Manifest consistency | [N] | [N] |
+| TOTAL | [N] | [N] |
+Overall health: [Clean / Warnings only / Errors found]
+## Errors (must fix)
+[FILE] [DESCRIPTION] — each with exact field path and criterion ID where applicable
+## Warnings (should review)
+[FILE] [DESCRIPTION]
+## Recommended Actions
+[Numbered list, prioritised by severity]
+```
+> For common issues and how to fix them, read `references/fix-patterns.md`.
+## Non-Negotiable Rules
+1. **Report, don't auto-fix.** Flag problems; do not silently correct them. The user reviews and approves all changes.
+2. **Be precise.** "profiles/reference-architecture.yaml CRITERION RA-VIEW-01 MISSING: decision_tree" is useful. "Some criteria have missing fields" is not.
+3. **Count accurately.** Verify documentation claims against actual YAML — do not rely on memory or prior knowledge.
+4. **Errors vs. warnings.** Missing required fields are errors. Style deviations are warnings. Never downgrade an error to a warning.
+## When to Read References
+| When | Read |
+|------|------|
+| Before running any checks | `references/validation-checks.md` |
+| Check field paths for scoring and outputs | `references/validation-checks.md` |
+| After finding errors — how to fix | `references/fix-patterns.md` |
+| User asks how to fix a specific error | `references/fix-patterns.md` |