@trohde/earos 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (135) hide show
  1. package/README.md +156 -0
  2. package/assets/init/.agents/skills/earos-artifact-gen/SKILL.md +106 -0
  3. package/assets/init/.agents/skills/earos-artifact-gen/references/interview-guide.md +313 -0
  4. package/assets/init/.agents/skills/earos-artifact-gen/references/output-guide.md +367 -0
  5. package/assets/init/.agents/skills/earos-assess/SKILL.md +212 -0
  6. package/assets/init/.agents/skills/earos-assess/references/calibration-benchmarks.md +160 -0
  7. package/assets/init/.agents/skills/earos-assess/references/output-templates.md +311 -0
  8. package/assets/init/.agents/skills/earos-assess/references/scoring-protocol.md +281 -0
  9. package/assets/init/.agents/skills/earos-calibrate/SKILL.md +153 -0
  10. package/assets/init/.agents/skills/earos-calibrate/references/agreement-metrics.md +188 -0
  11. package/assets/init/.agents/skills/earos-calibrate/references/calibration-protocol.md +263 -0
  12. package/assets/init/.agents/skills/earos-create/SKILL.md +257 -0
  13. package/assets/init/.agents/skills/earos-create/references/criterion-writing-guide.md +268 -0
  14. package/assets/init/.agents/skills/earos-create/references/dependency-rules.md +193 -0
  15. package/assets/init/.agents/skills/earos-create/references/rubric-interview-guide.md +123 -0
  16. package/assets/init/.agents/skills/earos-create/references/validation-checklist.md +238 -0
  17. package/assets/init/.agents/skills/earos-profile-author/SKILL.md +251 -0
  18. package/assets/init/.agents/skills/earos-profile-author/references/criterion-writing-guide.md +280 -0
  19. package/assets/init/.agents/skills/earos-profile-author/references/design-methods.md +158 -0
  20. package/assets/init/.agents/skills/earos-profile-author/references/profile-checklist.md +173 -0
  21. package/assets/init/.agents/skills/earos-remediate/SKILL.md +118 -0
  22. package/assets/init/.agents/skills/earos-remediate/references/output-template.md +199 -0
  23. package/assets/init/.agents/skills/earos-remediate/references/remediation-patterns.md +330 -0
  24. package/assets/init/.agents/skills/earos-report/SKILL.md +85 -0
  25. package/assets/init/.agents/skills/earos-report/references/portfolio-template.md +181 -0
  26. package/assets/init/.agents/skills/earos-report/references/single-artifact-template.md +168 -0
  27. package/assets/init/.agents/skills/earos-review/SKILL.md +130 -0
  28. package/assets/init/.agents/skills/earos-review/references/challenge-patterns.md +163 -0
  29. package/assets/init/.agents/skills/earos-review/references/output-template.md +180 -0
  30. package/assets/init/.agents/skills/earos-template-fill/SKILL.md +177 -0
  31. package/assets/init/.agents/skills/earos-template-fill/references/evidence-writing-guide.md +186 -0
  32. package/assets/init/.agents/skills/earos-template-fill/references/section-rubric-mapping.md +200 -0
  33. package/assets/init/.agents/skills/earos-validate/SKILL.md +113 -0
  34. package/assets/init/.agents/skills/earos-validate/references/fix-patterns.md +281 -0
  35. package/assets/init/.agents/skills/earos-validate/references/validation-checks.md +287 -0
  36. package/assets/init/.claude/CLAUDE.md +4 -0
  37. package/assets/init/AGENTS.md +293 -0
  38. package/assets/init/CLAUDE.md +635 -0
  39. package/assets/init/README.md +507 -0
  40. package/assets/init/calibration/gold-set/.gitkeep +0 -0
  41. package/assets/init/calibration/results/.gitkeep +0 -0
  42. package/assets/init/core/core-meta-rubric.yaml +643 -0
  43. package/assets/init/docs/consistency-report.md +325 -0
  44. package/assets/init/docs/getting-started.md +194 -0
  45. package/assets/init/docs/profile-authoring-guide.md +51 -0
  46. package/assets/init/docs/terminology.md +126 -0
  47. package/assets/init/earos.manifest.yaml +104 -0
  48. package/assets/init/evaluations/.gitkeep +0 -0
  49. package/assets/init/examples/aws-event-driven-order-processing/artifact.yaml +2056 -0
  50. package/assets/init/examples/aws-event-driven-order-processing/evaluation.yaml +973 -0
  51. package/assets/init/examples/aws-event-driven-order-processing/report.md +244 -0
  52. package/assets/init/examples/example-solution-architecture.evaluation.yaml +136 -0
  53. package/assets/init/examples/multi-cloud-data-analytics/artifact.yaml +715 -0
  54. package/assets/init/overlays/data-governance.yaml +94 -0
  55. package/assets/init/overlays/regulatory.yaml +154 -0
  56. package/assets/init/overlays/security.yaml +92 -0
  57. package/assets/init/profiles/adr.yaml +225 -0
  58. package/assets/init/profiles/capability-map.yaml +223 -0
  59. package/assets/init/profiles/reference-architecture.yaml +426 -0
  60. package/assets/init/profiles/roadmap.yaml +205 -0
  61. package/assets/init/profiles/solution-architecture.yaml +227 -0
  62. package/assets/init/research/architecture-assessment-rubrics-research.docx +0 -0
  63. package/assets/init/research/architecture-assessment-rubrics-research.md +566 -0
  64. package/assets/init/research/reference-architecture-research.md +751 -0
  65. package/assets/init/standard/EAROS.md +1426 -0
  66. package/assets/init/standard/schemas/artifact.schema.json +1295 -0
  67. package/assets/init/standard/schemas/artifact.uischema.json +65 -0
  68. package/assets/init/standard/schemas/evaluation.schema.json +284 -0
  69. package/assets/init/standard/schemas/rubric.schema.json +383 -0
  70. package/assets/init/templates/evaluation-record.template.yaml +58 -0
  71. package/assets/init/templates/new-profile.template.yaml +65 -0
  72. package/bin.js +188 -0
  73. package/dist/assets/_basePickBy-BVu6YmSW.js +1 -0
  74. package/dist/assets/_baseUniq-CWRzQDz_.js +1 -0
  75. package/dist/assets/arc-CyDBhtDM.js +1 -0
  76. package/dist/assets/architectureDiagram-2XIMDMQ5-BH6O4dvN.js +36 -0
  77. package/dist/assets/blockDiagram-WCTKOSBZ-2xmwdjpg.js +132 -0
  78. package/dist/assets/c4Diagram-IC4MRINW-BNmPRFJF.js +10 -0
  79. package/dist/assets/channel-CiySTNoJ.js +1 -0
  80. package/dist/assets/chunk-4BX2VUAB-DGQTvirp.js +1 -0
  81. package/dist/assets/chunk-55IACEB6-DNMAQAC_.js +1 -0
  82. package/dist/assets/chunk-FMBD7UC4-BJbVTQ5o.js +15 -0
  83. package/dist/assets/chunk-JSJVCQXG-BCxUL74A.js +1 -0
  84. package/dist/assets/chunk-KX2RTZJC-H7wWZOfz.js +1 -0
  85. package/dist/assets/chunk-NQ4KR5QH-BK4RlTQF.js +220 -0
  86. package/dist/assets/chunk-QZHKN3VN-0chxDV5g.js +1 -0
  87. package/dist/assets/chunk-WL4C6EOR-DexfQ-AV.js +189 -0
  88. package/dist/assets/classDiagram-VBA2DB6C-D7luWJQn.js +1 -0
  89. package/dist/assets/classDiagram-v2-RAHNMMFH-D7luWJQn.js +1 -0
  90. package/dist/assets/clone-ylgRbd3D.js +1 -0
  91. package/dist/assets/cose-bilkent-S5V4N54A-DS2IOCfZ.js +1 -0
  92. package/dist/assets/cytoscape.esm-CyJtwmzi.js +331 -0
  93. package/dist/assets/dagre-KLK3FWXG-BbSoTTa3.js +4 -0
  94. package/dist/assets/defaultLocale-DX6XiGOO.js +1 -0
  95. package/dist/assets/diagram-E7M64L7V-C9TvYgv0.js +24 -0
  96. package/dist/assets/diagram-IFDJBPK2-DowUMWrg.js +43 -0
  97. package/dist/assets/diagram-P4PSJMXO-BL6nrnQF.js +24 -0
  98. package/dist/assets/erDiagram-INFDFZHY-rXPRl8VM.js +70 -0
  99. package/dist/assets/flowDiagram-PKNHOUZH-DBRM99-W.js +162 -0
  100. package/dist/assets/ganttDiagram-A5KZAMGK-INcWFsBT.js +292 -0
  101. package/dist/assets/gitGraphDiagram-K3NZZRJ6-DMwpfE91.js +65 -0
  102. package/dist/assets/graph-DLQn37b-.js +1 -0
  103. package/dist/assets/index-BFFITMT8.js +650 -0
  104. package/dist/assets/index-H7f6VTz1.css +1 -0
  105. package/dist/assets/infoDiagram-LFFYTUFH-B0f4TWRM.js +2 -0
  106. package/dist/assets/init-Gi6I4Gst.js +1 -0
  107. package/dist/assets/ishikawaDiagram-PHBUUO56-CsU6XimZ.js +70 -0
  108. package/dist/assets/journeyDiagram-4ABVD52K-CQ7ibNib.js +139 -0
  109. package/dist/assets/kanban-definition-K7BYSVSG-DzEN7THt.js +89 -0
  110. package/dist/assets/katex-B1X10hvy.js +261 -0
  111. package/dist/assets/layout-C0dvb42R.js +1 -0
  112. package/dist/assets/linear-j4a8mGj7.js +1 -0
  113. package/dist/assets/mindmap-definition-YRQLILUH-DP8iEuCf.js +68 -0
  114. package/dist/assets/ordinal-Cboi1Yqb.js +1 -0
  115. package/dist/assets/pieDiagram-SKSYHLDU-BpIAXgAm.js +30 -0
  116. package/dist/assets/quadrantDiagram-337W2JSQ-DrpXn5Eg.js +7 -0
  117. package/dist/assets/requirementDiagram-Z7DCOOCP-Bg7EwHlG.js +73 -0
  118. package/dist/assets/sankeyDiagram-WA2Y5GQK-BWagRs1F.js +10 -0
  119. package/dist/assets/sequenceDiagram-2WXFIKYE-q5jwhivG.js +145 -0
  120. package/dist/assets/stateDiagram-RAJIS63D-B_J9pE-2.js +1 -0
  121. package/dist/assets/stateDiagram-v2-FVOUBMTO-Q_1GcybB.js +1 -0
  122. package/dist/assets/timeline-definition-YZTLITO2-dv0jgQ0z.js +61 -0
  123. package/dist/assets/treemap-KZPCXAKY-Dt1dkIE7.js +162 -0
  124. package/dist/assets/vennDiagram-LZ73GAT5-BdO5RgRZ.js +34 -0
  125. package/dist/assets/xychartDiagram-JWTSCODW-CpDVe-8v.js +7 -0
  126. package/dist/index.html +23 -0
  127. package/export-docx.js +1583 -0
  128. package/init.js +353 -0
  129. package/manifest-cli.mjs +207 -0
  130. package/package.json +83 -0
  131. package/schemas/artifact.schema.json +1295 -0
  132. package/schemas/artifact.uischema.json +65 -0
  133. package/schemas/evaluation.schema.json +284 -0
  134. package/schemas/rubric.schema.json +383 -0
  135. package/serve.js +238 -0
@@ -0,0 +1,367 @@
1
+ # Output Guide — EAROS Artifact Generator
2
+
3
+ How to transform interview answers into a structured artifact YAML that conforms to `standard/schemas/artifact.schema.json` and satisfies EAROS rubric evidence requirements.
4
+
5
+ ---
6
+
7
+ ## Core Principle: Schema is the Contract
8
+
9
+ The `artifact.schema.json` is derived from the `required_evidence` fields of the EAROS rubrics. Every required field in the schema maps directly to evidence that a rubric criterion needs. Filling the schema correctly means satisfying the evidence requirements. Do not add sections not in the schema; do not omit required sections.
10
+
11
+ Before generating output:
12
+ 1. Read `standard/schemas/artifact.schema.json` to confirm the required fields for the artifact type.
13
+ 2. Verify your interview answers cover each required section.
14
+ 3. Flag any gaps as `[TBD: <description>]` rather than omitting them.
15
+
16
+ ---
17
+
18
+ ## YAML Structure Overview
19
+
20
+ Every artifact YAML begins with a metadata block, then sections that map to the schema:
21
+
22
+ ```yaml
23
+ kind: artifact
24
+ artifact_type: <type> # solution_architecture | reference_architecture | adr | capability_map | roadmap
25
+ schema_version: "1.0.0"
26
+ rubric_id: <id> # e.g., EAROS-CORE-002 + EAROS-SA-001
27
+
28
+ metadata:
29
+ title: ""
30
+ version: ""
31
+ status: draft # draft | candidate_for_review | approved | deprecated
32
+ owner: ""
33
+ authors: []
34
+ reviewers: []
35
+ created_date: ""
36
+ last_updated: ""
37
+ review_date: ""
38
+ supersedes: "" # Previous version or document, if any
39
+ audience: [] # List of intended reader groups
40
+
41
+ # Sections below vary by artifact_type — see per-type guidance
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Section-by-Section Transformation Rules
47
+
48
+ ### business_context
49
+
50
+ **From interview:** Business drivers, goals, constraints from Block 2.
51
+
52
+ **YAML structure:**
53
+ ```yaml
54
+ business_context:
55
+ problem_statement: >
56
+ [One paragraph. What problem this architecture solves. What would happen without it.]
57
+ business_drivers:
58
+ - id: BD-01
59
+ driver: "" # Name of the driver
60
+ description: "" # Why it matters
61
+ priority: high # high | medium | low
62
+ goals:
63
+ - id: G-01
64
+ goal: ""
65
+ measurable_outcome: "" # How success will be measured
66
+ constraints:
67
+ - type: regulatory # regulatory | technical | commercial | timeline | resource
68
+ constraint: ""
69
+ impact: ""
70
+ ```
71
+
72
+ **Quality check:** Every architectural decision in `key_decisions` should reference at least one business driver by ID. If it can't, the decision may lack justification.
73
+
74
+ ---
75
+
76
+ ### stakeholders
77
+
78
+ **From interview:** Stakeholder identification from Block 1 and Block 2.
79
+
80
+ **YAML structure:**
81
+ ```yaml
82
+ stakeholders:
83
+ - role: "" # e.g., "Architecture Review Board"
84
+ concerns:
85
+ - "" # Primary concern
86
+ questions:
87
+ - "" # Key question they want answered
88
+ addressed_in: [] # Section IDs or names where concerns are addressed
89
+ ```
90
+
91
+ **Scoring rule (STK-01):** Score 3 requires: role, at least one concern per stakeholder, and a cross-reference to where in the document their concern is addressed. A list of names with no concerns scores 1.
92
+
93
+ ---
94
+
95
+ ### scope
96
+
97
+ **From interview:** Scope and boundary answers from Block 1.
98
+
99
+ **YAML structure:**
100
+ ```yaml
101
+ scope:
102
+ in_scope:
103
+ - "" # List items clearly — systems, functions, data domains
104
+ out_of_scope:
105
+ - item: ""
106
+ rationale: "" # Why excluded — do not omit the rationale
107
+ deferred:
108
+ - item: ""
109
+ target_phase: "" # When this will be addressed
110
+ assumptions:
111
+ - id: ASM-01
112
+ assumption: ""
113
+ owner: "" # Who is responsible for validating this
114
+ validation_status: open # open | validated | invalidated
115
+ ```
116
+
117
+ **Scoring rule (SCP-01):** Score 3 requires an explicit out-of-scope list with rationale. Score 4 requires assumptions listed and validated. Scope stated only as goals (not as boundaries) scores 1.
118
+
119
+ ---
120
+
121
+ ### architecture_views
122
+
123
+ **From interview:** Views and components from Block 3.
124
+
125
+ **YAML structure:**
126
+ ```yaml
127
+ architecture_views:
128
+ context_view:
129
+ description: >
130
+ [Prose describing the system boundary and external actors.]
131
+ external_actors:
132
+ - name: ""
133
+ type: system # system | user | external_service
134
+ interaction: "" # What it sends/receives
135
+ diagram_ref: "" # Path or URL to diagram file, if available
136
+
137
+ functional_view:
138
+ description: >
139
+ [Prose overview of the functional decomposition.]
140
+ components:
141
+ - id: COMP-01
142
+ name: ""
143
+ responsibility: "" # One sentence
144
+ technology: "" # Optional — framework, language, service
145
+ interfaces:
146
+ - direction: inbound # inbound | outbound
147
+ protocol: "" # REST, gRPC, event, batch, etc.
148
+ consumer_provider: ""
149
+ diagram_ref: ""
150
+
151
+ deployment_view:
152
+ description: >
153
+ [Prose describing where and how the system runs.]
154
+ infrastructure:
155
+ - name: ""
156
+ type: "" # cloud, on-prem, hybrid
157
+ provider: "" # AWS, Azure, GCP, etc.
158
+ region: ""
159
+ topology_notes: >
160
+ [Network layout, segmentation, zones, redundancy.]
161
+ diagram_ref: ""
162
+
163
+ data_flow:
164
+ primary_flow:
165
+ description: >
166
+ [Prose walkthrough of the primary success scenario.]
167
+ steps:
168
+ - step: 1
169
+ from: ""
170
+ to: ""
171
+ action: ""
172
+ data: "" # What is being passed
173
+ secondary_flows: [] # Additional flows (async, batch, error)
174
+ sensitive_data:
175
+ - data_type: ""
176
+ classification: "" # e.g., PII, PCI, internal
177
+ enters_at: "" # Component or interface
178
+ exits_at: "" # Where it leaves or is discarded
179
+ ```
180
+
181
+ **Scoring rule (RA-VIEW-01):** Score 2 requires 2–3 views. Score 3 requires all four views with adequate prose. Score 4 requires cross-references between views and a security data flow.
182
+
183
+ ---
184
+
185
+ ### key_decisions
186
+
187
+ **From interview:** Design decisions from Block 4.
188
+
189
+ **YAML structure:**
190
+ ```yaml
191
+ key_decisions:
192
+ - id: DEC-01
193
+ title: "" # Short name for the decision
194
+ driver: BD-01 # Reference to a business_driver ID
195
+ status: accepted # accepted | proposed | superseded
196
+ context: >
197
+ [Why this decision was needed — what requirement or constraint forced a choice.]
198
+ options_considered:
199
+ - option: ""
200
+ description: ""
201
+ rejection_reason: ""
202
+ selected_option: ""
203
+ rationale: >
204
+ [Why the selected option was chosen. Reference the driver and constraints.]
205
+ tradeoffs_accepted: >
206
+ [What was given up — be honest.]
207
+ adr_ref: "" # Path to full ADR if one exists
208
+ ```
209
+
210
+ **Scoring rule (RAT-01):** Score 1 if decisions are stated without alternatives. Score 3 if each decision has at least two alternatives with rejection rationale. Score 4 if tradeoffs are quantified and linked to business drivers.
211
+
212
+ **Common mistake:** "We chose X" with no context. Always ask: what drove the choice? what was rejected? what was the cost?
213
+
214
+ ---
215
+
216
+ ### quality_attributes
217
+
218
+ **From interview:** NFRs and measurability from Block 5.
219
+
220
+ **YAML structure:**
221
+ ```yaml
222
+ quality_attributes:
223
+ - id: QA-01
224
+ attribute: availability # availability | performance | scalability | security | maintainability | etc.
225
+ target: "" # MUST be measurable: "99.9% monthly uptime"
226
+ measurement: "" # How it is measured: "Datadog uptime monitoring"
227
+ architectural_response: "" # What in the design delivers this target
228
+ realised_by: [] # Component IDs that deliver this QA
229
+ validation:
230
+ method: "" # load_test | dr_drill | security_assessment | none
231
+ status: "" # validated | planned | not_yet_tested
232
+ evidence_ref: "" # Link or reference to validation results
233
+ ```
234
+
235
+ **Scoring rule (RA-QA-01):** "High availability" scores 0–1. "99.9% monthly uptime (< 8.7h/year)" scores 2–3. "99.9% uptime, measured by Datadog, delivered by active-active across 2 AZs, validated in DR drill 2025-11-15" scores 4.
236
+
237
+ ---
238
+
239
+ ### risks
240
+
241
+ **From interview:** Risk identification from Block 2 and operational concerns from Block 6.
242
+
243
+ **YAML structure:**
244
+ ```yaml
245
+ risks:
246
+ - id: RSK-01
247
+ risk: "" # Clear risk statement: what could go wrong
248
+ likelihood: medium # high | medium | low
249
+ impact: high # high | medium | low
250
+ mitigation: "" # Specific control or mechanism in place
251
+ owner: ""
252
+ status: open # open | mitigated | accepted | transferred
253
+ assumptions_at_risk:
254
+ - assumption_id: ASM-01
255
+ risk_if_false: ""
256
+ ```
257
+
258
+ ---
259
+
260
+ ### compliance
261
+
262
+ **From interview:** Compliance and governance from Block 7.
263
+
264
+ **YAML structure:**
265
+ ```yaml
266
+ compliance:
267
+ applicable_standards:
268
+ - standard: "" # e.g., PCI DSS v4.0, GDPR, ISO 27001
269
+ scope: "" # How/why this standard applies
270
+ applicable_requirements:
271
+ - requirement: "" # Specific clause or requirement
272
+ satisfied_by: "" # Component or mechanism that satisfies it
273
+ evidence_ref: "" # Link to evidence (DPIA, pen test, etc.)
274
+ compliance_gaps:
275
+ - requirement: ""
276
+ gap: ""
277
+ remediation_plan: ""
278
+ owner: ""
279
+ target_date: ""
280
+ reviews_conducted:
281
+ - type: "" # DPIA | security_assessment | legal_review | etc.
282
+ date: ""
283
+ outcome: ""
284
+ ref: ""
285
+ ```
286
+
287
+ **Gate behaviour:** If the security or regulatory overlay is applied, compliance criteria often have critical gates. A compliance section that lists standards without evidenced controls will fail the gate.
288
+
289
+ ---
290
+
291
+ ### implementation_guidance
292
+
293
+ **From interview:** Next steps and phasing from Block 8.
294
+
295
+ **YAML structure:**
296
+ ```yaml
297
+ implementation_guidance:
298
+ phases:
299
+ - id: PHASE-1
300
+ name: ""
301
+ deliverable: ""
302
+ dependencies: [] # IDs of other phases or external items
303
+ target_date: ""
304
+ interface_contracts:
305
+ - id: IFC-01
306
+ interface: "" # e.g., "POST /payments"
307
+ consumer: ""
308
+ provider: ""
309
+ protocol: ""
310
+ spec_ref: "" # Link to OpenAPI spec, event schema, etc.
311
+ sla: ""
312
+ open_decisions:
313
+ - id: OD-01
314
+ decision: ""
315
+ owner: ""
316
+ target_date: ""
317
+ options: []
318
+ next_steps:
319
+ - action: ""
320
+ owner: ""
321
+ due: ""
322
+ ```
323
+
324
+ ---
325
+
326
+ ## Transformation Quality Checklist
327
+
328
+ Before finalising the YAML, verify:
329
+
330
+ - [ ] Every `key_decision` references a `business_driver` ID
331
+ - [ ] Every `quality_attribute` has a measurable `target` (not just an aspiration)
332
+ - [ ] Every `stakeholder` has at least one `concern` and an `addressed_in` reference
333
+ - [ ] `scope.out_of_scope` has at least one entry with `rationale`
334
+ - [ ] `compliance` section exists if the system handles personal data, financial data, or operates in a regulated domain
335
+ - [ ] `deployment_view` includes topology notes with network segmentation information
336
+ - [ ] `data_flow.sensitive_data` is populated if the system processes classified data
337
+ - [ ] All `[TBD: ...]` placeholders are visible (not silently omitted)
338
+ - [ ] No field contains only aspirational text ("high performance", "highly available", "secure by design") without a measurable qualifier
339
+
340
+ ---
341
+
342
+ ## TBD Conventions
343
+
344
+ Use `[TBD: <context>]` consistently. Include enough context for a reviewer to understand what is needed:
345
+
346
+ ```yaml
347
+ # Good TBD
348
+ target: "[TBD: SLA target not yet agreed — owner: Product, target date: 2026-04-01]"
349
+
350
+ # Poor TBD
351
+ target: "[TBD]"
352
+ ```
353
+
354
+ Never use `null` or omit a required field silently. TBD is preferable to omission because it signals a known gap rather than a missing answer.
355
+
356
+ ---
357
+
358
+ ## Post-Generation Validation
359
+
360
+ After generating the YAML:
361
+
362
+ 1. **Schema check** — confirm all required fields from `artifact.schema.json` are present.
363
+ 2. **Evidence check** — for each rubric criterion's `required_evidence` list, confirm the YAML contains the evidence item.
364
+ 3. **Gate check** — identify any gate criteria whose required evidence is absent or TBD; these are likely to fail in review.
365
+ 4. **Offer earos-assess** — "Would you like me to run a preliminary `earos-assess` on this artifact before you finalize it?"
366
+
367
+ A complete artifact YAML that passes the schema check and covers all `required_evidence` items should score at minimum 2 on every criterion — no surprises in review.
@@ -0,0 +1,212 @@
1
+ ---
2
+ name: earos-assess
3
+ description: "Run a full EAROS evaluation on an architecture artifact. Triggers when the user wants to assess, evaluate, score, or review an architecture document using the EAROS framework. Also triggers for \"score this architecture\", \"evaluate this ADR\", \"run EAROS on this\", \"assess this capability map\", \"review this solution design\", \"is this architecture any good\", \"quality check this design\", \"grade this document\", \"what score would this get\", or any request to evaluate, rate, or assess the quality of an architecture artifact."
4
+ ---
5
+
6
+ # EAROS Assessment Skill
7
+
8
+ You are running a governed architecture quality evaluation. The output must be **auditable** — every score needs a cited evidence anchor from the artifact, not an impression. The most common failure mode in architecture assessment (human and AI alike) is scoring from vibes rather than evidence. This skill prevents that.
9
+
10
+ **Before anything else:** Read `references/scoring-protocol.md` to understand the RULERS evidence-anchoring protocol. Do this before Step 2.
11
+
12
+ ---
13
+
14
+ ## Step 0 — Load the Rubric Files
15
+
16
+ Read these files before scoring anything. The rubric files contain the `scoring_guide` and `decision_tree` fields that define what each score level means. Do not score from memory — read the rubric.
17
+
18
+ **Start with the manifest.** Read `earos.manifest.yaml` (at the repo root) first — it is the authoritative registry of all available profiles and overlays. Use the `path` values listed in the manifest to find the files. Do not hardcode paths.
19
+
20
+ **Always load:**
21
+ - `core/core-meta-rubric.yaml` — 9 dimensions, 10 criteria, applies to every artifact
22
+
23
+ **Load the matching profile (if one exists):**
24
+ Check `earos.manifest.yaml` → `profiles` section for available profiles and their `artifact_type`. Select the entry whose `artifact_type` matches the artifact being assessed. If no match exists, use core only.
25
+
26
+ Common matches:
27
+ - Solution architecture → `profiles/solution-architecture.yaml`
28
+ - Reference architecture → `profiles/reference-architecture.yaml`
29
+ - Architecture Decision Record → `profiles/adr.yaml`
30
+ - Capability map → `profiles/capability-map.yaml`
31
+ - Roadmap → `profiles/roadmap.yaml`
32
+
33
+ **Ask the user which overlays apply (if not specified):**
34
+ Check `earos.manifest.yaml` → `overlays` section for all available overlays.
35
+ - `overlays/security.yaml` — apply when the artifact touches auth, authorization, personal data, or external integrations
36
+ - `overlays/data-governance.yaml` — apply when the artifact describes data flows, retention, or classification
37
+ - `overlays/regulatory.yaml` — apply when the artifact is in a regulated domain (payments, healthcare, financial reporting)
38
+
39
+ ---
40
+
41
+ ## The 8-Step Evaluation DAG
42
+
43
+ The evaluation follows a directed acyclic graph. Steps must run in order — you cannot aggregate before scoring, cannot determine status before checking gates.
44
+
45
+ ```
46
+ structural_validation → content_extraction → criterion_scoring
47
+ → cross_reference_validation → dimension_aggregation
48
+ → challenge_pass → calibration → status_determination
49
+ ```
50
+
51
+ ---
52
+
53
+ ### Step 1 — Structural Validation
54
+
55
+ Binary gate: is the artifact reviewable at all?
56
+
57
+ Check whether these five elements are present:
58
+ - Title and version identifier
59
+ - Named owner or author
60
+ - Purpose or scope statement
61
+ - Diagrams or structural representations
62
+ - Stakeholder or audience section
63
+
64
+ **If 3 or more are absent:** stop. Flag **Not Reviewable**. Explain exactly which elements are missing and what must be added before assessment can proceed. Do not assign criterion scores for an un-reviewable artifact.
65
+
66
+ ---
67
+
68
+ ### Step 2 — Content Extraction (RULERS Protocol)
69
+
70
+ > **Read `references/scoring-protocol.md` before this step.** It contains the full RULERS protocol, evidence classification rules, and examples of correct vs. incorrect evidence extraction.
71
+
72
+ For every criterion in the loaded rubric files:
73
+ 1. Read the criterion's `required_evidence` list
74
+ 2. Search the artifact for direct quotes, references, or sections that address it
75
+ 3. Record an `evidence_anchor` (section heading, page, diagram label) and an `excerpt` (direct quote or close paraphrase)
76
+ 4. Classify: `observed` (directly stated) / `inferred` (reasonable interpretation) / `external` (judgment based on a standard outside the artifact)
77
+
78
+ If you cannot find evidence → record `evidence_class: none`. The absence of evidence is data. Never score from impression.
79
+
80
+ ---
81
+
82
+ ### Step 3 — Criterion Scoring
83
+
84
+ For each criterion:
85
+ - Use the `scoring_guide` level descriptors from the rubric YAML — these are the authoritative definitions of each score level
86
+ - Use the `decision_tree` field to resolve ambiguous cases (it translates the scoring guide into observable conditions)
87
+ - Score 0–4. Use `N/A` only when the criterion genuinely cannot apply, with a written justification
88
+ - Report `confidence` (high / medium / low) separately — it does **not** change the numerical score
89
+ - If you find contradicting evidence after assigning a score, revise the score down
90
+
91
+ Minimum output per criterion:
92
+ ```
93
+ criterion_id: [ID]
94
+ score: [0-4 or N/A]
95
+ evidence_class: [observed/inferred/external/none]
96
+ confidence: [high/medium/low]
97
+ evidence_anchor: "[section or location in artifact]"
98
+ excerpt: "[direct quote or close paraphrase]"
99
+ rationale: "[1-3 sentences citing the evidence]"
100
+ ```
101
+
102
+ > **If scoring feels ambiguous**, see `references/scoring-protocol.md` for worked examples of good and bad scoring, and how to use `decision_tree` fields.
103
+
104
+ ---
105
+
106
+ ### Step 4 — Cross-Reference Validation
107
+
108
+ Check for internal consistency issues that affect scores:
109
+ - Do component names match across all diagrams and sections?
110
+ - Do interface definitions agree between API specs and sequence diagrams?
111
+ - Is the scope boundary consistent across all views?
112
+ - Do narrative claims match the diagrams?
113
+
114
+ Inconsistencies reduce scores on `CON-01` (internal consistency) in the core rubric. Note specific mismatches as evidence.
115
+
116
+ > **For cross-reference patterns**, see `references/scoring-protocol.md#cross-reference-validation`.
117
+
118
+ ---
119
+
120
+ ### Step 5 — Dimension Aggregation
121
+
122
+ For each dimension:
123
+ 1. Average criterion scores — exclude N/A criteria from the denominator (they don't count for or against)
124
+ 2. Apply the dimension weight from the rubric YAML
125
+ 3. Report the weighted dimension score
126
+
127
+ A dimension score of 0.0 is not neutralized by a dimension score of 4.0. Report each dimension separately.
128
+
129
+ ---
130
+
131
+ ### Step 6 — Challenge Pass
132
+
133
+ Before finalizing, challenge your three highest and three lowest scores:
134
+ - "What interpretation of the evidence would justify a higher score?"
135
+ - "What interpretation would justify a lower score?"
136
+ - "Am I labelling as `observed` something that is actually `inferred`?"
137
+
138
+ Revise scores where the challenge reveals weak reasoning. Flag revised scores with `revised: true`.
139
+
140
+ The purpose of this step is to catch over-scoring (the most common agent failure) and under-scoring (harsh treatment of well-evidenced but incomplete artifacts).
141
+
142
+ > **For detailed challenge methodology**, see `references/scoring-protocol.md#challenge-pass`.
143
+
144
+ ---
145
+
146
+ ### Step 7 — Calibration Check
147
+
148
+ > **Read `references/calibration-benchmarks.md` before this step** to sanity-check your score distribution.
149
+
150
+ Quick self-checks:
151
+ - An overall score > 3.5 should be genuinely exceptional — evidence-rich, decision-ready. If you're scoring above 3.5, confirm it's warranted.
152
+ - An overall score < 2.0 is a serious, near-unusable artifact. Confirm this is warranted before finalizing.
153
+ - Flag any criterion where `confidence: low` — these warrant independent human review.
154
+
155
+ ---
156
+
157
+ ### Step 8 — Status Determination
158
+
159
+ **Gates first** — check gate criteria before computing any weighted average. A single critical gate failure = Reject, no matter how high the average is.
160
+
161
+ | Gate type | Effect |
162
+ |-----------|--------|
163
+ | `critical` failure | Status = `reject` regardless of average |
164
+ | `major` failure | Status cannot exceed `conditional_pass` |
165
+
166
+ Then compute the weighted overall average and apply thresholds:
167
+
168
+ | Status | Threshold |
169
+ |--------|-----------|
170
+ | **Pass** | No critical gate failure + overall ≥ 3.2 + no dimension < 2.0 |
171
+ | **Conditional Pass** | No critical gate failure + overall 2.4–3.19 |
172
+ | **Rework Required** | Overall < 2.4 or repeated weak dimensions |
173
+ | **Reject** | Any critical gate failure |
174
+ | **Not Reviewable** | Evidence too incomplete to score gate criteria |
175
+
176
+ ---
177
+
178
+ ## Output
179
+
180
+ Produce **two outputs**: a YAML evaluation record and a markdown report.
181
+
182
+ > **Read `references/output-templates.md`** for full templates with field-by-field explanations before producing output. Mirror the structure of `examples/example-solution-architecture.evaluation.yaml`.
183
+
184
+ **YAML evaluation record** key fields:
185
+ `evaluation_id`, `rubric_id`, `artifact_ref`, `evaluation_date`, `status`, `overall_score`, `gate_failures`, `criterion_results`, `dimension_scores`, `narrative_summary`, `recommended_actions`
186
+
187
+ **Markdown report** key sections:
188
+ Traffic-light status, overall score, dimension table, gate failures, key findings (3–5 bullets), top 5 prioritized recommended actions with criterion references.
189
+
190
+ ---
191
+
192
+ ## Non-Negotiable Rules
193
+
194
+ 1. **Evidence first.** Every score requires a cited excerpt or reference. "The artifact seems to address this" is not evidence.
195
+ 2. **Gates override averages.** One critical gate failure = Reject regardless of the overall score.
196
+ 3. **Confidence ≠ score.** Low confidence lowers the weight a human reviewer places on your output. It does not lower the numerical score.
197
+ 4. **N/A requires justification.** One sentence explaining why the criterion genuinely cannot apply.
198
+ 5. **Do not modify the rubric** during evaluation. It is locked. Changes require a version bump.
199
+ 6. **Never collapse the three evaluation types** — artifact quality, architectural fitness, and governance fit are distinct judgments. Keep them separate in the narrative.
200
+
201
+ ---
202
+
203
+ ## When to Read Which Reference File
204
+
205
+ | When | Read |
206
+ |------|------|
207
+ | Before Step 2 (always) | `references/scoring-protocol.md` |
208
+ | When scoring is ambiguous | `references/scoring-protocol.md` |
209
+ | Before the challenge pass (Step 6) | `references/scoring-protocol.md` — section: Challenge Pass |
210
+ | Before Step 7 (calibration check) | `references/calibration-benchmarks.md` |
211
+ | Before producing output | `references/output-templates.md` |
212
+ | Unsure what a score distribution should look like | `references/calibration-benchmarks.md` |