@trohde/earos 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (135) hide show
  1. package/README.md +156 -0
  2. package/assets/init/.agents/skills/earos-artifact-gen/SKILL.md +106 -0
  3. package/assets/init/.agents/skills/earos-artifact-gen/references/interview-guide.md +313 -0
  4. package/assets/init/.agents/skills/earos-artifact-gen/references/output-guide.md +367 -0
  5. package/assets/init/.agents/skills/earos-assess/SKILL.md +212 -0
  6. package/assets/init/.agents/skills/earos-assess/references/calibration-benchmarks.md +160 -0
  7. package/assets/init/.agents/skills/earos-assess/references/output-templates.md +311 -0
  8. package/assets/init/.agents/skills/earos-assess/references/scoring-protocol.md +281 -0
  9. package/assets/init/.agents/skills/earos-calibrate/SKILL.md +153 -0
  10. package/assets/init/.agents/skills/earos-calibrate/references/agreement-metrics.md +188 -0
  11. package/assets/init/.agents/skills/earos-calibrate/references/calibration-protocol.md +263 -0
  12. package/assets/init/.agents/skills/earos-create/SKILL.md +257 -0
  13. package/assets/init/.agents/skills/earos-create/references/criterion-writing-guide.md +268 -0
  14. package/assets/init/.agents/skills/earos-create/references/dependency-rules.md +193 -0
  15. package/assets/init/.agents/skills/earos-create/references/rubric-interview-guide.md +123 -0
  16. package/assets/init/.agents/skills/earos-create/references/validation-checklist.md +238 -0
  17. package/assets/init/.agents/skills/earos-profile-author/SKILL.md +251 -0
  18. package/assets/init/.agents/skills/earos-profile-author/references/criterion-writing-guide.md +280 -0
  19. package/assets/init/.agents/skills/earos-profile-author/references/design-methods.md +158 -0
  20. package/assets/init/.agents/skills/earos-profile-author/references/profile-checklist.md +173 -0
  21. package/assets/init/.agents/skills/earos-remediate/SKILL.md +118 -0
  22. package/assets/init/.agents/skills/earos-remediate/references/output-template.md +199 -0
  23. package/assets/init/.agents/skills/earos-remediate/references/remediation-patterns.md +330 -0
  24. package/assets/init/.agents/skills/earos-report/SKILL.md +85 -0
  25. package/assets/init/.agents/skills/earos-report/references/portfolio-template.md +181 -0
  26. package/assets/init/.agents/skills/earos-report/references/single-artifact-template.md +168 -0
  27. package/assets/init/.agents/skills/earos-review/SKILL.md +130 -0
  28. package/assets/init/.agents/skills/earos-review/references/challenge-patterns.md +163 -0
  29. package/assets/init/.agents/skills/earos-review/references/output-template.md +180 -0
  30. package/assets/init/.agents/skills/earos-template-fill/SKILL.md +177 -0
  31. package/assets/init/.agents/skills/earos-template-fill/references/evidence-writing-guide.md +186 -0
  32. package/assets/init/.agents/skills/earos-template-fill/references/section-rubric-mapping.md +200 -0
  33. package/assets/init/.agents/skills/earos-validate/SKILL.md +113 -0
  34. package/assets/init/.agents/skills/earos-validate/references/fix-patterns.md +281 -0
  35. package/assets/init/.agents/skills/earos-validate/references/validation-checks.md +287 -0
  36. package/assets/init/.claude/CLAUDE.md +4 -0
  37. package/assets/init/AGENTS.md +293 -0
  38. package/assets/init/CLAUDE.md +635 -0
  39. package/assets/init/README.md +507 -0
  40. package/assets/init/calibration/gold-set/.gitkeep +0 -0
  41. package/assets/init/calibration/results/.gitkeep +0 -0
  42. package/assets/init/core/core-meta-rubric.yaml +643 -0
  43. package/assets/init/docs/consistency-report.md +325 -0
  44. package/assets/init/docs/getting-started.md +194 -0
  45. package/assets/init/docs/profile-authoring-guide.md +51 -0
  46. package/assets/init/docs/terminology.md +126 -0
  47. package/assets/init/earos.manifest.yaml +104 -0
  48. package/assets/init/evaluations/.gitkeep +0 -0
  49. package/assets/init/examples/aws-event-driven-order-processing/artifact.yaml +2056 -0
  50. package/assets/init/examples/aws-event-driven-order-processing/evaluation.yaml +973 -0
  51. package/assets/init/examples/aws-event-driven-order-processing/report.md +244 -0
  52. package/assets/init/examples/example-solution-architecture.evaluation.yaml +136 -0
  53. package/assets/init/examples/multi-cloud-data-analytics/artifact.yaml +715 -0
  54. package/assets/init/overlays/data-governance.yaml +94 -0
  55. package/assets/init/overlays/regulatory.yaml +154 -0
  56. package/assets/init/overlays/security.yaml +92 -0
  57. package/assets/init/profiles/adr.yaml +225 -0
  58. package/assets/init/profiles/capability-map.yaml +223 -0
  59. package/assets/init/profiles/reference-architecture.yaml +426 -0
  60. package/assets/init/profiles/roadmap.yaml +205 -0
  61. package/assets/init/profiles/solution-architecture.yaml +227 -0
  62. package/assets/init/research/architecture-assessment-rubrics-research.docx +0 -0
  63. package/assets/init/research/architecture-assessment-rubrics-research.md +566 -0
  64. package/assets/init/research/reference-architecture-research.md +751 -0
  65. package/assets/init/standard/EAROS.md +1426 -0
  66. package/assets/init/standard/schemas/artifact.schema.json +1295 -0
  67. package/assets/init/standard/schemas/artifact.uischema.json +65 -0
  68. package/assets/init/standard/schemas/evaluation.schema.json +284 -0
  69. package/assets/init/standard/schemas/rubric.schema.json +383 -0
  70. package/assets/init/templates/evaluation-record.template.yaml +58 -0
  71. package/assets/init/templates/new-profile.template.yaml +65 -0
  72. package/bin.js +188 -0
  73. package/dist/assets/_basePickBy-BVu6YmSW.js +1 -0
  74. package/dist/assets/_baseUniq-CWRzQDz_.js +1 -0
  75. package/dist/assets/arc-CyDBhtDM.js +1 -0
  76. package/dist/assets/architectureDiagram-2XIMDMQ5-BH6O4dvN.js +36 -0
  77. package/dist/assets/blockDiagram-WCTKOSBZ-2xmwdjpg.js +132 -0
  78. package/dist/assets/c4Diagram-IC4MRINW-BNmPRFJF.js +10 -0
  79. package/dist/assets/channel-CiySTNoJ.js +1 -0
  80. package/dist/assets/chunk-4BX2VUAB-DGQTvirp.js +1 -0
  81. package/dist/assets/chunk-55IACEB6-DNMAQAC_.js +1 -0
  82. package/dist/assets/chunk-FMBD7UC4-BJbVTQ5o.js +15 -0
  83. package/dist/assets/chunk-JSJVCQXG-BCxUL74A.js +1 -0
  84. package/dist/assets/chunk-KX2RTZJC-H7wWZOfz.js +1 -0
  85. package/dist/assets/chunk-NQ4KR5QH-BK4RlTQF.js +220 -0
  86. package/dist/assets/chunk-QZHKN3VN-0chxDV5g.js +1 -0
  87. package/dist/assets/chunk-WL4C6EOR-DexfQ-AV.js +189 -0
  88. package/dist/assets/classDiagram-VBA2DB6C-D7luWJQn.js +1 -0
  89. package/dist/assets/classDiagram-v2-RAHNMMFH-D7luWJQn.js +1 -0
  90. package/dist/assets/clone-ylgRbd3D.js +1 -0
  91. package/dist/assets/cose-bilkent-S5V4N54A-DS2IOCfZ.js +1 -0
  92. package/dist/assets/cytoscape.esm-CyJtwmzi.js +331 -0
  93. package/dist/assets/dagre-KLK3FWXG-BbSoTTa3.js +4 -0
  94. package/dist/assets/defaultLocale-DX6XiGOO.js +1 -0
  95. package/dist/assets/diagram-E7M64L7V-C9TvYgv0.js +24 -0
  96. package/dist/assets/diagram-IFDJBPK2-DowUMWrg.js +43 -0
  97. package/dist/assets/diagram-P4PSJMXO-BL6nrnQF.js +24 -0
  98. package/dist/assets/erDiagram-INFDFZHY-rXPRl8VM.js +70 -0
  99. package/dist/assets/flowDiagram-PKNHOUZH-DBRM99-W.js +162 -0
  100. package/dist/assets/ganttDiagram-A5KZAMGK-INcWFsBT.js +292 -0
  101. package/dist/assets/gitGraphDiagram-K3NZZRJ6-DMwpfE91.js +65 -0
  102. package/dist/assets/graph-DLQn37b-.js +1 -0
  103. package/dist/assets/index-BFFITMT8.js +650 -0
  104. package/dist/assets/index-H7f6VTz1.css +1 -0
  105. package/dist/assets/infoDiagram-LFFYTUFH-B0f4TWRM.js +2 -0
  106. package/dist/assets/init-Gi6I4Gst.js +1 -0
  107. package/dist/assets/ishikawaDiagram-PHBUUO56-CsU6XimZ.js +70 -0
  108. package/dist/assets/journeyDiagram-4ABVD52K-CQ7ibNib.js +139 -0
  109. package/dist/assets/kanban-definition-K7BYSVSG-DzEN7THt.js +89 -0
  110. package/dist/assets/katex-B1X10hvy.js +261 -0
  111. package/dist/assets/layout-C0dvb42R.js +1 -0
  112. package/dist/assets/linear-j4a8mGj7.js +1 -0
  113. package/dist/assets/mindmap-definition-YRQLILUH-DP8iEuCf.js +68 -0
  114. package/dist/assets/ordinal-Cboi1Yqb.js +1 -0
  115. package/dist/assets/pieDiagram-SKSYHLDU-BpIAXgAm.js +30 -0
  116. package/dist/assets/quadrantDiagram-337W2JSQ-DrpXn5Eg.js +7 -0
  117. package/dist/assets/requirementDiagram-Z7DCOOCP-Bg7EwHlG.js +73 -0
  118. package/dist/assets/sankeyDiagram-WA2Y5GQK-BWagRs1F.js +10 -0
  119. package/dist/assets/sequenceDiagram-2WXFIKYE-q5jwhivG.js +145 -0
  120. package/dist/assets/stateDiagram-RAJIS63D-B_J9pE-2.js +1 -0
  121. package/dist/assets/stateDiagram-v2-FVOUBMTO-Q_1GcybB.js +1 -0
  122. package/dist/assets/timeline-definition-YZTLITO2-dv0jgQ0z.js +61 -0
  123. package/dist/assets/treemap-KZPCXAKY-Dt1dkIE7.js +162 -0
  124. package/dist/assets/vennDiagram-LZ73GAT5-BdO5RgRZ.js +34 -0
  125. package/dist/assets/xychartDiagram-JWTSCODW-CpDVe-8v.js +7 -0
  126. package/dist/index.html +23 -0
  127. package/export-docx.js +1583 -0
  128. package/init.js +353 -0
  129. package/manifest-cli.mjs +207 -0
  130. package/package.json +83 -0
  131. package/schemas/artifact.schema.json +1295 -0
  132. package/schemas/artifact.uischema.json +65 -0
  133. package/schemas/evaluation.schema.json +284 -0
  134. package/schemas/rubric.schema.json +383 -0
  135. package/serve.js +238 -0
@@ -0,0 +1,280 @@
1
+ # Criterion Writing Guide — EAROS Profile Author
2
+
3
+ This file explains how to write well-formed EAROS criteria with all 13 required v2 fields. Read this before drafting any criteria.
4
+
5
+ ---
6
+
7
+ ## Why Criterion Quality Determines Profile Reliability
8
+
9
+ A criterion is not just a question — it is an assessment instruction. A well-written criterion tells the evaluator exactly what to look for, how to classify what they find, and what each score level means. A poorly written criterion leaves the evaluator guessing, which leads to inconsistent scores, low inter-rater reliability, and a profile that cannot be used in governance.
10
+
11
+ The 13 required v2 fields exist because each one has been found to reduce ambiguity. Missing any of them is not just a schema violation — it is a reliability risk.
12
+
13
+ ---
14
+
15
+ ## The 13 Required Fields
16
+
17
+ | Field | Purpose |
18
+ |-------|---------|
19
+ | `id` | Unique identifier for cross-referencing in evaluation records |
20
+ | `question` | The scoring question — what the evaluator is asking about the artifact |
21
+ | `description` | Why this matters — the quality concern this criterion encodes |
22
+ | `metric_type` | Always `ordinal` in EAROS |
23
+ | `scale` | Always `[0, 1, 2, 3, 4, "N/A"]` |
24
+ | `gate` | Gate configuration or `false` |
25
+ | `required_evidence` | List of specific things to look for in the artifact |
26
+ | `scoring_guide` | One-sentence level descriptors for scores 0–4 |
27
+ | `anti_patterns` | Common failure modes to watch for |
28
+ | `examples.good` | What strong evidence looks like (score 3–4) |
29
+ | `examples.bad` | What absent or weak evidence looks like (score 0–1) |
30
+ | `decision_tree` | Observable conditions that resolve ambiguous scoring |
31
+ | `remediation_hints` | Specific improvements that would raise the score |
32
+
33
+ ---
34
+
35
+ ## Field-by-Field Guidance
36
+
37
+ ### `question`
38
+
39
+ The question should be:
40
+ - Specific to this artifact type (not generic)
41
+ - Answerable from artifact content (not requiring external research)
42
+ - Focused on a single quality concern (not compound)
43
+
44
+ **Good:** "Does the reference architecture include context, functional, deployment, and data flow views?"
45
+ **Bad:** "Is the architecture complete and well-documented?"
46
+
47
+ The bad example fails because "complete" and "well-documented" are two different concerns, and "complete" is vague.
48
+
49
+ ---
50
+
51
+ ### `scoring_guide`
52
+
53
+ Write one sentence per level that describes what the artifact contains at that level — not what the evaluator should do.
54
+
55
+ **Pattern:**
56
+ ```yaml
57
+ scoring_guide:
58
+ "0": "[Absent description — criterion entirely missing or directly contradicted]"
59
+ "1": "[Weak description — acknowledged or implied but inadequate]"
60
+ "2": "[Partial description — present but incomplete, inconsistent, or weakly evidenced]"
61
+ "3": "[Good description — clearly addressed with adequate evidence and only minor gaps]"
62
+ "4": "[Strong description — fully addressed, well evidenced, internally consistent, decision-ready]"
63
+ ```
64
+
65
+ **Common mistake:** Writing what the evaluator should do, not what the artifact shows.
66
+
67
+ **Wrong:**
68
+ ```yaml
69
+ "3": "Check whether the artifact covers most of the required views."
70
+ ```
71
+
72
+ **Correct:**
73
+ ```yaml
74
+ "3": "Three or more views present with adequate detail; data flow view exists but lacks narrative."
75
+ ```
76
+
77
+ The key test: could an evaluator read the level descriptor and know immediately which score to assign without needing to interpret?
78
+
79
+ ---
80
+
81
+ ### `decision_tree`
82
+
83
+ The decision tree translates the scoring guide into observable conditions. It is especially important when:
84
+ - The scoring guide levels are close together (2 vs. 3 is often ambiguous)
85
+ - Scoring requires counting specific features
86
+ - There are compound conditions (A AND B = score 3; A OR B = score 2)
87
+
88
+ **Pattern:** Start with the lowest score condition and work upward. Use IF/THEN structure.
89
+
90
+ **Good example:**
91
+ ```yaml
92
+ decision_tree: >
93
+ Count distinct architectural views (context, component, deployment, data flow, security):
94
+ IF 0 views THEN score 0.
95
+ IF 1 view only THEN score 1.
96
+ IF 2-3 views present THEN score 2.
97
+ IF 4+ views AND data flow narrative exists THEN score 3.
98
+ IF 4+ views AND all cross-referenced AND security view included THEN score 4.
99
+ ```
100
+
101
+ **Bad example:**
102
+ ```yaml
103
+ decision_tree: >
104
+ Evaluate the completeness of the architectural views and assign a score based on quality.
105
+ ```
106
+
107
+ This is not a decision tree — it just restates the criterion question.
108
+
109
+ ---
110
+
111
+ ### `required_evidence`
112
+
113
+ List the specific artifact elements an evaluator should search for. These become the RULERS evidence anchors.
114
+
115
+ **Good:**
116
+ ```yaml
117
+ required_evidence:
118
+ - context diagram (C4 Level 1 or equivalent) showing system boundaries
119
+ - deployment diagram showing infrastructure topology
120
+ - data flow walkthrough (numbered steps or annotated sequence diagram)
121
+ - component diagram showing service decomposition
122
+ ```
123
+
124
+ **Bad:**
125
+ ```yaml
126
+ required_evidence:
127
+ - architectural documentation
128
+ - diagrams
129
+ ```
130
+
131
+ The bad example gives the evaluator no guidance on what specifically to find.
132
+
133
+ ---
134
+
135
+ ### `examples.good` and `examples.bad`
136
+
137
+ These are the most important fields for calibration. Include direct quotes or realistic paraphrases from the artifact — not descriptions of what good looks like.
138
+
139
+ **Good examples:**
140
+ ```yaml
141
+ examples:
142
+ good:
143
+ - >
144
+ "Section 3 provides C4 Level 1 context diagram. Section 4 shows container decomposition.
145
+ Section 5 includes AWS deployment topology with AZ distribution. Section 6 contains
146
+ a 12-step numbered data flow for the payment processing path."
147
+ bad:
148
+ - "See architecture diagram on page 3."
149
+ - >
150
+ "Figure 1: System Overview. [Single box-and-arrow diagram with no explanatory text,
151
+ no deployment details, no data flows.]"
152
+ ```
153
+
154
+ **Why quotes matter:** During calibration, reviewers compare their scores to the examples. If the examples are descriptions rather than quotes, reviewers cannot determine whether their artifact matches.
155
+
156
+ ---
157
+
158
+ ### Gate Guidance {#gate-guidance}
159
+
160
+ Gates prevent bad scores being hidden by weighted averages. But every gate is a potential false reject — a criterion that rejects a genuinely good artifact because it misses one element.
161
+
162
+ **When to use `gate: false`:**
163
+ - The criterion contributes to the score but a low score here doesn't invalidate the whole artifact
164
+ - Most criteria should be `gate: false` or `severity: advisory`
165
+
166
+ **When to use `severity: major`:**
167
+ - The criterion covers the most important quality dimension for this artifact type
168
+ - A score below 2 here means the artifact cannot serve its primary purpose
169
+ - Example: missing views in a viewpoint-centred profile
170
+
171
+ **When to use `severity: critical`:**
172
+ - The criterion covers a compliance-level concern — mandatory control, regulatory requirement, or minimum governance standard
173
+ - A failure here means the artifact cannot proceed in any state
174
+ - Reserve this for absolute must-haves: usually 0–1 per profile
175
+
176
+ **Target gate distribution per profile:**
177
+ ```
178
+ gate: false → 60-70% of criteria
179
+ severity: advisory → 10-20% of criteria
180
+ severity: major → 1-2 criteria
181
+ severity: critical → 0-1 criteria
182
+ ```
183
+
184
+ **Over-gating example (bad):**
185
+ ```yaml
186
+ # 5 major gates in a 10-criterion profile
187
+ # Result: almost any artifact with a weak section fails the whole review
188
+ # This defeats the purpose of the weighted average
189
+ ```
190
+
191
+ **Under-gating example (bad):**
192
+ ```yaml
193
+ # 0 gates in a security profile
194
+ # Result: an artifact with no security controls at all can still "pass" on a high average
195
+ # Gates exist precisely to prevent this
196
+ ```
197
+
198
+ ---
199
+
200
+ ## Complete Criterion Example
201
+
202
+ **Incomplete (bad) — will fail schema validation and produce unreliable scores:**
203
+ ```yaml
204
+ - id: PM-ROOT-01
205
+ question: "Does the post-mortem identify the root cause?"
206
+ scoring_guide:
207
+ "0": "No root cause"
208
+ "3": "Root cause identified"
209
+ gate: false
210
+ ```
211
+
212
+ Missing 9 of 13 required fields. Evaluators have no guidance on scores 1, 2, 4; no evidence to look for; no examples; no decision tree.
213
+
214
+ **Complete (good) — all 13 fields present:**
215
+ ```yaml
216
+ - id: PM-ROOT-01
217
+ question: "Does the post-mortem identify the root cause with supporting evidence?"
218
+ description: >
219
+ Root cause identification is the primary purpose of a post-mortem. Without a
220
+ specific, evidenced root cause, the post-mortem cannot drive effective prevention.
221
+ "Human error" and "process failure" are not root causes — they are proxies for
222
+ the conditions that enabled the failure.
223
+ metric_type: ordinal
224
+ scale: [0, 1, 2, 3, 4, "N/A"]
225
+ gate:
226
+ enabled: true
227
+ severity: major
228
+ failure_effect: Cannot pass if the post-mortem does not identify a specific root cause
229
+ required_evidence:
230
+ - explicit root cause statement (not just a timeline)
231
+ - contributing factors (conditions that enabled the root cause)
232
+ - evidence supporting the root cause conclusion (data, logs, timeline analysis)
233
+ scoring_guide:
234
+ "0": "No root cause section, or root cause stated as 'human error' / 'process failure' without further analysis."
235
+ "1": "Root cause implied or mentioned superficially with no supporting evidence."
236
+ "2": "Specific root cause stated but supporting evidence absent or limited to timeline."
237
+ "3": "Specific root cause stated with supporting evidence and at least one contributing factor identified."
238
+ "4": "Specific root cause with full evidence chain, multiple contributing factors, and causal relationships mapped."
239
+ anti_patterns:
240
+ - "Root cause: Human error" — this is a contributing factor, not a root cause
241
+ - "Root cause: TBD" — post-mortem cannot be used for prevention without this
242
+ - Root cause stated but no evidence provided for why this was the cause
243
+ examples:
244
+ good:
245
+ - >
246
+ "Root cause: Race condition in payment state machine between the timeout handler
247
+ and the confirmation webhook processor. Contributing factors: (1) Missing mutex on
248
+ shared payment state object (2) Timeout threshold (30s) shorter than downstream
249
+ webhook delivery SLA (45s). Evidence: Log analysis shows 47 concurrent state
250
+ transitions in 2.3s during the incident window (Appendix A)."
251
+ bad:
252
+ - "Root cause: Engineering team failed to test edge cases."
253
+ - "Root cause: See timeline above."
254
+ decision_tree: >
255
+ IF no root cause section THEN score 0.
256
+ IF root cause is 'human error' or 'process failure' without further drill-down THEN score 0-1.
257
+ IF specific technical root cause stated but no evidence THEN score 2.
258
+ IF specific root cause with supporting evidence THEN score 3.
259
+ IF specific root cause AND full evidence chain AND contributing factors mapped THEN score 4.
260
+ remediation_hints:
261
+ - Apply the "5 Whys" technique to drill below human error to systemic causes
262
+ - Attach log excerpts or data to support the stated root cause
263
+ - Add a contributing factors section listing the conditions that enabled the root cause
264
+ ```
265
+
266
+ ---
267
+
268
+ ## Criterion Review Checklist
269
+
270
+ Before saving a criterion, verify:
271
+
272
+ - [ ] `question` is specific and answerable from artifact content alone
273
+ - [ ] `scoring_guide` uses artifact-content language, not evaluator-action language
274
+ - [ ] `scoring_guide` distinguishes clearly between each adjacent pair (0/1, 1/2, 2/3, 3/4)
275
+ - [ ] `decision_tree` resolves the most common ambiguous case (usually 2 vs. 3)
276
+ - [ ] `required_evidence` lists specific artifact elements, not general categories
277
+ - [ ] `examples.good` contains a realistic quote or paraphrase from a strong artifact
278
+ - [ ] `examples.bad` contains the actual common failure mode, not just an empty section
279
+ - [ ] `gate` assignment is deliberate (not defaulted)
280
+ - [ ] `remediation_hints` are specific verb-first actions, not general advice
@@ -0,0 +1,158 @@
1
+ # Design Methods — EAROS Profile Author
2
+
3
+ This file describes the 5 design methods for EAROS profiles. Read this before choosing a method (Step 2).
4
+
5
+ ---
6
+
7
+ ## Why Design Methods Matter
8
+
9
+ A profile's design method shapes its dimensional structure and criterion types. Choosing the wrong method produces criteria that feel disconnected from the artifact type — assessors struggle to apply them, calibration fails, and the profile is abandoned.
10
+
11
+ The five methods are not about the content of the architecture — they are about the primary *evaluative lens* the profile applies.
12
+
13
+ ---
14
+
15
+ ## Method A — Decision-Centred
16
+
17
+ **Best for:** ADRs, investment reviews, exception requests, approval documents
18
+
19
+ **Core question:** "Is this document adequate to support a governance decision?"
20
+
21
+ **Why:** Decision-focused artifacts are evaluated primarily on whether they enable a clear, informed decision. The architecture content matters less than the decision structure: was the context clear, were alternatives considered, is the rationale sound, is the decision reversible or final?
22
+
23
+ **Dimension structure typically includes:**
24
+ - Decision context and framing (why is a decision needed?)
25
+ - Options analysis (what was considered, why rejected)
26
+ - Decision statement and rationale
27
+ - Reversibility and revisit conditions
28
+ - Stakeholder alignment
29
+
30
+ **Signature criteria:**
31
+ - Options presented with comparative analysis (not just the chosen option)
32
+ - Decision consequences made explicit
33
+ - Revisit/escalation conditions named
34
+
35
+ **Example profile:** `profiles/adr.yaml`
36
+
37
+ **Key indicator to choose Method A:** The primary artifact purpose is to get approval or record a governance decision — not to describe an architecture in full.
38
+
39
+ ---
40
+
41
+ ## Method B — Viewpoint-Centred
42
+
43
+ **Best for:** Capability maps, reference architectures, solution architectures, platform blueprints
44
+
45
+ **Core question:** "Does this artifact address the concerns of all relevant stakeholders through appropriate architectural views?"
46
+
47
+ **Why:** Viewpoint-centred artifacts are evaluated on their completeness across multiple perspectives (context, functional, deployment, data, security) and how well those views address the stated stakeholder concerns. The presence and quality of views is the primary quality signal.
48
+
49
+ **Dimension structure typically includes:**
50
+ - Views and diagrams coverage
51
+ - Stakeholder concern coverage
52
+ - Cross-view consistency
53
+ - Notation and annotation quality
54
+
55
+ **Signature criteria:**
56
+ - Multiple views present (minimum: context, component, deployment)
57
+ - Views explicitly mapped to stakeholder concerns
58
+ - Consistent terminology and component naming across views
59
+
60
+ **Example profiles:** `profiles/reference-architecture.yaml`, `profiles/solution-architecture.yaml`
61
+
62
+ **Key indicator to choose Method B:** The artifact is expected to contain multiple diagrams and the audience needs different perspectives on the same system.
63
+
64
+ ---
65
+
66
+ ## Method C — Lifecycle-Centred
67
+
68
+ **Best for:** Transition designs, roadmaps, handover documents, migration plans
69
+
70
+ **Core question:** "Does this artifact support the full lifecycle — current state, future state, and the path between them?"
71
+
72
+ **Why:** Lifecycle artifacts are evaluated on whether they adequately describe the journey, not just the destination. Current state, future state, transition steps, dependencies, and rollback conditions are all essential. An artifact that only describes the target state fails because it leaves delivery teams without a path.
73
+
74
+ **Dimension structure typically includes:**
75
+ - Current state description
76
+ - Future/target state description
77
+ - Transition pathway (phases, milestones, dependencies)
78
+ - Risk and rollback
79
+ - Ownership across lifecycle phases
80
+
81
+ **Signature criteria:**
82
+ - Current state explicitly described (not just assumed)
83
+ - Transition steps sequenced with dependencies
84
+ - Rollback or abort conditions named
85
+
86
+ **Example profile:** `profiles/roadmap.yaml`
87
+
88
+ **Key indicator to choose Method C:** The artifact describes a change over time, not just a static design.
89
+
90
+ ---
91
+
92
+ ## Method D — Risk-Centred
93
+
94
+ **Best for:** Security architectures, regulatory compliance designs, resilience architectures, threat models
95
+
96
+ **Core question:** "Does this artifact identify, mitigate, and accept risks at a level appropriate for the risk domain?"
97
+
98
+ **Why:** Risk-centred artifacts are evaluated on completeness of risk identification and adequacy of mitigations, not just architectural soundness. The primary failure mode is incomplete risk coverage — threats not considered, mitigations not proportionate, residual risk not accepted by a named authority.
99
+
100
+ **Dimension structure typically includes:**
101
+ - Risk identification scope and completeness
102
+ - Mitigation design and proportionality
103
+ - Residual risk acceptance
104
+ - Control implementation evidence
105
+ - Compliance coverage
106
+
107
+ **Signature criteria:**
108
+ - Threat model or risk register covering defined scope
109
+ - Mitigations proportionate to risk likelihood × impact
110
+ - Named authority accepting residual risks
111
+ - Control-to-requirement traceability
112
+
113
+ **Key indicator to choose Method D:** The primary purpose of the artifact is to demonstrate that risks have been identified and managed — not just to describe the architecture.
114
+
115
+ **Note:** The security and regulatory overlays (`overlays/security.yaml`, `overlays/regulatory.yaml`) often apply alongside Method D profiles but are not substitutes for a D-method profile when the artifact is primarily risk-focused.
116
+
117
+ ---
118
+
119
+ ## Method E — Pattern-Library
120
+
121
+ **Best for:** Recurring reference patterns, platform blueprints, golden-path designs
122
+
123
+ **Core question:** "Is this pattern sufficiently defined, validated, and reusable that teams can adopt it without extensive customization?"
124
+
125
+ **Why:** Pattern-library artifacts are evaluated on their reusability and adoption-readiness, not just their technical correctness. A pattern that is architecturally sound but undocumented at the decision point level isn't reusable — teams have to recreate the design rationale each time. The primary failure mode is an artifact that is a good architecture but a poor pattern.
126
+
127
+ **Dimension structure typically includes:**
128
+ - Pattern definition and applicability conditions
129
+ - Implementation completeness (is there enough to act on?)
130
+ - Reuse guidance (when to use, when not to use, variants)
131
+ - Evolution and versioning
132
+ - Validation evidence (is this proven in production?)
133
+
134
+ **Signature criteria:**
135
+ - Named applicability conditions ("use this when X, don't use when Y")
136
+ - Canonical implementation example
137
+ - Known variants documented
138
+ - Adoption metrics or proven instances
139
+
140
+ **Example profile:** `profiles/reference-architecture.yaml` (uses Method E)
141
+
142
+ **Key indicator to choose Method E:** Teams are expected to adopt this pattern repeatedly — it needs to work as a template, not just a one-time design.
143
+
144
+ ---
145
+
146
+ ## Choosing Between Methods — Decision Guide
147
+
148
+ | Situation | Method |
149
+ |-----------|--------|
150
+ | "We need to approve a specific decision" | A — Decision-Centred |
151
+ | "We need multiple teams to understand this system from different angles" | B — Viewpoint-Centred |
152
+ | "We're describing how to get from A to B over time" | C — Lifecycle-Centred |
153
+ | "The primary purpose is to show risks are controlled" | D — Risk-Centred |
154
+ | "Teams will use this repeatedly as a template" | E — Pattern-Library |
155
+
156
+ **When in doubt:** Method B (Viewpoint-Centred) is the most general and works well for most architecture artifacts that don't fit a more specific method.
157
+
158
+ **Combinations:** Some artifacts have secondary concerns from another method. Handle this by choosing the primary method and adding criteria from the secondary concern to relevant dimensions, rather than trying to combine two methods.
@@ -0,0 +1,173 @@
1
+ # Profile Validation Checklist — EAROS Profile Author
2
+
3
+ This checklist must be completed before publishing a profile or overlay. Read it before Step 6 (pre-publication checks).
4
+
5
+ ---
6
+
7
+ ## Why a Checklist?
8
+
9
+ Profiles that skip validation steps cause silent failures in evaluations. A missing field in one criterion might not be caught until the profile is used in production, by which time evaluations have already been produced on a flawed rubric. Running this checklist before publishing catches errors when they are cheap to fix.
10
+
11
+ ---
12
+
13
+ ## Part 1 — Structural Validation
14
+
15
+ ### 1.1 Required Top-Level Fields
16
+
17
+ Check that each of these is present and correctly typed:
18
+
19
+ | Field | Required Value |
20
+ |-------|---------------|
21
+ | `rubric_id` | Unique string, format `EAROS-<ARTIFACT>-<NNN>` |
22
+ | `version` | Semver format (e.g., `1.0.0`) |
23
+ | `kind` | `profile` or `overlay` |
24
+ | `title` | Non-empty string |
25
+ | `status` | `draft` (for new profiles) |
26
+ | `effective_date` | `YYYY-MM-DD` format |
27
+ | `owner` | `enterprise-architecture` |
28
+ | `artifact_type` | Snake_case string |
29
+ | `inherits` | `[EAROS-CORE-002]` (profiles only; absent for overlays) |
30
+ | `design_method` | One of the 5 valid methods |
31
+ | `dimensions` | Non-empty list |
32
+ | `scoring` | Object with required sub-fields |
33
+ | `outputs` | Object with required sub-fields |
34
+ | `calibration` | Object with `required_before_production: true` |
35
+ | `change_log` | List with at least one entry |
36
+
37
+ **Overlays specifically:**
38
+ - [ ] No `inherits` field
39
+ - [ ] `scoring.method: append_to_base_rubric`
40
+
41
+ ### 1.2 Scoring Block
42
+
43
+ ```yaml
44
+ scoring:
45
+ scale: 0-4 ordinal plus N/A
46
+ method: gates_first_then_weighted_average # overlays: append_to_base_rubric
47
+ thresholds:
48
+ pass: No critical gate failure, overall >= 3.2, no dimension < 2.0
49
+ conditional_pass: No critical gate failure, overall 2.4-3.19
50
+ rework_required: Overall < 2.4 or repeated weak dimensions
51
+ reject: Critical gate failure or mandatory control breach
52
+ not_reviewable: Evidence insufficient for core gate criteria
53
+ na_policy: Exclude N/A criteria from denominator; evaluator must justify N/A
54
+ confidence_policy: Confidence reported separately, must not modify score
55
+ ```
56
+
57
+ ### 1.3 Outputs Block
58
+
59
+ ```yaml
60
+ outputs:
61
+ require_evidence_refs: true
62
+ require_confidence: true
63
+ require_actions: true
64
+ require_evidence_class: true
65
+ require_evidence_anchors: true
66
+ ```
67
+
68
+ ---
69
+
70
+ ## Part 2 — Criterion Completeness
71
+
72
+ For every criterion, verify all 13 v2 fields are present:
73
+
74
+ | Field | Check |
75
+ |-------|-------|
76
+ | `id` | [ ] Present, unique, format `<ARTIFACT>-<AREA>-<NN>` |
77
+ | `question` | [ ] Present, specific, single concern |
78
+ | `description` | [ ] Present, explains WHY this matters |
79
+ | `metric_type` | [ ] Value is exactly `ordinal` |
80
+ | `scale` | [ ] Value is exactly `[0, 1, 2, 3, 4, "N/A"]` |
81
+ | `gate` | [ ] Either `false` or object with `enabled`, `severity`, `failure_effect` |
82
+ | `required_evidence` | [ ] Non-empty list of specific artifact elements |
83
+ | `scoring_guide` | [ ] All keys "0" through "4" present with content |
84
+ | `anti_patterns` | [ ] Non-empty list |
85
+ | `examples.good` | [ ] Non-empty list with realistic quote or paraphrase |
86
+ | `examples.bad` | [ ] Non-empty list with the actual common failure mode |
87
+ | `decision_tree` | [ ] Non-empty string with IF/THEN branches |
88
+ | `remediation_hints` | [ ] Non-empty list of verb-first actions |
89
+
90
+ ---
91
+
92
+ ## Part 3 — ID Uniqueness
93
+
94
+ Before publishing, verify no ID collisions:
95
+
96
+ - [ ] Profile `rubric_id` not already used in `core/`, `profiles/`, or `overlays/`
97
+ - [ ] All criterion IDs (`id` fields) unique across ALL rubric files in the repo
98
+ - [ ] All dimension IDs unique within this profile file
99
+
100
+ To check: scan `core/core-meta-rubric.yaml`, all files in `profiles/`, and all files in `overlays/` for any matching IDs.
101
+
102
+ **Common mistake:** Using short IDs like `D1`, `CRT-01` that collide with core rubric dimension IDs.
103
+
104
+ ---
105
+
106
+ ## Part 4 — Gate Distribution Check
107
+
108
+ Review gate assignments across the profile and verify:
109
+
110
+ | Gate Type | Target Count | Actual Count |
111
+ |-----------|-------------|--------------|
112
+ | `critical` | 0–1 | [ ] |
113
+ | `major` | 1–2 | [ ] |
114
+ | `advisory` | 0–3 | [ ] |
115
+ | `gate: false` | Most criteria | [ ] |
116
+
117
+ **Red flags:**
118
+ - More than 2 `major` gates → likely over-gating; review whether all are truly fatal
119
+ - 0 gates on a security/compliance-focused profile → likely under-gating
120
+ - `critical` gate on a non-compliance criterion → review whether this is warranted
121
+
122
+ ---
123
+
124
+ ## Part 5 — Criterion Count Check
125
+
126
+ | Check | Target | Actual |
127
+ |-------|--------|--------|
128
+ | Total profile-specific criteria | 5–12 | [ ] |
129
+ | New dimensions | 2–6 | [ ] |
130
+ | Criteria that duplicate core criteria | 0 | [ ] |
131
+
132
+ To verify: read `core/core-meta-rubric.yaml` and compare each new criterion's `question` to ensure it covers a genuinely different concern.
133
+
134
+ ---
135
+
136
+ ## Part 6 — Schema Validation
137
+
138
+ If you have a YAML validator available, validate against `standard/schemas/rubric.schema.json`.
139
+
140
+ If not, manually verify the most common schema violations:
141
+ - [ ] All string keys are quoted where required (especially numeric keys in `scoring_guide`: `"0":`, `"1":`, etc.)
142
+ - [ ] Two-space indentation throughout (not 4-space, not tabs)
143
+ - [ ] Lists use `- item` format, not inline `[item1, item2]` for multi-line lists
144
+ - [ ] Multi-line descriptions use `>` block scalar, not `|` (unless you need to preserve newlines)
145
+ - [ ] File name matches convention: `<artifact-type>.yaml` (kebab-case)
146
+
147
+ ---
148
+
149
+ ## Part 7 — Pre-Calibration Checklist
150
+
151
+ Before the profile can be used in production:
152
+
153
+ - [ ] At least 3 real artifacts collected for calibration (1 strong ≥3.2, 1 weak <2.4, 1 ambiguous)
154
+ - [ ] Calibration artifacts documented in `calibration/gold-set/` or accessible to reviewers
155
+ - [ ] 2+ reviewers identified for independent scoring
156
+ - [ ] Profile `status: draft` until calibration is complete
157
+
158
+ After successful calibration:
159
+ - [ ] Profile status changed to `candidate` (then `approved` after governance sign-off)
160
+ - [ ] Calibration results saved to `calibration/results/`
161
+ - [ ] Worked evaluation example added to `examples/`
162
+ - [ ] Profile mentioned in CHANGELOG.md
163
+
164
+ ---
165
+
166
+ ## Final Sign-Off
167
+
168
+ Before committing the profile:
169
+ - [ ] All Part 1–6 checks completed
170
+ - [ ] No duplicate IDs found
171
+ - [ ] Gate distribution reviewed and approved
172
+ - [ ] Criterion count within 5–12
173
+ - [ ] `earos-validate` skill run on the full repo (catches cross-file issues)