npm - @trohde/earos - Versions diffs - 1.0.0 - Mend

@trohde/earos 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (135) hide show

package/assets/init/research/reference-architecture-research.md ADDED Viewed

@@ -0,0 +1,751 @@
+# Reference Architecture Best Practices: Blueprints, Golden Paths, and Quality Frameworks
+*Deep research compiled March 2026*
+---
+## 1. What Makes a Good Reference Architecture
+### 1.1 Definition and Purpose
+A reference architecture is a standardised, reusable blueprint that guides the design and implementation of system architectures within a specific domain. In TOGAF's terminology, it is "a generic architecture that identifies the normal outlines of a system and provides the components, relationships, principles, and architectural guidelines" for a class of solutions. Unlike a solution architecture (which is tailored to one project), a reference architecture is meant to be broadly applicable — it captures proven patterns so that teams don't reinvent the wheel.
+Conexiam identifies two fundamental types of reference architectures: those that **expose the structure** of a system (component models, layered views) and those that **show how a system works** (data flows, interaction patterns, runtime behaviours). The best reference architectures include both.
+### 1.2 TOGAF and the Enterprise Continuum
+TOGAF positions reference architectures along an **Enterprise Continuum** — a spectrum from generic to specific:
+| Level | Description | Example |
+|-------|-------------|---------|
+| **Foundation** | Universal IT principles and building blocks | TOGAF Technical Reference Model (TRM) |
+| **Common Systems** | Cross-industry patterns | Service-oriented architecture, event-driven architecture |
+| **Industry** | Domain-specific blueprints | BIAN (banking), HL7 FHIR (healthcare), ACORD (insurance) |
+| **Organisation-Specific** | Tailored to one enterprise | "Our microservices reference architecture on AWS" |
+The further right on the continuum, the more opinionated and prescriptive the architecture becomes. A good reference architecture clearly states **where it sits** on this continuum so consumers understand how much adaptation is expected.
+TOGAF also provides two built-in reference models: the **Technical Reference Model (TRM)**, which describes a fundamental layered architecture, and the **Integrated Information Infrastructure Reference Model (III-RM)**, focused on information flow. Both serve as starting points that organisations extend.
+### 1.3 Structure and Content: Lessons from the Hyperscalers
+The major cloud providers have converged on a remarkably consistent structure for their reference architectures. Examining AWS, Azure, and Google Cloud reveals a common anatomy:
+#### AWS Reference Architectures
+AWS publishes reference architecture diagrams through the **AWS Architecture Center**, structured around:
+- **Architecture diagram** with numbered data-flow annotations
+- **Component descriptions** — each AWS service's role explained
+- **Well-Architected alignment** — mapping to the six pillars (operational excellence, security, reliability, performance efficiency, cost optimisation, sustainability)
+- **Deployment guidance** — often including CloudFormation or CDK templates
+- **Cost considerations**
+AWS also provides **AWS Solutions Constructs** — an open-source CDK library of multi-service patterns that implement Well-Architected best practices with ready-to-deploy code. These represent reference architectures at the infrastructure-as-code level.
+#### Azure Architecture Center
+Microsoft's Azure Architecture Center structures reference architectures with:
+- **Scenario overview** — business context, use cases, and intended audience
+- **Architecture diagram** — visual showing component interactions
+- **Dataflow description** — numbered walkthrough of how data moves through the system
+- **Components** — each Azure service linked to its Well-Architected service guide
+- **Well-Architected Framework considerations** — analysis across the five pillars (reliability, security, cost optimisation, operational excellence, performance efficiency)
+- **Alternatives** — discussion of technology choices and trade-offs
+- **Deploy this scenario** — ARM templates or Bicep files on GitHub
+#### Google Cloud Architecture Center
+Google Cloud's approach emphasises:
+- **Architecture framework** — organised into pillars (security, reliability, performance, cost, operations, sustainability) plus cross-pillar perspectives
+- **Simplicity** — explicit principle that if your architecture is too complex to understand, it will be difficult to implement
+- **Documentation emphasis** — Google specifically calls out that lack of documentation is a major obstacle and that a "properly documented cloud architecture establishes common language and standards"
+- **Decoupling** — separating components so they can operate independently
+- **Industry perspectives** — e.g. a dedicated Financial Services Industry (FSI) perspective aligned to each pillar
+### 1.4 Audience and Level of Detail
+A good reference architecture serves multiple audiences simultaneously and should be layered accordingly:
+| Audience | Needs | Detail Level |
+|----------|-------|-------------|
+| **Executive / Business** | Why this pattern? What value? | High-level context diagram, business benefits, cost model |
+| **Solution Architect** | How to compose the solution | Component model, integration patterns, decision points |
+| **Developer** | How to implement | API contracts, code samples, deployment templates |
+| **Operations / SRE** | How to run it | Monitoring, scaling policies, runbooks, SLOs |
+| **Security / Compliance** | How it's protected | Threat model, control mapping, compliance evidence |
+The key insight from all three hyperscalers: **the diagram is necessary but not sufficient**. Every reference architecture needs narrative explanation of data flows, explicit quality-attribute considerations, and actionable deployment artefacts.
+---
+## 2. Golden Paths and Paved Roads
+### 2.1 Origin: Spotify's Golden Path
+The concept of the **Golden Path** was popularised by Spotify Engineering in 2020 to address a specific problem: as the company grew, autonomous teams led to ecosystem fragmentation. Developers resorted to what Spotify called "rumour-driven development" — the only way to figure out how to do something was to ask a colleague who might know.
+Spotify's definition: a Golden Path is the **"opinionated and supported path to build something"** — for example, build a backend service, put up a website, or create a data pipeline.
+Key design principles:
+- **Opinionated but not mandatory**: The path represents the organisation's recommended way of doing things. Teams can deviate, but they lose platform support when they do.
+- **Convenience over restriction**: The goal is to make the right thing the easy thing, not to enforce compliance through gates.
+- **Full lifecycle coverage**: A golden path covers from project creation through CI/CD, observability, and production operations.
+- **Dramatically reduced setup time**: Spotify reported reducing service setup time from **14 days to under 5 minutes**.
+### 2.2 Related Concepts
+The golden path concept has spawned related ideas:
+- **Paved Road** (Netflix's term): Similar concept — a well-maintained, well-lit highway through the technology landscape. Netflix emphasises that the paved road includes automated security scanning, standardised observability, and pre-configured CI/CD.
+- **Silver Path**: Spotify's term for a secondary supported path — not the primary recommendation, but still maintained and supported.
+- **Guardrails vs. Gates**: Golden paths favour guardrails (guide developers towards good practices) over gates (block developers until they comply). If you stay on the path, security and observability come "for free."
+### 2.3 Backstage and Software Templates
+Spotify's internal developer portal, **Backstage** (now a CNCF project), is the primary mechanism for delivering golden paths. As of mid-2025, Backstage has become "the gravitational center of the IDP conversation."
+**Backstage Software Templates (Scaffolder)** operationalise golden paths:
+1. A platform team defines a template (e.g. "Create a Java microservice")
+2. The template encodes organisational standards: preferred language/framework, CI/CD pipeline, observability setup, security scanning
+3. A developer selects the template, fills in a few form fields
+4. Backstage creates a new repository with the golden-path configuration, with the first build already running through CI/CD
+The templates are stored as YAML definitions that can be version-controlled, reviewed, and evolved alongside the platform.
+### 2.4 Platform Engineering Context
+Golden paths are a core concept within **platform engineering** — the practice of designing, building, and maintaining self-service capabilities that enable developers to deliver software faster with reduced cognitive load.
+ThoughtWorks describes the golden path philosophy as: "If you follow and use these tools, we will give you our best support, so you will have a good experience. We're using these tools because these are the ones we recommend, and if you get stuck, we'll help you because we know exactly how to unblock you."
+Key platform engineering patterns for golden paths:
+- **Internal Developer Platform (IDP)**: The infrastructure layer that hosts and delivers golden paths
+- **Self-service provisioning**: Developers can create environments, deploy services, and access tools without tickets
+- **Production readiness scoring**: Tools like OpsLevel provide maturity dashboards that measure how closely services follow golden paths
+- **Cognitive load reduction**: The primary metric — how much mental effort is removed from the developer's workflow
+### 2.5 Design Principles for Effective Golden Paths
+Drawing from Red Hat, Spotify, and ThoughtWorks guidance, effective golden paths share these characteristics:
+1. **Start with the most painful developer journey** — identify where teams waste the most time
+2. **Cover the full lifecycle** — from "I need a new service" to "it's running in production with monitoring"
+3. **Make defaults excellent** — logging, security, observability, error handling should be pre-configured
+4. **Keep the escape hatch** — experienced engineers must be able to go "off-road" for specific needs
+5. **Measure adoption, not enforcement** — track what percentage of teams voluntarily use the path
+6. **Iterate based on feedback** — treat golden paths as products with their own roadmap
+7. **Make the path visible** — use a developer portal (Backstage, Port, OpsLevel) to showcase available paths
+---
+## 3. Reference Architecture Templates and Structures
+### 3.1 The arc42 Template
+The **arc42** template, created by Gernot Starke and Peter Hruschka, provides a 12-section framework for software architecture documentation. It is deliberately pragmatic — every section is optional, and teams choose what's relevant.
+#### The 12 Sections
+| # | Section | Purpose |
+|---|---------|---------|
+| 1 | **Introduction and Goals** | Functional requirements summary, driving forces, quality goals, stakeholder expectations |
+| 2 | **Constraints** | Technical, organisational, and regulatory constraints |
+| 3 | **Context and Scope** | System boundary, external systems, interfaces — the "big picture" |
+| 4 | **Solution Strategy** | Core architectural decisions, technology choices, key patterns |
+| 5 | **Building Block View** | Static decomposition — modules, components, layers (hierarchical) |
+| 6 | **Runtime View** | Dynamic behaviour — important scenarios, sequences, interactions |
+| 7 | **Deployment View** | Infrastructure, hardware, environments, deployment topology |
+| 8 | **Crosscutting Concepts** | Technical topics that span multiple building blocks (security, logging, error handling, persistence) |
+| 9 | **Architectural Decisions** | Key decisions with rationale (ADRs) |
+| 10 | **Quality Requirements** | Quality tree, quality scenarios, measurable targets |
+| 11 | **Risks and Technical Debt** | Known problems, risks, and their mitigations |
+| 12 | **Glossary** | Domain terms, ubiquitous language |
+Arc42's strength is its **technology-agnostic** structure — it works equally well for a microservices system, an embedded system, or a data platform.
+### 3.2 The C4 Model
+The **C4 model** (created by Simon Brown) provides an "abstraction-first" approach to diagramming software architecture. It uses four levels of zoom:
+| Level | Name | Shows | Audience |
+|-------|------|-------|----------|
+| 1 | **System Context** | The system as a box, surrounded by users and external systems | Everyone |
+| 2 | **Container** | Major runtime units (applications, databases, message queues) | Architects, developers |
+| 3 | **Component** | Components within a container | Developers |
+| 4 | **Code** | Class/module level (usually auto-generated) | Developers |
+Plus supplementary diagrams: **Dynamic** (runtime interactions), **Deployment** (infrastructure mapping).
+### 3.3 Combining arc42 and C4
+Arc42 and C4 complement each other naturally:
+| C4 Diagram | Maps to arc42 Section |
+|------------|----------------------|
+| System Context | Section 3: Context and Scope |
+| Container | Section 5: Building Block View (Level 1) |
+| Component | Section 5: Building Block View (Level 2+) |
+| Dynamic | Section 6: Runtime View |
+| Deployment | Section 7: Deployment View |
+This combination provides both the **structure** (arc42) and the **visual language** (C4) needed for comprehensive documentation.
+### 3.4 Azure Architecture Center Structure
+Microsoft's reference architectures follow a consistent template:
+1. **Architecture overview** — one-paragraph summary
+2. **Architecture diagram** — annotated visual
+3. **Dataflow** — numbered step-by-step walkthrough
+4. **Components** — list of services with links to documentation
+5. **Scenario details** — business context, use cases
+6. **Considerations** — organised by Well-Architected pillars:
+   - Reliability
+   - Security
+   - Cost optimisation
+   - Operational excellence
+   - Performance efficiency
+7. **Deploy this scenario** — ARM/Bicep templates
+8. **Contributors** — authors and reviewers
+### 3.5 AWS Reference Architecture Structure
+AWS reference architectures typically include:
+1. **Title and one-line description**
+2. **Architecture diagram** with numbered data flows
+3. **Service descriptions** — role of each AWS service
+4. **How it works** — step-by-step data flow narrative
+5. **Well-Architected considerations** — mapped to six pillars
+6. **Implementation** — CDK constructs, CloudFormation templates, or Solution Constructs
+7. **Cost estimate** — using AWS Pricing Calculator
+8. **Related resources** — links to documentation, workshops, blog posts
+### 3.6 Proposed Unified Template for Reference Architectures
+Synthesising across all frameworks, a comprehensive reference architecture should include:
+```
+1. OVERVIEW
+   1.1 Purpose and scope
+   1.2 Target audience
+   1.3 Position on the enterprise continuum (foundation / common / industry / org-specific)
+   1.4 Relationship to other reference architectures
+2. BUSINESS CONTEXT
+   2.1 Business drivers and goals
+   2.2 Use cases and scenarios
+   2.3 Key stakeholders and their concerns
+   2.4 Constraints (regulatory, organisational, technical)
+3. ARCHITECTURE VIEWS
+   3.1 Context view (C4 Level 1 — system context)
+   3.2 Functional view (C4 Level 2 — containers / building blocks)
+   3.3 Component view (C4 Level 3 — internal structure)
+   3.4 Data flow view (runtime scenarios, numbered walkthrough)
+   3.5 Deployment view (infrastructure topology)
+   3.6 Security view (threat model, control mapping)
+4. CROSSCUTTING CONCERNS
+   4.1 Security and identity
+   4.2 Observability (logging, monitoring, tracing)
+   4.3 Resilience and fault tolerance
+   4.4 Scalability and performance
+   4.5 Data management and governance
+   4.6 Integration patterns
+5. ARCHITECTURE DECISIONS
+   5.1 Key decisions with rationale (ADR format)
+   5.2 Technology choices and trade-offs
+   5.3 Patterns adopted and why
+   5.4 Alternatives considered
+6. QUALITY ATTRIBUTES
+   6.1 Quality attribute requirements (measurable)
+   6.2 Quality scenarios
+   6.3 SLOs and SLAs
+   6.4 Fitness functions for automated validation
+7. OPERATIONAL MODEL
+   7.1 Deployment strategy
+   7.2 CI/CD pipeline design
+   7.3 Monitoring and alerting
+   7.4 Incident response
+   7.5 Scaling policies
+   7.6 Disaster recovery
+8. IMPLEMENTATION GUIDANCE
+   8.1 Golden path / getting started
+   8.2 Infrastructure-as-code templates
+   8.3 API specifications
+   8.4 Sample code and starter kits
+   8.5 Migration path from existing systems
+9. GOVERNANCE
+   9.1 Compliance mapping
+   9.2 Architecture review checklist
+   9.3 Known risks and technical debt
+   9.4 Evolution roadmap
+10. GLOSSARY AND REFERENCES
+    10.1 Domain terms
+    10.2 Abbreviations
+    10.3 External references and standards
+```
+---
+## 4. Quality Criteria for Reference Architectures
+### 4.1 Fundamental Quality Dimensions
+Research and industry practice identify several dimensions that distinguish good reference architectures from poor ones:
+#### Applicability
+Four criteria indicate the applicability of a reference architecture (from academic research on reference architecture evaluation):
+1. **Range of development pathways** — How many valid design decisions remain? A reference architecture with too few pathways is overly rigid; too many means it's too vague.
+2. **Size of application area** — How broad is the domain it covers? Larger application areas increase reuse potential but risk becoming too generic.
+3. **Availability of standards and implementation guides** — Are there concrete artefacts teams can use?
+4. **Provision of submodels** — Does it provide detailed models for specific aspects (data, security, integration)?
+#### The Prescriptiveness Spectrum
+One of the most critical design decisions for a reference architecture is where to sit on the **prescriptiveness spectrum**:
+| Approach | Characteristics | When to Use |
+|----------|----------------|-------------|
+| **Highly prescriptive** | Specific technologies, exact configurations, ready-to-deploy templates | Organisations seeking standardisation, reducing decision fatigue |
+| **Balanced** | Recommended patterns with defined decision points | Organisations with diverse needs but shared principles |
+| **Flexible / guidance-oriented** | Principles and patterns without technology mandates | Federated organisations, multi-cloud strategies |
+The hyperscalers tend towards the prescriptive end (since they're promoting their own services), while frameworks like TOGAF and arc42 are deliberately technology-agnostic.
+### 4.2 Quality Criteria Checklist
+Drawing from ISO 25010 quality characteristics, TOGAF principles, and cloud provider practices:
+**Completeness**
+- Covers all relevant architectural views (structural, behavioural, deployment)
+- Addresses all quality attributes important to the domain
+- Includes both "happy path" and failure scenarios
+- Documents decisions AND their rationale
+**Clarity**
+- Architecture diagrams follow a consistent notation (C4, ArchiMate, UML)
+- Text is written for the stated audience level
+- Jargon is defined in a glossary
+- Data flows are narrated step-by-step
+**Consistency**
+- Internal consistency — no contradictions between views
+- External consistency — aligns with organisational principles and standards
+- Terminology is used consistently throughout
+**Currency**
+- Uses current, supported technology versions
+- Reflects current security best practices
+- Aligned with current vendor recommendations
+- Has a clear versioning and update strategy
+**Actionability**
+- Provides implementation artefacts (IaC templates, API specs, code samples)
+- Includes a clear "getting started" path
+- Offers deployment automation
+- Contains runnable examples or proof-of-concept code
+**Reusability**
+- Clearly delineates what is fixed vs. what is customisable
+- Uses parameterisation for environment-specific values
+- Separates concerns to allow partial adoption
+- Documents extension points
+**Operational Readiness**
+- Includes monitoring and alerting guidance
+- Defines SLOs and scaling policies
+- Addresses disaster recovery
+- Provides runbook templates
+**Evolvability**
+- Architecture decisions are recorded with rationale
+- Known limitations and technical debt are documented
+- An evolution roadmap exists
+- The architecture supports incremental adoption
+### 4.3 Common Anti-Patterns
+What makes a **bad** reference architecture:
+- **Diagram-only architectures** — A box-and-arrow diagram without narrative, decisions, or implementation guidance
+- **Stale blueprints** — Reference architectures that haven't been updated in years and recommend deprecated technologies
+- **Ivory tower designs** — Created by architects who don't build, without input from implementing teams
+- **One-size-fits-all** — No customisation points, no decision framework for when the pattern doesn't fit
+- **Missing the "-ilities"** — Focuses only on functional decomposition without addressing security, performance, resilience, or operability
+- **No golden path** — Describes what to build but not how to get started
+- **Undocumented trade-offs** — Presents one option without explaining what was considered and rejected
+---
+## 5. Assessment Rubrics for Reference Architectures
+### 5.1 Architecture Review Boards (ARBs)
+Architecture Review Boards are the primary governance mechanism for assessing reference architectures. TOGAF's Architecture Governance Framework positions the ARB as a key governing component.
+#### Typical ARB Assessment Criteria
+Based on AWS, LeanIX, and TOGAF guidance, ARBs typically evaluate:
+| Category | Assessment Questions |
+|----------|---------------------|
+| **Strategic Alignment** | Does this align with business strategy and architectural principles? |
+| **Standards Compliance** | Does it conform to enterprise standards and regulatory requirements? |
+| **Security** | Are security best practices implemented (least privilege, encryption, threat modelling)? |
+| **Scalability** | Can it handle projected growth? Are scaling mechanisms defined? |
+| **Reliability** | Are HA and fault tolerance measures adequate? |
+| **Cost** | Is the cost model understood and optimised? |
+| **Operational Readiness** | Can operations teams run this? Are monitoring and incident procedures defined? |
+| **Reusability** | Can other teams adopt this pattern? Is it sufficiently parameterised? |
+| **Technical Debt** | What debt is being introduced? Is it documented and planned for? |
+#### Decision Framework
+ARBs typically use a structured decision framework:
+- **Approve** — Architecture meets all criteria
+- **Approve with conditions** — Architecture is acceptable with specific modifications required
+- **Defer** — More information or analysis needed
+- **Reject** — Architecture does not meet minimum criteria
+Decisions are usually reached through consensus, with a voting mechanism (e.g. 2/3 majority) used in deadlock cases.
+### 5.2 A Reference Architecture Assessment Rubric
+No single universally adopted rubric exists, but synthesising from ARB practices, TOGAF, and the Well-Architected frameworks, here is a proposed assessment rubric:
+#### Scoring: 1 (Poor) — 2 (Developing) — 3 (Adequate) — 4 (Good) — 5 (Excellent)
+**A. Completeness and Coverage (weight: 20%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Architectural views | Single diagram only | Multiple views but gaps | Full coverage: context, functional, deployment, data flow, security |
+| Quality attributes | Not addressed | Some mentioned informally | All relevant attributes with measurable targets |
+| Failure scenarios | Not considered | Basic HA described | Comprehensive failure mode analysis with recovery procedures |
+| Decision rationale | No decisions recorded | Key decisions listed | Full ADRs with context, options, consequences |
+**B. Clarity and Communication (weight: 15%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Diagram quality | Ad-hoc notation, inconsistent | Standard notation, mostly clear | Consistent C4/ArchiMate, all elements labelled, legend provided |
+| Narrative quality | Jargon-heavy, assumes deep context | Readable but uneven | Clear, audience-appropriate, with glossary |
+| Data flow documentation | No flow description | Partial walkthrough | Complete numbered step-by-step narrative |
+**C. Actionability and Implementation (weight: 25%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Getting started | No implementation guidance | Some code samples | Complete golden path with scaffolding templates |
+| IaC templates | None provided | Partial templates | Full deployment automation (Terraform/CDK/Bicep) |
+| API specifications | Informal descriptions | Partial OpenAPI specs | Complete, versioned API contracts |
+| CI/CD guidance | Not addressed | Pipeline described | Pipeline-as-code templates provided |
+**D. Operational Readiness (weight: 20%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Monitoring | Not addressed | Metrics mentioned | Dashboard templates, alerting rules, SLO definitions |
+| Scaling | Not addressed | Manual scaling described | Auto-scaling policies with load testing results |
+| DR/BCP | Not addressed | Backup strategy mentioned | Full DR plan with RTO/RPO, tested runbooks |
+| Security posture | Not addressed | Basic security controls | Threat model, control mapping, compliance evidence |
+**E. Reusability and Evolvability (weight: 10%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Customisation points | Monolithic, no variation | Some parameterisation | Clear extension points, decision framework for variants |
+| Versioning | No version control | Versioned document | Architecture-as-code in Git with change history |
+| Evolution roadmap | Static document | Known limitations listed | Roadmap with planned improvements and migration guidance |
+**F. Governance and Compliance (weight: 10%)**
+| Criterion | 1 | 3 | 5 |
+|-----------|---|---|---|
+| Standards alignment | No reference to standards | Mentions relevant standards | Explicit mapping to enterprise principles and industry standards |
+| Compliance | Not addressed | Regulatory requirements listed | Control mapping with evidence collection guidance |
+| Review process | No review mechanism | Periodic manual review | Automated fitness functions with continuous validation |
+### 5.3 Well-Architected Reviews as Assessment
+The cloud providers' Well-Architected Reviews serve as de facto assessment rubrics:
+- **AWS Well-Architected Tool**: Structured questionnaire across six pillars, producing a report with high-risk issues (HRIs) and improvement plan
+- **Azure Well-Architected Review**: Online assessment tool that evaluates workloads against the five pillars
+- **Google Cloud Architecture Review**: Review process aligned to the framework's pillars plus industry-specific perspectives
+These tools can be repurposed to assess reference architectures by treating the reference architecture as the "workload" being reviewed.
+---
+## 6. Machine-Readable Aspects
+### 6.1 Architecture as Code (AaC)
+**Architecture as Code** is the practice of describing, version-controlling, and automating the entire system architecture through machine-readable artefacts. It represents a fundamental shift from architecture-as-documents to architecture-as-code.
+Core principles:
+- Architecture definitions are **text files** stored in version control
+- Changes are tracked through **pull requests** and code review
+- Architectural rules are enforced through **automated validation**
+- Diagrams are **generated from models**, not drawn manually
+### 6.2 Diagram-as-Code Tools
+Several tools enable machine-readable architecture diagrams:
+**Structurizr DSL** (recommended for C4 model):
+```
+workspace {
+    model {
+        user = person "User"
+        system = softwareSystem "My System" {
+            webapp = container "Web Application" "Serves the UI" "React"
+            api = container "API" "Business logic" "Java/Spring Boot"
+            db = container "Database" "Stores data" "PostgreSQL"
+        }
+        user -> webapp "Uses"
+        webapp -> api "Calls" "HTTPS/JSON"
+        api -> db "Reads/writes" "JDBC"
+    }
+    views {
+        systemContext system "Context" { include * }
+        container system "Containers" { include * }
+    }
+}
+```
+Key benefit: **one model, multiple views**. Change the model and all diagrams update automatically.
+Other diagram-as-code tools include:
+- **PlantUML** — UML and C4 diagrams from text
+- **Mermaid** — Markdown-native diagrams (widely supported in GitHub, GitLab, Notion)
+- **D2** — Modern diagramming language with auto-layout
+- **Diagrams (Python)** — Cloud architecture diagrams as Python code
+- **Mingrammer Diagrams** — AWS/Azure/GCP specific diagram generation
+### 6.3 Infrastructure as Code (IaC) as Architecture
+IaC templates are themselves a form of machine-readable reference architecture:
+| Tool | Format | Strengths |
+|------|--------|-----------|
+| **Terraform** | HCL | Multi-cloud, large module ecosystem |
+| **AWS CDK** | TypeScript/Python/Java | Type-safe, composable constructs |
+| **Pulumi** | General-purpose languages | Full programming language power |
+| **Azure Bicep** | Domain-specific | Native Azure integration |
+| **Google Cloud Deployment Manager** | YAML/Jinja | Native GCP integration |
+| **Crossplane** | Kubernetes YAML | Infrastructure as Kubernetes resources |
+**AWS Solutions Constructs** exemplify this: they are CDK libraries that encode Well-Architected reference architecture patterns as deployable code.
+### 6.4 Architecture Decision Records (ADRs)
+ADRs capture significant architectural decisions in a structured, version-controlled format. The standard format (from Michael Nygard):
+```markdown
+# ADR-001: Use PostgreSQL for primary data store
+## Status
+Accepted
+## Context
+We need a relational database that supports JSONB for semi-structured data,
+has strong ecosystem support, and is available as a managed service on all
+major cloud providers.
+## Decision
+We will use PostgreSQL as our primary relational data store.
+## Consequences
+- Teams have a clear default database choice
+- We can leverage JSONB for flexible schemas
+- We need PostgreSQL expertise on the platform team
+- Applications requiring pure document storage may need a separate decision
+```
+**Machine-readable ADR formats** are emerging:
+- **Decision Reasoning Format (DRF)**: A vendor-neutral YAML/JSON format for decisions with explicit reasoning, assumptions, cognitive state, and trade-offs
+- **MADR (Markdown Architectural Decision Records)**: Structured Markdown with YAML front matter
+- **Log4brains**: Tool that manages ADRs as a searchable knowledge base
+### 6.5 API Specifications
+Reference architectures increasingly embed machine-readable API contracts:
+- **OpenAPI / Swagger** — REST API specifications
+- **AsyncAPI** — Event-driven and message-based API specifications
+- **GraphQL SDL** — Schema definition for GraphQL APIs
+- **Protocol Buffers** — gRPC service definitions
+- **JSON Schema** — Data validation contracts
+These specifications can be used for automated contract testing, documentation generation, and client SDK generation.
+### 6.6 Fitness Functions for Automated Assessment
+**Architecture fitness functions** (coined by Neal Ford and colleagues in "Building Evolutionary Architectures") are automated tests that validate architectural characteristics:
+Types of fitness functions relevant to reference architecture conformance:
+| Type | Example | Automation |
+|------|---------|-----------|
+| **Dependency rules** | "Service A must not directly depend on Service C's database" | ArchUnit, Dependency-Check |
+| **Performance budgets** | "API response time < 200ms at p99" | Load testing in CI/CD |
+| **Security policies** | "No public S3 buckets, all data encrypted at rest" | OPA/Rego policies, cfn-nag, Checkov |
+| **API conformance** | "All APIs must follow our OpenAPI conventions" | Spectral linting |
+| **Infrastructure compliance** | "All deployments must use approved instance types" | Sentinel (Terraform), AWS Config Rules |
+| **Operational readiness** | "All services must have health checks and readiness probes" | Kubernetes admission controllers |
+The key insight: fitness functions **shift governance left** — from periodic architecture reviews to continuous, automated validation in the CI/CD pipeline. They turn reference architecture principles into executable policy.
+### 6.7 Putting It Together: A Machine-Readable Reference Architecture
+A fully machine-readable reference architecture might be structured as a Git repository:
+```
+reference-architecture/
+├── README.md                          # Human-readable overview
+├── docs/
+│   ├── architecture/
+│   │   ├── workspace.dsl              # Structurizr C4 model
+│   │   ├── context.md                 # Context narrative
+│   │   └── decisions/                 # ADRs
+│   │       ├── 001-database-choice.md
+│   │       ├── 002-messaging-pattern.md
+│   │       └── ...
+│   ├── runbooks/                      # Operational runbooks
+│   └── quality/
+│       ├── quality-scenarios.yaml     # Machine-readable quality requirements
+│       └── slo-definitions.yaml       # SLO definitions
+├── infrastructure/
+│   ├── terraform/                     # IaC templates
+│   │   ├── modules/
+│   │   └── environments/
+│   └── kubernetes/                    # K8s manifests / Helm charts
+├── api/
+│   ├── openapi.yaml                   # API specification
+│   └── asyncapi.yaml                  # Event specification
+├── templates/
+│   ├── backstage-template.yaml        # Backstage scaffolder template
+│   └── cookiecutter/                  # Project scaffolding
+├── fitness-functions/
+│   ├── arch-rules/                    # ArchUnit / dependency rules
+│   ├── security-policies/             # OPA/Rego policies
+│   ├── performance-budgets/           # Load test definitions
+│   └── compliance-checks/             # Regulatory compliance checks
+├── examples/
+│   ├── sample-service/                # Reference implementation
+│   └── integration-tests/             # End-to-end tests
+└── CHANGELOG.md                       # Version history
+```
+This structure enables:
+- **Automated diagram generation** from the Structurizr DSL
+- **One-click deployment** via IaC templates
+- **Continuous compliance** via fitness functions in CI/CD
+- **Self-service adoption** via Backstage templates
+- **Version-controlled evolution** via Git history and ADRs
+- **Machine-parseable quality requirements** for automated assessment
+---
+## 7. Synthesis: Principles for Excellent Reference Architectures
+Across all the research, several overarching principles emerge:
+1. **Reference architectures are products, not projects.** They need ongoing investment, a roadmap, feedback loops, and versioning — just like the software they describe.
+2. **The best reference architectures are executable.** A reference architecture that exists only as a PDF or wiki page will drift from reality. The gold standard is architecture-as-code that can be deployed, tested, and validated automatically.
+3. **Opinions are a feature, not a bug.** The value of a reference architecture is proportional to the number of decisions it makes for you. Vague guidance that says "choose an appropriate database" is less useful than specific guidance that says "use PostgreSQL for OLTP workloads with these configurations."
+4. **Multiple audiences require multiple layers.** A single 50-page document doesn't serve executives and developers equally. Layer the architecture: context for executives, component models for architects, deployment templates for developers, runbooks for operations.
+5. **Golden paths operationalise reference architectures.** A reference architecture describes what to build; a golden path makes it easy to build it. The most effective organisations pair every reference architecture with a corresponding Backstage template or scaffolding tool.
+6. **Fitness functions close the governance loop.** Without automated validation, reference architecture conformance degrades over time. Fitness functions turn architectural principles into executable tests.
+7. **Document decisions, not just designs.** The rationale behind architectural choices is often more valuable than the choices themselves, because it enables future teams to understand when the architecture should evolve.
+---
+## Sources
+### Cloud Provider Reference Architectures
+- [AWS Architecture Center](https://aws.amazon.com/architecture/)
+- [AWS Reference Architecture Diagrams](https://aws.amazon.com/architecture/reference-architecture-diagrams/)
+- [AWS Solutions Constructs](https://aws.amazon.com/blogs/aws/aws-solutions-constructs-a-library-of-architecture-patterns-for-the-aws-cdk/)
+- [AWS Well-Architected Framework](https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html)
+- [Azure Architecture Center](https://learn.microsoft.com/en-us/azure/architecture/)
+- [Azure Best Practices](https://learn.microsoft.com/en-us/azure/architecture/best-practices/index-best-practices)
+- [Google Cloud Architecture Center](https://docs.cloud.google.com/architecture)
+- [Google Cloud Well-Architected Framework](https://docs.cloud.google.com/architecture/framework)
+### TOGAF and Enterprise Architecture
+- [TOGAF Architecture Principles](https://pubs.opengroup.org/architecture/togaf9-doc/arch/chap20.html)
+- [TOGAF Enterprise Continuum](https://pubs.opengroup.org/architecture/togaf9-doc/arch/chap35.html)
+- [TOGAF Architecture Review Board](https://pubs.opengroup.org/architecture/togaf8-doc/arch/chap23.html)
+- [What is a Reference Architecture? — Conexiam](https://conexiam.com/what-is-a-reference-architecture/)
+### Golden Paths and Platform Engineering
+- [How We Use Golden Paths to Solve Fragmentation — Spotify Engineering](https://engineering.atspotify.com/2020/08/how-we-use-golden-paths-to-solve-fragmentation-in-our-software-ecosystem)
+- [How Spotify Leverages Paved Paths — InfoQ](https://www.infoq.com/news/2021/03/spotify-paved-paths/)
+- [What is a Golden Path? — Red Hat](https://www.redhat.com/en/topics/platform-engineering/golden-paths)
+- [Designing Golden Paths — Red Hat](https://www.redhat.com/en/blog/designing-golden-paths)
+- [Engineering Platforms and Golden Paths — ThoughtWorks](https://www.thoughtworks.com/en-us/insights/podcasts/technology-podcasts/engineering-platforms-golden-paths-building-better-developer-experiences)
+- [Platform Engineering: Rebuilding the Core — ThoughtWorks](https://www.thoughtworks.com/insights/blog/platforms/platform-engineering--rebuilding-the-core-for-developer-effectiv)
+- [Backstage Software Templates — Spotify](https://backstage.spotify.com/learn/onboarding-software-to-backstage/setting-up-software-templates/11-spotify-templates/)
+- [How to Build Golden Paths Developers Will Actually Use — Jellyfish](https://jellyfish.co/library/platform-engineering/golden-paths/)
+- [Golden Paths in Platform Engineering — Cyclops](https://cyclops-ui.com/blog/2025/04/24/golden-paths/)
+### Architecture Documentation Templates
+- [arc42 Template Overview](https://arc42.org/overview)
+- [arc42 Documentation](https://docs.arc42.org/home/)
+- [arc42 + C4 Example — bitsmuggler](https://bitsmuggler.github.io/arc42-c4-software-architecture-documentation-example/)
+- [Effective Architecture Documentation with arc42 and C4 — Torsten Mosis](https://www.linkedin.com/pulse/effective-architecture-documentation-arc42-c4-torsten-mosis)
+- [The Ultimate Guide to Software Architecture Documentation](https://www.workingsoftware.dev/software-architecture-documentation-the-ultimate-guide/)
+### Architecture as Code and Machine-Readable Formats
+- [Architecture as Code — aac.muthub.org](https://aac.muthub.org/)
+- [Introduction to Architecture as Code — aac.geon.se](https://aac.geon.se/01_introduction/)
+- [Getting Started with Architecture as Code — TechTarget](https://www.techtarget.com/searchapparchitecture/tip/Getting-started-with-architecture-as-code)
+- [Structurizr](https://structurizr.com/)
+- [Structurizr DSL](https://docs.structurizr.com/dsl)
+- [Architecture Decision Records — Joel Parker Henderson](https://github.com/joelparkerhenderson/architecture-decision-record)
+### Fitness Functions and Automated Governance
+- [Fitness Functions — Continuous Architecture](https://continuous-architecture.org/practices/fitness-functions/)
+- [Fitness Functions for Your Architecture — InfoQ](https://www.infoq.com/articles/fitness-functions-architecture/)
+- [Fitness Functions — ThoughtWorks](https://www.thoughtworks.com/insights/decoder/f/fitness-functions)
+- [Automated Architecture Compliance Using Fitness Functions — LinkedIn](https://www.linkedin.com/pulse/automated-architecture-compliance-assessments-leveraging-bayan)
+- [Governing Data Products Using Fitness Functions — Martin Fowler](https://martinfowler.com/articles/fitness-functions-data-products.html)
+### Architecture Review and Governance
+- [Build and Operate an Effective ARB — AWS Architecture Blog](https://aws.amazon.com/blogs/architecture/build-and-operate-an-effective-architecture-review-board/)
+- [Architecture Review Board — LeanIX](https://www.leanix.net/en/wiki/ea/architecture-review-board)
+- [Architecture Review Board Checklist — Hava](https://www.hava.io/blog/architecture-review-board-checklist)
+- [What Is an Enterprise Architecture Review Board? — Conexiam](https://conexiam.com/what-is-an-enterprise-architecture-review-board/)
+### Quality and Evaluation
+- [Reference Architecture — ScienceDirect](https://www.sciencedirect.com/topics/computer-science/reference-architecture)
+- [arc42 Quality Requirements (ISO 25010)](https://github.com/arc42/quality-requirements/blob/main/doc/iso-25010-definitions.txt)
+- [Software Architecture Quality Attributes — Syndicode](https://syndicode.com/blog/12-software-architecture-quality-attributes/)