onto-mcp 0.3.1 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/.onto/authority/core-lexicon.yaml +1 -0
  2. package/.onto/domains/software-engineering/competency_qs.md +192 -63
  3. package/.onto/domains/software-engineering/concepts.md +67 -5
  4. package/.onto/domains/software-engineering/conciseness_rules.md +22 -2
  5. package/.onto/domains/software-engineering/dependency_rules.md +78 -8
  6. package/.onto/domains/software-engineering/domain_scope.md +181 -150
  7. package/.onto/domains/software-engineering/extension_cases.md +318 -542
  8. package/.onto/domains/software-engineering/logic_rules.md +75 -3
  9. package/.onto/domains/software-engineering/problem_framing_profile.md +29 -2
  10. package/.onto/domains/software-engineering/prompt_interface.md +122 -0
  11. package/.onto/domains/software-engineering/structure_spec.md +53 -4
  12. package/.onto/principles/llm-native-development-guideline.md +20 -0
  13. package/.onto/principles/productization-charter.md +6 -0
  14. package/.onto/processes/reconstruct/reconstruct-boundary-contract.md +278 -91
  15. package/.onto/processes/reconstruct/reconstruct-execution-ux-contract.md +45 -12
  16. package/.onto/processes/reconstruct/source-profile-contract.md +39 -6
  17. package/.onto/processes/reconstruct/top-level-concept-discovery-contract.md +387 -0
  18. package/.onto/processes/review/lens-registry.md +16 -0
  19. package/.onto/processes/shared/target-material-kind-contract.md +18 -2
  20. package/.onto/roles/axiology.md +7 -2
  21. package/AGENTS.md +3 -2
  22. package/README.md +39 -33
  23. package/dist/core-api/reconstruct-api.js +22 -5
  24. package/dist/core-api/review-api.js +1288 -533
  25. package/dist/core-runtime/cli/mock-review-unit-executor.js +17 -0
  26. package/dist/core-runtime/cli/review-invoke.js +23 -48
  27. package/dist/core-runtime/cli/run-review-prompt-execution.js +122 -0
  28. package/dist/core-runtime/path-boundary.js +58 -0
  29. package/dist/core-runtime/reconstruct/artifact-types.js +5 -0
  30. package/dist/core-runtime/reconstruct/materialize-preparation.js +54 -4
  31. package/dist/core-runtime/reconstruct/pipeline-execution-ledger.js +38 -2
  32. package/dist/core-runtime/reconstruct/post-seed-validation.js +13 -0
  33. package/dist/core-runtime/reconstruct/record.js +11 -0
  34. package/dist/core-runtime/reconstruct/run.js +1133 -26
  35. package/dist/core-runtime/reconstruct/seed-candidate-validation.js +29 -0
  36. package/dist/core-runtime/review/execution-plan-boundary.js +123 -0
  37. package/dist/core-runtime/review/materializers.js +8 -3
  38. package/dist/core-runtime/review/review-artifact-utils.js +15 -2
  39. package/dist/core-runtime/review/review-invocation-runner.js +604 -0
  40. package/dist/core-runtime/target-material-kind.js +43 -5
  41. package/dist/mcp/server.js +158 -39
  42. package/dist/mcp/tool-schemas.js +22 -2
  43. package/package.json +3 -1
  44. package/.onto/domains/llm-native-development/competency_qs.md +0 -430
  45. package/.onto/domains/llm-native-development/concepts.md +0 -242
  46. package/.onto/domains/llm-native-development/conciseness_rules.md +0 -163
  47. package/.onto/domains/llm-native-development/dependency_rules.md +0 -216
  48. package/.onto/domains/llm-native-development/domain_scope.md +0 -197
  49. package/.onto/domains/llm-native-development/extension_cases.md +0 -474
  50. package/.onto/domains/llm-native-development/logic_rules.md +0 -123
  51. package/.onto/domains/llm-native-development/prompt_interface.md +0 -49
  52. package/.onto/domains/llm-native-development/structure_spec.md +0 -245
@@ -1,183 +1,214 @@
1
1
  ---
2
- version: 3
3
- last_updated: "2026-03-31"
4
- source: bundled-domain-baseline
2
+ version: 8
3
+ last_updated: "2026-05-28"
4
+ source: zero-based-software-engineering-redesign
5
5
  status: established
6
6
  ---
7
7
 
8
- # Software Engineering Domain Domain Scope Definition
8
+ # Software Engineering Domain - Domain Scope Definition
9
9
 
10
- This is the reference document used by coverage to identify "what should exist but is missing."
11
- This domain applies when **reviewing** a software system.
10
+ This document is the coverage and axiology entrypoint for the `software-engineering`
11
+ domain. It defines what a software review must be able to notice.
12
12
 
13
- ## Major Sub-areas
14
-
15
- Classification axis: **concern** classified by the design concerns that a software system must address.
16
-
17
- Applicability markers:
18
- - **(required)**: Must be addressed in any software system review. Absence indicates a fundamental gap
19
- - **(when applicable)**: Address when the system's architecture includes the relevant pattern. Not addressing it when the pattern is absent is correct, not a gap
20
- - **(scale-dependent)**: Becomes required beyond a scale threshold. The threshold should be documented per sub-area
21
-
22
- ### Data & State
23
- - **Data Modeling** (required): entities, relationships, type definitions, schema design. Uber's Schemaless demonstrates schema-on-read (MySQL stores JSON blobs, schema evolution without migrations). Netflix EVCache handles 30M+ req/s, illustrating consistency vs availability trade-offs. Event Sourcing (Greg Young, EventStore) stores state as immutable event logs — Axon Framework implements this on the JVM with built-in CQRS. Data modeling must declare schema-on-write vs schema-on-read, as this determines migration strategy and consistency guarantees
24
- - **State Management** (required): state transitions, invariants, recovery paths, concurrency control. CQRS (Command Query Responsibility Segregation) separates read and write models — the write side enforces invariants via commands, the read side serves optimized projections. Saga patterns (choreography vs orchestration) coordinate distributed state changes across service boundaries without distributed transactions
25
- - **Event Sourcing** (when applicable): event storage, state reconstruction, projections, terminal states, partial commit prevention. Event Store (eventstore.com) provides a purpose-built database for event sourcing with built-in projections and subscriptions
26
-
27
- ### Interface & Contract
28
- - **API Design** (required): public interfaces, versioning, contracts, backward compatibility. REST maturity is measured by Richardson's Maturity Model (L0: HTTP tunnel → L3: hypermedia/HATEOAS). GraphQL (Facebook, 2015) lets clients specify exact data shapes, solving over/under-fetching. gRPC (Google) uses Protocol Buffers for schema-first, strongly-typed RPC with HTTP/2 multiplexing. Stripe's rolling API versioning pins each customer to their integration version, avoiding the "upgrade cliff." OpenAPI/Swagger provides machine-readable specs enabling automated client generation and contract testing
29
- - **Type System** (required): discriminated union, exhaustive check, type-level safety mechanisms. TypeScript's discriminated unions with exhaustive switch/case checking eliminate an entire class of runtime errors at compile time. Rust's `Result<T, E>` and `Option<T>` force callers to handle both success and failure paths (see also: concepts.md §Type Safety Mechanisms)
30
- - **Error Handling** (required): error classification, recovery strategies, fallback paths, user guidance. Error classification should distinguish operational errors (expected, recoverable: network timeout, validation failure) from programmer errors (unexpected, non-recoverable: null dereference, assertion violation). Circuit breaker patterns (Netflix Hystrix, Resilience4j) prevent cascading failures by failing fast when downstream services are unhealthy
31
- - **Requirements & Specification** (when applicable): functional and non-functional requirements capture, acceptance criteria, traceability from requirements to implementation. IEEE 830 (SRS) provides a standard format. Requirements must be testable — each requirement should map to at least one verification method. Non-functional requirements (performance, security, availability) must be quantified. Applicable when formal requirements management is practiced; implicit for small-team projects with clear verbal agreements
13
+ `software-engineering` is the canonical domain for conventional software engineering,
14
+ AI-assisted development, and LLM-powered product/runtime behavior. The former
15
+ `llm-native-development` domain is a compatibility alias only. A reviewer should not
16
+ run a second domain review to cover AI behavior; this domain activates AI-era concerns
17
+ when the target uses LLMs, agents, model providers, prompt/context contracts, retrieval,
18
+ semantic evaluation, AI-assisted workflows, or tool-call boundaries.
32
19
 
33
- ### Security & Auth
34
- - **Authentication/Authorization** (required): user identification, permission systems, access control. OAuth 2.0/OIDC are the industry standards for delegated auth. JWT enables stateless authentication but requires careful token expiration, refresh rotation, and revocation handling. RBAC vs ABAC represent different authorization models — RBAC assigns permissions to roles, ABAC evaluates policies against attributes at runtime
35
- - **Security** (required): input validation, injection prevention, data encryption, supply chain security. OWASP Top 10 (2021) classifies critical risks: A01 Broken Access Control through A10 SSRF. Log4Shell (CVE-2021-44228) demonstrated supply chain risk — a single Log4j vulnerability compromised millions of systems. The SolarWinds attack (2020) showed compromised build pipelines can inject malicious code into trusted updates, affecting 18,000+ organizations
20
+ The domain is not a MECE taxonomy. It is a lens-usable concern map. A concern is strong
21
+ enough for active scope when it can support:
36
22
 
37
- ### Verification & Quality
38
- - **Test Strategy** (required): unit/integration/E2E coverage, verification criteria, test data management. Google's Testing Pyramid: many unit tests (fast, isolated), fewer integration tests (service boundaries), minimal E2E tests (slow, brittle). Fowler distinguishes sociable tests (real collaborators) from solitary tests (test doubles). Property-based testing (QuickCheck, Hypothesis, fast-check) generates random inputs to discover edge cases. Mutation testing (Stryker, PIT) validates test suite effectiveness by checking that tests catch injected mutations (see also: structure_spec.md §Verification Structure)
39
- - **Verification Design** (when applicable): verification timing (shift-left), structural vs semantic verification, specification requirements when delegating to AI agents. Netflix Chaos Engineering (Chaos Monkey, 2011) injects failures in production to verify resilience — accepting failure as inevitable and testing recovery paths
40
- - **Performance** (scale-dependent): response time, throughput, caching strategy. Load testing tools (k6, Gatling, Locust) simulate concurrent users to identify bottlenecks before production. Key metrics: p50/p95/p99 latency, throughput (RPS), error rate under load
23
+ ```text
24
+ lens perspective -> principle -> case evidence -> actionable guideline -> CQ
25
+ ```
41
26
 
42
- ### Structure & Architecture
43
- - **Module Separation** (required): layer structure, dependency direction, separation of concerns. Hexagonal Architecture (Cockburn, 2005) isolates domain logic from infrastructure through ports and adapters, making the core testable without databases or HTTP. Clean Architecture (Robert C. Martin) enforces the Dependency Rule: dependencies point only inward. Domain-Driven Design (Evans, 2003) organizes code around bounded contexts. Microservices (Sam Newman) decompose systems into independently deployable services, trading operational complexity for deployment independence
44
- - **Data Flow** (required): input-to-processing-to-output paths, transformation chains, source of truth designation. Pipe-and-filter architecture (Unix philosophy) composes systems from small, focused transformations. Event-driven architecture uses event buses (Kafka, RabbitMQ) to decouple producers from consumers
45
- - **Event/Messaging** (when applicable): message queues, asynchronous processing, pipeline scalability. Apache Kafka provides durable, partitioned event logs enabling replay and exactly-once semantics. Message delivery guarantees (at-most-once, at-least-once, exactly-once) have fundamental trade-offs with latency and complexity
27
+ ## Domain Purpose
46
28
 
47
- ### Operations, Deployment & Maintenance
48
- - **Deployment/Operations** (scale-dependent): CI/CD, environment separation, monitoring, logging. The 12-Factor App defines 12 principles for cloud-native applications (codebase, dependencies, config, backing services, build/release/run, processes, port binding, concurrency, disposability, dev/prod parity, logs, admin processes). GitOps (Weaveworks) uses Git as the single source of truth for declarative infrastructure. Feature flags (LaunchDarkly, Unleash) decouple deployment from release, enabling progressive rollouts without redeployment. Google SRE defines error budgets, SLIs/SLOs/SLAs, and toil reduction. DORA metrics (Deployment Frequency, Lead Time, Change Failure Rate, Time to Restore) measure delivery performance
49
- - **Internationalization/Accessibility** (scale-dependent): multi-language, time zones, accessibility standards. **Applicability conditions**: Required when (1) the system serves users in multiple locales, (2) the system is subject to accessibility regulations (ADA, EAA, EN 301 549), or (3) the system is public-facing web/mobile. Not applicable for internal tools with a single-locale user base unless regulatory requirements mandate it. WCAG 2.1 defines A/AA/AAA conformance levels. ICU handles locale-aware formatting, collation, and transliteration. See concepts.md §Internationalization/Accessibility Terms for definitions
50
- - **Maintenance** (when applicable): corrective maintenance (fixing defects discovered after delivery), adaptive maintenance (accommodating environment changes — OS upgrades, dependency updates, regulatory changes), perfective maintenance (improving performance/maintainability based on user feedback), preventive maintenance (refactoring to prevent anticipated problems). IEEE 14764 classifies these four categories. Technical debt management maps to preventive maintenance. Corrective/adaptive are reactive; perfective/preventive are proactive. See concepts.md §Change Management Terms for related terminology
29
+ Software engineering review should detect whether a software system can be understood,
30
+ changed, verified, operated, trusted, and retired without hiding the authority, evidence,
31
+ or value tradeoffs that make the system work.
51
32
 
52
- ### Documentation & Consumers
53
- - **Document Design** (when applicable): dual-consumer handling for AI agents and humans, separation of contract documents vs guide documents, separation of information structure and rendering. Diátaxis classifies documentation into 4 types: tutorials (learning), how-to guides (task), reference (information), explanation (understanding). ADRs (Michael Nygard) capture the "why" behind architectural choices in structured format (context, decision, consequences). API-first design ensures contracts are defined before implementation
54
- - **Constraint Design** (when applicable): hard/soft constraint classification, invariant vs best-effort boundary, pre-inclusion vs post-verification. Hard constraints: invariants that must never be violated (data integrity, security). Soft constraints: preferences relaxed under pressure (response time targets, cache hit ratios)
33
+ The domain covers the full lifecycle:
55
34
 
56
- ## Normative System Classification
35
+ 1. acquisition and supply
36
+ 2. requirements and design
37
+ 3. implementation
38
+ 4. verification and release
39
+ 5. operation and incident response
40
+ 6. maintenance and evolution
41
+ 7. decommissioning and retirement
57
42
 
58
- Standards governing software engineering operate at three distinct layers. Each layer has different enforcement mechanisms and change velocity.
43
+ LLM-native artifacts inherit this lifecycle. Prompts, context assembly rules, tool
44
+ schemas, retrieval indexes, model/provider routes, eval rubrics, agent instructions, and
45
+ generated authority artifacts are behavior-affecting software artifacts.
59
46
 
60
- | Layer | Scope | Enforcement | Change Velocity | Examples |
61
- |---|---|---|---|---|
62
- | Layer 1 — Language/Runtime | Syntax, semantics, built-in APIs | Compiler/interpreter rejection | Slow (years) | ECMAScript, JLS, Go Spec, Rust Reference |
63
- | Layer 2 — Framework/Library | Conventions imposed by frameworks | Runtime errors, lint rules | Medium (quarterly) | React hooks rules, Spring IoC, Rails conventions |
64
- | Layer 3 — Industry/Organization | Cross-technology principles | Code review, audit | Fast (per incident) | OWASP Top 10, 12-Factor, SOLID, DDD patterns |
47
+ ## Axiology Input
65
48
 
66
- **Why this matters**: A review that cites only Layer 3 principles without checking Layer 1/2 conformance misses concrete, enforceable violations. Conversely, a review that only checks Layer 1/2 conformance without Layer 3 assessment misses systemic design issues.
49
+ Axiology uses this domain input to identify value conflicts; it does not treat these
50
+ statements as predetermined conclusions.
67
51
 
68
- ## Required Concept Categories
69
-
70
- These are concept categories that must be addressed in any software system.
71
-
72
- | Category | Description | Risk if Missing | Example of Failure |
73
- |---|---|---|---|
74
- | Happy path | Normal behavior for expected inputs | Incomplete functional definition | API returns 200 but response body format is unspecified |
75
- | Error path | Handling of abnormal inputs/states | Defenseless during failures | Unhandled exception crashes process; user sees raw stack trace |
76
- | Boundary condition | Min/max values, empty inputs, concurrent access | Edge case failures | Integer overflow in payment calculation; empty array dereference |
77
- | Lifecycle | Creation use disposal, state transitions | Resource leaks, zombie objects | Database connection pool exhaustion from unclosed connections |
78
- | Traceability | Change rationale, decision justification, audit trail | Unmaintainable | 3-year-old conditional with no comment or commit message explaining why it exists |
79
- | Source of truth | The authoritative data/definition source when inconsistencies arise | Unable to resolve information conflicts | User profile cached in 3 services diverges; no system is designated authoritative |
80
- | Concurrency | Thread safety, race conditions, deadlock prevention | Data corruption under load | Two threads updating the same balance without locking; lost update |
81
- | Idempotency | Operations that produce the same result when executed multiple times | Duplicate side effects | Payment charged twice because retry logic doesn't check for existing transaction |
82
- | Observability | Logging, metrics, and tracing for runtime behavior | Silent failures, undiagnosable production issues | Error rate spikes but no logs indicate which endpoint or upstream service is responsible |
83
-
84
- ## Reference Standards/Frameworks
85
-
86
- | Standard | Application Area | Usage | When to Apply |
87
- |---|---|---|---|
88
- | OWASP Top 10 (2021) | Security | Web application security vulnerability classification | Every web-facing system; review checklist for security |
89
- | 12-Factor App | Deployment/Operations | Cloud-native application design principles | Cloud-deployed services; SaaS architecture review |
90
- | REST Maturity Model (Richardson) | API Design | API design level assessment (L0–L3) | Reviewing or designing REST APIs |
91
- | SOLID Principles | Module separation, type system | Five principles of object-oriented design | Object-oriented codebases; module boundary review |
92
- | Event Sourcing Pattern | Event Sourcing | Event storage, state reconstruction, projections | Systems requiring full audit trail or temporal queries |
93
- | Diátaxis Framework | Document Design | Tutorial/How-to/Reference/Explanation 4-way classification | Documentation review or creation |
94
- | IEEE 830 (SRS) | Requirements | Software Requirements Specification format | Formal requirements documentation |
95
- | ISO 25010 | Quality Model | Software quality characteristics (8 characteristics, 31 sub-characteristics) | System quality attribute assessment |
96
- | TOGAF | Architecture | Enterprise architecture framework and ADM (Architecture Development Method) | Enterprise-scale architecture decisions |
97
- | C4 Model (Simon Brown) | Architecture | 4-level architecture diagramming (Context, Container, Component, Code) | Architecture documentation and communication |
98
- | Arc42 | Architecture | Pragmatic architecture documentation template (12 sections) | Documenting system architecture decisions |
99
- | Domain-Driven Design (Eric Evans) | Structure & Architecture | Bounded contexts, aggregates, ubiquitous language | Complex domains with rich business logic |
100
- | Google SRE Workbook | Operations | Error budgets, SLIs/SLOs, incident management | Production systems requiring reliability guarantees |
101
- | IEEE 14764 | Maintenance | Software maintenance process and classification (corrective/adaptive/perfective/preventive) | Systems with ongoing maintenance operations |
102
- | WCAG 2.1 | Accessibility | Web Content Accessibility Guidelines (A/AA/AAA conformance levels) | Public-facing web applications; legally mandated accessibility |
103
-
104
- ## Bias Detection Criteria
105
-
106
- - If ⌈N/2.5⌉ or more of the Major Sub-areas (§Major Sub-areas) are not represented at all → **insufficient coverage**. **N** = the number of `###` subsections under §Major Sub-areas (count at review time — do not hard-code). At review time the reviewer counts the current `###` headings under §Major Sub-areas and computes the threshold from N, so adding or removing a sub-area updates the threshold automatically without editing this rule
107
- - If concepts from a specific area account for more than 70% of the total → **area bias**
108
- - If only the happy path is defined with no error path → **path bias**
109
- - If creation/use is defined but disposal/cleanup is missing → **incomplete lifecycle**
110
- - If 2 or more data sources lack a designated source of truth → **undesignated authority**
111
- - If the document design area is missing in a system where AI agents are consumers/executors → **missing consumer perspective**
112
- - If only synchronous request-response is considered with no async patterns (queues, events, callbacks) → **concurrency blindness**. Production systems require async for resilience and scalability
113
- - If only server-side logic is addressed with no client-side considerations → **deployment bias**. Full-stack systems need design decisions on both sides of the network boundary
114
- - If testing covers only unit tests with no integration or E2E strategy → **test level bias**. Each level catches different defect classes; missing any level leaves a verification gap (see also: structure_spec.md §Verification Structure)
115
- - If security addresses only authentication without authorization → **security scope bias**. Auth without authz means every authenticated user has full access (OWASP A01)
116
- - If only read operations are designed with no write/mutation considerations → **operation bias**. Ignoring write paths produces systems that fail under mutation load
52
+ | Value commitment | Review signal |
53
+ |---|---|
54
+ | Diagnosability over false smoothness | Silent fallback, hidden repair, or unmarked degradation is suspect when it hides the failing boundary |
55
+ | Artifact truth over response truth | Public responses may summarize durable artifacts but must not become the authority seat |
56
+ | Accountability over automation theater | Agent autonomy must preserve owner, approval, audit, and recovery paths |
57
+ | Evidence over plausibility | Generated or retrieved claims need provenance when they affect trust, release, or user decisions |
58
+ | Explicit loss over invisible degradation | User-facing degradation is acceptable only when capability loss, trust status, diagnostics, and recovery are visible |
59
+ | Least agency over broad capability | Agent functionality, permissions, and autonomy must be minimized separately |
60
+ | Governance as engineering material | AI risk ownership, approval gates, incident disclosure, and continuous improvement are reviewable engineering concerns |
61
+ | Accessibility and user agency | Speed or productivity claims do not justify excluding affected users or hiding control from operators |
117
62
 
118
- ## Inter-Document Contract
63
+ ## Top-Down Concern Stack
119
64
 
120
- This section declares which file owns which cross-cutting topic, preventing rule duplication and phantom references.
65
+ Reviewers should reason from the top down before diving into local implementation detail.
121
66
 
122
- ### Rule Ownership
123
-
124
- | Cross-cutting Topic | Owner File | Other Files |
67
+ | Layer | What to look for | Primary lens consumers |
125
68
  |---|---|---|
126
- | Dependency direction rules | dependency_rules.md | structure_spec.md (references only) |
127
- | Backward compatibility classification | dependency_rules.md | logic_rules.md (references only) |
128
- | Concept definitions | concepts.md | All other files reference, do not redefine |
129
- | Structural coherence rules | structure_spec.md §Golden Relationships | Other files reference |
130
- | Conciseness criteria | conciseness_rules.md | Other files reference |
131
- | Competency questions | competency_qs.md | Other files provide inference path targets |
132
- | Error handling design | logic_rules.md §Error Handling Logic | dependency_rules.md (cascading failure mechanisms) |
133
- | Performance optimization rules | logic_rules.md §Performance Logic | structure_spec.md (thresholds), domain_scope.md (SLOs) |
134
-
135
- ### Required Substance per Sub-area
69
+ | Purpose and value | Stakeholder promises, harms, accountability, tradeoffs, non-negotiable constraints | axiology, coverage, pragmatics |
70
+ | Lifecycle and governance | Acquisition/supply, approval gates, risk ownership, incident response, retirement | coverage, axiology, evolution |
71
+ | Architecture and state | Modules, interfaces, state transitions, source of truth, concurrency, data flow | structure, logic, dependency |
72
+ | Contract and dependency truth | Types, schemas, APIs, package dependencies, provider/model/tool/corpus dependencies | dependency, logic, semantics |
73
+ | Verification and operations | Tests, static checks, semantic eval, red-team evidence, release gates, observability, drift detection | pragmatics, coverage, evolution |
74
+ | LLM-native behavior controls | Prompt/context contracts, output zero-trust, retrieval boundaries, agent agency, model routing, failure posture | logic, structure, dependency, axiology |
75
+ | Case evidence and CQ library | Case-backed guideline cards and PASS/FAIL review questions | pragmatics, evolution, coverage |
136
76
 
137
- Each sub-area declared in Major Sub-areas must have corresponding substance in at least one of:
138
- - concepts.md: term definitions
139
- - logic_rules.md or structure_spec.md or dependency_rules.md: operational rules
140
- - competency_qs.md: verification questions
77
+ ## Major Sub-areas
141
78
 
142
- A sub-area with declaration but no substance in any file is a "ghost sub-area" and must either be populated or annotated with applicability conditions.
79
+ Applicability markers:
143
80
 
144
- ### Cross-cutting Concern Attribution
81
+ - **required**: must be addressed in every software review.
82
+ - **when applicable**: required when the target uses the relevant pattern.
83
+ - **scale-dependent**: required beyond a documented scale, exposure, or risk threshold.
145
84
 
146
- When a concern spans multiple sub-areas, attribute it to the sub-area where the concern has its **primary enforcement point**:
85
+ | Sub-area | Applicability | Review substance |
86
+ |---|---|---|
87
+ | Data and state | required | entities, schemas, source of truth, invariants, migrations, state transitions, consistency, retention/disposal |
88
+ | Interface and contract | required | APIs, types, schemas, versioning, requirements, acceptance criteria, backward compatibility |
89
+ | Error and failure posture | required | error taxonomy, fail-close gates, fail-loud diagnostics, recovery paths, user-facing loss markers |
90
+ | Security and authorization | required | authn/authz, input/output validation, injection prevention, secrets, privacy, supply chain, abuse boundaries |
91
+ | Verification and quality | required | unit/integration/E2E/static checks, semantic eval, quality attributes, release gates, measurable acceptance criteria |
92
+ | Architecture and structure | required | module/layer boundaries, dependency direction, state ownership, deployment topology, consumer surfaces |
93
+ | Operations and maintenance | required for operated systems | CI/CD, observability, incident response, SLOs, drift detection, maintenance classification, retirement |
94
+ | Documentation and consumers | when applicable | human/agent readers, contract vs guide docs, authority seats, diagrams, onboarding and handoff paths |
95
+ | LLM-native and agentic behavior | when applicable | model/provider routing, prompts, context, retrieval, tools, agents, eval, provenance, failure diagnostics |
96
+ | AI governance and risk | when applicable | risk owner, approval gate, human oversight, transparency, red-team loop, incident disclosure, continuous improvement |
97
+ | Accessibility and internationalization | scale-dependent | WCAG/current accessibility baseline, locale/time/currency/text-direction behavior, assistive technology support |
98
+
99
+ ## LLM-Native Activation Conditions
100
+
101
+ Activate LLM-native review concerns when any of the following is true:
102
+
103
+ - product behavior depends on a model call, agent loop, retrieval result, or generated output.
104
+ - development/review/release workflow depends on LLM-generated artifacts.
105
+ - prompt templates, tool schemas, eval rubrics, or model/provider routes influence behavior.
106
+ - external content enters model context through files, webpages, RAG, search, or user-provided text.
107
+ - model output can trigger tool calls, persistence, authority artifacts, user-visible decisions, or downstream sinks.
108
+
109
+ When activated, the target must address:
110
+
111
+ - LLM/runtime/middleware ownership split.
112
+ - output zero-trust and sink-specific validation.
113
+ - prompt injection and external-content authority limits.
114
+ - RAG/vector ingestion, permission, poisoning, provenance, and audit boundaries.
115
+ - agent functionality, permission, and autonomy minimization.
116
+ - semantic evaluation and production drift monitoring.
117
+ - fail-loud diagnostics for development/review/authority paths.
118
+ - explicit degraded-state behavior for product paths.
119
+ - AI governance/risk ownership when behavior can materially affect users, operators, security, or release decisions.
147
120
 
148
- 1. **Primary enforcement point**: The sub-area whose rules would be violated if the concern is not addressed. Example: input validation spans Interface & Contract (API design) and Security & Auth (injection prevention). Primary enforcement: Security & Auth, because the security consequence is the enforcement driver
149
- 2. **Secondary references**: Other sub-areas reference the primary sub-area's rules rather than duplicating them
150
- 3. **Tie-breaking**: If enforcement is equally distributed, attribute to the sub-area with fewer existing items (load balancing)
121
+ ## Required Concept Categories
151
122
 
152
- ### Classification Axis Relationships
123
+ | Category | Risk if missing | Example failure |
124
+ |---|---|---|
125
+ | Happy path | Functional intent is incomplete | API returns 200 but response body semantics are unspecified |
126
+ | Error path | Failures are defenseless or hidden | Runtime catches and ignores validation errors |
127
+ | Boundary condition | Edge cases break correctness | Empty input, max value, clock skew, race, or overflow is unhandled |
128
+ | Concurrency | Parallel or asynchronous access breaks safety | Race, deadlock, resource exhaustion, or ordering assumption corrupts state |
129
+ | Lifecycle | Resources or obligations survive past use | Data, feature flags, services, or AI indexes have no retirement path |
130
+ | Traceability | Review cannot explain why behavior changed | A prompt/provider change has no decision record or eval comparison |
131
+ | Source of truth | Conflicts cannot be resolved | Cache, DB, generated artifact, and public response disagree |
132
+ | Authority boundary | A layer silently takes over another layer's responsibility | Middleware repairs semantic meaning and becomes hidden policy authority |
133
+ | Observability | Operators cannot diagnose behavior | Model/tool failures lack prompt, route, schema, or artifact refs |
134
+ | Provenance | Evidence cannot be trusted | Retrieved or generated claims lack source and builder/agent trace |
135
+ | Agency boundary | Automation exceeds intended control | Agent can call high-impact tools without approval or least privilege |
136
+ | Semantic evaluation | Route success is mistaken for quality | Schema-valid model output is hallucinated or unfaithful |
137
+ | Governance path | Risk has no accountable owner | AI feature ships without approval gate, incident path, or red-team feedback loop |
138
+
139
+ ## Reference Standards and Frameworks
140
+
141
+ These anchors provide review signals, not checklist-compliance obligations unless the
142
+ target explicitly claims conformance.
143
+
144
+ | Anchor | Use in this domain |
145
+ |---|---|
146
+ | NIST AI RMF 1.0 | AI risk framing across design, development, deployment, and use |
147
+ | NIST AI 600-1 GenAI Profile | GenAI governance, provenance, testing, incident disclosure, red-team loop |
148
+ | ISO/IEC 42001 | AI management-system concepts: ownership, objectives, risk treatment, traceability, improvement |
149
+ | NIST SSDF SP 800-218 and SP 800-218A | Secure development evidence for software and AI/foundation-model artifacts |
150
+ | OWASP LLM Top 10 2025 | Prompt injection, output handling, excessive agency, vector/embedding, supply-chain and resource risks |
151
+ | SLSA | Artifact provenance and verification summaries for supply-chain trust |
152
+ | ISO/IEC/IEEE 12207:2026 | Full software lifecycle, including acquisition, supply, operation, support, maintenance, retirement |
153
+ | ISO/IEC 25010:2023 | Quality characteristics as requirements, design objectives, tests, acceptance criteria, and measures |
154
+ | ISO/IEC/IEEE 29148 | Requirements processes and information items; preferred over IEEE 830 for current requirements work |
155
+ | WCAG 2.2 / ISO/IEC 40500:2025 | Current accessibility baseline for web/mobile/user-facing interfaces |
156
+ | OWASP Top 10 | General web application security risks |
157
+ | 12-Factor App, SRE, DORA | Operational design, reliability, deployability, and delivery performance |
158
+ | DDD, C4, Arc42, Clean/Hexagonal Architecture | Architecture documentation and boundary reasoning |
153
159
 
154
- The domain files use three related concern axes — they are facets of the same domain, not independent classification systems:
160
+ ## Bias Detection Criteria
155
161
 
156
- | File | Axis | Facet |
157
- |---|---|---|
158
- | domain_scope.md | concern | What design concerns exist (scope) |
159
- | logic_rules.md | system construction concern | What concerns are governed by rules (rules) |
160
- | competency_qs.md | verification concern | What concerns must be verified (questions) |
162
+ - If too many major sub-areas are absent, flag **coverage bias**. Count current sub-areas at review time and treat absence of roughly 40% or more as a strong signal.
163
+ - If a review target includes LLM behavior but omits prompts, context, model/provider, tool, retrieval, eval, and governance concerns, flag **AI-era engineering blind spot**.
164
+ - If fallback, repair, or graceful degradation hides the failing prompt, model, schema, tool, retrieval, or artifact boundary, flag **silent degradation bias**.
165
+ - If LLM, runtime, and middleware responsibilities are not separated, flag **ownership boundary bias**.
166
+ - If model output is trusted because the prompt requested a format, flag **output trust bias**.
167
+ - If retrieved material can influence claims without source provenance and permission-aware retrieval, flag **retrieval authority bias**.
168
+ - If agent capability, permission, and autonomy are discussed as one undifferentiated knob, flag **agency compression bias**.
169
+ - If governance, approval, incident response, or human oversight is treated as external policy only, flag **governance externalization bias**.
170
+ - If only implementation is discussed and acquisition/supply/operation/retirement are absent, flag **lifecycle narrowing bias**.
171
+ - If only unit tests are discussed for behavior with integration, semantic, security, or operational risk, flag **verification level bias**.
172
+ - If accessibility is outdated, qualitative, or omitted for a public/user-facing system, flag **accessibility currency bias**.
161
173
 
162
- ### Sub-area to CQ Section Mapping
174
+ ## Inter-Document Contract
163
175
 
164
- | Sub-area | CQ Sections | Coverage |
176
+ | Topic | Owner file | Other files |
165
177
  |---|---|---|
166
- | Data & State | CQ-D (Data Flow) | Full |
167
- | Interface & Contract | CQ-I (Change Impact), CQ-E (Error Handling), CQ-R (Requirements) | Full |
168
- | Security & Auth | CQ-SE (Security) | Full |
169
- | Verification & Quality | CQ-V (Testing/Verification), CQ-P (Performance) | Full |
170
- | Structure & Architecture | CQ-S (Structural Understanding), CQ-M (Event/Messaging) | Full |
171
- | Operations, Deployment & Maintenance | CQ-O (Deployment/Operations), CQ-MT (Maintenance) | Full |
172
- | Documentation & Consumers | CQ-A (AI Agent Collaboration) | Partial (AI-focused) |
173
-
174
- Cross-cutting CQ sections (not mapped to a single sub-area):
175
- - CQ-T (Types and Constraints) spans Interface & Contract + Data & State
176
- - CQ-B (Boundary Conditions) — spans all sub-areas
177
- - CQ-C (Concurrency) spans Data & State + Structure & Architecture
178
- - CQ-DE (Dependencies) — spans Structure & Architecture + Operations
178
+ | Domain purpose, scope, value commitments, standards anchors | domain_scope.md | Other files reference |
179
+ | Domain-local concept definitions, domain projections, and homonym guards | concepts.md | Other files reference; onto/productization core concepts remain canonical in `.onto/authority/`, `.onto/principles/`, or `.onto/processes/` |
180
+ | Logical gates, contradiction rules, failure posture, trust rules | logic_rules.md | competency_qs.md asks; prompt_interface.md applies |
181
+ | Structural seats and required relationships | structure_spec.md | domain_scope.md activates; competency_qs.md verifies |
182
+ | Dependency direction, provider/model/tool/corpus dependencies, provenance dependencies | dependency_rules.md | structure_spec.md references |
183
+ | Prompt, role, context, tool, response, sink interface criteria | prompt_interface.md | logic_rules.md and structure_spec.md reference |
184
+ | Domain-specific CQs and PASS/FAIL criteria | competency_qs.md | All rule files provide inference paths |
185
+ | Case-backed guideline library | extension_cases.md | competency_qs.md can reuse CQ seeds |
186
+ | Concept economy and duplication policy | conciseness_rules.md | All files follow |
187
+ | Closure axes for issue stance and synthesize | problem_framing_profile.md | Review runtime consumes |
188
+
189
+ ## Sub-area to CQ Mapping
190
+
191
+ | Sub-area | CQ sections |
192
+ |---|---|
193
+ | Data and state | CQ-D, CQ-T, CQ-B, CQ-C, CQ-SE |
194
+ | Interface and contract | CQ-I, CQ-R, CQ-E |
195
+ | Error and failure posture | CQ-E, CQ-A, CQ-G |
196
+ | Security and authorization | CQ-SE, CQ-A, CQ-DE, CQ-D |
197
+ | Verification and quality | CQ-V, CQ-P, CQ-A, CQ-G |
198
+ | Architecture and structure | CQ-S, CQ-M, CQ-C, CQ-DE |
199
+ | Operations and maintenance | CQ-O, CQ-MT, CQ-G, CQ-DE |
200
+ | Documentation and consumers | CQ-A, CQ-R, CQ-S |
201
+ | LLM-native and agentic behavior | CQ-A, CQ-SE, CQ-DE, CQ-G |
202
+ | AI governance and risk | CQ-G, CQ-A, CQ-O |
203
+ | Accessibility and internationalization | CQ-R, CQ-V, CQ-G |
179
204
 
180
205
  ## Related Documents
181
- - concepts.md §Architecture Core Terms — definitions of terms within this scope
182
- - structure_spec.md §Required Module Structure Elements specific rules for module structure, test organization
183
- - competency_qs.md questions this scope must be able to answer
206
+
207
+ - concepts.md - canonical terms and homonym guards
208
+ - logic_rules.md - logical rules, trust gates, failure posture
209
+ - structure_spec.md - required structures and relationships
210
+ - dependency_rules.md - dependency and provenance rules
211
+ - prompt_interface.md - prompt, role, tool, context, output, and sink criteria
212
+ - competency_qs.md - domain competency questions
213
+ - extension_cases.md - case-backed guideline cards
214
+ - problem_framing_profile.md - closure axes for software-engineering review findings