onto-mcp 0.3.1 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.onto/authority/core-lexicon.yaml +1 -0
- package/.onto/domains/software-engineering/competency_qs.md +192 -63
- package/.onto/domains/software-engineering/concepts.md +67 -5
- package/.onto/domains/software-engineering/conciseness_rules.md +22 -2
- package/.onto/domains/software-engineering/dependency_rules.md +78 -8
- package/.onto/domains/software-engineering/domain_scope.md +181 -150
- package/.onto/domains/software-engineering/extension_cases.md +318 -542
- package/.onto/domains/software-engineering/logic_rules.md +75 -3
- package/.onto/domains/software-engineering/problem_framing_profile.md +29 -2
- package/.onto/domains/software-engineering/prompt_interface.md +122 -0
- package/.onto/domains/software-engineering/structure_spec.md +53 -4
- package/.onto/principles/llm-native-development-guideline.md +20 -0
- package/.onto/principles/productization-charter.md +6 -0
- package/.onto/processes/reconstruct/reconstruct-boundary-contract.md +278 -91
- package/.onto/processes/reconstruct/reconstruct-execution-ux-contract.md +45 -12
- package/.onto/processes/reconstruct/source-profile-contract.md +39 -6
- package/.onto/processes/reconstruct/top-level-concept-discovery-contract.md +387 -0
- package/.onto/processes/review/lens-registry.md +16 -0
- package/.onto/processes/shared/target-material-kind-contract.md +18 -2
- package/.onto/roles/axiology.md +7 -2
- package/AGENTS.md +3 -2
- package/README.md +39 -33
- package/dist/core-api/reconstruct-api.js +22 -5
- package/dist/core-api/review-api.js +1288 -533
- package/dist/core-runtime/cli/mock-review-unit-executor.js +17 -0
- package/dist/core-runtime/cli/review-invoke.js +23 -48
- package/dist/core-runtime/cli/run-review-prompt-execution.js +122 -0
- package/dist/core-runtime/path-boundary.js +58 -0
- package/dist/core-runtime/reconstruct/artifact-types.js +5 -0
- package/dist/core-runtime/reconstruct/materialize-preparation.js +54 -4
- package/dist/core-runtime/reconstruct/pipeline-execution-ledger.js +38 -2
- package/dist/core-runtime/reconstruct/post-seed-validation.js +13 -0
- package/dist/core-runtime/reconstruct/record.js +11 -0
- package/dist/core-runtime/reconstruct/run.js +1133 -26
- package/dist/core-runtime/reconstruct/seed-candidate-validation.js +29 -0
- package/dist/core-runtime/review/execution-plan-boundary.js +123 -0
- package/dist/core-runtime/review/materializers.js +8 -3
- package/dist/core-runtime/review/review-artifact-utils.js +15 -2
- package/dist/core-runtime/review/review-invocation-runner.js +604 -0
- package/dist/core-runtime/target-material-kind.js +43 -5
- package/dist/mcp/server.js +158 -39
- package/dist/mcp/tool-schemas.js +22 -2
- package/package.json +3 -1
- package/.onto/domains/llm-native-development/competency_qs.md +0 -430
- package/.onto/domains/llm-native-development/concepts.md +0 -242
- package/.onto/domains/llm-native-development/conciseness_rules.md +0 -163
- package/.onto/domains/llm-native-development/dependency_rules.md +0 -216
- package/.onto/domains/llm-native-development/domain_scope.md +0 -197
- package/.onto/domains/llm-native-development/extension_cases.md +0 -474
- package/.onto/domains/llm-native-development/logic_rules.md +0 -123
- package/.onto/domains/llm-native-development/prompt_interface.md +0 -49
- package/.onto/domains/llm-native-development/structure_spec.md +0 -245
|
@@ -1,183 +1,214 @@
|
|
|
1
1
|
---
|
|
2
|
-
version:
|
|
3
|
-
last_updated: "2026-
|
|
4
|
-
source:
|
|
2
|
+
version: 8
|
|
3
|
+
last_updated: "2026-05-28"
|
|
4
|
+
source: zero-based-software-engineering-redesign
|
|
5
5
|
status: established
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
# Software Engineering Domain
|
|
8
|
+
# Software Engineering Domain - Domain Scope Definition
|
|
9
9
|
|
|
10
|
-
This is the
|
|
11
|
-
|
|
10
|
+
This document is the coverage and axiology entrypoint for the `software-engineering`
|
|
11
|
+
domain. It defines what a software review must be able to notice.
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
- **(when applicable)**: Address when the system's architecture includes the relevant pattern. Not addressing it when the pattern is absent is correct, not a gap
|
|
20
|
-
- **(scale-dependent)**: Becomes required beyond a scale threshold. The threshold should be documented per sub-area
|
|
21
|
-
|
|
22
|
-
### Data & State
|
|
23
|
-
- **Data Modeling** (required): entities, relationships, type definitions, schema design. Uber's Schemaless demonstrates schema-on-read (MySQL stores JSON blobs, schema evolution without migrations). Netflix EVCache handles 30M+ req/s, illustrating consistency vs availability trade-offs. Event Sourcing (Greg Young, EventStore) stores state as immutable event logs — Axon Framework implements this on the JVM with built-in CQRS. Data modeling must declare schema-on-write vs schema-on-read, as this determines migration strategy and consistency guarantees
|
|
24
|
-
- **State Management** (required): state transitions, invariants, recovery paths, concurrency control. CQRS (Command Query Responsibility Segregation) separates read and write models — the write side enforces invariants via commands, the read side serves optimized projections. Saga patterns (choreography vs orchestration) coordinate distributed state changes across service boundaries without distributed transactions
|
|
25
|
-
- **Event Sourcing** (when applicable): event storage, state reconstruction, projections, terminal states, partial commit prevention. Event Store (eventstore.com) provides a purpose-built database for event sourcing with built-in projections and subscriptions
|
|
26
|
-
|
|
27
|
-
### Interface & Contract
|
|
28
|
-
- **API Design** (required): public interfaces, versioning, contracts, backward compatibility. REST maturity is measured by Richardson's Maturity Model (L0: HTTP tunnel → L3: hypermedia/HATEOAS). GraphQL (Facebook, 2015) lets clients specify exact data shapes, solving over/under-fetching. gRPC (Google) uses Protocol Buffers for schema-first, strongly-typed RPC with HTTP/2 multiplexing. Stripe's rolling API versioning pins each customer to their integration version, avoiding the "upgrade cliff." OpenAPI/Swagger provides machine-readable specs enabling automated client generation and contract testing
|
|
29
|
-
- **Type System** (required): discriminated union, exhaustive check, type-level safety mechanisms. TypeScript's discriminated unions with exhaustive switch/case checking eliminate an entire class of runtime errors at compile time. Rust's `Result<T, E>` and `Option<T>` force callers to handle both success and failure paths (see also: concepts.md §Type Safety Mechanisms)
|
|
30
|
-
- **Error Handling** (required): error classification, recovery strategies, fallback paths, user guidance. Error classification should distinguish operational errors (expected, recoverable: network timeout, validation failure) from programmer errors (unexpected, non-recoverable: null dereference, assertion violation). Circuit breaker patterns (Netflix Hystrix, Resilience4j) prevent cascading failures by failing fast when downstream services are unhealthy
|
|
31
|
-
- **Requirements & Specification** (when applicable): functional and non-functional requirements capture, acceptance criteria, traceability from requirements to implementation. IEEE 830 (SRS) provides a standard format. Requirements must be testable — each requirement should map to at least one verification method. Non-functional requirements (performance, security, availability) must be quantified. Applicable when formal requirements management is practiced; implicit for small-team projects with clear verbal agreements
|
|
13
|
+
`software-engineering` is the canonical domain for conventional software engineering,
|
|
14
|
+
AI-assisted development, and LLM-powered product/runtime behavior. The former
|
|
15
|
+
`llm-native-development` domain is a compatibility alias only. A reviewer should not
|
|
16
|
+
run a second domain review to cover AI behavior; this domain activates AI-era concerns
|
|
17
|
+
when the target uses LLMs, agents, model providers, prompt/context contracts, retrieval,
|
|
18
|
+
semantic evaluation, AI-assisted workflows, or tool-call boundaries.
|
|
32
19
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
- **Security** (required): input validation, injection prevention, data encryption, supply chain security. OWASP Top 10 (2021) classifies critical risks: A01 Broken Access Control through A10 SSRF. Log4Shell (CVE-2021-44228) demonstrated supply chain risk — a single Log4j vulnerability compromised millions of systems. The SolarWinds attack (2020) showed compromised build pipelines can inject malicious code into trusted updates, affecting 18,000+ organizations
|
|
20
|
+
The domain is not a MECE taxonomy. It is a lens-usable concern map. A concern is strong
|
|
21
|
+
enough for active scope when it can support:
|
|
36
22
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
- **Performance** (scale-dependent): response time, throughput, caching strategy. Load testing tools (k6, Gatling, Locust) simulate concurrent users to identify bottlenecks before production. Key metrics: p50/p95/p99 latency, throughput (RPS), error rate under load
|
|
23
|
+
```text
|
|
24
|
+
lens perspective -> principle -> case evidence -> actionable guideline -> CQ
|
|
25
|
+
```
|
|
41
26
|
|
|
42
|
-
|
|
43
|
-
- **Module Separation** (required): layer structure, dependency direction, separation of concerns. Hexagonal Architecture (Cockburn, 2005) isolates domain logic from infrastructure through ports and adapters, making the core testable without databases or HTTP. Clean Architecture (Robert C. Martin) enforces the Dependency Rule: dependencies point only inward. Domain-Driven Design (Evans, 2003) organizes code around bounded contexts. Microservices (Sam Newman) decompose systems into independently deployable services, trading operational complexity for deployment independence
|
|
44
|
-
- **Data Flow** (required): input-to-processing-to-output paths, transformation chains, source of truth designation. Pipe-and-filter architecture (Unix philosophy) composes systems from small, focused transformations. Event-driven architecture uses event buses (Kafka, RabbitMQ) to decouple producers from consumers
|
|
45
|
-
- **Event/Messaging** (when applicable): message queues, asynchronous processing, pipeline scalability. Apache Kafka provides durable, partitioned event logs enabling replay and exactly-once semantics. Message delivery guarantees (at-most-once, at-least-once, exactly-once) have fundamental trade-offs with latency and complexity
|
|
27
|
+
## Domain Purpose
|
|
46
28
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
- **Maintenance** (when applicable): corrective maintenance (fixing defects discovered after delivery), adaptive maintenance (accommodating environment changes — OS upgrades, dependency updates, regulatory changes), perfective maintenance (improving performance/maintainability based on user feedback), preventive maintenance (refactoring to prevent anticipated problems). IEEE 14764 classifies these four categories. Technical debt management maps to preventive maintenance. Corrective/adaptive are reactive; perfective/preventive are proactive. See concepts.md §Change Management Terms for related terminology
|
|
29
|
+
Software engineering review should detect whether a software system can be understood,
|
|
30
|
+
changed, verified, operated, trusted, and retired without hiding the authority, evidence,
|
|
31
|
+
or value tradeoffs that make the system work.
|
|
51
32
|
|
|
52
|
-
|
|
53
|
-
- **Document Design** (when applicable): dual-consumer handling for AI agents and humans, separation of contract documents vs guide documents, separation of information structure and rendering. Diátaxis classifies documentation into 4 types: tutorials (learning), how-to guides (task), reference (information), explanation (understanding). ADRs (Michael Nygard) capture the "why" behind architectural choices in structured format (context, decision, consequences). API-first design ensures contracts are defined before implementation
|
|
54
|
-
- **Constraint Design** (when applicable): hard/soft constraint classification, invariant vs best-effort boundary, pre-inclusion vs post-verification. Hard constraints: invariants that must never be violated (data integrity, security). Soft constraints: preferences relaxed under pressure (response time targets, cache hit ratios)
|
|
33
|
+
The domain covers the full lifecycle:
|
|
55
34
|
|
|
56
|
-
|
|
35
|
+
1. acquisition and supply
|
|
36
|
+
2. requirements and design
|
|
37
|
+
3. implementation
|
|
38
|
+
4. verification and release
|
|
39
|
+
5. operation and incident response
|
|
40
|
+
6. maintenance and evolution
|
|
41
|
+
7. decommissioning and retirement
|
|
57
42
|
|
|
58
|
-
|
|
43
|
+
LLM-native artifacts inherit this lifecycle. Prompts, context assembly rules, tool
|
|
44
|
+
schemas, retrieval indexes, model/provider routes, eval rubrics, agent instructions, and
|
|
45
|
+
generated authority artifacts are behavior-affecting software artifacts.
|
|
59
46
|
|
|
60
|
-
|
|
61
|
-
|---|---|---|---|---|
|
|
62
|
-
| Layer 1 — Language/Runtime | Syntax, semantics, built-in APIs | Compiler/interpreter rejection | Slow (years) | ECMAScript, JLS, Go Spec, Rust Reference |
|
|
63
|
-
| Layer 2 — Framework/Library | Conventions imposed by frameworks | Runtime errors, lint rules | Medium (quarterly) | React hooks rules, Spring IoC, Rails conventions |
|
|
64
|
-
| Layer 3 — Industry/Organization | Cross-technology principles | Code review, audit | Fast (per incident) | OWASP Top 10, 12-Factor, SOLID, DDD patterns |
|
|
47
|
+
## Axiology Input
|
|
65
48
|
|
|
66
|
-
|
|
49
|
+
Axiology uses this domain input to identify value conflicts; it does not treat these
|
|
50
|
+
statements as predetermined conclusions.
|
|
67
51
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
|
73
|
-
|
|
74
|
-
|
|
|
75
|
-
|
|
|
76
|
-
|
|
|
77
|
-
|
|
|
78
|
-
| Traceability | Change rationale, decision justification, audit trail | Unmaintainable | 3-year-old conditional with no comment or commit message explaining why it exists |
|
|
79
|
-
| Source of truth | The authoritative data/definition source when inconsistencies arise | Unable to resolve information conflicts | User profile cached in 3 services diverges; no system is designated authoritative |
|
|
80
|
-
| Concurrency | Thread safety, race conditions, deadlock prevention | Data corruption under load | Two threads updating the same balance without locking; lost update |
|
|
81
|
-
| Idempotency | Operations that produce the same result when executed multiple times | Duplicate side effects | Payment charged twice because retry logic doesn't check for existing transaction |
|
|
82
|
-
| Observability | Logging, metrics, and tracing for runtime behavior | Silent failures, undiagnosable production issues | Error rate spikes but no logs indicate which endpoint or upstream service is responsible |
|
|
83
|
-
|
|
84
|
-
## Reference Standards/Frameworks
|
|
85
|
-
|
|
86
|
-
| Standard | Application Area | Usage | When to Apply |
|
|
87
|
-
|---|---|---|---|
|
|
88
|
-
| OWASP Top 10 (2021) | Security | Web application security vulnerability classification | Every web-facing system; review checklist for security |
|
|
89
|
-
| 12-Factor App | Deployment/Operations | Cloud-native application design principles | Cloud-deployed services; SaaS architecture review |
|
|
90
|
-
| REST Maturity Model (Richardson) | API Design | API design level assessment (L0–L3) | Reviewing or designing REST APIs |
|
|
91
|
-
| SOLID Principles | Module separation, type system | Five principles of object-oriented design | Object-oriented codebases; module boundary review |
|
|
92
|
-
| Event Sourcing Pattern | Event Sourcing | Event storage, state reconstruction, projections | Systems requiring full audit trail or temporal queries |
|
|
93
|
-
| Diátaxis Framework | Document Design | Tutorial/How-to/Reference/Explanation 4-way classification | Documentation review or creation |
|
|
94
|
-
| IEEE 830 (SRS) | Requirements | Software Requirements Specification format | Formal requirements documentation |
|
|
95
|
-
| ISO 25010 | Quality Model | Software quality characteristics (8 characteristics, 31 sub-characteristics) | System quality attribute assessment |
|
|
96
|
-
| TOGAF | Architecture | Enterprise architecture framework and ADM (Architecture Development Method) | Enterprise-scale architecture decisions |
|
|
97
|
-
| C4 Model (Simon Brown) | Architecture | 4-level architecture diagramming (Context, Container, Component, Code) | Architecture documentation and communication |
|
|
98
|
-
| Arc42 | Architecture | Pragmatic architecture documentation template (12 sections) | Documenting system architecture decisions |
|
|
99
|
-
| Domain-Driven Design (Eric Evans) | Structure & Architecture | Bounded contexts, aggregates, ubiquitous language | Complex domains with rich business logic |
|
|
100
|
-
| Google SRE Workbook | Operations | Error budgets, SLIs/SLOs, incident management | Production systems requiring reliability guarantees |
|
|
101
|
-
| IEEE 14764 | Maintenance | Software maintenance process and classification (corrective/adaptive/perfective/preventive) | Systems with ongoing maintenance operations |
|
|
102
|
-
| WCAG 2.1 | Accessibility | Web Content Accessibility Guidelines (A/AA/AAA conformance levels) | Public-facing web applications; legally mandated accessibility |
|
|
103
|
-
|
|
104
|
-
## Bias Detection Criteria
|
|
105
|
-
|
|
106
|
-
- If ⌈N/2.5⌉ or more of the Major Sub-areas (§Major Sub-areas) are not represented at all → **insufficient coverage**. **N** = the number of `###` subsections under §Major Sub-areas (count at review time — do not hard-code). At review time the reviewer counts the current `###` headings under §Major Sub-areas and computes the threshold from N, so adding or removing a sub-area updates the threshold automatically without editing this rule
|
|
107
|
-
- If concepts from a specific area account for more than 70% of the total → **area bias**
|
|
108
|
-
- If only the happy path is defined with no error path → **path bias**
|
|
109
|
-
- If creation/use is defined but disposal/cleanup is missing → **incomplete lifecycle**
|
|
110
|
-
- If 2 or more data sources lack a designated source of truth → **undesignated authority**
|
|
111
|
-
- If the document design area is missing in a system where AI agents are consumers/executors → **missing consumer perspective**
|
|
112
|
-
- If only synchronous request-response is considered with no async patterns (queues, events, callbacks) → **concurrency blindness**. Production systems require async for resilience and scalability
|
|
113
|
-
- If only server-side logic is addressed with no client-side considerations → **deployment bias**. Full-stack systems need design decisions on both sides of the network boundary
|
|
114
|
-
- If testing covers only unit tests with no integration or E2E strategy → **test level bias**. Each level catches different defect classes; missing any level leaves a verification gap (see also: structure_spec.md §Verification Structure)
|
|
115
|
-
- If security addresses only authentication without authorization → **security scope bias**. Auth without authz means every authenticated user has full access (OWASP A01)
|
|
116
|
-
- If only read operations are designed with no write/mutation considerations → **operation bias**. Ignoring write paths produces systems that fail under mutation load
|
|
52
|
+
| Value commitment | Review signal |
|
|
53
|
+
|---|---|
|
|
54
|
+
| Diagnosability over false smoothness | Silent fallback, hidden repair, or unmarked degradation is suspect when it hides the failing boundary |
|
|
55
|
+
| Artifact truth over response truth | Public responses may summarize durable artifacts but must not become the authority seat |
|
|
56
|
+
| Accountability over automation theater | Agent autonomy must preserve owner, approval, audit, and recovery paths |
|
|
57
|
+
| Evidence over plausibility | Generated or retrieved claims need provenance when they affect trust, release, or user decisions |
|
|
58
|
+
| Explicit loss over invisible degradation | User-facing degradation is acceptable only when capability loss, trust status, diagnostics, and recovery are visible |
|
|
59
|
+
| Least agency over broad capability | Agent functionality, permissions, and autonomy must be minimized separately |
|
|
60
|
+
| Governance as engineering material | AI risk ownership, approval gates, incident disclosure, and continuous improvement are reviewable engineering concerns |
|
|
61
|
+
| Accessibility and user agency | Speed or productivity claims do not justify excluding affected users or hiding control from operators |
|
|
117
62
|
|
|
118
|
-
##
|
|
63
|
+
## Top-Down Concern Stack
|
|
119
64
|
|
|
120
|
-
|
|
65
|
+
Reviewers should reason from the top down before diving into local implementation detail.
|
|
121
66
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
| Cross-cutting Topic | Owner File | Other Files |
|
|
67
|
+
| Layer | What to look for | Primary lens consumers |
|
|
125
68
|
|---|---|---|
|
|
126
|
-
|
|
|
127
|
-
|
|
|
128
|
-
|
|
|
129
|
-
|
|
|
130
|
-
|
|
|
131
|
-
|
|
|
132
|
-
|
|
|
133
|
-
| Performance optimization rules | logic_rules.md §Performance Logic | structure_spec.md (thresholds), domain_scope.md (SLOs) |
|
|
134
|
-
|
|
135
|
-
### Required Substance per Sub-area
|
|
69
|
+
| Purpose and value | Stakeholder promises, harms, accountability, tradeoffs, non-negotiable constraints | axiology, coverage, pragmatics |
|
|
70
|
+
| Lifecycle and governance | Acquisition/supply, approval gates, risk ownership, incident response, retirement | coverage, axiology, evolution |
|
|
71
|
+
| Architecture and state | Modules, interfaces, state transitions, source of truth, concurrency, data flow | structure, logic, dependency |
|
|
72
|
+
| Contract and dependency truth | Types, schemas, APIs, package dependencies, provider/model/tool/corpus dependencies | dependency, logic, semantics |
|
|
73
|
+
| Verification and operations | Tests, static checks, semantic eval, red-team evidence, release gates, observability, drift detection | pragmatics, coverage, evolution |
|
|
74
|
+
| LLM-native behavior controls | Prompt/context contracts, output zero-trust, retrieval boundaries, agent agency, model routing, failure posture | logic, structure, dependency, axiology |
|
|
75
|
+
| Case evidence and CQ library | Case-backed guideline cards and PASS/FAIL review questions | pragmatics, evolution, coverage |
|
|
136
76
|
|
|
137
|
-
|
|
138
|
-
- concepts.md: term definitions
|
|
139
|
-
- logic_rules.md or structure_spec.md or dependency_rules.md: operational rules
|
|
140
|
-
- competency_qs.md: verification questions
|
|
77
|
+
## Major Sub-areas
|
|
141
78
|
|
|
142
|
-
|
|
79
|
+
Applicability markers:
|
|
143
80
|
|
|
144
|
-
|
|
81
|
+
- **required**: must be addressed in every software review.
|
|
82
|
+
- **when applicable**: required when the target uses the relevant pattern.
|
|
83
|
+
- **scale-dependent**: required beyond a documented scale, exposure, or risk threshold.
|
|
145
84
|
|
|
146
|
-
|
|
85
|
+
| Sub-area | Applicability | Review substance |
|
|
86
|
+
|---|---|---|
|
|
87
|
+
| Data and state | required | entities, schemas, source of truth, invariants, migrations, state transitions, consistency, retention/disposal |
|
|
88
|
+
| Interface and contract | required | APIs, types, schemas, versioning, requirements, acceptance criteria, backward compatibility |
|
|
89
|
+
| Error and failure posture | required | error taxonomy, fail-close gates, fail-loud diagnostics, recovery paths, user-facing loss markers |
|
|
90
|
+
| Security and authorization | required | authn/authz, input/output validation, injection prevention, secrets, privacy, supply chain, abuse boundaries |
|
|
91
|
+
| Verification and quality | required | unit/integration/E2E/static checks, semantic eval, quality attributes, release gates, measurable acceptance criteria |
|
|
92
|
+
| Architecture and structure | required | module/layer boundaries, dependency direction, state ownership, deployment topology, consumer surfaces |
|
|
93
|
+
| Operations and maintenance | required for operated systems | CI/CD, observability, incident response, SLOs, drift detection, maintenance classification, retirement |
|
|
94
|
+
| Documentation and consumers | when applicable | human/agent readers, contract vs guide docs, authority seats, diagrams, onboarding and handoff paths |
|
|
95
|
+
| LLM-native and agentic behavior | when applicable | model/provider routing, prompts, context, retrieval, tools, agents, eval, provenance, failure diagnostics |
|
|
96
|
+
| AI governance and risk | when applicable | risk owner, approval gate, human oversight, transparency, red-team loop, incident disclosure, continuous improvement |
|
|
97
|
+
| Accessibility and internationalization | scale-dependent | WCAG/current accessibility baseline, locale/time/currency/text-direction behavior, assistive technology support |
|
|
98
|
+
|
|
99
|
+
## LLM-Native Activation Conditions
|
|
100
|
+
|
|
101
|
+
Activate LLM-native review concerns when any of the following is true:
|
|
102
|
+
|
|
103
|
+
- product behavior depends on a model call, agent loop, retrieval result, or generated output.
|
|
104
|
+
- development/review/release workflow depends on LLM-generated artifacts.
|
|
105
|
+
- prompt templates, tool schemas, eval rubrics, or model/provider routes influence behavior.
|
|
106
|
+
- external content enters model context through files, webpages, RAG, search, or user-provided text.
|
|
107
|
+
- model output can trigger tool calls, persistence, authority artifacts, user-visible decisions, or downstream sinks.
|
|
108
|
+
|
|
109
|
+
When activated, the target must address:
|
|
110
|
+
|
|
111
|
+
- LLM/runtime/middleware ownership split.
|
|
112
|
+
- output zero-trust and sink-specific validation.
|
|
113
|
+
- prompt injection and external-content authority limits.
|
|
114
|
+
- RAG/vector ingestion, permission, poisoning, provenance, and audit boundaries.
|
|
115
|
+
- agent functionality, permission, and autonomy minimization.
|
|
116
|
+
- semantic evaluation and production drift monitoring.
|
|
117
|
+
- fail-loud diagnostics for development/review/authority paths.
|
|
118
|
+
- explicit degraded-state behavior for product paths.
|
|
119
|
+
- AI governance/risk ownership when behavior can materially affect users, operators, security, or release decisions.
|
|
147
120
|
|
|
148
|
-
|
|
149
|
-
2. **Secondary references**: Other sub-areas reference the primary sub-area's rules rather than duplicating them
|
|
150
|
-
3. **Tie-breaking**: If enforcement is equally distributed, attribute to the sub-area with fewer existing items (load balancing)
|
|
121
|
+
## Required Concept Categories
|
|
151
122
|
|
|
152
|
-
|
|
123
|
+
| Category | Risk if missing | Example failure |
|
|
124
|
+
|---|---|---|
|
|
125
|
+
| Happy path | Functional intent is incomplete | API returns 200 but response body semantics are unspecified |
|
|
126
|
+
| Error path | Failures are defenseless or hidden | Runtime catches and ignores validation errors |
|
|
127
|
+
| Boundary condition | Edge cases break correctness | Empty input, max value, clock skew, race, or overflow is unhandled |
|
|
128
|
+
| Concurrency | Parallel or asynchronous access breaks safety | Race, deadlock, resource exhaustion, or ordering assumption corrupts state |
|
|
129
|
+
| Lifecycle | Resources or obligations survive past use | Data, feature flags, services, or AI indexes have no retirement path |
|
|
130
|
+
| Traceability | Review cannot explain why behavior changed | A prompt/provider change has no decision record or eval comparison |
|
|
131
|
+
| Source of truth | Conflicts cannot be resolved | Cache, DB, generated artifact, and public response disagree |
|
|
132
|
+
| Authority boundary | A layer silently takes over another layer's responsibility | Middleware repairs semantic meaning and becomes hidden policy authority |
|
|
133
|
+
| Observability | Operators cannot diagnose behavior | Model/tool failures lack prompt, route, schema, or artifact refs |
|
|
134
|
+
| Provenance | Evidence cannot be trusted | Retrieved or generated claims lack source and builder/agent trace |
|
|
135
|
+
| Agency boundary | Automation exceeds intended control | Agent can call high-impact tools without approval or least privilege |
|
|
136
|
+
| Semantic evaluation | Route success is mistaken for quality | Schema-valid model output is hallucinated or unfaithful |
|
|
137
|
+
| Governance path | Risk has no accountable owner | AI feature ships without approval gate, incident path, or red-team feedback loop |
|
|
138
|
+
|
|
139
|
+
## Reference Standards and Frameworks
|
|
140
|
+
|
|
141
|
+
These anchors provide review signals, not checklist-compliance obligations unless the
|
|
142
|
+
target explicitly claims conformance.
|
|
143
|
+
|
|
144
|
+
| Anchor | Use in this domain |
|
|
145
|
+
|---|---|
|
|
146
|
+
| NIST AI RMF 1.0 | AI risk framing across design, development, deployment, and use |
|
|
147
|
+
| NIST AI 600-1 GenAI Profile | GenAI governance, provenance, testing, incident disclosure, red-team loop |
|
|
148
|
+
| ISO/IEC 42001 | AI management-system concepts: ownership, objectives, risk treatment, traceability, improvement |
|
|
149
|
+
| NIST SSDF SP 800-218 and SP 800-218A | Secure development evidence for software and AI/foundation-model artifacts |
|
|
150
|
+
| OWASP LLM Top 10 2025 | Prompt injection, output handling, excessive agency, vector/embedding, supply-chain and resource risks |
|
|
151
|
+
| SLSA | Artifact provenance and verification summaries for supply-chain trust |
|
|
152
|
+
| ISO/IEC/IEEE 12207:2026 | Full software lifecycle, including acquisition, supply, operation, support, maintenance, retirement |
|
|
153
|
+
| ISO/IEC 25010:2023 | Quality characteristics as requirements, design objectives, tests, acceptance criteria, and measures |
|
|
154
|
+
| ISO/IEC/IEEE 29148 | Requirements processes and information items; preferred over IEEE 830 for current requirements work |
|
|
155
|
+
| WCAG 2.2 / ISO/IEC 40500:2025 | Current accessibility baseline for web/mobile/user-facing interfaces |
|
|
156
|
+
| OWASP Top 10 | General web application security risks |
|
|
157
|
+
| 12-Factor App, SRE, DORA | Operational design, reliability, deployability, and delivery performance |
|
|
158
|
+
| DDD, C4, Arc42, Clean/Hexagonal Architecture | Architecture documentation and boundary reasoning |
|
|
153
159
|
|
|
154
|
-
|
|
160
|
+
## Bias Detection Criteria
|
|
155
161
|
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
162
|
+
- If too many major sub-areas are absent, flag **coverage bias**. Count current sub-areas at review time and treat absence of roughly 40% or more as a strong signal.
|
|
163
|
+
- If a review target includes LLM behavior but omits prompts, context, model/provider, tool, retrieval, eval, and governance concerns, flag **AI-era engineering blind spot**.
|
|
164
|
+
- If fallback, repair, or graceful degradation hides the failing prompt, model, schema, tool, retrieval, or artifact boundary, flag **silent degradation bias**.
|
|
165
|
+
- If LLM, runtime, and middleware responsibilities are not separated, flag **ownership boundary bias**.
|
|
166
|
+
- If model output is trusted because the prompt requested a format, flag **output trust bias**.
|
|
167
|
+
- If retrieved material can influence claims without source provenance and permission-aware retrieval, flag **retrieval authority bias**.
|
|
168
|
+
- If agent capability, permission, and autonomy are discussed as one undifferentiated knob, flag **agency compression bias**.
|
|
169
|
+
- If governance, approval, incident response, or human oversight is treated as external policy only, flag **governance externalization bias**.
|
|
170
|
+
- If only implementation is discussed and acquisition/supply/operation/retirement are absent, flag **lifecycle narrowing bias**.
|
|
171
|
+
- If only unit tests are discussed for behavior with integration, semantic, security, or operational risk, flag **verification level bias**.
|
|
172
|
+
- If accessibility is outdated, qualitative, or omitted for a public/user-facing system, flag **accessibility currency bias**.
|
|
161
173
|
|
|
162
|
-
|
|
174
|
+
## Inter-Document Contract
|
|
163
175
|
|
|
164
|
-
|
|
|
176
|
+
| Topic | Owner file | Other files |
|
|
165
177
|
|---|---|---|
|
|
166
|
-
|
|
|
167
|
-
|
|
|
168
|
-
|
|
|
169
|
-
|
|
|
170
|
-
|
|
|
171
|
-
|
|
|
172
|
-
|
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
178
|
+
| Domain purpose, scope, value commitments, standards anchors | domain_scope.md | Other files reference |
|
|
179
|
+
| Domain-local concept definitions, domain projections, and homonym guards | concepts.md | Other files reference; onto/productization core concepts remain canonical in `.onto/authority/`, `.onto/principles/`, or `.onto/processes/` |
|
|
180
|
+
| Logical gates, contradiction rules, failure posture, trust rules | logic_rules.md | competency_qs.md asks; prompt_interface.md applies |
|
|
181
|
+
| Structural seats and required relationships | structure_spec.md | domain_scope.md activates; competency_qs.md verifies |
|
|
182
|
+
| Dependency direction, provider/model/tool/corpus dependencies, provenance dependencies | dependency_rules.md | structure_spec.md references |
|
|
183
|
+
| Prompt, role, context, tool, response, sink interface criteria | prompt_interface.md | logic_rules.md and structure_spec.md reference |
|
|
184
|
+
| Domain-specific CQs and PASS/FAIL criteria | competency_qs.md | All rule files provide inference paths |
|
|
185
|
+
| Case-backed guideline library | extension_cases.md | competency_qs.md can reuse CQ seeds |
|
|
186
|
+
| Concept economy and duplication policy | conciseness_rules.md | All files follow |
|
|
187
|
+
| Closure axes for issue stance and synthesize | problem_framing_profile.md | Review runtime consumes |
|
|
188
|
+
|
|
189
|
+
## Sub-area to CQ Mapping
|
|
190
|
+
|
|
191
|
+
| Sub-area | CQ sections |
|
|
192
|
+
|---|---|
|
|
193
|
+
| Data and state | CQ-D, CQ-T, CQ-B, CQ-C, CQ-SE |
|
|
194
|
+
| Interface and contract | CQ-I, CQ-R, CQ-E |
|
|
195
|
+
| Error and failure posture | CQ-E, CQ-A, CQ-G |
|
|
196
|
+
| Security and authorization | CQ-SE, CQ-A, CQ-DE, CQ-D |
|
|
197
|
+
| Verification and quality | CQ-V, CQ-P, CQ-A, CQ-G |
|
|
198
|
+
| Architecture and structure | CQ-S, CQ-M, CQ-C, CQ-DE |
|
|
199
|
+
| Operations and maintenance | CQ-O, CQ-MT, CQ-G, CQ-DE |
|
|
200
|
+
| Documentation and consumers | CQ-A, CQ-R, CQ-S |
|
|
201
|
+
| LLM-native and agentic behavior | CQ-A, CQ-SE, CQ-DE, CQ-G |
|
|
202
|
+
| AI governance and risk | CQ-G, CQ-A, CQ-O |
|
|
203
|
+
| Accessibility and internationalization | CQ-R, CQ-V, CQ-G |
|
|
179
204
|
|
|
180
205
|
## Related Documents
|
|
181
|
-
|
|
182
|
-
-
|
|
183
|
-
-
|
|
206
|
+
|
|
207
|
+
- concepts.md - canonical terms and homonym guards
|
|
208
|
+
- logic_rules.md - logical rules, trust gates, failure posture
|
|
209
|
+
- structure_spec.md - required structures and relationships
|
|
210
|
+
- dependency_rules.md - dependency and provenance rules
|
|
211
|
+
- prompt_interface.md - prompt, role, tool, context, output, and sink criteria
|
|
212
|
+
- competency_qs.md - domain competency questions
|
|
213
|
+
- extension_cases.md - case-backed guideline cards
|
|
214
|
+
- problem_framing_profile.md - closure axes for software-engineering review findings
|