npm - @pilotspace/add - Versions diffs - 1.0.0 - Mend

@pilotspace/add 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

package/GETTING-STARTED.md +238 -0
package/LICENSE +20 -0
package/README.md +106 -0
package/bin/cli.js +131 -0
package/docs/00-introduction.md +46 -0
package/docs/01-principles.md +71 -0
package/docs/02-the-flow.md +93 -0
package/docs/03-step-1-specify.md +117 -0
package/docs/04-step-2-scenarios.md +78 -0
package/docs/05-step-3-contract.md +78 -0
package/docs/06-step-4-tests.md +71 -0
package/docs/07-step-5-build.md +80 -0
package/docs/08-step-6-verify.md +63 -0
package/docs/09-the-loop.md +43 -0
package/docs/10-setup-and-stages.md +75 -0
package/docs/11-governance.md +87 -0
package/docs/12-roles.md +99 -0
package/docs/13-adoption.md +67 -0
package/docs/14-foundation.md +121 -0
package/docs/README.md +70 -0
package/docs/add-competencies.png +0 -0
package/docs/add-flow.png +0 -0
package/docs/add-foundation.png +0 -0
package/docs/add-hierarchy.png +0 -0
package/docs/appendix-a-templates.md +88 -0
package/docs/appendix-b-prompts.md +119 -0
package/docs/appendix-c-glossary.md +85 -0
package/docs/appendix-d-worked-example.md +152 -0
package/docs/appendix-e-checklists.md +80 -0
package/docs/appendix-f-requirements-matrix.md +170 -0
package/package.json +47 -0
package/skill/add/SKILL.md +118 -0
package/skill/add/deltas.md +69 -0
package/skill/add/fold.md +66 -0
package/skill/add/intake.md +49 -0
package/skill/add/phases/0-setup.md +35 -0
package/skill/add/phases/1-specify.md +55 -0
package/skill/add/phases/2-scenarios.md +36 -0
package/skill/add/phases/3-contract.md +41 -0
package/skill/add/phases/4-tests.md +37 -0
package/skill/add/phases/5-build.md +38 -0
package/skill/add/phases/6-verify.md +39 -0
package/skill/add/phases/7-observe.md +32 -0
package/skill/add/run.md +152 -0
package/skill/add/scope.md +58 -0
package/tooling/add.py +1573 -0
package/tooling/templates/CONVENTIONS.md.tmpl +8 -0
package/tooling/templates/GLOSSARY.md.tmpl +3 -0
package/tooling/templates/MILESTONE.md.tmpl +25 -0
package/tooling/templates/MODEL_REGISTRY.md.tmpl +6 -0
package/tooling/templates/PROJECT.md.tmpl +42 -0
package/tooling/templates/TASK.md.tmpl +111 -0
package/tooling/templates/dependencies.allowlist.tmpl +2 -0

package/docs/10-setup-and-stages.md ADDED Viewed

@@ -0,0 +1,75 @@
+# 10 · Project setup and stages
+[← 09 The loop](./09-the-loop.md) · [Contents](./README.md) · Next: [11 Governance →](./11-governance.md)
+This chapter covers two operational matters: what you set up once per project, and how the same flow runs at different depths as a product matures.
+---
+## One-time setup
+Before the first feature, establish the foundation the whole project depends on. Done once, it makes every later checkpoint enforceable automatically.
+| Item | File | Purpose |
+|------|------|---------|
+| Repository + pipeline | — | runs the gates on every change |
+| Conventions | `CONVENTIONS.md` | naming, layout, language, formatter — the survivor layer |
+| Model record | `MODEL_REGISTRY.md` | which AI model and version the project uses, for reproducibility and audit |
+| Dependency allow-list | `dependencies.allowlist` | the packages the AI may use; the pipeline rejects others |
+| Prompt playbook | `playbook/` | the six prompts from [Appendix B](./appendix-b-prompts.md) |
+**Setup exit check**
+- [ ] The pipeline runs and is green on the empty skeleton.
+- [ ] The model is pinned.
+- [ ] The allow-list exists and the pipeline fails on any package outside it.
+- [ ] The playbook is present.
+Do not start a feature until the pipeline is green. It is the thing that will enforce every later exit check without anyone having to remember to.
+---
+## Stages: the same flow at increasing depth
+A *stage* is one pass through the flow at a chosen depth. The steps never change between stages; what changes is how deeply you run each one. The instinct to skip steps for an early prototype is right in spirit but wrong in form — you do not skip steps, you run them lightly.
+### The depth matrix
+Depth: **Deep** (full rigor) · **Core** (real but scoped) · **Light** (just enough) · **—** (skipped or stubbed).
+| Step | Prototype | Proof of Concept | MVP | Production-Ready |
+|------|:---------:|:----------------:|:---:|:----------------:|
+| 1 Specify | Light | Deep (risky slice) | Deep | Deep |
+| (design, if UI) | **Deep** | Light | Core | Deep |
+| 2 Scenarios | Light | Core | Deep | Deep |
+| 3 Contract | — | Core | Deep | Deep |
+| 4 Tests | — | Core | Core | Deep |
+| 5 Build | Light (throwaway) | Core | Core | Deep |
+| 6 Verify | Light | Core | Core | Deep |
+| Loop / operate | — | — | Light | Deep |
+| **Typical time\*** | ~2–5 days | ~1–3 weeks | ~4–8 weeks | ~4–8+ weeks |
+| **Code is** | disposable | disposable | kept | hardened |
+\* *Ranges assume a small team on a single product slice. Scale by scope and by the number of parallel streams. The pace is set by judgment and review capacity, not by how fast the AI can type — adding more AI does not compress the human-led steps.*
+### Stage by stage
+**Prototype — prove the experience.** Run the design deeply and everything else lightly; the code is throwaway. The achievement is that a stakeholder reacts to something tangible and a go/no-go on the concept becomes possible. Do not expect real data, tests, or anything that survives.
+**Proof of Concept — retire the biggest technical risk.** Run the contract, tests, and build *deeply but only on the single riskiest slice*. The achievement is evidence that the hardest unknown is solvable, which turns an MVP estimate from hopeful into credible. Do not expect breadth or polish.
+**MVP — deliver value to real users.** Run the full flow at a narrow scope — the first complete loop, including light observation. The achievement is real users getting value while you learn from them. Do not expect scale or full operational rigor.
+**Production-Ready — run safely at scale.** Run every step at full rigor and deepen the operate-and-learn loop: service objectives, incident response, tested rollback, gradual delivery. The achievement is a system that is tested, secure, observable, and supportable. Do not expect "zero defects"; expect managed risk with a working feedback loop.
+### What carries forward
+The durable thing is never the code:
+| Transition | Discard | Keep |
+|------------|---------|------|
+| Prototype → POC | the prototype code | the validated experience (design, flows) |
+| POC → MVP | the spike code | the validated approach + the risky-interface contract |
+| MVP → Production | nothing | everything; the code is real and is hardened |
+The survivor layer thickens as you move right: a prototype leaves you a validated design; a proof of concept adds a proven approach and a contract; the MVP adds real, kept code. By production, you are hardening, not rebuilding.

package/docs/11-governance.md ADDED Viewed

@@ -0,0 +1,87 @@
+# 11 · Governance
+[← 10 Setup and stages](./10-setup-and-stages.md) · [Contents](./README.md) · Next: [12 Roles →](./12-roles.md)
+Governance is what keeps the method honest when a team runs it at speed. It is the same regardless of which AI tool writes the code, because it lives in the process and the pipeline, not in the agent.
+---
+## The autonomy ladder
+How much the AI is allowed to do is not one switch; it is a setting chosen per area, rising with evidence and with your capacity to verify.
+| Level | The AI… | A person… | Typical use |
+|-------|---------|-----------|-------------|
+| **Suggest** | proposes options | decides and writes | early, exploratory work |
+| **Draft-and-review** | drafts artifacts | edits and approves each one | specs, scenarios, contracts |
+| **Generate-behind-gate** | generates code | reviews the change; it merges only if the contract and tests pass | the normal build |
+| **Auto-with-evidence** | generates and merges | samples and audits; auto-merge allowed only with a full evidence bundle attached | narrow, well-tested areas |
+The governing rule, restated from the principles: **operate only at the level your review capacity can sustain.** If the AI produces more than the team can verify, drop a level.
+The **per-scope default is auto-with-evidence behind a one-approval seam**: the AI drafts the front, a human approves the frozen contract once, and the build auto-gates on evidence. You *lower* a scope toward draft-and-review or suggest wherever risk is high or evidence is thin — and a high-risk or method-defining scope is *always* lowered (it is never auto-run). The default sets where you start; review capacity and risk set where you stay.
+## The gate-fail protocol and the three reports
+Every checkpoint produces three short reports — **Test** (does it pass?), **Quality** (is it well-made and conformant?), and **Risk** (what could go wrong, and who owns it?) — and resolves to exactly one outcome:
+- **`PASS`** — criteria met; proceed.
+- **`RISK-ACCEPTED`** — proceed with a signed waiver carrying a named owner, a linked ticket, and an expiry. Allowed for non-security gaps only.
+- **`HARD-STOP`** — cannot proceed. Triggered by any failing test or any security finding; overridable only by the most senior accountable owner, and never for security.
+The rule behind the protocol is *no silent skips.* A report nobody is accountable for approving is just a document; an outcome with an owner is governance.
+### Why each step exists (institutional memory)
+When someone proposes skipping a step "to go faster," this table is the answer:
+| Step skipped | What happens | How you notice |
+|--------------|--------------|----------------|
+| Specify | the wrong thing gets built | shipped, but users do not use it |
+| Scenarios | the feature is vague, edges missing | the AI keeps asking questions mid-build |
+| Contract | interfaces drift | front, back, and AI disagree on shapes |
+| Tests | AI code is uncontrollable | no way to know it is right but to test by hand |
+| Verify (architecture check) | entropy explodes | the codebase is a tangle within months |
+| Operate / loop | silent rot | the same incidents recur |
+## The continuous concerns
+Four concerns are not steps but threads that run through every step, starting at project setup. Pulling them to the front ("shifting left") is far cheaper than bolting them on at the end.
+| Concern | Begins at | Enforced at the build gate by |
+|---------|-----------|-------------------------------|
+| **Security** | setup (secret scanning, dependency allow-list) | zero high-severity findings; every AI-suggested package verified to exist |
+| **Testing** | the scenarios step | coverage must not decrease; no test weakened to pass |
+| **Observability** | setup (logging/metric conventions) | instrumentation present; service objectives verified after release |
+| **Cost** | setup (an AI-usage budget per task) | a task may not exceed its budget without escalation |
+## AI-specific governance
+A method built on AI agents needs controls older methods did not:
+- **Pin the model.** Record the model and version; re-check the prompt library before adopting an upgrade. AI output is non-deterministic, so provenance matters.
+- **Test the prompts.** The reusable instructions in `playbook/` are themselves artifacts: give each golden input/output cases, and re-check them when edited. A prompt that fails its check does not ship.
+- **Guard the supply chain.** No package outside the allow-list without human approval; verify each suggested package actually exists, to defeat the risk of an agent inventing a plausible name an attacker has registered.
+- **Track provenance and licensing.** License-scan both generated and pulled-in code; keep a record of what the AI produced.
+## Metrics that matter — and the anti-metrics
+Measure the scarce things:
+- **Contract stability** — how rarely the frozen contracts change; high churn is genuinely expensive.
+- **Validated requirement coverage** — the share of rules confirmed against real behavior.
+- **Review throughput** — the team's verification capacity, which sets the safe autonomy level.
+- **Delivery and reliability** — lead time, deployment frequency, change-failure rate, time to recover.
+Do **not** optimize: lines of AI code generated, code-reuse percentage, prompt counts, or velocity measured in code volume. These count the cheap, disposable thing and create incentives to keep bad code to protect a number.
+## Profiles: one method, three intensities
+| | **Express** (startup) | **Standard** (most teams) | **Regulated** (audited) |
+|---|---|---|---|
+| Steps | combine Specify + Scenarios into a one-page brief; light contract | full flow | full flow, all `HARD-STOP` |
+| Scenarios | happy path only | happy + key alternatives | exhaustive, incl. compliance |
+| Autonomy ceiling | generate-behind-gate from day one | up to auto-with-evidence | generate-behind-gate max; the AI never merges its own work |
+| Gate default | `RISK-ACCEPTED` allowed | `PASS` required to advance | `HARD-STOP`; full audit trail |
+Choose the profile deliberately — a startup spike and a banking system are not the same risk — and run different products at different profiles as appropriate. The choice is owned by the delivery lead (see [12 Roles](./12-roles.md)).

package/docs/12-roles.md ADDED Viewed

@@ -0,0 +1,99 @@
+# 12 · Roles and responsibilities
+[← 11 Governance](./11-governance.md) · [Contents](./README.md) · Next: [13 Adoption →](./13-adoption.md)
+Everyone on an AIDD team becomes, in part, a *verifier*; most also become *authors of the artifacts*. This chapter says what each role owns and does. Find your section; each answers the same three questions — what you do, when, and what "done" means for you.
+---
+## Product / Domain Owner
+- **Mission:** ensure the right thing gets built. You guard the problem.
+- **Leads:** Specify. **Contributes to:** Scenarios; the loop (deciding what the next cycle addresses).
+- **Owns:** the problem definition, the glossary of domain terms, the prioritized backlog.
+- **Done means:** the spec states real user value with no disputed terms and its assumptions ranked least-sure first — the one or two most likely wrong flagged with *why* and *what they cost*; after release, you have decided what the next loop must address.
+- **Apply it:** run the Specify prompt against a real ticket or interview, then read the AI's least-sure flag *first* and decide the one or two load-bearing assumptions before skimming the low-stakes tail. If you cannot confirm a load-bearing rule, it is not ready to build.
+## Architect / Engineering Lead
+- **Mission:** own the load-bearing surfaces and the checks that protect them.
+- **Leads:** project setup; the Contract freeze. **Accountable for:** all the durable artifacts.
+- **Owns:** `CONVENTIONS.md`, the contracts, the architecture check in verification, the model record.
+- **Done means:** contracts are frozen and versioned; the architecture check runs in the pipeline; autonomy levels match the team's real review capacity.
+- **Apply it:** treat the contract freeze as a one-way door. When a stream wants to change a frozen contract, route it as a change request that reopens Specify — never let code quietly move the surface.
+## Software Engineer (Senior)
+- **Mission:** direct the build and hold quality at the architecture check.
+- **Leads:** Build. **Contributes to:** Contract, Tests; reviews others' changes.
+- **Owns:** the implementation, the architecture conformance check, the evidence bundle on each change.
+- **Done means:** all tests pass without any test being weakened; coverage holds; architecture and security checks pass; a person has reviewed it.
+- **Apply it:** work in small batches the review can keep up with, and never let the AI edit a test to make it pass — that is the cardinal sin of the build step.
+## Software Engineer (Junior)
+- **Mission:** learn the craft by entering at the build end and growing toward judgment.
+- **Leads:** nothing yet. **Contributes to:** Build (against handed-over specs and contracts), Tests.
+- **Owns:** your tasks' code and tests; raising a flag when a spec is ambiguous — which is a contribution, not a failure.
+- **Done means:** your task's tests pass honestly, your change has a clear evidence bundle, and a senior has reviewed it.
+- **Apply it:** start with specs and contracts given to you and make red tests green without weakening them; over time move *up* toward design and specification as your judgment matures (see [13 Adoption](./13-adoption.md)).
+## QA / Test Engineer
+- **Mission:** make "done" machine-checkable; you are the safety net for AI-written code.
+- **Leads:** Tests. **Contributes to:** Scenarios (turning rules into checkable form); the loop (production monitors).
+- **Owns:** the test suite, the scenario files, the coverage target, the test report at each gate.
+- **Done means:** every scenario has a test that was red before the build; the suite is honest (nothing passes by default); coverage never regresses.
+- **Apply it:** co-author the scenarios so the path from rule to test loses nothing, and confirm the suite fails for the *right* reason before the build begins.
+## Product Designer (UI/UX)
+- **Mission:** ensure correct logic does not ship inside a poor experience.
+- **Leads:** the design portion of Specify; the Prototype stage. **Contributes to:** Scenarios (experience-side rules).
+- **Owns:** the user flows, the specification of every screen state, the design document, the clickable prototype.
+- **Done means:** every screen has all its states designed; the prototype matches the scenarios; the self-critique for generic, low-effort output has passed.
+- **Apply it:** in the Prototype stage you lead — make the experience tangible fast, and carry the design forward while the prototype code is discarded.
+## DevOps / SRE / Platform
+- **Mission:** make the continuous concerns real and run the operate-and-learn loop.
+- **Leads:** the loop / operations. **Contributes to:** setup (pipeline, observability conventions), Build (deployment, gradual delivery).
+- **Owns:** gate enforcement in the pipeline, telemetry conventions, service-objective dashboards, rollback, the cost budget.
+- **Done means:** the gate outcomes are enforced mechanically in the pipeline; instrumentation is required to pass the build gate; rollback is tested; objectives are observed after release.
+- **Apply it:** wire the gate-fail protocol into the pipeline so a `HARD-STOP` is automatic, not a meeting, and shift security checks to setup rather than the end.
+## Security Engineer
+- **Mission:** keep AI-written code from importing AI-shaped risk.
+- **Leads:** the security thread. **Contributes to:** setup (allow-list, secret scanning), Specify (threat modeling), Build (scanning), AI governance.
+- **Owns:** the dependency allow-list, the provenance and license record, the security report at each gate, the supply-chain policy.
+- **Done means:** zero high-severity findings at the build gate; every AI-suggested dependency verified real and intended; generated and pulled-in code license-scanned.
+- **Apply it:** assume the AI will at some point hardcode a secret and invent a package name; gate against both from setup, and keep security findings as `HARD-STOP`, never waivers.
+## Engineering Manager / Delivery Lead
+- **Mission:** match intensity to risk, and protect verification capacity.
+- **Leads:** profile selection and stage planning. **Contributes to:** unblocking every step; the loop (priorities).
+- **Owns:** the chosen profile, the stage roadmap, the metrics dashboard.
+- **Done means:** the team operates at an autonomy level its review capacity can sustain; metrics track the scarce things, not code volume; each stage exits on its real achievement, not a date.
+- **Apply it:** choose the profile deliberately, and watch review throughput as the true measure of velocity — if AI output outpaces review, slow the engine rather than rushing the review.
+---
+## Responsibility matrix
+`A` Accountable · `R` Responsible/Lead · `C` Consulted · `I` Informed
+| Role | Setup | Specify | Scenarios | Contract | Tests | Build | Verify | Loop |
+|------|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
+| Product / Domain | C | **R** | R | I | I | I | I | R |
+| Architect / Lead | **R/A** | C | C | **R/A** | C | A | A | C |
+| Engineer (Senior) | C | I | C | R | R | **R** | R | C |
+| Engineer (Junior) | I | I | I | I | R | R | I | I |
+| QA / Test | I | C | R | C | **R** | C | C | C |
+| Designer | I | R (design) | R | C | I | I | I | I |
+| DevOps / SRE | R | I | I | C | C | R | R | **R** |
+| Security | R | C | I | C | C | R | R | C |
+| EM / Delivery | C | C | C | C | C | C | C | C |
+> If your role is only ever `I`, you are not yet using the method — find the step where your judgment *is* the gate.

package/docs/13-adoption.md ADDED Viewed

@@ -0,0 +1,67 @@
+# 13 · Adoption and onboarding
+[← 12 Roles](./12-roles.md) · [Contents](./README.md) · Next: [14 The foundation →](./14-foundation.md)
+How a team starts using AIDD, and how a new person becomes productive in it.
+---
+## A 90-day rollout
+Adopt the method on one real product, not as an all-at-once mandate.
+1. **Days 1–15 — Set the foundation.** Stand up the one-time setup on one pilot service: conventions, glossary, dependency allow-list, model record, and the prompt playbook ([Appendix B](./appendix-b-prompts.md)).
+2. **Days 16–45 — One feature, end to end.** Run a single feature through the whole flow at the **Express** profile. Capture friction; tune the prompts' golden cases as you go.
+3. **Days 46–75 — Turn on the gates.** Wire the three reports and the gate-fail protocol into the pipeline; introduce the autonomy ladder at the generate-behind-gate level.
+4. **Days 76–90 — Promote.** Move the pilot to the **Standard** profile, draft the **Regulated** variant for any compliance-bound product, and publish the prompts as a shared, versioned playbook.
+## Choosing a profile
+| Choose… | When… |
+|---------|-------|
+| **Express** | startup, spike, or internal tool; speed of learning dominates; small blast radius |
+| **Standard** | a normal product with real users and ordinary risk |
+| **Regulated** | finance, health, or anything audited; failure is expensive or legally consequential |
+The choice is deliberate and owned by the delivery lead; different products can run at different profiles.
+## Onboarding: enter from the build end
+The most common onboarding mistake is to start newcomers at the most abstract step. Specification and domain discovery require judgment a newcomer has not yet built. So bring people in from the *concrete* end and move them toward judgment:
+1. **Weeks 1–4 — Build and Tests.** Implement tasks against specs and contracts handed to you; write tests. Learn the architecture check and the evidence bundle.
+2. **Weeks 5–8 — Contract and design.** Start contributing to contracts and screen states; learn why the surface is frozen.
+3. **Weeks 9–12 — Scenarios and Specify.** Co-author scenarios and specs; practice removing ambiguity.
+4. **Beyond — Domain discovery.** The most abstract work comes last, once judgment is calibrated.
+You graduate *up* the flow, from execution toward direction. Deciding what to build is the senior skill, not the entry skill.
+## Tool portability
+The prompts are plain text that reference files in the repository, and the gates are enforced in the pipeline, not in the agent. So the method does not depend on any one AI coding tool — the agent is replaceable, the method is not. A conformant prompt is (1) tool-agnostic plain language, (2) anchored to repository files rather than chat memory, (3) self-describing about which model and exit criteria it assumes, and (4) checkable by the pipeline.
+| Concern | Where it lives |
+|---------|----------------|
+| Prompt discovery | a folder convention in the repo (`playbook/`) |
+| Context | repository files the prompt names explicitly |
+| Gate enforcement | the build pipeline |
+Switching tools changes the discovery convention and nothing structural.
+## First week, by role
+| Role | First-week task |
+|------|------------------|
+| Product / Domain | run the Specify prompt on a real input; produce a glossary you would defend |
+| Architect / Lead | stand up setup and freeze one contract; wire the architecture check into the pipeline |
+| Engineer (Senior) | run the Build prompt on one small task; produce a full evidence bundle |
+| Engineer (Junior) | take a handed-over spec; make a red test green without weakening it |
+| QA / Test | convert one rule into a scenario, then a failing test |
+| Designer | take a spec; produce flows, all screen states, and a clickable prototype |
+| DevOps / SRE | wire one gate report into the pipeline; add secret-scan and allow-list to setup |
+| Security | build the dependency allow-list; make security findings a `HARD-STOP` |
+| EM / Delivery | choose the pilot's profile; stand up the review-throughput metric |
+---
+> Adoption is a loop too. The method itself is a living document: every cycle should feed improvements back into your copy of these prompts and conventions.

package/docs/14-foundation.md ADDED Viewed

@@ -0,0 +1,121 @@
+# 14 · The foundation: project context across milestones
+[← 13 Adoption](./13-adoption.md) · [Contents](./README.md) · Next: [Appendix A Templates →](./appendix-a-templates.md)
+---
+## The engine needs ground
+The flow in [Part II](./02-the-flow.md) is the *engine*: Specify → Scenarios →
+Contract → Tests → Build → Verify, run as a tight loop. TDD and ADD turn inside
+that engine — write the failing test, let the AI generate code, repeat.
+But an engine needs something to stand on. Every loop quietly assumes context that
+no single task owns: *what the words mean*, *what we are building right now*, and
+*how its users experience it*. When that context lives only in someone's head, each new session —
+and each new milestone — starts cold, and the AI fills the gap with plausible
+guesses. That is the same failure the method exists to prevent ([00](./00-introduction.md)),
+one level up.
+The **foundation** is the layer that holds this context and *outlives every
+milestone*. It is not new ceremony; it is the [survivor layer](./appendix-f-requirements-matrix.md)
+the method already names, made explicit as three concerns.
+## Three concerns, one foundation
+![The engine needs ground — the TDD ⇄ ADD engine runs on a DDD · SDD · UDD foundation: context feeds up, and any loop may send a correction back down](./add-foundation.png)
+- **DDD — Domain.** The shared, precise language and the boundaries it lives in:
+  the core concepts, the modules/contexts they belong to, and the invariants that
+  must always hold — the domain model and context map behind the names. One name
+  per concept — the same names the spec, the contract, and the code all use. (The
+  [GLOSSARY](./appendix-c-glossary.md) holds the full term list; the foundation
+  holds the model those terms describe.)
+- **SDD — Spec.** *The living document.* What is being built right now and what is
+  settled versus still open. This is not a frozen plan written once — it is the
+  layer that changes as the loop learns ([01](./01-principles.md)). In ADD it does
+  not duplicate the work; it **points** to the active milestone and the frozen
+  contracts that other tasks build against.
+- **UDD — UI/UX.** *Users use the interface, not the spec.* The experience designed
+  before code: the **user flows** (happy and alternative paths), the **UI states**
+  every screen must handle (loading · empty · error · success), and a design source
+  of truth — a `DESIGN.md` or clickable prototype. The AI can generate a prototype
+  from a design system; a person owns the empathy — what the user is trying to do,
+  and what "good" feels like from their side. The scenarios ([04](./04-step-2-scenarios.md))
+  test that behaviour; the foundation keeps the design intent that makes a screen
+  worth building.
+These three foundation competencies, together with the **TDD ⇄ ADD** engine of
+[Part II](./02-the-flow.md), are ADD's five. The first four feed context to the
+fifth, where the AI executes on it:
+![ADD's five competencies — DDD · SDD · UDD · TDD · ADD: the first four are human-led and feed context to ADD, which is AI-led under your direction](./add-competencies.png)
+> The diagram's foundation (DDD · SDD · UDD) and the method's own words — survivor
+> layer · living document · ubiquitous language — name the same three ideas. This
+> chapter is where the diagram and the text finally meet.
+## One file, not three
+A foundation that takes a week to write is a foundation no one keeps current. So
+ADD realizes all three concerns as **one survivor document — `PROJECT.md`** — with
+one short section each, plus an append-only record of key decisions:
+```
+.add/PROJECT.md
+  ## Domain (DDD)                — concepts · contexts · invariants
+  ## Spec / Living Document (SDD)— → active milestone + frozen contracts
+  ## Users (UDD)                 — UI/UX: user flows · states · DESIGN.md / prototype
+  ## Key Decisions               — append-only: date · decision · why · outcome
+```
+Keep it to one screen. If a section wants to grow into a manual, that is a signal
+the detail belongs in a milestone or a contract, not the foundation. The foundation
+is the *thin, durable* context the engine reads first — not a place to relocate the
+work.
+## How it feeds the engine — and takes feedback back
+The arrow runs both ways, which is the whole point of a re-entrant method:
+- **Down → up.** At the start of any session or milestone, read `PROJECT.md`
+  before touching a task. It is the cheapest way to point the AI in the right
+  direction. `add.py status` prints a pointer to it for exactly this reason.
+- **Up → down.** When a loop reveals that the domain model was wrong, the spec
+  stance has shifted, or a user assumption did not survive contact with reality,
+  you **stop and update the foundation** — then come forward again. A passing test
+  built on a broken foundation is still the wrong software, fast.
+## Where it sits in the hierarchy
+The foundation is the **Project tier** of the document hierarchy
+([Appendix F](./appendix-f-requirements-matrix.md)) — created once, kept for the
+life of the product, owned above any single milestone.
+![Three tiers of documents — Project (the foundation, .add/PROJECT.md) → Milestone → Task: scope narrows and lifespan shortens down the stack](./add-hierarchy.png)
+| Tier | Lives in | Lifespan | Holds |
+|------|----------|----------|-------|
+| **Project** (foundation) | `.add/PROJECT.md` + survivor files | whole product | domain, spec stance, users, decisions |
+| **Milestone** | `.add/milestones/<slug>/MILESTONE.md` | one depth-bounded goal | scope, shared contracts, exit criteria |
+| **Task** | `.add/tasks/<slug>/TASK.md` | one feature | the seven-step artifacts |
+A milestone is a *version bump* to the foundation, not a fresh start: when it
+closes, fold what it validated into `PROJECT.md` (a decision, a settled domain
+term, a confirmed user journey) and open the next one against the same, now-richer,
+ground.
+## In the tooling
+- `add.py init` scaffolds `PROJECT.md` as a survivor file — and, like every
+  survivor file, **never overwrites a hand-edited one**.
+- `add.py status` shows a one-line pointer to the foundation, so a fresh session
+  re-orients on context before code.
+- The guideline block written into `CLAUDE.md` / `AGENTS.md` tells any agent the
+  same thing: run `status`, read the foundation, then work the loop.
+> **The thesis, one level up.** The engine builds the thing right; the foundation
+> keeps the engine pointed at the right thing — across every milestone, not just
+> the current one.

package/docs/README.md ADDED Viewed

@@ -0,0 +1,70 @@
+# AI-Driven Development
+### A complete, practical book on building software when AI writes the code
+**Edition:** 1.0 · **Type:** Methodology + operating manual
+---
+## What this book is
+This is a complete guide to **AIDD (AI-Driven Development)** — a way of building software in which an AI agent writes most of the code and people do the two things AI cannot reliably do alone: decide *what* to build, and *verify* that what was built is correct.
+It is written to be read once front to back, then kept open beside you as a working manual. The early chapters explain *why* the method has the shape it does; the middle chapters explain each step in detail; the later chapters explain how to operate it across a real team and product; the appendices are copy-paste reference material.
+A single worked example — *transferring money between a user's own accounts* — runs through the entire book so that every abstract step has a concrete form you can see.
+## Who it is for
+Anyone who builds software with AI in the loop: engineers, architects, testers, designers, product owners, and the managers who lead them. No part assumes you have read the others; cross-references point you to what you need.
+## The method in one paragraph
+For every feature, before AI writes any code, you write four short artifacts in order — the rules it must obey, those rules as pass/fail scenarios, the data and interface contract, and the failing tests — and then you direct the AI to make the tests pass without changing them, and finally you verify the result through evidence rather than inspection. That ordered set of artifacts *is* the method. The code is disposable; the artifacts are the durable asset. Direction comes before speed, and trust comes from passing tests rather than from reading code and finding it plausible.
+## The flow
+> **Specify → Scenarios → Contract → Tests → Build → Verify → observe, then repeat.**
+---
+## Table of contents
+**Part I — Foundations**
+- [00 · The shift: why AIDD exists](./00-introduction.md)
+- [01 · Core principles](./01-principles.md)
+- [02 · The flow, and what is disposable](./02-the-flow.md)
+**Part II — The method, step by step**
+- [03 · Step 1 — Specify](./03-step-1-specify.md)
+- [04 · Step 2 — Scenarios](./04-step-2-scenarios.md)
+- [05 · Step 3 — Contract](./05-step-3-contract.md)
+- [06 · Step 4 — Tests](./06-step-4-tests.md)
+- [07 · Step 5 — Build](./07-step-5-build.md)
+- [08 · Step 6 — Verify](./08-step-6-verify.md)
+- [09 · The loop — observe and learn](./09-the-loop.md)
+**Part III — Operating the method**
+- [10 · Project setup and stages](./10-setup-and-stages.md)
+- [11 · Governance](./11-governance.md)
+- [12 · Roles and responsibilities](./12-roles.md)
+- [13 · Adoption and onboarding](./13-adoption.md)
+- [14 · The foundation: project context across milestones](./14-foundation.md)
+**Part IV — Reference**
+- [Appendix A · Templates](./appendix-a-templates.md)
+- [Appendix B · Prompt library](./appendix-b-prompts.md)
+- [Appendix C · Glossary](./appendix-c-glossary.md)
+- [Appendix D · The worked example, end to end](./appendix-d-worked-example.md)
+- [Appendix E · Checklists](./appendix-e-checklists.md)
+- [Appendix F · Document requirements matrix (Project → Milestone → Task)](./appendix-f-requirements-matrix.md)
+---
+## Conventions used in this book
+- **▶ Example** marks the running worked example.
+- **Do / Don't** boxes give the rule in its shortest form.
+- A **gate** is a checkpoint with an explicit pass/fail exit. Its outcome is always one of `PASS`, `RISK-ACCEPTED` (a signed waiver), or `HARD-STOP`.
+- File names like `SPEC.md`, `features/*.feature`, `contracts/*` refer to the artifacts you create per feature; see [Appendix A](./appendix-a-templates.md).
+- Where this book uses a plain step name, the formal phase name (for teams mapping to a larger standard) appears once in [Appendix C](./appendix-c-glossary.md).

package/docs/add-competencies.png ADDED Viewed

Binary file

package/docs/add-flow.png ADDED Viewed

Binary file

package/docs/add-foundation.png ADDED Viewed

Binary file

package/docs/add-hierarchy.png ADDED Viewed

Binary file

package/docs/appendix-a-templates.md ADDED Viewed

@@ -0,0 +1,88 @@
+# Appendix A · Templates
+[← 13 Adoption](./13-adoption.md) · [Contents](./README.md) · Next: [Appendix B Prompts →](./appendix-b-prompts.md)
+Copy-paste blanks. Project-level templates are filled once at setup; feature-level templates are filled once per feature.
+---
+## Project-level (set up once)
+### `CONVENTIONS.md`
+```
+Language/framework: <e.g. Python 3.12 / FastAPI>
+Folders: src/  tests/  contracts/  features/  playbook/
+Naming: <file case>, <type case>, verbs for functions
+Lint/format: <tools>, enforced in the pipeline
+Errors: machine-readable error codes (string enums), never free text
+Architecture: <layering and dependency rules>
+```
+### `MODEL_REGISTRY.md`
+```
+Model: <name>
+Version: <version/date>
+Adopted: <date>
+Notes: re-run the prompt golden-cases before changing this.
+```
+### `dependencies.allowlist`
+```
+# one package per line; the pipeline rejects anything not listed
+<package>==<version-or-range>
+```
+---
+## Feature-level (once per feature)
+### `SPEC.md`
+```
+Feature: <name>
+Framings weighed: <chosen> (chosen) · <alternative> · <alternative>
+Must:
+  - <required behavior>
+Reject:
+  - <bad input / situation> -> "<error_code>"
+After:
+  - <state true once it succeeds>
+Assumptions — least-sure first:
+  ⚠ <most-likely-wrong assumption> — least sure because <why>; if wrong: <cost>
+  - [x] <confirmed / low-stakes assumption> — <one line>
+```
+### `features/<name>.feature`
+```
+Scenario: <short name>
+  Given <starting situation>
+  When <action>
+  Then <expected result>
+  And <what must remain unchanged>   # when relevant
+```
+### `contracts/<name>.md`
+```
+<METHOD> <path>   body: { <fields> }
+  200 -> { <success fields> }
+  4xx -> { error: "<code>" | "<code>" }
+Schema: <tables/fields touched, and access pattern>
+Status: FROZEN @ v<n>
+```
+### `tests/<name>_test.<ext>` (stub)
+```
+test_<scenario_name>:
+  arrange: <set up the Given>
+  act:     <do the When>
+  assert:  <check the Then>
+  assert:  <check what must stay unchanged>   # when relevant
+```
+### Gate outcome record
+```
+Feature: <name>   Step: <Specify|...|Verify>   Date: <date>
+Reports: Test=<pass/fail>  Quality=<pass/fail>  Risk=<summary>
+Outcome: <PASS | RISK-ACCEPTED | HARD-STOP>
+If RISK-ACCEPTED -> owner: <name>  ticket: <link>  expires: <date>
+Reviewed by: <name>
+```