npm - arey-pi - Versions diffs - 0.1.0 - Mend

arey-pi 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/LICENSE +21 -0
package/README.md +159 -0
package/agents/README.md +313 -0
package/agents/engineering-reviewer.md +78 -0
package/agents/project-evaluator.md +136 -0
package/agents/spec-author.md +82 -0
package/agents/spec-syncer.md +88 -0
package/agents/tdd-implementer.md +81 -0
package/agents/tech-lead.md +92 -0
package/package.json +48 -0
package/prompts/assess-project.md +38 -0
package/rules/README.md +57 -0
package/rules/architecture/adrs.md +257 -0
package/rules/architecture/architecture-memory.md +55 -0
package/rules/assessment/project-readiness.md +224 -0
package/rules/core/change-modes.md +63 -0
package/rules/core/conflict-resolution.md +56 -0
package/rules/core/definition-of-done.md +67 -0
package/rules/core/principles.md +63 -0
package/rules/engineering/engineering-quality.md +285 -0
package/rules/engineering/quality-tooling.md +137 -0
package/rules/engineering/rebuildability.md +49 -0
package/rules/engineering/tdd.md +86 -0
package/rules/engineering/test-quality.md +159 -0
package/rules/specs/canonical-specs.md +62 -0
package/rules/specs/database-specs.md +142 -0
package/rules/specs/gherkin-authoring.md +121 -0
package/rules/specs/language-style.md +106 -0
package/rules/specs/spec-sync.md +70 -0
package/rules/workflow/agent-workflows.md +70 -0
package/rules/workflow/ai-harness.md +177 -0
package/rules/workflow/incremental-commits.md +88 -0
package/skills/project-readiness/SKILL.md +96 -0

package/rules/architecture/adrs.md ADDED Viewed

@@ -0,0 +1,257 @@
+# Architecture Decision Records
+## Purpose
+Architecture Decision Records persist meaningful technical decisions that shape the system over time.
+ADRs are not meeting notes, implementation diaries, or paperwork.
+They exist to preserve decisions with real architectural, operational, product, or long-term maintenance impact.
+## Core Rule
+Create an ADR when a decision is significant enough that a future senior engineer or agent would need to understand why the system works that way.
+Do not create ADRs for trivial choices, obvious implementation details, or decisions that can be safely inferred from local code.
+## Quality Bar
+An ADR must be useful after the original conversation is forgotten.
+A high-quality ADR explains:
+- the context that made the decision necessary;
+- the decision that was made;
+- the options seriously considered;
+- the tradeoffs and consequences;
+- the constraints that shaped the decision;
+- the expected impact on architecture, operations, data, security, testing, or delivery;
+- when the decision should be revisited.
+If an ADR does not clarify future work, it should not exist.
+## When to Create an ADR
+Create or update an ADR for decisions involving:
+- major framework or platform choices;
+- persistence or database strategy;
+- API contracts or integration patterns;
+- authentication, authorisation, security, or privacy models;
+- eventing, queues, background jobs, or distributed systems;
+- deployment, runtime, infrastructure, or operational constraints;
+- data ownership, tenancy, retention, or migration strategy;
+- module boundaries or dependency direction;
+- substantial performance, reliability, or scalability tradeoffs;
+- accepted technical debt with meaningful consequences;
+- deviations from Arey Pi rules;
+- irreversible or expensive-to-reverse choices.
+## When Not to Create an ADR
+Do not create ADRs for:
+- tiny local refactors;
+- routine bug fixes;
+- obvious library usage within an existing standard;
+- formatting or tooling changes with no durable tradeoff;
+- implementation steps that are already clear from code and tests;
+- temporary notes that belong in a task plan;
+- decisions that have no meaningful future consequence.
+If the decision is too small for an ADR but still worth remembering, consider updating comments, docs, glossary, or existing architecture notes instead.
+## Required Structure
+Use this structure unless the project already has an equivalent ADR template:
+```md
+# ADR-NNNN: Title
+## Status
+Proposed | Accepted | Superseded | Deprecated
+## Relationship
+Supersedes:
+Superseded by:
+Amends:
+Amended by:
+Narrows:
+Expands:
+Depends on:
+Related:
+## Scope
+Where does this decision apply?
+Where does it not apply?
+## Context
+What problem, constraint, or opportunity forced a decision?
+## Decision
+What did we decide?
+## Options Considered
+What realistic alternatives were considered?
+Why were they not chosen?
+## Consequences
+What improves?
+What gets worse?
+What new constraints or responsibilities exist?
+## Impact
+Which areas are affected?
+Consider architecture, data, security, operations, testing, specs, and delivery.
+## Revisit When
+What signals should cause this decision to be reviewed?
+```
+## Location and Naming
+Default location:
+```txt
+specs/decisions/
+```
+Recommended naming:
+```txt
+ADR-0001-use-postgresql-for-primary-storage.md
+ADR-0002-adopt-event-driven-billing-integration.md
+```
+Use semantic line breaks and UK English.
+## Status Lifecycle
+ADRs should have a clear status:
+- **Proposed:** not yet accepted;
+- **Accepted:** current decision;
+- **Superseded:** replaced by a newer ADR;
+- **Deprecated:** no longer recommended but not directly replaced.
+Do not silently edit history when a major decision changes.
+Prefer adding a new ADR that relates to the old one explicitly.
+## Relationships Between ADRs
+ADRs may overlap, refine, or replace earlier decisions.
+Those relationships must be explicit.
+Use relationship fields consistently:
+- **Supersedes:** the new ADR fully replaces an older ADR.
+- **Superseded by:** the old ADR points to the newer replacing ADR.
+- **Amends:** the new ADR partially changes an older ADR while leaving the rest valid.
+- **Amended by:** the old ADR points to a newer partial amendment.
+- **Narrows:** the new ADR restricts where an older decision applies.
+- **Expands:** the new ADR applies an older decision to additional scope.
+- **Depends on:** the new ADR relies on another decision still being valid.
+- **Related:** the ADRs are relevant to each other but do not change each other's status or scope.
+## Supersession and Overlap
+When a new ADR replaces an older decision:
+1. Create a new ADR explaining why the decision changed.
+2. Set the new ADR relationship to `Supersedes: ADR-NNNN`.
+3. Update the old ADR status to `Superseded`.
+4. Set the old ADR relationship to `Superseded by: ADR-NNNN`.
+5. Update architecture docs, Gherkin, DBML, glossary, or tests when the current system truth changes.
+When a new ADR only partially changes an older decision, do not mark the older ADR as fully superseded.
+Use `Amends`, `Narrows`, or `Expands`, and make the affected scope explicit in both ADRs where practical.
+Historical ADRs should remain readable.
+Do not delete or rewrite meaningful context to make history look cleaner.
+## Conflict Handling
+Accepted newer ADRs only override older ADRs when the relationship is explicit.
+If two accepted ADRs appear to conflict and no relationship explains the overlap, agents must stop and report the conflict.
+They must not silently choose the newer, older, or easier decision.
+Report conflicts with evidence:
+```txt
+ADR conflict detected:
+- ADR-0004 says: ...
+- ADR-0010 says: ...
+- No supersession, amendment, narrowing, or expansion relationship found.
+Decision required.
+```
+## Current Architecture Truth
+ADRs preserve decision history.
+Architecture overview docs describe the current architecture.
+The current decision set is derived from:
+- accepted ADRs;
+- minus superseded or deprecated ADRs;
+- plus explicit amendments, narrowing, expansions, and dependencies.
+Because deriving current truth from many ADRs can become difficult, durable current-state architecture docs should be updated when ADRs materially change the system.
+## Relationship to Specs
+ADRs explain why durable technical decisions were made.
+They do not replace:
+- Gherkin behaviour specs;
+- DBML database specs;
+- tests;
+- glossary entries;
+- architecture overview docs.
+When a decision affects behaviour, update Gherkin.
+When it affects the data model, update DBML.
+When it introduces domain language, update the glossary.
+When it changes system structure, update architecture docs.
+## Agent Behaviour
+Agents must identify whether their work includes a meaningful architectural decision.
+If it does, they must either:
+- create or update an ADR;
+- update an existing ADR's status;
+- or explicitly report why an ADR is not warranted.
+Agents must not create low-value ADRs just to satisfy process.
+## Review Checklist
+Before accepting an ADR, check:
+- Does it capture a real decision, not just an implementation step?
+- Is the impact significant enough to justify an ADR?
+- Is the context understandable without chat history?
+- Are alternatives and tradeoffs honest?
+- Are consequences explicit?
+- Does it say when to revisit the decision?
+- Does it declare relationships to overlapping ADRs?
+- Does it avoid silent conflict with accepted ADRs?
+- Does it link to affected specs, DBML, architecture docs, or follow-up work where useful?
+- Is it written with semantic line breaks and UK English?
+## Acceptance Rule
+An ADR is valuable only if it improves future decision-making.
+A change is not complete when it makes or changes a significant technical decision but leaves no high-quality ADR or explicit no-ADR rationale.

package/rules/architecture/architecture-memory.md ADDED Viewed

@@ -0,0 +1,55 @@
+# Architecture Memory
+## Purpose
+Architecture memory preserves durable technical decisions and system constraints outside implementation code.
+It ensures future agents can understand not just what the system does, but why it is shaped that way.
+## Canonical Architecture Sources
+Use:
+- architecture docs for current system structure and constraints;
+- ADRs for decisions, tradeoffs, and consequences;
+- glossary for domain language;
+- Gherkin for observable behaviour affected by architecture.
+Recommended locations:
+```txt
+specs/architecture/
+specs/decisions/
+specs/glossary.md
+```
+## When to Update Architecture Memory
+Update architecture memory when work introduces or changes:
+- component boundaries;
+- data ownership;
+- persistence strategy;
+- public API shape;
+- integration patterns;
+- queues/events/background jobs;
+- auth/security model;
+- deployment/runtime assumptions;
+- performance or reliability constraints;
+- important dependencies;
+- accepted technical debt;
+- major tradeoffs or rejected alternatives.
+## ADR Rule
+Create or update a high-quality ADR for non-trivial decisions that future maintainers should not have to rediscover.
+ADRs should be meaningful, decision-focused, and useful after the original conversation is forgotten.
+Use `architecture/adrs.md` as the quality bar for when ADRs are warranted and how they should be written.
+## Agent Behaviour
+Agents must not leave important architecture decisions only in code comments, task summaries, or chat history.
+If a task makes an architectural decision, persist it before marking the work complete.

package/rules/assessment/project-readiness.md ADDED Viewed

@@ -0,0 +1,224 @@
+# Project Readiness
+## Purpose
+Project Readiness evaluates whether a repository is aligned with Arey Pi.
+It is a meta-assessment across the Arey Pi rules. AI Harness, Language Style, Database Specs, and ADR quality are evaluated as first-class concerns alongside specs, TDD, test quality, quality tooling, architecture, spec sync, rebuildability, and process.
+## Core Rule
+Projects should be periodically evaluated against Arey Pi rules.
+Assessment is read-only by default. Findings should produce evidence, scores, risks, and a prioritised improvement plan before any changes are made.
+## Assessment Areas
+Evaluate the project across these Arey Pi areas.
+### Canonical Specs
+Check whether:
+- `specs/features/` exists where applicable;
+- Gherkin specs describe meaningful observable behaviour;
+- specs avoid incidental implementation details;
+- important domains, APIs, CLIs, errors, permissions, and edge cases are covered;
+- glossary, architecture docs, and ADRs exist when project complexity requires them.
+### Gherkin Authoring
+Check whether:
+- scenarios are readable and domain-focused;
+- scenarios are testable;
+- `Feature`, `Rule`, `Scenario`, and `Scenario Outline` are used clearly;
+- specs avoid duplicative or low-value scenarios;
+- scenarios can be traced to tests where practical.
+### Tests and TDD
+Check whether:
+- there is a clear test suite;
+- tests are easy to run;
+- tests can be traced to specs or requirements where practical;
+- bug fixes have regression tests;
+- test structure supports TDD;
+- tests are meaningful rather than shallow generated assertions.
+### Test Quality
+Check whether:
+- coverage is available or intentionally absent;
+- mutation testing is configured for critical code or proposed as an improvement;
+- tests assert behaviour rather than implementation mechanics;
+- edge cases and failure paths are covered for important behaviour;
+- surviving mutants or weak assertions are triaged when evidence exists.
+### Quality Tooling
+Check whether the project defines:
+- formatter;
+- linter/static analyser;
+- type checking where applicable;
+- test command;
+- composed check/validation command;
+- relevant dynamic analysis for project risk.
+If tooling is absent, assessment must recommend options and mark the project as not fully aligned.
+### Engineering Quality
+Check whether:
+- architecture is understandable;
+- boundaries and responsibilities are clear;
+- code follows consistent patterns;
+- complexity is justified;
+- the implementation reflects senior engineering standards;
+- generated or agent-authored code has been reviewed for quality.
+### Spec Sync
+Check whether:
+- specs, tests, and code appear aligned;
+- behaviour changes have corresponding Gherkin updates or no-impact reasoning;
+- database changes have precise DBML updates or no-impact reasoning;
+- architecture/ADR/glossary updates exist when durable knowledge changed;
+- Definition of Done expectations are documented.
+### Database Specs
+Evaluate Database Specs as a normal Arey Pi rule when the project uses persistent storage.
+Check whether:
+- DBML exists in `specs/database/` or another documented canonical location;
+- DBML reflects tables, columns, types, keys, relationships, constraints, indexes, and relevant notes;
+- migrations, ORM models, SQL DDL, and DBML agree;
+- database-related changes update DBML in the same change set;
+- DBML validation tooling exists or its absence is reported.
+### Rebuildability
+Check whether:
+- important behaviour can be reconstructed from specs and tests;
+- durable decisions are outside code;
+- modules are not understandable only by reading implementation;
+- architecture and domain knowledge are persisted.
+### Architecture Memory and ADRs
+Check whether:
+- architecture docs exist when needed;
+- ADRs capture meaningful non-trivial decisions rather than irrelevant process noise;
+- ADRs explain context, options, tradeoffs, consequences, impact, and revisit conditions;
+- low-value ADRs are avoided;
+- glossary captures domain language;
+- integrations, boundaries, ownership, and constraints are documented.
+### Incremental Commits
+Check whether:
+- Conventional Commits are used;
+- commits are incremental and coherent;
+- unrelated changes are not mixed together;
+- history supports review and rollback.
+### AI Harness
+Evaluate AI Harness as a normal Arey Pi rule.
+Check whether:
+- root `AGENTS.md` exists and is useful;
+- nested `AGENTS.md` files exist for subtrees that need local technology, domain, command, or safety instructions;
+- Arey Pi installation/reference is discoverable;
+- project-local skills, prompts, and subagents exist where useful;
+- technology-specific guidance is available;
+- validation/setup commands are discoverable;
+- safety rails for agents are documented.
+Missing AI harness setup is an Arey Pi alignment gap because it prevents agents from applying the other rules consistently.
+### Language Style
+Evaluate Language Style as a normal Arey Pi rule.
+Check whether:
+- project-facing prose uses UK English by default;
+- specs always use semantic line breaks;
+- touched docs preserve or improve semantic line breaks;
+- specs, docs, prompts, skills, reports, and harness instructions are consistent;
+- US spellings are avoided unless required by identifiers, APIs, quoted material, tooling, or customer terminology;
+- widespread inconsistency is captured as a follow-up rather than silently expanded.
+## Scoring
+Use a 0-5 score for each rule area:
+| Score | Meaning |
+| --- | --- |
+| 0 | Missing |
+| 1 | Poor |
+| 2 | Partial |
+| 3 | Adequate |
+| 4 | Strong |
+| 5 | Excellent |
+Avoid false precision. Scores are meant to prioritise improvement, not gamify compliance.
+## Required Report
+A readiness report should include:
+```txt
+Project Readiness Report
+- Overall readiness:
+- Rule scores:
+- Blockers:
+- Quick wins:
+- Recommended plan:
+```
+For each rule area, include:
+```txt
+- Score:
+- Evidence:
+- Findings:
+- Recommendations:
+```
+## Assessment Modes
+### Audit Mode
+Read-only. Produce findings and recommendations. Do not modify project files.
+### Bootstrap Mode
+After audit and user approval, implement selected improvements such as:
+- adding or updating `AGENTS.md`;
+- adding missing validation scripts;
+- adding spec skeletons;
+- adding Arey Pi prompts or skills;
+- documenting quality tooling;
+- creating ADR/glossary structure.
+Bootstrap Mode must still follow Arey Pi policies, including TDD where applicable, quality tooling, DoD, and incremental Conventional Commits.
+## Acceptance Rule
+A project is not fully ready for Arey Pi until its relevant Arey Pi rules score adequate or better for its risk and complexity.
+AI Harness readiness is one of those rules, not a separate external assessment.

package/rules/core/change-modes.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Change Modes
+## Purpose
+Arey Pi supports both full Spec-Driven Development and direct changes. The goal is to avoid unnecessary ceremony while preserving TDD, spec sync, and rebuildability.
+## Spec-Driven Mode
+Use Spec-Driven Mode for:
+- new features;
+- non-trivial behaviour changes;
+- ambiguous requirements;
+- business rule changes;
+- public API or CLI changes;
+- architectural changes;
+- module rewrites;
+- security, permission, persistence, or integration behaviour;
+- work where future rebuildability depends on capturing intent first.
+Default flow:
+```txt
+Intent → Gherkin/Canonical Spec → Test → Code → Refactor → Spec Sync → Review
+```
+## Direct Change Mode
+Use Direct Change Mode for:
+- small obvious fixes;
+- local implementation cleanup;
+- mechanical refactors;
+- formatting or naming changes;
+- changes fully covered by existing specs/tests;
+- low-risk internal changes with no observable behaviour change.
+Default flow:
+```txt
+Intent → Test/Coverage Check → Code → Validation → Spec Sync Check → Done
+```
+## Non-Negotiables
+Direct Change Mode does not allow agents to skip:
+- TDD for production behaviour;
+- regression tests for bug fixes;
+- final spec sync;
+- conflict resolution;
+- reporting residual risks.
+## Escalation
+Switch from Direct Change Mode to Spec-Driven Mode when:
+- intent becomes ambiguous;
+- behaviour changes more than expected;
+- specs are missing or stale;
+- tests are inadequate for the risk;
+- architectural or domain decisions appear;
+- the change grows beyond the original small scope.

package/rules/core/conflict-resolution.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Conflict Resolution
+## Purpose
+This policy defines what to do when user requests, specs, tests, and code disagree.
+## Authority Order
+Default authority order:
+1. Explicit current user instruction.
+2. Canonical specs.
+3. Tests.
+4. Existing code.
+5. Agent inference.
+This order does not mean every user instruction should silently overwrite specs. If a user instruction changes intended behaviour, specs must be updated explicitly.
+## Spec vs Code
+If canonical specs and code disagree, specs define intended behaviour by default.
+Agents should either align code with specs or ask whether the spec should change.
+## Spec vs Tests
+If tests and specs disagree, resolve the mismatch before relying on either one.
+Possible outcomes:
+- update tests to match specs;
+- update specs with explicit approval;
+- ask for clarification.
+## User Request vs Specs
+If the user requests behaviour that conflicts with canonical specs, the agent must treat it as a possible spec change.
+The task should include spec updates unless the user explicitly says not to persist the new behaviour, in which case the agent should clarify the intended lifecycle of the change.
+## Code vs Tests
+If tests fail against current code, agents must determine whether:
+- the code is wrong;
+- the test is stale or incorrect;
+- the spec is missing or ambiguous;
+- the environment is broken.
+Do not delete or weaken tests without understanding the mismatch.
+## Ambiguity
+Ask for clarification when resolving the conflict would require a product, domain, or architectural decision that is not already clear.
+Do not hide uncertainty by choosing the easiest implementation path.