npm - @alexandrealvaro/agentic - Versions diffs - 0.5.1-beta.1 → 0.6.0-beta.1 - Mend

@alexandrealvaro/agentic 0.5.1-beta.1 → 0.6.0-beta.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +32 -0
package/WORKFLOW.md +17 -27
package/package.json +1 -1
package/src/skills/claude-code/agentic-audit/SKILL.md +12 -2
package/src/skills/claude-code/agentic-philosophy/SKILL.md +2 -5
package/src/skills/codex/agentic-audit/SKILL.md +7 -0
package/src/skills/codex/agentic-philosophy/SKILL.md +1 -1

package/README.md CHANGED Viewed

@@ -88,6 +88,38 @@ npm install -g @alexandrealvaro/agentic@beta
 agentic init
 ```
+## Recommended daily sequence
+The kit ships nine universal skills plus three conditional ones — twelve discrete capabilities. The sequence below is a happy path through them for the three flows that cover most daily work. Skip steps that don't apply; the kit never enforces order.
+**Greenfield project, first non-trivial feature:**
+1. `agentic init` — install skills.
+2. `/agentic-bootstrap` — produce `AGENTS.md` (operational guide).
+3. `/agentic-architecture` — produce `ARCHITECTURE.md` once load-bearing patterns emerge.
+4. `/agentic-spec` — feature-level spec at `doc/specs/NNNN-<slug>.md` (User Scenarios, Requirements, Success Criteria).
+5. `/agentic-adr` — only when the feature forces a binding architectural decision worth recording for posterity.
+6. `/agentic-task` — work-unit decomposition; reference the spec via `Spec ref`.
+7. `/agentic-ground` — four-source research before code (`agentic-philosophy` auto-loads in parallel).
+8. Implement.
+9. `/agentic-review main..HEAD` — fresh-context §10 review before merge.
+10. `/agentic-audit` — periodic drift check across operational docs and specs.
+**Brownfield project, quick fix:**
+1. `agentic update` (only if you want upstream kit changes).
+2. Fix. `agentic-philosophy` auto-loads if the change is non-trivial.
+3. `/agentic-review` only if the fix is non-trivial. Trivial diffs skip the review.
+4. Commit.
+**Brownfield project, research-only ("what's the best way to add X?"):**
+1. `/agentic-ground` — runs the four-source research pass and surfaces the happy path with citations.
+2. Decide whether the answer becomes a spec (`/agentic-spec`) or a one-off task (`/agentic-task`).
+3. Continue from step 6 of the greenfield flow.
+The kit's discipline scales with the project's maturity. A solo PoC may legitimately skip `/agentic-spec` and `/agentic-adr` (the WORKFLOW §1 prune principle applies — don't add an artifact that wouldn't change agent behavior). A team product running on this kit is expected to use the full sequence; CI + hooks (deferred — see Task 0013) will eventually enforce the gates that today are advisory.
 ## Manual prompts
 If you prefer to skip the installer, the same artifacts can be generated by pasting prompts directly into your agent. Each prompt file has the literal text to copy, plus the matching template structure:

package/WORKFLOW.md CHANGED Viewed

@@ -62,16 +62,18 @@ Comments are exceptions. They justify *why* a non-obvious choice was made — ne
 ### Documentation Discipline
-Eight rules apply to every document the agent or the human writes in the project. They are framed against the failure modes the kit has actually seen — bloated `AGENTS.md`, README pages that drift into changelogs, decision artifacts diluted by speculation.
+The agent's authoritative copy of the eight-rule documentation discipline lives in the `agentic-philosophy` skill (`Documentation Discipline` section). The rules are summarized below for reference; the skill carries the full text agents read at session time. ADR-0008 records the canonical decision and the reconciliations against ADR-0004 (file-based task tracking) and ADR-0005 (universal agent behavior as a skill).
-1. **Definitions and decisions only.** Documentation captures what is true now and the decisions that brought it there. Speculation, history, and unfounded plans are out. A deferred decision is in scope when it is *recorded* — an accepted ADR or a task file is fundamentação; "we might do X later" without a record is speculation and is cut.
-2. **No dates, version stamps, `DRAFT` markers, or changelogs in narrative documents.** Applies to `README.md`, `AGENTS.md` / `CLAUDE.md`, `ARCHITECTURE.md`, `DESIGN.md`, specs, and any prose page. **Decision-record artifacts are exempt** — ADRs under `doc/adr/` (Nygard `Status` lifecycle requires a date) and tasks under `doc/tasks/` (append-only `Notes` log is the auditability surface). Use git history for narrative-doc evolution; keep the dated lifecycle inside the artifacts that need it.
-3. **No emoji anywhere.** Not in docs, code, source comments, commit messages, PR bodies, or skill outputs. Severity tags use plain words (`Blocker / Concern / Note`); status uses words (`accepted / superseded`); structural cues use Markdown.
-4. **Business context first.** Every document opens with *why* — the problem, the constraint, the user — before *what* and *how*. The first paragraph answers "what would break if this document didn't exist".
-5. **One scope per document. No duplication.** If two documents would say the same thing, link instead of copying. The canonical location owns the content; everywhere else references it.
-6. **Code is the primary documentation of behavior.** Comments justify *why* a non-obvious choice was made — never restate *what* the line does. If the comment is needed to explain *what*, rename or refactor.
-7. **No commented-out code; no orphan `TODO` / `FIXME` in source.** Every deferred item references a tracked work item — a GitHub Issue, or a per-task file under `doc/tasks/NNNN-*.md` (the kit's file-based tracking surface). The trace must be addressable from the source line.
-8. **Tests are living documentation of behavior.** Test names and assertions should read as the spec they enforce. When the spec changes, the test changes; when the test changes, the spec it documents must already have changed.
+1. **Definitions and decisions only.** No speculation, history, or unfounded plans.
+2. **No dates, version stamps, `DRAFT` markers, or changelogs in narrative documents.** Decision-record artifacts under `doc/adr/`, `doc/tasks/`, `doc/specs/` are exempt — their lifecycle fields are the auditability primitive.
+3. **No emoji anywhere.**
+4. **Business context first.**
+5. **One scope per document. No duplication.**
+6. **Code is the primary documentation of behavior.**
+7. **No commented-out code; no orphan `TODO` / `FIXME` in source.** Every deferred item references a GitHub Issue or a `doc/tasks/NNNN-*.md` task.
+8. **Tests are living documentation of behavior.**
+The skill body explains the rationale per rule, lists the failure modes the rules counter (bloated `AGENTS.md`, README pages drifting into changelogs, decision artifacts diluted by speculation), and walks through the reconciliations. Generator skills (`agentic-bootstrap`, `agentic-architecture`, `agentic-spec`, `agentic-task`, `agentic-adr`, `agentic-design`) reject violations of these rules at write time; `agentic-audit` flags drift across narrative docs and decision-record artifacts on demand.
 ## 3. Format by Evidence
@@ -86,29 +88,17 @@ Use XML when the prompt mixes instructions, retrieved context, examples, user in
 No format is universally best. **An observation from my practice, not benchmarked:** I've seen consistent gains when shifting prompts to XML — most noticeably with autonomous agents, where the prompt has to land alone without conversational refinement. Direct interactive use (Claude Code, Codex) tolerates loose Markdown; unattended agents don't. Claude in particular seems to respond well to XML, which I attribute to its training, but I haven't benchmarked it. Treat this as a starting hypothesis worth testing on your own target model and task before standardizing.
-## 4. Find the Happy Path
-Before implementing, ask:
-> *"What is the canonical, idiomatic way to implement [X] in [stack]? Cite official docs. List common deviations and why people take them."*
-Then check continuously, especially mid-implementation:
-> *"We are at step Y. Are we still on the happy path? If we deviated, was it deliberate?"*
-Sometimes you can't follow the happy path — that's fine. But always know where it is and why you left it.
-The kit ships `agentic-ground` (workflow-operational skill) as the bound implementation of §4 + §5: it runs a four-source research pass (official docs, validated open-source examples, in-repo patterns, git history), synthesizes the happy path with citations, and gates any deviation behind an irrefutable justification before code is written.
+## 4–5. Research Before Implementation
-## 5. Ground in Real Patterns
+Combines Find the Happy Path (canonical / idiomatic baseline) and Ground in Real Patterns (anchoring in project-specific examples). The kit treats both as one indivisible flow via `agentic-ground`; two prose sections would frame one operation as two separate practices.
-Don't dump the codebase into context. Anchor the model in a specific, project-relevant example.
+Two sub-practices, joined into one indivisible pass.
-> *"Find an existing example of [similar feature]; use that exact structure."*
+**Find the happy path.** Before implementing, ask: *"What is the canonical, idiomatic way to implement [X] in [stack]? Cite official docs. List common deviations and why people take them."* Mid-implementation: *"Are we still on the happy path? If we deviated, was it deliberate?"* Sometimes you can't follow the happy path — that's fine. Always know where it is and why you left it.
-Cite specific files, not "the codebase." Use just-in-time retrieval: pass paths or IDs and let the agent fetch via tools when it needs to read them.
+**Ground in real patterns.** Don't dump the codebase into context. Anchor the model in a specific, project-relevant example: *"Find an existing example of [similar feature]; use that exact structure."* Cite specific files, not "the codebase." Use just-in-time retrieval — pass paths or IDs and let the agent fetch via tools.
-`agentic-ground` (see §4) is the workflow-operational skill that combines this with the other three research sources — official docs, validated OSS, and git history — into one indivisible pre-implementation pass.
+The kit ships `agentic-ground` as the workflow-operational implementation of both. It runs a four-source research pass — official docs, validated open-source examples, in-repo patterns, git history — joined by AND not OR, synthesizes the happy path with citations from each source, and gates any deviation behind an irrefutable justification before code is written. Splitting the two sub-practices into separate skills would force two invocations with overlapping research outputs and fragment the synthesis context (ADR-0010).
 ## 6. Explore → Plan → Implement → Commit

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@alexandrealvaro/agentic",
-  "version": "0.5.1-beta.1",
+  "version": "0.6.0-beta.1",
   "description": "Bootstrap and audit AGENTS.md, ARCHITECTURE.md, ADRs, skills, and subagents for engineering production code with LLMs",
   "type": "module",
   "bin": {

package/src/skills/claude-code/agentic-audit/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: agentic-audit
-description: Read-only drift audit — compare AGENTS.md, ARCHITECTURE.md, and ADR statuses against what the code actually does. Outputs a drift list, never writes files. Use when the user wants to audit, review for drift, sanity-check, or report inconsistencies between the repo's docs and its code.
+description: Read-only drift audit — compare AGENTS.md, ARCHITECTURE.md, ADR statuses, feature specs in doc/specs/, and documentation discipline against what the code actually does. Outputs a drift list, never writes files. Use when the user wants to audit, review for drift, sanity-check, or report inconsistencies between the repo's docs and its code.
 allowed-tools: Read, Glob, Grep, Bash
 ---
@@ -10,7 +10,7 @@ Read-only. Produces a drift list comparing the repo's operational docs against w
 ## Step 1 — Decide what to audit
-If the user names an artifact (`AGENTS.md`, `ARCHITECTURE.md`, ADRs), audit only that. Otherwise audit all three categories below.
+If the user names an artifact (`AGENTS.md`, `ARCHITECTURE.md`, ADRs, specs), audit only that. Otherwise audit all categories below.
 ## Step 2 — Run checks
@@ -34,6 +34,16 @@ If the user names an artifact (`AGENTS.md`, `ARCHITECTURE.md`, ADRs), audit only
 * Status field — every ADR has one of `proposed | accepted | deprecated | superseded by ADR-NNNN`.
 * Superseded chains — every "superseded by ADR-NNNN" target exists.
+### Spec drift (if `doc/specs/` exists)
+Structural integrity only — does **not** deep-audit spec text against shipped code (deferred per [ADR-0011](../../doc/adr/0011-agentic-spec-skill.md) Consequences).
+* Numbering — gaps or duplicates in `doc/specs/NNNN-*.md`?
+* Status field — every spec has one of `draft | accepted | shipped | superseded by SPEC-NNNN`.
+* Superseded chains — every "superseded by SPEC-NNNN" target exists.
+* Reciprocity — every task under `doc/tasks/NNNN-*.md` whose `Spec ref` field is non-empty points to a spec that exists. And every spec with `Status: accepted` or `shipped` has at least one entry in its `Related → Tasks` list (an accepted spec with no implementing task is a smell).
+* Success Criteria coverage — when every task that references a spec is `done`, the spec's Success Criteria checkboxes should all be checked. A spec with all tasks `done` but unchecked Success Criteria boxes is drift between work-unit completion and feature-level claim.
 ### Documentation discipline drift (`WORKFLOW.md` §2 / ADR-0008)
 Audit narrative documents — `README.md`, `AGENTS.md` / `CLAUDE.md`, `ARCHITECTURE.md`, `DESIGN.md`, any prose page under `doc/` that is not a decision-record artifact under `doc/adr/` or `doc/tasks/`:

package/src/skills/claude-code/agentic-philosophy/SKILL.md CHANGED Viewed

@@ -20,12 +20,9 @@ Before implementing:
 ## Ground Before Coding
-**Anchor in real patterns. Research the canonical path.**
+**Anchor in real patterns before writing code.**
-- Find the canonical/idiomatic way to do it. Note where you deviate and why.
-- Find an existing example in the codebase; reuse its structure.
-- Cite specific files, not "the codebase". Fetch via tools — don't dump code into context.
-- For non-trivial changes, explore (read-only) → plan → implement → commit. Skip for diffs you can describe in one sentence.
+For non-trivial changes, invoke `/agentic-ground` — the workflow-operational skill that runs the four-source research pass (official docs, validated open-source examples, in-repo patterns, git history) and synthesizes a happy path with citations. The skill carries the prescriptive deviation gate; this section carries the posture only. Skip for diffs you can describe in one sentence.
 ## Simplicity First

package/src/skills/codex/agentic-audit/SKILL.md CHANGED Viewed

@@ -29,6 +29,13 @@ ADR drift (if `doc/adr/` exists):
 - Status field — every ADR has one of `proposed | accepted | deprecated | superseded by ADR-NNNN`.
 - Superseded chains — every "superseded by ADR-NNNN" target exists.
+Spec drift (if `doc/specs/` exists; structural integrity only — does NOT deep-audit spec text against code, deferred per ADR-0011):
+- Numbering — gaps or duplicates in `doc/specs/NNNN-*.md`?
+- Status field — every spec has one of `draft | accepted | shipped | superseded by SPEC-NNNN`.
+- Superseded chains — every "superseded by SPEC-NNNN" target exists.
+- Reciprocity — every task with non-empty `Spec ref` points to a spec that exists; every accepted/shipped spec has at least one entry in its Related → Tasks list.
+- Success Criteria coverage — when every task referencing a spec is done, the spec's Success Criteria checkboxes should all be checked.
 Documentation discipline drift (`WORKFLOW.md` §2 / ADR-0008). Audit narrative documents — `README.md`, `AGENTS.md` / `CLAUDE.md`, `ARCHITECTURE.md`, `DESIGN.md`, any prose page under `doc/` that is not a decision-record artifact under `doc/adr/` or `doc/tasks/`:
 - Emoji — any present? Rule 3 forbids emoji anywhere (docs, code, comments, commits, skill outputs).
 - Dates / version stamps / `DRAFT` markers / changelog blocks in narrative documents — Rule 2 forbids these. Decision-record artifacts under `doc/adr/` and `doc/tasks/` are exempt.

package/src/skills/codex/agentic-philosophy/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ Six behaviors apply to every non-trivial change. Bias toward caution over speed;
 <instructions>
 **Think Before Coding.** Don't assume. Don't hide confusion. Surface tradeoffs. State assumptions explicitly; ask when uncertain. If multiple interpretations exist, present them — don't pick silently. If a simpler approach exists, say so. If something is unclear, stop, name the confusion, ask.
-**Ground Before Coding.** Anchor in real patterns. Find the canonical/idiomatic way; note where you deviate and why. Find an existing example in the codebase and reuse its structure. Cite specific files, not "the codebase". Fetch via tools — don't dump code into context. For non-trivial changes, explore → plan → implement → commit. Skip for diffs describable in one sentence.
+**Ground Before Coding.** Anchor in real patterns before writing code. For non-trivial changes, invoke `/agentic-ground` — the workflow-operational skill that runs the four-source research pass (official docs, validated open-source examples, in-repo patterns, git history) and synthesizes a happy path with citations. The skill carries the prescriptive deviation gate; this section carries posture only. Skip for diffs describable in one sentence.
 **Simplicity First.** Minimum code that solves the problem. No features beyond what was asked. No abstractions for single-use code. No "flexibility" or "configurability" that wasn't requested. No error handling for impossible scenarios. Comments justify why, not what. No commented-out code; no orphan `TODO`/`FIXME` without an issue/ADR/follow-up reference. If 200 lines could be 50, rewrite.