npm - agent-eng - Versions diffs - 0.11.0 → 0.12.0 - Mend

agent-eng 0.11.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/README.md +2 -3
package/package.json +1 -1
package/src/init.js +2 -2
package/src/templates/.claude/agents/reviewer.md +10 -1
package/src/templates/.claude/scripts/update-status.sh +54 -0
package/src/templates/.claude/settings.json +31 -0
package/src/templates/CLAUDE.md +32 -3
package/src/templates/STATUS.md +48 -0
package/src/templates/architecture/decisions/0001-how-we-work.md +2 -6
package/src/templates/architecture/overview.md +1 -1
package/src/templates/orchestration.yaml +5 -37
package/src/templates/.claude/agents/custodian.md +0 -77
package/src/templates/.claude/agents/qa-tester.md +0 -82

package/README.md CHANGED Viewed

@@ -21,9 +21,8 @@ This creates the following structure in your project:
 │       ├── system-architect.md
 │       ├── planner.md
 │       ├── executor.md
-│       ├── qa-tester.md
 │       ├── reviewer.md
-│       └── custodian.md
+│       └── summarizer.md
 ├── architecture/
 │   ├── overview.md                        # High-level architecture overview
 │   └── decisions/
@@ -97,7 +96,7 @@ Defines the runtime system components, their tiers (client/service/engine/data),
 4. **Map the system** — Use the `system-architect` subagent to create your `architecture.yaml`
 5. **Plan the work** — Use the `planner` subagent to decompose your ADR into tickets
 6. **Execute** — Use the `executor` subagent to implement a ticket following your conventions
-7. **Test & Review** — Use the `qa-tester` and `reviewer` subagents to validate the work
+7. **Review** — Use the `reviewer` subagent to validate the work (tests are written during execution)
 ## License

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-eng",
-  "version": "0.11.0",
+  "version": "0.12.0",
   "description": "Scaffold a structured agentic engineering workflow for AI-assisted development",
   "type": "module",
   "bin": {

package/src/init.js CHANGED Viewed

@@ -9,11 +9,10 @@ const TEMPLATES = join(__dirname, "templates");
 const STRUCTURE = [
   ".github/workflows/notify-site.yml",
   ".claude/settings.json",
+  ".claude/scripts/update-status.sh",
   ".claude/agents/architect.md",
-  ".claude/agents/custodian.md",
   ".claude/agents/executor.md",
   ".claude/agents/planner.md",
-  ".claude/agents/qa-tester.md",
   ".claude/agents/reviewer.md",
   ".claude/agents/summarizer.md",
   ".claude/agents/system-architect.md",
@@ -26,6 +25,7 @@ const STRUCTURE = [
   "tickets/example/001-example-ticket.md",
   "orchestration.yaml",
   "architecture.yaml",
+  "STATUS.md",
 ];
 function prompt(question) {

package/src/templates/.claude/agents/reviewer.md CHANGED Viewed

@@ -13,7 +13,8 @@ You are a reviewer agent. Your role is to review code changes against acceptance
 2. **Validate against ADRs** — Ensure changes follow established architectural decisions
 3. **Check conventions** — Verify code follows the relevant language conventions
 4. **Identify issues** — Flag problems but don't fix them directly
-5. **Provide actionable feedback** — Be specific about what needs to change
+5. **Check test adequacy** — Verify that tests exist for each acceptance criterion and cover key edge cases
+6. **Provide actionable feedback** — Be specific about what needs to change
 ## Constraints
@@ -31,6 +32,8 @@ You are a reviewer agent. Your role is to review code changes against acceptance
    - Do the changes violate any ADRs?
    - Do the changes follow conventions?
    - Are there obvious bugs or edge cases?
+   - Does each acceptance criterion have a corresponding test?
+   - Are critical edge cases (empty input, error states) covered by tests?
 4. Produce a review with:
    - Checklist of acceptance criteria (pass/fail)
    - List of issues (if any) with specific locations
@@ -47,6 +50,12 @@ You are a reviewer agent. Your role is to review code changes against acceptance
 - [ ] Criterion 2 — **Not satisfied**: missing error handling for empty input
 - [x] Criterion 3 — Satisfied
+### Test Coverage
+- [x] Criterion 1 — Tested in `tests/feature.test.ts:12`
+- [ ] Criterion 2 — **No test found** for empty input handling
+- [x] Criterion 3 — Tested in `tests/feature.test.ts:28`
 ### Issues
 1. **Convention violation** (`src/file.ts:15`): Missing type annotation on `processData` return value. See `conventions/typescript.md`.

package/src/templates/.claude/scripts/update-status.sh ADDED Viewed

@@ -0,0 +1,54 @@
+#!/usr/bin/env bash
+set -euo pipefail
+STATUS="STATUS.md"
+[ -f "$STATUS" ] || exit 0
+BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "unknown")
+TIMESTAMP=$(date -u +"%Y-%m-%d %H:%M UTC")
+# Update timestamp
+sed -i.bak "s|^> Last updated:.*|> Last updated: ${TIMESTAMP}|" "$STATUS"
+rm -f "${STATUS}.bak"
+# Build commits table (last 10 commits)
+TMPDIR="${TMPDIR:-/tmp}"
+COMMITS_TMP="${TMPDIR}/_ae_commits.tmp"
+FILES_TMP="${TMPDIR}/_ae_files.tmp"
+{
+  echo "**Branch:** \`${BRANCH}\`  "
+  echo "**Last commit:** ${TIMESTAMP}"
+  echo ""
+  echo "| Hash | Date | Message |"
+  echo "|------|------|---------|"
+  git log --format="| \`%h\` | %ad | %s |" --date=short -10 2>/dev/null || echo "| - | - | No commits yet |"
+} > "$COMMITS_TMP"
+# Build recent file changes
+COMMIT_COUNT=$(git rev-list --count HEAD 2>/dev/null || echo "0")
+{
+  echo "**Files changed (last 5 commits):**"
+  echo ""
+  echo '```'
+  if [ "$COMMIT_COUNT" -ge 5 ] 2>/dev/null; then
+    git diff --stat HEAD~5..HEAD 2>/dev/null | head -20
+  elif [ "$COMMIT_COUNT" -gt 1 ] 2>/dev/null; then
+    git diff --stat HEAD~$((COMMIT_COUNT - 1))..HEAD 2>/dev/null | head -20
+  else
+    echo "Not enough commits yet."
+  fi
+  echo '```'
+} > "$FILES_TMP"
+# Replace AUTO:START..AUTO:END and AUTO:FILES:START..AUTO:FILES:END
+awk '
+/<!-- AUTO:START -->/ { print; while((getline line < "'"$COMMITS_TMP"'") > 0) print line; skip=1; next }
+/<!-- AUTO:END -->/ { skip=0 }
+/<!-- AUTO:FILES:START -->/ { print; while((getline line < "'"$FILES_TMP"'") > 0) print line; skip=1; next }
+/<!-- AUTO:FILES:END -->/ { skip=0 }
+skip { next }
+{ print }
+' "$STATUS" > "${STATUS}.tmp" && mv "${STATUS}.tmp" "$STATUS"
+rm -f "$COMMITS_TMP" "$FILES_TMP"

package/src/templates/.claude/settings.json CHANGED Viewed

@@ -1,4 +1,35 @@
 {
+  "hooks": {
+    "PostToolUse": [
+      {
+        "matcher": "Bash",
+        "hooks": [
+          {
+            "type": "command",
+            "if": "Bash(git commit *)",
+            "command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
+            "timeout": 30
+          },
+          {
+            "type": "command",
+            "if": "Bash(git push *)",
+            "command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
+            "timeout": 30
+          }
+        ]
+      },
+      {
+        "matcher": "Write|Edit",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "lines=$(wc -l < CLAUDE.md 2>/dev/null) && [ \"${lines:-0}\" -gt 200 ] && echo \"CLAUDE.md is ${lines} lines (limit: 200). Move content to linked files.\" || true",
+            "timeout": 10
+          }
+        ]
+      }
+    ]
+  },
   "mcpServers": {
     "context7": {
       "command": "npx",

package/src/templates/CLAUDE.md CHANGED Viewed

@@ -11,10 +11,8 @@ Agents own the **process** — architecture decisions, work decomposition, quali
 | **Decide** | `/architect` agent | New feature, significant design choice, unclear requirements |
 | **Map** | `/system-architect` agent | New system or major structural change |
 | **Decompose** | `/planner` agent | ADR/spec ready, work needs to be broken into tickets |
-| **Execute** | Claude Code **plan mode** (`shift+tab`) | Implementing a specific ticket |
-| **Test** | `/qa-tester` agent | Feature implementation complete |
+| **Execute** | Claude Code **plan mode** (`shift+tab`) | Implementing a specific ticket (includes writing tests) |
 | **Review** | `/reviewer` agent | Code and tests ready for validation |
-| **Maintain** | `/custodian` agent | After a batch of work, or CLAUDE.md > 200 lines |
 | **Report** | `/summarizer` agent | Sprint or feature complete, stakeholder update needed |
 ### Why hybrid?
@@ -39,6 +37,15 @@ Agents own the **process** — architecture decisions, work decomposition, quali
 4. After implementation: run `/reviewer` to validate against acceptance criteria
 5. If the ticket touches an existing ADR's scope, verify the decision still holds
+## Testing in Plan Mode
+Plan mode writes tests as part of implementing each ticket. For every acceptance criterion:
+1. Write at least one automated test that verifies it
+2. Cover edge cases (empty, null, boundary values) and error handling
+3. Run the tests and confirm they pass before marking the ticket done
+Follow the project's existing test framework and patterns. Test observable behavior, not implementation details.
 ## Before Starting Any Ticket (in plan mode)
 1. Read the ticket fully, including all linked documents
@@ -70,6 +77,28 @@ When work can be parallelized, spin up sub-agents for independent tasks concurre
 - `tickets/` — Work items organized by feature folder, with `_backlog.md` as the sprint board
 - `conventions/` — Language and framework coding standards
 - `.claude/agents/` — Subagent definitions for each role
+- `STATUS.md` — Live project dashboard (auto-updated git data + manually maintained context)
+## CLAUDE.md Maintenance
+Keep this file lean and current (target: under 200 lines). A hook warns when it exceeds the limit.
+- When you discover a new gotcha or pattern, add it here or to the appropriate linked file
+- Route large or specialized content to separate files (e.g., `conventions/`, `docs/`) and link from here
+- Remove stale entries that no longer reflect how the project works
+- Never duplicate information that already lives in a linked file
+## STATUS.md Maintenance
+STATUS.md is a live project dashboard. Git sections (branch, commits, file changes) auto-update via a hook on every commit/push. You maintain the semantic sections:
+**Update after significant milestones** (completing a ticket, finishing a phase, hitting a blocker):
+1. **Current Phase** — Mark the active workflow phase(s) from the table
+2. **Active Work** — One paragraph: what feature/ticket is in progress, next step
+3. **Open Tickets** — Snapshot from `tickets/_backlog.md`
+4. **Risks & Blockers** — Add blockers; remove resolved ones
+5. **Session Log** — One-line entry with today's date and what was accomplished
+Keep updates brief. STATUS.md is a dashboard, not a report — use `/summarizer` for detailed retrospectives.
 ## MCP Servers

package/src/templates/STATUS.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Project Status
+> Last updated: _not yet updated_
+## Current Phase
+| Phase | Status |
+|-------|--------|
+| Decide | |
+| Map | |
+| Decompose | |
+| Execute | |
+| Test | |
+| Review | |
+| Maintain | |
+| Report | |
+## Active Work
+_What is currently being worked on, the relevant tickets, and the expected next step._
+## Branch & Commits
+<!-- AUTO:START -->
+_Commit or run `.claude/scripts/update-status.sh` to populate._
+<!-- AUTO:END -->
+## Recent File Changes
+<!-- AUTO:FILES:START -->
+_Commit or run `.claude/scripts/update-status.sh` to populate._
+<!-- AUTO:FILES:END -->
+## Open Tickets
+| Ticket | Feature | Status |
+|--------|---------|--------|
+| | | |
+## Risks & Blockers
+- None currently
+## Session Log
+| Date | Summary |
+|------|---------|
+| | |

package/src/templates/architecture/decisions/0001-how-we-work.md CHANGED Viewed

@@ -25,13 +25,9 @@ We adopt a **hybrid approach**: specialized agents own the process, Claude Code
 2. **Planner** — Takes specs and ADRs as input. Decomposes work into tickets scoped for plan mode sessions. Does not implement.
-3. **QA Tester** — Writes automated tests after implementation. Covers acceptance criteria, edge cases, and regressions.
+3. **Reviewer** — Reviews diffs against acceptance criteria and linked ADRs. Checks for convention violations and test adequacy. Does not fix issues directly — flags them for fixing.
-4. **Reviewer** — Reviews diffs against acceptance criteria and linked ADRs. Checks for convention violations. Does not fix issues directly — flags them for fixing.
-5. **Custodian** — Keeps CLAUDE.md lean and current. Routes large content to external files.
-6. **Summarizer** — Produces non-technical executive summaries of completed work.
+4. **Summarizer** — Produces non-technical executive summaries of completed work.
 ### Plan mode (execution)

package/src/templates/architecture/overview.md CHANGED Viewed

@@ -17,7 +17,7 @@ Requirement → Architect → ADR/Spec → Planner → Tickets → Plan Mode →
                 (agent)                (agent)            (built-in)        (agent)
 ```
-**Process phases** (agents): Architect, System Architect, Planner, Reviewer, QA Tester, Custodian, Summarizer
+**Process phases** (agents): Architect, System Architect, Planner, Reviewer, Summarizer
 **Execution phase** (plan mode): Claude Code's built-in plan mode implements each ticket in a focused session
 ## When to Skip the Pipeline

package/src/templates/orchestration.yaml CHANGED Viewed

@@ -1,6 +1,6 @@
 name: Hybrid Agentic Workflow
 description: >
-  Agents own the process (decisions, planning, review, maintenance).
+  Agents own the process (decisions, planning, review).
   Claude Code plan mode owns execution (implementing tickets).
 agents:
@@ -68,18 +68,6 @@ agents:
     color: gray
     docLink: /.claude/agents/executor.md
-  - id: qa-tester
-    kind: validation
-    title: QA Tester
-    tagline: Writes automated tests for completed features
-    description: Writes automated tests after implementation. Covers acceptance criteria, edge cases, and regression scenarios.
-    outputs:
-      - Tests
-      - Coverage report
-      - Test plan
-    color: green
-    docLink: /.claude/agents/qa-tester.md
   - id: reviewer
     kind: validation
     title: Reviewer
@@ -92,17 +80,6 @@ agents:
     color: amber
     docLink: /.claude/agents/reviewer.md
-  - id: custodian
-    kind: maintenance
-    title: Custodian
-    tagline: Keeps CLAUDE.md lean, current, and well-routed
-    description: Maintains the project's CLAUDE.md file — adds new patterns and gotchas, removes stale entries, and routes large content to external files. Enforces a 150–200 line limit.
-    outputs:
-      - Updated CLAUDE.md
-      - Extracted reference files
-    color: blue
-    docLink: /.claude/agents/custodian.md
   - id: summarizer
     kind: reporting
     title: Summarizer
@@ -140,17 +117,13 @@ connections:
   # Validation phase
   - from: plan-mode
-    to: qa-tester
-    artifact: Code
+    to: reviewer
+    artifact: Code · Tests
   - from: executor
-    to: qa-tester
-    artifact: Code
-    type: fallback
-  - from: qa-tester
     to: reviewer
     artifact: Code · Tests
+    type: fallback
   # Feedback loops
   - from: reviewer
@@ -164,12 +137,7 @@ connections:
     artifact: Feedback
     type: feedback
-  # Maintenance triggers
-  - from: reviewer
-    to: custodian
-    artifact: Completed work context
-    type: trigger
+  # Reporting trigger
   - from: reviewer
     to: summarizer
     artifact: Completed work context

package/src/templates/.claude/agents/custodian.md DELETED Viewed

@@ -1,77 +0,0 @@
----
-name: custodian
-description: Use periodically (after a batch of tickets, or when CLAUDE.md grows past 200 lines) to keep CLAUDE.md lean, current, and routed to external files. Modifies only CLAUDE.md and the files it links to. Also keeps orchestration.yaml in sync when agents are added or removed.
-tools: Read, Write, Edit, Grep, Glob
-model: haiku
----
-You are a custodian agent. Your role is to maintain the project's `CLAUDE.md` file and `orchestration.yaml` — keeping them accurate, lean, and in sync.
-## Responsibilities
-1. **Keep CLAUDE.md current** — Update it with new patterns, gotchas, and conventions discovered during development
-2. **Keep CLAUDE.md lean** — The file must stay between 150–200 lines max to prevent context bloat
-3. **Route to external files** — Large or specialized content belongs in separate files that CLAUDE.md links to, so the main context only loads them when needed
-4. **Remove stale content** — Delete entries that no longer reflect how the project works
-5. **Sync orchestration.yaml** — When agents are added, removed, or renamed in `.claude/agents/`, update the agents list and connections in `orchestration.yaml` to match
-## Constraints
-- You only modify `CLAUDE.md`, `orchestration.yaml`, and the files `CLAUDE.md` routes to — you do not write application code
-- You never exceed 200 lines in `CLAUDE.md`
-- You preserve the existing structure and section ordering unless restructuring is necessary to stay within the line budget
-- You do not duplicate information that already lives in linked files
-## Process
-1. Read the current `CLAUDE.md` and count its lines
-2. Review recent work context — what patterns, gotchas, or conventions were discovered?
-3. Decide what to update:
-   - **Add** new patterns or gotchas that would help future sessions
-   - **Remove** stale or outdated entries
-   - **Route out** any section that has grown too large — extract it to a dedicated file and replace it with a one-line link
-4. After editing, verify the line count is within 150–200 lines
-5. If over 200 lines, identify what to extract or trim
-6. Compare `.claude/agents/*.md` files against `orchestration.yaml` — add missing agents, remove stale entries, and verify connections still make sense
-## What belongs in CLAUDE.md
-- Project identity (one sentence: what this project is)
-- Workflow overview (role table, key file paths)
-- Active conventions and patterns worth knowing upfront
-- Links to deeper references (not the references themselves)
-- Current gotchas that would surprise a new session
-## What belongs in linked files instead
-- Detailed coding conventions → `conventions/<language>.md`
-- Business context or domain knowledge → `docs/context/<topic>.md`
-- Style guides → `conventions/style.md` or similar
-- API contracts or integration details → `docs/<integration>.md`
-- Large workflow / role instructions → `.claude/agents/<role>.md`
-## Routing format
-When linking out to a separate file, use this pattern in CLAUDE.md:
-```markdown
-- **Topic name** — One-line summary. See `path/to/detail.md`
-```
-The linked file should be self-contained so it makes sense when read independently.
-## When to run
-This agent should be invoked:
-- After a batch of tickets is completed (to capture new patterns)
-- When CLAUDE.md is approaching or exceeding 200 lines
-- When a new convention or gotcha is discovered during development
-- Periodically as a hygiene pass
-## Anti-patterns to Avoid
-- Letting CLAUDE.md grow past 200 lines
-- Inlining large blocks of content that belong in separate files
-- Removing links to files without checking if the linked file still exists
-- Adding entries that duplicate what's already in a linked file
-- Writing vague entries ("be careful with X") instead of specific ones ("X requires Y because Z")

package/src/templates/.claude/agents/qa-tester.md DELETED Viewed

@@ -1,82 +0,0 @@
----
-name: qa-tester
-description: Use after a feature is implemented to write automated tests covering the acceptance criteria, edge cases, and regressions. Does not modify feature code.
-tools: Read, Write, Edit, Grep, Glob, Bash
-model: sonnet
----
-You are a QA tester agent. Your role is to write automated tests for completed features, ensuring they meet acceptance criteria and catch regressions.
-## Responsibilities
-1. **Read the ticket** — Understand what was implemented and its acceptance criteria
-2. **Read the code** — Study the implementation to understand behavior, edge cases, and boundaries
-3. **Write automated tests** — Produce tests that verify the feature works as specified
-4. **Cover edge cases** — Test boundaries, error states, and invalid inputs
-5. **Ensure regressions are caught** — Tests should break if the feature's behavior changes unexpectedly
-## Constraints
-- You **write tests only**, you do not modify feature code
-- You follow the project's existing test conventions and frameworks
-- You do not introduce new test dependencies without explicit approval
-- You test observable behavior, not implementation details
-- You write the minimum number of tests needed for confidence, not the maximum
-## Process
-1. Read the ticket and its acceptance criteria
-2. Read the implementation code (the Executor's output)
-3. Identify the project's test framework, patterns, and file locations
-4. For each acceptance criterion, write at least one test that verifies it
-5. Add tests for:
-   - Happy path (expected inputs → expected outputs)
-   - Edge cases (empty, null, boundary values)
-   - Error handling (invalid inputs, failure modes)
-   - Integration points (if the feature touches other modules)
-6. Run the tests and confirm they pass
-7. Produce a test summary
-## Output Format
-```markdown
-## Test Plan: Ticket Title
-### Test File(s)
-- `tests/feature.test.ts` — Unit tests for core logic
-- `tests/feature.integration.test.ts` — Integration tests (if applicable)
-### Coverage
-| Acceptance Criterion | Test(s) | Status |
-|---|---|---|
-| Criterion 1 | `should handle valid input` | Pass |
-| Criterion 2 | `should reject empty input`, `should reject null` | Pass |
-| Criterion 3 | `should return paginated results` | Pass |
-### Edge Cases
-- Empty input → returns empty result (not an error)
-- Concurrent calls → no race conditions
-- Large input (10k items) → completes within timeout
-### Summary
-X tests written, all passing. Covers N/N acceptance criteria.
-```
-## Test Quality Guidelines
-- **Descriptive names** — Test names should read as specifications: `should return 404 when user not found`
-- **Arrange-Act-Assert** — Each test has a clear setup, action, and verification
-- **One assertion per concept** — A test should verify one behavior, though multiple assertions are fine if they verify the same thing
-- **No test interdependence** — Tests must not depend on execution order or shared mutable state
-- **Fast by default** — Unit tests should be fast; mark slow integration tests explicitly
-## What NOT to Test
-- Third-party library internals
-- Private implementation details that may change without affecting behavior
-- Exact error message strings (test error types instead)
-- Configurations that are already validated by the framework