npm - meto-cli - Versions diffs - 0.12.0 → 0.13.0 - Mend

meto-cli 0.12.0 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +9 -1
package/dist/cli/audit/index.d.ts.map +1 -1
package/dist/cli/audit/index.js +103 -4
package/dist/cli/audit/index.js.map +1 -1
package/dist/cli/audit/rubric-check.d.ts +55 -0
package/dist/cli/audit/rubric-check.d.ts.map +1 -0
package/dist/cli/audit/rubric-check.js +185 -0
package/dist/cli/audit/rubric-check.js.map +1 -0
package/dist/cli/audit/rubric-display.d.ts +36 -0
package/dist/cli/audit/rubric-display.d.ts.map +1 -0
package/dist/cli/audit/rubric-display.js +64 -0
package/dist/cli/audit/rubric-display.js.map +1 -0
package/package.json +1 -1
package/templates/.claude/agent-memory/meto-community/MEMORY.md +4 -0
package/templates/.claude/agent-memory/meto-developer/MEMORY.md +5 -0
package/templates/.claude/agent-memory/meto-pm/MEMORY.md +5 -0
package/templates/.claude/agent-memory/meto-tester/MEMORY.md +10 -0
package/templates/.claude/agents/developer-agent.md +32 -9
package/templates/.claude/agents/tester-agent.md +85 -9
package/templates/.claude/settings.json +13 -0
package/templates/CLAUDE.md +25 -2
package/templates/ai/contracts/README.md +89 -0
package/templates/ai/contracts/slice-NNN-contract.md +87 -0
package/templates/ai/handoff/README.md +74 -0
package/templates/ai/handoff/current.md +100 -0
package/templates/ai/rubric/README.md +42 -0
package/templates/ai/rubric/tester-calibration-log.md +71 -0
package/templates/ai/rubric/tester-rubric.md +140 -0

package/templates/ai/rubric/README.md ADDED Viewed

@@ -0,0 +1,42 @@
+# ai/rubric/
+This directory holds the tester rubric and related evaluation artifacts for the project.
+---
+## Files
+| File | Purpose |
+|------|---------|
+| `tester-rubric.md` | Blank rubric template — copy and fill in for each slice evaluation |
+| `tester-calibration-log.md` | Running log of past misjudgments and corrected thresholds (added in slice-090) |
+| `slice-NNN-score.md` | Completed rubric for slice NNN — created by @meto-tester after each evaluation |
+---
+## Score-to-Outcome Mapping
+| Score pattern | Outcome | What happens next |
+|---------------|---------|-------------------|
+| All dimensions score 3 | **Clean Pass** | Task moves to `tasks-done.md` immediately |
+| All dimensions score 2 or higher (at least one 2) | **Conditional Pass** | Task moves to `tasks-done.md`; tester opens a follow-up task for each 2-scored dimension if the issue is worth tracking |
+| Any dimension scores 1 | **Automatic Fail** | Task moves back to `tasks-todo.md`; the full rubric with critiques is attached so the developer knows exactly what to fix |
+---
+## Workflow
+1. @meto-tester copies `tester-rubric.md` to `ai/rubric/slice-NNN-score.md` (where NNN is the slice ID).
+2. Tester runs all verification commands (`npx vitest run`, `npx tsc --noEmit`, lint) and records exit codes in the rubric table.
+3. Tester scores each of the 5 dimensions, writes critiques for any dimension below 3.
+4. Tester writes the overall outcome (PASS / CONDITIONAL PASS / FAIL) and signs off.
+5. Tester moves the task to `tasks-done.md` (PASS / CONDITIONAL PASS) or back to `tasks-todo.md` (FAIL), attaching the completed rubric file path in the task block.
+---
+## NEVER DO
+- Do not return a binary "pass" or "fail" without completing the rubric table.
+- Do not skip running the verification commands — reading code is not evidence.
+- Do not modify a completed rubric after sign-off. If a past score was wrong, record the correction in `tester-calibration-log.md`.
+- Do not delete slice score files. They are the audit trail for the project's quality history.

package/templates/ai/rubric/tester-calibration-log.md ADDED Viewed

@@ -0,0 +1,71 @@
+# Tester Calibration Log — {{PROJECT_NAME}}
+<!-- LOCATION: This file lives at ai/rubric/tester-calibration-log.md in the scaffolded project. -->
+<!-- @meto-tester MUST read this file at session start before evaluating any slice. -->
+<!-- Update "Current Calibration Rules" every time a new entry is added below. -->
+---
+## Current Calibration Rules
+<!-- INSTRUCTIONS: Keep this section current. Each time you add a new log entry, -->
+<!-- distil the rule from that entry into the list below. This section is the active -->
+<!-- working rule set @meto-tester applies on every evaluation. -->
+<!-- Delete example rules and replace with real ones when the first real entry is added. -->
+1. **Test Coverage — partial implementation still scores 2, not 1** — If the implementation covers all acceptance criteria but lacks edge-case tests that were never specified in the contract, score 2 (partial) rather than 1 (fail). Score 1 only when an AC is provably untested.
+2. **Convention Adherence — a single stray debug log is score 2, not 1** — A committed `console.log` left in a non-critical path is a partial violation (score 2). Score 1 only when debug output meaningfully pollutes production behaviour or multiple violations exist.
+---
+## Log Entries
+<!-- INSTRUCTIONS: Add new entries at the TOP (reverse-chronological). -->
+<!-- Copy the entry template below for each new entry. -->
+<!-- Fictional slice IDs (slice-000, slice-001) used for illustration — delete these examples -->
+<!-- once the first real entry is recorded. -->
+---
+### Entry 2 — Example
+| Field | Value |
+|---|---|
+| **Date** | 2026-01-15 |
+| **Slice ID** | slice-001 |
+| **Dimension Affected** | Convention Adherence |
+| **What was scored incorrectly** | Scored 1 (fail) because one `console.log` was found in a utility helper; correct threshold is score 2 (partial) for a single isolated debug statement that does not affect production output |
+| **Correct score in retrospect** | 2 — partial violation, not a full fail |
+| **Rule update** | A single isolated `console.log` in a non-critical path is a partial violation (score 2); only score 1 when multiple violations exist or when debug output pollutes production behaviour |
+---
+### Entry 1 — Example
+| Field | Value |
+|---|---|
+| **Date** | 2026-01-08 |
+| **Slice ID** | slice-000 |
+| **Dimension Affected** | Test Coverage |
+| **What was scored incorrectly** | Scored 1 (fail) because an edge case not listed in the sprint contract had no test; the implementation itself covered all agreed acceptance criteria |
+| **Correct score in retrospect** | 2 — all contracted criteria were covered; the missing edge case was out of scope for the contract |
+| **Rule update** | Score Test Coverage based on contracted acceptance criteria only; uncontracted edge cases lower the score to 2 (partial) but never to 1 (fail) unless an AC is provably untested |
+---
+## Entry Template
+<!-- Copy this block for each new entry. Replace all placeholder values. -->
+<!--
+### Entry N — [brief label]
+| Field | Value |
+|---|---|
+| **Date** | YYYY-MM-DD |
+| **Slice ID** | slice-NNN |
+| **Dimension Affected** | [Code Quality / Type Safety / Test Coverage / Convention Adherence / Methodology Compliance] |
+| **What was scored incorrectly** | [Describe what score you gave and why it was wrong] |
+| **Correct score in retrospect** | [1 / 2 / 3 — and why] |
+| **Rule update** | [One sentence: the threshold adjustment to apply in all future evaluations of this dimension] |
+-->

package/templates/ai/rubric/tester-rubric.md ADDED Viewed

@@ -0,0 +1,140 @@
+# Tester Rubric — Slice {{SLICE_ID}}
+<!-- Fill in all fields before submitting the rubric. Incomplete rubrics are not valid evaluations. -->
+**Tester:** {{TESTER_NAME}}
+**Date:** {{DATE}}
+**Slice:** {{SLICE_ID}}
+---
+## Grading Dimensions
+Score each dimension on a 1–3 scale using the criteria below. Every score below 3 requires a written critique (see Critique Format section).
+---
+### 1. Code Quality
+| Score | Criteria |
+|-------|----------|
+| 3 | Code is readable, well-structured, no dead code, no magic numbers, all identifiers are descriptive |
+| 2 | Mostly readable with minor issues: one or two unclear names, a stray commented block, or a magic number with obvious intent |
+| 1 | Code is difficult to follow, contains dead code, relies on unexplained magic numbers, or has significant structural problems |
+**Score:** ___
+**Critique (required if score < 3):**
+<!-- One sentence of actionable feedback. Point to a specific file and line number where possible. Example: "src/cli/audit/scanner.ts:42 uses the magic number 20 — extract it as a named constant MAX_BAR_WIDTH." -->
+---
+### 2. Type Safety
+| Score | Criteria |
+|-------|----------|
+| 3 | No `any` types anywhere, `tsc --noEmit` exits 0 with zero errors or warnings |
+| 2 | One suppressed `any` with a clear justification comment, or `tsc` exits 0 but with a minor type assertion that is technically safe |
+| 1 | `any` type used without justification, `tsc --noEmit` exits non-zero, or type errors are suppressed silently |
+**Score:** ___
+**Critique (required if score < 3):**
+<!-- One sentence. Example: "src/cli/init/renderer.ts:88 uses `as any` to bypass a union type — the correct fix is to narrow the type with a type guard." -->
+---
+### 3. Test Coverage
+| Score | Criteria |
+|-------|----------|
+| 3 | Failing test written before implementation (red→green confirmed), all acceptance criteria have at least one corresponding test, `npx vitest run` exits 0 |
+| 2 | Tests cover most acceptance criteria but one AC has no direct test, or the red→green order cannot be verified but tests are otherwise complete |
+| 1 | Tests added after the fact without red→green discipline, one or more acceptance criteria have no test, or `npx vitest run` exits non-zero |
+**Score:** ___
+**Critique (required if score < 3):**
+<!-- One sentence. Example: "The 'minimum pass threshold' AC has no test — add a test asserting that a dimension scored 1 causes the rubric evaluation to return FAIL regardless of other scores." -->
+---
+### 4. Convention Adherence
+| Score | Criteria |
+|-------|----------|
+| 3 | Commit message matches the format `type(scope): description [agent-tag]`, file naming follows project conventions, no `console.log`, no hardcoded scaffold content in `/src/` |
+| 2 | One minor convention violation (e.g. missing agent tag in commit, a slightly inconsistent file name) with no functional impact |
+| 1 | Commit message does not follow the format, `console.log` present in committed code, scaffold content hardcoded in source, or multiple convention violations |
+**Score:** ___
+**Critique (required if score < 3):**
+<!-- One sentence. Example: "Commit a3f9c1b is missing the [dev-agent] tag — amend or note in the next commit message." -->
+---
+### 5. Methodology Compliance
+| Score | Criteria |
+|-------|----------|
+| 3 | Sprint contract exists at `ai/contracts/slice-{{SLICE_ID}}-contract.md` and is signed by both agents, task definition was followed exactly, no out-of-scope work delivered |
+| 2 | Sprint contract exists and is signed but has one minor gap (e.g. an edge case that was agreed verbally but not written into the contract), or a very small out-of-scope addition with clear justification |
+| 1 | No sprint contract, contract not signed before code was written, significant out-of-scope work delivered, or task definition was materially deviated from |
+**Score:** ___
+**Critique (required if score < 3):**
+<!-- One sentence. Example: "ai/contracts/slice-088-contract.md was not signed before implementation began — retroactive sign-off does not satisfy the contract-first rule." -->
+---
+## Minimum Pass Threshold
+<!-- Do not modify this section. It is part of the rubric definition. -->
+- All dimensions must score **2 or higher** to pass.
+- Any dimension scoring **1 is an automatic fail** — the slice returns to todo regardless of other scores.
+- A score of 2 across all dimensions is a **conditional pass** — the tester must note in the rubric table what must be improved in the next slice or a follow-up task.
+- A score of 3 across all dimensions is a **clean pass**.
+**Overall result:** PASS / CONDITIONAL PASS / FAIL _(circle one)_
+---
+## Critique Format
+For each dimension that scored below 3, write exactly one sentence of actionable feedback. The sentence must:
+1. Name the specific problem (not "code quality issues" — name the actual issue)
+2. Point to a file and line number where possible
+3. State the corrective action the developer should take
+**Do not write critiques for dimensions that scored 3.** Silence on a dimension means it passed cleanly.
+---
+## Verification Commands Run
+<!-- List every command you ran and its exit code. Never evaluate by reading code alone. -->
+| Command | Exit Code | Notes |
+|---------|-----------|-------|
+| `npx vitest run` | | |
+| `npx tsc --noEmit` | | |
+| _(lint command if present)_ | | |
+---
+## Final Sign-off
+By submitting this rubric, I confirm that I ran all verification commands in the current session and that every score reflects observed evidence, not assumption.
+**Tester:** {{TESTER_NAME}}
+**Date:** {{DATE}}
+**Outcome:** PASS / CONDITIONAL PASS / FAIL _(circle one)_