@cleocode/skills 2026.4.4 → 2026.4.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/package.json +1 -1
  2. package/skills/ct-adr-recorder/SKILL.md +175 -0
  3. package/skills/ct-adr-recorder/manifest-entry.json +30 -0
  4. package/skills/ct-adr-recorder/references/cascade.md +82 -0
  5. package/skills/ct-adr-recorder/references/examples.md +141 -0
  6. package/skills/ct-artifact-publisher/SKILL.md +146 -0
  7. package/skills/ct-artifact-publisher/manifest-entry.json +30 -0
  8. package/skills/ct-artifact-publisher/references/artifact-types.md +126 -0
  9. package/skills/ct-artifact-publisher/references/handler-interface.md +187 -0
  10. package/skills/ct-consensus-voter/SKILL.md +158 -0
  11. package/skills/ct-consensus-voter/manifest-entry.json +30 -0
  12. package/skills/ct-consensus-voter/references/matrix-examples.md +140 -0
  13. package/skills/ct-grade/references/token-tracking.md +2 -2
  14. package/skills/ct-grade/scripts/run_all.py +1 -1
  15. package/skills/ct-grade-v2-1/manifest-entry.json +1 -1
  16. package/skills/ct-ivt-looper/SKILL.md +181 -0
  17. package/skills/ct-ivt-looper/manifest-entry.json +30 -0
  18. package/skills/ct-ivt-looper/references/escalation.md +91 -0
  19. package/skills/ct-ivt-looper/references/frameworks.md +119 -0
  20. package/skills/ct-ivt-looper/references/loop-anatomy.md +156 -0
  21. package/skills/ct-orchestrator/manifest-entry.json +1 -1
  22. package/skills/ct-provenance-keeper/SKILL.md +161 -0
  23. package/skills/ct-provenance-keeper/manifest-entry.json +30 -0
  24. package/skills/ct-provenance-keeper/references/signing.md +188 -0
  25. package/skills/ct-provenance-keeper/references/slsa.md +121 -0
  26. package/skills/ct-release-orchestrator/SKILL.md +134 -0
  27. package/skills/ct-release-orchestrator/manifest-entry.json +30 -0
  28. package/skills/ct-release-orchestrator/references/composition.md +138 -0
  29. package/skills/ct-release-orchestrator/references/release-types.md +130 -0
  30. package/skills/ct-skill-creator/manifest-entry.json +1 -1
  31. package/skills/ct-skill-creator/references/provider-deployment.md +9 -9
  32. package/skills/ct-skill-validator/evals/evals.json +1 -1
  33. package/skills/ct-skill-validator/manifest-entry.json +1 -1
  34. package/skills/manifest.json +252 -16
  35. package/skills/ct-skill-creator/.cleo/.context-state.json +0 -13
@@ -0,0 +1,181 @@
1
+ ---
2
+ name: ct-ivt-looper
3
+ description: "Runs a project-agnostic autonomous Implement-then-Validate-then-Test compliance loop on any git worktree. Detects the project's test framework (vitest, jest, mocha, pytest, unittest, go-test, cargo-test, rspec, phpunit, bats, or other) and iterates until the implementation satisfies its specification, recording convergence metrics to the manifest. Use when given an implementation task that must ship verified: the IVT loop is the autonomous compliance layer enforced before any release or PR. Triggers on phrases like 'implement and verify', 'run the IVT loop', 'ship this task', 'complete implementation with tests', 'verify against spec', or any implementation task with acceptance criteria. Works in any git worktree regardless of language or framework, never hardcoded to one project's tooling."
4
+ ---
5
+
6
+ # IVT Looper
7
+
8
+ ## Overview
9
+
10
+ Runs an autonomous Implement-then-Validate-then-Test loop against any git worktree, detects the test framework in use, and iterates until the implementation converges on its specification. This skill is the autonomous compliance layer: no task ships, no release runs, and no PR opens until the loop has recorded a converged result to the manifest.
11
+
12
+ ## Core Principle
13
+
14
+ > The loop converges on the spec, not on 'tests pass'. Passing tests that don't cover the spec is a failure.
15
+
16
+ ## Immutable Constraints
17
+
18
+ | ID | Rule | Enforcement |
19
+ |----|------|-------------|
20
+ | IVT-001 | Test framework MUST be detected or declared before the loop starts. | `validateTestingProtocol` rejects entries without a `framework`; deducts 20 from score. |
21
+ | IVT-002 | Loop MUST cap iterations at `MAX_ITERATIONS` (default 5). | Hard counter in the loop; unreachable convergence forces HITL escalation. |
22
+ | IVT-003 | Every MUST requirement in the spec MUST map to at least one passing test. | Spec-to-test traceability matrix; gaps block convergence. |
23
+ | IVT-004 | Final manifest entry MUST record `framework`, `testsRun`, `testsPassed`, `testsFailed`, `ivtLoopConverged`, `ivtLoopIterations`. | `validateTestingProtocol` requires these fields; `ivtLoopConverged: false` fails validation. |
24
+ | IVT-005 | Framework detection MUST NOT be hardcoded to one project's tooling. | Detection walks the project tree; no vitest/jest-only code paths. |
25
+ | IVT-006 | Loop MUST NOT run on the main branch. | Check `git branch --show-current`; stop if on `main`/`master`/`trunk`. |
26
+ | IVT-007 | Non-convergence after `MAX_ITERATIONS` MUST escalate to HITL (exit code 65). | Agent stops, writes manifest with `ivtLoopConverged: false`, exits. |
27
+ | IVT-008 | Final manifest entry MUST set `agent_type: "testing"`. | Validator rejects any other value. |
28
+
29
+ ## The IVT Loop
30
+
31
+ ```
32
+ load_spec(task_id) # read acceptance criteria from canon
33
+ detect_framework(worktree) # see references/frameworks.md
34
+ branch_check() # abort if on main/master/trunk
35
+ iteration = 0
36
+
37
+ while iteration < MAX_ITERATIONS: # IVT-002
38
+ iteration += 1
39
+
40
+ # ----- I : Implement -----
41
+ apply_patch(current_diff) # patch generated by the implementer
42
+ ensure_provenance_tags(new_code) # IMPL-003 tags on new functions
43
+
44
+ # ----- V : Validate -----
45
+ lint_result = run_project_linter()
46
+ type_result = run_type_checker() # tsc / mypy / go vet / rustc
47
+ if lint_result.failed or type_result.failed:
48
+ regenerate_fix_for(lint_result, type_result)
49
+ continue # back to top, same iteration count
50
+
51
+ # ----- T : Test -----
52
+ test_result = run_framework_tests() # framework-specific command
53
+ trace = spec_to_test_trace(spec) # IVT-003: every MUST has a test
54
+ if test_result.all_passed and trace.complete:
55
+ write_manifest(
56
+ framework=framework,
57
+ iterations=iteration,
58
+ converged=True,
59
+ agent_type="testing",
60
+ )
61
+ return CONVERGED # exit loop, exit skill with code 0
62
+
63
+ # ----- Analyze and regenerate fix -----
64
+ failure = diagnose(test_result, trace)
65
+ current_diff = regenerate_fix_for(failure)
66
+
67
+ # MAX_ITERATIONS reached without convergence
68
+ write_manifest(
69
+ framework=framework,
70
+ iterations=MAX_ITERATIONS,
71
+ converged=False,
72
+ agent_type="testing",
73
+ )
74
+ escalate_to_hitl() # IVT-007: exit code 65
75
+ ```
76
+
77
+ The loop is a *single* stage from the lifecycle's point of view. Implement, Validate, and Test are not three separate tasks — they are three phases of one autonomous run that either converges or escalates.
78
+
79
+ ## Framework Detection
80
+
81
+ Framework detection is project-agnostic: the skill walks the worktree, inspects config files, and selects the correct test command. No language or framework is special-cased above another. The full detection table lives in [references/frameworks.md](references/frameworks.md). In summary: detection reads the project manifest (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `Gemfile`, `composer.json`, or `.cleo/project-context.json#testing.command`) and selects one of: `vitest`, `jest`, `mocha`, `pytest`, `unittest`, `go-test`, `cargo-test`, `rspec`, `phpunit`, `bats`, `other`.
82
+
83
+ ## Convergence Criteria
84
+
85
+ The loop has converged when **all** of the following hold:
86
+
87
+ - Every MUST requirement in the task's linked specification has at least one passing test (spec-to-test traceability complete).
88
+ - `testsFailed == 0`, `testsRun > 0`, and `testsPassed == testsRun`.
89
+ - Project linter reports zero errors (warnings are allowed).
90
+ - Type checker reports zero errors.
91
+ - CI-equivalent local commands return zero (e.g., `pnpm run build`, `cargo build`, `go build ./...`).
92
+ - No new runtime errors were observed during the test run.
93
+
94
+ A loop that makes tests pass by deleting assertions, narrowing scope, or mocking the thing under test is **not** converged — that is a spec violation dressed up as a green run. See [references/loop-anatomy.md](references/loop-anatomy.md) for worked examples.
95
+
96
+ ## Branch Discipline
97
+
98
+ Before iteration 1, the skill MUST verify the current branch:
99
+
100
+ ```bash
101
+ branch=$(git branch --show-current)
102
+ if [[ "$branch" == "main" || "$branch" == "master" || "$branch" == "trunk" ]]; then
103
+ echo "Refusing to run IVT loop on protected branch: $branch"
104
+ exit 65 # HANDOFF_REQUIRED: ask HITL for a feature branch
105
+ fi
106
+ ```
107
+
108
+ The loop is destructive in the sense that it rewrites code on failed iterations. It MUST run on a feature branch or worktree. If the user invoked it on a protected branch, stop and request a feature branch.
109
+
110
+ ## Escalation on Non-Convergence
111
+
112
+ When the loop exhausts `MAX_ITERATIONS` without converging, the skill MUST:
113
+
114
+ 1. Write the manifest entry with `ivtLoopConverged: false` and the iteration count.
115
+ 2. Record the last diagnostic output in `key_findings`.
116
+ 3. Exit with code 65 (`HANDOFF_REQUIRED`).
117
+ 4. Leave the worktree in its last state — do **not** revert.
118
+
119
+ The human reviewer picks up the worktree, reads the diagnostics, and either raises the iteration cap, rewrites the spec, or manually corrects the implementation before rerunning the skill. The full escalation handoff protocol is in [references/escalation.md](references/escalation.md).
120
+
121
+ ## Integration
122
+
123
+ Record the loop outcome through `cleo check protocol`:
124
+
125
+ ```bash
126
+ # Success case: loop converged on iteration 3.
127
+ cleo check protocol \
128
+ --protocolType testing \
129
+ --framework vitest \
130
+ --testsRun 142 \
131
+ --testsPassed 142 \
132
+ --testsFailed 0 \
133
+ --ivtLoopConverged true \
134
+ --ivtLoopIterations 3
135
+
136
+ # Failure case: loop exhausted iterations, HITL escalation.
137
+ cleo check protocol \
138
+ --protocolType testing \
139
+ --framework pytest \
140
+ --testsRun 87 \
141
+ --testsPassed 84 \
142
+ --testsFailed 3 \
143
+ --ivtLoopConverged false \
144
+ --ivtLoopIterations 5
145
+ ```
146
+
147
+ Exit code 0 = loop converged and protocol is valid. Exit code 65 = `HANDOFF_REQUIRED` (non-convergence).
148
+
149
+ This skill MUST also chain a validation check against the spec before its own testing check:
150
+
151
+ ```bash
152
+ cleo check protocol \
153
+ --protocolType validation \
154
+ --specMatchConfirmed true \
155
+ --testSuitePassed true \
156
+ --protocolComplianceChecked true
157
+ ```
158
+
159
+ ## Anti-Patterns
160
+
161
+ | Pattern | Problem | Solution |
162
+ |---------|---------|----------|
163
+ | Hardcoding vitest or jest | Violates IVT-005; breaks on any non-JS project | Walk the worktree and pick the framework per project |
164
+ | Running the loop on main/master | Violates IVT-006; pollutes shared branch | Check `git branch --show-current` and refuse protected branches |
165
+ | No iteration cap | Infinite loop on unreachable convergence; context burn | Enforce `MAX_ITERATIONS` (default 5); escalate to HITL on exhaustion |
166
+ | Deleting assertions to force green | Tests pass but the spec is not met | The skill requires spec-to-test traceability; gaps block convergence |
167
+ | Skipping the validation phase | Lint or type errors slip through | The loop MUST run lint and type check before the test phase every iteration |
168
+ | Treating testing as "just run the tests" | Misses the loop; one-shot runs are not compliant | Testing is a loop, not a single call; record `ivtLoopIterations` |
169
+ | Exiting without writing the manifest | Downstream skills cannot see the outcome | Always write the manifest entry — converged or not — before exiting |
170
+ | Reverting the worktree on escalation | Destroys diagnostic evidence | Leave the failed state; the human reviewer needs it |
171
+
172
+ ## Critical Rules Summary
173
+
174
+ 1. Detect the test framework from the worktree before iterating; never hardcode.
175
+ 2. Cap iterations at `MAX_ITERATIONS` (default 5); escalate on exhaustion.
176
+ 3. Every MUST in the spec MUST map to a passing test before declaring convergence.
177
+ 4. Each iteration runs all three phases: Implement, Validate, Test.
178
+ 5. Never run on `main`, `master`, or `trunk` — stop and request a feature branch.
179
+ 6. Record `framework`, `testsRun`, `testsPassed`, `testsFailed`, `ivtLoopConverged`, `ivtLoopIterations` in the manifest.
180
+ 7. On non-convergence, exit 65 and leave the worktree untouched.
181
+ 8. Validate every run via `cleo check protocol --protocolType testing`.
@@ -0,0 +1,30 @@
1
+ {
2
+ "_comment": "CLEO-only metadata -- add to packages/skills/skills/manifest.json",
3
+ "name": "ct-ivt-looper",
4
+ "version": "1.0.0",
5
+ "tier": 2,
6
+ "token_budget": 10000,
7
+ "protocol": "testing",
8
+ "capabilities": {
9
+ "inputs": ["task-id", "spec-reference", "worktree-path"],
10
+ "outputs": ["manifest-entry", "ivt-convergence-report"],
11
+ "dispatch_triggers": [
12
+ "run the IVT loop",
13
+ "implement and verify",
14
+ "ship this task",
15
+ "verify against spec",
16
+ "complete implementation with tests"
17
+ ],
18
+ "compatible_subagent_types": ["general-purpose"],
19
+ "chains_to": ["ct-validator", "ct-release-orchestrator"],
20
+ "dispatch_keywords": {
21
+ "primary": ["ivt", "implement", "validate", "test"],
22
+ "secondary": ["converge", "framework", "spec", "iterate"]
23
+ }
24
+ },
25
+ "constraints": {
26
+ "max_context_tokens": 80000,
27
+ "requires_session": false,
28
+ "requires_epic": false
29
+ }
30
+ }
@@ -0,0 +1,91 @@
1
+ # IVT Non-Convergence Escalation
2
+
3
+ When the loop exhausts `MAX_ITERATIONS` without converging, the skill hands control back to a human reviewer. This file documents the hand-off contract.
4
+
5
+ ## When to Escalate
6
+
7
+ Escalation fires on exactly one condition:
8
+
9
+ ```
10
+ iteration == MAX_ITERATIONS && converged == false
11
+ ```
12
+
13
+ Other failure modes — missing framework, protected branch, credential error — also exit with code 65, but they are **pre-loop** escalations. This document covers post-loop escalation only.
14
+
15
+ ## Manifest Entry on Escalation
16
+
17
+ The escalation manifest entry MUST include:
18
+
19
+ | Field | Value | Why |
20
+ |-------|-------|-----|
21
+ | `agent_type` | `"testing"` | IVT-008; every run, converged or not |
22
+ | `framework` | detected framework | IVT-001 |
23
+ | `testsRun` | final count | Evidence of the last attempt |
24
+ | `testsPassed` | final count | — |
25
+ | `testsFailed` | final count | — |
26
+ | `ivtLoopConverged` | `false` | IVT-007 |
27
+ | `ivtLoopIterations` | `MAX_ITERATIONS` | Shows exhaustion |
28
+ | `key_findings` | diagnostic lines | Reviewer reads these first |
29
+
30
+ Example:
31
+
32
+ ```json
33
+ {
34
+ "agent_type": "testing",
35
+ "framework": "pytest",
36
+ "testsRun": 87,
37
+ "testsPassed": 84,
38
+ "testsFailed": 3,
39
+ "ivtLoopConverged": false,
40
+ "ivtLoopIterations": 5,
41
+ "key_findings": [
42
+ "tests/test_auth.py::test_expired_token failed on all 5 iterations",
43
+ "fix attempts alternated between re-raising and swallowing TokenExpiredError",
44
+ "spec clause SEC-003 not satisfied by any current test",
45
+ "worktree left at commit 3a2f1e9 on branch feature/auth-refresh"
46
+ ]
47
+ }
48
+ ```
49
+
50
+ `key_findings` is the most important field. The reviewer does not re-run the loop's diagnostics; they read the findings and decide which lever to pull.
51
+
52
+ ## Worktree State
53
+
54
+ The skill MUST NOT revert the worktree on escalation. The reviewer needs:
55
+
56
+ - The last patch that was attempted.
57
+ - The last test output.
58
+ - Any temporary files the loop created.
59
+ - The current branch (as left by the loop).
60
+
61
+ Reverting destroys all of this evidence and forces the reviewer to re-run the loop from scratch. Do not revert.
62
+
63
+ ## Reviewer Levers
64
+
65
+ When a human reviewer picks up an escalated loop, they have four options:
66
+
67
+ | Lever | When to use | Consequence |
68
+ |-------|-------------|-------------|
69
+ | **Raise the cap** | The loop was making progress but needed more iterations | Set `MAX_ITERATIONS` higher for this task; rerun the skill |
70
+ | **Rewrite the spec** | A spec clause is impossible or contradictory | Update the spec, then rerun the loop; the trace will now match |
71
+ | **Manual correction** | The spec is fine but the loop's fix generator is stuck | Apply a human patch, then rerun; the loop resumes from the corrected state |
72
+ | **Abandon the task** | The work is no longer needed | Mark the task `cancelled`; no rerun |
73
+
74
+ The skill does not pick a lever itself. Picking a lever is a human decision.
75
+
76
+ ## Rerun Semantics
77
+
78
+ When the skill is re-invoked after an escalation, it MUST:
79
+
80
+ 1. Read the previous manifest entry for the task.
81
+ 2. Start a new iteration counter (the cap applies per invocation, not globally).
82
+ 3. Keep the worktree state; do not reset to a prior commit.
83
+ 4. Write a fresh manifest entry on completion — do not edit the old one.
84
+
85
+ The previous manifest entry stays in the canon as the record of the failed attempt. The new entry sits alongside it. This preserves the history for post-mortems.
86
+
87
+ ## Escalation Is Not Failure
88
+
89
+ A loop that converges in five iterations and a loop that escalates to HITL are both valid outcomes. Escalation is the mechanism by which the skill stays honest about its limits. Agents that hide non-convergence by spoofing convergence metrics are worse than agents that escalate cleanly.
90
+
91
+ The only actual failure mode is: skipping the escalation, marking the task done, and letting uncovered spec clauses leak into a release.
@@ -0,0 +1,119 @@
1
+ # Framework Detection
2
+
3
+ The IVT loop is project-agnostic. The test framework MUST be detected from the worktree at loop start. This file is the authoritative detection table.
4
+
5
+ ## Detection Priority
6
+
7
+ The detection walks these signals in order and stops at the first match:
8
+
9
+ 1. `.cleo/project-context.json#testing.framework` — explicit CLEO project hint.
10
+ 2. Language-specific manifest files (`package.json`, `pyproject.toml`, etc.).
11
+ 3. CI workflow files (`.github/workflows/*.yml`) — last-ditch signal.
12
+ 4. Fallback: `.cleo/project-context.json#testing.command` if nothing above matches, mark the framework as `other`.
13
+
14
+ A project-context hint always wins. If the hint contradicts the manifest (e.g., `testing.framework: vitest` but `package.json` shows jest), the skill respects the hint and treats the discrepancy as a non-blocking warning.
15
+
16
+ ## Per-Framework Detection
17
+
18
+ ### vitest
19
+
20
+ **Signal**: `package.json` has `vitest` in `devDependencies` or `vitest.config.{js,ts,mjs}` exists.
21
+
22
+ **Test command**: `pnpm vitest run` (monorepo) or `npx vitest run` (single package). Use `--reporter=json` for structured parsing.
23
+
24
+ **Edge case**: Projects with both vitest and jest installed (usually mid-migration) MUST be disambiguated by checking which config file exists. If both exist, read `.cleo/project-context.json#testing.framework` or fail with IVT-001.
25
+
26
+ ### jest
27
+
28
+ **Signal**: `package.json` has `jest` in `devDependencies` or a `jest.config.{js,ts,cjs,mjs}` file, or a `jest` key in `package.json`.
29
+
30
+ **Test command**: `npx jest --json` for structured output.
31
+
32
+ **Edge case**: React Native projects sometimes use a jest preset without a top-level config. Detect via the `preset` key inside `package.json#jest`.
33
+
34
+ ### mocha
35
+
36
+ **Signal**: `package.json` has `mocha` in `devDependencies` or `.mocharc.{js,json,cjs,yml}` exists.
37
+
38
+ **Test command**: `npx mocha --reporter json`.
39
+
40
+ **Edge case**: Mocha is often paired with chai/sinon; the test command stays the same. If `nyc` is present, prefer `npx nyc mocha --reporter json` to capture coverage in the same run.
41
+
42
+ ### pytest
43
+
44
+ **Signal**: `pyproject.toml` has `[tool.pytest.ini_options]`, or a `pytest.ini` / `setup.cfg` with a `[tool:pytest]` section, or `pytest` appears in `requirements*.txt` or `pyproject.toml`'s test dependencies.
45
+
46
+ **Test command**: `pytest --json-report --json-report-file=/tmp/pytest.json` (requires `pytest-json-report`) or fall back to `pytest -q` and parse the summary line.
47
+
48
+ **Edge case**: Projects that use `tox` wrap pytest. Prefer `tox -e py` when a `tox.ini` is present and delegate parsing to the wrapped pytest run.
49
+
50
+ ### unittest
51
+
52
+ **Signal**: No test runner in `pyproject.toml`/`requirements.txt`, but `tests/` contains files matching `test_*.py` with `unittest.TestCase` imports.
53
+
54
+ **Test command**: `python -m unittest discover -s tests -p 'test_*.py'`.
55
+
56
+ **Edge case**: unittest has no JSON reporter. The skill parses the stderr summary (`Ran N tests in T — OK` or `FAILED (errors=N)`).
57
+
58
+ ### go-test
59
+
60
+ **Signal**: `go.mod` file at the repo root and at least one `*_test.go` file.
61
+
62
+ **Test command**: `go test -json ./...` for structured output.
63
+
64
+ **Edge case**: Projects that use Bazel (`BUILD.bazel`) wrap go test. Prefer `bazel test //...` when Bazel is present; fall back to `go test -json ./...` otherwise.
65
+
66
+ ### cargo-test
67
+
68
+ **Signal**: `Cargo.toml` at the repo root.
69
+
70
+ **Test command**: `cargo test -- --format json -Z unstable-options` (nightly toolchain) or `cargo test --message-format json` and parse the compile + test records.
71
+
72
+ **Edge case**: Monorepo workspaces use `Cargo.toml` at the root with `[workspace]`. The test command runs all workspace members via `cargo test --workspace`. For single-crate runs, the skill MUST target the specific crate with `--package <name>`.
73
+
74
+ ### rspec
75
+
76
+ **Signal**: `Gemfile` includes `rspec` or `rspec-rails`, or a `.rspec` file exists.
77
+
78
+ **Test command**: `bundle exec rspec --format json`.
79
+
80
+ **Edge case**: Rails projects may split specs across `spec/` and `spec/system/`. The default command covers both; no extra path is needed.
81
+
82
+ ### phpunit
83
+
84
+ **Signal**: `composer.json` lists `phpunit/phpunit` in `require-dev`, or a `phpunit.xml` / `phpunit.xml.dist` file exists.
85
+
86
+ **Test command**: `./vendor/bin/phpunit --log-junit /tmp/phpunit.xml` (the skill parses the JUnit XML).
87
+
88
+ **Edge case**: Laravel projects use `php artisan test`, which wraps phpunit. Prefer the artisan command when `artisan` exists at the repo root.
89
+
90
+ ### bats
91
+
92
+ **Signal**: `tests/` contains `*.bats` files and `bats-core` is available (`command -v bats`).
93
+
94
+ **Test command**: `bats --tap tests/` (TAP output, parseable line-by-line).
95
+
96
+ **Edge case**: BATS projects often use `tests/test_helper/bats-support/` and `bats-assert/`. No special handling is needed — the TAP output is framework-neutral.
97
+
98
+ ### other
99
+
100
+ **Signal**: None of the above match, but `.cleo/project-context.json#testing.command` is populated.
101
+
102
+ **Test command**: Run the command from `testing.command` verbatim. Parse stdout + stderr and the exit code. The skill MUST NOT guess at structure — success is exit 0 and nothing else.
103
+
104
+ **Edge case**: If the command is a shell script that wraps multiple frameworks, the skill treats the wrapper as the framework and records `framework: other` in the manifest.
105
+
106
+ ## Detection Failures
107
+
108
+ | Failure | Exit Code | Remediation |
109
+ |---------|-----------|-------------|
110
+ | No framework detected and no `testing.command` hint | 65 (HANDOFF_REQUIRED) | Add `.cleo/project-context.json#testing.framework` or `testing.command` |
111
+ | Two conflicting signals (e.g., jest and vitest both present) | 65 | Use the project-context hint to disambiguate |
112
+ | Framework detected but command not on PATH (`bats: command not found`) | 65 | Install the missing tool or update CI image |
113
+ | Framework detected but tests directory is empty | 65 | Add at least one test before running the loop |
114
+
115
+ Detection is diagnostic, not speculative. If the skill cannot pick exactly one framework, it exits 65 and lets the human reviewer decide.
116
+
117
+ ## Monorepos
118
+
119
+ Monorepo detection follows the same table but is scoped per package. The skill reads the task's declared package (from the task metadata) and walks detection from that package's root, not the monorepo root. This prevents a `pyproject.toml` in one package from confusing the detector for a JS package next to it.
@@ -0,0 +1,156 @@
1
+ # Anatomy of One IVT Iteration
2
+
3
+ This file walks through a single iteration of the IVT loop, from the failure diagnosis to the regenerated fix. Read it before modifying the loop logic.
4
+
5
+ ## Scenario
6
+
7
+ Task T5142 requires a function `normalizeEmail(input: string): string` with the following spec:
8
+
9
+ | MUST | Requirement |
10
+ |------|-------------|
11
+ | MUST-1 | Trim leading and trailing whitespace. |
12
+ | MUST-2 | Lowercase the local part and the domain. |
13
+ | MUST-3 | Reject empty strings by throwing `ValidationError`. |
14
+ | MUST-4 | Reject strings with more than one `@` by throwing `ValidationError`. |
15
+
16
+ The implementer has produced an initial patch. The loop enters iteration 1.
17
+
18
+ ## Iteration 1: Implement
19
+
20
+ The patch adds the function skeleton:
21
+
22
+ ```ts
23
+ export function normalizeEmail(input: string): string {
24
+ return input.trim().toLowerCase();
25
+ }
26
+ ```
27
+
28
+ Provenance tags are present (`@task T5142`). The Implement phase ends.
29
+
30
+ ## Iteration 1: Validate
31
+
32
+ The lint phase passes. The type check passes. No errors yet. The Validate phase ends.
33
+
34
+ ## Iteration 1: Test
35
+
36
+ The framework (vitest) is invoked:
37
+
38
+ ```
39
+ $ pnpm vitest run --reporter=json
40
+ PASS tests/normalize-email.test.ts
41
+ normalizeEmail
42
+ ✓ trims whitespace
43
+ ✓ lowercases local and domain
44
+ ✗ throws on empty string
45
+ ✗ throws on double @
46
+ ```
47
+
48
+ Raw result: `testsRun=4, testsPassed=2, testsFailed=2`.
49
+
50
+ ### Spec-to-test trace
51
+
52
+ | Spec MUST | Test | Passing? |
53
+ |-----------|------|----------|
54
+ | MUST-1 | `trims whitespace` | yes |
55
+ | MUST-2 | `lowercases local and domain` | yes |
56
+ | MUST-3 | `throws on empty string` | **no** |
57
+ | MUST-4 | `throws on double @` | **no** |
58
+
59
+ Two failures, both tied to MUST requirements. Convergence = false. The Test phase ends.
60
+
61
+ ## Iteration 1: Diagnose
62
+
63
+ The diagnose step reads each failure and maps it to a spec clause:
64
+
65
+ | Failure | Spec Clause | Root cause |
66
+ |---------|-------------|------------|
67
+ | `throws on empty string` | MUST-3 | Implementation does not throw on empty input |
68
+ | `throws on double @` | MUST-4 | Implementation does not validate `@` count |
69
+
70
+ A fix is generated:
71
+
72
+ ```ts
73
+ export function normalizeEmail(input: string): string {
74
+ const trimmed = input.trim();
75
+ if (trimmed.length === 0) {
76
+ throw new ValidationError('email cannot be empty');
77
+ }
78
+ if ((trimmed.match(/@/g) ?? []).length !== 1) {
79
+ throw new ValidationError('email must contain exactly one @');
80
+ }
81
+ return trimmed.toLowerCase();
82
+ }
83
+ ```
84
+
85
+ The loop increments `iteration` to 2 and re-enters Implement.
86
+
87
+ ## Iteration 2: Implement, Validate, Test
88
+
89
+ Patch applies cleanly. Lint and type check pass. Vitest runs:
90
+
91
+ ```
92
+ PASS tests/normalize-email.test.ts
93
+ normalizeEmail
94
+ ✓ trims whitespace
95
+ ✓ lowercases local and domain
96
+ ✓ throws on empty string
97
+ ✓ throws on double @
98
+
99
+ Tests 4 passed (4)
100
+ ```
101
+
102
+ Spec trace is complete. All four MUST requirements map to passing tests.
103
+
104
+ ## Convergence
105
+
106
+ The loop writes:
107
+
108
+ ```json
109
+ {
110
+ "agent_type": "testing",
111
+ "framework": "vitest",
112
+ "testsRun": 4,
113
+ "testsPassed": 4,
114
+ "testsFailed": 0,
115
+ "ivtLoopConverged": true,
116
+ "ivtLoopIterations": 2,
117
+ "key_findings": [
118
+ "4/4 tests pass",
119
+ "all 4 MUST requirements covered",
120
+ "converged in 2 iterations"
121
+ ]
122
+ }
123
+ ```
124
+
125
+ Exit code 0. The task advances to the next stage.
126
+
127
+ ## What This Iteration Teaches
128
+
129
+ 1. **The loop is spec-driven, not test-driven.** If iteration 1 had produced a fifth test that passes but does not map to any MUST, it would still not count toward convergence. Convergence is defined by spec coverage, not raw pass count.
130
+
131
+ 2. **Lint and type check run every iteration.** Even though they passed in iteration 1, they run again in iteration 2. Regressions are possible and must be caught inside the loop, not after it.
132
+
133
+ 3. **The diagnose step maps failures to spec clauses, not to tests.** This is how the loop avoids the anti-pattern of deleting assertions: you cannot satisfy a spec clause by removing the test that checks it, because the trace still shows the clause as uncovered.
134
+
135
+ 4. **Iterations end with a single atomic manifest write.** The manifest is not appended to during the loop — only at the end, once convergence is decided. This keeps the canon immutable and the loop deterministic.
136
+
137
+ ## Pathological Cases
138
+
139
+ ### Fake convergence via assertion deletion
140
+
141
+ An untrained implementer might delete the failing assertion:
142
+
143
+ ```ts
144
+ // was: expect(() => normalizeEmail('')).toThrow();
145
+ // now: expect(() => normalizeEmail('')).not.toThrow();
146
+ ```
147
+
148
+ Tests now pass. The spec trace, however, still lists MUST-3 as uncovered: the assertion no longer checks the MUST clause. Convergence fails and the loop continues until MAX_ITERATIONS, then escalates. Do not accept this pattern.
149
+
150
+ ### Coverage inflation
151
+
152
+ An implementer adds 50 trivial tests to hit a coverage number. The spec trace still has gaps. Convergence fails. Coverage is advisory; the spec trace is authoritative.
153
+
154
+ ### Fix regression
155
+
156
+ Iteration 2's fix breaks the test that iteration 1 made pass. Validate or Test catches this and the loop re-diagnoses. This is normal; the iteration cap exists precisely to bound it.
@@ -2,7 +2,7 @@
2
2
  "name": "ct-orchestrator",
3
3
  "version": "4.0.0",
4
4
  "description": "Pipeline-aware orchestration skill for managing complex workflows through subagent delegation.",
5
- "path": "packages/ct-skills/skills/ct-orchestrator",
5
+ "path": "packages/skills/skills/ct-orchestrator",
6
6
  "status": "active",
7
7
  "tier": 0,
8
8
  "core": true,