@laitszkin/apollo-toolkit 3.1.5 → 3.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. package/CHANGELOG.md +10 -0
  2. package/analyse-app-logs/scripts/__pycache__/filter_logs_by_time.cpython-312.pyc +0 -0
  3. package/analyse-app-logs/scripts/__pycache__/log_cli_utils.cpython-312.pyc +0 -0
  4. package/analyse-app-logs/scripts/__pycache__/search_logs.cpython-312.pyc +0 -0
  5. package/docs-to-voice/scripts/__pycache__/docs_to_voice.cpython-312.pyc +0 -0
  6. package/generate-spec/scripts/__pycache__/create-specscpython-312.pyc +0 -0
  7. package/iterative-code-quality/README.md +11 -5
  8. package/iterative-code-quality/SKILL.md +73 -148
  9. package/iterative-code-quality/agents/openai.yaml +1 -1
  10. package/iterative-code-quality/references/coupled-core-file-strategy.md +4 -0
  11. package/iterative-code-quality/references/iteration-gates.md +16 -4
  12. package/iterative-code-quality/references/job-selection.md +73 -0
  13. package/iterative-code-quality/references/module-coverage.md +113 -0
  14. package/iterative-code-quality/references/repository-scan.md +5 -1
  15. package/iterative-code-quality/references/testing-strategy.md +10 -0
  16. package/katex/scripts/__pycache__/render_katex.cpython-312.pyc +0 -0
  17. package/open-github-issue/scripts/__pycache__/open_github_issue.cpython-312.pyc +0 -0
  18. package/package.json +1 -1
  19. package/read-github-issue/scripts/__pycache__/find_issues.cpython-312.pyc +0 -0
  20. package/read-github-issue/scripts/__pycache__/read_issue.cpython-312.pyc +0 -0
  21. package/resolve-review-comments/scripts/__pycache__/review_threads.cpython-312.pyc +0 -0
  22. package/text-to-short-video/scripts/__pycache__/enforce_video_aspect_ratio.cpython-312.pyc +0 -0
package/CHANGELOG.md CHANGED
@@ -7,6 +7,16 @@ All notable changes to this repository are documented in this file.
7
7
  ### Changed
8
8
  - None yet.
9
9
 
10
+ ## [v3.1.7] - 2026-04-23
11
+
12
+ ### Changed
13
+ - Enhance `iterative-code-quality` with module inventory and coverage-ledger guidance so agents start from the easiest useful modules, deeply read each in-scope module before completion, and return to scanning whenever unvisited modules remain.
14
+
15
+ ## [v3.1.6] - 2026-04-23
16
+
17
+ ### Changed
18
+ - Rewrite `iterative-code-quality` around a strict three-step loop of full-codebase scan, per-round job selection/refactor, and final doc/constraint sync, while moving job-specific execution guidance into reference documents so the main skill no longer reads like a serial workflow.
19
+
10
20
  ## [v3.1.5] - 2026-04-23
11
21
 
12
22
  ### Changed
@@ -1,30 +1,36 @@
1
1
  # iterative-code-quality
2
2
 
3
- Improve an existing repository through repeated, evidence-backed full-iteration refactors while preserving intended business behavior and the system's top-level macro architecture.
3
+ Improve an existing repository through a strict three-step loop of full-codebase scan, job-based refactor, and final documentation/constraint sync while preserving intended business behavior and the system's top-level macro architecture.
4
4
 
5
5
  ## Core capabilities
6
6
 
7
- - Scans the full codebase and builds a prioritized quality backlog before editing.
8
- - Treats naming, abstraction, module boundaries, logging, and tests as selectable execution directions rather than a fixed sequence.
7
+ - Runs a repository-wide scan before every refactor round and refreshes a concrete quality backlog.
8
+ - Uses a strict three-step loop: scan the codebase, choose this round's jobs and refactor, then update docs/constraints only when no actionable gap remains.
9
+ - Keeps job execution guidance in focused reference documents instead of embedding every job as a workflow step in the main skill.
10
+ - Builds a module inventory and coverage ledger so every in-scope module receives a deep-read iteration before completion.
11
+ - Starts from the easiest useful modules first, while preserving the rule that unvisited modules cannot be skipped before completion.
9
12
  - Clarifies ambiguous variable, parameter, field, helper, and test-data names.
10
13
  - Simplifies complex functions and extracts reusable helpers only when they centralize real behavior.
11
14
  - Splits mixed-responsibility code into narrower modules without changing macro architecture.
12
15
  - Repairs stale or missing logs and adds tests for important observability contracts.
13
16
  - Adds high-value unit, property-based, integration, or E2E tests based on risk.
17
+ - Does not require pre-existing tests before every refactor; for high-risk under-guarded areas, it treats test addition as the next unlock direction.
14
18
  - Uses those tests and other guardrails to justify more aggressive refactors, instead of leaving known issues in place for subjective confidence reasons.
15
19
  - Re-scans the full repository after every iteration and picks the next highest-confidence, highest-leverage directions.
16
20
  - Uses small safe refactors to prepare the ground for larger later refactors, progressing gradually from outside to inside.
17
21
  - Treats large coupled or apparently core files as staged unlock problems, not as automatic stop signals.
22
+ - Uses explicit next-job selection conditions from references so the agent can decide more concretely whether naming, simplification, modularization, logging, testing, or unlock work should happen next.
18
23
  - Runs a stage-gate full-codebase decision after every iteration to decide whether more rounds are still required.
19
24
  - Repeats the pass cycle while any known in-scope actionable quality issue remains, and forbids a completion report until the latest scan is clear or remaining items are explicitly deferred with a valid reason.
25
+ - Forbids completion while any in-scope module remains unvisited, even if already-read modules look clean.
20
26
  - Targets as many inherited repository quality problems as can be solved safely, and expects the guarded test surface to remain green after the refactor.
21
27
  - Synchronizes project docs and `AGENTS.md` through `align-project-documents` and `maintain-project-constraints` after implementation.
22
28
 
23
29
  ## Repository structure
24
30
 
25
- - `SKILL.md`: Main iterative workflow, dependencies, guardrails, and output contract.
31
+ - `SKILL.md`: Main three-step loop, dependencies, guardrails, and output contract.
26
32
  - `agents/openai.yaml`: Agent interface metadata and default prompt.
27
- - `references/`: Focused guides for scanning, naming, simplification, module boundaries, logging, testing, and iteration gates.
33
+ - `references/`: Focused guides for scanning, module coverage, job selection, naming, simplification, module boundaries, logging, testing, unlock work, and iteration gates.
28
34
 
29
35
  ## Typical usage
30
36
 
@@ -1,183 +1,108 @@
1
1
  ---
2
2
  name: iterative-code-quality
3
3
  description: >-
4
- Improve an existing codebase through repeated evidence-based code-quality
5
- passes: clarify poor variable names, simplify or extract reusable functions,
6
- split oversized code into single-responsibility modules, repair stale or
7
- missing logs, and add high-value tests while preserving business behavior and
8
- system-level macro architecture. Use when users ask for comprehensive refactoring, code
9
- cleanup, maintainability hardening, naming cleanup, log alignment, or test
10
- coverage improvement across a repository.
4
+ Improve an existing codebase through repeated evidence-based repository-wide
5
+ scans, module-by-module deep-read coverage, and behavior-safe refactors until
6
+ no known in-scope actionable quality issue or unvisited in-scope module
7
+ remains: clarify poor names, simplify or extract reusable functions, split
8
+ mixed-responsibility code, repair stale or missing logs, and add high-value
9
+ tests where guardrails are missing, while preserving intended business
10
+ behavior and the system's macro architecture. Use when users ask for
11
+ comprehensive refactoring, code cleanup, maintainability hardening, naming
12
+ cleanup, log alignment, or test coverage improvement across a repository.
11
13
  ---
12
14
 
13
15
  # Iterative Code Quality
14
16
 
15
17
  ## Dependencies
16
18
 
17
- - Required: `align-project-documents` and `maintain-project-constraints` after implementation changes are complete.
18
- - Conditional: `systematic-debug` when a new or existing test reveals a real business-logic defect that must be fixed.
19
- - Optional: `discover-edge-cases` for high-risk boundary exploration before choosing missing tests; `improve-observability` for complex telemetry design.
20
- - Fallback: If required completion dependencies are unavailable, finish code and tests, then report exactly which documentation or constraint sync step could not run.
19
+ - Required: `align-project-documents` and `maintain-project-constraints` after the repository is truly iteration-complete.
20
+ - Conditional: `systematic-debug` when a newly added or existing test exposes a real business-logic defect that must be fixed at the true owner.
21
+ - Optional: `discover-edge-cases` for high-risk boundary exploration before adding tests; `improve-observability` for non-trivial telemetry design.
22
+ - Fallback: If required completion dependencies are unavailable, finish code and validation first, then report exactly which documentation or constraint-sync action could not run.
21
23
 
22
24
  ## Standards
23
25
 
24
- - Evidence: Read repository docs, project constraints, source, tests, logs, and entrypoints before editing; every rename, extraction, split, log update, or test must be backed by code context.
25
- - Execution: Continuously re-scan the full codebase, treat naming, abstraction, module boundaries, logging, and tests as selectable execution directions rather than a fixed sequence, choose the highest-confidence directions that can safely land now, and use those smaller refactors to prepare the ground for larger future refactors; validate after each iteration, then keep iterating while any known in-scope codebase quality issue remains unresolved; when tests or other reliable guardrails can prove equivalence, prefer taking the refactor instead of deferring it for subjective confidence reasons; do not produce the completion report while the scan still contains actionable gaps.
26
- - Quality: Solve as many inherited code-quality problems as safely possible without changing intended behavior or the system's macro architecture; avoid style-only churn, compatibility theater, broad rewrites, and unverified "cleanup", but do not reject a worthwhile refactor purely because it feels risky when existing or newly added guardrails can verify it safely.
27
- - Output: Deliver a concise pass-by-pass summary, changed behavior-neutral surfaces, test coverage added, validation results, and documentation/`AGENTS.md` sync status only after every known in-scope quality issue is resolved or explicitly classified as blocked, unsafe, low-value, speculative, or requiring user approval.
26
+ - Evidence: Read repository docs, project constraints, source, tests, logs, build scripts, entrypoints, and nearby abstractions before editing; every refactor and every new test must be justified by code context.
27
+ - Execution: Run a continuous three-step loop of full-codebase scan choose this round's jobs and refactor if and only if the latest full-codebase scan is clear, update docs and constraints; otherwise return to scanning immediately. Maintain a module inventory and coverage ledger so every in-scope module receives a deep-read iteration before completion. Do not treat jobs as workflow steps. Do not produce a completion report while any known in-scope actionable issue or unvisited in-scope module remains.
28
+ - Quality: Resolve as many inherited quality problems as safely possible without changing intended behavior or the system's macro architecture. Do not require pre-existing tests before every safe refactor; if an area is high-risk and weakly guarded, add the missing guardrails as part of the work instead of treating the area as untouchable.
29
+ - Output: Return iteration-by-iteration decisions, selected jobs, module coverage status, changed files, behavior-preservation evidence, tests and guardrails added, validation results, and docs/constraint sync status only after the latest scan shows no remaining known actionable in-scope issue and no unvisited in-scope module.
28
30
 
29
- ## Goal
31
+ ## Mission
30
32
 
31
- Resolve as many inherited repository quality problems as possible without breaking intended behavior, and use tests plus other reliable guardrails to prove that the refactor leaves the project in a fully green state.
33
+ Leave the repository materially cleaner by continuously scanning the whole codebase, landing the highest-value safe refactors available at each moment, and repeating until there is no known in-scope actionable quality gap left to fix.
32
34
 
33
- This skill is intentionally implementation-oriented, not report-only. It should keep scanning the full codebase, choose the best available refactor directions at each moment, apply as much safe cleanup as the repository can support, add or strengthen tests to guard the refactor, and use incremental cleanup to unlock deeper improvements over time. If a post-iteration scan finds remaining actionable gaps, continue the next iteration instead of writing a completion report.
35
+ For this skill, `macro architecture` means the system's top-level runtime shape and overall operating logic: major subsystems, top-level execution model, deployment/runtime boundaries, persistence model, service boundaries, and the end-to-end way the whole system works. Ordinary module interactions, helper extraction, local responsibility moves, internal call-boundary cleanup, and local module splits do not count as macro-architecture changes by themselves.
34
36
 
35
- For this skill, `macro architecture` means the system's top-level runtime shape and overall operating logic: major subsystems, top-level execution model, deployment/runtime boundaries, persistence model, service boundaries, and the end-to-end way the whole system works. Ordinary module interactions, helper extraction, local responsibility moves, and internal call-boundary cleanup do not count as macro-architecture changes by themselves.
37
+ ## Three-Step Loop
36
38
 
37
- ## Required Reference Loading
39
+ ### 1) Scan the repository
38
40
 
39
- Load references only when they match the active pass:
41
+ - Read root guidance first: `AGENTS.md`, `README*`, package manifests, task runners, CI/test config, and major project docs.
42
+ - Map runtime entrypoints, domain modules, external integrations, logging utilities, and current test surfaces.
43
+ - Exclude generated, vendored, lock, build-output, fixture, or snapshot files unless evidence shows they are human-maintained source.
44
+ - Build or refresh a concrete repository-wide backlog of known actionable quality issues.
45
+ - Build or refresh a module inventory and coverage ledger; every in-scope module starts as unvisited until it has received a deep-read iteration with callers, callees, tests, logs, and relevant contracts inspected.
46
+ - Re-scan the full codebase after every landed iteration, not only the files just changed.
47
+ - Load `references/repository-scan.md` for the scan checklist and backlog shaping rules.
48
+ - Load `references/module-coverage.md` for module inventory, deep-read coverage, easy-first ordering, and completion rules.
40
49
 
41
- - `references/repository-scan.md`: scope mapping, generated-file exclusions, and quality backlog selection.
42
- - `references/naming-and-simplification.md`: variable renames, function simplification, reusable extraction, and behavior-preservation checks.
43
- - `references/module-boundaries.md`: single-responsibility split heuristics and safe module extraction rules.
44
- - `references/coupled-core-file-strategy.md`: staged unlock strategy for large, coupled, or apparently core files that should not become stop signals.
45
- - `references/logging-alignment.md`: stale log detection, missing log criteria, and behavior-neutral observability updates.
46
- - `references/testing-strategy.md`: risk-based unit, property, integration, and E2E coverage selection.
47
- - `references/iteration-gates.md`: multi-pass quality gates, stopping criteria, and validation cadence.
50
+ ### 2) Choose this round's jobs and refactor
48
51
 
49
- ## Workflow
52
+ - Choose jobs only after the latest full-codebase scan. Jobs are optional execution directions, not ordered workflow steps.
53
+ - Select the smallest set of jobs that can safely improve the currently selected module or module cluster under current guardrails.
54
+ - Prefer easy-first module ordering: start from low-risk, high-confidence modules when doing so builds context, tests, naming clarity, or seams that make harder modules safer later.
55
+ - Do not keep revisiting familiar modules while other in-scope modules remain unvisited unless the familiar module blocks the next unvisited module's safe deep read.
56
+ - Prefer smaller, high-confidence refactors that reduce risk and prepare the ground for deeper later cleanup.
57
+ - If a desired refactor is high-risk and weakly guarded, make guardrail-building part of this round instead of stopping.
58
+ - If a file feels too coupled, too central, or too risky for a direct rewrite, do staged unlock work rather than declaring the area blocked.
59
+ - Read all directly affected callers, tests, interfaces, and logs before editing.
60
+ - Validate from narrow to broad after each bounded round, then perform a full-codebase stage-gate decision:
61
+ - if any known in-scope actionable issue still remains or any in-scope module has not received a deep-read iteration, return to Step 1;
62
+ - only continue to Step 3 when the latest scan is clear.
50
63
 
51
- ### 1) Establish the repository baseline
64
+ Load references for this step only as needed:
52
65
 
53
- - Read root guidance first: `AGENTS.md`, `README*`, major docs, package manifests, task runners, CI configs, and test setup.
54
- - Map runtime entrypoints, domain modules, external integrations, logging/telemetry utilities, and existing test suites.
55
- - Identify generated, vendored, lock, build-output, fixture, or snapshot files; exclude them from refactoring unless evidence shows they are human-maintained source.
56
- - Run or inspect the most relevant existing validation commands before editing when feasible, so pre-existing failures are distinguishable from new regressions.
57
- - Build an initial quality backlog with concrete file/function/test targets before changing code.
58
- - Use `references/repository-scan.md` for the scan checklist and backlog scoring.
66
+ - `references/module-coverage.md` for choosing the next module and proving every in-scope module has been deeply read.
67
+ - `references/job-selection.md` for next-job choice conditions and tie-breakers.
68
+ - `references/naming-and-simplification.md` for naming cleanup and function simplification/extraction.
69
+ - `references/module-boundaries.md` for single-responsibility module cleanup.
70
+ - `references/logging-alignment.md` for stale or missing log repair.
71
+ - `references/testing-strategy.md` for unit, property, integration, and E2E test strategy.
72
+ - `references/coupled-core-file-strategy.md` for staged unlock work on large coupled or apparently core files.
73
+ - `references/iteration-gates.md` for validation cadence, stage-gate rules, and stop criteria.
59
74
 
60
- ### 2) Execute continuous full scans with selectable directions
75
+ ### 3) Update project documents and constraints
61
76
 
62
- Do not force one fixed order such as "finish naming first, then abstraction, then modules". Instead, keep re-scanning the whole codebase and select the execution directions that are highest-confidence and highest-leverage right now.
77
+ Only enter this step when the latest full-codebase scan confirms there is no remaining known actionable in-scope quality issue and every in-scope module has received a deep-read iteration, except items explicitly classified as blocked, unsafe, speculative, low-value, excluded, or approval-dependent.
63
78
 
64
- Treat these as multi-select execution directions, not mandatory sequential stages:
65
-
66
- 1. Naming clarity for variables, parameters, fields, local helpers, and test data.
67
- 2. Function simplification and reusable extraction for duplicated or hard-coded workflows.
68
- 3. Single-responsibility module splits for oversized or mixed-concern code.
69
- 4. Logging alignment for stale, misleading, missing, or low-context diagnostics.
70
- 5. Risk-based test coverage for high-value business logic and boundary cases.
71
-
72
- Direction-selection rules:
73
-
74
- - Prefer the directions with the strongest current evidence and best guardrails.
75
- - Prefer smaller, higher-confidence refactors that unlock or de-risk larger later refactors.
76
- - Prefer outside-in progress: stabilize boundaries, callers, naming, logs, and tests around a subsystem before attempting deeper internal rewrites.
77
- - Re-evaluate the whole backlog after every landed iteration; the next best direction may change because the previous cleanup improved the local safety or clarity.
78
- - When a file appears too coupled, too central, or too risky for a direct rewrite, treat that as a prompt to switch into staged unlock work rather than a reason to stop. Load `references/coupled-core-file-strategy.md` and choose the next smallest refactor that reduces future risk.
79
-
80
- For each iteration:
81
-
82
- - Read all directly affected callers, tests, and public interfaces before editing.
83
- - Keep the scope small enough to validate and review, and select whichever directions are most justified for that scope instead of forcing every direction to appear in every iteration.
84
- - Prefer repository-native abstractions over new parallel frameworks.
85
- - Preserve public behavior, data contracts, side effects, error classes, and macro architecture.
86
- - Add or update tests in the same iteration when the change touches non-trivial logic, observability contracts, or extracted helpers.
87
- - If strong guardrails exist or can be added cheaply, prefer the clearer or more maintainable refactor instead of leaving a known issue in place due to subjective caution alone.
88
- - Validate the touched scope before starting another iteration.
89
-
90
- ### 3) Rename for clarity without churn
91
-
92
- - Rename only when the current name hides domain meaning, confuses ownership, conflicts with real units, or makes tests/logs misleading.
93
- - Prefer names that encode domain role, unit, lifecycle stage, or canonical owner.
94
- - Update all references, tests, fixtures, structured log fields, docs, and comments that describe the renamed concept.
95
- - Avoid renaming stable public API fields or persisted schema names unless the user explicitly requested a breaking migration.
96
- - Use `references/naming-and-simplification.md` before broad rename passes.
97
-
98
- ### 4) Simplify and extract reusable functions
99
-
100
- - Simplify functions when branches, temporary state, repeated transformations, or hard-coded workflows obscure the invariant.
101
- - Extract helpers only when they reduce duplication, centralize one business rule, clarify caller intent, or make a behavior testable.
102
- - Keep helper placement aligned with current module ownership.
103
- - Do not create abstractions for one-off code unless they isolate a meaningful domain rule or external contract.
104
- - If tests or equivalent guardrails can prove behavior preservation, do not let moderate implementation uncertainty block an otherwise valuable simplification or extraction.
105
- - Preserve observable behavior unless a test proves the current behavior is a defect.
106
-
107
- ### 5) Split modules by responsibility
108
-
109
- - Split code only when one file/module owns multiple change reasons, domain boundaries, external integrations, or lifecycle stages.
110
- - Define the new module's responsibility before moving code.
111
- - Keep interfaces narrow, explicit, and consistent with existing project style.
112
- - Avoid macro-architecture changes such as new top-level layers, new service boundaries, new persistence strategies, deployment/runtime model changes, or framework swaps unless the user explicitly expands scope.
113
- - When module boundaries are currently poor but can be protected by focused tests or other guardrails, choose the cleaner split instead of preserving a mixed-responsibility file out of caution alone.
114
- - Use `references/module-boundaries.md` for extraction rules and anti-patterns.
115
-
116
- ### 5.1) Handle large coupled or apparently core files through unlock work
117
-
118
- - Do not treat a large coupled file, a central orchestrator, or a historically fragile module as an automatic stop condition.
119
- - First ask: what is the next smallest refactor that lowers the risk of changing this area later without changing business behavior now?
120
- - Prefer unlock steps such as characterization tests, naming cleanup, type extraction, pure-function extraction, side-effect boundary isolation, read/write path separation, dependency seam introduction, and caller grouping.
121
- - Only stop when no such unlock step can be identified under current guardrails. If an unlock step exists, do it before reconsidering the larger refactor.
122
- - Use `references/coupled-core-file-strategy.md` whenever the current obstacle is "too coupled", "too central", or "too risky to touch directly".
123
-
124
- ### 6) Repair logging and observability drift
125
-
126
- - Compare log messages, event names, structured fields, metrics, and trace names against the current code ownership model.
127
- - Fix stale terminology after renames or refactors so logs describe the live workflow.
128
- - Add logs only at high-value decision points: branch selection, skipped work, external dependency outcome, persistence side effect, retry/rollback, and final outcome.
129
- - Use structured fields already accepted by the project; never log secrets, tokens, full sensitive payloads, or personal data.
130
- - Add tests or assertions for important log fields when the project has log-capture helpers.
131
- - Use `references/logging-alignment.md` for detailed criteria.
132
-
133
- ### 7) Add high-value tests
134
-
135
- - Start from risk, not coverage percentage.
136
- - Prioritize tests for business rules, state transitions, error handling, extracted helpers, edge cases, observability contracts, and integration boundaries.
137
- - Use unit tests for local logic, property-based tests for invariants and generated input spaces, integration tests for cross-module chains, and E2E tests only when external services are stable or can be controlled reliably.
138
- - Mock or fake external services unless the real service contract is the subject under test.
139
- - If a new test exposes an existing business-logic bug, invoke `systematic-debug`, fix the true owner, and keep the regression test.
140
- - Use `references/testing-strategy.md` for coverage selection and required `N/A` reasoning.
141
-
142
- ### 8) Iterate gradually from outside to inside until the repository is clear of known actionable issues
143
-
144
- - After each iteration, run the narrowest relevant tests first, then broaden validation until the changed scope and final repository state are adequately guarded.
145
- - Re-scan the full codebase, not only the touched area, because the best next direction may have shifted after the last cleanup.
146
- - Perform an explicit stage-gate decision after that full-codebase scan: decide whether all known in-scope issues are now resolved, whether remaining issues are only legitimately deferred categories, or whether another iteration is required right now.
147
- - Re-rank the backlog after every iteration and choose the next highest-confidence, highest-leverage direction set.
148
- - Use small external or boundary-level cleanups to make later deeper refactors safer; treat that groundwork as progress toward a thorough long-horizon refactor, not as a distraction from it.
149
- - Repeat the full iteration whenever any known in-scope actionable gap remains and can be fixed safely without changing business behavior or macro architecture.
150
- - Do not write the completion report, summarize the task as done, or hand back as complete while the latest scan still contains known actionable quality issues.
151
- - Stop only when every known in-scope issue has been resolved, or each remaining candidate is explicitly classified as low-value, speculative, blocked, unsafe, or requiring product/architecture approval.
152
- - Use `references/iteration-gates.md` for stopping criteria.
153
-
154
- ### 9) Synchronize docs and constraints
155
-
156
- After code and tests are complete:
157
-
158
- - Invoke `align-project-documents` when README, docs, architecture notes, debugging docs, setup instructions, or test guidance may have drifted.
159
- - Invoke `maintain-project-constraints` to verify `AGENTS.md` still reflects architecture, business flow, common commands, macro purpose, and coding conventions.
160
- - Update only documentation that is affected by real code, command, logging, or test changes.
79
+ - Run `align-project-documents` when README, architecture notes, setup docs, debugging docs, or test guidance may have drifted.
80
+ - Run `maintain-project-constraints` to verify `AGENTS.md` still matches the repository's real architecture, business flow, commands, and conventions.
81
+ - Update only the documentation and constraints that changed in reality because of the refactor.
161
82
 
162
83
  ## Hard Guardrails
163
84
 
164
- - Do not change intended business logic while refactoring.
165
- - Do not change the system's macro architecture—its top-level runtime shape, deployment/runtime model, persistence model, major service boundaries, or overall operating logic—unless the user explicitly approves that expanded scope.
85
+ - Do not change intended business logic while refactoring, except to fix a real defect exposed by tests and verified at the true owner.
86
+ - Do not change the system's macro architecture unless the user explicitly expands scope.
166
87
  - Do not use one-off scripts to rewrite product code.
167
- - Do not perform style-only churn that does not improve naming, reuse, modularity, observability, or test confidence.
168
- - Do not weaken tests to make refactors pass; update tests to stable invariants or fix the implementation defect.
169
- - Do not add E2E tests that depend on unreliable external services when a controlled integration test can prove the same business risk.
88
+ - Do not stop early just because a file is large, central, or historically fragile; if a safe unlock step exists, that is the next job.
89
+ - Do not stop before every in-scope module has been inventoried, deeply read, and either improved, validated as clear, or explicitly deferred/excluded with evidence.
90
+ - Do not weaken tests to make a refactor pass; fix the real defect or update stale expectations to stable invariants.
91
+ - Do not add style-only churn that does not improve naming, modularity, observability, reuse, or guardrail strength.
92
+ - Do not add unreliable E2E coverage when a controlled integration or characterization test can prove the same risk more safely.
170
93
 
171
94
  ## Completion Report
172
95
 
173
- Only write this report after the latest scan confirms there are no known actionable in-scope quality issues remaining and the relevant test/guardrail suite is green. If any such issue remains, continue iterating instead of reporting completion.
96
+ Only report completion after Step 3 is done, the latest Step 1 scan is clear, and the module coverage ledger has no unvisited in-scope module.
174
97
 
175
98
  Return:
176
99
 
177
- 1. Iterations completed, which execution directions were selected in each one, and the stage-gate decision after each full-codebase re-scan.
178
- 2. Key files changed and the quality issue each change resolved.
179
- 3. Business behavior preservation evidence.
180
- 4. Tests added or updated, including property/integration/E2E `N/A` reasons where relevant.
181
- 5. Validation commands and results.
182
- 6. Documentation and `AGENTS.md` synchronization status.
183
- 7. Remaining quality gaps, blockers, or deferred architecture/product decisions.
100
+ 1. Iterations completed and the jobs selected in each iteration.
101
+ 2. Stage-gate verdict after each full-codebase re-scan.
102
+ 3. Module coverage ledger summary: modules deep-read, improved, validated-clear, deferred, or excluded.
103
+ 4. Key files changed and the quality issue each change resolved.
104
+ 5. Business behavior preservation evidence.
105
+ 6. Tests or other guardrails added or updated, including property/integration/E2E `N/A` reasons where relevant.
106
+ 7. Validation commands and results.
107
+ 8. Documentation and `AGENTS.md` synchronization status.
108
+ 9. Remaining blocked or approval-dependent items, if any.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "Iterative Code Quality"
3
3
  short_description: "Refactor names, functions, modules, logs, and tests in repeated behavior-safe passes"
4
- default_prompt: "Use $iterative-code-quality to keep scanning the full repository, treat naming, simplification, reusable extraction, module-boundary cleanup, logging alignment, and testing as selectable execution directions rather than a fixed sequence, and choose the highest-confidence, highest-leverage directions available at each moment; use small safe refactors to prepare the ground for larger later refactors, progress gradually from outside to inside, and when a file feels too coupled or too central switch into staged unlock work instead of stopping; use tests or other reliable guardrails to justify aggressive cleanup without breaking intended behavior or the system's top-level runtime architecture, run a full-codebase stage-gate after each iteration to decide whether another round is required, and do not write a completion report while any known in-scope actionable issue remains; only finish after the latest full-codebase scan is clear or remaining items are explicitly classified as blocked, unsafe, low-value, speculative, or approval-dependent, and the guarded test surface is green; then run $align-project-documents and $maintain-project-constraints to synchronize docs and AGENTS.md."
4
+ default_prompt: "Use $iterative-code-quality as a strict three-step loop. Step 1: scan the full repository, refresh the actionable quality backlog, and maintain a module inventory plus coverage ledger. Step 2: choose this round's module or bounded module cluster, then choose jobs from the reference documents and land the highest-value safe refactors now; start from the easiest useful unvisited modules, jobs are selectable directions rather than workflow steps, and if a high-risk area is weakly guarded add the missing tests or other guardrails instead of stopping. If a file is too coupled or too central for direct cleanup, switch to staged unlock work and keep progressing. After validation, run a full-codebase stage-gate; if any known in-scope actionable issue remains or any in-scope module has not received a deep-read iteration, go back to Step 1 immediately. Step 3: only when the latest full-codebase scan is clear and every in-scope module is deeply read, run $align-project-documents and $maintain-project-constraints to synchronize docs and AGENTS.md. Preserve intended business behavior and the system's macro architecture, keep the guarded test surface green, and do not write a completion report while actionable gaps or unvisited modules still exist."
@@ -12,6 +12,8 @@ A large coupled file is a **decomposition signal**, not a **completion blocker**
12
12
 
13
13
  If a safe, behavior-preserving unlock step exists under current guardrails, take that step now instead of deferring the whole area.
14
14
 
15
+ If guardrails are too weak for direct cleanup, strengthening them is itself the next unlock step.
16
+
15
17
  ## First questions to ask
16
18
 
17
19
  When a file feels untouchable, ask:
@@ -57,6 +59,8 @@ Prefer the next step that maximizes:
57
59
 
58
60
  If two steps are both safe, choose the one that makes the next iteration easier.
59
61
 
62
+ If the file is high-risk and under-tested, prefer adding the smallest useful characterization tests before attempting deeper structural edits.
63
+
60
64
  ## Completion rule for coupled files
61
65
 
62
66
  Do not ask "Can I solve the whole file now?"
@@ -4,6 +4,7 @@
4
4
 
5
5
  Each iteration must have:
6
6
 
7
+ - a selected module or bounded module cluster,
7
8
  - a concrete quality target,
8
9
  - a bounded file/symbol scope,
9
10
  - one or more selected execution directions,
@@ -15,6 +16,8 @@ An iteration is not "one work type", and it also does not need to include every
15
16
 
16
17
  Avoid starting a broad second iteration before validating the first, but do not stop after a validated iteration if known actionable quality issues remain anywhere in the in-scope codebase.
17
18
 
19
+ Do not stop after a validated iteration if any in-scope module remains unvisited in the module coverage ledger.
20
+
18
21
  ## Validation cadence
19
22
 
20
23
  Run validation from narrow to broad:
@@ -39,6 +42,7 @@ The final stopping condition also requires the relevant guarded test surface to
39
42
 
40
43
  Inspect the full known quality backlog for:
41
44
 
45
+ - modules that are still unvisited or only shallowly read,
42
46
  - new naming drift from moved or extracted concepts,
43
47
  - duplicated logic that remains after extraction,
44
48
  - module boundaries that are still mixed,
@@ -52,15 +56,18 @@ Then choose the next execution directions with these priorities:
52
56
  2. strongest leverage for later deeper cleanup,
53
57
  3. lowest business-risk path toward broader system improvement.
54
58
 
59
+ Use `references/job-selection.md` to convert those priorities into a concrete next-job choice.
60
+
55
61
  ## Stage-gate after each iteration
56
62
 
57
63
  After every validated iteration, run a deliberate full-codebase decision pass:
58
64
 
59
65
  1. Re-scan the repository and refresh the known quality backlog.
60
- 2. Ask whether any known in-scope actionable issue still remains.
61
- 3. If yes, decide whether it should be addressed in the very next iteration or whether first-step unlock work is needed.
62
- 4. If the obstacle is a large, coupled, or central file, do not stop there; switch to staged unlock work and continue.
63
- 5. Only declare the repository iteration-complete when the re-scan shows no remaining actionable in-scope issue except items that are explicitly deferred under the allowed stop categories.
66
+ 2. Refresh the module coverage ledger and identify unvisited in-scope modules.
67
+ 3. Ask whether any known in-scope actionable issue still remains.
68
+ 4. If yes, decide whether it should be addressed in the very next iteration or whether first-step unlock work is needed.
69
+ 5. If the obstacle is a large, coupled, or central file, do not stop there; switch to staged unlock work and continue.
70
+ 6. Only declare the repository iteration-complete when the re-scan shows no remaining actionable in-scope issue and no unvisited in-scope module except items that are explicitly deferred or excluded under the allowed stop categories.
64
71
 
65
72
  This stage-gate is mandatory. A validated local change does not by itself mean the repository is done.
66
73
 
@@ -69,6 +76,7 @@ This stage-gate is mandatory. A validated local change does not by itself mean t
69
76
  Repeat the cycle when:
70
77
 
71
78
  - any known in-scope actionable quality issue remains unresolved,
79
+ - any in-scope module remains unvisited,
72
80
  - high-impact unclear names remain,
73
81
  - duplicated or hard-coded workflows still have safe extraction paths,
74
82
  - a module still mixes distinct responsibilities and can be split locally,
@@ -93,12 +101,16 @@ Stop only when there are no unresolved known in-scope actionable issues. Any rem
93
101
 
94
102
  If a remaining candidate cannot be placed in one of these categories, it is still an actionable gap and the agent must continue iterating rather than complete the task.
95
103
 
104
+ If an in-scope module has not received a deep-read iteration, it is still an actionable coverage gap even when the already-read modules look clean.
105
+
96
106
  ## Completion evidence
97
107
 
98
108
  The final report should make the stopping point auditable:
99
109
 
100
110
  - passes completed,
101
111
  - execution directions selected per iteration,
112
+ - module or module cluster covered per iteration,
113
+ - final module coverage ledger,
102
114
  - stage-gate verdict after each full-codebase re-scan,
103
115
  - validation commands and outcomes,
104
116
  - confirmation that the guarded test surface is green after the refactor,
@@ -0,0 +1,73 @@
1
+ # Job Selection Guide
2
+
3
+ ## Purpose
4
+
5
+ Help the agent choose the next execution direction after each full-codebase re-scan.
6
+
7
+ These are job-selection rules for Step 2 of the main skill loop. They are not workflow steps.
8
+
9
+ The goal is not to force one permanent order. The goal is to choose the next job that most safely improves the selected module or module cluster and unlocks later work.
10
+
11
+ ## Available jobs
12
+
13
+ - naming cleanup
14
+ - function simplification / extraction
15
+ - module-boundary cleanup
16
+ - logging alignment
17
+ - test addition
18
+ - staged unlock work
19
+
20
+ ## Choose `naming cleanup` when
21
+
22
+ - confusing names are the main thing blocking understanding,
23
+ - flags, units, lifecycle states, or ownership terms are misleading,
24
+ - better naming would clearly reduce the risk of a later deeper refactor.
25
+
26
+ ## Choose `function simplification / extraction` when
27
+
28
+ - duplicated logic exists across multiple call sites,
29
+ - one function mixes too many concerns,
30
+ - control flow is currently the main complexity bottleneck,
31
+ - extracting a helper would make the next test or split easier.
32
+
33
+ ## Choose `module-boundary cleanup` when
34
+
35
+ - one file or module clearly has multiple reasons to change,
36
+ - local responsibilities are already visible enough to separate,
37
+ - a safe split would reduce repeated touching of unrelated concerns.
38
+
39
+ ## Choose `logging alignment` when
40
+
41
+ - stale or missing diagnostics are the main blocker to safe validation,
42
+ - later refactors would be safer if branch decisions and outcomes were easier to observe,
43
+ - observability drift is currently hiding the real ownership model.
44
+
45
+ ## Choose `test addition` when
46
+
47
+ - the target area is high-risk and weakly guarded,
48
+ - desired cleanup is blocked mainly by missing behavior locks,
49
+ - coupling spans multiple modules and needs characterization before change,
50
+ - regression risk is too high to justify deeper refactors without stronger coverage.
51
+
52
+ ## Choose `staged unlock work` when
53
+
54
+ - the file feels too central or too coupled for direct cleanup,
55
+ - no safe full refactor exists yet, but a preparatory step does,
56
+ - you can reduce risk through naming, seam extraction, type extraction, side-effect isolation, or caller grouping,
57
+ - the best next move is to make a future refactor cheaper rather than solve the whole area now.
58
+
59
+ ## Tie-breakers
60
+
61
+ If multiple jobs are plausible, prefer the one that:
62
+
63
+ 1. increases safety for the next iteration,
64
+ 2. reduces cognitive load fastest,
65
+ 3. removes the strongest blocker to a deeper future refactor,
66
+ 4. helps an unvisited module reach deep-read coverage,
67
+ 5. preserves behavior with the clearest available guardrails.
68
+
69
+ ## Hard rule
70
+
71
+ If a high-risk area lacks enough guardrails, `test addition` or another guardrail-building job should usually win before a deeper structural refactor.
72
+
73
+ If any in-scope module remains unvisited, choose jobs that help the next easiest useful unvisited module become deeply read, improved, or validated-clear before spending another round on already-familiar areas.
@@ -0,0 +1,113 @@
1
+ # Module Coverage And Deep-Read Iterations
2
+
3
+ ## Purpose
4
+
5
+ Prevent the agent from repeatedly improving only familiar or easy files while untouched modules remain unexamined.
6
+
7
+ Use this reference in Step 1 to build the module inventory and in Step 2 to choose which module or module cluster receives the next deep-read iteration.
8
+
9
+ ## Module inventory
10
+
11
+ List every meaningful in-scope module before completion. A module may be:
12
+
13
+ - a package, app, service, route group, command group, worker, or library,
14
+ - a domain folder with a clear responsibility,
15
+ - a runtime entrypoint plus its owned helpers,
16
+ - a testable subsystem with stable callers and contracts.
17
+
18
+ Record each module with:
19
+
20
+ - module name and path roots,
21
+ - primary responsibility,
22
+ - entrypoints and public interfaces,
23
+ - key callers and callees,
24
+ - tests and guardrails,
25
+ - logging or telemetry surfaces,
26
+ - risk level and estimated ease,
27
+ - current coverage status.
28
+
29
+ Exclude generated, vendored, lock, build-output, snapshot, fixture-only, or explicitly out-of-scope areas only with evidence.
30
+
31
+ ## Coverage ledger statuses
32
+
33
+ Use simple statuses so stopping conditions are auditable:
34
+
35
+ - `unvisited`: inventoried but not deeply read yet.
36
+ - `deep-read`: callers, callees, tests, logs, contracts, and core files were inspected with enough context to judge quality.
37
+ - `refactored`: at least one behavior-neutral improvement landed for this module.
38
+ - `validated-clear`: deep read found no actionable in-scope quality issue worth changing now.
39
+ - `deferred`: an issue exists but is blocked, unsafe, speculative, approval-dependent, or requires macro-architecture/product scope.
40
+ - `excluded`: not human-maintained source or outside the user's requested scope.
41
+
42
+ Completion is not allowed while any in-scope module remains `unvisited`.
43
+
44
+ ## Easy-first module ordering
45
+
46
+ Start with the easiest useful modules when that reduces risk:
47
+
48
+ - small surface area,
49
+ - clear ownership,
50
+ - local tests or cheap guardrails,
51
+ - limited side effects,
52
+ - low public API or persistence risk,
53
+ - likely to clarify names, tests, boundaries, or seams used by harder modules.
54
+
55
+ Do not confuse easy-first with low-value churn. The chosen module should either resolve real quality issues or create context/guardrails that make later modules safer.
56
+
57
+ If multiple modules are equally easy, prefer the one that unlocks harder modules by improving shared naming, helpers, tests, logging, or dependency seams.
58
+
59
+ ## Deep-read requirements
60
+
61
+ A module iteration is not deep-read until the agent inspects:
62
+
63
+ - module entrypoints and public interfaces,
64
+ - internal core files and responsibility boundaries,
65
+ - key callers and downstream callees,
66
+ - tests, fixtures, mocks, and validation commands,
67
+ - logs, metrics, tracing, and error messages,
68
+ - configuration, persistence, and external-service contracts when relevant,
69
+ - known TODOs, comments, or docs that describe current behavior.
70
+
71
+ Do not mark a module `validated-clear` from a shallow file skim.
72
+
73
+ ## Choosing the next module
74
+
75
+ After every iteration:
76
+
77
+ 1. Re-scan the module ledger.
78
+ 2. Prefer an `unvisited` module unless a just-touched module must be stabilized before moving on.
79
+ 3. Choose the easiest useful `unvisited` module that can be deeply read and improved or validated now.
80
+ 4. If the next unvisited module is high-risk and under-guarded, choose guardrail-building jobs first.
81
+ 5. If the next unvisited module is too coupled for direct cleanup, choose staged unlock work rather than skipping it.
82
+ 6. Return to the full-codebase scan after validation and update the ledger.
83
+
84
+ Revisiting a familiar module is valid only when:
85
+
86
+ - it blocks safe deep reading of an unvisited module,
87
+ - a previous refactor created follow-up drift that must be stabilized,
88
+ - validation exposed a real defect or stale contract,
89
+ - cross-module cleanup requires touching it together with the next module.
90
+
91
+ ## Module cluster iterations
92
+
93
+ One iteration may cover a small cluster of modules when they share one boundary or invariant, such as:
94
+
95
+ - a command and its parser,
96
+ - a route and its service,
97
+ - a domain module and its test helpers,
98
+ - an integration wrapper and its local fake.
99
+
100
+ Keep clusters bounded. Do not use clustering to claim full-repository coverage without deep context.
101
+
102
+ ## Stage-gate questions
103
+
104
+ At the end of each iteration, answer:
105
+
106
+ - Which module or module cluster was deeply read?
107
+ - Which jobs were selected and why?
108
+ - What quality issue was fixed, or why is the module validated-clear?
109
+ - Which guardrails prove behavior was preserved?
110
+ - Which modules remain `unvisited`?
111
+ - Which module is the next easiest useful target?
112
+
113
+ If any in-scope module remains `unvisited`, the correct action is to return to Step 1, not to finish.
@@ -2,13 +2,14 @@
2
2
 
3
3
  ## Purpose
4
4
 
5
- Build a factual map before changing code, then choose the highest-value quality improvements.
5
+ Build a factual map before changing code, then choose the highest-value quality improvements while tracking module-by-module deep-read coverage.
6
6
 
7
7
  ## Required scan
8
8
 
9
9
  - Read `AGENTS.md`, `README*`, project docs, manifests, task runners, CI configs, and test setup.
10
10
  - List entrypoints: CLI commands, servers, workers, jobs, frontend routes, scripts, libraries, or public packages.
11
11
  - Identify core domain modules, external integrations, persistence boundaries, logging utilities, and test helpers.
12
+ - Create a module inventory and coverage ledger using `references/module-coverage.md`.
12
13
  - Inspect current git state before editing so unrelated user changes are not overwritten.
13
14
  - Identify generated, vendored, lock, snapshot, build-output, and fixture files; exclude them from refactoring unless they are human-maintained source.
14
15
 
@@ -29,6 +30,7 @@ Prioritize files or functions with:
29
30
  For each candidate record:
30
31
 
31
32
  - file path and symbol name,
33
+ - owning module or module cluster,
32
34
  - observed quality problem,
33
35
  - why it matters to maintainability or correctness confidence,
34
36
  - expected behavior-neutral change,
@@ -57,3 +59,5 @@ Score each candidate by:
57
59
  4. **Blast radius**: number of modules, public contracts, and migrations affected.
58
60
 
59
61
  Start with high-impact, high-confidence, low-blast-radius items. Escalate broad changes only when smaller passes cannot resolve the root problem.
62
+
63
+ Do not finish from backlog scoring alone. Completion also requires the module coverage ledger to show that every in-scope module has been deeply read and either improved, validated-clear, deferred, or excluded with evidence.
@@ -8,6 +8,11 @@ For every non-trivial pass, ask what could regress silently if the cleanup were
8
8
 
9
9
  Use the resulting guardrails aggressively: when tests or equivalent verification can prove behavior preservation, they should unlock bolder refactors rather than merely justify small cosmetic edits.
10
10
 
11
+ Do not require pre-existing tests before every refactor. Instead:
12
+
13
+ - if existing guardrails are already sufficient, proceed;
14
+ - if the area is high-risk and guardrails are weak, add the smallest high-value tests first and treat that as progress toward the refactor.
15
+
11
16
  The intended end state is not merely "some tests passed for touched files". The refactor is complete only when the relevant guarded test surface for the repository remains green after the cleanup.
12
17
 
13
18
  ## Unit tests
@@ -27,6 +32,8 @@ Good oracles:
27
32
  - exact error class or reason code,
28
33
  - emitted side effect or explicit lack of side effect.
29
34
 
35
+ For high-risk legacy code with weak coverage, characterization-style unit tests are often the first unlock step even before the larger cleanup happens.
36
+
30
37
  ## Property-based tests
31
38
 
32
39
  Use when logic has invariants or broad input space:
@@ -51,6 +58,8 @@ Use when the risk spans modules:
51
58
 
52
59
  For external services, prefer mocks, fakes, local emulators, or recorded stable fixtures unless the real contract is explicitly under test.
53
60
 
61
+ When risk comes from multi-module coupling rather than one local function, integration coverage is often the best guardrail to add before refactoring.
62
+
54
63
  ## E2E tests
55
64
 
56
65
  Use only when:
@@ -81,3 +90,4 @@ Consider:
81
90
  - Do not weaken existing tests to fit the refactor.
82
91
  - If old tests asserted implementation details, rewrite them around stable behavior while preserving the business invariant.
83
92
  - Once stable guardrails exist, do not refuse a maintainability-improving refactor purely because confidence feels lower than ideal; let the guardrails decide.
93
+ - If stable guardrails do not yet exist for a high-risk area, create them as the next execution direction instead of treating the refactor as blocked forever.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@laitszkin/apollo-toolkit",
3
- "version": "3.1.5",
3
+ "version": "3.1.7",
4
4
  "description": "Apollo Toolkit npm installer for managed skill copying across Codex, OpenClaw, and Trae.",
5
5
  "license": "MIT",
6
6
  "author": "LaiTszKin",