@laitszkin/apollo-toolkit 3.1.2 → 3.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,16 @@ All notable changes to this repository are documented in this file.
7
7
  ### Changed
8
8
  - None yet.
9
9
 
10
+ ## [v3.1.4] - 2026-04-23
11
+
12
+ ### Changed
13
+ - Refine `iterative-code-quality` so it now treats naming, abstraction, module boundaries, logging, and tests as selectable execution directions under continuous full-codebase rescans, guiding agents to choose the highest-confidence, highest-leverage gradual refactors that prepare the ground for deeper later cleanup while preserving behavior under green guardrails and a precise system-level definition of macro architecture.
14
+
15
+ ## [v3.1.3] - 2026-04-23
16
+
17
+ ### Changed
18
+ - Tighten `iterative-code-quality` so agents must keep iterating while any known in-scope actionable quality issue remains, must not produce a completion report until the latest scan is clear or remaining candidates are explicitly classified as blocked, unsafe, low-value, speculative, or approval-dependent, and should use tests or equivalent guardrails to support more aggressive refactors instead of deferring them for subjective confidence reasons.
19
+
10
20
  ## [v3.1.2] - 2026-04-23
11
21
 
12
22
  ### Changed
@@ -1,16 +1,21 @@
1
1
  # iterative-code-quality
2
2
 
3
- Improve an existing repository through repeated, evidence-backed code-quality passes while preserving intended business behavior and macro architecture.
3
+ Improve an existing repository through repeated, evidence-backed full-iteration refactors while preserving intended business behavior and the system's top-level macro architecture.
4
4
 
5
5
  ## Core capabilities
6
6
 
7
7
  - Scans the full codebase and builds a prioritized quality backlog before editing.
8
+ - Treats naming, abstraction, module boundaries, logging, and tests as selectable execution directions rather than a fixed sequence.
8
9
  - Clarifies ambiguous variable, parameter, field, helper, and test-data names.
9
10
  - Simplifies complex functions and extracts reusable helpers only when they centralize real behavior.
10
11
  - Splits mixed-responsibility code into narrower modules without changing macro architecture.
11
12
  - Repairs stale or missing logs and adds tests for important observability contracts.
12
13
  - Adds high-value unit, property-based, integration, or E2E tests based on risk.
13
- - Repeats the pass cycle until remaining issues are low-value, blocked, or require explicit product/architecture approval.
14
+ - Uses those tests and other guardrails to justify more aggressive refactors, instead of leaving known issues in place for subjective confidence reasons.
15
+ - Re-scans the full repository after every iteration and picks the next highest-confidence, highest-leverage directions.
16
+ - Uses small safe refactors to prepare the ground for larger later refactors, progressing gradually from outside to inside.
17
+ - Repeats the pass cycle while any known in-scope actionable quality issue remains, and forbids a completion report until the latest scan is clear or remaining items are explicitly deferred with a valid reason.
18
+ - Targets as many inherited repository quality problems as can be solved safely, and expects the guarded test surface to remain green after the refactor.
14
19
  - Synchronizes project docs and `AGENTS.md` through `align-project-documents` and `maintain-project-constraints` after implementation.
15
20
 
16
21
  ## Repository structure
@@ -5,7 +5,7 @@ description: >-
5
5
  passes: clarify poor variable names, simplify or extract reusable functions,
6
6
  split oversized code into single-responsibility modules, repair stale or
7
7
  missing logs, and add high-value tests while preserving business behavior and
8
- macro architecture. Use when users ask for comprehensive refactoring, code
8
+ system-level macro architecture. Use when users ask for comprehensive refactoring, code
9
9
  cleanup, maintainability hardening, naming cleanup, log alignment, or test
10
10
  coverage improvement across a repository.
11
11
  ---
@@ -22,15 +22,17 @@ description: >-
22
22
  ## Standards
23
23
 
24
24
  - Evidence: Read repository docs, project constraints, source, tests, logs, and entrypoints before editing; every rename, extraction, split, log update, or test must be backed by code context.
25
- - Execution: Work in bounded passes, prioritize behavior-neutral improvements with the highest maintainability and test value, validate after each pass, and repeat until the remaining quality gaps are low-value or unsafe to change.
26
- - Quality: Preserve business behavior and macro architecture unless tests expose an existing logic defect; avoid style-only churn, compatibility theater, broad rewrites, and unverified "cleanup".
27
- - Output: Deliver a concise pass-by-pass summary, changed behavior-neutral surfaces, test coverage added, validation results, unresolved risks, and documentation/`AGENTS.md` sync status.
25
+ - Execution: Continuously re-scan the full codebase, treat naming, abstraction, module boundaries, logging, and tests as selectable execution directions rather than a fixed sequence, choose the highest-confidence directions that can safely land now, and use those smaller refactors to prepare the ground for larger future refactors; validate after each iteration, then keep iterating while any known in-scope codebase quality issue remains unresolved; when tests or other reliable guardrails can prove equivalence, prefer taking the refactor instead of deferring it for subjective confidence reasons; do not produce the completion report while the scan still contains actionable gaps.
26
+ - Quality: Solve as many inherited code-quality problems as safely possible without changing intended behavior or the system's macro architecture; avoid style-only churn, compatibility theater, broad rewrites, and unverified "cleanup", but do not reject a worthwhile refactor purely because it feels risky when existing or newly added guardrails can verify it safely.
27
+ - Output: Deliver a concise pass-by-pass summary, changed behavior-neutral surfaces, test coverage added, validation results, and documentation/`AGENTS.md` sync status only after every known in-scope quality issue is resolved or explicitly classified as blocked, unsafe, low-value, speculative, or requiring user approval.
28
28
 
29
29
  ## Goal
30
30
 
31
- Raise code quality across an existing repository without changing intended product behavior or the system's macro architecture.
31
+ Resolve as many inherited repository quality problems as possible without breaking intended behavior, and use tests plus other reliable guardrails to prove that the refactor leaves the project in a fully green state.
32
32
 
33
- This skill is intentionally implementation-oriented, not report-only. It should identify high-value improvements, apply them, test them, and keep iterating until further changes would be speculative, low-value, or architecture-changing.
33
+ This skill is intentionally implementation-oriented, not report-only. It should keep scanning the full codebase, choose the best available refactor directions at each moment, apply as much safe cleanup as the repository can support, add or strengthen tests to guard the refactor, and use incremental cleanup to unlock deeper improvements over time. If a post-iteration scan finds remaining actionable gaps, continue the next iteration instead of writing a completion report.
34
+
35
+ For this skill, `macro architecture` means the system's top-level runtime shape and overall operating logic: major subsystems, top-level execution model, deployment/runtime boundaries, persistence model, service boundaries, and the end-to-end way the whole system works. Ordinary module interactions, helper extraction, local responsibility moves, and internal call-boundary cleanup do not count as macro-architecture changes by themselves.
34
36
 
35
37
  ## Required Reference Loading
36
38
 
@@ -54,9 +56,11 @@ Load references only when they match the active pass:
54
56
  - Build an initial quality backlog with concrete file/function/test targets before changing code.
55
57
  - Use `references/repository-scan.md` for the scan checklist and backlog scoring.
56
58
 
57
- ### 2) Execute bounded improvement passes
59
+ ### 2) Execute continuous full scans with selectable directions
60
+
61
+ Do not force one fixed order such as "finish naming first, then abstraction, then modules". Instead, keep re-scanning the whole codebase and select the execution directions that are highest-confidence and highest-leverage right now.
58
62
 
59
- Run focused passes in the order that fits the repository evidence. A typical order is:
63
+ Treat these as multi-select execution directions, not mandatory sequential stages:
60
64
 
61
65
  1. Naming clarity for variables, parameters, fields, local helpers, and test data.
62
66
  2. Function simplification and reusable extraction for duplicated or hard-coded workflows.
@@ -64,14 +68,22 @@ Run focused passes in the order that fits the repository evidence. A typical ord
64
68
  4. Logging alignment for stale, misleading, missing, or low-context diagnostics.
65
69
  5. Risk-based test coverage for high-value business logic and boundary cases.
66
70
 
67
- For each pass:
71
+ Direction-selection rules:
72
+
73
+ - Prefer the directions with the strongest current evidence and best guardrails.
74
+ - Prefer smaller, higher-confidence refactors that unlock or de-risk larger later refactors.
75
+ - Prefer outside-in progress: stabilize boundaries, callers, naming, logs, and tests around a subsystem before attempting deeper internal rewrites.
76
+ - Re-evaluate the whole backlog after every landed iteration; the next best direction may change because the previous cleanup improved the local safety or clarity.
77
+
78
+ For each iteration:
68
79
 
69
80
  - Read all directly affected callers, tests, and public interfaces before editing.
70
- - Keep the pass small enough to validate and review; split broad cleanups into multiple passes.
81
+ - Keep the scope small enough to validate and review, and select whichever directions are most justified for that scope instead of forcing every direction to appear in every iteration.
71
82
  - Prefer repository-native abstractions over new parallel frameworks.
72
83
  - Preserve public behavior, data contracts, side effects, error classes, and macro architecture.
73
- - Add or update tests in the same pass when the change touches non-trivial logic, observability contracts, or extracted helpers.
74
- - Validate the touched scope before starting another pass.
84
+ - Add or update tests in the same iteration when the change touches non-trivial logic, observability contracts, or extracted helpers.
85
+ - If strong guardrails exist or can be added cheaply, prefer the clearer or more maintainable refactor instead of leaving a known issue in place due to subjective caution alone.
86
+ - Validate the touched scope before starting another iteration.
75
87
 
76
88
  ### 3) Rename for clarity without churn
77
89
 
@@ -87,6 +99,7 @@ For each pass:
87
99
  - Extract helpers only when they reduce duplication, centralize one business rule, clarify caller intent, or make a behavior testable.
88
100
  - Keep helper placement aligned with current module ownership.
89
101
  - Do not create abstractions for one-off code unless they isolate a meaningful domain rule or external contract.
102
+ - If tests or equivalent guardrails can prove behavior preservation, do not let moderate implementation uncertainty block an otherwise valuable simplification or extraction.
90
103
  - Preserve observable behavior unless a test proves the current behavior is a defect.
91
104
 
92
105
  ### 5) Split modules by responsibility
@@ -94,7 +107,8 @@ For each pass:
94
107
  - Split code only when one file/module owns multiple change reasons, domain boundaries, external integrations, or lifecycle stages.
95
108
  - Define the new module's responsibility before moving code.
96
109
  - Keep interfaces narrow, explicit, and consistent with existing project style.
97
- - Avoid macro-architecture changes such as new layers, new service boundaries, new persistence strategies, or framework swaps unless the user explicitly expands scope.
110
+ - Avoid macro-architecture changes such as new top-level layers, new service boundaries, new persistence strategies, deployment/runtime model changes, or framework swaps unless the user explicitly expands scope.
111
+ - When module boundaries are currently poor but can be protected by focused tests or other guardrails, choose the cleaner split instead of preserving a mixed-responsibility file out of caution alone.
98
112
  - Use `references/module-boundaries.md` for extraction rules and anti-patterns.
99
113
 
100
114
  ### 6) Repair logging and observability drift
@@ -115,12 +129,15 @@ For each pass:
115
129
  - If a new test exposes an existing business-logic bug, invoke `systematic-debug`, fix the true owner, and keep the regression test.
116
130
  - Use `references/testing-strategy.md` for coverage selection and required `N/A` reasoning.
117
131
 
118
- ### 8) Iterate until quality gates pass
132
+ ### 8) Iterate gradually from outside to inside until the repository is clear of known actionable issues
119
133
 
120
- - After each pass, run the narrowest relevant tests first, then broaden validation when confidence increases.
121
- - Re-scan touched areas for new naming drift, duplicated helper candidates, module-boundary cracks, logging drift, and missing tests.
122
- - Repeat the full pass cycle when significant gaps remain and can be fixed safely without changing business behavior or macro architecture.
123
- - Stop only when remaining issues are low-value, speculative, blocked, or require explicit product/architecture approval.
134
+ - After each iteration, run the narrowest relevant tests first, then broaden validation until the changed scope and final repository state are adequately guarded.
135
+ - Re-scan the full codebase, not only the touched area, because the best next direction may have shifted after the last cleanup.
136
+ - Re-rank the backlog after every iteration and choose the next highest-confidence, highest-leverage direction set.
137
+ - Use small external or boundary-level cleanups to make later deeper refactors safer; treat that groundwork as progress toward a thorough long-horizon refactor, not as a distraction from it.
138
+ - Repeat the full iteration whenever any known in-scope actionable gap remains and can be fixed safely without changing business behavior or macro architecture.
139
+ - Do not write the completion report, summarize the task as done, or hand back as complete while the latest scan still contains known actionable quality issues.
140
+ - Stop only when every known in-scope issue has been resolved, or each remaining candidate is explicitly classified as low-value, speculative, blocked, unsafe, or requiring product/architecture approval.
124
141
  - Use `references/iteration-gates.md` for stopping criteria.
125
142
 
126
143
  ### 9) Synchronize docs and constraints
@@ -134,7 +151,7 @@ After code and tests are complete:
134
151
  ## Hard Guardrails
135
152
 
136
153
  - Do not change intended business logic while refactoring.
137
- - Do not change macro architecture, framework choice, storage model, deployment model, or service boundaries unless the user explicitly approves that expanded scope.
154
+ - Do not change the system's macro architecture—its top-level runtime shape, deployment/runtime model, persistence model, major service boundaries, or overall operating logic—unless the user explicitly approves that expanded scope.
138
155
  - Do not use one-off scripts to rewrite product code.
139
156
  - Do not perform style-only churn that does not improve naming, reuse, modularity, observability, or test confidence.
140
157
  - Do not weaken tests to make refactors pass; update tests to stable invariants or fix the implementation defect.
@@ -142,9 +159,11 @@ After code and tests are complete:
142
159
 
143
160
  ## Completion Report
144
161
 
162
+ Only write this report after the latest scan confirms there are no known actionable in-scope quality issues remaining and the relevant test/guardrail suite is green. If any such issue remains, continue iterating instead of reporting completion.
163
+
145
164
  Return:
146
165
 
147
- 1. Passes completed and why they were ordered that way.
166
+ 1. Iterations completed and which execution directions were selected in each one.
148
167
  2. Key files changed and the quality issue each change resolved.
149
168
  3. Business behavior preservation evidence.
150
169
  4. Tests added or updated, including property/integration/E2E `N/A` reasons where relevant.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "Iterative Code Quality"
3
3
  short_description: "Refactor names, functions, modules, logs, and tests in repeated behavior-safe passes"
4
- default_prompt: "Use $iterative-code-quality to scan this repository, build an evidence-backed quality backlog, then iteratively clarify variable names, simplify or extract reusable functions, split code into single-responsibility modules, repair stale or missing logs, and add high-value unit/property/integration/E2E tests while preserving business behavior and macro architecture; after code and tests are complete, run $align-project-documents and $maintain-project-constraints to synchronize docs and AGENTS.md."
4
+ default_prompt: "Use $iterative-code-quality to keep scanning the full repository, treat naming, simplification, reusable extraction, module-boundary cleanup, logging alignment, and testing as selectable execution directions rather than a fixed sequence, and choose the highest-confidence, highest-leverage directions available at each moment; use small safe refactors to prepare the ground for larger later refactors, progress gradually from outside to inside, use tests or other reliable guardrails to justify aggressive cleanup without breaking intended behavior or the system's top-level runtime architecture, and do not write a completion report while any known in-scope actionable issue remains; only finish after the latest full-codebase scan is clear or remaining items are explicitly classified as blocked, unsafe, low-value, speculative, or approval-dependent, and the guarded test surface is green; then run $align-project-documents and $maintain-project-constraints to synchronize docs and AGENTS.md."
@@ -2,15 +2,18 @@
2
2
 
3
3
  ## Pass discipline
4
4
 
5
- Each pass must have:
5
+ Each iteration must have:
6
6
 
7
7
  - a concrete quality target,
8
8
  - a bounded file/symbol scope,
9
+ - one or more selected execution directions,
9
10
  - expected behavior-neutral outcome,
10
11
  - validation plan,
11
12
  - rollback point if evidence contradicts the change.
12
13
 
13
- Avoid starting a broad second pass before validating the first.
14
+ An iteration is not "one work type", and it also does not need to include every direction every time. Within the selected scope, choose the subset of directions that has the best current confidence and leverage: naming, simplification, module boundaries, logging, and/or tests.
15
+
16
+ Avoid starting a broad second iteration before validating the first, but do not stop after a validated iteration if known actionable quality issues remain anywhere in the in-scope codebase.
14
17
 
15
18
  ## Validation cadence
16
19
 
@@ -28,9 +31,13 @@ If validation fails:
28
31
  - keep regression coverage for real defects,
29
32
  - do not mask failures by weakening assertions.
30
33
 
31
- ## Re-scan after each pass
34
+ If validation passes and the guardrails meaningfully cover the changed behavior, do not keep a known quality issue in place purely because of subjective confidence concerns.
35
+
36
+ The final stopping condition also requires the relevant guarded test surface to be green; a partially red repository is not a completed refactor outcome.
32
37
 
33
- Inspect touched areas for:
38
+ ## Re-scan after each iteration
39
+
40
+ Inspect the full known quality backlog for:
34
41
 
35
42
  - new naming drift from moved or extracted concepts,
36
43
  - duplicated logic that remains after extraction,
@@ -39,19 +46,30 @@ Inspect touched areas for:
39
46
  - tests that cover only the happy path,
40
47
  - documentation or `AGENTS.md` drift.
41
48
 
49
+ Then choose the next execution directions with these priorities:
50
+
51
+ 1. highest confidence under current guardrails,
52
+ 2. strongest leverage for later deeper cleanup,
53
+ 3. lowest business-risk path toward broader system improvement.
54
+
42
55
  ## Continue when
43
56
 
44
57
  Repeat the cycle when:
45
58
 
59
+ - any known in-scope actionable quality issue remains unresolved,
46
60
  - high-impact unclear names remain,
47
61
  - duplicated or hard-coded workflows still have safe extraction paths,
48
62
  - a module still mixes distinct responsibilities and can be split locally,
49
63
  - logs are still misleading or missing at critical decisions,
50
64
  - high-value business logic remains untested and is testable.
51
65
 
66
+ Do not produce a final completion report while any item in this section is true. Continue with the next bounded iteration instead.
67
+
68
+ Prefer gradual outside-in progress: boundary cleanup, naming clarity, and guardrail strengthening should often come before deeper internal rewrites because they make the deeper work safer later.
69
+
52
70
  ## Stop when
53
71
 
54
- Stop when remaining candidates are:
72
+ Stop only when there are no unresolved known in-scope actionable issues. Any remaining candidates must be explicitly classified as one of:
55
73
 
56
74
  - low-value style preference,
57
75
  - speculative without concrete evidence,
@@ -61,13 +79,18 @@ Stop when remaining candidates are:
61
79
  - blocked by unavailable credentials, unstable external systems, or missing documentation,
62
80
  - untestable with the current repository tooling and too risky to change safely.
63
81
 
82
+ If a remaining candidate cannot be placed in one of these categories, it is still an actionable gap and the agent must continue iterating rather than complete the task.
83
+
64
84
  ## Completion evidence
65
85
 
66
86
  The final report should make the stopping point auditable:
67
87
 
68
88
  - passes completed,
89
+ - execution directions selected per iteration,
69
90
  - validation commands and outcomes,
91
+ - confirmation that the guarded test surface is green after the refactor,
70
92
  - tests added by risk category,
71
93
  - behavior-preservation evidence,
72
94
  - docs and constraints sync status,
73
- - deferred items with reason and required approval or dependency.
95
+ - proof that the latest scan found no known actionable in-scope quality issues,
96
+ - deferred items with reason and required approval, dependency, or safety constraint.
@@ -21,6 +21,26 @@ Define:
21
21
  - which interfaces must remain stable,
22
22
  - which tests prove behavior did not change.
23
23
 
24
+ ## Macro architecture boundary
25
+
26
+ For this skill, `macro architecture` means the whole system's runtime shape and operating model, such as:
27
+
28
+ - major subsystems and their top-level responsibilities,
29
+ - deployment/runtime boundaries,
30
+ - persistence model and data ownership model,
31
+ - inter-service or inter-process boundaries,
32
+ - the overall execution logic by which the system operates end to end.
33
+
34
+ The following do **not** count as macro-architecture changes by themselves:
35
+
36
+ - moving logic between ordinary internal modules,
37
+ - extracting helpers,
38
+ - splitting a mixed-responsibility file into narrower local modules,
39
+ - clarifying call boundaries inside one subsystem,
40
+ - replacing duplicated local control flow with a shared internal abstraction.
41
+
42
+ Treat those changes as ordinary refactoring work unless they also alter the top-level system model above.
43
+
24
44
  ## Safe split patterns
25
45
 
26
46
  - Move pure domain logic into a domain-owned helper module.
@@ -69,3 +69,5 @@ Before and after simplification, verify:
69
69
  - public API and CLI behavior remain stable,
70
70
  - log fields remain compatible unless stale names were intentionally corrected,
71
71
  - existing tests still pass and new tests cover extracted rules.
72
+
73
+ If these checks can be enforced by existing or newly added tests, do not treat subjective confidence alone as a reason to avoid the simplification.
@@ -6,6 +6,10 @@ Choose tests from the risk inventory, not from a generic coverage target.
6
6
 
7
7
  For every non-trivial pass, ask what could regress silently if the cleanup were wrong.
8
8
 
9
+ Use the resulting guardrails aggressively: when tests or equivalent verification can prove behavior preservation, they should unlock bolder refactors rather than merely justify small cosmetic edits.
10
+
11
+ The intended end state is not merely "some tests passed for touched files". The refactor is complete only when the relevant guarded test surface for the repository remains green after the cleanup.
12
+
9
13
  ## Unit tests
10
14
 
11
15
  Use for:
@@ -76,3 +80,4 @@ Consider:
76
80
  - Preserve failing seeds or examples from property-based tests.
77
81
  - Do not weaken existing tests to fit the refactor.
78
82
  - If old tests asserted implementation details, rewrite them around stable behavior while preserving the business invariant.
83
+ - Once stable guardrails exist, do not refuse a maintainability-improving refactor purely because confidence feels lower than ideal; let the guardrails decide.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@laitszkin/apollo-toolkit",
3
- "version": "3.1.2",
3
+ "version": "3.1.4",
4
4
  "description": "Apollo Toolkit npm installer for managed skill copying across Codex, OpenClaw, and Trae.",
5
5
  "license": "MIT",
6
6
  "author": "LaiTszKin",