@laitszkin/apollo-toolkit 3.12.1 → 3.13.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +14 -0
- package/analyse-app-logs/scripts/__pycache__/filter_logs_by_time.cpython-312.pyc +0 -0
- package/analyse-app-logs/scripts/__pycache__/log_cli_utils.cpython-312.pyc +0 -0
- package/analyse-app-logs/scripts/__pycache__/search_logs.cpython-312.pyc +0 -0
- package/archive-specs/SKILL.md +0 -6
- package/commit-and-push/SKILL.md +3 -9
- package/docs-to-voice/scripts/__pycache__/docs_to_voice.cpython-312.pyc +0 -0
- package/generate-spec/SKILL.md +27 -9
- package/generate-spec/references/definition.md +12 -0
- package/generate-spec/scripts/__pycache__/create-specscpython-312.pyc +0 -0
- package/init-project-html/SKILL.md +18 -22
- package/init-project-html/references/definition.md +12 -0
- package/katex/scripts/__pycache__/render_katex.cpython-312.pyc +0 -0
- package/maintain-project-constraints/SKILL.md +11 -19
- package/merge-changes-from-local-branches/SKILL.md +11 -24
- package/open-github-issue/scripts/__pycache__/open_github_issue.cpython-312.pyc +0 -0
- package/package.json +1 -1
- package/read-github-issue/scripts/__pycache__/find_issues.cpython-312.pyc +0 -0
- package/read-github-issue/scripts/__pycache__/read_issue.cpython-312.pyc +0 -0
- package/resolve-review-comments/scripts/__pycache__/review_threads.cpython-312.pyc +0 -0
- package/solve-issues-found-during-review/SKILL.md +1 -1
- package/systematic-debug/SKILL.md +11 -38
- package/test-case-strategy/SKILL.md +10 -37
- package/text-to-short-video/scripts/__pycache__/enforce_video_aspect_ratio.cpython-312.pyc +0 -0
- package/update-project-html/SKILL.md +19 -24
- package/update-project-html/references/definition.md +12 -0
- package/version-release/SKILL.md +16 -37
- package/iterative-code-performance/LICENSE +0 -21
- package/iterative-code-performance/README.md +0 -34
- package/iterative-code-performance/SKILL.md +0 -116
- package/iterative-code-performance/agents/openai.yaml +0 -4
- package/iterative-code-performance/references/algorithmic-complexity.md +0 -58
- package/iterative-code-performance/references/allocation-and-hot-loops.md +0 -53
- package/iterative-code-performance/references/caching-and-memoization.md +0 -64
- package/iterative-code-performance/references/concurrency-and-pipelines.md +0 -61
- package/iterative-code-performance/references/coupled-hot-path-strategy.md +0 -78
- package/iterative-code-performance/references/io-batching-and-queries.md +0 -55
- package/iterative-code-performance/references/iteration-gates.md +0 -133
- package/iterative-code-performance/references/job-selection.md +0 -92
- package/iterative-code-performance/references/measurement-and-benchmarking.md +0 -78
- package/iterative-code-performance/references/module-coverage.md +0 -133
- package/iterative-code-performance/references/repository-scan.md +0 -69
- package/iterative-code-quality/LICENSE +0 -21
- package/iterative-code-quality/README.md +0 -45
- package/iterative-code-quality/SKILL.md +0 -112
- package/iterative-code-quality/agents/openai.yaml +0 -4
- package/iterative-code-quality/references/coupled-core-file-strategy.md +0 -73
- package/iterative-code-quality/references/iteration-gates.md +0 -127
- package/iterative-code-quality/references/job-selection.md +0 -78
- package/iterative-code-quality/references/logging-alignment.md +0 -67
- package/iterative-code-quality/references/module-boundaries.md +0 -83
- package/iterative-code-quality/references/module-coverage.md +0 -126
- package/iterative-code-quality/references/naming-and-simplification.md +0 -73
- package/iterative-code-quality/references/repository-scan.md +0 -65
- package/iterative-code-quality/references/testing-strategy.md +0 -95
- package/merge-conflict-resolver/SKILL.md +0 -46
- package/merge-conflict-resolver/agents/openai.yaml +0 -5
- package/spec-to-project-html/SKILL.md +0 -42
- package/spec-to-project-html/agents/openai.yaml +0 -11
- package/spec-to-project-html/references/TEMPLATE_SPEC.md +0 -113
- package/submission-readiness-check/SKILL.md +0 -39
- package/submission-readiness-check/agents/openai.yaml +0 -4
|
@@ -1,133 +0,0 @@
|
|
|
1
|
-
# Performance Iteration Gates And Stopping Criteria
|
|
2
|
-
|
|
3
|
-
## Pass discipline
|
|
4
|
-
|
|
5
|
-
Each iteration must have:
|
|
6
|
-
|
|
7
|
-
- a selected module or bounded module cluster,
|
|
8
|
-
- a concrete performance target,
|
|
9
|
-
- an explicit record of which performance job lenses were checked during the deep read,
|
|
10
|
-
- a bounded file/symbol scope,
|
|
11
|
-
- one or more selected execution directions,
|
|
12
|
-
- baseline evidence or a reason measurement is unavailable,
|
|
13
|
-
- a confidence assessment covering the agent's own ability to complete the optimization, the task's inherent difficulty, objective guardrail strength, benchmark quality, and rollback or repair paths,
|
|
14
|
-
- expected behavior-preserving outcome,
|
|
15
|
-
- validation plan,
|
|
16
|
-
- rollback point if evidence contradicts the change.
|
|
17
|
-
|
|
18
|
-
An iteration is not "one work type", and it also does not need to include every direction every time. Within the selected scope, choose the subset of directions that has the best current evidence and leverage: measurement, complexity, IO, caching, allocation, concurrency, and/or staged unlock work.
|
|
19
|
-
|
|
20
|
-
Confidence is not a synonym for "easy". Assess whether the agent has enough understanding, skill, workload context, tests, benchmarks, validation commands, and recovery path to complete the optimization safely. A hard task can still be high-confidence when strong guardrails, characterization coverage, and clear rollback let the agent repair mistakes by making the guarded behavior green again.
|
|
21
|
-
|
|
22
|
-
Avoid starting a broad second iteration before validating the first, but do not stop after a validated iteration if known actionable performance issues remain anywhere in the in-scope codebase.
|
|
23
|
-
|
|
24
|
-
Do not stop after a validated iteration if any in-scope module remains unvisited in the module coverage ledger.
|
|
25
|
-
|
|
26
|
-
## Validation cadence
|
|
27
|
-
|
|
28
|
-
Run validation from narrow to broad:
|
|
29
|
-
|
|
30
|
-
1. Formatter or type check for touched files when available.
|
|
31
|
-
2. Unit tests for touched helpers and modules.
|
|
32
|
-
3. Benchmarks, profiler runs, query-count checks, or operation-count checks for the optimized path when available.
|
|
33
|
-
4. Integration tests for affected chains.
|
|
34
|
-
5. Broader suite or build once multiple passes interact.
|
|
35
|
-
|
|
36
|
-
If validation fails:
|
|
37
|
-
|
|
38
|
-
- determine whether the failure is pre-existing, stale test expectation, flaky benchmark, test isolation issue, or real product bug,
|
|
39
|
-
- fix the true owner,
|
|
40
|
-
- keep regression coverage for real defects,
|
|
41
|
-
- do not mask failures by weakening assertions or widening benchmark budgets without evidence.
|
|
42
|
-
|
|
43
|
-
If validation passes and the performance plus correctness guardrails meaningfully cover the changed behavior, do not keep a known bottleneck in place purely because of subjective confidence concerns. Reassess whether the agent has enough capability and objective support to proceed; if yes, continue, and if no, choose the smallest measurement, benchmark, or unlock step that would make the next optimization credible.
|
|
44
|
-
|
|
45
|
-
The final stopping condition also requires the relevant guarded test surface to be green; a partially red repository is not a completed optimization outcome.
|
|
46
|
-
|
|
47
|
-
## Re-scan after each iteration
|
|
48
|
-
|
|
49
|
-
Inspect the full known performance backlog for:
|
|
50
|
-
|
|
51
|
-
- modules that are still unvisited or only shallowly read,
|
|
52
|
-
- modules that were read but not yet checked against every available performance job lens,
|
|
53
|
-
- new repeated work after moved or extracted concepts,
|
|
54
|
-
- remaining N+1 calls, serial round trips, or excessive query shapes,
|
|
55
|
-
- caches that are stale, unbounded, or unnecessary,
|
|
56
|
-
- hot loops that still allocate avoidable objects,
|
|
57
|
-
- concurrency changes that need backpressure or max-in-flight proof,
|
|
58
|
-
- logs, metrics, traces, or benchmarks that now describe stale names or paths,
|
|
59
|
-
- documentation or `AGENTS.md/CLAUDE.md` drift.
|
|
60
|
-
|
|
61
|
-
Then choose the next execution directions with these priorities:
|
|
62
|
-
|
|
63
|
-
1. strongest bottleneck evidence,
|
|
64
|
-
2. largest user-visible or high-frequency impact,
|
|
65
|
-
3. highest combined confidence from agent capability, workload understanding, correctness guardrails, benchmark quality, and recovery path,
|
|
66
|
-
4. strongest leverage for later deeper optimization,
|
|
67
|
-
5. lowest business-risk path toward broader system improvement.
|
|
68
|
-
|
|
69
|
-
Use `references/job-selection.md` to convert those priorities into a concrete next-job choice.
|
|
70
|
-
|
|
71
|
-
## Stage-gate after each iteration
|
|
72
|
-
|
|
73
|
-
After every validated iteration, run a deliberate full-codebase decision pass:
|
|
74
|
-
|
|
75
|
-
1. Re-scan the repository and refresh the known performance backlog.
|
|
76
|
-
2. Refresh the module coverage ledger and identify unvisited in-scope modules.
|
|
77
|
-
3. Ask whether any known in-scope actionable bottleneck still remains.
|
|
78
|
-
4. If yes, decide whether it should be addressed in the very next iteration or whether measurement or unlock work is needed first.
|
|
79
|
-
5. If the obstacle is a large, coupled, or central hot path, do not stop there; switch to staged unlock work and continue.
|
|
80
|
-
6. Only declare the repository iteration-complete when the re-scan shows no remaining actionable in-scope bottleneck and no unvisited in-scope module except items that are explicitly deferred or excluded under the allowed stop categories.
|
|
81
|
-
|
|
82
|
-
This stage-gate is mandatory. A validated local optimization does not by itself mean the repository is done.
|
|
83
|
-
|
|
84
|
-
## Continue when
|
|
85
|
-
|
|
86
|
-
Repeat the cycle when:
|
|
87
|
-
|
|
88
|
-
- any known in-scope actionable performance issue remains unresolved,
|
|
89
|
-
- any in-scope module remains unvisited,
|
|
90
|
-
- measured slow paths remain,
|
|
91
|
-
- clear algorithmic waste remains,
|
|
92
|
-
- avoidable IO, query, or external-service round trips remain,
|
|
93
|
-
- unsafe or low-value caches need removal or replacement,
|
|
94
|
-
- allocation churn or hot-loop repeated work remains in a material path,
|
|
95
|
-
- concurrency, backpressure, or batching gaps remain,
|
|
96
|
-
- benchmarks or profiling are missing for a high-risk optimization that is otherwise actionable.
|
|
97
|
-
|
|
98
|
-
Do not produce a final completion report while any item in this section is true. Continue with the next bounded iteration instead.
|
|
99
|
-
|
|
100
|
-
## Stop when
|
|
101
|
-
|
|
102
|
-
Stop only when there are no unresolved known in-scope actionable performance issues. Any remaining candidates must be explicitly classified as one of:
|
|
103
|
-
|
|
104
|
-
- low-value micro-optimization,
|
|
105
|
-
- speculative without concrete evidence,
|
|
106
|
-
- production-measurement-only with no safe local proxy,
|
|
107
|
-
- public contract migrations,
|
|
108
|
-
- macro-architecture changes,
|
|
109
|
-
- product behavior changes needing user approval,
|
|
110
|
-
- blocked by unavailable credentials, unstable external systems, or missing documentation,
|
|
111
|
-
- untestable with the current repository tooling and too risky to change safely.
|
|
112
|
-
|
|
113
|
-
If a remaining candidate cannot be placed in one of these categories, it is still an actionable gap and the agent must continue iterating rather than complete the task.
|
|
114
|
-
|
|
115
|
-
If an in-scope module has not received a performance deep-read iteration, it is still an actionable coverage gap even when the already-read modules look fast.
|
|
116
|
-
|
|
117
|
-
## Completion evidence
|
|
118
|
-
|
|
119
|
-
The final report should make the stopping point auditable:
|
|
120
|
-
|
|
121
|
-
- passes completed,
|
|
122
|
-
- execution directions selected per iteration,
|
|
123
|
-
- module or module cluster covered per iteration,
|
|
124
|
-
- performance job lenses checked per iteration,
|
|
125
|
-
- final module coverage ledger,
|
|
126
|
-
- stage-gate verdict after each full-codebase re-scan,
|
|
127
|
-
- validation commands and outcomes,
|
|
128
|
-
- confirmation that the guarded test surface is green after the optimization,
|
|
129
|
-
- speedup, throughput, latency, CPU, memory, allocation, query-count, or complexity evidence,
|
|
130
|
-
- behavior-preservation evidence,
|
|
131
|
-
- docs and constraints sync status,
|
|
132
|
-
- proof that the latest scan found no known actionable in-scope performance issues,
|
|
133
|
-
- deferred items with reason and required approval, dependency, production data, or safety constraint.
|
|
@@ -1,92 +0,0 @@
|
|
|
1
|
-
# Performance Job Selection Guide
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
|
|
5
|
-
Help the agent choose the next execution direction after each full-codebase performance re-scan.
|
|
6
|
-
|
|
7
|
-
These are job-selection rules for Step 2 of the main skill loop. They are not workflow steps.
|
|
8
|
-
|
|
9
|
-
The goal is not to force one permanent order. The goal is to choose the next job that most safely improves the selected module or module cluster and unlocks later work.
|
|
10
|
-
|
|
11
|
-
Before choosing, the agent should first scan the selected module through every available performance job lens. Job selection happens after that scan; it is not a substitute for that scan.
|
|
12
|
-
|
|
13
|
-
## Available jobs
|
|
14
|
-
|
|
15
|
-
- measurement / benchmarking
|
|
16
|
-
- algorithmic complexity / repeated-work cleanup
|
|
17
|
-
- IO batching / query shaping
|
|
18
|
-
- caching / memoization
|
|
19
|
-
- allocation / hot-loop cleanup
|
|
20
|
-
- concurrency / pipeline tuning
|
|
21
|
-
- staged unlock work
|
|
22
|
-
|
|
23
|
-
## Choose `measurement / benchmarking` when
|
|
24
|
-
|
|
25
|
-
- the slow path is suspected but not proven,
|
|
26
|
-
- multiple candidate bottlenecks compete and prioritization would otherwise be guesswork,
|
|
27
|
-
- an optimization could regress correctness or user-visible latency without a baseline,
|
|
28
|
-
- production symptoms exist but local reproduction or profiling is missing.
|
|
29
|
-
|
|
30
|
-
## Choose `algorithmic complexity / repeated-work cleanup` when
|
|
31
|
-
|
|
32
|
-
- nested loops, repeated scans, repeated sorts, or repeated filters dominate the path,
|
|
33
|
-
- the same derived value is recomputed for every item or request,
|
|
34
|
-
- a more appropriate data structure would reduce asymptotic or constant-factor cost,
|
|
35
|
-
- the complexity change can be proven without changing business behavior.
|
|
36
|
-
|
|
37
|
-
## Choose `IO batching / query shaping` when
|
|
38
|
-
|
|
39
|
-
- N+1 database, network, filesystem, or external API calls exist,
|
|
40
|
-
- serial round trips can be safely batched or preloaded,
|
|
41
|
-
- query predicates, projections, pagination, or indexes are mismatched to real access patterns,
|
|
42
|
-
- external calls lack timeout, retry, or result-shape handling that affects throughput.
|
|
43
|
-
|
|
44
|
-
## Choose `caching / memoization` when
|
|
45
|
-
|
|
46
|
-
- the same expensive pure or owner-scoped computation repeats,
|
|
47
|
-
- cache ownership, invalidation, size bounds, and failure behavior are clear,
|
|
48
|
-
- stale data cannot violate business rules,
|
|
49
|
-
- a simpler repeated-work or data-structure fix is insufficient.
|
|
50
|
-
|
|
51
|
-
## Choose `allocation / hot-loop cleanup` when
|
|
52
|
-
|
|
53
|
-
- tight loops allocate avoidable objects, strings, buffers, regexes, or closures,
|
|
54
|
-
- repeated serialization, parsing, cloning, copying, or formatting appears in a hot path,
|
|
55
|
-
- memory pressure or garbage collection is part of the observed slowdown,
|
|
56
|
-
- changes can preserve readability or are justified by strong hot-path evidence.
|
|
57
|
-
|
|
58
|
-
## Choose `concurrency / pipeline tuning` when
|
|
59
|
-
|
|
60
|
-
- safe independent work is unnecessarily serial,
|
|
61
|
-
- current parallelism is unbounded or overloads downstream resources,
|
|
62
|
-
- queues, workers, streams, or async tasks lack backpressure,
|
|
63
|
-
- batching or bounded concurrency would improve throughput without changing ordering guarantees.
|
|
64
|
-
|
|
65
|
-
## Choose `staged unlock work` when
|
|
66
|
-
|
|
67
|
-
- the file feels too central or too coupled for direct optimization,
|
|
68
|
-
- no safe full hot-path rewrite exists yet, but a preparatory step does,
|
|
69
|
-
- you can reduce risk through measurement hooks, seam extraction, characterization tests, type extraction, data-shape clarification, or side-effect isolation,
|
|
70
|
-
- the best next move is to make a future optimization cheaper rather than solve the whole area now.
|
|
71
|
-
|
|
72
|
-
## Tie-breakers
|
|
73
|
-
|
|
74
|
-
If multiple jobs are plausible, prefer the one that:
|
|
75
|
-
|
|
76
|
-
1. addresses the highest-evidence bottleneck,
|
|
77
|
-
2. improves the most user-visible or high-frequency path,
|
|
78
|
-
3. increases safety for the next iteration,
|
|
79
|
-
4. removes the strongest blocker to a deeper future optimization,
|
|
80
|
-
5. helps an unvisited module reach performance deep-read coverage,
|
|
81
|
-
6. matches the agent's self-assessed ability to understand, execute, benchmark, and repair the change under current evidence,
|
|
82
|
-
7. preserves behavior with the clearest available guardrails.
|
|
83
|
-
|
|
84
|
-
## Hard rule
|
|
85
|
-
|
|
86
|
-
If performance evidence is too weak, `measurement / benchmarking` should usually win before code changes.
|
|
87
|
-
|
|
88
|
-
If a high-risk hot path lacks enough correctness guardrails, benchmark or characterization guardrail work should usually win before a deeper optimization.
|
|
89
|
-
|
|
90
|
-
If the area is difficult but the agent can explain the workload, behavior, affected contracts, rollback path, and available tests or benchmarks clearly, do not downgrade confidence just because the optimization is non-trivial. Strong guardrails mean accidental breakage should be repaired by returning the test and benchmark surface to green, not avoided by leaving an actionable bottleneck in place.
|
|
91
|
-
|
|
92
|
-
If any in-scope module remains unvisited, choose jobs that help the next highest-evidence or easiest useful unvisited module become deeply read, improved, or validated-clear before spending another round on already-familiar areas.
|
|
@@ -1,78 +0,0 @@
|
|
|
1
|
-
# Measurement And Benchmarking
|
|
2
|
-
|
|
3
|
-
## Principle
|
|
4
|
-
|
|
5
|
-
Optimize from evidence. A performance change should have at least one of:
|
|
6
|
-
|
|
7
|
-
- production latency, throughput, CPU, memory, queue, or query evidence,
|
|
8
|
-
- profiler output,
|
|
9
|
-
- repeatable benchmark baseline,
|
|
10
|
-
- test runtime profile,
|
|
11
|
-
- log or trace timings,
|
|
12
|
-
- clear algorithmic complexity proof tied to a plausible workload.
|
|
13
|
-
|
|
14
|
-
If none exists, measurement is usually the next job.
|
|
15
|
-
|
|
16
|
-
Measurement also informs confidence, but it is not the only input. The agent must assess its own ability to understand and complete the optimization, then combine that self-assessment with task difficulty, benchmark quality, correctness tests, rollback options, and repair paths. Strong tests and repeatable benchmarks should make difficult changes more actionable because failures can be diagnosed and driven back to green.
|
|
17
|
-
|
|
18
|
-
## Baseline rules
|
|
19
|
-
|
|
20
|
-
Before changing a hot path, record:
|
|
21
|
-
|
|
22
|
-
- command, scenario, fixture, data size, seed, and environment,
|
|
23
|
-
- current timing, throughput, allocation, query count, memory, or operation count,
|
|
24
|
-
- variance or repeated-run notes when possible,
|
|
25
|
-
- correctness oracle used with the benchmark,
|
|
26
|
-
- reason if the path can only be measured in production.
|
|
27
|
-
|
|
28
|
-
Do not compare a cold-cache baseline with a warm-cache after result unless that is the intended user-visible scenario.
|
|
29
|
-
|
|
30
|
-
## Benchmark selection
|
|
31
|
-
|
|
32
|
-
Use the cheapest reliable benchmark that proves the risk:
|
|
33
|
-
|
|
34
|
-
- unit microbenchmarks for pure hot helpers,
|
|
35
|
-
- integration benchmarks for query, serialization, or multi-module orchestration paths,
|
|
36
|
-
- command or request benchmarks for user-visible entrypoints,
|
|
37
|
-
- load tests only when concurrency, backpressure, or throughput is the actual risk,
|
|
38
|
-
- profiler snapshots when the bottleneck location is unknown.
|
|
39
|
-
|
|
40
|
-
Avoid expensive load tests when a deterministic benchmark or integration test proves the same performance issue.
|
|
41
|
-
|
|
42
|
-
## Before/after comparisons
|
|
43
|
-
|
|
44
|
-
A useful comparison names:
|
|
45
|
-
|
|
46
|
-
- baseline command and result,
|
|
47
|
-
- after command and result,
|
|
48
|
-
- data shape and scale,
|
|
49
|
-
- correctness validation,
|
|
50
|
-
- variance or caveat,
|
|
51
|
-
- whether the improvement is latency, throughput, CPU, memory, allocation, query count, or algorithmic complexity.
|
|
52
|
-
|
|
53
|
-
If exact numbers are unstable, report operation counts, query counts, asymptotic complexity, or profiler rank change instead of pretending precision exists.
|
|
54
|
-
|
|
55
|
-
## Guardrail design
|
|
56
|
-
|
|
57
|
-
Performance guardrails should fail on meaningful regressions, not noise.
|
|
58
|
-
|
|
59
|
-
Prefer:
|
|
60
|
-
|
|
61
|
-
- deterministic operation or query counts,
|
|
62
|
-
- bounded latency budgets with generous margins only when the environment is stable,
|
|
63
|
-
- regression tests for duplicate work removal,
|
|
64
|
-
- benchmark scripts documented for local use,
|
|
65
|
-
- assertions on cache invalidation behavior,
|
|
66
|
-
- profiler notes for manual verification when automated thresholds are unreliable.
|
|
67
|
-
|
|
68
|
-
Do not add flaky timing thresholds to CI when the repository has no stable benchmark environment.
|
|
69
|
-
|
|
70
|
-
## Production-only measurement
|
|
71
|
-
|
|
72
|
-
When a bottleneck requires production data or credentials:
|
|
73
|
-
|
|
74
|
-
- capture the best local proxy evidence available,
|
|
75
|
-
- document the missing data source,
|
|
76
|
-
- avoid speculative rewrites that cannot be validated,
|
|
77
|
-
- add safe instrumentation or benchmark hooks if approved and useful,
|
|
78
|
-
- classify the remaining item as production-measurement-only if no safe local action remains.
|
|
@@ -1,133 +0,0 @@
|
|
|
1
|
-
# Module Coverage And Performance Deep-Read Iterations
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
|
|
5
|
-
Prevent the agent from repeatedly optimizing only familiar hot paths while untouched modules remain unexamined.
|
|
6
|
-
|
|
7
|
-
Use this reference in Step 1 to build the module inventory and in Step 2 to choose which module or module cluster receives the next performance deep-read iteration.
|
|
8
|
-
|
|
9
|
-
Deep-read here does not mean generic reading. It means scanning the module through each available performance job lens so the agent can identify whether measurement, algorithmic complexity, repeated-work removal, IO batching, caching, allocation cleanup, concurrency work, or staged unlock work is justified.
|
|
10
|
-
|
|
11
|
-
## Module inventory
|
|
12
|
-
|
|
13
|
-
List every meaningful in-scope module before completion. A module may be:
|
|
14
|
-
|
|
15
|
-
- a package, app, service, route group, command group, worker, or library,
|
|
16
|
-
- a domain folder with a clear responsibility,
|
|
17
|
-
- a runtime entrypoint plus its owned helpers,
|
|
18
|
-
- a persistence/query, external-integration, queue, cache, or reporting subsystem,
|
|
19
|
-
- a testable subsystem with stable callers and contracts.
|
|
20
|
-
|
|
21
|
-
Record each module with:
|
|
22
|
-
|
|
23
|
-
- module name and path roots,
|
|
24
|
-
- primary responsibility,
|
|
25
|
-
- entrypoints and public interfaces,
|
|
26
|
-
- key callers and callees,
|
|
27
|
-
- expected workload shape and frequency,
|
|
28
|
-
- tests, benchmarks, and performance guardrails,
|
|
29
|
-
- logs, metrics, traces, or profiling surfaces,
|
|
30
|
-
- persistence, network, filesystem, or external API contracts,
|
|
31
|
-
- risk level and estimated ease,
|
|
32
|
-
- current coverage status.
|
|
33
|
-
|
|
34
|
-
Exclude generated, vendored, lock, build-output, snapshot, fixture-only, or explicitly out-of-scope areas only with evidence.
|
|
35
|
-
|
|
36
|
-
## Coverage ledger statuses
|
|
37
|
-
|
|
38
|
-
Use simple statuses so stopping conditions are auditable:
|
|
39
|
-
|
|
40
|
-
- `unvisited`: inventoried but not deeply read yet.
|
|
41
|
-
- `deep-read`: callers, callees, tests, logs, benchmarks, contracts, workload shape, core files, and all available performance job lenses were inspected with enough context to judge performance.
|
|
42
|
-
- `optimized`: at least one behavior-safe performance improvement landed for this module.
|
|
43
|
-
- `validated-clear`: deep read found no actionable in-scope performance issue worth changing now.
|
|
44
|
-
- `deferred`: an issue exists but is blocked, unsafe, speculative, approval-dependent, production-measurement-only, or requires macro-architecture/product scope.
|
|
45
|
-
- `excluded`: not human-maintained source or outside the user's requested scope.
|
|
46
|
-
|
|
47
|
-
Completion is not allowed while any in-scope module remains `unvisited`.
|
|
48
|
-
|
|
49
|
-
## Easy-first and evidence-first ordering
|
|
50
|
-
|
|
51
|
-
Start with the easiest useful modules when that reduces risk:
|
|
52
|
-
|
|
53
|
-
- small surface area,
|
|
54
|
-
- clear ownership,
|
|
55
|
-
- local tests, cheap benchmarks, or profiling hooks,
|
|
56
|
-
- limited side effects,
|
|
57
|
-
- low public API or persistence risk,
|
|
58
|
-
- likely to clarify workload shape, tests, benchmarks, caching seams, batching seams, or data structures used by harder modules.
|
|
59
|
-
|
|
60
|
-
Prefer measured high-impact bottlenecks when they exist, even if they are not the easiest module.
|
|
61
|
-
|
|
62
|
-
Do not confuse easy-first with low-value micro-optimization. The chosen module should either resolve real performance issues or create context/guardrails that make later hot paths safer.
|
|
63
|
-
|
|
64
|
-
## Deep-read requirements
|
|
65
|
-
|
|
66
|
-
A module iteration is not deep-read until the agent inspects:
|
|
67
|
-
|
|
68
|
-
- module entrypoints and public interfaces,
|
|
69
|
-
- internal core files and responsibility boundaries,
|
|
70
|
-
- key callers and downstream callees,
|
|
71
|
-
- workload size, frequency, and data-shape assumptions,
|
|
72
|
-
- tests, fixtures, mocks, benchmark commands, and validation commands,
|
|
73
|
-
- logs, metrics, tracing, profiler hooks, and error messages,
|
|
74
|
-
- configuration, persistence, query, cache, concurrency, and external-service contracts when relevant,
|
|
75
|
-
- known TODOs, comments, or docs that describe performance behavior.
|
|
76
|
-
|
|
77
|
-
It also must inspect the module through each available performance job lens:
|
|
78
|
-
|
|
79
|
-
- `measurement / benchmarking`: is there enough baseline evidence, or is measurement the next unlock?
|
|
80
|
-
- `algorithmic complexity / repeated work`: are there avoidable scans, sorts, conversions, or duplicated computations?
|
|
81
|
-
- `IO batching / queries`: are there N+1 calls, excessive round trips, poor query shapes, or serial external work?
|
|
82
|
-
- `caching / memoization`: would caching help, and are ownership plus invalidation safe?
|
|
83
|
-
- `allocation / hot loops`: are tight loops creating avoidable objects, strings, parsing, or serialization?
|
|
84
|
-
- `concurrency / pipelines`: is work too serial, too parallel, unbounded, or missing backpressure?
|
|
85
|
-
- `staged unlock work`: if the module is too coupled for direct optimization, what is the next smaller unlock step?
|
|
86
|
-
|
|
87
|
-
Do not mark a module `validated-clear` from a shallow file skim.
|
|
88
|
-
Do not mark a module `validated-clear` until every available performance job lens has been checked and classified as one of: actionable now, measure-first, unlock-first, deferred, excluded, or no meaningful issue found.
|
|
89
|
-
|
|
90
|
-
## Choosing the next module
|
|
91
|
-
|
|
92
|
-
After every iteration:
|
|
93
|
-
|
|
94
|
-
1. Re-scan the module ledger.
|
|
95
|
-
2. Prefer an `unvisited` module unless a just-touched module must be stabilized before moving on.
|
|
96
|
-
3. Choose the highest-evidence hot module, or the easiest useful `unvisited` module that can be deeply read and improved or validated now.
|
|
97
|
-
4. Scan that module through every available performance job lens before deciding what "this round" means.
|
|
98
|
-
5. If the next module is high-risk and under-guarded, choose benchmark or characterization guardrails first.
|
|
99
|
-
6. If the next module is too coupled for direct optimization, choose staged unlock work rather than skipping it.
|
|
100
|
-
7. Return to the full-codebase scan after validation and update the ledger.
|
|
101
|
-
|
|
102
|
-
Revisiting a familiar module is valid only when:
|
|
103
|
-
|
|
104
|
-
- it blocks safe deep reading of an unvisited module,
|
|
105
|
-
- a previous optimization created follow-up risk that must be stabilized,
|
|
106
|
-
- validation exposed a real defect, stale benchmark, or stale contract,
|
|
107
|
-
- cross-module optimization requires touching it together with the next module.
|
|
108
|
-
|
|
109
|
-
## Module cluster iterations
|
|
110
|
-
|
|
111
|
-
One iteration may cover a small cluster of modules when they share one hot path or invariant, such as:
|
|
112
|
-
|
|
113
|
-
- a command and its parser,
|
|
114
|
-
- a route and its service,
|
|
115
|
-
- a domain module and its query adapter,
|
|
116
|
-
- an integration wrapper and its retry or batching helper,
|
|
117
|
-
- a worker and its queue processor.
|
|
118
|
-
|
|
119
|
-
Keep clusters bounded. Do not use clustering to claim full-repository coverage without deep context.
|
|
120
|
-
|
|
121
|
-
## Stage-gate questions
|
|
122
|
-
|
|
123
|
-
At the end of each iteration, answer:
|
|
124
|
-
|
|
125
|
-
- Which module or module cluster was deeply read?
|
|
126
|
-
- Which performance job lenses were checked, and which jobs were selected and why?
|
|
127
|
-
- What bottleneck was fixed, or why is the module validated-clear?
|
|
128
|
-
- Which guardrails prove behavior was preserved?
|
|
129
|
-
- What baseline and after evidence exists?
|
|
130
|
-
- Which modules remain `unvisited`?
|
|
131
|
-
- Which module is the next highest-evidence or easiest useful target?
|
|
132
|
-
|
|
133
|
-
If any in-scope module remains `unvisited`, the correct action is to return to Step 1, not to finish.
|
|
@@ -1,69 +0,0 @@
|
|
|
1
|
-
# Repository Performance Scan And Backlog Selection
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
|
|
5
|
-
Build a factual performance map before changing code, then choose the highest-value optimizations while tracking module-by-module performance deep-read coverage.
|
|
6
|
-
|
|
7
|
-
## Required scan
|
|
8
|
-
|
|
9
|
-
- Read `AGENTS.md/CLAUDE.md`, `README*`, project docs, manifests, task runners, CI configs, benchmark setup, profiler setup, and test setup.
|
|
10
|
-
- List entrypoints: CLI commands, servers, workers, jobs, frontend routes, scripts, libraries, public packages, and scheduled tasks.
|
|
11
|
-
- Identify core domain modules, persistence/query boundaries, external integrations, serialization/parsing paths, logging utilities, queues, caches, and test helpers.
|
|
12
|
-
- Create a module inventory and coverage ledger using `references/module-coverage.md`.
|
|
13
|
-
- For each module, scan through the available performance job lenses instead of treating scan as generic code reading.
|
|
14
|
-
- Inspect current git state before editing so unrelated user changes are not overwritten.
|
|
15
|
-
- Identify generated, vendored, lock, snapshot, build-output, fixture, compiled, and minified files; exclude them unless they are human-maintained source.
|
|
16
|
-
|
|
17
|
-
## Performance backlog signals
|
|
18
|
-
|
|
19
|
-
Prioritize files or functions with:
|
|
20
|
-
|
|
21
|
-
- measured slow requests, commands, jobs, tests, startup, builds, or user-visible interactions,
|
|
22
|
-
- high fan-in, high loop counts, high request frequency, or repeated invocation in long-running workers,
|
|
23
|
-
- avoidable nested loops, repeated scans, repeated sorting, repeated parsing, or repeated conversions,
|
|
24
|
-
- N+1 database, network, filesystem, or external API calls,
|
|
25
|
-
- unbounded concurrency, serial work that can be safely batched, or pipelines without backpressure,
|
|
26
|
-
- repeated serialization/deserialization or large intermediate objects,
|
|
27
|
-
- allocation churn, excessive cloning/copying, or memory-pressure paths,
|
|
28
|
-
- caches with missing invalidation, excessive retention, or low hit value,
|
|
29
|
-
- logs, metrics, traces, or benchmarks that hide where time is spent.
|
|
30
|
-
|
|
31
|
-
## Evidence to capture
|
|
32
|
-
|
|
33
|
-
For each candidate record:
|
|
34
|
-
|
|
35
|
-
- file path and symbol name,
|
|
36
|
-
- owning module or module cluster,
|
|
37
|
-
- job lens that exposed the issue,
|
|
38
|
-
- performance evidence: benchmark, trace, profiler output, log timing, production symptom, or complexity analysis,
|
|
39
|
-
- expected speed, throughput, allocation, IO, or complexity improvement,
|
|
40
|
-
- correctness risks and behavior invariants,
|
|
41
|
-
- tests, benchmarks, or validations needed to prove safety,
|
|
42
|
-
- reason to defer if the candidate requires product, architecture, operational, or production-data approval.
|
|
43
|
-
|
|
44
|
-
## Exclusion rules
|
|
45
|
-
|
|
46
|
-
Do not optimize:
|
|
47
|
-
|
|
48
|
-
- third-party, generated, compiled, or minified artifacts,
|
|
49
|
-
- snapshots where churn would hide signal,
|
|
50
|
-
- code the user marked as actively edited elsewhere,
|
|
51
|
-
- public schema/API names or data contracts that require migration planning,
|
|
52
|
-
- cold paths where the optimization makes code harder to maintain without evidence of value,
|
|
53
|
-
- areas that cannot be validated and are not causing a clear performance risk.
|
|
54
|
-
|
|
55
|
-
## Backlog scoring
|
|
56
|
-
|
|
57
|
-
Prefer a small set of high-confidence improvements over an exhaustive sweep.
|
|
58
|
-
|
|
59
|
-
Score each candidate by:
|
|
60
|
-
|
|
61
|
-
1. **Impact**: latency, throughput, CPU, memory, IO, user criticality, and call frequency.
|
|
62
|
-
2. **Evidence**: measurement quality or clear complexity proof.
|
|
63
|
-
3. **Correctness confidence**: ability to preserve business behavior.
|
|
64
|
-
4. **Validation**: ability to benchmark, test, or otherwise prove equivalence.
|
|
65
|
-
5. **Blast radius**: number of modules, public contracts, persistence paths, and operational assumptions affected.
|
|
66
|
-
|
|
67
|
-
Start with high-impact, high-evidence, low-blast-radius items. Escalate broad changes only when smaller passes cannot resolve the root performance problem.
|
|
68
|
-
|
|
69
|
-
Do not finish from backlog scoring alone. Completion also requires the module coverage ledger to show that every in-scope module has been deeply read and either improved, validated-clear, deferred, or excluded with evidence.
|
|
@@ -1,21 +0,0 @@
|
|
|
1
|
-
MIT License
|
|
2
|
-
|
|
3
|
-
Copyright (c) 2026 LaiTszKin
|
|
4
|
-
|
|
5
|
-
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
-
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
-
in the Software without restriction, including without limitation the rights
|
|
8
|
-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
-
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
-
furnished to do so, subject to the following conditions:
|
|
11
|
-
|
|
12
|
-
The above copyright notice and this permission notice shall be included in all
|
|
13
|
-
copies or substantial portions of the Software.
|
|
14
|
-
|
|
15
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
-
SOFTWARE.
|
|
@@ -1,45 +0,0 @@
|
|
|
1
|
-
# iterative-code-quality
|
|
2
|
-
|
|
3
|
-
Improve an existing repository through a strict three-step loop of full-codebase scan, job-based refactor, and final documentation/constraint sync while preserving intended business behavior and the system's top-level macro architecture.
|
|
4
|
-
|
|
5
|
-
## Core capabilities
|
|
6
|
-
|
|
7
|
-
- Runs a repository-wide scan before every refactor round and refreshes a concrete quality backlog.
|
|
8
|
-
- Uses a strict three-step loop: scan the codebase, choose this round's jobs and refactor, then update docs/constraints only when no actionable gap remains.
|
|
9
|
-
- Keeps job execution guidance in focused reference documents instead of embedding every job as a workflow step in the main skill.
|
|
10
|
-
- Builds a module inventory and coverage ledger so every in-scope module receives a deep-read iteration before completion.
|
|
11
|
-
- Defines deep-read as a job-oriented scan across naming, simplification, module boundaries, logging, testing, and staged unlock opportunities instead of generic reading.
|
|
12
|
-
- Starts from the easiest useful modules first, while preserving the rule that unvisited modules cannot be skipped before completion.
|
|
13
|
-
- Clarifies ambiguous variable, parameter, field, helper, and test-data names.
|
|
14
|
-
- Simplifies complex functions and extracts reusable helpers only when they centralize real behavior.
|
|
15
|
-
- Splits mixed-responsibility code into narrower modules without changing macro architecture.
|
|
16
|
-
- Repairs stale or missing logs and adds tests for important observability contracts.
|
|
17
|
-
- Adds high-value unit, property-based, integration, or E2E tests based on risk.
|
|
18
|
-
- Does not require pre-existing tests before every refactor; for high-risk under-guarded areas, it treats test addition as the next unlock direction.
|
|
19
|
-
- Requires confidence decisions to combine the agent's self-assessed ability, task complexity, guardrail strength, rollback or repair paths, and whether a strong test suite can safely drive broken refactors back to green.
|
|
20
|
-
- Uses those tests and other guardrails to justify more aggressive refactors, instead of leaving known issues in place for subjective confidence reasons.
|
|
21
|
-
- Re-scans the full repository after every iteration and picks the next highest-confidence, highest-leverage directions.
|
|
22
|
-
- Uses small safe refactors to prepare the ground for larger later refactors, progressing gradually from outside to inside.
|
|
23
|
-
- Treats large coupled or apparently core files as staged unlock problems, not as automatic stop signals.
|
|
24
|
-
- Uses explicit next-job selection conditions from references so the agent can decide more concretely whether naming, simplification, modularization, logging, testing, or unlock work should happen next.
|
|
25
|
-
- Runs a stage-gate full-codebase decision after every iteration to decide whether more rounds are still required.
|
|
26
|
-
- Repeats the pass cycle while any known in-scope actionable quality issue remains, and forbids a completion report until the latest scan is clear or remaining items are explicitly deferred with a valid reason.
|
|
27
|
-
- Forbids completion while any in-scope module remains unvisited, even if already-read modules look clean.
|
|
28
|
-
- Targets as many inherited repository quality problems as can be solved safely, and expects the guarded test surface to remain green after the refactor.
|
|
29
|
-
- Synchronizes project docs and `AGENTS.md/CLAUDE.md` through `align-project-documents` and `maintain-project-constraints` after implementation.
|
|
30
|
-
|
|
31
|
-
## Repository structure
|
|
32
|
-
|
|
33
|
-
- `SKILL.md`: Main three-step loop, dependencies, guardrails, and output contract.
|
|
34
|
-
- `agents/openai.yaml`: Agent interface metadata and default prompt.
|
|
35
|
-
- `references/`: Focused guides for scanning, module coverage, job selection, naming, simplification, module boundaries, logging, testing, unlock work, and iteration gates.
|
|
36
|
-
|
|
37
|
-
## Typical usage
|
|
38
|
-
|
|
39
|
-
```text
|
|
40
|
-
Use $iterative-code-quality to improve this repository's code quality end to end without changing business behavior or macro architecture.
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
## License
|
|
44
|
-
|
|
45
|
-
MIT. See `LICENSE`.
|