@laitszkin/apollo-toolkit 3.3.5 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/AGENTS.md +1 -0
  2. package/CHANGELOG.md +8 -0
  3. package/README.md +1 -0
  4. package/analyse-app-logs/scripts/__pycache__/filter_logs_by_time.cpython-312.pyc +0 -0
  5. package/analyse-app-logs/scripts/__pycache__/log_cli_utils.cpython-312.pyc +0 -0
  6. package/analyse-app-logs/scripts/__pycache__/search_logs.cpython-312.pyc +0 -0
  7. package/develop-new-features/README.md +9 -19
  8. package/develop-new-features/SKILL.md +14 -24
  9. package/develop-new-features/agents/openai.yaml +1 -1
  10. package/docs-to-voice/scripts/__pycache__/docs_to_voice.cpython-312.pyc +0 -0
  11. package/enhance-existing-features/README.md +9 -21
  12. package/enhance-existing-features/SKILL.md +16 -27
  13. package/enhance-existing-features/agents/openai.yaml +1 -1
  14. package/generate-spec/README.md +4 -3
  15. package/generate-spec/SKILL.md +14 -5
  16. package/generate-spec/agents/openai.yaml +1 -1
  17. package/generate-spec/references/templates/checklist.md +5 -0
  18. package/generate-spec/references/templates/tasks.md +38 -9
  19. package/generate-spec/scripts/__pycache__/create-specscpython-312.pyc +0 -0
  20. package/katex/scripts/__pycache__/render_katex.cpython-312.pyc +0 -0
  21. package/open-github-issue/scripts/__pycache__/open_github_issue.cpython-312.pyc +0 -0
  22. package/package.json +1 -1
  23. package/read-github-issue/scripts/__pycache__/find_issues.cpython-312.pyc +0 -0
  24. package/read-github-issue/scripts/__pycache__/read_issue.cpython-312.pyc +0 -0
  25. package/resolve-review-comments/scripts/__pycache__/review_threads.cpython-312.pyc +0 -0
  26. package/test-case-strategy/LICENSE +21 -0
  27. package/test-case-strategy/README.md +27 -0
  28. package/test-case-strategy/SKILL.md +110 -0
  29. package/test-case-strategy/agents/openai.yaml +4 -0
  30. package/test-case-strategy/references/e2e-tests.md +31 -0
  31. package/test-case-strategy/references/integration-tests.md +32 -0
  32. package/test-case-strategy/references/property-based-tests.md +43 -0
  33. package/test-case-strategy/references/unit-tests.md +59 -0
  34. package/text-to-short-video/scripts/__pycache__/enforce_video_aspect_ratio.cpython-312.pyc +0 -0
  35. package/develop-new-features/references/testing-e2e.md +0 -36
  36. package/develop-new-features/references/testing-integration.md +0 -42
  37. package/develop-new-features/references/testing-property-based.md +0 -44
  38. package/develop-new-features/references/testing-unit.md +0 -37
  39. package/enhance-existing-features/references/e2e-tests.md +0 -26
  40. package/enhance-existing-features/references/integration-tests.md +0 -30
  41. package/enhance-existing-features/references/property-based-tests.md +0 -33
  42. package/enhance-existing-features/references/unit-tests.md +0 -29
package/AGENTS.md CHANGED
@@ -58,6 +58,7 @@ This repository enables users to install and run a curated set of reusable agent
58
58
  - Users can add focused observability to opaque workflows through targeted logs, metrics, traces, and tests.
59
59
  - Users can iteratively improve repository code quality through behavior-neutral naming, simplification, module-boundary, logging, and test-coverage passes.
60
60
  - Users can iteratively improve repository performance through evidence-backed module scans, safe hot-path optimization, benchmark guardrails, batching, caching, allocation, concurrency, and repeated full-codebase stage gates.
61
+ - Users can select risk-driven test levels and define unit drift checks through a shared testing strategy skill.
61
62
  - Users can build against Jupiter's official Solana swap, token, price, lending, trigger, recurring, and portfolio APIs with an evidence-based development guide.
62
63
  - Users can render and embed math formulas with KaTeX using official documentation-backed guidance and reusable rendering scripts.
63
64
  - Users can debug software systematically by reproducing causes, validating fixes, and testing outcomes.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,14 @@ All notable changes to this repository are documented in this file.
7
7
  ### Added
8
8
  - (None yet)
9
9
 
10
+ ## [v3.4.0] - 2026-04-28
11
+
12
+ ### Added
13
+ - Add `test-case-strategy`, a shared skill for selecting risk-driven test levels, defining meaningful test oracles, and adding focused unit drift checks for atomic implementation tasks.
14
+
15
+ ### Changed
16
+ - Make `generate-spec`, `develop-new-features`, and `enhance-existing-features` depend on `test-case-strategy` for test case selection, while tightening `tasks.md` into an atomic implementation queue with verification hooks.
17
+
10
18
  ## [v3.3.5] - 2026-04-28
11
19
 
12
20
  ### Changed
package/README.md CHANGED
@@ -48,6 +48,7 @@ A curated skill catalog for Codex, OpenClaw, Trae, Agents, and Claude Code with
48
48
  - solana-development
49
49
  - submission-readiness-check
50
50
  - systematic-debug
51
+ - test-case-strategy
51
52
  - text-to-short-video
52
53
  - version-release
53
54
  - video-production
@@ -1,12 +1,12 @@
1
1
  # develop-new-features
2
2
 
3
- A spec-first feature development skill for new behavior and greenfield work. It delegates shared planning-doc generation to `generate-spec`, then implements the approved feature with risk-driven testing.
3
+ A spec-first feature development skill for new behavior and greenfield work. It delegates shared planning-doc generation to `generate-spec`, uses `test-case-strategy` for risk-driven test selection, then implements the approved feature with focused validation.
4
4
 
5
5
  ## Key capabilities
6
6
 
7
7
  - Requires `generate-spec` before any implementation starts.
8
8
  - Treats `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, and `design.md` as approval-gated artifacts, not optional notes.
9
- - Covers unit, regression, property-based, integration, E2E, and adversarial testing based on actual risk.
9
+ - Covers unit, regression, property-based, integration, E2E, adversarial, mock/fake, rollback, and unit drift-check testing based on actual risk through `test-case-strategy`.
10
10
  - Reuses existing architecture and avoids speculative expansion.
11
11
  - Backfills `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, and `design.md` after implementation and testing complete.
12
12
  - Once approval is granted and implementation starts, finishes all in-scope planned tasks and applicable checklist items before yielding unless the user defers work or an external blocker prevents safe completion.
@@ -18,13 +18,8 @@ A spec-first feature development skill for new behavior and greenfield work. It
18
18
  ├── SKILL.md
19
19
  ├── README.md
20
20
  ├── LICENSE
21
- ├── agents/
22
- └── openai.yaml
23
- └── references/
24
- ├── testing-unit.md
25
- ├── testing-property-based.md
26
- ├── testing-integration.md
27
- └── testing-e2e.md
21
+ └── agents/
22
+ └── openai.yaml
28
23
  ```
29
24
 
30
25
  ## Workflow summary
@@ -38,17 +33,12 @@ A spec-first feature development skill for new behavior and greenfield work. It
38
33
 
39
34
  ## Testing expectations
40
35
 
41
- - Unit: changed logic, boundaries, failure paths.
42
- - Regression: pin down bug-prone or high-risk behavior.
43
- - Property-based: required for business logic unless concrete `N/A` is recorded.
44
- - Integration: cover the user-critical logic chain.
45
- - E2E: cover the most important success and denial/failure paths when justified.
46
- - Adversarial: include abuse, malformed input, privilege, replay, concurrency, and edge-combination cases when relevant.
36
+ - Use `test-case-strategy` to choose the smallest useful test level for each risk.
37
+ - Define meaningful oracles before implementation.
38
+ - Add focused unit drift checks for non-trivial atomic tasks when possible.
39
+ - Record concrete `N/A` reasons when a test level is not suitable.
47
40
 
48
41
  ## References
49
42
 
50
43
  - Shared planning workflow: `generate-spec`
51
- - Unit testing guide: `references/testing-unit.md`
52
- - Property-based testing guide: `references/testing-property-based.md`
53
- - Integration testing guide: `references/testing-integration.md`
54
- - E2E testing guide: `references/testing-e2e.md`
44
+ - Test selection and unit drift-check guide: `test-case-strategy`
@@ -2,8 +2,9 @@
2
2
  name: develop-new-features
3
3
  description: >-
4
4
  Spec-first feature development workflow for new behavior and greenfield
5
- features. Depends on `generate-spec` for shared planning artifacts before
6
- coding, then implements the approved feature with risk-driven test coverage.
5
+ features. Depends on `generate-spec` for shared planning artifacts and
6
+ `test-case-strategy` for risk-driven test selection before coding, then
7
+ implements the approved feature with focused validation.
7
8
  Use when users ask to design or implement new features, change product
8
9
  behavior, request a planning-first process, or ask for a greenfield feature.
9
10
  Once the approved spec set exists and implementation begins, complete all
@@ -20,16 +21,16 @@ description: >-
20
21
 
21
22
  ## Dependencies
22
23
 
23
- - Required: `generate-spec` for `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, clarification handling, approval gating, and completion-status backfill.
24
+ - Required: `generate-spec` for `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, clarification handling, approval gating, and completion-status backfill; `test-case-strategy` for risk-driven test selection, meaningful oracle design, and unit drift checks.
24
25
  - Conditional: none.
25
26
  - Optional: none.
26
- - Fallback: If `generate-spec` is unavailable, stop and report the missing dependency.
27
+ - Fallback: If `generate-spec` or `test-case-strategy` is unavailable, stop and report the missing dependency.
27
28
 
28
29
  ## Standards
29
30
 
30
31
  - Evidence: Review authoritative docs and the existing codebase before planning or implementation.
31
32
  - Execution: Use specs only for feature work that is genuinely multi-step, cross-surface, or higher risk; skip specs for obviously small/localized work and route that work to direct implementation or the appropriate maintenance skill instead.
32
- - Quality: Add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
33
+ - Quality: Use `test-case-strategy` to add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
33
34
  - Output: Keep the approved planning artifacts and the final implementation aligned with actual completion results.
34
35
 
35
36
  ## Goal
@@ -93,21 +94,13 @@ Use a shared spec-generation workflow for non-trivial new feature work, then imp
93
94
 
94
95
  ### 5) Testing coverage (required)
95
96
 
96
- For every non-trivial change, evaluate all categories and add test cases or record justified `N/A`:
97
- - Start from a risk inventory, not from the happy path: assess misuse/abuse, authorization, invalid transitions, idempotency, replay/duplication, concurrency/races, data-integrity, and partial-failure/rollback risks.
98
- - Unit tests: changed logic, boundaries, failure paths, and exact error/side-effect expectations.
99
- - Regression tests: bug-prone or high-risk behavior that should never silently regress again.
100
- - Property-based tests: required for business-logic changes unless truly unsuitable; use them for invariants, generated business input spaces, state-machine/metamorphic checks when useful, and output expectation checks.
101
- - Integration tests: user-critical logic chain across modules/layers.
102
- - E2E tests: key user-visible path impacted by this change; prefer one minimal critical success path plus one highest-value denial/failure path when the risk warrants it.
103
- - Adversarial/penetration-style cases: abuse paths, malformed inputs, forged identities/privileges, invalid transitions, replay/duplication, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
104
-
105
- Rules:
106
- - If E2E is too costly or unstable, add stronger integration coverage for the same risk and record the reason in the checklist.
107
- - If property-based testing is not suitable, record `N/A` with a concrete reason.
108
- - For logic chains with external services, mock or fake those services unless the real contract itself is under test; simulate diverse external states and verify the business chain remains correct.
109
- - Where the feature can partially commit work, test rollback, compensation, or no-partial-write behavior explicitly.
110
- - Each test must assert a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects. Avoid assertion-light smoke tests and snapshot-only coverage.
97
+ Use `$test-case-strategy` for every non-trivial change.
98
+
99
+ - Start from risk inventory and requirement IDs, not from the happy path.
100
+ - Define test oracles before implementation and map them to `spec.md`, `tasks.md`, and `checklist.md`.
101
+ - For each atomic task that changes non-trivial local logic, define a focused unit drift check or record the smallest replacement verification with a concrete `N/A` reason.
102
+ - Add unit, regression, property-based, integration, E2E, adversarial, mock/fake, rollback, or no-partial-write coverage only when the risk profile warrants it.
103
+ - Each planned test must have a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects.
111
104
  - Run relevant tests when possible and fix failures.
112
105
 
113
106
  ### 6) Completion updates
@@ -138,7 +131,4 @@ Rules:
138
131
  ## References
139
132
 
140
133
  - `$generate-spec`: shared planning and approval workflow.
141
- - `references/testing-unit.md`: unit testing principles.
142
- - `references/testing-property-based.md`: property-based testing principles.
143
- - `references/testing-integration.md`: integration testing principles.
144
- - `references/testing-e2e.md`: E2E decision and design principles.
134
+ - `$test-case-strategy`: shared test selection, oracle design, and unit drift-check workflow.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "Develop New Features"
3
3
  short_description: "Spec-first feature development that depends on generate-spec"
4
- default_prompt: "Use $develop-new-features to design new behavior through a spec-first workflow: review the required external docs, run $generate-spec to create and maintain docs/plans/<date>/<change_name>/... for single-spec work or docs/plans/<date>/<batch_name>/<change_name>/... plus coordination.md for parallel batches, wait for explicit approval, document material external dependency contracts in contract.md, document the architecture/design delta in design.md, record shared preparation and legacy-replacement direction in coordination.md when multiple specs will be implemented in parallel, then complete the approved implementation end-to-end with risk-driven tests and full backfill of spec.md, tasks.md, checklist.md, contract.md, design.md, and when applicable coordination.md before yielding, unless the user changes scope or an external blocker prevents safe completion."
4
+ default_prompt: "Use $develop-new-features to design new behavior through a spec-first workflow: review the required external docs, run $generate-spec to create and maintain docs/plans/<date>/<change_name>/... for single-spec work or docs/plans/<date>/<batch_name>/<change_name>/... plus coordination.md for parallel batches, wait for explicit approval, document material external dependency contracts in contract.md, document the architecture/design delta in design.md, record shared preparation and legacy-replacement direction in coordination.md when multiple specs will be implemented in parallel, use $test-case-strategy to choose risk-driven tests and unit drift checks before implementation, then complete the approved implementation end-to-end with full backfill of spec.md, tasks.md, checklist.md, contract.md, design.md, and when applicable coordination.md before yielding, unless the user changes scope or an external blocker prevents safe completion."
@@ -1,13 +1,13 @@
1
1
  # enhance-existing-features
2
2
 
3
- A brownfield feature-extension skill: map dependencies first, decide whether shared specs are required, then implement the approved change with risk-driven hardening tests.
3
+ A brownfield feature-extension skill: map dependencies first, decide whether shared specs are required, then use `test-case-strategy` to implement the change with risk-driven hardening tests and unit drift checks.
4
4
 
5
5
  ## Core capabilities
6
6
 
7
7
  - Explores dependencies and data flow before deciding how to change the system.
8
8
  - Uses `generate-spec` whenever the change is high-complexity, touches a critical module, or crosses module boundaries.
9
9
  - Requires explicit approval before coding when specs are generated.
10
- - Still requires meaningful tests even when specs are skipped.
10
+ - Still requires meaningful tests even when specs are skipped, selected through `test-case-strategy`.
11
11
  - Keeps brownfield changes focused and traceable.
12
12
  - When specs exist and are approved, finishes all in-scope planned tasks and applicable checklist items before yielding unless the user defers work or an external blocker prevents safe completion.
13
13
 
@@ -18,13 +18,8 @@ A brownfield feature-extension skill: map dependencies first, decide whether sha
18
18
  ├── SKILL.md
19
19
  ├── README.md
20
20
  ├── LICENSE
21
- ├── agents/
22
- └── openai.yaml
23
- └── references/
24
- ├── unit-tests.md
25
- ├── property-based-tests.md
26
- ├── integration-tests.md
27
- └── e2e-tests.md
21
+ └── agents/
22
+ └── openai.yaml
28
23
  ```
29
24
 
30
25
  ## Workflow summary
@@ -38,19 +33,12 @@ A brownfield feature-extension skill: map dependencies first, decide whether sha
38
33
 
39
34
  ## Test requirements
40
35
 
41
- - Unit: changed logic, boundaries, failure paths.
42
- - Regression: bug-prone or high-risk behavior that must not silently return.
43
- - Property-based: mandatory for business logic unless concrete `N/A` is recorded.
44
- - Integration: user-critical logic chain across layers/modules.
45
- - E2E: affected key user-visible path when the risk justifies it.
46
- - Adversarial: abuse paths, malformed inputs, privilege issues, replay, concurrency, and edge combinations when relevant.
47
-
48
- If E2E is not feasible, replace it with stronger integration coverage and record the reason.
36
+ - Use `test-case-strategy` to choose the smallest useful test level for each risk.
37
+ - Define meaningful oracles before finalizing tests.
38
+ - Add focused unit drift checks for non-trivial atomic tasks when possible.
39
+ - Record concrete `N/A` reasons when a test level is not suitable.
49
40
 
50
41
  ## References
51
42
 
52
43
  - Shared planning workflow: `generate-spec`
53
- - Unit testing guide: `references/unit-tests.md`
54
- - Property-based testing guide: `references/property-based-tests.md`
55
- - Integration testing guide: `references/integration-tests.md`
56
- - E2E testing guide: `references/e2e-tests.md`
44
+ - Test selection and unit drift-check guide: `test-case-strategy`
@@ -6,26 +6,26 @@ description: >-
6
6
  before coding. When specs are needed, use `generate-spec` for planning,
7
7
  clarification, approval, and backfill, and complete approved in-scope tasks
8
8
  before yielding unless scope changes or an external blocker prevents safe
9
- completion. With or without specs, add and run relevant unit,
10
- property-based, regression, integration, E2E, and adversarial tests as
11
- applicable, use mocks for external services in logic chains, and verify
12
- meaningful business outcomes instead of smoke-only success.
9
+ completion. With or without specs, use `test-case-strategy` to select and
10
+ run relevant unit, property-based, regression, integration, E2E, adversarial,
11
+ mock/fake, and drift-check coverage, and verify meaningful business outcomes
12
+ instead of smoke-only success.
13
13
  ---
14
14
 
15
15
  # Enhance Existing Features
16
16
 
17
17
  ## Dependencies
18
18
 
19
- - Required: `generate-spec` for shared planning docs when spec-trigger conditions are met.
20
- - Conditional: `recover-missing-plan` when the user points to a required `docs/plans/...` spec set that is missing, archived, or mismatched in the current workspace.
19
+ - Required: `test-case-strategy` for risk-driven test selection, meaningful oracle design, and unit drift checks.
20
+ - Conditional: `generate-spec` for shared planning docs when spec-trigger conditions are met; `recover-missing-plan` when the user points to a required `docs/plans/...` spec set that is missing, archived, or mismatched in the current workspace.
21
21
  - Optional: none.
22
- - Fallback: If specs are required and `generate-spec` is unavailable, stop and report the missing dependency.
22
+ - Fallback: If `test-case-strategy` is unavailable, stop and report the missing dependency. If specs are required and `generate-spec` is unavailable, stop and report the missing dependency.
23
23
 
24
24
  ## Standards
25
25
 
26
26
  - Evidence: Explore the existing codebase first and verify the latest authoritative docs for the involved stack or integrations.
27
27
  - Execution: Decide whether specs are required from the actual change surface, run `generate-spec` when needed, then continue through implementation, testing, and backfill until the active scope is fully reconciled; when the user asks for a specific final behavior or architectural end state, do not substitute a preparatory or partial milestone unless the user explicitly re-scopes the request.
28
- - Quality: Add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
28
+ - Quality: Use `test-case-strategy` to add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
29
29
  - Output: Keep implementation and any planning artifacts traceable, updated, and aligned with actual completion results.
30
30
 
31
31
  ## Overview
@@ -106,21 +106,13 @@ If not triggered:
106
106
 
107
107
  ### 5) Testing coverage (required with or without specs)
108
108
 
109
- For every non-trivial change, evaluate all categories and add test cases or record justified `N/A`:
110
- - Start from a risk inventory, not from the happy path: assess misuse/abuse, authorization, invalid transitions, idempotency, replay/duplication, concurrency/races, data-integrity, and partial-failure/rollback risks.
111
- - Unit tests: changed logic, boundaries, failure paths, and exact error/side-effect expectations.
112
- - Regression tests: bug-prone or high-risk behavior that should never silently regress again.
113
- - Property-based tests: required for business-logic changes unless truly unsuitable; use them for invariants, generated business input spaces, state-machine/metamorphic checks when useful, and output expectation checks.
114
- - Integration tests: user-critical logic chain across modules/layers.
115
- - E2E tests: key user-visible path impacted by this change; prefer one minimal critical success path plus one highest-value denial/failure path when the risk warrants it.
116
- - Adversarial/penetration-style cases: abuse paths, malformed inputs, forged identities/privileges, invalid transitions, replay/duplication, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
117
-
118
- Rules:
119
- - If E2E is too costly or unstable, add stronger integration coverage for the same risk and record the reason.
120
- - If property-based testing is not suitable, record `N/A` with a concrete reason.
121
- - For logic chains with external services, mock or fake those services unless the real contract itself is under test; simulate diverse external states and verify the business chain remains correct.
122
- - Where the feature can partially commit work, test rollback, compensation, or no-partial-write behavior explicitly.
123
- - Each test must assert a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects. Avoid assertion-light smoke tests and snapshot-only coverage.
109
+ Use `$test-case-strategy` for every non-trivial change, even when specs are skipped.
110
+
111
+ - Start from risk inventory and changed behavior, not from the happy path.
112
+ - Define test oracles before implementation when the change is planned, and before finalizing tests when the change is discovered during brownfield exploration.
113
+ - For each atomic task that changes non-trivial local logic, define a focused unit drift check or record the smallest replacement verification with a concrete `N/A` reason.
114
+ - Add unit, regression, property-based, integration, E2E, adversarial, mock/fake, rollback, or no-partial-write coverage only when the risk profile warrants it.
115
+ - Each planned test must have a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects.
124
116
  - Run relevant tests when possible and fix failures.
125
117
 
126
118
  ### 6) Completion updates
@@ -152,7 +144,4 @@ Rules:
152
144
  ## References
153
145
 
154
146
  - `$generate-spec`: shared planning and approval workflow.
155
- - `references/unit-tests.md`: unit testing guidance.
156
- - `references/property-based-tests.md`: property-based testing guidance.
157
- - `references/integration-tests.md`: integration testing guidance.
158
- - `references/e2e-tests.md`: E2E decision and design guidance.
147
+ - `$test-case-strategy`: shared test selection, oracle design, and unit drift-check workflow.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "enhance-existing-features"
3
3
  short_description: "Extend brownfield features with conditional generate-spec planning and risk-driven tests"
4
- default_prompt: "Use $enhance-existing-features to extend a brownfield feature: map the affected code and dependencies first, decide whether the change is high complexity / critical module / cross-module, run $generate-spec when specs are required, wait for explicit approval before coding, document material external dependency contracts in contract.md, document the architecture/design delta in design.md, and when one change is split into parallel spec sets maintain a shared coordination.md for common preparation, ownership boundaries, and legacy-replacement direction; always add risk-driven tests plus clear N/A reasons when a category truly does not apply; if the user asked for a specific final behavior or architecture state, do not stop at an enabling intermediate milestone unless the user explicitly narrows scope; if a spec set exists, finish the approved tasks and applicable checklist items and backfill spec.md, tasks.md, checklist.md, contract.md, design.md, and when applicable coordination.md before yielding unless the user changes scope or an external blocker prevents safe completion."
4
+ default_prompt: "Use $enhance-existing-features to extend a brownfield feature: map the affected code and dependencies first, decide whether the change is high complexity / critical module / cross-module, run $generate-spec when specs are required, wait for explicit approval before coding, document material external dependency contracts in contract.md, document the architecture/design delta in design.md, and when one change is split into parallel spec sets maintain a shared coordination.md for common preparation, ownership boundaries, and legacy-replacement direction; use $test-case-strategy to choose risk-driven tests, unit drift checks, and clear N/A reasons even when specs are skipped; if the user asked for a specific final behavior or architecture state, do not stop at an enabling intermediate milestone unless the user explicitly narrows scope; if a spec set exists, finish the approved tasks and applicable checklist items and backfill spec.md, tasks.md, checklist.md, contract.md, design.md, and when applicable coordination.md before yielding unless the user changes scope or an external blocker prevents safe completion."
@@ -1,6 +1,6 @@
1
1
  # generate-spec
2
2
 
3
- A shared planning skill for feature work. It centralizes creation and maintenance of `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, and when needed `coordination.md` so other skills can reuse one consistent approval-gated spec workflow.
3
+ A shared planning skill for feature work. It centralizes creation and maintenance of `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, and when needed `coordination.md` so other skills can reuse one consistent approval-gated spec workflow with risk-driven test planning from `test-case-strategy`.
4
4
 
5
5
  ## Core capabilities
6
6
 
@@ -10,6 +10,7 @@ A shared planning skill for feature work. It centralizes creation and maintenanc
10
10
  - Requires clarification handling and explicit user approval before implementation starts.
11
11
  - Backfills task and checklist status after implementation and testing.
12
12
  - Keeps requirement, task, and test coverage mapping traceable.
13
+ - Uses `test-case-strategy` to choose test levels, define meaningful oracles, and add focused unit drift checks to atomic implementation tasks.
13
14
  - Standardizes external dependency contracts in `contract.md` and architecture/design deltas in `design.md`.
14
15
 
15
16
  ## Repository layout
@@ -77,8 +78,8 @@ docs/plans/<today>/membership-cutover/
77
78
  ## Authoring rules
78
79
 
79
80
  - `spec.md`: use BDD keywords `GIVEN / WHEN / THEN / AND / Requirements`.
80
- - `tasks.md`: use `## **Task N: ...**`, `- N. [ ]`, and `- N.x [ ]`.
81
- - `checklist.md`: use `- [ ]` only, adapt items to real scope, and record actual results.
81
+ - `tasks.md`: use `## **Task N: ...**` and atomic implementation queue items with allowed scope, output, completion condition, verification hook, and unit drift check.
82
+ - `checklist.md`: use `- [ ]` only, adapt items to real scope, record actual results, and map behavior risks to test IDs plus oracles selected through `test-case-strategy`.
82
83
  - `contract.md`: when external dependencies materially shape the change, record their official-source-backed invocation surface, constraints, and caller obligations in the standard dependency-record format.
83
84
  - `design.md`: record the architecture/design delta in the standard format, including affected modules, flow, invariants, tradeoffs, and validation plan.
84
85
  - `coordination.md`: for multi-spec batches only, record shared preparation, ownership boundaries, replacement direction, file ownership guardrails, known collision candidates, pre-agreed edit rules for shared surfaces, shared API/schema freeze or additive-only rules, compatibility-shim retention rules, merge order, and cross-spec integration checkpoints, but never use it to make one spec depend on another spec's implementation before it can be completed.
@@ -1,22 +1,22 @@
1
1
  ---
2
2
  name: generate-spec
3
- description: Generate and maintain shared feature planning artifacts (`spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, and when needed `coordination.md`) from standard templates with clarification tracking, approval gating, and post-implementation backfill. Use when a workflow needs specs before coding, or when another skill needs to create/update planning docs under `docs/plans/{YYYY-MM-DD}/...`.
3
+ description: Generate and maintain shared feature planning artifacts (`spec.md`, `tasks.md`, `checklist.md`, `contract.md`, `design.md`, and when needed `coordination.md`) from standard templates with clarification tracking, approval gating, unit drift-check planning, and post-implementation backfill. Use when a workflow needs specs before coding, or when another skill needs to create/update planning docs under `docs/plans/{YYYY-MM-DD}/...`.
4
4
  ---
5
5
 
6
6
  # Generate Spec
7
7
 
8
8
  ## Dependencies
9
9
 
10
- - Required: none.
10
+ - Required: `test-case-strategy` for risk-driven test case selection, meaningful oracle design, and unit drift-check planning.
11
11
  - Conditional: none.
12
12
  - Optional: none.
13
- - Fallback: not applicable.
13
+ - Fallback: If `test-case-strategy` is unavailable, stop and report the missing dependency instead of inventing test coverage heuristics locally.
14
14
 
15
15
  ## Standards
16
16
 
17
17
  - Evidence: Review the relevant code, configs, and authoritative docs before filling requirements or test plans; when external dependencies, libraries, frameworks, APIs, or platforms are involved, checking their official documentation is mandatory during spec creation.
18
18
  - Execution: Generate the planning files first, keep each spec set tightly scoped, split broader work into multiple independent spec sets when needed, ensure every batch spec is independently completable and truly parallel-implementable without depending on another spec set to land first, surface shared-file or shared-contract collision risks during planning, resolve those coordination rules before implementation starts, complete them with traceable requirements and risks, handle clarification updates, then wait for explicit approval before implementation.
19
- - Quality: Keep `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, and `design.md` synchronized, map each planned test to a concrete risk or requirement, and tailor the templates so only applicable items remain active.
19
+ - Quality: Keep `spec.md`, `tasks.md`, `checklist.md`, `contract.md`, and `design.md` synchronized, use `test-case-strategy` to map each planned test to a concrete risk or requirement, and tailor the templates so only applicable items remain active.
20
20
  - Output: Store planning artifacts under `docs/plans/{YYYY-MM-DD}/{change_name}/` for single-spec work, or `docs/plans/{YYYY-MM-DD}/{batch_name}/{change_name}/` plus `coordination.md` for multi-spec parallel work whose member specs remain independently approvable, independently implementable, and ready for concurrent worktree execution with pre-agreed collision rules, and keep them concise, executable, and easy to update.
21
21
 
22
22
  ## Goal
@@ -78,8 +78,14 @@ Own the shared planning-doc lifecycle for feature work so other skills can reuse
78
78
 
79
79
  - Use `## **Task N: [Task Title]**` for each main task.
80
80
  - Describe each task's purpose and the related requirement IDs.
81
- - Use `- N. [ ]` for tasks and `- N.x [ ]` for subtasks.
81
+ - Use `- N. [ ]` for atomic task items; use `- N.x [ ]` only when a task must be split into additional atomic subtasks.
82
+ - Treat `tasks.md` as an implementation queue, not a high-level summary.
83
+ - Make each checkbox atomic: one verb, one responsibility, one concrete output, and one verification hook.
84
+ - For every task, include allowed scope, out-of-scope guardrails, requirement/design/contract inputs, expected output, completion condition, and verification hook.
85
+ - If one task needs more than three files, more than one behavior slice, or an implementation decision not already captured in `design.md` or `contract.md`, split it before approval.
86
+ - Use `$test-case-strategy` to define test IDs and unit drift checks for non-trivial local logic before implementation starts.
82
87
  - Include explicit tasks for testing, mocks/fakes, regression coverage, and adversarial or edge-case hardening when relevant.
88
+ - Do not write vague tasks such as `Implement integration`, `Add tests`, or `Update docs`; replace them with task-local outputs, test IDs, and verification commands.
83
89
 
84
90
  ### 5) Fill `contract.md`
85
91
 
@@ -121,6 +127,7 @@ Own the shared planning-doc lifecycle for feature work so other skills can reuse
121
127
  - Treat the template as a starting point and adapt it to the actual scope.
122
128
  - Remove or rewrite template examples that are not part of the real plan instead of leaving them as fake work to be checked later.
123
129
  - Map observable behaviors to requirement IDs and real test case IDs.
130
+ - Use `$test-case-strategy` to choose the smallest test level that proves each risk and to define meaningful oracles before implementation.
124
131
  - Record risk class, oracle/assertion focus, dependency strategy, and test results.
125
132
  - Property-based coverage is required for business-logic changes unless a concrete `N/A` reason is recorded.
126
133
  - For decision sections, create as many records as needed for distinct flows or risk slices; do not collapse unrelated decisions into one record.
@@ -152,6 +159,7 @@ Own the shared planning-doc lifecycle for feature work so other skills can reuse
152
159
 
153
160
  - By default, write planning docs in the user's language.
154
161
  - Keep requirement IDs, task IDs, and test IDs traceable across all three files.
162
+ - Every non-trivial implementation task must have either a focused unit drift check, another concrete verification hook, or an explicit `N/A` reason.
155
163
  - Never allow one spec set to cover more than three modules.
156
164
  - When a request exceeds that scope, split it into independent, non-conflicting, non-dependent spec sets before approval.
157
165
  - For batch specs, independence is mandatory: each spec must describe a complete slice that can be implemented, tested, reviewed, and merged without waiting for another spec in the same batch.
@@ -169,6 +177,7 @@ Own the shared planning-doc lifecycle for feature work so other skills can reuse
169
177
 
170
178
  ## References
171
179
 
180
+ - `$test-case-strategy`: shared test case selection, oracle design, and unit drift-check workflow.
172
181
  - `scripts/create-specs`: shared planning file generator, exposed as `apltk create-specs`.
173
182
  - `references/templates/spec.md`: BDD requirement template.
174
183
  - `references/templates/tasks.md`: task breakdown template.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "generate-spec"
3
3
  short_description: "Generate shared feature spec, task, and checklist docs before coding"
4
- default_prompt: "Use $generate-spec to create or update single-spec plans under docs/plans/<date>/<change_name>/ or parallel batches under docs/plans/<date>/<batch_name>/<change_name>/ with a shared coordination.md, but ensure every spec in a batch remains independently approvable, independently implementable, and safe for true parallel execution without depending on another batch spec landing first; surface shared-file or shared-contract collision risks during planning, settle ownership and additive-only rules in coordination.md before implementation starts, fill BDD requirements and risk-driven test planning, document external dependency contracts in contract.md when they materially constrain the change, write the architecture/design delta in design.md, process clarification updates, and wait for explicit approval before implementation."
4
+ default_prompt: "Use $generate-spec to create or update single-spec plans under docs/plans/<date>/<change_name>/ or parallel batches under docs/plans/<date>/<batch_name>/<change_name>/ with a shared coordination.md, but ensure every spec in a batch remains independently approvable, independently implementable, and safe for true parallel execution without depending on another batch spec landing first; surface shared-file or shared-contract collision risks during planning, settle ownership and additive-only rules in coordination.md before implementation starts, fill BDD requirements, use $test-case-strategy for risk-driven test planning and unit drift checks, make tasks.md an atomic implementation queue with concrete outputs and verification hooks, document external dependency contracts in contract.md when they materially constrain the change, write the architecture/design delta in design.md, process clarification updates, and wait for explicit approval before implementation."
@@ -12,6 +12,7 @@
12
12
  - Duplicate or remove decision-record blocks as needed; the final document should contain as many records as the real change requires.
13
13
  - Duplicate or remove completion-record blocks as needed; the final document should contain as many records as the real change requires.
14
14
  - Suggested test result values: `PASS / FAIL / BLOCKED / NOT RUN / N/A`.
15
+ - Use `$test-case-strategy` to choose test levels, define meaningful oracles, and record unit drift checks for atomic tasks.
15
16
  - For business-logic changes, property-based coverage is required unless a concrete `N/A` reason is recorded.
16
17
  - Each checklist item should map to a distinct risk; avoid repeating shallow happy-path cases.
17
18
 
@@ -30,6 +31,7 @@
30
31
  - Property/matrix focus: [invariant / generated business input space / external state matrix / adversarial case]
31
32
  - External dependency strategy: [none / mocked service states / near-real dependency]
32
33
  - Oracle/assertion focus: [exact output / persisted state / side effects / no partial write / compensation / emitted event / permission denial]
34
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
33
35
  - Test result: `PASS / FAIL / BLOCKED / NOT RUN / N/A`
34
36
  - Notes (optional): [risk, limitation, observation]
35
37
 
@@ -41,6 +43,7 @@
41
43
  - Property/matrix focus: [invariant / generated business input space / external state matrix / adversarial case]
42
44
  - External dependency strategy: [none / mocked service states / near-real dependency]
43
45
  - Oracle/assertion focus: [exact output / persisted state / side effects / no partial write / compensation / emitted event / permission denial]
46
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
44
47
  - Test result: `PASS / FAIL / BLOCKED / NOT RUN / N/A`
45
48
  - Notes (optional): [risk, limitation, observation]
46
49
 
@@ -52,11 +55,13 @@
52
55
  - Property/matrix focus: [invariant / generated business input space / external state matrix / adversarial case]
53
56
  - External dependency strategy: [none / mocked service states / near-real dependency]
54
57
  - Oracle/assertion focus: [exact output / persisted state / side effects / no partial write / compensation / emitted event / permission denial]
58
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
55
59
  - Test result: `PASS / FAIL / BLOCKED / NOT RUN / N/A`
56
60
  - Notes (optional): [risk, limitation, observation]
57
61
 
58
62
  ## Required Hardening Records
59
63
  - [ ] Regression tests are added/updated for bug-prone or high-risk behavior, or `N/A` is recorded with a concrete reason.
64
+ - [ ] Focused unit drift checks are defined for non-trivial atomic implementation tasks, or `N/A` is recorded with the replacement verification and concrete reason.
60
65
  - [ ] Property-based coverage is added/updated for changed business logic, or `N/A` is recorded with a concrete reason.
61
66
  - [ ] External services in the business logic chain are mocked/faked for scenario testing, or `N/A` is recorded with a concrete reason.
62
67
  - [ ] Adversarial/penetration-style cases are added/updated for abuse paths and edge combinations, or `N/A` is recorded with a concrete reason.
@@ -5,32 +5,61 @@
5
5
 
6
6
  ## **Task 1: [Task Title]**
7
7
 
8
- [Describe task purpose and requirement mapping (for example: maps to R1.x, core objective is [one sentence]).]
8
+ Purpose: [one sentence describing the narrow outcome]
9
+ Requirements: [R1.x]
10
+ Allowed scope: [files/modules/functions this task may touch]
11
+ Out of scope: [files/modules/behaviors this task must not change]
9
12
 
10
13
  - 1. [ ] [Main task item]
11
- - 1.1 [ ] [Subtask item]
12
- - 1.2 [ ] [Subtask item]
14
+ - Input: [requirement/design/contract evidence]
15
+ - Touches: [specific file/function/module]
16
+ - Output: [specific code/doc/test artifact]
17
+ - Done when: [observable completion condition]
18
+ - Verify with: [focused command/check/manual inspection]
19
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
20
+ - Do not: [explicit implementation-drift guardrail]
13
21
 
14
22
  ## **Task 2: [Task Title]**
15
23
 
16
- [Describe task purpose and requirement mapping (for example: maps to R2.x, core objective is [one sentence]).]
24
+ Purpose: [one sentence describing the narrow outcome]
25
+ Requirements: [R2.x]
26
+ Allowed scope: [files/modules/functions this task may touch]
27
+ Out of scope: [files/modules/behaviors this task must not change]
17
28
 
18
29
  - 2. [ ] [Main task item]
19
- - 2.1 [ ] [Subtask item]
20
- - 2.2 [ ] [Subtask item]
30
+ - Input: [requirement/design/contract evidence]
31
+ - Touches: [specific file/function/module]
32
+ - Output: [specific code/doc/test artifact]
33
+ - Done when: [observable completion condition]
34
+ - Verify with: [focused command/check/manual inspection]
35
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
36
+ - Do not: [explicit implementation-drift guardrail]
21
37
 
22
38
  ## **Task 3: [Task Title]**
23
39
 
24
- [Describe task purpose and requirement mapping (for example: maps to R3.x, core objective is [one sentence]).]
40
+ Purpose: [one sentence describing the narrow outcome]
41
+ Requirements: [R3.x]
42
+ Allowed scope: [files/modules/functions this task may touch]
43
+ Out of scope: [files/modules/behaviors this task must not change]
25
44
 
26
45
  - 3. [ ] [Main task item]
27
- - 3.1 [ ] [Subtask item]
28
- - 3.2 [ ] [Subtask item]
46
+ - Input: [requirement/design/contract evidence]
47
+ - Touches: [specific file/function/module]
48
+ - Output: [specific code/doc/test artifact]
49
+ - Done when: [observable completion condition]
50
+ - Verify with: [focused command/check/manual inspection]
51
+ - Unit drift check: [UT-xx target unit + oracle, or N/A with reason]
52
+ - Do not: [explicit implementation-drift guardrail]
29
53
 
30
54
  ## Notes
31
55
  - Task order should reflect actual implementation sequence.
32
56
  - Every main task must map back to `spec.md` requirement IDs.
57
+ - Treat `tasks.md` as an implementation queue, not a high-level work summary.
58
+ - Each checkbox must be atomic: one verb, one responsibility, one concrete output, and one verification hook.
59
+ - Split any task that needs more than three files, more than one behavior slice, or a design decision not already captured in `design.md` or `contract.md`.
60
+ - Use `$test-case-strategy` to define test IDs and unit drift checks before implementation.
33
61
  - Include explicit tasks for required test coverage (unit, regression, property-based, integration/E2E as applicable), mock scenario setup, and adversarial/edge-case hardening.
62
+ - Do not write vague tasks such as `Implement integration`, `Add tests`, or `Update docs`; replace them with task-local outputs, test IDs, and verification commands.
34
63
  - For batch specs, tasks must never include "wait for Spec X to land first" as a prerequisite; if such a dependency appears, re-slice the plan or move the coordination rule into `coordination.md`.
35
64
  - After execution, the agent must update each checkbox (`[x]` for done, `[ ]` for not done).
36
65
  - Remove all placeholder guidance text in square brackets after filling.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@laitszkin/apollo-toolkit",
3
- "version": "3.3.5",
3
+ "version": "3.4.0",
4
4
  "description": "Apollo Toolkit npm installer for managed skill copying across Codex, OpenClaw, and Trae.",
5
5
  "license": "MIT",
6
6
  "author": "LaiTszKin",
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 LaiTszKin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.