@laitszkin/apollo-toolkit 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (204) hide show
  1. package/AGENTS.md +62 -0
  2. package/CHANGELOG.md +100 -0
  3. package/LICENSE +21 -0
  4. package/README.md +144 -0
  5. package/align-project-documents/SKILL.md +94 -0
  6. package/align-project-documents/agents/openai.yaml +4 -0
  7. package/analyse-app-logs/LICENSE +21 -0
  8. package/analyse-app-logs/README.md +126 -0
  9. package/analyse-app-logs/SKILL.md +121 -0
  10. package/analyse-app-logs/agents/openai.yaml +4 -0
  11. package/analyse-app-logs/references/investigation-checklist.md +58 -0
  12. package/analyse-app-logs/references/log-signal-patterns.md +52 -0
  13. package/answering-questions-with-research/SKILL.md +46 -0
  14. package/answering-questions-with-research/agents/openai.yaml +4 -0
  15. package/bin/apollo-toolkit.js +7 -0
  16. package/commit-and-push/LICENSE +21 -0
  17. package/commit-and-push/README.md +26 -0
  18. package/commit-and-push/SKILL.md +70 -0
  19. package/commit-and-push/agents/openai.yaml +4 -0
  20. package/commit-and-push/references/branch-naming.md +15 -0
  21. package/commit-and-push/references/commit-messages.md +19 -0
  22. package/deep-research-topics/LICENSE +21 -0
  23. package/deep-research-topics/README.md +43 -0
  24. package/deep-research-topics/SKILL.md +84 -0
  25. package/deep-research-topics/agents/openai.yaml +4 -0
  26. package/develop-new-features/LICENSE +21 -0
  27. package/develop-new-features/README.md +52 -0
  28. package/develop-new-features/SKILL.md +105 -0
  29. package/develop-new-features/agents/openai.yaml +4 -0
  30. package/develop-new-features/references/testing-e2e.md +35 -0
  31. package/develop-new-features/references/testing-integration.md +42 -0
  32. package/develop-new-features/references/testing-property-based.md +44 -0
  33. package/develop-new-features/references/testing-unit.md +37 -0
  34. package/discover-edge-cases/CHANGELOG.md +19 -0
  35. package/discover-edge-cases/LICENSE +21 -0
  36. package/discover-edge-cases/README.md +87 -0
  37. package/discover-edge-cases/SKILL.md +124 -0
  38. package/discover-edge-cases/agents/openai.yaml +4 -0
  39. package/discover-edge-cases/references/architecture-edge-cases.md +41 -0
  40. package/discover-edge-cases/references/code-edge-cases.md +46 -0
  41. package/docs-to-voice/.env.example +106 -0
  42. package/docs-to-voice/CHANGELOG.md +71 -0
  43. package/docs-to-voice/LICENSE +21 -0
  44. package/docs-to-voice/README.md +118 -0
  45. package/docs-to-voice/SKILL.md +107 -0
  46. package/docs-to-voice/agents/openai.yaml +4 -0
  47. package/docs-to-voice/scripts/docs_to_voice.py +1385 -0
  48. package/docs-to-voice/scripts/docs_to_voice.sh +11 -0
  49. package/docs-to-voice/tests/test_docs_to_voice_api_max_chars.py +210 -0
  50. package/docs-to-voice/tests/test_docs_to_voice_sentence_timeline.py +115 -0
  51. package/docs-to-voice/tests/test_docs_to_voice_settings.py +43 -0
  52. package/docs-to-voice/tests/test_docs_to_voice_speech_rate.py +57 -0
  53. package/enhance-existing-features/CHANGELOG.md +35 -0
  54. package/enhance-existing-features/LICENSE +21 -0
  55. package/enhance-existing-features/README.md +54 -0
  56. package/enhance-existing-features/SKILL.md +120 -0
  57. package/enhance-existing-features/agents/openai.yaml +4 -0
  58. package/enhance-existing-features/references/e2e-tests.md +25 -0
  59. package/enhance-existing-features/references/integration-tests.md +30 -0
  60. package/enhance-existing-features/references/property-based-tests.md +33 -0
  61. package/enhance-existing-features/references/unit-tests.md +29 -0
  62. package/feature-propose/LICENSE +21 -0
  63. package/feature-propose/README.md +23 -0
  64. package/feature-propose/SKILL.md +107 -0
  65. package/feature-propose/agents/openai.yaml +4 -0
  66. package/feature-propose/references/enhancement-features.md +25 -0
  67. package/feature-propose/references/important-features.md +25 -0
  68. package/feature-propose/references/mvp-features.md +25 -0
  69. package/feature-propose/references/performance-features.md +25 -0
  70. package/financial-research/SKILL.md +208 -0
  71. package/financial-research/agents/openai.yaml +4 -0
  72. package/financial-research/assets/weekly_market_report_template.md +45 -0
  73. package/fix-github-issues/SKILL.md +98 -0
  74. package/fix-github-issues/agents/openai.yaml +4 -0
  75. package/fix-github-issues/scripts/list_issues.py +148 -0
  76. package/fix-github-issues/tests/test_list_issues.py +127 -0
  77. package/generate-spec/LICENSE +21 -0
  78. package/generate-spec/README.md +61 -0
  79. package/generate-spec/SKILL.md +96 -0
  80. package/generate-spec/agents/openai.yaml +4 -0
  81. package/generate-spec/references/templates/checklist.md +78 -0
  82. package/generate-spec/references/templates/spec.md +55 -0
  83. package/generate-spec/references/templates/tasks.md +35 -0
  84. package/generate-spec/scripts/create-specs +123 -0
  85. package/harden-app-security/CHANGELOG.md +27 -0
  86. package/harden-app-security/LICENSE +21 -0
  87. package/harden-app-security/README.md +46 -0
  88. package/harden-app-security/SKILL.md +127 -0
  89. package/harden-app-security/agents/openai.yaml +4 -0
  90. package/harden-app-security/references/agent-attack-catalog.md +117 -0
  91. package/harden-app-security/references/common-software-attack-catalog.md +168 -0
  92. package/harden-app-security/references/red-team-extreme-scenarios.md +81 -0
  93. package/harden-app-security/references/risk-checklist.md +78 -0
  94. package/harden-app-security/references/security-test-patterns-agent.md +101 -0
  95. package/harden-app-security/references/security-test-patterns-finance.md +88 -0
  96. package/harden-app-security/references/test-snippets.md +73 -0
  97. package/improve-observability/SKILL.md +114 -0
  98. package/improve-observability/agents/openai.yaml +4 -0
  99. package/learn-skill-from-conversations/CHANGELOG.md +15 -0
  100. package/learn-skill-from-conversations/LICENSE +22 -0
  101. package/learn-skill-from-conversations/README.md +47 -0
  102. package/learn-skill-from-conversations/SKILL.md +85 -0
  103. package/learn-skill-from-conversations/agents/openai.yaml +4 -0
  104. package/learn-skill-from-conversations/scripts/extract_recent_conversations.py +369 -0
  105. package/learn-skill-from-conversations/tests/test_extract_recent_conversations.py +176 -0
  106. package/learning-error-book/SKILL.md +112 -0
  107. package/learning-error-book/agents/openai.yaml +4 -0
  108. package/learning-error-book/assets/error_book_template.md +66 -0
  109. package/learning-error-book/scripts/render_markdown_to_pdf.py +367 -0
  110. package/lib/cli.js +338 -0
  111. package/lib/installer.js +225 -0
  112. package/maintain-project-constraints/SKILL.md +109 -0
  113. package/maintain-project-constraints/agents/openai.yaml +4 -0
  114. package/maintain-skill-catalog/README.md +18 -0
  115. package/maintain-skill-catalog/SKILL.md +66 -0
  116. package/maintain-skill-catalog/agents/openai.yaml +4 -0
  117. package/novel-to-short-video/CHANGELOG.md +53 -0
  118. package/novel-to-short-video/LICENSE +21 -0
  119. package/novel-to-short-video/README.md +63 -0
  120. package/novel-to-short-video/SKILL.md +233 -0
  121. package/novel-to-short-video/agents/openai.yaml +4 -0
  122. package/novel-to-short-video/references/plan-template.md +71 -0
  123. package/novel-to-short-video/references/roles-json.md +41 -0
  124. package/open-github-issue/LICENSE +21 -0
  125. package/open-github-issue/README.md +97 -0
  126. package/open-github-issue/SKILL.md +119 -0
  127. package/open-github-issue/agents/openai.yaml +4 -0
  128. package/open-github-issue/scripts/open_github_issue.py +380 -0
  129. package/open-github-issue/tests/test_open_github_issue.py +159 -0
  130. package/open-source-pr-workflow/CHANGELOG.md +32 -0
  131. package/open-source-pr-workflow/LICENSE +21 -0
  132. package/open-source-pr-workflow/README.md +23 -0
  133. package/open-source-pr-workflow/SKILL.md +123 -0
  134. package/open-source-pr-workflow/agents/openai.yaml +4 -0
  135. package/openai-text-to-image-storyboard/.env.example +10 -0
  136. package/openai-text-to-image-storyboard/CHANGELOG.md +49 -0
  137. package/openai-text-to-image-storyboard/LICENSE +21 -0
  138. package/openai-text-to-image-storyboard/README.md +99 -0
  139. package/openai-text-to-image-storyboard/SKILL.md +107 -0
  140. package/openai-text-to-image-storyboard/agents/openai.yaml +4 -0
  141. package/openai-text-to-image-storyboard/scripts/generate_storyboard_images.py +763 -0
  142. package/package.json +36 -0
  143. package/record-spending/SKILL.md +113 -0
  144. package/record-spending/agents/openai.yaml +4 -0
  145. package/record-spending/references/account-format.md +33 -0
  146. package/record-spending/references/workbook-layout.md +84 -0
  147. package/resolve-review-comments/SKILL.md +122 -0
  148. package/resolve-review-comments/agents/openai.yaml +4 -0
  149. package/resolve-review-comments/references/adoption-criteria.md +23 -0
  150. package/resolve-review-comments/scripts/review_threads.py +425 -0
  151. package/resolve-review-comments/tests/test_review_threads.py +74 -0
  152. package/review-change-set/LICENSE +21 -0
  153. package/review-change-set/README.md +55 -0
  154. package/review-change-set/SKILL.md +103 -0
  155. package/review-change-set/agents/openai.yaml +4 -0
  156. package/review-codebases/LICENSE +21 -0
  157. package/review-codebases/README.md +67 -0
  158. package/review-codebases/SKILL.md +109 -0
  159. package/review-codebases/agents/openai.yaml +4 -0
  160. package/scripts/install_skills.ps1 +283 -0
  161. package/scripts/install_skills.sh +262 -0
  162. package/scripts/validate_openai_agent_config.py +194 -0
  163. package/scripts/validate_skill_frontmatter.py +110 -0
  164. package/specs-to-project-docs/LICENSE +21 -0
  165. package/specs-to-project-docs/README.md +57 -0
  166. package/specs-to-project-docs/SKILL.md +111 -0
  167. package/specs-to-project-docs/agents/openai.yaml +4 -0
  168. package/specs-to-project-docs/references/templates/architecture.md +29 -0
  169. package/specs-to-project-docs/references/templates/configuration.md +29 -0
  170. package/specs-to-project-docs/references/templates/developer-guide.md +33 -0
  171. package/specs-to-project-docs/references/templates/docs-index.md +39 -0
  172. package/specs-to-project-docs/references/templates/features.md +25 -0
  173. package/specs-to-project-docs/references/templates/getting-started.md +38 -0
  174. package/specs-to-project-docs/references/templates/readme.md +49 -0
  175. package/systematic-debug/LICENSE +21 -0
  176. package/systematic-debug/README.md +81 -0
  177. package/systematic-debug/SKILL.md +59 -0
  178. package/systematic-debug/agents/openai.yaml +4 -0
  179. package/text-to-short-video/.env.example +36 -0
  180. package/text-to-short-video/LICENSE +21 -0
  181. package/text-to-short-video/README.md +82 -0
  182. package/text-to-short-video/SKILL.md +221 -0
  183. package/text-to-short-video/agents/openai.yaml +4 -0
  184. package/text-to-short-video/scripts/enforce_video_aspect_ratio.py +350 -0
  185. package/version-release/CHANGELOG.md +53 -0
  186. package/version-release/LICENSE +21 -0
  187. package/version-release/README.md +28 -0
  188. package/version-release/SKILL.md +94 -0
  189. package/version-release/agents/openai.yaml +4 -0
  190. package/version-release/references/branch-naming.md +15 -0
  191. package/version-release/references/changelog-writing.md +8 -0
  192. package/version-release/references/commit-messages.md +19 -0
  193. package/version-release/references/readme-writing.md +12 -0
  194. package/version-release/references/semantic-versioning.md +12 -0
  195. package/video-production/CHANGELOG.md +104 -0
  196. package/video-production/LICENSE +18 -0
  197. package/video-production/README.md +68 -0
  198. package/video-production/SKILL.md +213 -0
  199. package/video-production/agents/openai.yaml +4 -0
  200. package/video-production/references/plan-template.md +54 -0
  201. package/video-production/references/roles-json.md +41 -0
  202. package/weekly-financial-event-report/SKILL.md +195 -0
  203. package/weekly-financial-event-report/agents/openai.yaml +4 -0
  204. package/weekly-financial-event-report/assets/financial_event_report_template.md +53 -0
@@ -0,0 +1,52 @@
1
+ # develop-new-features
2
+
3
+ A spec-first feature development skill for new behavior and greenfield work. It delegates shared planning-doc generation to `generate-spec`, then implements the approved feature with risk-driven testing.
4
+
5
+ ## Key capabilities
6
+
7
+ - Requires `generate-spec` before any implementation starts.
8
+ - Treats `spec.md`, `tasks.md`, and `checklist.md` as approval-gated artifacts, not optional notes.
9
+ - Covers unit, regression, property-based, integration, E2E, and adversarial testing based on actual risk.
10
+ - Reuses existing architecture and avoids speculative expansion.
11
+ - Backfills planning docs after implementation and testing complete.
12
+
13
+ ## Repository layout
14
+
15
+ ```text
16
+ .
17
+ ├── SKILL.md
18
+ ├── README.md
19
+ ├── LICENSE
20
+ ├── agents/
21
+ │ └── openai.yaml
22
+ └── references/
23
+ ├── testing-unit.md
24
+ ├── testing-property-based.md
25
+ ├── testing-integration.md
26
+ └── testing-e2e.md
27
+ ```
28
+
29
+ ## Workflow summary
30
+
31
+ 1. Review only the official docs and code paths needed for the feature.
32
+ 2. Run `generate-spec` to create and maintain `docs/plans/{YYYY-MM-DD}_{change_name}/`.
33
+ 3. Wait for explicit approval on the spec set.
34
+ 4. Implement the approved behavior with minimal changes.
35
+ 5. Run risk-driven tests and backfill `tasks.md` and `checklist.md`.
36
+
37
+ ## Testing expectations
38
+
39
+ - Unit: changed logic, boundaries, failure paths.
40
+ - Regression: pin down bug-prone or high-risk behavior.
41
+ - Property-based: required for business logic unless concrete `N/A` is recorded.
42
+ - Integration: cover the user-critical logic chain.
43
+ - E2E: cover the most important success and denial/failure paths when justified.
44
+ - Adversarial: include abuse, malformed input, privilege, replay, concurrency, and edge-combination cases when relevant.
45
+
46
+ ## References
47
+
48
+ - Shared planning workflow: `generate-spec`
49
+ - Unit testing guide: `references/testing-unit.md`
50
+ - Property-based testing guide: `references/testing-property-based.md`
51
+ - Integration testing guide: `references/testing-integration.md`
52
+ - E2E testing guide: `references/testing-e2e.md`
@@ -0,0 +1,105 @@
1
+ ---
2
+ name: develop-new-features
3
+ description: >-
4
+ Spec-first feature development workflow for new behavior and greenfield
5
+ features. Depends on `generate-spec` for shared planning artifacts before
6
+ coding, then implements the approved feature with risk-driven test coverage.
7
+ Use when users ask to design or implement new features, change product
8
+ behavior, request a planning-first process, or ask for a greenfield feature.
9
+ Tests must not stop at happy-path validation: for business-logic changes
10
+ require property-based testing unless explicitly `N/A` with reason, design
11
+ adversarial/regression/authorization/idempotency/concurrency coverage where
12
+ relevant, use mocks for external services in logic chains, and verify
13
+ meaningful business outcomes rather than smoke-only success.
14
+ ---
15
+
16
+ # Develop New Features
17
+
18
+ ## Dependencies
19
+
20
+ - Required: `generate-spec` for `spec.md`, `tasks.md`, `checklist.md`, clarification handling, approval gating, and status backfill.
21
+ - Conditional: none.
22
+ - Optional: none.
23
+ - Fallback: If `generate-spec` is unavailable, stop and report the missing dependency.
24
+
25
+ ## Standards
26
+
27
+ - Evidence: Review authoritative docs and the existing codebase before planning or implementation.
28
+ - Execution: Run `generate-spec` for every new feature or product-behavior change, obtain approval, then implement minimally.
29
+ - Quality: Add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
30
+ - Output: Keep the approved planning artifacts and the final implementation aligned with actual completion results.
31
+
32
+ ## Goal
33
+
34
+ Use a shared spec-generation workflow for all new feature work, then implement the approved behavior with strong test coverage and minimal rework.
35
+
36
+ ## Workflow
37
+
38
+ ### 1) Review authoritative docs first
39
+
40
+ - Identify the stack, libraries, APIs, and external dependencies involved.
41
+ - Use official documentation as the source of truth.
42
+ - Prefer Context7 for framework/library APIs; use web for the latest official docs when needed.
43
+ - Record only the references required for this feature.
44
+
45
+ ### 2) Run `$generate-spec`
46
+
47
+ - Specs are mandatory for every new feature, product behavior change, and greenfield project.
48
+ - Follow `$generate-spec` completely for:
49
+ - generating `docs/plans/{YYYY-MM-DD}_{change_name}/spec.md`, `tasks.md`, and `checklist.md`
50
+ - filling BDD requirements and risk-driven test plans
51
+ - handling clarification responses
52
+ - obtaining explicit approval before coding
53
+ - backfilling document status after implementation and testing
54
+ - Do not modify product code before the approved spec set exists.
55
+
56
+ ### 3) Explore architecture and reuse opportunities
57
+
58
+ - Trace entrypoints, module boundaries, data flow, and integration points relevant to the new behavior.
59
+ - Identify reusable components, patterns, and configuration paths before adding new code.
60
+ - Keep a concise map of likely files to modify so implementation stays scoped.
61
+
62
+ ### 4) Implement after approval
63
+
64
+ - Reuse existing patterns and abstractions when possible.
65
+ - Keep changes focused and avoid speculative scope expansion.
66
+ - Update environment examples only when new inputs are actually required.
67
+
68
+ ### 5) Testing coverage (required)
69
+
70
+ For every non-trivial change, evaluate all categories and add test cases or record justified `N/A`:
71
+ - Start from a risk inventory, not from the happy path: assess misuse/abuse, authorization, invalid transitions, idempotency, replay/duplication, concurrency/races, data-integrity, and partial-failure/rollback risks.
72
+ - Unit tests: changed logic, boundaries, failure paths, and exact error/side-effect expectations.
73
+ - Regression tests: bug-prone or high-risk behavior that should never silently regress again.
74
+ - Property-based tests: required for business-logic changes unless truly unsuitable; use them for invariants, generated business input spaces, state-machine/metamorphic checks when useful, and output expectation checks.
75
+ - Integration tests: user-critical logic chain across modules/layers.
76
+ - E2E tests: key user-visible path impacted by this change; prefer one minimal critical success path plus one highest-value denial/failure path when the risk warrants it.
77
+ - Adversarial/penetration-style cases: abuse paths, malformed inputs, forged identities/privileges, invalid transitions, replay/duplication, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
78
+
79
+ Rules:
80
+ - If E2E is too costly or unstable, add stronger integration coverage for the same risk and record the reason in the checklist.
81
+ - If property-based testing is not suitable, record `N/A` with a concrete reason.
82
+ - For logic chains with external services, mock or fake those services unless the real contract itself is under test; simulate diverse external states and verify the business chain remains correct.
83
+ - Where the feature can partially commit work, test rollback, compensation, or no-partial-write behavior explicitly.
84
+ - Each test must assert a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects. Avoid assertion-light smoke tests and snapshot-only coverage.
85
+ - Run relevant tests when possible and fix failures.
86
+
87
+ ### 6) Completion updates
88
+
89
+ - Backfill `tasks.md` and `checklist.md` through `$generate-spec` workflow after implementation and testing.
90
+ - Report the implemented scope, test execution, and any concrete `N/A` reasons.
91
+
92
+ ## Working Rules
93
+
94
+ - By default, write planning docs in the user's language.
95
+ - Keep implementation traceable to approved requirement IDs and planned risks.
96
+ - Prefer realism over rigid templates: add or remove test coverage only when the risk profile justifies it.
97
+ - Every planned test should justify a distinct risk; remove shallow duplicates that only prove the code "still runs".
98
+
99
+ ## References
100
+
101
+ - `$generate-spec`: shared planning and approval workflow.
102
+ - `references/testing-unit.md`: unit testing principles.
103
+ - `references/testing-property-based.md`: property-based testing principles.
104
+ - `references/testing-integration.md`: integration testing principles.
105
+ - `references/testing-e2e.md`: E2E decision and design principles.
@@ -0,0 +1,4 @@
1
+ interface:
2
+ display_name: "Develop New Features"
3
+ short_description: "Spec-first feature development that depends on generate-spec"
4
+ default_prompt: "Use $develop-new-features to design new behavior through a spec-first workflow: review the required external docs, run $generate-spec to create and maintain docs/plans/<date>_<change_name>/{spec.md,tasks.md,checklist.md}, wait for explicit approval, then implement the approved feature with risk-driven tests and backfill the planning docs after execution."
@@ -0,0 +1,35 @@
1
+ # E2E Testing Principles
2
+
3
+ ## Core rules
4
+ - E2E is not decided solely by explicit user request.
5
+ - The agent must decide E2E based on feature importance, complexity, and cross-layer risk.
6
+ - For high-risk key user paths, create the smallest necessary E2E coverage first.
7
+ - If E2E is unstable, too costly, or environment-limited, add integration coverage for equivalent risk and record the alternative.
8
+
9
+ ## Purpose
10
+ - Verify critical end-to-end user paths are usable.
11
+ - Catch behavior gaps after cross-system/cross-layer integration.
12
+ - Provide confidence close to real usage for high-risk scenarios.
13
+
14
+ ## Decision criteria
15
+ - Importance: core feature, critical revenue flow, or high-impact process.
16
+ - Complexity: multi-step state transitions, branching flows, cross-service collaboration.
17
+ - Risk: historical regressions, fragile integrations, major user-visible failures.
18
+ - Maintainability: stable environment and controllable test data.
19
+
20
+ ## Not suitable when
21
+ - Feature risk is low and unit/integration tests already cover it sufficiently.
22
+ - E2E is unstable and disproportionately expensive while integration tests can cover key risk.
23
+
24
+ ## Design guidance
25
+ - Cover only the most critical paths; avoid expanding into full UI test suites.
26
+ - Keep test data controllable (fixed seeds or recyclable fixtures).
27
+ - Prioritize stability; avoid brittle external dependencies, use controlled substitutes if needed.
28
+ - Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
29
+ - Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
30
+ - Keep cost decisions explicit: document why E2E is done or not done and what alternative strategy is used.
31
+
32
+ ## Spec/checklist authoring hints
33
+ - Mark high-risk key paths in `spec.md` requirement descriptions.
34
+ - Record E2E decisions, mapped test cases, and results in `checklist.md`.
35
+ - If skipping E2E, specify replacement integration test cases (`IT-xx`) and rationale in `checklist.md`.
@@ -0,0 +1,42 @@
1
+ # Integration Testing Principles
2
+
3
+ ## Purpose
4
+ - Verify collaboration across modules/layers and external dependencies.
5
+ - Cover integration risks unit tests cannot capture (sequence, config, IO failure).
6
+ - Validate user-critical business logic chains under realistic component interaction and controlled external-service scenarios.
7
+
8
+ ## When to use
9
+ - Interface interactions between modules (for example service ↔ repository).
10
+ - Changes touching IO dependencies such as DB, RPC, files, cache, queues.
11
+ - Behaviors that depend on configuration combinations or environment differences.
12
+ - The correctness question is about the whole business logic chain rather than one isolated function.
13
+ - As minimum safety replacement when E2E is not suitable.
14
+
15
+ ## Not suitable when
16
+ - Single pure-function or pure-logic behavior (use unit tests).
17
+ - Full end-to-end user flow can be stably covered by E2E.
18
+
19
+ ## Relationship with E2E
20
+ - If change importance/complexity is high and E2E is feasible, prefer minimal E2E for key paths.
21
+ - If E2E is hard or too costly, integration tests must cover equivalent key risks.
22
+ - Record replacement mapping in `checklist.md` (E2E-xx ↔ IT-xx) with rationale.
23
+
24
+ ## Design guidance
25
+ - Focus on high-value integration points; each test should justify risk/value.
26
+ - Keep dependencies inside the application boundary near-real where practical.
27
+ - Mock/fake external services at the business-chain boundary unless the real service contract itself is what needs verification.
28
+ - Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
29
+ - Add adversarial/penetration-style cases for abuse paths such as invalid transitions, replay, double-submit, forged identifiers, or out-of-order events when those risks exist.
30
+ - When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
31
+ - Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
32
+ - Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
33
+ - Keep reproducible: controlled test data and recoverable environment.
34
+ - Keep cost controlled; avoid broad redundant coverage (leave that to unit tests).
35
+
36
+ ## Spec/checklist authoring hints
37
+ - Dependency scope: list involved modules/external systems.
38
+ - Scenario: describe cross-module flow or critical branch.
39
+ - Risk: explain what integration failure, misconfiguration, or business-chain break this test can reveal.
40
+ - External dependency strategy: specify which services are mocked/faked versus near-real and why.
41
+ - Scenario matrix: list the external states or adversarial paths covered.
42
+ - Map behavior, test IDs, and test outcomes in `checklist.md`.
@@ -0,0 +1,44 @@
1
+ # Property-based Testing Principles
2
+
3
+ ## Purpose
4
+ - Verify invariants/properties hold across broad input spaces.
5
+ - Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
6
+ - Catch combinational, adversarial, and boundary behaviors that fixed examples often miss.
7
+
8
+ ## When to use
9
+ - Algorithms, transformations, serialization/deserialization, sorting, aggregation.
10
+ - Behaviors requiring consistency or reversibility (for example round-trip).
11
+ - Data structures or state transitions with clear invariants.
12
+ - Business logic where the rule can be stated as input/output expectations, allowed states, forbidden states, or safety constraints.
13
+ - Logic chains that depend on external services, when those services can be replaced by controllable mocks/fakes and their states generated as part of the test space.
14
+
15
+ ## Not suitable when
16
+ - The main thing being validated is the real integration contract with external systems or live IO (use integration tests).
17
+ - UI/interactive flows without stable invariants.
18
+ - Very small discrete input spaces (unit tests are sufficient).
19
+
20
+ ## Design guidance
21
+ - Properties must be explicit and machine-verifiable, whether they are invariants, allowed outcome sets, rejection rules, or business-output predicates.
22
+ - Generators should cover normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
23
+ - Prefer modeling business rules directly: generate inputs, run the logic, then assert the output/error/state transition matches the rule.
24
+ - When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
25
+ - When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
26
+ - For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
27
+ - Ensure reproducibility (fixed seed or replayable input generation) and preserve failing seeds/examples for regression coverage.
28
+ - Complement unit tests; avoid duplicating fixed-case tests.
29
+ - Control cost with reasonable sample counts and input-size limits.
30
+
31
+ ## Common property examples (description level)
32
+ - `deserialize(serialize(x)) == x`
33
+ - Sorted output is monotonic and preserves element multiset.
34
+ - Merge/split operations preserve total element count.
35
+ - Idempotency: repeating the same operation does not change results.
36
+ - Invalid or unauthorized generated inputs always fail with an expected error/result class.
37
+ - Generated order/payment/state-transition inputs always end in an allowed business state.
38
+ - Under generated mocked service states, the business logic chain still satisfies fallback/retry/compensation rules.
39
+
40
+ ## Spec/checklist authoring hints
41
+ - Property/rule: one sentence stating the rule that must always hold or the allowed outcomes that must contain the result.
42
+ - Generator strategy: input range, distribution, emphasized boundaries, and any adversarial or external-state dimensions.
43
+ - Oracle/check: describe how the test decides correctness (predicate, allow-list, reference model, or expected error class).
44
+ - Purpose: explain correctness/risk reduction value of this property.
@@ -0,0 +1,37 @@
1
+ # Unit Testing Principles
2
+
3
+ ## Purpose
4
+ - Verify correctness of the smallest testable unit (function, method, or pure logic module).
5
+ - Provide fast feedback with low-cost failure localization.
6
+
7
+ ## When to use
8
+ - Core business logic and critical branches.
9
+ - Boundary conditions (upper/lower limits, null/empty, extreme values).
10
+ - Error handling and exception paths (invalid input, incompatible state, etc.).
11
+
12
+ ## Not suitable when
13
+ - Behavior requires cross-module or external dependency verification (use integration tests).
14
+ - Full user-flow validation is required (evaluate E2E first; if not suitable, use integration tests to cover risk).
15
+
16
+ ## Design guidance
17
+ - Isolate external dependencies with mock/stub/fake; avoid DB/RPC/file IO.
18
+ - Keep tests small and focused: one test, one behavior/failure mode.
19
+ - Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
20
+ - Cover both success and failure branches.
21
+ - Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
22
+ - Prefer table-driven cases when many small business permutations share the same oracle.
23
+ - Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
24
+ - If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
25
+ - Keep tests reproducible: avoid nondeterministic time/random/global state.
26
+ - Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
27
+ - Map tests to requirements: each core requirement should have at least one unit test.
28
+
29
+ ## Spec/checklist authoring hints
30
+ - Scenario: describe input and initial state mapped to one requirement/boundary.
31
+ - Expected result: verifiable output, state change, or error.
32
+ - Purpose: explain which risk or bug type this test prevents.
33
+
34
+ ## Common examples (description level)
35
+ - Return a specific error when input is out of allowed range.
36
+ - Handle empty list/empty string input with expected behavior.
37
+ - Ensure output matches definition after state/flag switching.
@@ -0,0 +1,19 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## [v0.2.1] - 2026-02-17
6
+
7
+ ### Changed
8
+ - Remove `submit-changes` skill dependency from no-diff release flow while keeping the PR workflow.
9
+ - Update skill and README guidance to use direct git commit/push before opening a PR.
10
+
11
+ ## [v0.2.0] - 2026-02-17
12
+
13
+ ### Added
14
+ - Add no-diff workflow guidance to scan the whole codebase for actionable edge cases.
15
+ - Add release-flow guidance for no-diff fixes: create worktree, use `submit-changes`, and open a PR.
16
+
17
+ ### Changed
18
+ - Clarify scope selection logic: `git diff` path uses changed files only; no-diff path uses full-codebase scan.
19
+ - Expand README examples to include a no-diff prompt and expected execution path.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 LaiTszKin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,87 @@
1
+ # discover-edge-cases
2
+
3
+ `discover-edge-cases` is a Codex skill for discovering reproducible edge-case risks and coverage gaps.
4
+
5
+ ## Brief introduction
6
+
7
+ This skill is discovery-oriented. It scans the current diff by default, or the full codebase
8
+ when there is no diff, then validates the highest-risk edge cases with concrete evidence.
9
+ It does not write tests, patch code, or open PRs.
10
+
11
+ It follows a strict workflow:
12
+ 1. Detect whether `git diff` exists.
13
+ 2. Inspect only changed files plus minimal dependencies, or perform a full-project scan when no diff exists.
14
+ 3. Run `harden-app-security` as an adversarial dependency for code-affecting scope.
15
+ 4. Probe the highest-risk edge cases and gather concrete evidence.
16
+ 5. Reproduce confirmed issues at least twice and check nearby variants.
17
+ 6. Prioritize confirmed findings and report hardening guidance only.
18
+
19
+ ## When to use
20
+
21
+ Use this skill when a task asks you to:
22
+ - find edge-case risks in a diff or codebase,
23
+ - validate unusual inputs and error paths,
24
+ - assess hardening gaps around null/empty/boundary handling,
25
+ - review retries, timeouts, degradation paths, or stateful failure modes.
26
+
27
+ ## Core principles
28
+
29
+ - Scope is `git diff` plus the minimal dependency chain by default.
30
+ - If `git diff` is empty, run a full-codebase scan focused on high-risk modules.
31
+ - Treat prior authorship as irrelevant; even code written earlier in the same conversation must be challenged like third-party code.
32
+ - Decisions must be evidence-based; speculative ideas stay marked as hypotheses.
33
+ - Keep only reproducible findings with exact evidence.
34
+ - Run `harden-app-security` as a required adversarial cross-check for code-affecting scope.
35
+ - Report recommended fixes and test ideas, but do not implement them in this skill.
36
+
37
+ ## External API requirements
38
+
39
+ When the selected scope involves external API calls, this skill requires checks for:
40
+ - health/availability handling,
41
+ - graceful handling of `429` and `500` responses,
42
+ - actionable error logging (status code, request id, retry count, latency).
43
+
44
+ ## Example
45
+
46
+ Prompt example:
47
+
48
+ ```text
49
+ Please review this PR diff and find the 3 highest-risk edge cases.
50
+ Validate null input, boundary timestamp, and API 429 retry behavior.
51
+ Only report confirmed findings with reproduction evidence and suggested test coverage.
52
+ ```
53
+
54
+ Expected behavior:
55
+ - only changed files and minimal dependency chain are investigated,
56
+ - each finding includes reproducible evidence,
57
+ - speculative ideas are separated from confirmed issues,
58
+ - the output stays discovery-only with no code edits.
59
+
60
+ No-diff prompt example:
61
+
62
+ ```text
63
+ There is no git diff in this repo. Scan the whole codebase for high-risk edge cases.
64
+ If you find any actionable issues, reproduce them with evidence and report the highest-priority findings only.
65
+ ```
66
+
67
+ ## References
68
+
69
+ - [`SKILL.md`](./SKILL.md) - full workflow and execution rules.
70
+ - [`references/architecture-edge-cases.md`](./references/architecture-edge-cases.md) - cross-module/system-level edge-case checklist.
71
+ - [`references/code-edge-cases.md`](./references/code-edge-cases.md) - code-level input, boundary, and error-path checklist.
72
+
73
+ ## Repository structure
74
+
75
+ ```text
76
+ .
77
+ ├── LICENSE
78
+ ├── SKILL.md
79
+ ├── README.md
80
+ └── references
81
+ ├── architecture-edge-cases.md
82
+ └── code-edge-cases.md
83
+ ```
84
+
85
+ ## License
86
+
87
+ MIT
@@ -0,0 +1,124 @@
1
+ ---
2
+ name: discover-edge-cases
3
+ description: Discover reproducible edge-case risks in changed code or a selected codebase scope, prove them with concrete evidence, and report prioritized findings without modifying implementation. Use when users ask to find edge cases, assess hardening gaps, or validate that unusual inputs and error paths are covered.
4
+ ---
5
+
6
+ # Discover Edge Cases
7
+
8
+ ## Dependencies
9
+
10
+ - Required: none.
11
+ - Conditional: `harden-app-security` for code-affecting scopes before finalizing the report.
12
+ - Optional: none.
13
+ - Fallback: If the required security cross-check is unavailable for a code-affecting scope, stop and report the missing dependency.
14
+
15
+ ## Standards
16
+
17
+ - Evidence: Keep only reproducible findings backed by code, tests, runtime output, or direct reproduction steps.
18
+ - Execution: Determine scope first, run focused probes, confirm reproducibility, then report findings without remediation.
19
+ - Quality: Separate confirmed findings from hypotheses and cover boundary, failure, stateful, and observability edge cases that matter to the scope.
20
+ - Output: Return prioritized findings, edge-case evidence, risk assessment, hardening guidance, and residual risk only.
21
+
22
+ ## Overview
23
+
24
+ Use this skill to discover edge-case failures and coverage gaps with evidence-first analysis. The goal is to surface reproducible findings, not to remediate them.
25
+
26
+ ## Non-negotiable Boundaries
27
+
28
+ - This skill is discovery-only: do not edit code, do not add or modify tests, and do not open PRs.
29
+ - Keep only reproducible findings with clear evidence.
30
+ - Mark unverified ideas as hypotheses and separate them from confirmed findings.
31
+ - If the task also requires remediation, finish this discovery pass first, then hand off confirmed findings to another implementation workflow.
32
+ - Discard authorship bias completely: treat code written earlier in the conversation or by this agent as untrusted until evidence proves otherwise.
33
+
34
+ ## Workflow
35
+
36
+ ### 1) Determine scan scope (required)
37
+
38
+ - Run `git diff --name-only` first.
39
+ - If diff exists: inspect only changed files plus the minimum dependency chain required to validate suspected edge cases.
40
+ - If no diff exists: scan the full project, prioritizing core domain logic, external API boundaries, stateful workflows, and concurrency-sensitive modules.
41
+ - If no actionable issue is found, report `No actionable edge-case finding identified` and stop.
42
+
43
+ ### 2) Build a factual baseline
44
+
45
+ - Read the relevant code paths end-to-end before judging behavior.
46
+ - Re-derive behavior from code, tests, runtime output, and reproduced inputs only; ignore prior intent, authorship, or confidence from earlier turns.
47
+ - Clarify input/output contracts: types, valid ranges, null handling, ordering assumptions, retry/error behavior, and state transitions.
48
+ - Run existing tests or a minimal reproduction when needed to confirm actual vs expected behavior.
49
+ - Record exact evidence with file references (`path:line`) and observable symptoms.
50
+
51
+ ### 3) Execute focused edge-case probes
52
+
53
+ Prioritize 2-5 high-risk cases directly tied to the selected scope:
54
+
55
+ - Empty collections / empty strings / None / null
56
+ - Boundary values: 0, 1, -1, max/min limits, overflow
57
+ - Duplicate, ordering, sorting, or deduplication assumptions
58
+ - Exception paths: external dependency failure, timeout, retry, or partial data missing
59
+ - Invalid formats: malformed strings, invalid date/timezone, or unexpected types
60
+ - Concurrency/reentrancy: repeated calls, state contamination, or race windows
61
+ - Architecture-level edge cases: backpressure, resource exhaustion, timeout propagation, or partial commit/rollback behavior
62
+
63
+ For broader coverage, load references as needed:
64
+
65
+ - `references/architecture-edge-cases.md`
66
+ - `references/code-edge-cases.md`
67
+
68
+ #### External API checks
69
+
70
+ If the scope includes external API calls, validate:
71
+
72
+ - observable health/availability handling,
73
+ - degradation behavior for at least HTTP 429 and 500,
74
+ - actionable error logging (status code, request id, retry count, latency) to avoid silent failures.
75
+
76
+ ### 4) Confirm reproducibility
77
+
78
+ - Reproduce each confirmed issue at least twice through the same trigger path.
79
+ - For high-risk findings, try nearby variants such as boundary neighbors, empty vs null, malformed vs well-typed invalid input, repeated calls, and stale ordering.
80
+ - Capture the exact command, request, or input together with the observed failure or missing protection.
81
+ - Keep unverified ideas as hypotheses only.
82
+
83
+ ### 5) Prioritize confirmed findings
84
+
85
+ - Rank findings by user impact, exploitability or frequency, and blast radius.
86
+ - Call out data-integrity, state corruption, silent failure, retry storm, and cross-module propagation risks explicitly.
87
+ - Prefer fewer, stronger findings over many speculative ones.
88
+
89
+ ### 6) Report findings only
90
+
91
+ Deliver:
92
+
93
+ 1. Findings (highest risk first)
94
+ - Title and severity/priority
95
+ - Evidence (`path:line`)
96
+ - Reproduction steps or triggering input
97
+ - Broken expectation/invariant
98
+ 2. Edge-case evidence
99
+ - Preconditions
100
+ - Observed behavior
101
+ - Reproducibility notes and nearby variant results
102
+ 3. Risk assessment
103
+ - Impact, likelihood, and scope
104
+ - Why this matters in system context
105
+ 4. Hardening guidance (advice only)
106
+ - Recommended fix direction
107
+ - Suggested test coverage to add during remediation
108
+ 5. Residual risk
109
+ - Hypotheses, unknowns, and next validation ideas
110
+
111
+ ## Minimum Coverage
112
+
113
+ Apply all relevant checks for the selected scope:
114
+
115
+ - Input validation: empty/null/malformed/unexpected-type handling
116
+ - Boundary behavior: zero/one/min/max/overflow/ordering edges
117
+ - Failure behavior: timeout, retry, partial dependency failure, degraded mode
118
+ - Stateful behavior: idempotency, replay, concurrency, rollback, duplicate processing
119
+ - Observability: actionable errors and logging for failures that would otherwise be silent
120
+
121
+ ## Resources
122
+
123
+ - `references/architecture-edge-cases.md`: cross-module/system-level edge-case checklist.
124
+ - `references/code-edge-cases.md`: code-level input, boundary, and error-path checklist.
@@ -0,0 +1,4 @@
1
+ interface:
2
+ display_name: "Discover Edge Cases"
3
+ short_description: "Discover reproducible edge-case risks and coverage gaps"
4
+ default_prompt: "Use $discover-edge-cases to scan the current diff first (or the full codebase when there is no diff), discard any bias toward code written earlier in the conversation, run $harden-app-security as an adversarial cross-check for code-affecting scope, identify the highest-risk reproducible edge-case findings, validate them with concrete evidence, prioritize the confirmed risks, and report hardening and test recommendations without modifying code."
@@ -0,0 +1,41 @@
1
+ # Common Architecture-level Edge Cases (Reference List)
2
+
3
+ ## How to use
4
+ - Pick only 2-5 items directly related to the current change; avoid exhaustive scans.
5
+ - If changes involve external dependencies/concurrency/scheduling/messaging, prioritize matching sections.
6
+
7
+ ## Concurrency and synchronization
8
+ - Race conditions: concurrent updates to the same resource cause overwrite/lost updates
9
+ - Deadlock/livelock: inconsistent lock ordering, reentrant lock misuse, or busy-wait loops
10
+ - Visibility/memory consistency: cross-thread state is not synchronized
11
+ - Async task leaks: background tasks not cancelled or cleaned up
12
+
13
+ ## Backpressure and resources
14
+ - Backpressure failure: slow downstream causes upstream queue growth, OOM, or queue saturation
15
+ - Resource starvation: high-priority tasks monopolize resources
16
+ - Connection pool exhaustion: unreleased or delayed-release connections
17
+ - File/socket leaks: exception paths skip close/release
18
+
19
+ ## Distributed systems
20
+ - Network partition/intermittent unreachable state: requires retry/degrade/isolation strategy
21
+ - Retry storms: retry amplification under failure
22
+ - Consistency gaps: stale reads or partial writes
23
+ - Duplicate messages: at-least-once delivery causes duplicate processing
24
+ - Message ordering: reordering/out-of-order events corrupt state
25
+ - Clock skew: time-based ordering/expiration becomes incorrect
26
+
27
+ ## Timeout and cancellation
28
+ - Timeout not propagated: child tasks continue and consume resources
29
+ - Non-reentrant cancellation: retry causes inconsistent state
30
+ - Timeout boundary flapping: unstable behavior near timeout thresholds
31
+
32
+ ## Error handling and rollback
33
+ - Partial success: multi-step writes complete only partially
34
+ - Rollback failure: compensation action fails and leaves inconsistent data
35
+ - Swallowed exceptions: errors are neither surfaced nor logged
36
+ - Missing idempotency: retries create duplicate side effects
37
+
38
+ ## Deployment and versioning
39
+ - Rolling upgrade mismatch: old/new versions run together with inconsistent behavior
40
+ - Config drift: node configurations diverge
41
+ - Hot reload instability: temporary unavailability or state loss during reload