@laitszkin/apollo-toolkit 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +62 -0
- package/CHANGELOG.md +100 -0
- package/LICENSE +21 -0
- package/README.md +144 -0
- package/align-project-documents/SKILL.md +94 -0
- package/align-project-documents/agents/openai.yaml +4 -0
- package/analyse-app-logs/LICENSE +21 -0
- package/analyse-app-logs/README.md +126 -0
- package/analyse-app-logs/SKILL.md +121 -0
- package/analyse-app-logs/agents/openai.yaml +4 -0
- package/analyse-app-logs/references/investigation-checklist.md +58 -0
- package/analyse-app-logs/references/log-signal-patterns.md +52 -0
- package/answering-questions-with-research/SKILL.md +46 -0
- package/answering-questions-with-research/agents/openai.yaml +4 -0
- package/bin/apollo-toolkit.js +7 -0
- package/commit-and-push/LICENSE +21 -0
- package/commit-and-push/README.md +26 -0
- package/commit-and-push/SKILL.md +70 -0
- package/commit-and-push/agents/openai.yaml +4 -0
- package/commit-and-push/references/branch-naming.md +15 -0
- package/commit-and-push/references/commit-messages.md +19 -0
- package/deep-research-topics/LICENSE +21 -0
- package/deep-research-topics/README.md +43 -0
- package/deep-research-topics/SKILL.md +84 -0
- package/deep-research-topics/agents/openai.yaml +4 -0
- package/develop-new-features/LICENSE +21 -0
- package/develop-new-features/README.md +52 -0
- package/develop-new-features/SKILL.md +105 -0
- package/develop-new-features/agents/openai.yaml +4 -0
- package/develop-new-features/references/testing-e2e.md +35 -0
- package/develop-new-features/references/testing-integration.md +42 -0
- package/develop-new-features/references/testing-property-based.md +44 -0
- package/develop-new-features/references/testing-unit.md +37 -0
- package/discover-edge-cases/CHANGELOG.md +19 -0
- package/discover-edge-cases/LICENSE +21 -0
- package/discover-edge-cases/README.md +87 -0
- package/discover-edge-cases/SKILL.md +124 -0
- package/discover-edge-cases/agents/openai.yaml +4 -0
- package/discover-edge-cases/references/architecture-edge-cases.md +41 -0
- package/discover-edge-cases/references/code-edge-cases.md +46 -0
- package/docs-to-voice/.env.example +106 -0
- package/docs-to-voice/CHANGELOG.md +71 -0
- package/docs-to-voice/LICENSE +21 -0
- package/docs-to-voice/README.md +118 -0
- package/docs-to-voice/SKILL.md +107 -0
- package/docs-to-voice/agents/openai.yaml +4 -0
- package/docs-to-voice/scripts/docs_to_voice.py +1385 -0
- package/docs-to-voice/scripts/docs_to_voice.sh +11 -0
- package/docs-to-voice/tests/test_docs_to_voice_api_max_chars.py +210 -0
- package/docs-to-voice/tests/test_docs_to_voice_sentence_timeline.py +115 -0
- package/docs-to-voice/tests/test_docs_to_voice_settings.py +43 -0
- package/docs-to-voice/tests/test_docs_to_voice_speech_rate.py +57 -0
- package/enhance-existing-features/CHANGELOG.md +35 -0
- package/enhance-existing-features/LICENSE +21 -0
- package/enhance-existing-features/README.md +54 -0
- package/enhance-existing-features/SKILL.md +120 -0
- package/enhance-existing-features/agents/openai.yaml +4 -0
- package/enhance-existing-features/references/e2e-tests.md +25 -0
- package/enhance-existing-features/references/integration-tests.md +30 -0
- package/enhance-existing-features/references/property-based-tests.md +33 -0
- package/enhance-existing-features/references/unit-tests.md +29 -0
- package/feature-propose/LICENSE +21 -0
- package/feature-propose/README.md +23 -0
- package/feature-propose/SKILL.md +107 -0
- package/feature-propose/agents/openai.yaml +4 -0
- package/feature-propose/references/enhancement-features.md +25 -0
- package/feature-propose/references/important-features.md +25 -0
- package/feature-propose/references/mvp-features.md +25 -0
- package/feature-propose/references/performance-features.md +25 -0
- package/financial-research/SKILL.md +208 -0
- package/financial-research/agents/openai.yaml +4 -0
- package/financial-research/assets/weekly_market_report_template.md +45 -0
- package/fix-github-issues/SKILL.md +98 -0
- package/fix-github-issues/agents/openai.yaml +4 -0
- package/fix-github-issues/scripts/list_issues.py +148 -0
- package/fix-github-issues/tests/test_list_issues.py +127 -0
- package/generate-spec/LICENSE +21 -0
- package/generate-spec/README.md +61 -0
- package/generate-spec/SKILL.md +96 -0
- package/generate-spec/agents/openai.yaml +4 -0
- package/generate-spec/references/templates/checklist.md +78 -0
- package/generate-spec/references/templates/spec.md +55 -0
- package/generate-spec/references/templates/tasks.md +35 -0
- package/generate-spec/scripts/create-specs +123 -0
- package/harden-app-security/CHANGELOG.md +27 -0
- package/harden-app-security/LICENSE +21 -0
- package/harden-app-security/README.md +46 -0
- package/harden-app-security/SKILL.md +127 -0
- package/harden-app-security/agents/openai.yaml +4 -0
- package/harden-app-security/references/agent-attack-catalog.md +117 -0
- package/harden-app-security/references/common-software-attack-catalog.md +168 -0
- package/harden-app-security/references/red-team-extreme-scenarios.md +81 -0
- package/harden-app-security/references/risk-checklist.md +78 -0
- package/harden-app-security/references/security-test-patterns-agent.md +101 -0
- package/harden-app-security/references/security-test-patterns-finance.md +88 -0
- package/harden-app-security/references/test-snippets.md +73 -0
- package/improve-observability/SKILL.md +114 -0
- package/improve-observability/agents/openai.yaml +4 -0
- package/learn-skill-from-conversations/CHANGELOG.md +15 -0
- package/learn-skill-from-conversations/LICENSE +22 -0
- package/learn-skill-from-conversations/README.md +47 -0
- package/learn-skill-from-conversations/SKILL.md +85 -0
- package/learn-skill-from-conversations/agents/openai.yaml +4 -0
- package/learn-skill-from-conversations/scripts/extract_recent_conversations.py +369 -0
- package/learn-skill-from-conversations/tests/test_extract_recent_conversations.py +176 -0
- package/learning-error-book/SKILL.md +112 -0
- package/learning-error-book/agents/openai.yaml +4 -0
- package/learning-error-book/assets/error_book_template.md +66 -0
- package/learning-error-book/scripts/render_markdown_to_pdf.py +367 -0
- package/lib/cli.js +338 -0
- package/lib/installer.js +225 -0
- package/maintain-project-constraints/SKILL.md +109 -0
- package/maintain-project-constraints/agents/openai.yaml +4 -0
- package/maintain-skill-catalog/README.md +18 -0
- package/maintain-skill-catalog/SKILL.md +66 -0
- package/maintain-skill-catalog/agents/openai.yaml +4 -0
- package/novel-to-short-video/CHANGELOG.md +53 -0
- package/novel-to-short-video/LICENSE +21 -0
- package/novel-to-short-video/README.md +63 -0
- package/novel-to-short-video/SKILL.md +233 -0
- package/novel-to-short-video/agents/openai.yaml +4 -0
- package/novel-to-short-video/references/plan-template.md +71 -0
- package/novel-to-short-video/references/roles-json.md +41 -0
- package/open-github-issue/LICENSE +21 -0
- package/open-github-issue/README.md +97 -0
- package/open-github-issue/SKILL.md +119 -0
- package/open-github-issue/agents/openai.yaml +4 -0
- package/open-github-issue/scripts/open_github_issue.py +380 -0
- package/open-github-issue/tests/test_open_github_issue.py +159 -0
- package/open-source-pr-workflow/CHANGELOG.md +32 -0
- package/open-source-pr-workflow/LICENSE +21 -0
- package/open-source-pr-workflow/README.md +23 -0
- package/open-source-pr-workflow/SKILL.md +123 -0
- package/open-source-pr-workflow/agents/openai.yaml +4 -0
- package/openai-text-to-image-storyboard/.env.example +10 -0
- package/openai-text-to-image-storyboard/CHANGELOG.md +49 -0
- package/openai-text-to-image-storyboard/LICENSE +21 -0
- package/openai-text-to-image-storyboard/README.md +99 -0
- package/openai-text-to-image-storyboard/SKILL.md +107 -0
- package/openai-text-to-image-storyboard/agents/openai.yaml +4 -0
- package/openai-text-to-image-storyboard/scripts/generate_storyboard_images.py +763 -0
- package/package.json +36 -0
- package/record-spending/SKILL.md +113 -0
- package/record-spending/agents/openai.yaml +4 -0
- package/record-spending/references/account-format.md +33 -0
- package/record-spending/references/workbook-layout.md +84 -0
- package/resolve-review-comments/SKILL.md +122 -0
- package/resolve-review-comments/agents/openai.yaml +4 -0
- package/resolve-review-comments/references/adoption-criteria.md +23 -0
- package/resolve-review-comments/scripts/review_threads.py +425 -0
- package/resolve-review-comments/tests/test_review_threads.py +74 -0
- package/review-change-set/LICENSE +21 -0
- package/review-change-set/README.md +55 -0
- package/review-change-set/SKILL.md +103 -0
- package/review-change-set/agents/openai.yaml +4 -0
- package/review-codebases/LICENSE +21 -0
- package/review-codebases/README.md +67 -0
- package/review-codebases/SKILL.md +109 -0
- package/review-codebases/agents/openai.yaml +4 -0
- package/scripts/install_skills.ps1 +283 -0
- package/scripts/install_skills.sh +262 -0
- package/scripts/validate_openai_agent_config.py +194 -0
- package/scripts/validate_skill_frontmatter.py +110 -0
- package/specs-to-project-docs/LICENSE +21 -0
- package/specs-to-project-docs/README.md +57 -0
- package/specs-to-project-docs/SKILL.md +111 -0
- package/specs-to-project-docs/agents/openai.yaml +4 -0
- package/specs-to-project-docs/references/templates/architecture.md +29 -0
- package/specs-to-project-docs/references/templates/configuration.md +29 -0
- package/specs-to-project-docs/references/templates/developer-guide.md +33 -0
- package/specs-to-project-docs/references/templates/docs-index.md +39 -0
- package/specs-to-project-docs/references/templates/features.md +25 -0
- package/specs-to-project-docs/references/templates/getting-started.md +38 -0
- package/specs-to-project-docs/references/templates/readme.md +49 -0
- package/systematic-debug/LICENSE +21 -0
- package/systematic-debug/README.md +81 -0
- package/systematic-debug/SKILL.md +59 -0
- package/systematic-debug/agents/openai.yaml +4 -0
- package/text-to-short-video/.env.example +36 -0
- package/text-to-short-video/LICENSE +21 -0
- package/text-to-short-video/README.md +82 -0
- package/text-to-short-video/SKILL.md +221 -0
- package/text-to-short-video/agents/openai.yaml +4 -0
- package/text-to-short-video/scripts/enforce_video_aspect_ratio.py +350 -0
- package/version-release/CHANGELOG.md +53 -0
- package/version-release/LICENSE +21 -0
- package/version-release/README.md +28 -0
- package/version-release/SKILL.md +94 -0
- package/version-release/agents/openai.yaml +4 -0
- package/version-release/references/branch-naming.md +15 -0
- package/version-release/references/changelog-writing.md +8 -0
- package/version-release/references/commit-messages.md +19 -0
- package/version-release/references/readme-writing.md +12 -0
- package/version-release/references/semantic-versioning.md +12 -0
- package/video-production/CHANGELOG.md +104 -0
- package/video-production/LICENSE +18 -0
- package/video-production/README.md +68 -0
- package/video-production/SKILL.md +213 -0
- package/video-production/agents/openai.yaml +4 -0
- package/video-production/references/plan-template.md +54 -0
- package/video-production/references/roles-json.md +41 -0
- package/weekly-financial-event-report/SKILL.md +195 -0
- package/weekly-financial-event-report/agents/openai.yaml +4 -0
- package/weekly-financial-event-report/assets/financial_event_report_template.md +53 -0
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# develop-new-features
|
|
2
|
+
|
|
3
|
+
A spec-first feature development skill for new behavior and greenfield work. It delegates shared planning-doc generation to `generate-spec`, then implements the approved feature with risk-driven testing.
|
|
4
|
+
|
|
5
|
+
## Key capabilities
|
|
6
|
+
|
|
7
|
+
- Requires `generate-spec` before any implementation starts.
|
|
8
|
+
- Treats `spec.md`, `tasks.md`, and `checklist.md` as approval-gated artifacts, not optional notes.
|
|
9
|
+
- Covers unit, regression, property-based, integration, E2E, and adversarial testing based on actual risk.
|
|
10
|
+
- Reuses existing architecture and avoids speculative expansion.
|
|
11
|
+
- Backfills planning docs after implementation and testing complete.
|
|
12
|
+
|
|
13
|
+
## Repository layout
|
|
14
|
+
|
|
15
|
+
```text
|
|
16
|
+
.
|
|
17
|
+
├── SKILL.md
|
|
18
|
+
├── README.md
|
|
19
|
+
├── LICENSE
|
|
20
|
+
├── agents/
|
|
21
|
+
│ └── openai.yaml
|
|
22
|
+
└── references/
|
|
23
|
+
├── testing-unit.md
|
|
24
|
+
├── testing-property-based.md
|
|
25
|
+
├── testing-integration.md
|
|
26
|
+
└── testing-e2e.md
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Workflow summary
|
|
30
|
+
|
|
31
|
+
1. Review only the official docs and code paths needed for the feature.
|
|
32
|
+
2. Run `generate-spec` to create and maintain `docs/plans/{YYYY-MM-DD}_{change_name}/`.
|
|
33
|
+
3. Wait for explicit approval on the spec set.
|
|
34
|
+
4. Implement the approved behavior with minimal changes.
|
|
35
|
+
5. Run risk-driven tests and backfill `tasks.md` and `checklist.md`.
|
|
36
|
+
|
|
37
|
+
## Testing expectations
|
|
38
|
+
|
|
39
|
+
- Unit: changed logic, boundaries, failure paths.
|
|
40
|
+
- Regression: pin down bug-prone or high-risk behavior.
|
|
41
|
+
- Property-based: required for business logic unless concrete `N/A` is recorded.
|
|
42
|
+
- Integration: cover the user-critical logic chain.
|
|
43
|
+
- E2E: cover the most important success and denial/failure paths when justified.
|
|
44
|
+
- Adversarial: include abuse, malformed input, privilege, replay, concurrency, and edge-combination cases when relevant.
|
|
45
|
+
|
|
46
|
+
## References
|
|
47
|
+
|
|
48
|
+
- Shared planning workflow: `generate-spec`
|
|
49
|
+
- Unit testing guide: `references/testing-unit.md`
|
|
50
|
+
- Property-based testing guide: `references/testing-property-based.md`
|
|
51
|
+
- Integration testing guide: `references/testing-integration.md`
|
|
52
|
+
- E2E testing guide: `references/testing-e2e.md`
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: develop-new-features
|
|
3
|
+
description: >-
|
|
4
|
+
Spec-first feature development workflow for new behavior and greenfield
|
|
5
|
+
features. Depends on `generate-spec` for shared planning artifacts before
|
|
6
|
+
coding, then implements the approved feature with risk-driven test coverage.
|
|
7
|
+
Use when users ask to design or implement new features, change product
|
|
8
|
+
behavior, request a planning-first process, or ask for a greenfield feature.
|
|
9
|
+
Tests must not stop at happy-path validation: for business-logic changes
|
|
10
|
+
require property-based testing unless explicitly `N/A` with reason, design
|
|
11
|
+
adversarial/regression/authorization/idempotency/concurrency coverage where
|
|
12
|
+
relevant, use mocks for external services in logic chains, and verify
|
|
13
|
+
meaningful business outcomes rather than smoke-only success.
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# Develop New Features
|
|
17
|
+
|
|
18
|
+
## Dependencies
|
|
19
|
+
|
|
20
|
+
- Required: `generate-spec` for `spec.md`, `tasks.md`, `checklist.md`, clarification handling, approval gating, and status backfill.
|
|
21
|
+
- Conditional: none.
|
|
22
|
+
- Optional: none.
|
|
23
|
+
- Fallback: If `generate-spec` is unavailable, stop and report the missing dependency.
|
|
24
|
+
|
|
25
|
+
## Standards
|
|
26
|
+
|
|
27
|
+
- Evidence: Review authoritative docs and the existing codebase before planning or implementation.
|
|
28
|
+
- Execution: Run `generate-spec` for every new feature or product-behavior change, obtain approval, then implement minimally.
|
|
29
|
+
- Quality: Add risk-based tests with property-based, regression, integration, E2E, adversarial, and rollback coverage when relevant.
|
|
30
|
+
- Output: Keep the approved planning artifacts and the final implementation aligned with actual completion results.
|
|
31
|
+
|
|
32
|
+
## Goal
|
|
33
|
+
|
|
34
|
+
Use a shared spec-generation workflow for all new feature work, then implement the approved behavior with strong test coverage and minimal rework.
|
|
35
|
+
|
|
36
|
+
## Workflow
|
|
37
|
+
|
|
38
|
+
### 1) Review authoritative docs first
|
|
39
|
+
|
|
40
|
+
- Identify the stack, libraries, APIs, and external dependencies involved.
|
|
41
|
+
- Use official documentation as the source of truth.
|
|
42
|
+
- Prefer Context7 for framework/library APIs; use web for the latest official docs when needed.
|
|
43
|
+
- Record only the references required for this feature.
|
|
44
|
+
|
|
45
|
+
### 2) Run `$generate-spec`
|
|
46
|
+
|
|
47
|
+
- Specs are mandatory for every new feature, product behavior change, and greenfield project.
|
|
48
|
+
- Follow `$generate-spec` completely for:
|
|
49
|
+
- generating `docs/plans/{YYYY-MM-DD}_{change_name}/spec.md`, `tasks.md`, and `checklist.md`
|
|
50
|
+
- filling BDD requirements and risk-driven test plans
|
|
51
|
+
- handling clarification responses
|
|
52
|
+
- obtaining explicit approval before coding
|
|
53
|
+
- backfilling document status after implementation and testing
|
|
54
|
+
- Do not modify product code before the approved spec set exists.
|
|
55
|
+
|
|
56
|
+
### 3) Explore architecture and reuse opportunities
|
|
57
|
+
|
|
58
|
+
- Trace entrypoints, module boundaries, data flow, and integration points relevant to the new behavior.
|
|
59
|
+
- Identify reusable components, patterns, and configuration paths before adding new code.
|
|
60
|
+
- Keep a concise map of likely files to modify so implementation stays scoped.
|
|
61
|
+
|
|
62
|
+
### 4) Implement after approval
|
|
63
|
+
|
|
64
|
+
- Reuse existing patterns and abstractions when possible.
|
|
65
|
+
- Keep changes focused and avoid speculative scope expansion.
|
|
66
|
+
- Update environment examples only when new inputs are actually required.
|
|
67
|
+
|
|
68
|
+
### 5) Testing coverage (required)
|
|
69
|
+
|
|
70
|
+
For every non-trivial change, evaluate all categories and add test cases or record justified `N/A`:
|
|
71
|
+
- Start from a risk inventory, not from the happy path: assess misuse/abuse, authorization, invalid transitions, idempotency, replay/duplication, concurrency/races, data-integrity, and partial-failure/rollback risks.
|
|
72
|
+
- Unit tests: changed logic, boundaries, failure paths, and exact error/side-effect expectations.
|
|
73
|
+
- Regression tests: bug-prone or high-risk behavior that should never silently regress again.
|
|
74
|
+
- Property-based tests: required for business-logic changes unless truly unsuitable; use them for invariants, generated business input spaces, state-machine/metamorphic checks when useful, and output expectation checks.
|
|
75
|
+
- Integration tests: user-critical logic chain across modules/layers.
|
|
76
|
+
- E2E tests: key user-visible path impacted by this change; prefer one minimal critical success path plus one highest-value denial/failure path when the risk warrants it.
|
|
77
|
+
- Adversarial/penetration-style cases: abuse paths, malformed inputs, forged identities/privileges, invalid transitions, replay/duplication, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
|
|
78
|
+
|
|
79
|
+
Rules:
|
|
80
|
+
- If E2E is too costly or unstable, add stronger integration coverage for the same risk and record the reason in the checklist.
|
|
81
|
+
- If property-based testing is not suitable, record `N/A` with a concrete reason.
|
|
82
|
+
- For logic chains with external services, mock or fake those services unless the real contract itself is under test; simulate diverse external states and verify the business chain remains correct.
|
|
83
|
+
- Where the feature can partially commit work, test rollback, compensation, or no-partial-write behavior explicitly.
|
|
84
|
+
- Each test must assert a meaningful oracle: exact business output, persisted state, emitted side effects, or intentional lack of side effects. Avoid assertion-light smoke tests and snapshot-only coverage.
|
|
85
|
+
- Run relevant tests when possible and fix failures.
|
|
86
|
+
|
|
87
|
+
### 6) Completion updates
|
|
88
|
+
|
|
89
|
+
- Backfill `tasks.md` and `checklist.md` through `$generate-spec` workflow after implementation and testing.
|
|
90
|
+
- Report the implemented scope, test execution, and any concrete `N/A` reasons.
|
|
91
|
+
|
|
92
|
+
## Working Rules
|
|
93
|
+
|
|
94
|
+
- By default, write planning docs in the user's language.
|
|
95
|
+
- Keep implementation traceable to approved requirement IDs and planned risks.
|
|
96
|
+
- Prefer realism over rigid templates: add or remove test coverage only when the risk profile justifies it.
|
|
97
|
+
- Every planned test should justify a distinct risk; remove shallow duplicates that only prove the code "still runs".
|
|
98
|
+
|
|
99
|
+
## References
|
|
100
|
+
|
|
101
|
+
- `$generate-spec`: shared planning and approval workflow.
|
|
102
|
+
- `references/testing-unit.md`: unit testing principles.
|
|
103
|
+
- `references/testing-property-based.md`: property-based testing principles.
|
|
104
|
+
- `references/testing-integration.md`: integration testing principles.
|
|
105
|
+
- `references/testing-e2e.md`: E2E decision and design principles.
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
interface:
|
|
2
|
+
display_name: "Develop New Features"
|
|
3
|
+
short_description: "Spec-first feature development that depends on generate-spec"
|
|
4
|
+
default_prompt: "Use $develop-new-features to design new behavior through a spec-first workflow: review the required external docs, run $generate-spec to create and maintain docs/plans/<date>_<change_name>/{spec.md,tasks.md,checklist.md}, wait for explicit approval, then implement the approved feature with risk-driven tests and backfill the planning docs after execution."
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# E2E Testing Principles
|
|
2
|
+
|
|
3
|
+
## Core rules
|
|
4
|
+
- E2E is not decided solely by explicit user request.
|
|
5
|
+
- The agent must decide E2E based on feature importance, complexity, and cross-layer risk.
|
|
6
|
+
- For high-risk key user paths, create the smallest necessary E2E coverage first.
|
|
7
|
+
- If E2E is unstable, too costly, or environment-limited, add integration coverage for equivalent risk and record the alternative.
|
|
8
|
+
|
|
9
|
+
## Purpose
|
|
10
|
+
- Verify critical end-to-end user paths are usable.
|
|
11
|
+
- Catch behavior gaps after cross-system/cross-layer integration.
|
|
12
|
+
- Provide confidence close to real usage for high-risk scenarios.
|
|
13
|
+
|
|
14
|
+
## Decision criteria
|
|
15
|
+
- Importance: core feature, critical revenue flow, or high-impact process.
|
|
16
|
+
- Complexity: multi-step state transitions, branching flows, cross-service collaboration.
|
|
17
|
+
- Risk: historical regressions, fragile integrations, major user-visible failures.
|
|
18
|
+
- Maintainability: stable environment and controllable test data.
|
|
19
|
+
|
|
20
|
+
## Not suitable when
|
|
21
|
+
- Feature risk is low and unit/integration tests already cover it sufficiently.
|
|
22
|
+
- E2E is unstable and disproportionately expensive while integration tests can cover key risk.
|
|
23
|
+
|
|
24
|
+
## Design guidance
|
|
25
|
+
- Cover only the most critical paths; avoid expanding into full UI test suites.
|
|
26
|
+
- Keep test data controllable (fixed seeds or recyclable fixtures).
|
|
27
|
+
- Prioritize stability; avoid brittle external dependencies, use controlled substitutes if needed.
|
|
28
|
+
- Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
|
|
29
|
+
- Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
|
|
30
|
+
- Keep cost decisions explicit: document why E2E is done or not done and what alternative strategy is used.
|
|
31
|
+
|
|
32
|
+
## Spec/checklist authoring hints
|
|
33
|
+
- Mark high-risk key paths in `spec.md` requirement descriptions.
|
|
34
|
+
- Record E2E decisions, mapped test cases, and results in `checklist.md`.
|
|
35
|
+
- If skipping E2E, specify replacement integration test cases (`IT-xx`) and rationale in `checklist.md`.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Integration Testing Principles
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
- Verify collaboration across modules/layers and external dependencies.
|
|
5
|
+
- Cover integration risks unit tests cannot capture (sequence, config, IO failure).
|
|
6
|
+
- Validate user-critical business logic chains under realistic component interaction and controlled external-service scenarios.
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
- Interface interactions between modules (for example service ↔ repository).
|
|
10
|
+
- Changes touching IO dependencies such as DB, RPC, files, cache, queues.
|
|
11
|
+
- Behaviors that depend on configuration combinations or environment differences.
|
|
12
|
+
- The correctness question is about the whole business logic chain rather than one isolated function.
|
|
13
|
+
- As minimum safety replacement when E2E is not suitable.
|
|
14
|
+
|
|
15
|
+
## Not suitable when
|
|
16
|
+
- Single pure-function or pure-logic behavior (use unit tests).
|
|
17
|
+
- Full end-to-end user flow can be stably covered by E2E.
|
|
18
|
+
|
|
19
|
+
## Relationship with E2E
|
|
20
|
+
- If change importance/complexity is high and E2E is feasible, prefer minimal E2E for key paths.
|
|
21
|
+
- If E2E is hard or too costly, integration tests must cover equivalent key risks.
|
|
22
|
+
- Record replacement mapping in `checklist.md` (E2E-xx ↔ IT-xx) with rationale.
|
|
23
|
+
|
|
24
|
+
## Design guidance
|
|
25
|
+
- Focus on high-value integration points; each test should justify risk/value.
|
|
26
|
+
- Keep dependencies inside the application boundary near-real where practical.
|
|
27
|
+
- Mock/fake external services at the business-chain boundary unless the real service contract itself is what needs verification.
|
|
28
|
+
- Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
|
|
29
|
+
- Add adversarial/penetration-style cases for abuse paths such as invalid transitions, replay, double-submit, forged identifiers, or out-of-order events when those risks exist.
|
|
30
|
+
- When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
|
|
31
|
+
- Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
|
|
32
|
+
- Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
|
|
33
|
+
- Keep reproducible: controlled test data and recoverable environment.
|
|
34
|
+
- Keep cost controlled; avoid broad redundant coverage (leave that to unit tests).
|
|
35
|
+
|
|
36
|
+
## Spec/checklist authoring hints
|
|
37
|
+
- Dependency scope: list involved modules/external systems.
|
|
38
|
+
- Scenario: describe cross-module flow or critical branch.
|
|
39
|
+
- Risk: explain what integration failure, misconfiguration, or business-chain break this test can reveal.
|
|
40
|
+
- External dependency strategy: specify which services are mocked/faked versus near-real and why.
|
|
41
|
+
- Scenario matrix: list the external states or adversarial paths covered.
|
|
42
|
+
- Map behavior, test IDs, and test outcomes in `checklist.md`.
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Property-based Testing Principles
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
- Verify invariants/properties hold across broad input spaces.
|
|
5
|
+
- Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
|
|
6
|
+
- Catch combinational, adversarial, and boundary behaviors that fixed examples often miss.
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
- Algorithms, transformations, serialization/deserialization, sorting, aggregation.
|
|
10
|
+
- Behaviors requiring consistency or reversibility (for example round-trip).
|
|
11
|
+
- Data structures or state transitions with clear invariants.
|
|
12
|
+
- Business logic where the rule can be stated as input/output expectations, allowed states, forbidden states, or safety constraints.
|
|
13
|
+
- Logic chains that depend on external services, when those services can be replaced by controllable mocks/fakes and their states generated as part of the test space.
|
|
14
|
+
|
|
15
|
+
## Not suitable when
|
|
16
|
+
- The main thing being validated is the real integration contract with external systems or live IO (use integration tests).
|
|
17
|
+
- UI/interactive flows without stable invariants.
|
|
18
|
+
- Very small discrete input spaces (unit tests are sufficient).
|
|
19
|
+
|
|
20
|
+
## Design guidance
|
|
21
|
+
- Properties must be explicit and machine-verifiable, whether they are invariants, allowed outcome sets, rejection rules, or business-output predicates.
|
|
22
|
+
- Generators should cover normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
|
|
23
|
+
- Prefer modeling business rules directly: generate inputs, run the logic, then assert the output/error/state transition matches the rule.
|
|
24
|
+
- When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
|
|
25
|
+
- When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
|
|
26
|
+
- For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
|
|
27
|
+
- Ensure reproducibility (fixed seed or replayable input generation) and preserve failing seeds/examples for regression coverage.
|
|
28
|
+
- Complement unit tests; avoid duplicating fixed-case tests.
|
|
29
|
+
- Control cost with reasonable sample counts and input-size limits.
|
|
30
|
+
|
|
31
|
+
## Common property examples (description level)
|
|
32
|
+
- `deserialize(serialize(x)) == x`
|
|
33
|
+
- Sorted output is monotonic and preserves element multiset.
|
|
34
|
+
- Merge/split operations preserve total element count.
|
|
35
|
+
- Idempotency: repeating the same operation does not change results.
|
|
36
|
+
- Invalid or unauthorized generated inputs always fail with an expected error/result class.
|
|
37
|
+
- Generated order/payment/state-transition inputs always end in an allowed business state.
|
|
38
|
+
- Under generated mocked service states, the business logic chain still satisfies fallback/retry/compensation rules.
|
|
39
|
+
|
|
40
|
+
## Spec/checklist authoring hints
|
|
41
|
+
- Property/rule: one sentence stating the rule that must always hold or the allowed outcomes that must contain the result.
|
|
42
|
+
- Generator strategy: input range, distribution, emphasized boundaries, and any adversarial or external-state dimensions.
|
|
43
|
+
- Oracle/check: describe how the test decides correctness (predicate, allow-list, reference model, or expected error class).
|
|
44
|
+
- Purpose: explain correctness/risk reduction value of this property.
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# Unit Testing Principles
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
- Verify correctness of the smallest testable unit (function, method, or pure logic module).
|
|
5
|
+
- Provide fast feedback with low-cost failure localization.
|
|
6
|
+
|
|
7
|
+
## When to use
|
|
8
|
+
- Core business logic and critical branches.
|
|
9
|
+
- Boundary conditions (upper/lower limits, null/empty, extreme values).
|
|
10
|
+
- Error handling and exception paths (invalid input, incompatible state, etc.).
|
|
11
|
+
|
|
12
|
+
## Not suitable when
|
|
13
|
+
- Behavior requires cross-module or external dependency verification (use integration tests).
|
|
14
|
+
- Full user-flow validation is required (evaluate E2E first; if not suitable, use integration tests to cover risk).
|
|
15
|
+
|
|
16
|
+
## Design guidance
|
|
17
|
+
- Isolate external dependencies with mock/stub/fake; avoid DB/RPC/file IO.
|
|
18
|
+
- Keep tests small and focused: one test, one behavior/failure mode.
|
|
19
|
+
- Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
|
|
20
|
+
- Cover both success and failure branches.
|
|
21
|
+
- Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
|
|
22
|
+
- Prefer table-driven cases when many small business permutations share the same oracle.
|
|
23
|
+
- Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
|
|
24
|
+
- If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
|
|
25
|
+
- Keep tests reproducible: avoid nondeterministic time/random/global state.
|
|
26
|
+
- Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
|
|
27
|
+
- Map tests to requirements: each core requirement should have at least one unit test.
|
|
28
|
+
|
|
29
|
+
## Spec/checklist authoring hints
|
|
30
|
+
- Scenario: describe input and initial state mapped to one requirement/boundary.
|
|
31
|
+
- Expected result: verifiable output, state change, or error.
|
|
32
|
+
- Purpose: explain which risk or bug type this test prevents.
|
|
33
|
+
|
|
34
|
+
## Common examples (description level)
|
|
35
|
+
- Return a specific error when input is out of allowed range.
|
|
36
|
+
- Handle empty list/empty string input with expected behavior.
|
|
37
|
+
- Ensure output matches definition after state/flag switching.
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
## [v0.2.1] - 2026-02-17
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
- Remove `submit-changes` skill dependency from no-diff release flow while keeping the PR workflow.
|
|
9
|
+
- Update skill and README guidance to use direct git commit/push before opening a PR.
|
|
10
|
+
|
|
11
|
+
## [v0.2.0] - 2026-02-17
|
|
12
|
+
|
|
13
|
+
### Added
|
|
14
|
+
- Add no-diff workflow guidance to scan the whole codebase for actionable edge cases.
|
|
15
|
+
- Add release-flow guidance for no-diff fixes: create worktree, use `submit-changes`, and open a PR.
|
|
16
|
+
|
|
17
|
+
### Changed
|
|
18
|
+
- Clarify scope selection logic: `git diff` path uses changed files only; no-diff path uses full-codebase scan.
|
|
19
|
+
- Expand README examples to include a no-diff prompt and expected execution path.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 LaiTszKin
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
# discover-edge-cases
|
|
2
|
+
|
|
3
|
+
`discover-edge-cases` is a Codex skill for discovering reproducible edge-case risks and coverage gaps.
|
|
4
|
+
|
|
5
|
+
## Brief introduction
|
|
6
|
+
|
|
7
|
+
This skill is discovery-oriented. It scans the current diff by default, or the full codebase
|
|
8
|
+
when there is no diff, then validates the highest-risk edge cases with concrete evidence.
|
|
9
|
+
It does not write tests, patch code, or open PRs.
|
|
10
|
+
|
|
11
|
+
It follows a strict workflow:
|
|
12
|
+
1. Detect whether `git diff` exists.
|
|
13
|
+
2. Inspect only changed files plus minimal dependencies, or perform a full-project scan when no diff exists.
|
|
14
|
+
3. Run `harden-app-security` as an adversarial dependency for code-affecting scope.
|
|
15
|
+
4. Probe the highest-risk edge cases and gather concrete evidence.
|
|
16
|
+
5. Reproduce confirmed issues at least twice and check nearby variants.
|
|
17
|
+
6. Prioritize confirmed findings and report hardening guidance only.
|
|
18
|
+
|
|
19
|
+
## When to use
|
|
20
|
+
|
|
21
|
+
Use this skill when a task asks you to:
|
|
22
|
+
- find edge-case risks in a diff or codebase,
|
|
23
|
+
- validate unusual inputs and error paths,
|
|
24
|
+
- assess hardening gaps around null/empty/boundary handling,
|
|
25
|
+
- review retries, timeouts, degradation paths, or stateful failure modes.
|
|
26
|
+
|
|
27
|
+
## Core principles
|
|
28
|
+
|
|
29
|
+
- Scope is `git diff` plus the minimal dependency chain by default.
|
|
30
|
+
- If `git diff` is empty, run a full-codebase scan focused on high-risk modules.
|
|
31
|
+
- Treat prior authorship as irrelevant; even code written earlier in the same conversation must be challenged like third-party code.
|
|
32
|
+
- Decisions must be evidence-based; speculative ideas stay marked as hypotheses.
|
|
33
|
+
- Keep only reproducible findings with exact evidence.
|
|
34
|
+
- Run `harden-app-security` as a required adversarial cross-check for code-affecting scope.
|
|
35
|
+
- Report recommended fixes and test ideas, but do not implement them in this skill.
|
|
36
|
+
|
|
37
|
+
## External API requirements
|
|
38
|
+
|
|
39
|
+
When the selected scope involves external API calls, this skill requires checks for:
|
|
40
|
+
- health/availability handling,
|
|
41
|
+
- graceful handling of `429` and `500` responses,
|
|
42
|
+
- actionable error logging (status code, request id, retry count, latency).
|
|
43
|
+
|
|
44
|
+
## Example
|
|
45
|
+
|
|
46
|
+
Prompt example:
|
|
47
|
+
|
|
48
|
+
```text
|
|
49
|
+
Please review this PR diff and find the 3 highest-risk edge cases.
|
|
50
|
+
Validate null input, boundary timestamp, and API 429 retry behavior.
|
|
51
|
+
Only report confirmed findings with reproduction evidence and suggested test coverage.
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Expected behavior:
|
|
55
|
+
- only changed files and minimal dependency chain are investigated,
|
|
56
|
+
- each finding includes reproducible evidence,
|
|
57
|
+
- speculative ideas are separated from confirmed issues,
|
|
58
|
+
- the output stays discovery-only with no code edits.
|
|
59
|
+
|
|
60
|
+
No-diff prompt example:
|
|
61
|
+
|
|
62
|
+
```text
|
|
63
|
+
There is no git diff in this repo. Scan the whole codebase for high-risk edge cases.
|
|
64
|
+
If you find any actionable issues, reproduce them with evidence and report the highest-priority findings only.
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## References
|
|
68
|
+
|
|
69
|
+
- [`SKILL.md`](./SKILL.md) - full workflow and execution rules.
|
|
70
|
+
- [`references/architecture-edge-cases.md`](./references/architecture-edge-cases.md) - cross-module/system-level edge-case checklist.
|
|
71
|
+
- [`references/code-edge-cases.md`](./references/code-edge-cases.md) - code-level input, boundary, and error-path checklist.
|
|
72
|
+
|
|
73
|
+
## Repository structure
|
|
74
|
+
|
|
75
|
+
```text
|
|
76
|
+
.
|
|
77
|
+
├── LICENSE
|
|
78
|
+
├── SKILL.md
|
|
79
|
+
├── README.md
|
|
80
|
+
└── references
|
|
81
|
+
├── architecture-edge-cases.md
|
|
82
|
+
└── code-edge-cases.md
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## License
|
|
86
|
+
|
|
87
|
+
MIT
|
|
@@ -0,0 +1,124 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: discover-edge-cases
|
|
3
|
+
description: Discover reproducible edge-case risks in changed code or a selected codebase scope, prove them with concrete evidence, and report prioritized findings without modifying implementation. Use when users ask to find edge cases, assess hardening gaps, or validate that unusual inputs and error paths are covered.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Discover Edge Cases
|
|
7
|
+
|
|
8
|
+
## Dependencies
|
|
9
|
+
|
|
10
|
+
- Required: none.
|
|
11
|
+
- Conditional: `harden-app-security` for code-affecting scopes before finalizing the report.
|
|
12
|
+
- Optional: none.
|
|
13
|
+
- Fallback: If the required security cross-check is unavailable for a code-affecting scope, stop and report the missing dependency.
|
|
14
|
+
|
|
15
|
+
## Standards
|
|
16
|
+
|
|
17
|
+
- Evidence: Keep only reproducible findings backed by code, tests, runtime output, or direct reproduction steps.
|
|
18
|
+
- Execution: Determine scope first, run focused probes, confirm reproducibility, then report findings without remediation.
|
|
19
|
+
- Quality: Separate confirmed findings from hypotheses and cover boundary, failure, stateful, and observability edge cases that matter to the scope.
|
|
20
|
+
- Output: Return prioritized findings, edge-case evidence, risk assessment, hardening guidance, and residual risk only.
|
|
21
|
+
|
|
22
|
+
## Overview
|
|
23
|
+
|
|
24
|
+
Use this skill to discover edge-case failures and coverage gaps with evidence-first analysis. The goal is to surface reproducible findings, not to remediate them.
|
|
25
|
+
|
|
26
|
+
## Non-negotiable Boundaries
|
|
27
|
+
|
|
28
|
+
- This skill is discovery-only: do not edit code, do not add or modify tests, and do not open PRs.
|
|
29
|
+
- Keep only reproducible findings with clear evidence.
|
|
30
|
+
- Mark unverified ideas as hypotheses and separate them from confirmed findings.
|
|
31
|
+
- If the task also requires remediation, finish this discovery pass first, then hand off confirmed findings to another implementation workflow.
|
|
32
|
+
- Discard authorship bias completely: treat code written earlier in the conversation or by this agent as untrusted until evidence proves otherwise.
|
|
33
|
+
|
|
34
|
+
## Workflow
|
|
35
|
+
|
|
36
|
+
### 1) Determine scan scope (required)
|
|
37
|
+
|
|
38
|
+
- Run `git diff --name-only` first.
|
|
39
|
+
- If diff exists: inspect only changed files plus the minimum dependency chain required to validate suspected edge cases.
|
|
40
|
+
- If no diff exists: scan the full project, prioritizing core domain logic, external API boundaries, stateful workflows, and concurrency-sensitive modules.
|
|
41
|
+
- If no actionable issue is found, report `No actionable edge-case finding identified` and stop.
|
|
42
|
+
|
|
43
|
+
### 2) Build a factual baseline
|
|
44
|
+
|
|
45
|
+
- Read the relevant code paths end-to-end before judging behavior.
|
|
46
|
+
- Re-derive behavior from code, tests, runtime output, and reproduced inputs only; ignore prior intent, authorship, or confidence from earlier turns.
|
|
47
|
+
- Clarify input/output contracts: types, valid ranges, null handling, ordering assumptions, retry/error behavior, and state transitions.
|
|
48
|
+
- Run existing tests or a minimal reproduction when needed to confirm actual vs expected behavior.
|
|
49
|
+
- Record exact evidence with file references (`path:line`) and observable symptoms.
|
|
50
|
+
|
|
51
|
+
### 3) Execute focused edge-case probes
|
|
52
|
+
|
|
53
|
+
Prioritize 2-5 high-risk cases directly tied to the selected scope:
|
|
54
|
+
|
|
55
|
+
- Empty collections / empty strings / None / null
|
|
56
|
+
- Boundary values: 0, 1, -1, max/min limits, overflow
|
|
57
|
+
- Duplicate, ordering, sorting, or deduplication assumptions
|
|
58
|
+
- Exception paths: external dependency failure, timeout, retry, or partial data missing
|
|
59
|
+
- Invalid formats: malformed strings, invalid date/timezone, or unexpected types
|
|
60
|
+
- Concurrency/reentrancy: repeated calls, state contamination, or race windows
|
|
61
|
+
- Architecture-level edge cases: backpressure, resource exhaustion, timeout propagation, or partial commit/rollback behavior
|
|
62
|
+
|
|
63
|
+
For broader coverage, load references as needed:
|
|
64
|
+
|
|
65
|
+
- `references/architecture-edge-cases.md`
|
|
66
|
+
- `references/code-edge-cases.md`
|
|
67
|
+
|
|
68
|
+
#### External API checks
|
|
69
|
+
|
|
70
|
+
If the scope includes external API calls, validate:
|
|
71
|
+
|
|
72
|
+
- observable health/availability handling,
|
|
73
|
+
- degradation behavior for at least HTTP 429 and 500,
|
|
74
|
+
- actionable error logging (status code, request id, retry count, latency) to avoid silent failures.
|
|
75
|
+
|
|
76
|
+
### 4) Confirm reproducibility
|
|
77
|
+
|
|
78
|
+
- Reproduce each confirmed issue at least twice through the same trigger path.
|
|
79
|
+
- For high-risk findings, try nearby variants such as boundary neighbors, empty vs null, malformed vs well-typed invalid input, repeated calls, and stale ordering.
|
|
80
|
+
- Capture the exact command, request, or input together with the observed failure or missing protection.
|
|
81
|
+
- Keep unverified ideas as hypotheses only.
|
|
82
|
+
|
|
83
|
+
### 5) Prioritize confirmed findings
|
|
84
|
+
|
|
85
|
+
- Rank findings by user impact, exploitability or frequency, and blast radius.
|
|
86
|
+
- Call out data-integrity, state corruption, silent failure, retry storm, and cross-module propagation risks explicitly.
|
|
87
|
+
- Prefer fewer, stronger findings over many speculative ones.
|
|
88
|
+
|
|
89
|
+
### 6) Report findings only
|
|
90
|
+
|
|
91
|
+
Deliver:
|
|
92
|
+
|
|
93
|
+
1. Findings (highest risk first)
|
|
94
|
+
- Title and severity/priority
|
|
95
|
+
- Evidence (`path:line`)
|
|
96
|
+
- Reproduction steps or triggering input
|
|
97
|
+
- Broken expectation/invariant
|
|
98
|
+
2. Edge-case evidence
|
|
99
|
+
- Preconditions
|
|
100
|
+
- Observed behavior
|
|
101
|
+
- Reproducibility notes and nearby variant results
|
|
102
|
+
3. Risk assessment
|
|
103
|
+
- Impact, likelihood, and scope
|
|
104
|
+
- Why this matters in system context
|
|
105
|
+
4. Hardening guidance (advice only)
|
|
106
|
+
- Recommended fix direction
|
|
107
|
+
- Suggested test coverage to add during remediation
|
|
108
|
+
5. Residual risk
|
|
109
|
+
- Hypotheses, unknowns, and next validation ideas
|
|
110
|
+
|
|
111
|
+
## Minimum Coverage
|
|
112
|
+
|
|
113
|
+
Apply all relevant checks for the selected scope:
|
|
114
|
+
|
|
115
|
+
- Input validation: empty/null/malformed/unexpected-type handling
|
|
116
|
+
- Boundary behavior: zero/one/min/max/overflow/ordering edges
|
|
117
|
+
- Failure behavior: timeout, retry, partial dependency failure, degraded mode
|
|
118
|
+
- Stateful behavior: idempotency, replay, concurrency, rollback, duplicate processing
|
|
119
|
+
- Observability: actionable errors and logging for failures that would otherwise be silent
|
|
120
|
+
|
|
121
|
+
## Resources
|
|
122
|
+
|
|
123
|
+
- `references/architecture-edge-cases.md`: cross-module/system-level edge-case checklist.
|
|
124
|
+
- `references/code-edge-cases.md`: code-level input, boundary, and error-path checklist.
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
interface:
|
|
2
|
+
display_name: "Discover Edge Cases"
|
|
3
|
+
short_description: "Discover reproducible edge-case risks and coverage gaps"
|
|
4
|
+
default_prompt: "Use $discover-edge-cases to scan the current diff first (or the full codebase when there is no diff), discard any bias toward code written earlier in the conversation, run $harden-app-security as an adversarial cross-check for code-affecting scope, identify the highest-risk reproducible edge-case findings, validate them with concrete evidence, prioritize the confirmed risks, and report hardening and test recommendations without modifying code."
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# Common Architecture-level Edge Cases (Reference List)
|
|
2
|
+
|
|
3
|
+
## How to use
|
|
4
|
+
- Pick only 2-5 items directly related to the current change; avoid exhaustive scans.
|
|
5
|
+
- If changes involve external dependencies/concurrency/scheduling/messaging, prioritize matching sections.
|
|
6
|
+
|
|
7
|
+
## Concurrency and synchronization
|
|
8
|
+
- Race conditions: concurrent updates to the same resource cause overwrite/lost updates
|
|
9
|
+
- Deadlock/livelock: inconsistent lock ordering, reentrant lock misuse, or busy-wait loops
|
|
10
|
+
- Visibility/memory consistency: cross-thread state is not synchronized
|
|
11
|
+
- Async task leaks: background tasks not cancelled or cleaned up
|
|
12
|
+
|
|
13
|
+
## Backpressure and resources
|
|
14
|
+
- Backpressure failure: slow downstream causes upstream queue growth, OOM, or queue saturation
|
|
15
|
+
- Resource starvation: high-priority tasks monopolize resources
|
|
16
|
+
- Connection pool exhaustion: unreleased or delayed-release connections
|
|
17
|
+
- File/socket leaks: exception paths skip close/release
|
|
18
|
+
|
|
19
|
+
## Distributed systems
|
|
20
|
+
- Network partition/intermittent unreachable state: requires retry/degrade/isolation strategy
|
|
21
|
+
- Retry storms: retry amplification under failure
|
|
22
|
+
- Consistency gaps: stale reads or partial writes
|
|
23
|
+
- Duplicate messages: at-least-once delivery causes duplicate processing
|
|
24
|
+
- Message ordering: reordering/out-of-order events corrupt state
|
|
25
|
+
- Clock skew: time-based ordering/expiration becomes incorrect
|
|
26
|
+
|
|
27
|
+
## Timeout and cancellation
|
|
28
|
+
- Timeout not propagated: child tasks continue and consume resources
|
|
29
|
+
- Non-reentrant cancellation: retry causes inconsistent state
|
|
30
|
+
- Timeout boundary flapping: unstable behavior near timeout thresholds
|
|
31
|
+
|
|
32
|
+
## Error handling and rollback
|
|
33
|
+
- Partial success: multi-step writes complete only partially
|
|
34
|
+
- Rollback failure: compensation action fails and leaves inconsistent data
|
|
35
|
+
- Swallowed exceptions: errors are neither surfaced nor logged
|
|
36
|
+
- Missing idempotency: retries create duplicate side effects
|
|
37
|
+
|
|
38
|
+
## Deployment and versioning
|
|
39
|
+
- Rolling upgrade mismatch: old/new versions run together with inconsistent behavior
|
|
40
|
+
- Config drift: node configurations diverge
|
|
41
|
+
- Hot reload instability: temporary unavailability or state loss during reload
|