@laitszkin/apollo-toolkit 3.3.5 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/AGENTS.md +1 -0
  2. package/CHANGELOG.md +8 -0
  3. package/README.md +1 -0
  4. package/analyse-app-logs/scripts/__pycache__/filter_logs_by_time.cpython-312.pyc +0 -0
  5. package/analyse-app-logs/scripts/__pycache__/log_cli_utils.cpython-312.pyc +0 -0
  6. package/analyse-app-logs/scripts/__pycache__/search_logs.cpython-312.pyc +0 -0
  7. package/develop-new-features/README.md +9 -19
  8. package/develop-new-features/SKILL.md +14 -24
  9. package/develop-new-features/agents/openai.yaml +1 -1
  10. package/docs-to-voice/scripts/__pycache__/docs_to_voice.cpython-312.pyc +0 -0
  11. package/enhance-existing-features/README.md +9 -21
  12. package/enhance-existing-features/SKILL.md +16 -27
  13. package/enhance-existing-features/agents/openai.yaml +1 -1
  14. package/generate-spec/README.md +4 -3
  15. package/generate-spec/SKILL.md +14 -5
  16. package/generate-spec/agents/openai.yaml +1 -1
  17. package/generate-spec/references/templates/checklist.md +5 -0
  18. package/generate-spec/references/templates/tasks.md +38 -9
  19. package/generate-spec/scripts/__pycache__/create-specscpython-312.pyc +0 -0
  20. package/katex/scripts/__pycache__/render_katex.cpython-312.pyc +0 -0
  21. package/open-github-issue/scripts/__pycache__/open_github_issue.cpython-312.pyc +0 -0
  22. package/package.json +1 -1
  23. package/read-github-issue/scripts/__pycache__/find_issues.cpython-312.pyc +0 -0
  24. package/read-github-issue/scripts/__pycache__/read_issue.cpython-312.pyc +0 -0
  25. package/resolve-review-comments/scripts/__pycache__/review_threads.cpython-312.pyc +0 -0
  26. package/test-case-strategy/LICENSE +21 -0
  27. package/test-case-strategy/README.md +27 -0
  28. package/test-case-strategy/SKILL.md +110 -0
  29. package/test-case-strategy/agents/openai.yaml +4 -0
  30. package/test-case-strategy/references/e2e-tests.md +31 -0
  31. package/test-case-strategy/references/integration-tests.md +32 -0
  32. package/test-case-strategy/references/property-based-tests.md +43 -0
  33. package/test-case-strategy/references/unit-tests.md +59 -0
  34. package/text-to-short-video/scripts/__pycache__/enforce_video_aspect_ratio.cpython-312.pyc +0 -0
  35. package/develop-new-features/references/testing-e2e.md +0 -36
  36. package/develop-new-features/references/testing-integration.md +0 -42
  37. package/develop-new-features/references/testing-property-based.md +0 -44
  38. package/develop-new-features/references/testing-unit.md +0 -37
  39. package/enhance-existing-features/references/e2e-tests.md +0 -26
  40. package/enhance-existing-features/references/integration-tests.md +0 -30
  41. package/enhance-existing-features/references/property-based-tests.md +0 -33
  42. package/enhance-existing-features/references/unit-tests.md +0 -29
@@ -0,0 +1,27 @@
1
+ # test-case-strategy
2
+
3
+ Shared testing strategy skill for choosing risk-driven test cases across spec generation, new-feature implementation, and brownfield feature changes.
4
+
5
+ ## Core capabilities
6
+
7
+ - Selects the smallest useful test level for each risk: unit, regression, property-based, integration, E2E, adversarial, or mock/fake scenario coverage.
8
+ - Defines meaningful oracles before implementation so tests verify requirements instead of echoing newly written code.
9
+ - Adds focused unit drift checks for atomic implementation tasks.
10
+ - Records concrete test IDs, target units or flows, fixture strategy, verification hooks, and `N/A` reasons.
11
+ - Provides reusable references for unit, property-based, integration, and E2E test decisions.
12
+
13
+ ## Repository structure
14
+
15
+ - `SKILL.md`: Shared test selection workflow and output contract.
16
+ - `agents/openai.yaml`: Agent interface metadata and default prompt.
17
+ - `references/`: Focused guides for unit drift checks, property-based tests, integration tests, and E2E tests.
18
+
19
+ ## Typical usage
20
+
21
+ ```text
22
+ Use $test-case-strategy to choose the right tests for this change and define the unit drift checks before implementation.
23
+ ```
24
+
25
+ ## License
26
+
27
+ MIT. See `LICENSE`.
@@ -0,0 +1,110 @@
1
+ ---
2
+ name: test-case-strategy
3
+ description: Select and design risk-driven test cases for agent implementation work. Use when specs, new-feature work, brownfield feature changes, refactors, or bug fixes need a concrete decision about unit, regression, property-based, integration, E2E, adversarial, mock/fake, or drift-check coverage.
4
+ ---
5
+
6
+ # Test Case Strategy
7
+
8
+ ## Dependencies
9
+
10
+ - Required: none.
11
+ - Conditional: none.
12
+ - Optional: none.
13
+ - Fallback: not applicable.
14
+
15
+ ## Standards
16
+
17
+ - Evidence: Base every test decision on changed behavior, requirement IDs, risk class, dependency shape, and existing coverage; do not add tests only because a template lists them.
18
+ - Execution: Choose the smallest test level that can prove the risk, then escalate to broader tests only when lower-level tests cannot observe the behavior or contract.
19
+ - Quality: Each test must have a meaningful oracle derived from the requirement, design, or contract rather than from the implementation just written.
20
+ - Output: Return concrete test case IDs, test level, target unit or flow, oracle, fixture/mock strategy, command or verification hook, and any `N/A` reason.
21
+
22
+ ## Goal
23
+
24
+ Provide one shared testing decision workflow for spec generation and implementation skills so agents select useful tests consistently and use fast focused checks to detect implementation drift.
25
+
26
+ ## Workflow
27
+
28
+ ### 1) Build the risk inventory
29
+
30
+ - Identify changed behaviors, requirement IDs, and affected modules.
31
+ - Classify risks before choosing test types:
32
+ - boundary
33
+ - regression
34
+ - authorization or permission denial
35
+ - invalid transition
36
+ - idempotency, replay, or duplicate submission
37
+ - concurrency or race
38
+ - data integrity
39
+ - external failure or inconsistent dependency state
40
+ - partial write, rollback, or compensation
41
+ - adversarial abuse
42
+ - Reuse existing coverage only after naming the exact suite, test case, and risk it already proves.
43
+
44
+ ### 2) Choose the narrowest valid test level
45
+
46
+ - Use unit tests for isolated changed logic, boundaries, denials, exact errors, no-side-effect expectations, and fast implementation drift checks.
47
+ - Use regression tests when a bug-prone or historically fragile behavior must not silently return.
48
+ - Use property-based tests for business rules with describable invariants, generated input spaces, valid/invalid state transitions, or external-state matrices that can be mocked.
49
+ - Use integration tests for cross-module chains, repository/service/API/event interactions, configuration wiring, persistence, IO, and controlled external-service scenarios.
50
+ - Use E2E tests only for critical user-visible paths whose risk is not sufficiently proven by lower-level tests; keep them minimal and stable.
51
+ - Use adversarial tests for malformed input, forged identities, invalid transitions, replay, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
52
+ - If E2E is too costly or unstable, replace it with integration coverage for the same risk and record the replacement.
53
+
54
+ ### 3) Define the oracle before implementation
55
+
56
+ - Derive expected behavior from `spec.md`, `design.md`, `contract.md`, official documentation, or existing intended behavior.
57
+ - Never derive the oracle from the new implementation after writing it.
58
+ - Prefer exact assertions:
59
+ - exact output
60
+ - exact error class or denial reason
61
+ - persisted state
62
+ - emitted event
63
+ - retry or compensation accounting
64
+ - intentional absence of writes, notifications, or side effects
65
+ - allowed state transition or rejection
66
+ - Avoid assertion-light smoke tests, snapshot-only tests, and tests that only prove "does not throw" unless that is the real requirement.
67
+
68
+ ### 4) Add unit drift checks for atomic implementation tasks
69
+
70
+ - For each non-trivial atomic task, decide whether a focused unit drift check is possible.
71
+ - A unit drift check should name:
72
+ - target unit: function, method, module, policy, parser, mapper, validator, or state transition owner
73
+ - input state: minimal fixture, table row, fake dependency state, or boundary value
74
+ - oracle: exact output, error, state change, or no-side-effect assertion
75
+ - command: focused test command or existing test filter to run immediately after the task
76
+ - If no unit drift check is possible, record the smallest replacement check and a concrete reason.
77
+ - Do not allow broad integration or E2E tests to hide missing unit checks for locally owned business logic.
78
+
79
+ ### 5) Record the decision
80
+
81
+ Use this compact record in planning docs or implementation summaries:
82
+
83
+ ```text
84
+ Test ID: UT-01 / REG-01 / PBT-01 / IT-01 / E2E-01 / ADV-01
85
+ Requirement/Risk: R?.? / boundary | regression | ...
86
+ Target: function/module/flow
87
+ Fixture or mock strategy: ...
88
+ Oracle: exact output/error/state/no-side-effect
89
+ Verification hook: command or existing suite
90
+ N/A reason: only when this test level is not suitable
91
+ ```
92
+
93
+ ## Working Rules
94
+
95
+ - Start from risk and oracle, not from a desired test count.
96
+ - Prefer fast focused tests for implementation drift detection.
97
+ - Keep one test focused on one behavior or one failure mode.
98
+ - Use table-driven unit tests when many discrete business cases share one oracle.
99
+ - Use mocks/fakes for external services in business logic chains unless the real external contract itself is under test.
100
+ - Keep fixtures reproducible with fixed clocks, seeds, and controlled dependency states.
101
+ - Preserve failing generated examples or seeds as regression coverage.
102
+ - Record `N/A` only with a concrete reason tied to scope, risk, or observability.
103
+ - When a spec exists, map test IDs back to `tasks.md` and `checklist.md`.
104
+
105
+ ## References
106
+
107
+ - `references/unit-tests.md`: unit test and drift-check design.
108
+ - `references/property-based-tests.md`: property-based test selection and oracle design.
109
+ - `references/integration-tests.md`: integration test selection and external-state scenarios.
110
+ - `references/e2e-tests.md`: E2E decision and replacement rules.
@@ -0,0 +1,4 @@
1
+ interface:
2
+ display_name: "Test Case Strategy"
3
+ short_description: "Select risk-driven tests and unit drift checks"
4
+ default_prompt: "Use $test-case-strategy to choose the smallest useful test level for each changed behavior, starting from requirement IDs, risk class, dependency shape, and existing coverage. Define oracles before implementation, prefer focused unit drift checks for atomic tasks, escalate to property-based, integration, E2E, adversarial, or mock/fake scenario coverage only when the risk requires it, and return concrete test IDs, target unit or flow, fixture strategy, verification hook, and any N/A reason."
@@ -0,0 +1,31 @@
1
+ # E2E Tests
2
+
3
+ ## Purpose
4
+
5
+ - Verify critical user-visible paths at the closest practical level to real usage.
6
+ - Catch cross-layer behavior gaps that lower-level tests cannot prove.
7
+
8
+ ## Required when
9
+
10
+ - A change affects a high-impact user-visible flow.
11
+ - The flow is multi-step, cross-system, historically fragile, revenue-critical, permission-sensitive, or hard to reason about from lower-level tests alone.
12
+ - The environment and test data can be kept stable enough for maintainable coverage.
13
+
14
+ ## Not suitable when
15
+
16
+ - Unit and integration tests already prove the actual risk.
17
+ - The environment is unstable or external dependencies would make the test flaky.
18
+ - The cost is disproportionate and integration tests can cover the same risk.
19
+
20
+ ## Design rules
21
+
22
+ - Keep E2E minimal: one critical success path plus one highest-value denial or failure path when warranted.
23
+ - Assert business-visible outcomes, not just DOM presence or status-code success.
24
+ - Use controlled test data and avoid brittle external dependencies.
25
+ - Prefer replacing expensive E2E with stronger integration tests over adding flaky coverage.
26
+ - Record the decision per flow or risk slice; do not collapse unrelated paths into one global E2E decision.
27
+
28
+ ## Recording
29
+
30
+ - Record `E2E-xx`, target flow, business-visible oracle, setup, command, and result.
31
+ - If skipped, record the replacement `IT-xx` cases and concrete rationale.
@@ -0,0 +1,32 @@
1
+ # Integration Tests
2
+
3
+ ## Purpose
4
+
5
+ - Verify collaboration across modules, layers, repositories, handlers, services, event flows, persistence, configuration, or controlled external-service scenarios.
6
+ - Cover risks unit tests cannot observe: sequence, wiring, persistence, IO, config, retry, fallback, and cross-boundary side effects.
7
+
8
+ ## Required when
9
+
10
+ - The changed behavior depends on service/repository/API/event/module collaboration.
11
+ - The correctness question is about a user-critical business chain rather than one isolated unit.
12
+ - External dependency states affect business behavior and can be mocked or faked.
13
+ - E2E is not suitable but equivalent cross-layer risk still needs coverage.
14
+
15
+ ## Not suitable when
16
+
17
+ - A pure unit owns the behavior and unit tests can fully observe the risk.
18
+ - A stable E2E test is required and feasible for a critical user-visible path.
19
+
20
+ ## Design rules
21
+
22
+ - Keep dependencies inside the application boundary near-real when practical.
23
+ - Mock or fake external services unless the real service contract itself is under test.
24
+ - Build scenario matrices for success, timeout, retry exhaustion, partial data, stale data, duplicate callback, inconsistent response, and permission failure.
25
+ - Assert business outcomes across boundaries: persisted state, emitted event, deduplication, retry accounting, audit trail, compensation, or no partial write.
26
+ - Include adversarial paths when invalid transition, replay, double-submit, forged identifier, or out-of-order event risks exist.
27
+ - Keep data reconstructable and cleanup reliable.
28
+
29
+ ## Recording
30
+
31
+ - Record `IT-xx`, modules involved, external dependency strategy, scenario matrix, oracle, requirement mapping, and command.
32
+ - If integration replaces E2E, map the replacement explicitly.
@@ -0,0 +1,43 @@
1
+ # Property-Based Tests
2
+
3
+ ## Purpose
4
+
5
+ - Verify invariants across broad or generated input spaces.
6
+ - Validate business rules as allowed outputs, forbidden outputs, valid transitions, rejection rules, or safety constraints.
7
+ - Catch combinational, adversarial, and boundary behaviors that fixed examples miss.
8
+
9
+ ## Required when
10
+
11
+ - Logic has describable invariants: calculation, transformation, sorting, aggregation, serialization, normalization, deduplication, routing, ranking, or state transition.
12
+ - Business rules can be expressed as predicates, allow-lists, forbidden states, or metamorphic relationships.
13
+ - External-service-dependent logic can use mocks/fakes to generate service states.
14
+
15
+ ## Not suitable when
16
+
17
+ - The only meaningful risk is a real external integration contract.
18
+ - The input space is tiny and better covered by exhaustive unit tests.
19
+ - The behavior is UI-only or lacks a stable machine-verifiable oracle.
20
+
21
+ ## Design rules
22
+
23
+ - State the property in one sentence before writing generators.
24
+ - Generate normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
25
+ - Prefer direct business-rule predicates over vague structural checks.
26
+ - Use state-machine or sequence properties for stateful flows.
27
+ - Use metamorphic properties when exact outputs are hard to predict.
28
+ - Preserve failing seeds/examples as regression tests.
29
+ - Control sample count and input size so the suite remains practical.
30
+
31
+ ## Common properties
32
+
33
+ - Round trip: `decode(encode(x)) == x`.
34
+ - Sorting is monotonic and preserves the input multiset.
35
+ - Merge/split operations preserve total count or value.
36
+ - Replaying the same command remains idempotent.
37
+ - Generated invalid or unauthorized inputs always fail with an expected result class.
38
+ - Generated transitions always end in allowed states or explicit rejections.
39
+ - Under generated mocked service states, fallback, retry, and compensation rules still hold.
40
+
41
+ ## Recording
42
+
43
+ - Record `PBT-xx`, property, generator strategy, oracle, requirement mapping, and replay instructions for failures.
@@ -0,0 +1,59 @@
1
+ # Unit Tests And Drift Checks
2
+
3
+ ## Purpose
4
+
5
+ - Verify the smallest changed behavior unit: function, method, policy, parser, mapper, validator, state transition owner, or pure logic module.
6
+ - Localize failures quickly while the agent is still implementing a single task.
7
+ - Detect implementation drift by comparing the changed unit against an oracle defined before the code change.
8
+
9
+ ## Required when
10
+
11
+ - A task changes non-trivial local logic.
12
+ - A requirement has boundary, denial, validation, state transition, idempotency, authorization, error, or no-side-effect behavior.
13
+ - The input space is small and discrete enough to enumerate expected outputs.
14
+ - A bug-prone behavior needs a fixed regression example.
15
+ - The task can be verified without DB, RPC, filesystem, browser, queue, or multi-module orchestration.
16
+
17
+ ## Not suitable when
18
+
19
+ - Correctness depends on real cross-module collaboration, persistence, IO, or configuration wiring.
20
+ - The main risk is the external integration contract itself.
21
+ - The behavior is only observable through a user-visible end-to-end path.
22
+
23
+ ## Drift-check record
24
+
25
+ ```text
26
+ Unit drift check:
27
+ Target unit: [function/module/policy]
28
+ Requirement: [R?.?]
29
+ Fixture/input: [minimal state or table row]
30
+ Oracle: [exact output/error/state/no-side-effect]
31
+ Test case ID: [UT-xx or REG-xx]
32
+ Run after task: [focused command or test filter]
33
+ N/A reason: [only if no unit check can observe the task]
34
+ ```
35
+
36
+ ## Design rules
37
+
38
+ - Define the oracle from the spec, design, contract, official docs, or established intended behavior before implementation.
39
+ - Keep one test focused on one behavior or failure mode.
40
+ - Use table-driven tests for small business matrices.
41
+ - Cover both accepted and rejected states when the unit owns a decision.
42
+ - Assert exact errors, result classes, state changes, and intentional lack of side effects.
43
+ - Mock, stub, or fake external dependencies; unit tests should not need DB/RPC/file/browser IO.
44
+ - Control time, randomness, environment variables, and global state.
45
+ - Do not accept tests that only assert "does not throw", "returns truthy", or snapshot shape unless those are the real business oracle.
46
+
47
+ ## Useful unit cases
48
+
49
+ - Boundary value is accepted at the limit and rejected outside it.
50
+ - Invalid input returns the specified error and performs no write.
51
+ - Unauthorized actor is denied with the expected reason.
52
+ - Repeating an idempotent call returns the same outcome without duplicate side effects.
53
+ - State transition from `A` to `B` is allowed, while `A` to `C` is rejected.
54
+ - Mapper/parser preserves required fields and rejects malformed variants.
55
+
56
+ ## Recording
57
+
58
+ - In specs, map each `UT-xx` or `REG-xx` to requirement IDs and checklist items.
59
+ - In direct implementation, report test IDs, command, result, and any `N/A` reason.
@@ -1,36 +0,0 @@
1
- # E2E Testing Principles
2
-
3
- ## Core rules
4
- - E2E is not decided solely by explicit user request.
5
- - The agent must decide E2E based on feature importance, complexity, and cross-layer risk.
6
- - For high-risk key user paths, create the smallest necessary E2E coverage first.
7
- - If E2E is unstable, too costly, or environment-limited, add integration coverage for equivalent risk and record the alternative.
8
-
9
- ## Purpose
10
- - Verify critical end-to-end user paths are usable.
11
- - Catch behavior gaps after cross-system/cross-layer integration.
12
- - Provide confidence close to real usage for high-risk scenarios.
13
-
14
- ## Decision criteria
15
- - Importance: core feature, critical revenue flow, or high-impact process.
16
- - Complexity: multi-step state transitions, branching flows, cross-service collaboration.
17
- - Risk: historical regressions, fragile integrations, major user-visible failures.
18
- - Maintainability: stable environment and controllable test data.
19
-
20
- ## Not suitable when
21
- - Feature risk is low and unit/integration tests already cover it sufficiently.
22
- - E2E is unstable and disproportionately expensive while integration tests can cover key risk.
23
-
24
- ## Design guidance
25
- - Cover only the most critical paths; avoid expanding into full UI test suites.
26
- - Keep test data controllable (fixed seeds or recyclable fixtures).
27
- - Prioritize stability; avoid brittle external dependencies, use controlled substitutes if needed.
28
- - Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
29
- - Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
30
- - Keep cost decisions explicit: document why E2E is done or not done and what alternative strategy is used.
31
-
32
- ## Spec/checklist authoring hints
33
- - Mark high-risk key paths in `spec.md` requirement descriptions.
34
- - Record E2E decisions, mapped test cases, and results in `checklist.md`.
35
- - When different flows need different strategies, create multiple decision records instead of forcing one shared E2E decision for the whole feature.
36
- - If skipping E2E, specify replacement integration test cases (`IT-xx`) and rationale in `checklist.md`.
@@ -1,42 +0,0 @@
1
- # Integration Testing Principles
2
-
3
- ## Purpose
4
- - Verify collaboration across modules/layers and external dependencies.
5
- - Cover integration risks unit tests cannot capture (sequence, config, IO failure).
6
- - Validate user-critical business logic chains under realistic component interaction and controlled external-service scenarios.
7
-
8
- ## When to use
9
- - Interface interactions between modules (for example service ↔ repository).
10
- - Changes touching IO dependencies such as DB, RPC, files, cache, queues.
11
- - Behaviors that depend on configuration combinations or environment differences.
12
- - The correctness question is about the whole business logic chain rather than one isolated function.
13
- - As minimum safety replacement when E2E is not suitable.
14
-
15
- ## Not suitable when
16
- - Single pure-function or pure-logic behavior (use unit tests).
17
- - Full end-to-end user flow can be stably covered by E2E.
18
-
19
- ## Relationship with E2E
20
- - If change importance/complexity is high and E2E is feasible, prefer minimal E2E for key paths.
21
- - If E2E is hard or too costly, integration tests must cover equivalent key risks.
22
- - Record replacement mapping in `checklist.md` (E2E-xx ↔ IT-xx) with rationale.
23
-
24
- ## Design guidance
25
- - Focus on high-value integration points; each test should justify risk/value.
26
- - Keep dependencies inside the application boundary near-real where practical.
27
- - Mock/fake external services at the business-chain boundary unless the real service contract itself is what needs verification.
28
- - Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
29
- - Add adversarial/penetration-style cases for abuse paths such as invalid transitions, replay, double-submit, forged identifiers, or out-of-order events when those risks exist.
30
- - When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
31
- - Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
32
- - Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
33
- - Keep reproducible: controlled test data and recoverable environment.
34
- - Keep cost controlled; avoid broad redundant coverage (leave that to unit tests).
35
-
36
- ## Spec/checklist authoring hints
37
- - Dependency scope: list involved modules/external systems.
38
- - Scenario: describe cross-module flow or critical branch.
39
- - Risk: explain what integration failure, misconfiguration, or business-chain break this test can reveal.
40
- - External dependency strategy: specify which services are mocked/faked versus near-real and why.
41
- - Scenario matrix: list the external states or adversarial paths covered.
42
- - Map behavior, test IDs, and test outcomes in `checklist.md`.
@@ -1,44 +0,0 @@
1
- # Property-based Testing Principles
2
-
3
- ## Purpose
4
- - Verify invariants/properties hold across broad input spaces.
5
- - Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
6
- - Catch combinational, adversarial, and boundary behaviors that fixed examples often miss.
7
-
8
- ## When to use
9
- - Algorithms, transformations, serialization/deserialization, sorting, aggregation.
10
- - Behaviors requiring consistency or reversibility (for example round-trip).
11
- - Data structures or state transitions with clear invariants.
12
- - Business logic where the rule can be stated as input/output expectations, allowed states, forbidden states, or safety constraints.
13
- - Logic chains that depend on external services, when those services can be replaced by controllable mocks/fakes and their states generated as part of the test space.
14
-
15
- ## Not suitable when
16
- - The main thing being validated is the real integration contract with external systems or live IO (use integration tests).
17
- - UI/interactive flows without stable invariants.
18
- - Very small discrete input spaces (unit tests are sufficient).
19
-
20
- ## Design guidance
21
- - Properties must be explicit and machine-verifiable, whether they are invariants, allowed outcome sets, rejection rules, or business-output predicates.
22
- - Generators should cover normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
23
- - Prefer modeling business rules directly: generate inputs, run the logic, then assert the output/error/state transition matches the rule.
24
- - When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
25
- - When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
26
- - For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
27
- - Ensure reproducibility (fixed seed or replayable input generation) and preserve failing seeds/examples for regression coverage.
28
- - Complement unit tests; avoid duplicating fixed-case tests.
29
- - Control cost with reasonable sample counts and input-size limits.
30
-
31
- ## Common property examples (description level)
32
- - `deserialize(serialize(x)) == x`
33
- - Sorted output is monotonic and preserves element multiset.
34
- - Merge/split operations preserve total element count.
35
- - Idempotency: repeating the same operation does not change results.
36
- - Invalid or unauthorized generated inputs always fail with an expected error/result class.
37
- - Generated order/payment/state-transition inputs always end in an allowed business state.
38
- - Under generated mocked service states, the business logic chain still satisfies fallback/retry/compensation rules.
39
-
40
- ## Spec/checklist authoring hints
41
- - Property/rule: one sentence stating the rule that must always hold or the allowed outcomes that must contain the result.
42
- - Generator strategy: input range, distribution, emphasized boundaries, and any adversarial or external-state dimensions.
43
- - Oracle/check: describe how the test decides correctness (predicate, allow-list, reference model, or expected error class).
44
- - Purpose: explain correctness/risk reduction value of this property.
@@ -1,37 +0,0 @@
1
- # Unit Testing Principles
2
-
3
- ## Purpose
4
- - Verify correctness of the smallest testable unit (function, method, or pure logic module).
5
- - Provide fast feedback with low-cost failure localization.
6
-
7
- ## When to use
8
- - Core business logic and critical branches.
9
- - Boundary conditions (upper/lower limits, null/empty, extreme values).
10
- - Error handling and exception paths (invalid input, incompatible state, etc.).
11
-
12
- ## Not suitable when
13
- - Behavior requires cross-module or external dependency verification (use integration tests).
14
- - Full user-flow validation is required (evaluate E2E first; if not suitable, use integration tests to cover risk).
15
-
16
- ## Design guidance
17
- - Isolate external dependencies with mock/stub/fake; avoid DB/RPC/file IO.
18
- - Keep tests small and focused: one test, one behavior/failure mode.
19
- - Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
20
- - Cover both success and failure branches.
21
- - Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
22
- - Prefer table-driven cases when many small business permutations share the same oracle.
23
- - Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
24
- - If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
25
- - Keep tests reproducible: avoid nondeterministic time/random/global state.
26
- - Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
27
- - Map tests to requirements: each core requirement should have at least one unit test.
28
-
29
- ## Spec/checklist authoring hints
30
- - Scenario: describe input and initial state mapped to one requirement/boundary.
31
- - Expected result: verifiable output, state change, or error.
32
- - Purpose: explain which risk or bug type this test prevents.
33
-
34
- ## Common examples (description level)
35
- - Return a specific error when input is out of allowed range.
36
- - Handle empty list/empty string input with expected behavior.
37
- - Ensure output matches definition after state/flag switching.
@@ -1,26 +0,0 @@
1
- # E2E Testing Guide
2
-
3
- ## Purpose
4
- - Verify critical user-visible paths at end-to-end level.
5
- - Increase confidence in real behavior after cross-layer integration.
6
-
7
- ## Required when
8
- - If changes impact key user-visible flows, add or update E2E tests.
9
- - E2E must still be evaluated even when specs are not used; if not applicable, record explicit rationale.
10
-
11
- ## E2E decision rules
12
- - Prefer E2E for high-risk, high-impact, multi-step flow changes.
13
- - Integration tests may replace E2E when E2E is too costly, unstable, or hard to maintain.
14
- - When replacing E2E, provide equivalent risk coverage and record replacement cases plus reasons.
15
-
16
- ## Design guidance
17
- - Focus on minimal critical path coverage; avoid over-expansion.
18
- - Use stable test data and reproducible flows.
19
- - Prioritize business outcomes over brittle UI details.
20
- - Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
21
- - Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
22
-
23
- ## Recording rules
24
- - Specs flow: record E2E or replacement strategy with outcomes in `checklist.md`.
25
- - When different flows need different strategies, create multiple decision records in `checklist.md` so each flow/risk slice keeps its own rationale and linked case IDs.
26
- - Non-specs flow: explain E2E execution or replacement testing with rationale in the response.
@@ -1,30 +0,0 @@
1
- # Integration Testing Guide
2
-
3
- ## Purpose
4
- - Verify correctness of cross-layer/cross-module collaboration.
5
- - Focus especially on user-critical logic chains.
6
- - Validate business outcomes across the full changed chain, not just connectivity.
7
-
8
- ## Required when
9
- - Any change affecting service/repository/API handlers/event flows should add or update integration tests.
10
- - Integration tests for user-critical logic chains are required even when specs are not used.
11
-
12
- ## Coverage focus
13
- - Key data flow from entrypoint to output.
14
- - Cross-module contract and configuration interaction.
15
- - Common failure patterns (timeout, data inconsistency, external dependency failure).
16
- - External dependency state changes and fallback/compensation behavior.
17
- - Adversarial/abuse paths such as invalid transitions, replay, duplication, forged identifiers, or out-of-order events when relevant.
18
-
19
- ## Design guidance
20
- - Prefer near-real dependencies inside the application boundary; mock/fake external services unless the real service contract itself is under test.
21
- - Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
22
- - Keep test data reconstructable and cleanable.
23
- - When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
24
- - Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
25
- - Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
26
- - Each test case should map to an explainable risk.
27
-
28
- ## Recording rules
29
- - Specs flow: record IT cases and outcomes in `checklist.md`.
30
- - Non-specs flow: list user-critical integration tests, mocked external scenarios, adversarial cases, and outcomes in the response.
@@ -1,33 +0,0 @@
1
- # Property-based Testing Guide
2
-
3
- ## Purpose
4
- - Verify invariants across large input combinations.
5
- - Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
6
- - Catch combinational, adversarial, and boundary behavior that fixed examples often miss.
7
-
8
- ## Required when
9
- - If changes include logic with describable invariants (calculation, transformation, sorting, aggregation, serialization), add/update property-based tests.
10
- - If changes include business rules that can be expressed as allowed outputs, forbidden outputs, valid transitions, rejection rules, or safety constraints, add/update property-based tests.
11
- - If logic depends on external services but the service can be replaced with a mock/fake to generate service states, property-based tests should cover those state combinations too.
12
- - If not applicable, record `N/A` with a concrete reason.
13
-
14
- ## Common properties
15
- - Round-trip: `decode(encode(x)) == x`
16
- - Idempotency: repeated execution does not change the result
17
- - Monotonicity/conservation/set invariance
18
- - Generated invalid or unauthorized inputs always fail with an expected result/error class
19
- - Generated state transitions always end in an allowed business state
20
- - Under generated mocked service states, the business logic chain preserves fallback/retry/compensation rules
21
-
22
- ## Design guidance
23
- - Properties must be machine-verifiable, whether they are invariants, allow-lists, rejection rules, or business-output predicates.
24
- - Generator strategy should include normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
25
- - Prefer modeling the business rule directly: generate inputs, run the logic, then assert output/error/state transition matches the rule.
26
- - When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
27
- - When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
28
- - For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
29
- - Control execution cost while preserving reproducibility, and preserve failing seeds/examples for regression coverage.
30
-
31
- ## Recording rules
32
- - If specs are used, record cases and outcomes in `checklist.md`.
33
- - If specs are not used, record cases, external-state coverage, adversarial coverage, or `N/A` reasons in the response.
@@ -1,29 +0,0 @@
1
- # Unit Testing Guide
2
-
3
- ## Purpose
4
- - Verify correctness of a single function/module and localize failures quickly.
5
- - Cover both success and failure paths for the smallest changed behavior unit.
6
-
7
- ## Required when
8
- - Any non-trivial logic change should add or update unit tests.
9
- - Unit test evaluation is required even when specs are not used.
10
-
11
- ## Coverage focus
12
- - Core logic branches and boundary values.
13
- - Error handling, validation failures, and incompatible states.
14
- - Function paths with highest regression risk.
15
-
16
- ## Design guidance
17
- - Isolate external dependencies (mock/stub/fake).
18
- - Keep tests small and focused: one behavior per test.
19
- - Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
20
- - Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
21
- - Prefer table-driven cases when many small business permutations share the same oracle.
22
- - Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
23
- - If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
24
- - Keep tests reproducible and fast.
25
- - Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
26
-
27
- ## Recording rules
28
- - If specs are used, record mapped test cases and results in `checklist.md`.
29
- - If specs are not used, list test IDs and results in the response.