@laitszkin/apollo-toolkit 3.3.5 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +1 -0
- package/CHANGELOG.md +8 -0
- package/README.md +1 -0
- package/analyse-app-logs/scripts/__pycache__/filter_logs_by_time.cpython-312.pyc +0 -0
- package/analyse-app-logs/scripts/__pycache__/log_cli_utils.cpython-312.pyc +0 -0
- package/analyse-app-logs/scripts/__pycache__/search_logs.cpython-312.pyc +0 -0
- package/develop-new-features/README.md +9 -19
- package/develop-new-features/SKILL.md +14 -24
- package/develop-new-features/agents/openai.yaml +1 -1
- package/docs-to-voice/scripts/__pycache__/docs_to_voice.cpython-312.pyc +0 -0
- package/enhance-existing-features/README.md +9 -21
- package/enhance-existing-features/SKILL.md +16 -27
- package/enhance-existing-features/agents/openai.yaml +1 -1
- package/generate-spec/README.md +4 -3
- package/generate-spec/SKILL.md +14 -5
- package/generate-spec/agents/openai.yaml +1 -1
- package/generate-spec/references/templates/checklist.md +5 -0
- package/generate-spec/references/templates/tasks.md +38 -9
- package/generate-spec/scripts/__pycache__/create-specscpython-312.pyc +0 -0
- package/katex/scripts/__pycache__/render_katex.cpython-312.pyc +0 -0
- package/open-github-issue/scripts/__pycache__/open_github_issue.cpython-312.pyc +0 -0
- package/package.json +1 -1
- package/read-github-issue/scripts/__pycache__/find_issues.cpython-312.pyc +0 -0
- package/read-github-issue/scripts/__pycache__/read_issue.cpython-312.pyc +0 -0
- package/resolve-review-comments/scripts/__pycache__/review_threads.cpython-312.pyc +0 -0
- package/test-case-strategy/LICENSE +21 -0
- package/test-case-strategy/README.md +27 -0
- package/test-case-strategy/SKILL.md +110 -0
- package/test-case-strategy/agents/openai.yaml +4 -0
- package/test-case-strategy/references/e2e-tests.md +31 -0
- package/test-case-strategy/references/integration-tests.md +32 -0
- package/test-case-strategy/references/property-based-tests.md +43 -0
- package/test-case-strategy/references/unit-tests.md +59 -0
- package/text-to-short-video/scripts/__pycache__/enforce_video_aspect_ratio.cpython-312.pyc +0 -0
- package/develop-new-features/references/testing-e2e.md +0 -36
- package/develop-new-features/references/testing-integration.md +0 -42
- package/develop-new-features/references/testing-property-based.md +0 -44
- package/develop-new-features/references/testing-unit.md +0 -37
- package/enhance-existing-features/references/e2e-tests.md +0 -26
- package/enhance-existing-features/references/integration-tests.md +0 -30
- package/enhance-existing-features/references/property-based-tests.md +0 -33
- package/enhance-existing-features/references/unit-tests.md +0 -29
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# test-case-strategy
|
|
2
|
+
|
|
3
|
+
Shared testing strategy skill for choosing risk-driven test cases across spec generation, new-feature implementation, and brownfield feature changes.
|
|
4
|
+
|
|
5
|
+
## Core capabilities
|
|
6
|
+
|
|
7
|
+
- Selects the smallest useful test level for each risk: unit, regression, property-based, integration, E2E, adversarial, or mock/fake scenario coverage.
|
|
8
|
+
- Defines meaningful oracles before implementation so tests verify requirements instead of echoing newly written code.
|
|
9
|
+
- Adds focused unit drift checks for atomic implementation tasks.
|
|
10
|
+
- Records concrete test IDs, target units or flows, fixture strategy, verification hooks, and `N/A` reasons.
|
|
11
|
+
- Provides reusable references for unit, property-based, integration, and E2E test decisions.
|
|
12
|
+
|
|
13
|
+
## Repository structure
|
|
14
|
+
|
|
15
|
+
- `SKILL.md`: Shared test selection workflow and output contract.
|
|
16
|
+
- `agents/openai.yaml`: Agent interface metadata and default prompt.
|
|
17
|
+
- `references/`: Focused guides for unit drift checks, property-based tests, integration tests, and E2E tests.
|
|
18
|
+
|
|
19
|
+
## Typical usage
|
|
20
|
+
|
|
21
|
+
```text
|
|
22
|
+
Use $test-case-strategy to choose the right tests for this change and define the unit drift checks before implementation.
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## License
|
|
26
|
+
|
|
27
|
+
MIT. See `LICENSE`.
|
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: test-case-strategy
|
|
3
|
+
description: Select and design risk-driven test cases for agent implementation work. Use when specs, new-feature work, brownfield feature changes, refactors, or bug fixes need a concrete decision about unit, regression, property-based, integration, E2E, adversarial, mock/fake, or drift-check coverage.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Test Case Strategy
|
|
7
|
+
|
|
8
|
+
## Dependencies
|
|
9
|
+
|
|
10
|
+
- Required: none.
|
|
11
|
+
- Conditional: none.
|
|
12
|
+
- Optional: none.
|
|
13
|
+
- Fallback: not applicable.
|
|
14
|
+
|
|
15
|
+
## Standards
|
|
16
|
+
|
|
17
|
+
- Evidence: Base every test decision on changed behavior, requirement IDs, risk class, dependency shape, and existing coverage; do not add tests only because a template lists them.
|
|
18
|
+
- Execution: Choose the smallest test level that can prove the risk, then escalate to broader tests only when lower-level tests cannot observe the behavior or contract.
|
|
19
|
+
- Quality: Each test must have a meaningful oracle derived from the requirement, design, or contract rather than from the implementation just written.
|
|
20
|
+
- Output: Return concrete test case IDs, test level, target unit or flow, oracle, fixture/mock strategy, command or verification hook, and any `N/A` reason.
|
|
21
|
+
|
|
22
|
+
## Goal
|
|
23
|
+
|
|
24
|
+
Provide one shared testing decision workflow for spec generation and implementation skills so agents select useful tests consistently and use fast focused checks to detect implementation drift.
|
|
25
|
+
|
|
26
|
+
## Workflow
|
|
27
|
+
|
|
28
|
+
### 1) Build the risk inventory
|
|
29
|
+
|
|
30
|
+
- Identify changed behaviors, requirement IDs, and affected modules.
|
|
31
|
+
- Classify risks before choosing test types:
|
|
32
|
+
- boundary
|
|
33
|
+
- regression
|
|
34
|
+
- authorization or permission denial
|
|
35
|
+
- invalid transition
|
|
36
|
+
- idempotency, replay, or duplicate submission
|
|
37
|
+
- concurrency or race
|
|
38
|
+
- data integrity
|
|
39
|
+
- external failure or inconsistent dependency state
|
|
40
|
+
- partial write, rollback, or compensation
|
|
41
|
+
- adversarial abuse
|
|
42
|
+
- Reuse existing coverage only after naming the exact suite, test case, and risk it already proves.
|
|
43
|
+
|
|
44
|
+
### 2) Choose the narrowest valid test level
|
|
45
|
+
|
|
46
|
+
- Use unit tests for isolated changed logic, boundaries, denials, exact errors, no-side-effect expectations, and fast implementation drift checks.
|
|
47
|
+
- Use regression tests when a bug-prone or historically fragile behavior must not silently return.
|
|
48
|
+
- Use property-based tests for business rules with describable invariants, generated input spaces, valid/invalid state transitions, or external-state matrices that can be mocked.
|
|
49
|
+
- Use integration tests for cross-module chains, repository/service/API/event interactions, configuration wiring, persistence, IO, and controlled external-service scenarios.
|
|
50
|
+
- Use E2E tests only for critical user-visible paths whose risk is not sufficiently proven by lower-level tests; keep them minimal and stable.
|
|
51
|
+
- Use adversarial tests for malformed input, forged identities, invalid transitions, replay, stale/out-of-order events, toxic payload sizes, and risky edge combinations.
|
|
52
|
+
- If E2E is too costly or unstable, replace it with integration coverage for the same risk and record the replacement.
|
|
53
|
+
|
|
54
|
+
### 3) Define the oracle before implementation
|
|
55
|
+
|
|
56
|
+
- Derive expected behavior from `spec.md`, `design.md`, `contract.md`, official documentation, or existing intended behavior.
|
|
57
|
+
- Never derive the oracle from the new implementation after writing it.
|
|
58
|
+
- Prefer exact assertions:
|
|
59
|
+
- exact output
|
|
60
|
+
- exact error class or denial reason
|
|
61
|
+
- persisted state
|
|
62
|
+
- emitted event
|
|
63
|
+
- retry or compensation accounting
|
|
64
|
+
- intentional absence of writes, notifications, or side effects
|
|
65
|
+
- allowed state transition or rejection
|
|
66
|
+
- Avoid assertion-light smoke tests, snapshot-only tests, and tests that only prove "does not throw" unless that is the real requirement.
|
|
67
|
+
|
|
68
|
+
### 4) Add unit drift checks for atomic implementation tasks
|
|
69
|
+
|
|
70
|
+
- For each non-trivial atomic task, decide whether a focused unit drift check is possible.
|
|
71
|
+
- A unit drift check should name:
|
|
72
|
+
- target unit: function, method, module, policy, parser, mapper, validator, or state transition owner
|
|
73
|
+
- input state: minimal fixture, table row, fake dependency state, or boundary value
|
|
74
|
+
- oracle: exact output, error, state change, or no-side-effect assertion
|
|
75
|
+
- command: focused test command or existing test filter to run immediately after the task
|
|
76
|
+
- If no unit drift check is possible, record the smallest replacement check and a concrete reason.
|
|
77
|
+
- Do not allow broad integration or E2E tests to hide missing unit checks for locally owned business logic.
|
|
78
|
+
|
|
79
|
+
### 5) Record the decision
|
|
80
|
+
|
|
81
|
+
Use this compact record in planning docs or implementation summaries:
|
|
82
|
+
|
|
83
|
+
```text
|
|
84
|
+
Test ID: UT-01 / REG-01 / PBT-01 / IT-01 / E2E-01 / ADV-01
|
|
85
|
+
Requirement/Risk: R?.? / boundary | regression | ...
|
|
86
|
+
Target: function/module/flow
|
|
87
|
+
Fixture or mock strategy: ...
|
|
88
|
+
Oracle: exact output/error/state/no-side-effect
|
|
89
|
+
Verification hook: command or existing suite
|
|
90
|
+
N/A reason: only when this test level is not suitable
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Working Rules
|
|
94
|
+
|
|
95
|
+
- Start from risk and oracle, not from a desired test count.
|
|
96
|
+
- Prefer fast focused tests for implementation drift detection.
|
|
97
|
+
- Keep one test focused on one behavior or one failure mode.
|
|
98
|
+
- Use table-driven unit tests when many discrete business cases share one oracle.
|
|
99
|
+
- Use mocks/fakes for external services in business logic chains unless the real external contract itself is under test.
|
|
100
|
+
- Keep fixtures reproducible with fixed clocks, seeds, and controlled dependency states.
|
|
101
|
+
- Preserve failing generated examples or seeds as regression coverage.
|
|
102
|
+
- Record `N/A` only with a concrete reason tied to scope, risk, or observability.
|
|
103
|
+
- When a spec exists, map test IDs back to `tasks.md` and `checklist.md`.
|
|
104
|
+
|
|
105
|
+
## References
|
|
106
|
+
|
|
107
|
+
- `references/unit-tests.md`: unit test and drift-check design.
|
|
108
|
+
- `references/property-based-tests.md`: property-based test selection and oracle design.
|
|
109
|
+
- `references/integration-tests.md`: integration test selection and external-state scenarios.
|
|
110
|
+
- `references/e2e-tests.md`: E2E decision and replacement rules.
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
interface:
|
|
2
|
+
display_name: "Test Case Strategy"
|
|
3
|
+
short_description: "Select risk-driven tests and unit drift checks"
|
|
4
|
+
default_prompt: "Use $test-case-strategy to choose the smallest useful test level for each changed behavior, starting from requirement IDs, risk class, dependency shape, and existing coverage. Define oracles before implementation, prefer focused unit drift checks for atomic tasks, escalate to property-based, integration, E2E, adversarial, or mock/fake scenario coverage only when the risk requires it, and return concrete test IDs, target unit or flow, fixture strategy, verification hook, and any N/A reason."
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# E2E Tests
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
- Verify critical user-visible paths at the closest practical level to real usage.
|
|
6
|
+
- Catch cross-layer behavior gaps that lower-level tests cannot prove.
|
|
7
|
+
|
|
8
|
+
## Required when
|
|
9
|
+
|
|
10
|
+
- A change affects a high-impact user-visible flow.
|
|
11
|
+
- The flow is multi-step, cross-system, historically fragile, revenue-critical, permission-sensitive, or hard to reason about from lower-level tests alone.
|
|
12
|
+
- The environment and test data can be kept stable enough for maintainable coverage.
|
|
13
|
+
|
|
14
|
+
## Not suitable when
|
|
15
|
+
|
|
16
|
+
- Unit and integration tests already prove the actual risk.
|
|
17
|
+
- The environment is unstable or external dependencies would make the test flaky.
|
|
18
|
+
- The cost is disproportionate and integration tests can cover the same risk.
|
|
19
|
+
|
|
20
|
+
## Design rules
|
|
21
|
+
|
|
22
|
+
- Keep E2E minimal: one critical success path plus one highest-value denial or failure path when warranted.
|
|
23
|
+
- Assert business-visible outcomes, not just DOM presence or status-code success.
|
|
24
|
+
- Use controlled test data and avoid brittle external dependencies.
|
|
25
|
+
- Prefer replacing expensive E2E with stronger integration tests over adding flaky coverage.
|
|
26
|
+
- Record the decision per flow or risk slice; do not collapse unrelated paths into one global E2E decision.
|
|
27
|
+
|
|
28
|
+
## Recording
|
|
29
|
+
|
|
30
|
+
- Record `E2E-xx`, target flow, business-visible oracle, setup, command, and result.
|
|
31
|
+
- If skipped, record the replacement `IT-xx` cases and concrete rationale.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Integration Tests
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
- Verify collaboration across modules, layers, repositories, handlers, services, event flows, persistence, configuration, or controlled external-service scenarios.
|
|
6
|
+
- Cover risks unit tests cannot observe: sequence, wiring, persistence, IO, config, retry, fallback, and cross-boundary side effects.
|
|
7
|
+
|
|
8
|
+
## Required when
|
|
9
|
+
|
|
10
|
+
- The changed behavior depends on service/repository/API/event/module collaboration.
|
|
11
|
+
- The correctness question is about a user-critical business chain rather than one isolated unit.
|
|
12
|
+
- External dependency states affect business behavior and can be mocked or faked.
|
|
13
|
+
- E2E is not suitable but equivalent cross-layer risk still needs coverage.
|
|
14
|
+
|
|
15
|
+
## Not suitable when
|
|
16
|
+
|
|
17
|
+
- A pure unit owns the behavior and unit tests can fully observe the risk.
|
|
18
|
+
- A stable E2E test is required and feasible for a critical user-visible path.
|
|
19
|
+
|
|
20
|
+
## Design rules
|
|
21
|
+
|
|
22
|
+
- Keep dependencies inside the application boundary near-real when practical.
|
|
23
|
+
- Mock or fake external services unless the real service contract itself is under test.
|
|
24
|
+
- Build scenario matrices for success, timeout, retry exhaustion, partial data, stale data, duplicate callback, inconsistent response, and permission failure.
|
|
25
|
+
- Assert business outcomes across boundaries: persisted state, emitted event, deduplication, retry accounting, audit trail, compensation, or no partial write.
|
|
26
|
+
- Include adversarial paths when invalid transition, replay, double-submit, forged identifier, or out-of-order event risks exist.
|
|
27
|
+
- Keep data reconstructable and cleanup reliable.
|
|
28
|
+
|
|
29
|
+
## Recording
|
|
30
|
+
|
|
31
|
+
- Record `IT-xx`, modules involved, external dependency strategy, scenario matrix, oracle, requirement mapping, and command.
|
|
32
|
+
- If integration replaces E2E, map the replacement explicitly.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Property-Based Tests
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
- Verify invariants across broad or generated input spaces.
|
|
6
|
+
- Validate business rules as allowed outputs, forbidden outputs, valid transitions, rejection rules, or safety constraints.
|
|
7
|
+
- Catch combinational, adversarial, and boundary behaviors that fixed examples miss.
|
|
8
|
+
|
|
9
|
+
## Required when
|
|
10
|
+
|
|
11
|
+
- Logic has describable invariants: calculation, transformation, sorting, aggregation, serialization, normalization, deduplication, routing, ranking, or state transition.
|
|
12
|
+
- Business rules can be expressed as predicates, allow-lists, forbidden states, or metamorphic relationships.
|
|
13
|
+
- External-service-dependent logic can use mocks/fakes to generate service states.
|
|
14
|
+
|
|
15
|
+
## Not suitable when
|
|
16
|
+
|
|
17
|
+
- The only meaningful risk is a real external integration contract.
|
|
18
|
+
- The input space is tiny and better covered by exhaustive unit tests.
|
|
19
|
+
- The behavior is UI-only or lacks a stable machine-verifiable oracle.
|
|
20
|
+
|
|
21
|
+
## Design rules
|
|
22
|
+
|
|
23
|
+
- State the property in one sentence before writing generators.
|
|
24
|
+
- Generate normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
|
|
25
|
+
- Prefer direct business-rule predicates over vague structural checks.
|
|
26
|
+
- Use state-machine or sequence properties for stateful flows.
|
|
27
|
+
- Use metamorphic properties when exact outputs are hard to predict.
|
|
28
|
+
- Preserve failing seeds/examples as regression tests.
|
|
29
|
+
- Control sample count and input size so the suite remains practical.
|
|
30
|
+
|
|
31
|
+
## Common properties
|
|
32
|
+
|
|
33
|
+
- Round trip: `decode(encode(x)) == x`.
|
|
34
|
+
- Sorting is monotonic and preserves the input multiset.
|
|
35
|
+
- Merge/split operations preserve total count or value.
|
|
36
|
+
- Replaying the same command remains idempotent.
|
|
37
|
+
- Generated invalid or unauthorized inputs always fail with an expected result class.
|
|
38
|
+
- Generated transitions always end in allowed states or explicit rejections.
|
|
39
|
+
- Under generated mocked service states, fallback, retry, and compensation rules still hold.
|
|
40
|
+
|
|
41
|
+
## Recording
|
|
42
|
+
|
|
43
|
+
- Record `PBT-xx`, property, generator strategy, oracle, requirement mapping, and replay instructions for failures.
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# Unit Tests And Drift Checks
|
|
2
|
+
|
|
3
|
+
## Purpose
|
|
4
|
+
|
|
5
|
+
- Verify the smallest changed behavior unit: function, method, policy, parser, mapper, validator, state transition owner, or pure logic module.
|
|
6
|
+
- Localize failures quickly while the agent is still implementing a single task.
|
|
7
|
+
- Detect implementation drift by comparing the changed unit against an oracle defined before the code change.
|
|
8
|
+
|
|
9
|
+
## Required when
|
|
10
|
+
|
|
11
|
+
- A task changes non-trivial local logic.
|
|
12
|
+
- A requirement has boundary, denial, validation, state transition, idempotency, authorization, error, or no-side-effect behavior.
|
|
13
|
+
- The input space is small and discrete enough to enumerate expected outputs.
|
|
14
|
+
- A bug-prone behavior needs a fixed regression example.
|
|
15
|
+
- The task can be verified without DB, RPC, filesystem, browser, queue, or multi-module orchestration.
|
|
16
|
+
|
|
17
|
+
## Not suitable when
|
|
18
|
+
|
|
19
|
+
- Correctness depends on real cross-module collaboration, persistence, IO, or configuration wiring.
|
|
20
|
+
- The main risk is the external integration contract itself.
|
|
21
|
+
- The behavior is only observable through a user-visible end-to-end path.
|
|
22
|
+
|
|
23
|
+
## Drift-check record
|
|
24
|
+
|
|
25
|
+
```text
|
|
26
|
+
Unit drift check:
|
|
27
|
+
Target unit: [function/module/policy]
|
|
28
|
+
Requirement: [R?.?]
|
|
29
|
+
Fixture/input: [minimal state or table row]
|
|
30
|
+
Oracle: [exact output/error/state/no-side-effect]
|
|
31
|
+
Test case ID: [UT-xx or REG-xx]
|
|
32
|
+
Run after task: [focused command or test filter]
|
|
33
|
+
N/A reason: [only if no unit check can observe the task]
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Design rules
|
|
37
|
+
|
|
38
|
+
- Define the oracle from the spec, design, contract, official docs, or established intended behavior before implementation.
|
|
39
|
+
- Keep one test focused on one behavior or failure mode.
|
|
40
|
+
- Use table-driven tests for small business matrices.
|
|
41
|
+
- Cover both accepted and rejected states when the unit owns a decision.
|
|
42
|
+
- Assert exact errors, result classes, state changes, and intentional lack of side effects.
|
|
43
|
+
- Mock, stub, or fake external dependencies; unit tests should not need DB/RPC/file/browser IO.
|
|
44
|
+
- Control time, randomness, environment variables, and global state.
|
|
45
|
+
- Do not accept tests that only assert "does not throw", "returns truthy", or snapshot shape unless those are the real business oracle.
|
|
46
|
+
|
|
47
|
+
## Useful unit cases
|
|
48
|
+
|
|
49
|
+
- Boundary value is accepted at the limit and rejected outside it.
|
|
50
|
+
- Invalid input returns the specified error and performs no write.
|
|
51
|
+
- Unauthorized actor is denied with the expected reason.
|
|
52
|
+
- Repeating an idempotent call returns the same outcome without duplicate side effects.
|
|
53
|
+
- State transition from `A` to `B` is allowed, while `A` to `C` is rejected.
|
|
54
|
+
- Mapper/parser preserves required fields and rejects malformed variants.
|
|
55
|
+
|
|
56
|
+
## Recording
|
|
57
|
+
|
|
58
|
+
- In specs, map each `UT-xx` or `REG-xx` to requirement IDs and checklist items.
|
|
59
|
+
- In direct implementation, report test IDs, command, result, and any `N/A` reason.
|
|
Binary file
|
|
@@ -1,36 +0,0 @@
|
|
|
1
|
-
# E2E Testing Principles
|
|
2
|
-
|
|
3
|
-
## Core rules
|
|
4
|
-
- E2E is not decided solely by explicit user request.
|
|
5
|
-
- The agent must decide E2E based on feature importance, complexity, and cross-layer risk.
|
|
6
|
-
- For high-risk key user paths, create the smallest necessary E2E coverage first.
|
|
7
|
-
- If E2E is unstable, too costly, or environment-limited, add integration coverage for equivalent risk and record the alternative.
|
|
8
|
-
|
|
9
|
-
## Purpose
|
|
10
|
-
- Verify critical end-to-end user paths are usable.
|
|
11
|
-
- Catch behavior gaps after cross-system/cross-layer integration.
|
|
12
|
-
- Provide confidence close to real usage for high-risk scenarios.
|
|
13
|
-
|
|
14
|
-
## Decision criteria
|
|
15
|
-
- Importance: core feature, critical revenue flow, or high-impact process.
|
|
16
|
-
- Complexity: multi-step state transitions, branching flows, cross-service collaboration.
|
|
17
|
-
- Risk: historical regressions, fragile integrations, major user-visible failures.
|
|
18
|
-
- Maintainability: stable environment and controllable test data.
|
|
19
|
-
|
|
20
|
-
## Not suitable when
|
|
21
|
-
- Feature risk is low and unit/integration tests already cover it sufficiently.
|
|
22
|
-
- E2E is unstable and disproportionately expensive while integration tests can cover key risk.
|
|
23
|
-
|
|
24
|
-
## Design guidance
|
|
25
|
-
- Cover only the most critical paths; avoid expanding into full UI test suites.
|
|
26
|
-
- Keep test data controllable (fixed seeds or recyclable fixtures).
|
|
27
|
-
- Prioritize stability; avoid brittle external dependencies, use controlled substitutes if needed.
|
|
28
|
-
- Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
|
|
29
|
-
- Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
|
|
30
|
-
- Keep cost decisions explicit: document why E2E is done or not done and what alternative strategy is used.
|
|
31
|
-
|
|
32
|
-
## Spec/checklist authoring hints
|
|
33
|
-
- Mark high-risk key paths in `spec.md` requirement descriptions.
|
|
34
|
-
- Record E2E decisions, mapped test cases, and results in `checklist.md`.
|
|
35
|
-
- When different flows need different strategies, create multiple decision records instead of forcing one shared E2E decision for the whole feature.
|
|
36
|
-
- If skipping E2E, specify replacement integration test cases (`IT-xx`) and rationale in `checklist.md`.
|
|
@@ -1,42 +0,0 @@
|
|
|
1
|
-
# Integration Testing Principles
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify collaboration across modules/layers and external dependencies.
|
|
5
|
-
- Cover integration risks unit tests cannot capture (sequence, config, IO failure).
|
|
6
|
-
- Validate user-critical business logic chains under realistic component interaction and controlled external-service scenarios.
|
|
7
|
-
|
|
8
|
-
## When to use
|
|
9
|
-
- Interface interactions between modules (for example service ↔ repository).
|
|
10
|
-
- Changes touching IO dependencies such as DB, RPC, files, cache, queues.
|
|
11
|
-
- Behaviors that depend on configuration combinations or environment differences.
|
|
12
|
-
- The correctness question is about the whole business logic chain rather than one isolated function.
|
|
13
|
-
- As minimum safety replacement when E2E is not suitable.
|
|
14
|
-
|
|
15
|
-
## Not suitable when
|
|
16
|
-
- Single pure-function or pure-logic behavior (use unit tests).
|
|
17
|
-
- Full end-to-end user flow can be stably covered by E2E.
|
|
18
|
-
|
|
19
|
-
## Relationship with E2E
|
|
20
|
-
- If change importance/complexity is high and E2E is feasible, prefer minimal E2E for key paths.
|
|
21
|
-
- If E2E is hard or too costly, integration tests must cover equivalent key risks.
|
|
22
|
-
- Record replacement mapping in `checklist.md` (E2E-xx ↔ IT-xx) with rationale.
|
|
23
|
-
|
|
24
|
-
## Design guidance
|
|
25
|
-
- Focus on high-value integration points; each test should justify risk/value.
|
|
26
|
-
- Keep dependencies inside the application boundary near-real where practical.
|
|
27
|
-
- Mock/fake external services at the business-chain boundary unless the real service contract itself is what needs verification.
|
|
28
|
-
- Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
|
|
29
|
-
- Add adversarial/penetration-style cases for abuse paths such as invalid transitions, replay, double-submit, forged identifiers, or out-of-order events when those risks exist.
|
|
30
|
-
- When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
|
|
31
|
-
- Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
|
|
32
|
-
- Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
|
|
33
|
-
- Keep reproducible: controlled test data and recoverable environment.
|
|
34
|
-
- Keep cost controlled; avoid broad redundant coverage (leave that to unit tests).
|
|
35
|
-
|
|
36
|
-
## Spec/checklist authoring hints
|
|
37
|
-
- Dependency scope: list involved modules/external systems.
|
|
38
|
-
- Scenario: describe cross-module flow or critical branch.
|
|
39
|
-
- Risk: explain what integration failure, misconfiguration, or business-chain break this test can reveal.
|
|
40
|
-
- External dependency strategy: specify which services are mocked/faked versus near-real and why.
|
|
41
|
-
- Scenario matrix: list the external states or adversarial paths covered.
|
|
42
|
-
- Map behavior, test IDs, and test outcomes in `checklist.md`.
|
|
@@ -1,44 +0,0 @@
|
|
|
1
|
-
# Property-based Testing Principles
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify invariants/properties hold across broad input spaces.
|
|
5
|
-
- Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
|
|
6
|
-
- Catch combinational, adversarial, and boundary behaviors that fixed examples often miss.
|
|
7
|
-
|
|
8
|
-
## When to use
|
|
9
|
-
- Algorithms, transformations, serialization/deserialization, sorting, aggregation.
|
|
10
|
-
- Behaviors requiring consistency or reversibility (for example round-trip).
|
|
11
|
-
- Data structures or state transitions with clear invariants.
|
|
12
|
-
- Business logic where the rule can be stated as input/output expectations, allowed states, forbidden states, or safety constraints.
|
|
13
|
-
- Logic chains that depend on external services, when those services can be replaced by controllable mocks/fakes and their states generated as part of the test space.
|
|
14
|
-
|
|
15
|
-
## Not suitable when
|
|
16
|
-
- The main thing being validated is the real integration contract with external systems or live IO (use integration tests).
|
|
17
|
-
- UI/interactive flows without stable invariants.
|
|
18
|
-
- Very small discrete input spaces (unit tests are sufficient).
|
|
19
|
-
|
|
20
|
-
## Design guidance
|
|
21
|
-
- Properties must be explicit and machine-verifiable, whether they are invariants, allowed outcome sets, rejection rules, or business-output predicates.
|
|
22
|
-
- Generators should cover normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
|
|
23
|
-
- Prefer modeling business rules directly: generate inputs, run the logic, then assert the output/error/state transition matches the rule.
|
|
24
|
-
- When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
|
|
25
|
-
- When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
|
|
26
|
-
- For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
|
|
27
|
-
- Ensure reproducibility (fixed seed or replayable input generation) and preserve failing seeds/examples for regression coverage.
|
|
28
|
-
- Complement unit tests; avoid duplicating fixed-case tests.
|
|
29
|
-
- Control cost with reasonable sample counts and input-size limits.
|
|
30
|
-
|
|
31
|
-
## Common property examples (description level)
|
|
32
|
-
- `deserialize(serialize(x)) == x`
|
|
33
|
-
- Sorted output is monotonic and preserves element multiset.
|
|
34
|
-
- Merge/split operations preserve total element count.
|
|
35
|
-
- Idempotency: repeating the same operation does not change results.
|
|
36
|
-
- Invalid or unauthorized generated inputs always fail with an expected error/result class.
|
|
37
|
-
- Generated order/payment/state-transition inputs always end in an allowed business state.
|
|
38
|
-
- Under generated mocked service states, the business logic chain still satisfies fallback/retry/compensation rules.
|
|
39
|
-
|
|
40
|
-
## Spec/checklist authoring hints
|
|
41
|
-
- Property/rule: one sentence stating the rule that must always hold or the allowed outcomes that must contain the result.
|
|
42
|
-
- Generator strategy: input range, distribution, emphasized boundaries, and any adversarial or external-state dimensions.
|
|
43
|
-
- Oracle/check: describe how the test decides correctness (predicate, allow-list, reference model, or expected error class).
|
|
44
|
-
- Purpose: explain correctness/risk reduction value of this property.
|
|
@@ -1,37 +0,0 @@
|
|
|
1
|
-
# Unit Testing Principles
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify correctness of the smallest testable unit (function, method, or pure logic module).
|
|
5
|
-
- Provide fast feedback with low-cost failure localization.
|
|
6
|
-
|
|
7
|
-
## When to use
|
|
8
|
-
- Core business logic and critical branches.
|
|
9
|
-
- Boundary conditions (upper/lower limits, null/empty, extreme values).
|
|
10
|
-
- Error handling and exception paths (invalid input, incompatible state, etc.).
|
|
11
|
-
|
|
12
|
-
## Not suitable when
|
|
13
|
-
- Behavior requires cross-module or external dependency verification (use integration tests).
|
|
14
|
-
- Full user-flow validation is required (evaluate E2E first; if not suitable, use integration tests to cover risk).
|
|
15
|
-
|
|
16
|
-
## Design guidance
|
|
17
|
-
- Isolate external dependencies with mock/stub/fake; avoid DB/RPC/file IO.
|
|
18
|
-
- Keep tests small and focused: one test, one behavior/failure mode.
|
|
19
|
-
- Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
|
|
20
|
-
- Cover both success and failure branches.
|
|
21
|
-
- Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
|
|
22
|
-
- Prefer table-driven cases when many small business permutations share the same oracle.
|
|
23
|
-
- Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
|
|
24
|
-
- If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
|
|
25
|
-
- Keep tests reproducible: avoid nondeterministic time/random/global state.
|
|
26
|
-
- Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
|
|
27
|
-
- Map tests to requirements: each core requirement should have at least one unit test.
|
|
28
|
-
|
|
29
|
-
## Spec/checklist authoring hints
|
|
30
|
-
- Scenario: describe input and initial state mapped to one requirement/boundary.
|
|
31
|
-
- Expected result: verifiable output, state change, or error.
|
|
32
|
-
- Purpose: explain which risk or bug type this test prevents.
|
|
33
|
-
|
|
34
|
-
## Common examples (description level)
|
|
35
|
-
- Return a specific error when input is out of allowed range.
|
|
36
|
-
- Handle empty list/empty string input with expected behavior.
|
|
37
|
-
- Ensure output matches definition after state/flag switching.
|
|
@@ -1,26 +0,0 @@
|
|
|
1
|
-
# E2E Testing Guide
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify critical user-visible paths at end-to-end level.
|
|
5
|
-
- Increase confidence in real behavior after cross-layer integration.
|
|
6
|
-
|
|
7
|
-
## Required when
|
|
8
|
-
- If changes impact key user-visible flows, add or update E2E tests.
|
|
9
|
-
- E2E must still be evaluated even when specs are not used; if not applicable, record explicit rationale.
|
|
10
|
-
|
|
11
|
-
## E2E decision rules
|
|
12
|
-
- Prefer E2E for high-risk, high-impact, multi-step flow changes.
|
|
13
|
-
- Integration tests may replace E2E when E2E is too costly, unstable, or hard to maintain.
|
|
14
|
-
- When replacing E2E, provide equivalent risk coverage and record replacement cases plus reasons.
|
|
15
|
-
|
|
16
|
-
## Design guidance
|
|
17
|
-
- Focus on minimal critical path coverage; avoid over-expansion.
|
|
18
|
-
- Use stable test data and reproducible flows.
|
|
19
|
-
- Prioritize business outcomes over brittle UI details.
|
|
20
|
-
- Prefer one critical success path and one highest-value denial/failure path over many shallow happy-path journeys.
|
|
21
|
-
- Assert business-visible outcomes, not just DOM presence: final state, permission denial, user-facing error, persisted result, or prevented duplicate action.
|
|
22
|
-
|
|
23
|
-
## Recording rules
|
|
24
|
-
- Specs flow: record E2E or replacement strategy with outcomes in `checklist.md`.
|
|
25
|
-
- When different flows need different strategies, create multiple decision records in `checklist.md` so each flow/risk slice keeps its own rationale and linked case IDs.
|
|
26
|
-
- Non-specs flow: explain E2E execution or replacement testing with rationale in the response.
|
|
@@ -1,30 +0,0 @@
|
|
|
1
|
-
# Integration Testing Guide
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify correctness of cross-layer/cross-module collaboration.
|
|
5
|
-
- Focus especially on user-critical logic chains.
|
|
6
|
-
- Validate business outcomes across the full changed chain, not just connectivity.
|
|
7
|
-
|
|
8
|
-
## Required when
|
|
9
|
-
- Any change affecting service/repository/API handlers/event flows should add or update integration tests.
|
|
10
|
-
- Integration tests for user-critical logic chains are required even when specs are not used.
|
|
11
|
-
|
|
12
|
-
## Coverage focus
|
|
13
|
-
- Key data flow from entrypoint to output.
|
|
14
|
-
- Cross-module contract and configuration interaction.
|
|
15
|
-
- Common failure patterns (timeout, data inconsistency, external dependency failure).
|
|
16
|
-
- External dependency state changes and fallback/compensation behavior.
|
|
17
|
-
- Adversarial/abuse paths such as invalid transitions, replay, duplication, forged identifiers, or out-of-order events when relevant.
|
|
18
|
-
|
|
19
|
-
## Design guidance
|
|
20
|
-
- Prefer near-real dependencies inside the application boundary; mock/fake external services unless the real service contract itself is under test.
|
|
21
|
-
- Build scenario matrices for external states such as success, timeout, retries exhausted, partial data, stale data, duplicate callbacks, inconsistent responses, and permission failures.
|
|
22
|
-
- Keep test data reconstructable and cleanable.
|
|
23
|
-
- When workflows can partially commit, assert rollback/compensation/no-partial-write behavior instead of only final status codes.
|
|
24
|
-
- Assert business outcomes across boundaries: persisted state, emitted events, deduplication, retry accounting, audit trail, or intentional absence of writes/notifications.
|
|
25
|
-
- Add at least one regression-style integration test for the highest-risk chain whenever the change fixes a bug or touches a historically fragile path.
|
|
26
|
-
- Each test case should map to an explainable risk.
|
|
27
|
-
|
|
28
|
-
## Recording rules
|
|
29
|
-
- Specs flow: record IT cases and outcomes in `checklist.md`.
|
|
30
|
-
- Non-specs flow: list user-critical integration tests, mocked external scenarios, adversarial cases, and outcomes in the response.
|
|
@@ -1,33 +0,0 @@
|
|
|
1
|
-
# Property-based Testing Guide
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify invariants across large input combinations.
|
|
5
|
-
- Validate business rules by generating or exhaustively enumerating meaningful input spaces and checking outputs against expected business behavior.
|
|
6
|
-
- Catch combinational, adversarial, and boundary behavior that fixed examples often miss.
|
|
7
|
-
|
|
8
|
-
## Required when
|
|
9
|
-
- If changes include logic with describable invariants (calculation, transformation, sorting, aggregation, serialization), add/update property-based tests.
|
|
10
|
-
- If changes include business rules that can be expressed as allowed outputs, forbidden outputs, valid transitions, rejection rules, or safety constraints, add/update property-based tests.
|
|
11
|
-
- If logic depends on external services but the service can be replaced with a mock/fake to generate service states, property-based tests should cover those state combinations too.
|
|
12
|
-
- If not applicable, record `N/A` with a concrete reason.
|
|
13
|
-
|
|
14
|
-
## Common properties
|
|
15
|
-
- Round-trip: `decode(encode(x)) == x`
|
|
16
|
-
- Idempotency: repeated execution does not change the result
|
|
17
|
-
- Monotonicity/conservation/set invariance
|
|
18
|
-
- Generated invalid or unauthorized inputs always fail with an expected result/error class
|
|
19
|
-
- Generated state transitions always end in an allowed business state
|
|
20
|
-
- Under generated mocked service states, the business logic chain preserves fallback/retry/compensation rules
|
|
21
|
-
|
|
22
|
-
## Design guidance
|
|
23
|
-
- Properties must be machine-verifiable, whether they are invariants, allow-lists, rejection rules, or business-output predicates.
|
|
24
|
-
- Generator strategy should include normal cases, boundaries, extremes, malformed inputs, and suspicious/adversarial combinations.
|
|
25
|
-
- Prefer modeling the business rule directly: generate inputs, run the logic, then assert output/error/state transition matches the rule.
|
|
26
|
-
- When the behavior is stateful, prefer state-machine or sequence-based properties over isolated single-call generators.
|
|
27
|
-
- When exact outputs are hard to predict, use metamorphic properties (for example reordering, retrying, deduplicating, or replaying inputs should preserve an allowed relation).
|
|
28
|
-
- For external-service-dependent logic, mock/fake the service and generate multiple service states (success, timeout, empty, partial, stale, inconsistent, duplicate, rejected).
|
|
29
|
-
- Control execution cost while preserving reproducibility, and preserve failing seeds/examples for regression coverage.
|
|
30
|
-
|
|
31
|
-
## Recording rules
|
|
32
|
-
- If specs are used, record cases and outcomes in `checklist.md`.
|
|
33
|
-
- If specs are not used, record cases, external-state coverage, adversarial coverage, or `N/A` reasons in the response.
|
|
@@ -1,29 +0,0 @@
|
|
|
1
|
-
# Unit Testing Guide
|
|
2
|
-
|
|
3
|
-
## Purpose
|
|
4
|
-
- Verify correctness of a single function/module and localize failures quickly.
|
|
5
|
-
- Cover both success and failure paths for the smallest changed behavior unit.
|
|
6
|
-
|
|
7
|
-
## Required when
|
|
8
|
-
- Any non-trivial logic change should add or update unit tests.
|
|
9
|
-
- Unit test evaluation is required even when specs are not used.
|
|
10
|
-
|
|
11
|
-
## Coverage focus
|
|
12
|
-
- Core logic branches and boundary values.
|
|
13
|
-
- Error handling, validation failures, and incompatible states.
|
|
14
|
-
- Function paths with highest regression risk.
|
|
15
|
-
|
|
16
|
-
## Design guidance
|
|
17
|
-
- Isolate external dependencies (mock/stub/fake).
|
|
18
|
-
- Keep tests small and focused: one behavior per test.
|
|
19
|
-
- Do not stop at happy-path assertions; verify exact errors, rejected states, and intentional lack of side effects when the unit should block an action.
|
|
20
|
-
- Where the input space is small and discrete, exhaustively enumerate business inputs and expected outputs.
|
|
21
|
-
- Prefer table-driven cases when many small business permutations share the same oracle.
|
|
22
|
-
- Add regression tests for bug-prone or high-risk logic so previously broken behavior cannot silently return.
|
|
23
|
-
- If the unit owns authorization, invalid transition, idempotency, or concurrency decisions, test those denials explicitly.
|
|
24
|
-
- Keep tests reproducible and fast.
|
|
25
|
-
- Avoid assertion-light smoke tests and snapshot-only coverage unless the snapshot has a strict business oracle behind it.
|
|
26
|
-
|
|
27
|
-
## Recording rules
|
|
28
|
-
- If specs are used, record mapped test cases and results in `checklist.md`.
|
|
29
|
-
- If specs are not used, list test IDs and results in the response.
|