@neikyun/ciel 6.11.0 → 6.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/assets/.claude/hooks/memory-engine.py +29 -4
  2. package/assets/.claude/settings.json +8 -8
  3. package/assets/commands/ciel-create-skill.md +2 -2
  4. package/assets/commands/ciel-status.md +1 -1
  5. package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +2 -2
  6. package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +2 -2
  7. package/assets/platforms/opencode/.opencode/commands/ciel-memory-bootstrap.md +195 -0
  8. package/assets/skills/workflow/adr-auto/SKILL.md +88 -0
  9. package/assets/skills/workflow/ai-failure-modes-detector/SKILL.md +180 -0
  10. package/assets/skills/workflow/ask-window/SKILL.md +119 -0
  11. package/assets/skills/workflow/avec-quoi-versioner/SKILL.md +111 -0
  12. package/assets/skills/workflow/ci-watcher/SKILL.md +194 -0
  13. package/assets/skills/workflow/critiquer-auditor/SKILL.md +135 -0
  14. package/assets/skills/workflow/critiquer-auditor/reference.md +134 -0
  15. package/assets/skills/workflow/debug-reasoning-rca/SKILL.md +174 -0
  16. package/assets/skills/workflow/depth-classifier/SKILL.md +118 -0
  17. package/assets/skills/workflow/diverge/SKILL.md +91 -0
  18. package/assets/skills/workflow/doc-validator-official/SKILL.md +196 -0
  19. package/assets/skills/workflow/evaluer-sizer/SKILL.md +112 -0
  20. package/assets/skills/workflow/faire-gatekeeper/SKILL.md +99 -0
  21. package/assets/skills/workflow/flux-narrator/SKILL.md +93 -0
  22. package/assets/skills/workflow/memoire/SKILL.md +198 -0
  23. package/assets/skills/workflow/memoire-consolidator/SKILL.md +91 -0
  24. package/assets/skills/workflow/meta-critiquer/SKILL.md +112 -0
  25. package/assets/skills/workflow/modern-patterns-checker/SKILL.md +166 -0
  26. package/assets/skills/workflow/pattern-fitness-check/SKILL.md +108 -0
  27. package/assets/skills/workflow/playwright-visual-critic/SKILL.md +98 -0
  28. package/assets/skills/workflow/pr-review-responder/SKILL.md +214 -0
  29. package/assets/skills/workflow/prouver-verifier/SKILL.md +184 -0
  30. package/assets/skills/workflow/prouver-verifier/reference.md +152 -0
  31. package/assets/skills/workflow/quoi-framer/SKILL.md +91 -0
  32. package/assets/skills/workflow/relire-critic/SKILL.md +99 -0
  33. package/assets/skills/workflow/security-regression-check/SKILL.md +86 -0
  34. package/assets/skills/workflow/self-consistency-verifier/SKILL.md +85 -0
  35. package/assets/skills/workflow/spike-mode/SKILL.md +101 -0
  36. package/assets/skills/workflow/stride-analyzer/SKILL.md +96 -0
  37. package/assets/skills/workflow/stride-analyzer/reference.md +144 -0
  38. package/assets/skills/workflow/test-strategy-vitest-playwright/SKILL.md +119 -0
  39. package/package.json +1 -1
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: test-strategy-vitest-playwright
3
+ description: How to plan a test strategy — test pyramid (70/20/10), what to test at each level (unit/integration/E2E), what to mock vs hit real, property-based testing for boundaries, and keeping the suite fast. 2026 convention: browser-native runners, accessibility-tree assertions over screenshots.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # Test Strategy — Pyramid, Not Ice-Cream Cone
8
+
9
+ ## What this covers
10
+
11
+ How to decide which tests go where, what to mock, and how to keep a test suite fast. The anti-pattern is 70% E2E Playwright, 5% unit — slow CI, flaky, expensive. The 2026 pyramid: most tests at the unit level, very few real-browser E2E.
12
+
13
+ ## Core principle
14
+
15
+ **Most tests should be unit tests.** E2E is for critical user paths across 3+ components, not coverage inflation. If you're writing E2E because "it's hard to isolate", the code needs a refactor, not more tests.
16
+
17
+ ## The 2026 pyramid (target ratios)
18
+
19
+ ```
20
+ ┌───────────────┐
21
+ │ E2E (10%) │ Playwright — critical user paths only
22
+ ├───────────────┤
23
+ │ Integ (20%) │ Vitest + MSW (no real network) OR test DB
24
+ ├───────────────┤
25
+ │ │
26
+ │ Unit (70%) │ Vitest — pure logic, reducers, utils
27
+ │ │
28
+ └───────────────┘
29
+ ```
30
+
31
+ Property-based (`fast-check`) crosscuts all levels for boundary conditions.
32
+
33
+ ## Unit testing (Vitest)
34
+
35
+ **When**: pure function, reducer, class method with deterministic input→output.
36
+
37
+ - Test ONE behavior per test (not "mega tests")
38
+ - No real filesystem, no real network, no real DB
39
+ - Run in < 50ms each
40
+ - Should fail if implementation logic breaks (not if formatting breaks)
41
+
42
+ ```typescript
43
+ it('paginates offset correctly when page is 0', () => {
44
+ expect(paginate({ page: 0, size: 10 }).offset).toBe(0);
45
+ });
46
+ ```
47
+
48
+ ## Integration testing (Vitest + MSW)
49
+
50
+ **When**: module touches an external system (HTTP API, DB, cache) but you want fast deterministic runs.
51
+
52
+ - MSW mocks the HTTP layer at the network level (not at the `fetch` level)
53
+ - Seed the test with a realistic fixture response
54
+ - DB: use `vitest-environment` + SQLite in-memory OR Testcontainers for the real engine
55
+ - Run in < 500ms each
56
+
57
+ **Anti-pattern**: mocking your own modules. If you mock your own user-service, you're just testing that you wrote mocks correctly.
58
+
59
+ ## E2E testing (Playwright)
60
+
61
+ **When**: critical user path across ≥ 3 components (login → browse → checkout → confirm).
62
+
63
+ - **Accessibility-tree assertions** (`page.getByRole('button', { name: 'Submit' })`) — deterministic, doesn't break on CSS changes
64
+ - **Avoid screenshot assertions** for behavior — use for visual regression only, and only on static content
65
+ - Seed DB via a test setup script, NOT through the UI (too slow)
66
+ - One test = one user journey, not twelve
67
+
68
+ ## Property-based testing (fast-check)
69
+
70
+ **When**: boundary conditions are the risk — off-by-one, null, empty, max int, unicode.
71
+
72
+ - State the PROPERTY ("sorting is idempotent: sort(sort(x)) === sort(x)")
73
+ - Let fast-check generate 100+ inputs
74
+ - Use `fc.pre()` to filter invalid inputs (not to avoid branches of logic)
75
+
76
+ ## What to mock, what to hit real
77
+
78
+ | System | Mock? | Rationale |
79
+ |---|---|---|
80
+ | External HTTP APIs | Yes (MSW) | Flaky, slow, rate-limited |
81
+ | Internal microservices | Yes (MSW) for unit/integ; real for E2E | Keep blast radius small |
82
+ | Database | Real (in-memory or container) | Too many bugs hide in ORM/raw-SQL mismatch |
83
+ | Time (`Date.now`) | Yes (vi.useFakeTimers) | Non-determinism otherwise |
84
+ | Randomness | Yes (seeded PRNG) | Same reason |
85
+ | Filesystem | Real (temp dir) for integ; mock for unit | `memfs` is fine for pure tests |
86
+ | Auth tokens | Real signed test token | Mocked tokens hide signature-validation bugs |
87
+ | Third-party SDK | Mock at module boundary | Not at network level |
88
+
89
+ ## Key points
90
+
91
+ - Pyramid ratios are targets, not strict quotas — a pure-UI feature may skew E2E higher; a pure-algorithm feature may be 95% unit
92
+ - No E2E without unit first
93
+ - One test per behavior — tests named `it('does many things', ...)` are code smell
94
+ - Avoid snapshot tests for dynamic output — they become "update snapshots" rituals that don't catch bugs
95
+ - Accessibility-tree > CSS selectors in Playwright — `getByRole` survives refactors
96
+ - Flaky test policy: first flake → debug. Second flake → quarantine (`test.skip` + ISSUE). Third flake → delete unless Critical
97
+
98
+ ## Common anti-patterns
99
+
100
+ 1. **Ice-cream cone**: 70% E2E, 5% unit — slow, flaky, expensive to maintain
101
+ 2. **Coverage theater**: high coverage number but tests don't catch real bugs
102
+ 3. **Mocking yourself**: mocking your own modules proves nothing except that mocks work
103
+ 4. **Mega-tests**: one test covering 5 scenarios — split them
104
+ 5. **Screenshot assertions for behavior**: brittle, break on font changes; use accessibility-tree assertions instead
105
+ 6. **E2E as unit replacement**: E2E tests are 100x slower, use them only for integration across real browsers
106
+
107
+ ## How to verify your test strategy is good
108
+
109
+ - **Runtime budget**: total suite < 3 min for pre-commit + CI
110
+ - **Mutation testing**: change production code → does a test fail?
111
+ - **New person test**: can someone understand the feature from tests alone?
112
+ - **Bug regression**: when a bug is found, add a test that would have caught it at the lowest level possible
113
+
114
+ ## References
115
+
116
+ - defined.net/blog/modern-frontend-testing — Vitest + Storybook + Playwright stack
117
+ - playwright.dev/docs/best-practices — accessibility-tree assertions
118
+ - fast-check.dev — property-based testing in TS/JS
119
+ - hypothesis.works — property-based testing in Python (equivalent concepts)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@neikyun/ciel",
3
- "version": "6.11.0",
3
+ "version": "6.11.2",
4
4
  "description": "Ciel — Deep-reasoning pipeline for LLM-assisted development. OpenCode plugin + multi-platform CLI (OpenCode, Claude Code, more).",
5
5
  "main": "./dist/plugin/index.js",
6
6
  "types": "./dist/plugin/index.d.ts",