@neikyun/ciel 6.10.1 → 6.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/assets/.claude/hooks/memory-engine.py +256 -0
  2. package/assets/commands/ciel-audit.md +42 -0
  3. package/assets/commands/ciel-create-skill.md +2 -2
  4. package/assets/commands/ciel-status.md +1 -1
  5. package/assets/platforms/opencode/.opencode/agents/ciel-improver.md +2 -2
  6. package/assets/platforms/opencode/.opencode/commands/ciel-create-skill.md +2 -2
  7. package/assets/platforms/opencode/.opencode/commands/ciel-memory-bootstrap.md +195 -0
  8. package/assets/skills/ciel/SKILL.md +2 -1
  9. package/assets/skills/workflow/adr-auto/SKILL.md +88 -0
  10. package/assets/skills/workflow/ai-failure-modes-detector/SKILL.md +180 -0
  11. package/assets/skills/workflow/ask-window/SKILL.md +119 -0
  12. package/assets/skills/workflow/avec-quoi-versioner/SKILL.md +111 -0
  13. package/assets/skills/workflow/ci-watcher/SKILL.md +194 -0
  14. package/assets/skills/workflow/critiquer-auditor/SKILL.md +135 -0
  15. package/assets/skills/workflow/critiquer-auditor/reference.md +134 -0
  16. package/assets/skills/workflow/debug-reasoning-rca/SKILL.md +174 -0
  17. package/assets/skills/workflow/depth-classifier/SKILL.md +118 -0
  18. package/assets/skills/workflow/diverge/SKILL.md +91 -0
  19. package/assets/skills/workflow/doc-validator-official/SKILL.md +196 -0
  20. package/assets/skills/workflow/evaluer-sizer/SKILL.md +112 -0
  21. package/assets/skills/workflow/faire-gatekeeper/SKILL.md +99 -0
  22. package/assets/skills/workflow/flux-narrator/SKILL.md +93 -0
  23. package/assets/skills/workflow/memoire/SKILL.md +198 -0
  24. package/assets/skills/workflow/memoire-consolidator/SKILL.md +91 -0
  25. package/assets/skills/workflow/meta-critiquer/SKILL.md +112 -0
  26. package/assets/skills/workflow/modern-patterns-checker/SKILL.md +166 -0
  27. package/assets/skills/workflow/pattern-fitness-check/SKILL.md +108 -0
  28. package/assets/skills/workflow/playwright-visual-critic/SKILL.md +98 -0
  29. package/assets/skills/workflow/pr-review-responder/SKILL.md +214 -0
  30. package/assets/skills/workflow/prouver-verifier/SKILL.md +184 -0
  31. package/assets/skills/workflow/prouver-verifier/reference.md +152 -0
  32. package/assets/skills/workflow/quoi-framer/SKILL.md +91 -0
  33. package/assets/skills/workflow/relire-critic/SKILL.md +99 -0
  34. package/assets/skills/workflow/security-regression-check/SKILL.md +86 -0
  35. package/assets/skills/workflow/self-consistency-verifier/SKILL.md +85 -0
  36. package/assets/skills/workflow/spike-mode/SKILL.md +101 -0
  37. package/assets/skills/workflow/stride-analyzer/SKILL.md +96 -0
  38. package/assets/skills/workflow/stride-analyzer/reference.md +144 -0
  39. package/assets/skills/workflow/test-strategy-vitest-playwright/SKILL.md +119 -0
  40. package/package.json +1 -1
@@ -0,0 +1,144 @@
1
+ # stride-analyzer — Reference
2
+
3
+ ## STRIDE — detailed category probes
4
+
5
+ ### S — Spoofing (identity)
6
+
7
+ Can I impersonate another user/service/system?
8
+
9
+ Probes:
10
+ - Grep for `userId` / `user_id` coming from request params vs resolved server-side (JWT, session)
11
+ - Grep for identity claims trusted without verification (e.g. `X-User-Id` header accepted as-is)
12
+ - Check auth middleware ordering: is authentication before authorization?
13
+ - WebSocket/SSE: is the same auth applied? (common gap: REST auth is bulletproof, WS accepts any token)
14
+
15
+ Evidence format:
16
+ ```
17
+ - Spoofing: userId extracted from JWT claim at JwtMiddleware.kt:45 — not client-supplied ✓
18
+ ```
19
+
20
+ ### T — Tampering (data integrity)
21
+
22
+ Can input be modified in transit or at rest without detection?
23
+
24
+ Probes:
25
+ - HTTPS everywhere? Grep for `http://` (non-localhost)
26
+ - CSRF tokens on state-changing endpoints?
27
+ - Signed cookies / signed JWTs? What algorithm? (HS256 vs RS256 considerations)
28
+ - Database writes: is the audit trail immutable? (INSERT-only tables for events)
29
+
30
+ ### R — Repudiation (non-denial)
31
+
32
+ Can a user deny having performed an action?
33
+
34
+ Probes:
35
+ - Audit log coverage: what events are logged? With what identity?
36
+ - Log tampering resistance: append-only? Logged externally?
37
+ - Timestamp source: server-controlled? Synced?
38
+
39
+ ### I — Information Disclosure
40
+
41
+ What information leaks to unauthorized parties?
42
+
43
+ Probes:
44
+ - Error messages: do they include stack traces / SQL / paths / credentials?
45
+ - Logs: do they contain PII, secrets, tokens?
46
+ - API responses: over-fetching? `SELECT *` instead of projected columns?
47
+ - 404 vs 403 distinction: timing attack on existence probe?
48
+ - Autocomplete endpoints: leak usernames / emails?
49
+
50
+ ### D — Denial of Service
51
+
52
+ Can this be flooded or exhausted?
53
+
54
+ Probes:
55
+ - Rate limiting: per-IP? per-user? per-endpoint?
56
+ - Resource bounds: max payload size? max query depth (GraphQL)? max file upload?
57
+ - Algorithmic complexity: O(n²) loops on user-controlled n?
58
+ - Connection pooling: max connections? timeout?
59
+ - Regex catastrophic backtracking on user input?
60
+
61
+ ### E — Elevation of Privilege
62
+
63
+ Can I access what I shouldn't?
64
+
65
+ Probes:
66
+ - RBAC/ABAC correctness: does the permission check run before the action?
67
+ - Horizontal privilege escalation: can user A read user B's data with API manipulation?
68
+ - Vertical privilege escalation: can user become admin via some path?
69
+ - Mass assignment: can user set `isAdmin` via PATCH body?
70
+
71
+ ## OPS lens (overlayed on STRIDE)
72
+
73
+ - **Unclosed connections**: grep for `conn.close()` / `client.close()` / `try-with-resources` / `use {}` — every open should have a close
74
+ - **Memory leaks**: long-lived caches without eviction? Unbounded collections? Listeners not removed?
75
+ - **Locks**: deadlock-prone order? Held across I/O?
76
+ - **100x volume**: if traffic grew 100x tomorrow, what breaks first?
77
+
78
+ ## Killer checklist — detail
79
+
80
+ ### Same field = same validation everywhere
81
+
82
+ If `email` is validated one way in `RegisterRoute.kt` and another way in `ProfileUpdateRoute.kt`, an attacker uses the weaker one. Validation must be centralized.
83
+
84
+ ```bash
85
+ # Find all places email is validated
86
+ grep -rn "email" --include='*.kt' src/ | grep -iE 'valid|sanitize|check'
87
+ ```
88
+
89
+ Evidence: all call sites converge on a single validator.
90
+
91
+ ### Same domain = same auth on ALL transports
92
+
93
+ REST endpoint has auth; WebSocket channel for the same resource doesn't (or uses different auth). Attacker bypasses via WebSocket.
94
+
95
+ ```bash
96
+ grep -rn "authenticate" src/ --include='*.kt'
97
+ grep -rn "socket\|websocket\|sse\|webFluxClient" src/
98
+ ```
99
+
100
+ ### Identity resolved server-side
101
+
102
+ ```bash
103
+ # Any userId coming from request body/path?
104
+ grep -rn 'call.parameters\["userId"\]' src/
105
+ grep -rn 'request.body.userId' src/
106
+ # Should all be via JWT/session claim
107
+ ```
108
+
109
+ ### SQL parameterized
110
+
111
+ ```bash
112
+ # Find string interpolation in SQL
113
+ grep -rn "\\\$" src/ --include='*.kt' | grep -iE 'sql|query'
114
+ grep -rn "\"SELECT.*\"\ +\ " src/
115
+ ```
116
+
117
+ ### PII anonymization
118
+
119
+ ```bash
120
+ # Find logging of user fields
121
+ grep -rn "logger.info.*user" src/
122
+ grep -rn "println.*email\|println.*phone" src/
123
+ ```
124
+
125
+ ## Multi-PR delegation
126
+
127
+ When the same reviewer has done 2+ STRIDE passes on related PRs in one session, blind spots compound. Delegate the 2nd pass to a subagent:
128
+
129
+ ```
130
+ Task(subagent_type="Explore", prompt="""
131
+ Run STRIDE PASSE 2 on this diff. Fresh eyes, no session history.
132
+ CHANGED_FILES: [...]
133
+ FOCUS: category you feel is weakest
134
+ """)
135
+ ```
136
+
137
+ ## Stale item rotation
138
+
139
+ Tracked via `learnings-capture`: if a killer checklist item passes (✓) in 10+ audits without catching anything, flag for review. Either:
140
+
141
+ - The codebase is genuinely clean on that dimension → consider removing item
142
+ - The item is too vague to fail → tighten the check
143
+
144
+ Replace with a newer, more specific check.
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: test-strategy-vitest-playwright
3
+ description: How to plan a test strategy — test pyramid (70/20/10), what to test at each level (unit/integration/E2E), what to mock vs hit real, property-based testing for boundaries, and keeping the suite fast. 2026 convention: browser-native runners, accessibility-tree assertions over screenshots.
4
+ allowed-tools: Read, Grep, Glob, Bash
5
+ ---
6
+
7
+ # Test Strategy — Pyramid, Not Ice-Cream Cone
8
+
9
+ ## What this covers
10
+
11
+ How to decide which tests go where, what to mock, and how to keep a test suite fast. The anti-pattern is 70% E2E Playwright, 5% unit — slow CI, flaky, expensive. The 2026 pyramid: most tests at the unit level, very few real-browser E2E.
12
+
13
+ ## Core principle
14
+
15
+ **Most tests should be unit tests.** E2E is for critical user paths across 3+ components, not coverage inflation. If you're writing E2E because "it's hard to isolate", the code needs a refactor, not more tests.
16
+
17
+ ## The 2026 pyramid (target ratios)
18
+
19
+ ```
20
+ ┌───────────────┐
21
+ │ E2E (10%) │ Playwright — critical user paths only
22
+ ├───────────────┤
23
+ │ Integ (20%) │ Vitest + MSW (no real network) OR test DB
24
+ ├───────────────┤
25
+ │ │
26
+ │ Unit (70%) │ Vitest — pure logic, reducers, utils
27
+ │ │
28
+ └───────────────┘
29
+ ```
30
+
31
+ Property-based (`fast-check`) crosscuts all levels for boundary conditions.
32
+
33
+ ## Unit testing (Vitest)
34
+
35
+ **When**: pure function, reducer, class method with deterministic input→output.
36
+
37
+ - Test ONE behavior per test (not "mega tests")
38
+ - No real filesystem, no real network, no real DB
39
+ - Run in < 50ms each
40
+ - Should fail if implementation logic breaks (not if formatting breaks)
41
+
42
+ ```typescript
43
+ it('paginates offset correctly when page is 0', () => {
44
+ expect(paginate({ page: 0, size: 10 }).offset).toBe(0);
45
+ });
46
+ ```
47
+
48
+ ## Integration testing (Vitest + MSW)
49
+
50
+ **When**: module touches an external system (HTTP API, DB, cache) but you want fast deterministic runs.
51
+
52
+ - MSW mocks the HTTP layer at the network level (not at the `fetch` level)
53
+ - Seed the test with a realistic fixture response
54
+ - DB: use `vitest-environment` + SQLite in-memory OR Testcontainers for the real engine
55
+ - Run in < 500ms each
56
+
57
+ **Anti-pattern**: mocking your own modules. If you mock your own user-service, you're just testing that you wrote mocks correctly.
58
+
59
+ ## E2E testing (Playwright)
60
+
61
+ **When**: critical user path across ≥ 3 components (login → browse → checkout → confirm).
62
+
63
+ - **Accessibility-tree assertions** (`page.getByRole('button', { name: 'Submit' })`) — deterministic, doesn't break on CSS changes
64
+ - **Avoid screenshot assertions** for behavior — use for visual regression only, and only on static content
65
+ - Seed DB via a test setup script, NOT through the UI (too slow)
66
+ - One test = one user journey, not twelve
67
+
68
+ ## Property-based testing (fast-check)
69
+
70
+ **When**: boundary conditions are the risk — off-by-one, null, empty, max int, unicode.
71
+
72
+ - State the PROPERTY ("sorting is idempotent: sort(sort(x)) === sort(x)")
73
+ - Let fast-check generate 100+ inputs
74
+ - Use `fc.pre()` to filter invalid inputs (not to avoid branches of logic)
75
+
76
+ ## What to mock, what to hit real
77
+
78
+ | System | Mock? | Rationale |
79
+ |---|---|---|
80
+ | External HTTP APIs | Yes (MSW) | Flaky, slow, rate-limited |
81
+ | Internal microservices | Yes (MSW) for unit/integ; real for E2E | Keep blast radius small |
82
+ | Database | Real (in-memory or container) | Too many bugs hide in ORM/raw-SQL mismatch |
83
+ | Time (`Date.now`) | Yes (vi.useFakeTimers) | Non-determinism otherwise |
84
+ | Randomness | Yes (seeded PRNG) | Same reason |
85
+ | Filesystem | Real (temp dir) for integ; mock for unit | `memfs` is fine for pure tests |
86
+ | Auth tokens | Real signed test token | Mocked tokens hide signature-validation bugs |
87
+ | Third-party SDK | Mock at module boundary | Not at network level |
88
+
89
+ ## Key points
90
+
91
+ - Pyramid ratios are targets, not strict quotas — a pure-UI feature may skew E2E higher; a pure-algorithm feature may be 95% unit
92
+ - No E2E without unit first
93
+ - One test per behavior — tests named `it('does many things', ...)` are code smell
94
+ - Avoid snapshot tests for dynamic output — they become "update snapshots" rituals that don't catch bugs
95
+ - Accessibility-tree > CSS selectors in Playwright — `getByRole` survives refactors
96
+ - Flaky test policy: first flake → debug. Second flake → quarantine (`test.skip` + ISSUE). Third flake → delete unless Critical
97
+
98
+ ## Common anti-patterns
99
+
100
+ 1. **Ice-cream cone**: 70% E2E, 5% unit — slow, flaky, expensive to maintain
101
+ 2. **Coverage theater**: high coverage number but tests don't catch real bugs
102
+ 3. **Mocking yourself**: mocking your own modules proves nothing except that mocks work
103
+ 4. **Mega-tests**: one test covering 5 scenarios — split them
104
+ 5. **Screenshot assertions for behavior**: brittle, break on font changes; use accessibility-tree assertions instead
105
+ 6. **E2E as unit replacement**: E2E tests are 100x slower, use them only for integration across real browsers
106
+
107
+ ## How to verify your test strategy is good
108
+
109
+ - **Runtime budget**: total suite < 3 min for pre-commit + CI
110
+ - **Mutation testing**: change production code → does a test fail?
111
+ - **New person test**: can someone understand the feature from tests alone?
112
+ - **Bug regression**: when a bug is found, add a test that would have caught it at the lowest level possible
113
+
114
+ ## References
115
+
116
+ - defined.net/blog/modern-frontend-testing — Vitest + Storybook + Playwright stack
117
+ - playwright.dev/docs/best-practices — accessibility-tree assertions
118
+ - fast-check.dev — property-based testing in TS/JS
119
+ - hypothesis.works — property-based testing in Python (equivalent concepts)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@neikyun/ciel",
3
- "version": "6.10.1",
3
+ "version": "6.11.1",
4
4
  "description": "Ciel — Deep-reasoning pipeline for LLM-assisted development. OpenCode plugin + multi-platform CLI (OpenCode, Claude Code, more).",
5
5
  "main": "./dist/plugin/index.js",
6
6
  "types": "./dist/plugin/index.d.ts",