@sylphx/flow 2.16.0 → 2.16.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,11 @@
1
1
  # @sylphx/flow
2
2
 
3
+ ## 2.16.1 (2025-12-17)
4
+
5
+ ### ♻️ Refactoring
6
+
7
+ - **commands:** simplify /continue and /review - think, don't checklist ([754eec1](https://github.com/SylphxAI/flow/commit/754eec1211719fc68f25ce47510e5797a33e1469))
8
+
3
9
  ## 2.16.0 (2025-12-17)
4
10
 
5
11
  ### ✨ Features
@@ -1,160 +1,98 @@
1
1
  ---
2
2
  name: continue
3
- description: Continue incomplete work - finish features, fix bugs, complete TODOs
3
+ description: Continue incomplete work - find gaps, finish features, fix what's broken
4
4
  agent: coder
5
5
  ---
6
6
 
7
- # Continue Incomplete Work
7
+ # Continue
8
8
 
9
- Scan codebase for incomplete work. Prioritize. Finish.
9
+ Find what's incomplete. Finish it.
10
10
 
11
11
  ## Mandate
12
12
 
13
- * Perform a **deep, thorough scan** of incomplete work in this codebase.
14
- * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
15
- * **Research then Act**: understand full scope first, then **implement fixes directly**. Don't just report finish.
16
- * **Single-pass delivery**: no deferrals; deliver complete implementation.
17
- * **Be thorough**: incomplete work hides in comments, stubs, error messages, and "temporary" solutions.
18
-
19
- ## Discovery Approach
20
-
21
- ### Phase 1: Code Analysis
22
- Scan for explicit incomplete markers:
23
- - `TODO`, `FIXME`, `XXX`, `HACK`, `BUG`, `@todo`
24
- - `NotImplementedError`, `throw new Error('not implemented')`
25
- - Stub implementations (hardcoded returns, empty catches, `pass`)
26
- - `test.skip`, `it.skip`, empty test files
27
- - Commented-out code, debug statements
28
-
29
- ### Phase 2: Role-Based Simulation
30
-
31
- **👤 User Perspective** — Walk through every user-facing flow:
32
- - Onboarding: Can a new user complete setup without confusion?
33
- - Happy path: Does the core feature work end-to-end?
34
- - Error states: What happens when things go wrong? Are messages helpful?
35
- - Edge cases: Empty states, first-time use, account limits, expired sessions
36
- - Accessibility: Keyboard nav, screen readers, color contrast
37
- - Mobile/responsive: Does it work on all devices?
38
-
39
- **🔧 Developer Perspective** — Evaluate DX and maintainability:
40
- - Setup: Can someone clone and run in < 5 minutes?
41
- - Documentation: Are APIs documented? Examples provided?
42
- - Error messages: Do stack traces help identify root cause?
43
- - Debugging: Are there logs at appropriate levels?
44
- - Testing: Can tests run locally? Are mocks available?
45
- - Dependencies: Any outdated, deprecated, or vulnerable packages?
46
-
47
- **🛡️ Admin/Ops Perspective** — Consider operational readiness:
48
- - Monitoring: Can you tell if the system is healthy?
49
- - Logging: Is there enough info to debug production issues?
50
- - Configuration: Can settings be changed without code deploy?
51
- - Backup/Recovery: Can data be restored if something fails?
52
- - Security: Are admin actions audited? Permissions enforced?
53
- - Scaling: What happens under 10x load?
54
-
55
- ### Phase 3: Scenario Simulation
56
-
57
- Run through these scenarios mentally or via code paths:
58
-
59
- | Scenario | Questions to Answer |
60
- |----------|---------------------|
61
- | New user signup | Every step works? Validation clear? Email sent? |
62
- | Returning user login | Session handling? Password reset works? |
63
- | Core action fails | Error shown? User knows what to do? Data preserved? |
64
- | Network offline | Graceful degradation? Retry logic? |
65
- | Concurrent users | Race conditions? Locks? Optimistic updates? |
66
- | Bad actor attempts | Input sanitized? Rate limited? Logged? |
67
- | Admin intervention | Can support help user? Audit trail exists? |
68
-
69
- ## Execution Process
70
-
71
- 1. **Parallel Discovery** (delegate to workers):
72
- - Worker 1: Code markers & stubs (grep TODO, FIXME, placeholders)
73
- - Worker 2: User journey simulation (trace main flows, find dead ends)
74
- - Worker 3: Developer experience audit (setup, docs, error messages)
75
- - Worker 4: Ops readiness check (logging, monitoring, config)
76
- - Worker 5: Test coverage & edge cases (skipped tests, missing scenarios)
77
-
78
- 2. **Synthesize & Prioritize**:
79
- - Collect all findings
80
- - Group by severity: Critical → High → Medium → Low
81
- - Critical: Security issues, data loss risks, broken features
82
- - High: User-facing bugs, incomplete core features
83
- - Medium: Code quality, missing tests
84
- - Low: Documentation, style issues
85
-
86
- 3. **Implement Fixes**:
87
- - Start with Critical items
88
- - Complete each fix fully before moving on
89
- - Run tests after each significant change
90
- - Commit atomically per logical fix
91
-
92
- 4. **Deep Dive with /review** (when needed):
93
- If issues cluster in a specific domain, invoke `/review <domain>` for thorough analysis:
94
-
95
- | Domain | When to Invoke |
96
- |--------|----------------|
97
- | `auth` | Auth flow issues, session bugs, SSO problems |
98
- | `security` | Validation gaps, injection risks, secrets exposure |
99
- | `billing` | Payment bugs, subscription issues, webhook failures |
100
- | `performance` | Slow pages, bundle bloat, unnecessary re-renders |
101
- | `database` | Schema issues, missing indexes, N+1 queries |
102
- | `observability` | Missing logs, no alerts, debugging blind spots |
103
- | `i18n` | Hardcoded strings, locale issues, RTL bugs |
104
-
105
- Full domain list: auth, account-security, privacy, billing, pricing, ledger, security, trust-safety, uiux, seo, pwa, performance, i18n, database, data-architecture, storage, observability, operability, delivery, growth, referral, support, admin, discovery, code-quality
106
-
107
- 5. **Loop: Re-invoke /continue**:
108
- After completing fixes, **invoke `/continue` again** to:
109
- - Verify fixes didn't introduce new issues
110
- - Discover issues that were hidden by previous bugs
111
- - Continue until no Critical/High items remain
112
-
113
- **Exit condition**: No Critical or High severity items found.
114
-
115
- ## Output Format
116
-
117
- ### Discovery Summary
118
- ```
119
- ## Incomplete Work Found
13
+ * **Think, don't checklist.** Understand the project first. What is it trying to do? What would "done" look like?
14
+ * **Delegate workers** for parallel research. You synthesize and verify.
15
+ * **Fix, don't report.** Implement solutions directly.
16
+ * **One pass.** No deferrals. Complete each fix fully.
120
17
 
121
- ### Critical (X items)
122
- - [ ] File:line - Description
18
+ ## How to Find Incomplete Work
123
19
 
124
- ### High (X items)
125
- - [ ] File:line - Description
20
+ Don't grep for TODO and call it done. Incomplete work hides in:
126
21
 
127
- ### Medium (X items)
128
- - [ ] File:line - Description
22
+ **What's explicitly unfinished** — Yes, scan for TODO/FIXME/HACK. But ask: why are they there? What was the person avoiding?
129
23
 
130
- ### Low (X items)
131
- - [ ] File:line - Description
132
- ```
24
+ **What's implicitly broken** — Code that "works" but:
25
+ - Fails silently (empty catch blocks, swallowed errors)
26
+ - Works only in happy path (no validation, no edge cases)
27
+ - Works but confuses users (unclear errors, missing feedback)
28
+ - Works but can't be debugged (no logs, no context)
29
+
30
+ **What's missing entirely** — Features referenced but not implemented. UI that leads nowhere. Promises in docs that code doesn't deliver.
31
+
32
+ ## The Real Test
33
+
34
+ For each part of the system, ask:
35
+
36
+ > "If I were a user trying to accomplish their goal, where would I get stuck?"
37
+
38
+ > "If this broke at 3am, could someone figure out why?"
39
+
40
+ > "If requirements changed tomorrow, what would be painful to modify?"
41
+
42
+ > "If we had 100x traffic, what would fall over first?"
43
+
44
+ These questions reveal incompleteness that checklists miss.
45
+
46
+ ## Execution
47
+
48
+ 1. **Understand the project** — Read README, main entry points, core flows. What is this thing supposed to do?
49
+
50
+ 2. **Walk the critical paths** — Trace actual user journeys through code. Where does the path get uncertain, error-prone, or incomplete?
51
+
52
+ 3. **Find the gaps** — Not just TODOs, but:
53
+ - Dead ends (code that starts something but doesn't finish)
54
+ - Weak spots (minimal implementation that will break under pressure)
55
+ - Missing pieces (what's referenced but doesn't exist)
56
+
57
+ 4. **Prioritize by impact** — What blocks users? What causes data loss? What makes debugging impossible? Fix those first.
58
+
59
+ 5. **Fix completely** — Don't patch. Understand root cause. Implement proper solution. Test it works.
60
+
61
+ ## When to Go Deeper
62
+
63
+ If issues cluster in a domain, invoke `/review <domain>` for thorough analysis:
133
64
 
134
- ### Progress Updates
135
65
  ```
136
- Fixed: [description] (file:line)
137
- Blocked: [description] - needs [reason]
138
- Deep dive: invoking /review [domain]
139
- Loop: re-invoking /continue
66
+ /review auth — Auth flow issues
67
+ /review security — Validation gaps, injection risks
68
+ /review database — Schema issues, missing indexes
69
+ /review performance Slow paths, bundle bloat
140
70
  ```
141
71
 
142
- ### Completion
143
- ```
144
- ## Continue Complete
72
+ Full list: auth, account-security, privacy, billing, pricing, ledger, security, trust-safety, uiux, seo, pwa, performance, i18n, database, data-architecture, storage, observability, operability, delivery, growth, referral, support, admin, discovery, code-quality
73
+
74
+ ## Loop
75
+
76
+ After fixing, ask: "Did my fixes introduce new gaps? Did fixing X reveal Y was also broken?"
145
77
 
146
- X issues fixed
147
- ⚠ X issues need manual intervention
148
- Invoked /review for: [domains]
78
+ If yes → continue. If no Critical/High issues remain → done.
79
+
80
+ ## Output
149
81
 
150
- Next: [/continue again | no further action needed]
151
82
  ```
83
+ ## What I Found
84
+
85
+ [Describe the gaps discovered — not a checklist, but an understanding of what's incomplete and why]
86
+
87
+ ## What I Fixed
88
+
89
+ - [Description of fix and why it matters]
152
90
 
153
- ## Driving Questions
91
+ ## What Remains
154
92
 
155
- 1. **User**: What flows break or confuse real users? Where do they get stuck?
156
- 2. **Developer**: What would frustrate someone onboarding to this codebase?
157
- 3. **Admin/Ops**: What would make a 3am incident harder to debug?
158
- 4. **Security**: What incomplete validation could be exploited?
159
- 5. **Quality**: What "temporary" solutions became permanent debt?
160
- 6. **Scale**: What works now but will break at 10x growth?
93
+ - [Issues that need human decision or are blocked]
94
+
95
+ ## Next
96
+
97
+ [/continue again | /review <domain> | done]
98
+ ```
@@ -1,94 +1,68 @@
1
1
  ---
2
2
  name: review
3
- description: Review codebase by domain - usage: /review auth, /review billing, etc.
3
+ description: Review codebase by domain - /review <what to review>
4
4
  agent: coder
5
5
  args:
6
- - name: domain
7
- description: Domain to review (auth, billing, security, etc.)
6
+ - name: topic
7
+ description: What to review (e.g., auth, security, billing, "the login flow", "why it's slow")
8
8
  required: true
9
9
  ---
10
10
 
11
11
  # Review: $ARGS
12
12
 
13
- Perform a focused review of the specified domain(s) in this codebase.
14
-
15
13
  ## Mandate
16
14
 
17
- * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
18
- * **Deep, thorough analysis**: don't skim understand root causes and systemic patterns.
19
- * **Review then Act**: identify issues, then **implement fixes directly**. Don't just report fix.
20
- * **Single-pass delivery**: no deferrals; deliver complete implementation.
21
- * **Explore beyond the spec**: identify gaps, inefficiencies, and opportunities the checklist doesn't cover.
22
-
23
- ## Guidelines Reference
24
-
25
- Based on the domain(s) requested, read the relevant guideline(s) below. You may read multiple guidelines in parallel if the review spans multiple domains.
26
-
27
- | Domain | Guideline | Description |
28
- |--------|-----------|-------------|
29
- | **Identity & Auth** |||
30
- | `auth` | `/guideline-auth` | Sign-in, SSO, passkeys, verification |
31
- | `account-security` | `/guideline-account-security` | MFA, session management, account recovery |
32
- | `privacy` | `/guideline-privacy` | Data handling, consent, GDPR/CCPA |
33
- | **Billing & Revenue** |||
34
- | `billing` | `/guideline-billing` | Stripe integration, webhooks, subscriptions |
35
- | `pricing` | `/guideline-pricing` | Pricing models, tiers, feature gating |
36
- | `ledger` | `/guideline-ledger` | Transaction records, audit trails, reconciliation |
37
- | **Security** |||
38
- | `security` | `/guideline-security` | OWASP, input validation, secrets management |
39
- | `trust-safety` | `/guideline-trust-safety` | Abuse prevention, rate limiting, fraud |
40
- | **Frontend & UX** |||
41
- | `uiux` | `/guideline-uiux` | Design system, accessibility, interactions |
42
- | `seo` | `/guideline-seo` | Meta tags, structured data, crawlability |
43
- | `pwa` | `/guideline-pwa` | Service workers, offline, installability |
44
- | `performance` | `/guideline-performance` | Core Web Vitals, bundle size, caching |
45
- | `i18n` | `/guideline-i18n` | Localization, routing, hreflang |
46
- | **Data** |||
47
- | `database` | `/guideline-database` | Schema design, indexes, migrations |
48
- | `data-architecture` | `/guideline-data-architecture` | Data models, relationships, integrity |
49
- | `storage` | `/guideline-storage` | File uploads, CDN, blob storage |
50
- | **Operations** |||
51
- | `observability` | `/guideline-observability` | Logging, metrics, tracing, alerts |
52
- | `operability` | `/guideline-operability` | Deployment, rollback, feature flags |
53
- | `delivery` | `/guideline-delivery` | CI/CD, testing, release process |
54
- | **Growth & Support** |||
55
- | `growth` | `/guideline-growth` | Onboarding, activation, retention |
56
- | `referral` | `/guideline-referral` | Referral programs, viral loops |
57
- | `support` | `/guideline-support` | Help systems, tickets, documentation |
58
- | **Admin & Discovery** |||
59
- | `admin` | `/guideline-admin` | Admin panel, RBAC, config management |
60
- | `discovery` | `/guideline-discovery` | Feature discovery, competitive analysis |
61
- | **Code Quality** |||
62
- | `code-quality` | `/guideline-code-quality` | Patterns, testing, maintainability |
63
-
64
- ## Execution
65
-
66
- 1. Parse the `$ARGS` to identify which domain(s) to review
67
- 2. Read the corresponding guideline file(s) — **read in parallel** if multiple domains
68
- 3. Use the guideline's Tech Stack, Non-Negotiables, Context, and Driving Questions to guide your review
69
- 4. Delegate workers to investigate different aspects simultaneously
70
- 5. Synthesize findings and implement fixes
71
-
72
- ## Multi-Domain Reviews
73
-
74
- You can review multiple domains at once:
75
- - `/review auth security` — Review both auth and security
76
- - `/review billing pricing ledger` — Full revenue stack review
77
- - `/review performance seo pwa` — Frontend optimization review
78
-
79
- When reviewing multiple domains, look for **cross-cutting concerns** and **gaps between domains**.
80
-
81
- ## Output Format
15
+ * **Understand first.** Don't treat guidelines as checklists absorb the principles, then apply judgment.
16
+ * **Think like the failure mode.** Security? Think like an attacker. Performance? Think like a slow network. Auth? Think like a confused user.
17
+ * **Delegate workers** for parallel research. You synthesize and verify.
18
+ * **Fix, don't report.** Implement solutions directly.
19
+
20
+ ## Available Guidelines
21
+
22
+ Read relevant guideline(s) based on what you're reviewing:
23
+
24
+ | Guideline | Domain |
25
+ |-----------|--------|
26
+ | `/guideline-auth` | Sign-in, SSO, passkeys, verification |
27
+ | `/guideline-account-security` | MFA, sessions, account recovery |
28
+ | `/guideline-privacy` | Data handling, consent, GDPR/CCPA |
29
+ | `/guideline-billing` | Stripe, webhooks, subscriptions |
30
+ | `/guideline-pricing` | Pricing models, tiers, feature gating |
31
+ | `/guideline-ledger` | Transactions, audit trails, reconciliation |
32
+ | `/guideline-security` | OWASP, validation, secrets |
33
+ | `/guideline-trust-safety` | Abuse prevention, rate limiting, fraud |
34
+ | `/guideline-uiux` | Design system, accessibility |
35
+ | `/guideline-seo` | Meta tags, structured data, crawlability |
36
+ | `/guideline-pwa` | Service workers, offline, installability |
37
+ | `/guideline-performance` | Core Web Vitals, bundle size, caching |
38
+ | `/guideline-i18n` | Localization, routing, hreflang |
39
+ | `/guideline-database` | Schema, indexes, migrations |
40
+ | `/guideline-data-architecture` | Data models, relationships, integrity |
41
+ | `/guideline-storage` | File uploads, CDN, blob storage |
42
+ | `/guideline-observability` | Logging, metrics, tracing, alerts |
43
+ | `/guideline-operability` | Deployment, rollback, feature flags |
44
+ | `/guideline-delivery` | CI/CD, testing, release process |
45
+ | `/guideline-growth` | Onboarding, activation, retention |
46
+ | `/guideline-referral` | Referral programs, viral loops |
47
+ | `/guideline-support` | Help systems, tickets, documentation |
48
+ | `/guideline-admin` | Admin panel, RBAC, config |
49
+ | `/guideline-discovery` | Feature discovery, competitive analysis |
50
+ | `/guideline-code-quality` | Patterns, testing, maintainability |
51
+
52
+ ## Output
82
53
 
83
54
  ```
84
- ## Review: [domain(s)]
55
+ ## Review: [topic]
56
+
57
+ ### Understanding
58
+ [How this is implemented. Architecture, choices, tradeoffs.]
85
59
 
86
- ### Critical Issues
87
- - [ ] Issue description Fix implemented
60
+ ### Issues
61
+ [What's wrong and why it matters]
88
62
 
89
- ### Improvements Made
90
- - ✓ What was fixed/improved
63
+ ### Fixed
64
+ [Changes made]
91
65
 
92
- ### Recommendations
93
- - Future considerations (if can't fix now)
66
+ ### Remaining
67
+ [Needs human decision or blocked]
94
68
  ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sylphx/flow",
3
- "version": "2.16.0",
3
+ "version": "2.16.1",
4
4
  "description": "One CLI to rule them all. Unified orchestration layer for Claude Code, OpenCode, Cursor and all AI development tools. Auto-detection, auto-installation, auto-upgrade.",
5
5
  "type": "module",
6
6
  "bin": {