npm - @sylphx/flow - Versions diffs - 2.16.0 → 2.16.1 - Mend

@sylphx/flow 2.16.0 → 2.16.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/CHANGELOG.md +6 -0
package/assets/slash-commands/continue.md +76 -138
package/assets/slash-commands/review.md +51 -77
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,11 @@
 # @sylphx/flow
+## 2.16.1 (2025-12-17)
+### ♻️ Refactoring
+- **commands:** simplify /continue and /review - think, don't checklist ([754eec1](https://github.com/SylphxAI/flow/commit/754eec1211719fc68f25ce47510e5797a33e1469))
 ## 2.16.0 (2025-12-17)
 ### ✨ Features

package/assets/slash-commands/continue.md CHANGED Viewed

@@ -1,160 +1,98 @@
 ---
 name: continue
-description: Continue incomplete work - finish features, fix bugs, complete TODOs
+description: Continue incomplete work - find gaps, finish features, fix what's broken
 agent: coder
 ---
-# Continue Incomplete Work
+# Continue
-Scan codebase for incomplete work. Prioritize. Finish.
+Find what's incomplete. Finish it.
 ## Mandate
-* Perform a **deep, thorough scan** of incomplete work in this codebase.
-* **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
-* **Research then Act**: understand full scope first, then **implement fixes directly**. Don't just report — finish.
-* **Single-pass delivery**: no deferrals; deliver complete implementation.
-* **Be thorough**: incomplete work hides in comments, stubs, error messages, and "temporary" solutions.
-## Discovery Approach
-### Phase 1: Code Analysis
-Scan for explicit incomplete markers:
-- `TODO`, `FIXME`, `XXX`, `HACK`, `BUG`, `@todo`
-- `NotImplementedError`, `throw new Error('not implemented')`
-- Stub implementations (hardcoded returns, empty catches, `pass`)
-- `test.skip`, `it.skip`, empty test files
-- Commented-out code, debug statements
-### Phase 2: Role-Based Simulation
-**👤 User Perspective** — Walk through every user-facing flow:
-- Onboarding: Can a new user complete setup without confusion?
-- Happy path: Does the core feature work end-to-end?
-- Error states: What happens when things go wrong? Are messages helpful?
-- Edge cases: Empty states, first-time use, account limits, expired sessions
-- Accessibility: Keyboard nav, screen readers, color contrast
-- Mobile/responsive: Does it work on all devices?
-**🔧 Developer Perspective** — Evaluate DX and maintainability:
-- Setup: Can someone clone and run in < 5 minutes?
-- Documentation: Are APIs documented? Examples provided?
-- Error messages: Do stack traces help identify root cause?
-- Debugging: Are there logs at appropriate levels?
-- Testing: Can tests run locally? Are mocks available?
-- Dependencies: Any outdated, deprecated, or vulnerable packages?
-**🛡️ Admin/Ops Perspective** — Consider operational readiness:
-- Monitoring: Can you tell if the system is healthy?
-- Logging: Is there enough info to debug production issues?
-- Configuration: Can settings be changed without code deploy?
-- Backup/Recovery: Can data be restored if something fails?
-- Security: Are admin actions audited? Permissions enforced?
-- Scaling: What happens under 10x load?
-### Phase 3: Scenario Simulation
-Run through these scenarios mentally or via code paths:
-| Scenario | Questions to Answer |
-|----------|---------------------|
-| New user signup | Every step works? Validation clear? Email sent? |
-| Returning user login | Session handling? Password reset works? |
-| Core action fails | Error shown? User knows what to do? Data preserved? |
-| Network offline | Graceful degradation? Retry logic? |
-| Concurrent users | Race conditions? Locks? Optimistic updates? |
-| Bad actor attempts | Input sanitized? Rate limited? Logged? |
-| Admin intervention | Can support help user? Audit trail exists? |
-## Execution Process
-1. **Parallel Discovery** (delegate to workers):
-   - Worker 1: Code markers & stubs (grep TODO, FIXME, placeholders)
-   - Worker 2: User journey simulation (trace main flows, find dead ends)
-   - Worker 3: Developer experience audit (setup, docs, error messages)
-   - Worker 4: Ops readiness check (logging, monitoring, config)
-   - Worker 5: Test coverage & edge cases (skipped tests, missing scenarios)
-2. **Synthesize & Prioritize**:
-   - Collect all findings
-   - Group by severity: Critical → High → Medium → Low
-   - Critical: Security issues, data loss risks, broken features
-   - High: User-facing bugs, incomplete core features
-   - Medium: Code quality, missing tests
-   - Low: Documentation, style issues
-3. **Implement Fixes**:
-   - Start with Critical items
-   - Complete each fix fully before moving on
-   - Run tests after each significant change
-   - Commit atomically per logical fix
-4. **Deep Dive with /review** (when needed):
-   If issues cluster in a specific domain, invoke `/review <domain>` for thorough analysis:
-   | Domain | When to Invoke |
-   |--------|----------------|
-   | `auth` | Auth flow issues, session bugs, SSO problems |
-   | `security` | Validation gaps, injection risks, secrets exposure |
-   | `billing` | Payment bugs, subscription issues, webhook failures |
-   | `performance` | Slow pages, bundle bloat, unnecessary re-renders |
-   | `database` | Schema issues, missing indexes, N+1 queries |
-   | `observability` | Missing logs, no alerts, debugging blind spots |
-   | `i18n` | Hardcoded strings, locale issues, RTL bugs |
-   Full domain list: auth, account-security, privacy, billing, pricing, ledger, security, trust-safety, uiux, seo, pwa, performance, i18n, database, data-architecture, storage, observability, operability, delivery, growth, referral, support, admin, discovery, code-quality
-5. **Loop: Re-invoke /continue**:
-   After completing fixes, **invoke `/continue` again** to:
-   - Verify fixes didn't introduce new issues
-   - Discover issues that were hidden by previous bugs
-   - Continue until no Critical/High items remain
-   **Exit condition**: No Critical or High severity items found.
-## Output Format
-### Discovery Summary
-```
-## Incomplete Work Found
+* **Think, don't checklist.** Understand the project first. What is it trying to do? What would "done" look like?
+* **Delegate workers** for parallel research. You synthesize and verify.
+* **Fix, don't report.** Implement solutions directly.
+* **One pass.** No deferrals. Complete each fix fully.
-### Critical (X items)
-- [ ] File:line - Description
+## How to Find Incomplete Work
-### High (X items)
-- [ ] File:line - Description
+Don't grep for TODO and call it done. Incomplete work hides in:
-### Medium (X items)
-- [ ] File:line - Description
+**What's explicitly unfinished** — Yes, scan for TODO/FIXME/HACK. But ask: why are they there? What was the person avoiding?
-### Low (X items)
-- [ ] File:line - Description
-```
+**What's implicitly broken** — Code that "works" but:
+- Fails silently (empty catch blocks, swallowed errors)
+- Works only in happy path (no validation, no edge cases)
+- Works but confuses users (unclear errors, missing feedback)
+- Works but can't be debugged (no logs, no context)
+**What's missing entirely** — Features referenced but not implemented. UI that leads nowhere. Promises in docs that code doesn't deliver.
+## The Real Test
+For each part of the system, ask:
+> "If I were a user trying to accomplish their goal, where would I get stuck?"
+> "If this broke at 3am, could someone figure out why?"
+> "If requirements changed tomorrow, what would be painful to modify?"
+> "If we had 100x traffic, what would fall over first?"
+These questions reveal incompleteness that checklists miss.
+## Execution
+1. **Understand the project** — Read README, main entry points, core flows. What is this thing supposed to do?
+2. **Walk the critical paths** — Trace actual user journeys through code. Where does the path get uncertain, error-prone, or incomplete?
+3. **Find the gaps** — Not just TODOs, but:
+   - Dead ends (code that starts something but doesn't finish)
+   - Weak spots (minimal implementation that will break under pressure)
+   - Missing pieces (what's referenced but doesn't exist)
+4. **Prioritize by impact** — What blocks users? What causes data loss? What makes debugging impossible? Fix those first.
+5. **Fix completely** — Don't patch. Understand root cause. Implement proper solution. Test it works.
+## When to Go Deeper
+If issues cluster in a domain, invoke `/review <domain>` for thorough analysis:
-### Progress Updates
 ```
-✓ Fixed: [description] (file:line)
-⚠ Blocked: [description] - needs [reason]
-→ Deep dive: invoking /review [domain]
-↻ Loop: re-invoking /continue
+/review auth        — Auth flow issues
+/review security    — Validation gaps, injection risks
+/review database    — Schema issues, missing indexes
+/review performance — Slow paths, bundle bloat
 ```
-### Completion
-```
-## Continue Complete
+Full list: auth, account-security, privacy, billing, pricing, ledger, security, trust-safety, uiux, seo, pwa, performance, i18n, database, data-architecture, storage, observability, operability, delivery, growth, referral, support, admin, discovery, code-quality
+## Loop
+After fixing, ask: "Did my fixes introduce new gaps? Did fixing X reveal Y was also broken?"
-✓ X issues fixed
-⚠ X issues need manual intervention
-→ Invoked /review for: [domains]
+If yes → continue. If no Critical/High issues remain → done.
+## Output
-Next: [/continue again | no further action needed]
 ```
+## What I Found
+[Describe the gaps discovered — not a checklist, but an understanding of what's incomplete and why]
+## What I Fixed
+- [Description of fix and why it matters]
-## Driving Questions
+## What Remains
-1. **User**: What flows break or confuse real users? Where do they get stuck?
-2. **Developer**: What would frustrate someone onboarding to this codebase?
-3. **Admin/Ops**: What would make a 3am incident harder to debug?
-4. **Security**: What incomplete validation could be exploited?
-5. **Quality**: What "temporary" solutions became permanent debt?
-6. **Scale**: What works now but will break at 10x growth?
+- [Issues that need human decision or are blocked]
+## Next
+[/continue again | /review <domain> | done]
+```

package/assets/slash-commands/review.md CHANGED Viewed

@@ -1,94 +1,68 @@
 ---
 name: review
-description: Review codebase by domain - usage: /review auth, /review billing, etc.
+description: Review codebase by domain - /review <what to review>
 agent: coder
 args:
-  - name: domain
-    description: Domain to review (auth, billing, security, etc.)
+  - name: topic
+    description: What to review (e.g., auth, security, billing, "the login flow", "why it's slow")
     required: true
 ---
 # Review: $ARGS
-Perform a focused review of the specified domain(s) in this codebase.
 ## Mandate
-* **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
-* **Deep, thorough analysis**: don't skim — understand root causes and systemic patterns.
-* **Review then Act**: identify issues, then **implement fixes directly**. Don't just report — fix.
-* **Single-pass delivery**: no deferrals; deliver complete implementation.
-* **Explore beyond the spec**: identify gaps, inefficiencies, and opportunities the checklist doesn't cover.
-## Guidelines Reference
-Based on the domain(s) requested, read the relevant guideline(s) below. You may read multiple guidelines in parallel if the review spans multiple domains.
-| Domain | Guideline | Description |
-|--------|-----------|-------------|
-| **Identity & Auth** |||
-| `auth` | `/guideline-auth` | Sign-in, SSO, passkeys, verification |
-| `account-security` | `/guideline-account-security` | MFA, session management, account recovery |
-| `privacy` | `/guideline-privacy` | Data handling, consent, GDPR/CCPA |
-| **Billing & Revenue** |||
-| `billing` | `/guideline-billing` | Stripe integration, webhooks, subscriptions |
-| `pricing` | `/guideline-pricing` | Pricing models, tiers, feature gating |
-| `ledger` | `/guideline-ledger` | Transaction records, audit trails, reconciliation |
-| **Security** |||
-| `security` | `/guideline-security` | OWASP, input validation, secrets management |
-| `trust-safety` | `/guideline-trust-safety` | Abuse prevention, rate limiting, fraud |
-| **Frontend & UX** |||
-| `uiux` | `/guideline-uiux` | Design system, accessibility, interactions |
-| `seo` | `/guideline-seo` | Meta tags, structured data, crawlability |
-| `pwa` | `/guideline-pwa` | Service workers, offline, installability |
-| `performance` | `/guideline-performance` | Core Web Vitals, bundle size, caching |
-| `i18n` | `/guideline-i18n` | Localization, routing, hreflang |
-| **Data** |||
-| `database` | `/guideline-database` | Schema design, indexes, migrations |
-| `data-architecture` | `/guideline-data-architecture` | Data models, relationships, integrity |
-| `storage` | `/guideline-storage` | File uploads, CDN, blob storage |
-| **Operations** |||
-| `observability` | `/guideline-observability` | Logging, metrics, tracing, alerts |
-| `operability` | `/guideline-operability` | Deployment, rollback, feature flags |
-| `delivery` | `/guideline-delivery` | CI/CD, testing, release process |
-| **Growth & Support** |||
-| `growth` | `/guideline-growth` | Onboarding, activation, retention |
-| `referral` | `/guideline-referral` | Referral programs, viral loops |
-| `support` | `/guideline-support` | Help systems, tickets, documentation |
-| **Admin & Discovery** |||
-| `admin` | `/guideline-admin` | Admin panel, RBAC, config management |
-| `discovery` | `/guideline-discovery` | Feature discovery, competitive analysis |
-| **Code Quality** |||
-| `code-quality` | `/guideline-code-quality` | Patterns, testing, maintainability |
-## Execution
-1. Parse the `$ARGS` to identify which domain(s) to review
-2. Read the corresponding guideline file(s) — **read in parallel** if multiple domains
-3. Use the guideline's Tech Stack, Non-Negotiables, Context, and Driving Questions to guide your review
-4. Delegate workers to investigate different aspects simultaneously
-5. Synthesize findings and implement fixes
-## Multi-Domain Reviews
-You can review multiple domains at once:
-- `/review auth security` — Review both auth and security
-- `/review billing pricing ledger` — Full revenue stack review
-- `/review performance seo pwa` — Frontend optimization review
-When reviewing multiple domains, look for **cross-cutting concerns** and **gaps between domains**.
-## Output Format
+* **Understand first.** Don't treat guidelines as checklists — absorb the principles, then apply judgment.
+* **Think like the failure mode.** Security? Think like an attacker. Performance? Think like a slow network. Auth? Think like a confused user.
+* **Delegate workers** for parallel research. You synthesize and verify.
+* **Fix, don't report.** Implement solutions directly.
+## Available Guidelines
+Read relevant guideline(s) based on what you're reviewing:
+| Guideline | Domain |
+|-----------|--------|
+| `/guideline-auth` | Sign-in, SSO, passkeys, verification |
+| `/guideline-account-security` | MFA, sessions, account recovery |
+| `/guideline-privacy` | Data handling, consent, GDPR/CCPA |
+| `/guideline-billing` | Stripe, webhooks, subscriptions |
+| `/guideline-pricing` | Pricing models, tiers, feature gating |
+| `/guideline-ledger` | Transactions, audit trails, reconciliation |
+| `/guideline-security` | OWASP, validation, secrets |
+| `/guideline-trust-safety` | Abuse prevention, rate limiting, fraud |
+| `/guideline-uiux` | Design system, accessibility |
+| `/guideline-seo` | Meta tags, structured data, crawlability |
+| `/guideline-pwa` | Service workers, offline, installability |
+| `/guideline-performance` | Core Web Vitals, bundle size, caching |
+| `/guideline-i18n` | Localization, routing, hreflang |
+| `/guideline-database` | Schema, indexes, migrations |
+| `/guideline-data-architecture` | Data models, relationships, integrity |
+| `/guideline-storage` | File uploads, CDN, blob storage |
+| `/guideline-observability` | Logging, metrics, tracing, alerts |
+| `/guideline-operability` | Deployment, rollback, feature flags |
+| `/guideline-delivery` | CI/CD, testing, release process |
+| `/guideline-growth` | Onboarding, activation, retention |
+| `/guideline-referral` | Referral programs, viral loops |
+| `/guideline-support` | Help systems, tickets, documentation |
+| `/guideline-admin` | Admin panel, RBAC, config |
+| `/guideline-discovery` | Feature discovery, competitive analysis |
+| `/guideline-code-quality` | Patterns, testing, maintainability |
+## Output
 ```
-## Review: [domain(s)]
+## Review: [topic]
+### Understanding
+[How this is implemented. Architecture, choices, tradeoffs.]
-### Critical Issues
-- [ ] Issue description → Fix implemented
+### Issues
+[What's wrong and why it matters]
-### Improvements Made
-- ✓ What was fixed/improved
+### Fixed
+[Changes made]
-### Recommendations
-- Future considerations (if can't fix now)
+### Remaining
+[Needs human decision or blocked]
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "@sylphx/flow",
-	"version": "2.16.0",
+	"version": "2.16.1",
 	"description": "One CLI to rule them all. Unified orchestration layer for Claude Code, OpenCode, Cursor and all AI development tools. Auto-detection, auto-installation, auto-upgrade.",
 	"type": "module",
 	"bin": {