@sylphx/flow 2.10.0 → 2.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,11 @@
1
1
  # @sylphx/flow
2
2
 
3
+ ## 2.11.0 (2025-12-17)
4
+
5
+ ### ✨ Features
6
+
7
+ - **commands:** add tech stack and replace checklists with exploration questions ([0772b1d](https://github.com/SylphxAI/flow/commit/0772b1d788ddf2074729d04e67ef3d94b138de10))
8
+
3
9
  ## 2.10.0 (2025-12-17)
4
10
 
5
11
  Replace saas-* commands with 24 focused /review-* commands.
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for user safety and threat detection.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Auth**: better-auth
20
+ * **Framework**: Next.js
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -26,26 +32,17 @@ agent: coder
26
32
  * Key security event visibility (and export where applicable)
27
33
  * All server-enforced and auditable
28
34
 
29
- ### Security Event Management
30
-
31
- * Login attempts (success/failure) logged
32
- * Password changes logged
33
- * MFA enrollment/removal logged
34
- * Session creation/revocation logged
35
- * Suspicious activity detection
36
- * Security event notifications
37
-
38
35
  ### Recovery Governance
39
36
 
40
37
  * Account recovery flow secure
41
38
  * Support-assisted recovery with strict audit logging
42
39
  * Step-up verification for sensitive actions
43
40
 
44
- ## Verification Checklist
41
+ ## Key Areas to Explore
45
42
 
46
- - [ ] Session/device visibility exists
47
- - [ ] Session revocation works
48
- - [ ] MFA management available
49
- - [ ] Identity provider linking visible
50
- - [ ] Security events logged and visible
51
- - [ ] Recovery flow is secure and audited
43
+ * What visibility do users have into their active sessions and devices?
44
+ * How robust is the MFA implementation and enrollment flow?
45
+ * What security events are logged and how accessible are they to users?
46
+ * How does the account recovery flow prevent social engineering attacks?
47
+ * What step-up authentication exists for sensitive actions?
48
+ * How are compromised accounts detected and handled?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify operational improvements and safety enhancements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Framework**: Next.js
20
+ * **API**: tRPC
21
+ * **Database**: Neon (Postgres)
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -35,20 +42,17 @@ agent: coder
35
42
  ### Admin Analytics and Reporting (Mandatory)
36
43
 
37
44
  * Provide comprehensive dashboards/reports for business, growth, billing, referral, support, and security/abuse signals, governed by RBAC.
38
- * Reporting must be consistent with system-of-record truth and auditable when derived from privileged actions.
39
45
 
40
46
  ### Admin Operational Management (Mandatory)
41
47
 
42
48
  * Tools for user/account management, entitlements/access management, lifecycle actions, and issue resolution workflows.
43
- * Actions affecting access, money/credits, or security posture require appropriate step-up controls and must be fully auditable; reversibility must follow domain rules.
44
-
45
- ## Verification Checklist
46
-
47
- - [ ] RBAC implemented (least privilege)
48
- - [ ] Admin bootstrap secure (not file seeding)
49
- - [ ] Bootstrap disables after completion
50
- - [ ] Config management with history
51
- - [ ] Feature flags governed
52
- - [ ] Admin dashboards exist
53
- - [ ] Impersonation has safeguards
54
- - [ ] All admin actions audited
49
+ * Actions affecting access, money/credits, or security posture require appropriate step-up controls and must be fully auditable.
50
+
51
+ ## Key Areas to Explore
52
+
53
+ * How secure is the admin bootstrap process?
54
+ * What RBAC gaps exist that could lead to privilege escalation?
55
+ * How comprehensive is the audit logging for sensitive operations?
56
+ * What admin workflows are missing or painful?
57
+ * How does impersonation work and what safeguards exist?
58
+ * What visibility do admins have into system health and issues?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for security, usability, and reliability.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Auth**: better-auth
20
+ * **Framework**: Next.js
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -27,21 +33,10 @@ agent: coder
27
33
  * **Email verification is mandatory** baseline for high-impact capabilities.
28
34
  * **Phone verification is optional** and used as risk-based step-up (anti-abuse, higher-trust flows, recovery); consent-aware and data-minimizing.
29
35
 
30
- ### Authentication Best Practices
31
-
32
- * Secure session management
33
- * Token rotation and expiry
34
- * Brute force protection
35
- * Account enumeration prevention
36
- * Secure password reset flow
37
- * Remember me functionality (secure)
38
-
39
- ## Verification Checklist
36
+ ## Key Areas to Explore
40
37
 
41
- - [ ] All SSO providers implemented
42
- - [ ] Missing provider secrets hide UI option
43
- - [ ] Multi-provider linking works safely
44
- - [ ] Passkeys (WebAuthn) supported
45
- - [ ] Email verification enforced
46
- - [ ] Phone verification optional/step-up
47
- - [ ] Session management secure
38
+ * How does the current auth implementation compare to best practices?
39
+ * What security vulnerabilities exist in the sign-in flows?
40
+ * How can the user experience be improved while maintaining security?
41
+ * What edge cases are not handled (account recovery, provider outages, etc.)?
42
+ * How does session management handle concurrent devices?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify revenue leakage and billing reliability improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Payments**: Stripe
20
+ * **Workflows**: Upstash Workflows + QStash
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -28,7 +34,7 @@ agent: coder
28
34
  * **Webhook trust is mandatory**: webhook origin must be verified (signature verification and replay resistance)
29
35
  * The Stripe **event id** must be used as the idempotency and audit correlation key
30
36
  * Unverifiable events must be rejected and must trigger alerting
31
- * **Out-of-order behavior must be explicit**: all webhook handlers must define and enforce a clear out-of-order strategy (event ordering is not guaranteed even for the same subscription)
37
+ * **Out-of-order behavior must be explicit**: all webhook handlers must define and enforce a clear out-of-order strategy
32
38
 
33
39
  ### State Machine
34
40
 
@@ -36,12 +42,11 @@ agent: coder
36
42
  * Handle: trial, past_due, unpaid, canceled, refund, dispute
37
43
  * UI only shows interpretable, non-ambiguous states
38
44
 
39
- ## Verification Checklist
45
+ ## Key Areas to Explore
40
46
 
41
- - [ ] Billing state machine defined
42
- - [ ] Webhook signature verification
43
- - [ ] Idempotent webhook handling (event id)
44
- - [ ] Out-of-order handling defined
45
- - [ ] All Stripe states mapped
46
- - [ ] UI reflects server-truth only
47
- - [ ] Refund/dispute handling works
47
+ * How robust is the webhook handling for all Stripe events?
48
+ * What happens when webhooks arrive out of order?
49
+ * How does the UI handle ambiguous billing states?
50
+ * What revenue leakage exists (failed renewals, dunning, etc.)?
51
+ * How are disputes and chargebacks handled end-to-end?
52
+ * What is the testing strategy for billing edge cases?
@@ -12,6 +12,14 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify maintainability issues and technical debt.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Runtime**: Bun
20
+ * **Linting/Formatting**: Biome
21
+ * **Testing**: Bun test
22
+ * **Language**: TypeScript (strict)
15
23
 
16
24
  ## Review Scope
17
25
 
@@ -25,39 +33,12 @@ agent: coder
25
33
  * Precise naming; remove dead/unused code.
26
34
  * Upgrade all packages to latest stable; avoid deprecated patterns.
27
35
 
28
- ### Tooling Baseline
29
-
30
- * **Bun** for runtime
31
- * **Biome** for linting/formatting
32
- * **Bun test** for testing
33
- * Strict TypeScript
34
-
35
- ### CI Requirements
36
-
37
- * Biome lint/format passes
38
- * Strict TS typecheck passes
39
- * Unit + E2E tests pass
40
- * Build succeeds
41
-
42
- ### Code Quality Best Practices
43
-
44
- * Consistent code style
45
- * Meaningful variable/function names
46
- * Single responsibility principle
47
- * DRY (Don't Repeat Yourself)
48
- * SOLID principles
49
- * No circular dependencies
50
- * Reasonable file sizes
51
- * Clear module boundaries
52
-
53
- ## Verification Checklist
54
-
55
- - [ ] Biome lint passes
56
- - [ ] Biome format passes
57
- - [ ] TypeScript strict mode
58
- - [ ] No type errors
59
- - [ ] Tests exist and pass
60
- - [ ] Build succeeds
61
- - [ ] No TODOs/hacks
62
- - [ ] No dead code
63
- - [ ] Packages up to date
36
+ ## Key Areas to Explore
37
+
38
+ * What areas of the codebase have the most technical debt?
39
+ * Where are types weak or using `any` inappropriately?
40
+ * What test coverage gaps exist for critical paths?
41
+ * What code patterns are inconsistent across the codebase?
42
+ * What dependencies are outdated or have known vulnerabilities?
43
+ * Where do "god files" or overly complex modules exist?
44
+ * What naming inconsistencies make the code harder to understand?
@@ -12,6 +12,14 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for consistency, reliability, and scalability.
16
+
17
+ ## Tech Stack
18
+
19
+ * **API**: tRPC
20
+ * **Framework**: Next.js
21
+ * **Database**: Neon (Postgres)
22
+ * **ORM**: Drizzle
15
23
 
16
24
  ## Review Scope
17
25
 
@@ -26,11 +34,10 @@ agent: coder
26
34
  * **Server-truth is authoritative**: UI state must never contradict server-truth. Where asynchronous confirmation exists, UI must represent that state unambiguously and remain explainable.
27
35
  * **Auditability chain is mandatory** for any high-value mutation: who/when/why, before/after state, and correlation to the triggering request/job/webhook must be recorded and queryable.
28
36
 
29
- ## Verification Checklist
37
+ ## Key Areas to Explore
30
38
 
31
- - [ ] Clear domain boundaries defined
32
- - [ ] Server enforcement for all authorization
33
- - [ ] Explicit consistency model documented
34
- - [ ] State machine for billing/entitlements exists
35
- - [ ] No dual-write violations
36
- - [ ] Auditability chain implemented
39
+ * How well are domain boundaries defined and enforced?
40
+ * Where does client-side trust leak into authorization decisions?
41
+ * What consistency guarantees exist and are they sufficient?
42
+ * How does the system handle eventual consistency edge cases?
43
+ * What would break if a webhook is processed out of order?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for performance, reliability, and maintainability.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Database**: Neon (Postgres)
20
+ * **ORM**: Drizzle
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -21,20 +27,11 @@ agent: coder
21
27
  * Deterministic, reproducible, environment-safe; linear/auditable history; no drift.
22
28
  * CI must fail if schema changes are not represented by migrations.
23
29
 
24
- ### Database Best Practices
25
-
26
- * Schema design follows normalization principles where appropriate
27
- * Indexes exist for commonly queried columns
28
- * Foreign key constraints are properly defined
29
- * No orphaned tables or columns
30
- * Proper use of data types (no stringly-typed data)
31
- * Timestamps use consistent timezone handling
32
-
33
- ## Verification Checklist
30
+ ## Key Areas to Explore
34
31
 
35
- - [ ] All migrations committed and complete
36
- - [ ] No schema drift detected
37
- - [ ] CI blocks on migration integrity
38
- - [ ] Indexes cover common queries
39
- - [ ] Foreign keys properly defined
40
- - [ ] Data types appropriate
32
+ * Is the schema well-designed for the domain requirements?
33
+ * Are there missing indexes that could improve query performance?
34
+ * How are database connections pooled and managed?
35
+ * What is the backup and disaster recovery strategy?
36
+ * Are there any N+1 query problems or inefficient access patterns?
37
+ * How does the schema handle soft deletes, auditing, and data lifecycle?
@@ -12,6 +12,14 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify gaps in automated verification and release safety.
16
+
17
+ ## Tech Stack
18
+
19
+ * **CI**: GitHub Actions
20
+ * **Testing**: Bun test
21
+ * **Linting**: Biome
22
+ * **Platform**: Vercel
15
23
 
16
24
  ## Review Scope
17
25
 
@@ -19,34 +27,12 @@ agent: coder
19
27
 
20
28
  CI must block merges/deploys when failing:
21
29
 
22
- #### Code Quality Gates
23
- - [ ] Biome lint/format passes
24
- - [ ] Strict TS typecheck passes
25
- - [ ] Unit + E2E tests pass (Bun test)
26
- - [ ] Build succeeds
27
-
28
- #### Data Integrity Gates
29
- - [ ] Migration integrity checks pass
30
- - [ ] No schema drift
31
-
32
- #### i18n Gates
33
- - [ ] Missing translation keys fail build
34
- - [ ] `/en/*` returns 301 redirect
35
- - [ ] hreflang/x-default correct
36
- - [ ] Sitemap contains only true variants
37
-
38
- #### Performance Gates
39
- - [ ] Performance budget verification for key journeys
40
- - [ ] Core Web Vitals within thresholds
41
- - [ ] Release-blocking regression detection
42
-
43
- #### Security Gates
44
- - [ ] CSP/HSTS/security headers verified
45
- - [ ] CSRF protection tested
46
-
47
- #### Consent Gates
48
- - [ ] Analytics/marketing consent gating verified
49
- - [ ] Newsletter eligibility and firing rules tested
30
+ * Code Quality: Biome lint/format, strict TS typecheck, unit + E2E tests, build
31
+ * Data Integrity: Migration integrity checks, no schema drift
32
+ * i18n: Missing translation keys fail build, `/en/*` redirects, hreflang correct
33
+ * Performance: Budget verification, Core Web Vitals thresholds, regression detection
34
+ * Security: CSP/HSTS/headers verified, CSRF protection tested
35
+ * Consent: Analytics/marketing consent gating verified
50
36
 
51
37
  ### Automation Requirement
52
38
 
@@ -54,18 +40,18 @@ CI must block merges/deploys when failing:
54
40
 
55
41
  ### Configuration Gates
56
42
 
57
- * Build/startup must fail-fast when required configuration/secrets are missing or invalid for the target environment.
43
+ * Build/startup must fail-fast when required configuration/secrets are missing or invalid.
58
44
 
59
45
  ### Operability Gates
60
46
 
61
47
  * Observability and alerting configured for critical anomalies
62
- * Workflow dead-letter handling is operable, visible, and supports controlled replay
48
+ * Workflow dead-letter handling is operable and supports controlled replay
63
49
 
64
- ## Verification Checklist
50
+ ## Key Areas to Explore
65
51
 
66
- - [ ] All gates automated (no manual)
67
- - [ ] CI blocks on failures
68
- - [ ] Config fail-fast works
69
- - [ ] Operability gates met
70
- - [ ] No TODOs/hacks/workarounds
71
- - [ ] No dead/unused code
52
+ * What release gates are missing or insufficient?
53
+ * Where does manual verification substitute for automation?
54
+ * How fast is the CI pipeline and what slows it down?
55
+ * What flaky tests exist and how do they affect reliability?
56
+ * How does the deployment process handle rollbacks?
57
+ * What post-deployment verification exists?
@@ -12,6 +12,7 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: this review IS the exploration—think broadly and creatively.
15
16
 
16
17
  ## Review Scope
17
18
 
@@ -21,29 +22,13 @@ agent: coder
21
22
  * **Pricing/monetization review**: packaging/entitlements, lifecycle semantics, legal/tax/invoice implications; propose conversion/churn improvements within constraints.
22
23
  * **Competitive research**: features, extensibility, guidance patterns, pricing/packaging norms; convert insights into testable acceptance criteria.
23
24
 
24
- ### Discovery Areas
25
+ ## Key Areas to Explore
25
26
 
26
27
  * What features are competitors offering that we lack?
27
- * What pricing models are common in the market?
28
- * What UX patterns are users expecting?
29
- * What integrations would add value?
30
- * What automation opportunities exist?
31
- * What self-service capabilities are missing?
32
-
33
- ### Analysis Framework
34
-
35
- * Market positioning
36
- * Feature gap analysis
37
- * Pricing benchmarking
38
- * User journey optimization
39
- * Technical debt opportunities
40
- * Scalability considerations
41
-
42
- ## Verification Checklist
43
-
44
- - [ ] Competitor analysis completed
45
- - [ ] Feature gaps identified
46
- - [ ] Pricing reviewed
47
- - [ ] UX patterns researched
48
- - [ ] Integration opportunities listed
49
- - [ ] Recommendations prioritized
28
+ * What pricing models are common in the market and how do we compare?
29
+ * What UX patterns are users expecting based on industry standards?
30
+ * What integrations would add significant value?
31
+ * What automation opportunities exist to reduce manual work?
32
+ * What self-service capabilities are users asking for?
33
+ * What technical capabilities could enable new business models?
34
+ * Where are the biggest gaps between current state and market expectations?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify growth opportunities and conversion improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Analytics**: PostHog
20
+ * **Framework**: Next.js
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -41,22 +47,11 @@ agent: coder
41
47
  * Monitored
42
48
  * Protected against regressions
43
49
 
44
- ### Growth Best Practices
45
-
46
- * Clear value proposition on landing
47
- * Frictionless signup
48
- * Time-to-value optimization
49
- * Activation milestones defined
50
- * Re-engagement campaigns
51
- * Churn prediction/prevention
52
- * NPS/satisfaction tracking
53
-
54
- ## Verification Checklist
55
-
56
- - [ ] Onboarding flow exists
57
- - [ ] Onboarding instrumented
58
- - [ ] Sharing mechanisms work
59
- - [ ] Sharing is consent-aware
60
- - [ ] Retention metrics tracked
61
- - [ ] Re-engagement exists
62
- - [ ] Growth metrics dashboarded
50
+ ## Key Areas to Explore
51
+
52
+ * What is the current activation rate and where do users drop off?
53
+ * How can time-to-value be reduced for new users?
54
+ * What viral mechanics exist and how effective are they?
55
+ * What retention patterns exist and what predicts churn?
56
+ * How does the product re-engage dormant users?
57
+ * What experiments could drive meaningful growth improvements?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for coverage, quality, and user experience.
16
+
17
+ ## Tech Stack
18
+
19
+ * **i18n**: next-intl
20
+ * **Framework**: Next.js
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -39,12 +45,10 @@ agent: coder
39
45
  * No indexable locale-prefixed duplicates unless primary content is truly localized; otherwise redirect to canonical.
40
46
  * Canonical/hreflang/sitemap must reflect only true localized variants.
41
47
 
42
- ## Verification Checklist
48
+ ## Key Areas to Explore
43
49
 
44
- - [ ] All locales supported
45
- - [ ] `/en/*` returns 301 redirect
46
- - [ ] Missing keys fail build
47
- - [ ] No hardcoded strings
48
- - [ ] Intl formatting used
49
- - [ ] UGC has single canonical URL
50
- - [ ] hreflang tags correct
50
+ * How complete and consistent are the translations across all locales?
51
+ * What user-facing strings are hardcoded and missing from localization?
52
+ * How does the routing handle edge cases (unknown locales, malformed URLs)?
53
+ * What is the translation workflow and how can it be improved?
54
+ * How does the system handle RTL languages if needed in the future?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for accuracy, auditability, and reconciliation.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Payments**: Stripe
20
+ * **Database**: Neon (Postgres)
21
+ * **ORM**: Drizzle
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -21,20 +28,11 @@ agent: coder
21
28
  * Deterministic precision (no floats), idempotent posting, concurrency safety, transactional integrity, and auditability are required.
22
29
  * Monetary flows must be currency-based and reconcilable with Stripe; credits (if used) must be governed as non-cash entitlements.
23
30
 
24
- ### Ledger Requirements
25
-
26
- * Append-only transaction log
27
- * Double-entry bookkeeping where applicable
28
- * Idempotent transaction posting (idempotency keys)
29
- * Concurrency-safe balance calculations
30
- * Audit trail for all mutations
31
- * Reconciliation with external systems (Stripe)
32
-
33
- ## Verification Checklist
31
+ ## Key Areas to Explore
34
32
 
35
- - [ ] Immutable ledger pattern used (not mutable balance)
36
- - [ ] No floating point for money
37
- - [ ] Idempotent posting implemented
38
- - [ ] Concurrency safety verified
39
- - [ ] Full audit trail exists
40
- - [ ] Reconciliation with Stripe possible
33
+ * Is there a balance/credits system and how is it implemented?
34
+ * If mutable balances exist, what are the risks and how to migrate to immutable ledger?
35
+ * How does the system handle concurrent transactions?
36
+ * What is the reconciliation process with Stripe?
37
+ * How are edge cases handled (refunds, disputes, partial payments)?
38
+ * What audit trail exists for financial mutations?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify blind spots and debugging improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Error Tracking**: Sentry
20
+ * **Analytics**: PostHog
21
+ * **Platform**: Vercel
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -26,30 +33,11 @@ agent: coder
26
33
  * Abuse spikes
27
34
  * Drift detection
28
35
 
29
- ### Logging Best Practices
30
-
31
- * Structured JSON logs
32
- * Correlation ID in all logs
33
- * Request ID propagation
34
- * User context (anonymized where needed)
35
- * Error stack traces
36
- * Performance timing
37
- * No PII in logs
38
-
39
- ### Monitoring
40
-
41
- * Sentry for error tracking
42
- * PostHog for product analytics
43
- * Server metrics (latency, throughput, errors)
44
- * Database query performance
45
- * External service health
46
-
47
- ## Verification Checklist
48
-
49
- - [ ] Structured logging implemented
50
- - [ ] Correlation IDs propagate
51
- - [ ] Sentry configured
52
- - [ ] SLO/SLI defined
53
- - [ ] Alerts for critical failures
54
- - [ ] No PII in logs
55
- - [ ] Dashboards for key metrics
36
+ ## Key Areas to Explore
37
+
38
+ * How easy is it to debug a production issue end-to-end?
39
+ * What blind spots exist where errors go unnoticed?
40
+ * How effective are the current alerts (signal vs noise)?
41
+ * What SLOs/SLIs are defined and are they meaningful?
42
+ * How does log correlation work across async boundaries?
43
+ * What dashboards exist and do they answer the right questions?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify operational risks and reliability improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Workflows**: Upstash Workflows + QStash
20
+ * **Cache**: Upstash Redis
21
+ * **Platform**: Vercel
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -21,7 +28,7 @@ agent: coder
21
28
  * Define controlled retries/backoff
22
29
  * **Dead-letter handling must exist and be observable and operable**
23
30
  * **Safe replay must be supported**
24
- * Side-effects (email/billing/ledger/entitlements) must be governed such that they are either proven effectively-once or safely re-entrant without duplicated economic/security consequences
31
+ * Side-effects (email/billing/ledger/entitlements) must be governed such that they are either proven effectively-once or safely re-entrant
25
32
 
26
33
  ### Drift Detection (Hard Requirement)
27
34
 
@@ -33,22 +40,11 @@ agent: coder
33
40
  * Define safe rollout posture with backward compatibility
34
41
  * Rollback expectations for billing/ledger/auth changes
35
42
 
36
- ### Operability Best Practices
37
-
38
- * Health check endpoints
39
- * Graceful shutdown
40
- * Circuit breakers for external services
41
- * Timeout configuration
42
- * Retry with exponential backoff
43
- * Bulk operation controls
44
- * Maintenance mode capability
45
-
46
- ## Verification Checklist
47
-
48
- - [ ] Async jobs idempotent
49
- - [ ] DLQ exists and observable
50
- - [ ] Safe replay supported
51
- - [ ] Drift detection implemented
52
- - [ ] Remediation playbooks exist
53
- - [ ] Rollback strategy defined
54
- - [ ] Health checks present
43
+ ## Key Areas to Explore
44
+
45
+ * How does the system handle job failures and retries?
46
+ * What happens to messages that fail permanently (DLQ)?
47
+ * How are operators notified of and able to resolve stuck workflows?
48
+ * What drift can occur between systems and how is it detected?
49
+ * How safe is the deployment process for critical paths?
50
+ * What runbooks exist for common operational issues?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify bottlenecks and optimization opportunities.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Framework**: Next.js (SSR/ISR/Static)
20
+ * **Platform**: Vercel
21
+ * **Tooling**: Bun
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -23,34 +30,18 @@ agent: coder
23
30
  * Monitor Core Web Vitals and server latency
24
31
  * Alert on regressions
25
32
 
26
- ### Performance Budget Verification
27
-
28
- * **Performance budget verification** for key journeys (including Core Web Vitals-related thresholds) with release-blocking regression detection
29
-
30
- ### Core Web Vitals
33
+ ### Core Web Vitals Targets
31
34
 
32
35
  * LCP (Largest Contentful Paint) < 2.5s
33
36
  * FID (First Input Delay) < 100ms
34
37
  * CLS (Cumulative Layout Shift) < 0.1
35
38
  * INP (Interaction to Next Paint) < 200ms
36
39
 
37
- ### Performance Best Practices
38
-
39
- * Image optimization (WebP, lazy loading)
40
- * Code splitting
41
- * Tree shaking
42
- * Bundle size monitoring
43
- * Font optimization
44
- * Critical CSS
45
- * Preloading key resources
46
- * Server response time < 200ms
47
-
48
- ## Verification Checklist
49
-
50
- - [ ] Performance budgets defined
51
- - [ ] Core Web Vitals within targets
52
- - [ ] Regression detection active
53
- - [ ] Caching boundaries defined
54
- - [ ] Images optimized
55
- - [ ] Bundle size acceptable
56
- - [ ] No render-blocking resources
40
+ ## Key Areas to Explore
41
+
42
+ * What are the current Core Web Vitals scores and where do they fall short?
43
+ * Which pages or components are the biggest performance bottlenecks?
44
+ * How effective is the current caching strategy?
45
+ * What opportunities exist for code splitting and lazy loading?
46
+ * How does the bundle size compare to industry benchmarks?
47
+ * What database queries are slow and how can they be optimized?
@@ -12,6 +12,11 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify monetization opportunities and pricing strategy improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Payments**: Stripe
15
20
 
16
21
  ## Review Scope
17
22
 
@@ -30,19 +35,11 @@ agent: coder
30
35
  * All actions must be governed by RBAC, step-up controls, and audit logs.
31
36
  * Stripe Dashboard is treated as monitoring/emergency access; non-admin Stripe changes must be detectable (drift), alertable, and remediable.
32
37
 
33
- ### Pricing Strategy
34
-
35
- * Clear tier differentiation
36
- * Feature gating by plan
37
- * Upgrade/downgrade paths defined
38
- * Proration handling
39
- * Trial-to-paid conversion flow
40
-
41
- ## Verification Checklist
38
+ ## Key Areas to Explore
42
39
 
43
- - [ ] Stripe is system-of-record
44
- - [ ] New prices don't mutate existing
45
- - [ ] Grandfathering policy implemented
46
- - [ ] Pricing admin exists with RBAC
47
- - [ ] Stripe Dashboard drift detectable
48
- - [ ] Upgrade/downgrade paths work
40
+ * How does the pricing model compare to competitors?
41
+ * What friction exists in the upgrade/downgrade paths?
42
+ * How is grandfathering implemented and communicated?
43
+ * What tools exist for pricing experimentation (A/B tests)?
44
+ * How are pricing changes rolled out safely?
45
+ * What analytics exist for pricing optimization decisions?
@@ -12,6 +12,14 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify compliance gaps and privacy improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Analytics**: PostHog
20
+ * **Email**: Resend
21
+ * **Tag Management**: GTM (marketing only)
22
+ * **Observability**: Sentry
15
23
 
16
24
  ## Review Scope
17
25
 
@@ -32,19 +40,17 @@ agent: coder
32
40
  * Define deletion/deactivation semantics
33
41
  * Deletion propagation to third parties
34
42
  * Export where applicable
35
- * DR/backup posture aligned with retention
36
43
  * **Define data classification, retention periods, deletion propagation to third-party processors, and explicit exceptions** (legal/tax/anti-fraud)
37
- * Ensure external disclosures match behavior
38
44
 
39
45
  ### Behavioral Consistency
40
46
 
41
47
  * **Behavioral consistency is required**: policy and disclosures must match actual behavior across UI, data handling, logging/observability, analytics, support operations, and marketing tags; mismatches are release-blocking.
42
48
 
43
- ## Verification Checklist
49
+ ## Key Areas to Explore
44
50
 
45
- - [ ] Consent gates analytics/marketing
46
- - [ ] No tracking without consent
47
- - [ ] PII scrubbed from logs/Sentry/PostHog
48
- - [ ] Data deletion flow works
49
- - [ ] Deletion propagates to third parties
50
- - [ ] Privacy policy matches actual behavior
51
+ * Does the consent implementation actually block tracking before consent?
52
+ * Where does PII leak into logs, analytics, or error tracking?
53
+ * How does account deletion propagate to all third-party services?
54
+ * Does the privacy policy accurately reflect actual data practices?
55
+ * What data retention policies exist and are they enforced?
56
+ * How would the system handle a GDPR data subject access request?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify engagement opportunities and offline capabilities.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Framework**: Next.js
20
+ * **Platform**: Vercel
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -27,25 +33,11 @@ agent: coder
27
33
  * Authenticated and entitlement-sensitive routes must have explicit cache-control and SW rules
28
34
  * Must be validated by tests to prevent stale or unauthorized state exposure
29
35
 
30
- ### PWA Best Practices
31
-
32
- * Installable (meets PWA criteria)
33
- * Offline fallback page
34
- * App icons (all sizes)
35
- * Splash screen configured
36
- * Theme color defined
37
- * Start URL configured
38
- * Display mode appropriate
39
- * Cache versioning strategy
40
- * Cache invalidation on deploy
41
-
42
- ## Verification Checklist
43
-
44
- - [ ] Valid manifest.json
45
- - [ ] Service worker registered
46
- - [ ] SW doesn't cache auth content
47
- - [ ] Cache-control headers correct
48
- - [ ] Push notifications work (if applicable)
49
- - [ ] Installable on mobile
50
- - [ ] Offline fallback exists
51
- - [ ] Cache invalidation tested
36
+ ## Key Areas to Explore
37
+
38
+ * Does the PWA meet installation criteria on all platforms?
39
+ * What is the offline experience and how can it be improved?
40
+ * How does the service worker handle cache invalidation on deploys?
41
+ * What push notification capabilities exist and how are they used?
42
+ * Are there any caching bugs that expose stale or unauthorized content?
43
+ * How does the PWA experience compare to native app expectations?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify growth opportunities and fraud prevention improvements.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Analytics**: PostHog
20
+ * **Database**: Neon (Postgres)
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -34,21 +40,11 @@ agent: coder
34
40
  * Clawback conditions
35
41
  * Auditable manual review/appeal posture where applicable
36
42
 
37
- ### Referral Best Practices
38
-
39
- * Clear attribution window
40
- * Reward triggers well-defined
41
- * Double-sided rewards (referrer + referee)
42
- * Fraud detection signals
43
- * Admin visibility into referral chains
44
- * Automated clawback on refund/chargeback
45
-
46
- ## Verification Checklist
43
+ ## Key Areas to Explore
47
44
 
48
- - [ ] Attribution tracking works
49
- - [ ] Reward lifecycle defined
50
- - [ ] Velocity controls implemented
51
- - [ ] Device/account linkage checked
52
- - [ ] Clawback on fraud/refund
53
- - [ ] Admin can review referrals
54
- - [ ] Localized referral messaging
45
+ * How effective is the current referral program at driving growth?
46
+ * What fraud patterns have been observed and how are they mitigated?
47
+ * How does the attribution model handle edge cases (multiple touches, expired links)?
48
+ * What is the reward fulfillment process and where can it fail?
49
+ * How do users discover and share referral links?
50
+ * What analytics exist to measure referral program ROI?
@@ -12,6 +12,13 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify vulnerabilities and hardening opportunities.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Rate Limiting**: Upstash Redis
20
+ * **Framework**: Next.js
21
+ * **Platform**: Vercel
15
22
 
16
23
  ## Review Scope
17
24
 
@@ -31,23 +38,17 @@ agent: coder
31
38
  * Supply-chain hygiene
32
39
  * Measurable security
33
40
 
34
- ### Verification Requirements
35
-
36
- * **Security controls must be verifiable**: CSP/HSTS/security headers and CSRF (where applicable) must be covered by automated checks or security tests and included in release gates.
37
-
38
41
  ### Configuration and Secrets Governance
39
42
 
40
43
  * Required configuration must fail-fast at build/startup
41
44
  * Strict environment isolation (dev/stage/prod)
42
45
  * Rotation and incident remediation posture must be auditable and exercisable
43
46
 
44
- ## Verification Checklist
47
+ ## Key Areas to Explore
45
48
 
46
- - [ ] OWASP Top 10:2025 addressed
47
- - [ ] CSP headers configured
48
- - [ ] HSTS enabled
49
- - [ ] CSRF protection where needed
50
- - [ ] Rate limiting implemented
51
- - [ ] No plaintext passwords anywhere
52
- - [ ] MFA for admin roles
53
- - [ ] Security headers tested in CI
49
+ * What OWASP Top 10 vulnerabilities exist in the current implementation?
50
+ * How comprehensive are the security headers (CSP, HSTS, etc.)?
51
+ * Where is rate limiting missing or insufficient?
52
+ * How are secrets managed and what is the rotation strategy?
53
+ * What attack vectors exist for the authentication flows?
54
+ * How does the system detect and respond to security incidents?
@@ -12,6 +12,11 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for discoverability and search rankings.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Framework**: Next.js (SSR-first for indexable/discovery)
15
20
 
16
21
  ## Review Scope
17
22
 
@@ -36,25 +41,11 @@ agent: coder
36
41
  * UGC canonical redirects
37
42
  * Locale routing invariants
38
43
 
39
- ### SEO Best Practices
40
-
41
- * Unique titles per page
42
- * Meta descriptions present
43
- * Heading hierarchy (single H1)
44
- * Image alt text
45
- * Internal linking structure
46
- * Page speed optimization
47
- * Mobile-friendly
48
- * No duplicate content
49
-
50
- ## Verification Checklist
51
-
52
- - [ ] All pages have unique titles
53
- - [ ] Meta descriptions present
54
- - [ ] Open Graph tags complete
55
- - [ ] Canonical URLs correct
56
- - [ ] hreflang implemented
57
- - [ ] sitemap.xml exists and valid
58
- - [ ] robots.txt configured
59
- - [ ] schema.org markup present
60
- - [ ] SSR for indexable content
44
+ ## Key Areas to Explore
45
+
46
+ * How does the site perform in search engine results currently?
47
+ * What pages are missing proper metadata or structured data?
48
+ * How does the sitemap handle dynamic content and pagination?
49
+ * Are there duplicate content issues or canonicalization problems?
50
+ * What opportunities exist for featured snippets or rich results?
51
+ * How does page load performance affect SEO rankings?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify improvements for security, reliability, and cost efficiency.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Storage**: Vercel Blob
20
+ * **Platform**: Vercel
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -22,20 +28,11 @@ agent: coder
22
28
  * The server must validate the Blob URL/key ownership and namespace, and must match it against the originating upload intent (who/what/where/expiry/constraints) before attaching it to any resource.
23
29
  * The system must support safe retries and idempotent finalize; expired/abandoned intents must be cleanable and auditable.
24
30
 
25
- ### Storage Best Practices
26
-
27
- * No direct client-to-storage writes without server authorization
28
- * Upload size limits enforced server-side
29
- * File type validation (not just extension, actual content)
30
- * Malware scanning if applicable
31
- * Cleanup of orphaned/abandoned uploads
32
- * CDN caching strategy defined
33
-
34
- ## Verification Checklist
31
+ ## Key Areas to Explore
35
32
 
36
- - [ ] Intent-based upload flow implemented
37
- - [ ] Server-issued short-lived authorization
38
- - [ ] Server validates blob ownership before attach
39
- - [ ] Idempotent finalize endpoint
40
- - [ ] Abandoned uploads cleanable
41
- - [ ] File type/size validation server-side
33
+ * How are uploads currently implemented and do they follow intent-based pattern?
34
+ * What security vulnerabilities exist in the upload flow?
35
+ * How are abandoned/orphaned uploads handled?
36
+ * What is the cost implication of current storage patterns?
37
+ * How does the system handle large files and upload failures?
38
+ * What content validation (type, size, malware) exists?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify user satisfaction improvements and support efficiency gains.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Email**: Resend
20
+ * **Framework**: Next.js
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -27,29 +33,11 @@ agent: coder
27
33
  * Newsletter subscription/preferences must be consent-aware
28
34
  * Unsubscribe enforcement must be reliable
29
35
 
30
- ### Support Best Practices
31
-
32
- * Clear contact methods
33
- * Response time expectations
34
- * Help center / FAQ
35
- * Ticket tracking
36
- * Escalation paths
37
- * Support-assisted account recovery (with audit)
38
-
39
- ### Communications
40
-
41
- * Transactional emails (Resend)
42
- * Marketing emails (consent-gated)
43
- * In-app notifications
44
- * Push notifications (consent-gated)
45
- * Email templates localized
46
-
47
- ## Verification Checklist
36
+ ## Key Areas to Explore
48
37
 
49
- - [ ] Contact page discoverable
50
- - [ ] Contact page localized
51
- - [ ] WCAG AA compliant
52
- - [ ] Newsletter consent-aware
53
- - [ ] Unsubscribe works reliably
54
- - [ ] Transactional emails work
55
- - [ ] Support recovery audited
38
+ * How do users find help when they need it?
39
+ * What self-service support options exist (FAQ, help center)?
40
+ * How are support requests tracked and resolved?
41
+ * What is the email deliverability and engagement rate?
42
+ * How does the newsletter system handle bounces and complaints?
43
+ * What support tools do agents have for helping users?
@@ -12,6 +12,12 @@ agent: coder
12
12
  * **Delegate to multiple workers** to research different aspects in parallel; you act as the **final gate** to synthesize and verify quality.
13
13
  * Deliverables must be stated as **findings, gaps, and actionable recommendations**.
14
14
  * **Single-pass delivery**: no deferrals; deliver a complete assessment.
15
+ * **Explore beyond the spec**: identify usability improvements and design inconsistencies.
16
+
17
+ ## Tech Stack
18
+
19
+ * **Framework**: Next.js
20
+ * **Icons**: Iconify
15
21
 
16
22
  ## Review Scope
17
23
 
@@ -33,24 +39,11 @@ agent: coder
33
39
  * Localized and measurable
34
40
  * Governed by eligibility and frequency controls
35
41
 
36
- ### Design System Best Practices
37
-
38
- * Consistent spacing scale
39
- * Typography scale defined
40
- * Color tokens (semantic, not hardcoded)
41
- * Component library documented
42
- * Motion/animation guidelines
43
- * Error state patterns
44
- * Loading state patterns
45
- * Empty state patterns
46
-
47
- ## Verification Checklist
48
-
49
- - [ ] Design tokens used consistently
50
- - [ ] Dark/light theme works
51
- - [ ] WCAG AA verified
52
- - [ ] No CLS issues
53
- - [ ] Mobile-first responsive
54
- - [ ] No emoji in UI
55
- - [ ] Guidance present on key flows
56
- - [ ] Guidance is dismissible + re-entrable
42
+ ## Key Areas to Explore
43
+
44
+ * How consistent is the design system across the application?
45
+ * What accessibility issues exist and affect real users?
46
+ * Where do users get confused or drop off in key flows?
47
+ * How does the mobile experience compare to desktop?
48
+ * What guidance/onboarding is missing for complex features?
49
+ * How does the dark/light theme implementation handle edge cases?
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sylphx/flow",
3
- "version": "2.10.0",
3
+ "version": "2.11.0",
4
4
  "description": "One CLI to rule them all. Unified orchestration layer for Claude Code, OpenCode, Cursor and all AI development tools. Auto-detection, auto-installation, auto-upgrade.",
5
5
  "type": "module",
6
6
  "bin": {