npm - @loomfsm/bundle-code - Versions diffs - 0.1.0 - Mend

@loomfsm/bundle-code 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (81) hide show

package/LICENSE +201 -0
package/agents/acceptance.md +141 -0
package/agents/api-contract.md +89 -0
package/agents/architect.md +52 -0
package/agents/challenger-reviewer.md +104 -0
package/agents/classifier.md +74 -0
package/agents/code-analyzer.md +43 -0
package/agents/context-doc-verifier.md +94 -0
package/agents/dependency-auditor.md +42 -0
package/agents/implementer.md +135 -0
package/agents/logic-reviewer.md +132 -0
package/agents/migration.md +55 -0
package/agents/performance.md +95 -0
package/agents/plan-conformance.md +127 -0
package/agents/plan-grounding-check.md +106 -0
package/agents/planner.md +143 -0
package/agents/playwright.md +68 -0
package/agents/research.md +52 -0
package/agents/security.md +88 -0
package/agents/style-reviewer.md +85 -0
package/agents/test.md +206 -0
package/agents/ui-consistency.md +75 -0
package/dist/manifest.d.ts +2 -0
package/dist/manifest.js +34 -0
package/dist/manifest.js.map +1 -0
package/dist/src/bundle.d.ts +2 -0
package/dist/src/bundle.js +424 -0
package/dist/src/bundle.js.map +1 -0
package/dist/src/index.d.ts +5 -0
package/dist/src/index.js +14 -0
package/dist/src/index.js.map +1 -0
package/dist/src/invariants.d.ts +10 -0
package/dist/src/invariants.js +208 -0
package/dist/src/invariants.js.map +1 -0
package/dist/src/policy-resolver.d.ts +2 -0
package/dist/src/policy-resolver.js +65 -0
package/dist/src/policy-resolver.js.map +1 -0
package/dist/src/sandbox-rules.d.ts +2 -0
package/dist/src/sandbox-rules.js +40 -0
package/dist/src/sandbox-rules.js.map +1 -0
package/dist/test/bundle.test.d.ts +1 -0
package/dist/test/bundle.test.js +289 -0
package/dist/test/bundle.test.js.map +1 -0
package/dist/test/sandbox-rules.test.d.ts +1 -0
package/dist/test/sandbox-rules.test.js +73 -0
package/dist/test/sandbox-rules.test.js.map +1 -0
package/knowledge/references/api-design.md +188 -0
package/knowledge/references/arch-patterns.md +106 -0
package/knowledge/references/caching.md +190 -0
package/knowledge/references/concurrency.md +195 -0
package/knowledge/references/db-postgres.md +153 -0
package/knowledge/references/e2e-flutter.md +56 -0
package/knowledge/references/e2e-playwright.md +53 -0
package/knowledge/references/error-handling.md +208 -0
package/knowledge/references/next-app-router.md +231 -0
package/knowledge/references/observability.md +169 -0
package/knowledge/references/optimization-strategy.md +197 -0
package/knowledge/references/perf-flutter.md +62 -0
package/knowledge/references/perf-nestjs.md +59 -0
package/knowledge/references/perf-python.md +50 -0
package/knowledge/references/perf-react.md +52 -0
package/knowledge/references/react19.md +176 -0
package/knowledge/references/redis.md +175 -0
package/knowledge/references/security-backend.md +219 -0
package/knowledge/references/test-flutter.md +65 -0
package/knowledge/references/test-nestjs.md +82 -0
package/knowledge/references/test-python.md +76 -0
package/knowledge/references/test-react.md +66 -0
package/knowledge/references/test-strategy.md +175 -0
package/knowledge/references/ui-flutter.md +56 -0
package/knowledge/references/ui-web.md +51 -0
package/package.json +34 -0
package/schemas/agent-feedback.schema.json +80 -0
package/schemas/category-vocab.json +170 -0
package/schemas/classifier-output.schema.json +53 -0
package/schemas/finding.schema.json +92 -0
package/schemas/pipeline-state.schema.json +238 -0
package/schemas/reviewer-output.schema.json +62 -0
package/schemas/state-extension.schema.json +53 -0
package/schemas/validator-output.schema.json +48 -0
package/stack-candidates.yaml +248 -0

package/knowledge/references/concurrency.md ADDED Viewed

@@ -0,0 +1,195 @@
+---
+tags: [concurrency, async, parallel, race-condition, atomicity, locks, retry]
+stack_signals: []
+summary: |
+  Concurrency design and race-condition reasoning — atomicity is bought, not
+  assumed. Patterns for Promise.all, async gather, queues, locks, and shared
+  state mutation.
+when_to_load: |
+  Task touches async functions, parallel work, queues, locks, atomic
+  operations, retry/timeout logic, request handlers under load, background
+  jobs, or race-condition-prone state mutations. Diff including Promise.all,
+  asyncio.gather, parallel HTTP calls, mutex/lock usage, or read-modify-write
+  patterns on shared state also qualifies.
+agent_hints: [challenger-reviewer, logic-reviewer, security]
+---
+# Concurrency — Senior Stance
+## When this applies
+Load when task touches: async functions, parallel work, queues, locks, atomic operations, retry/timeout logic, request handlers under load, background jobs, race-condition-prone state mutations. Reviewer (especially Challenger) auto-loads when diff includes `Promise.all`, `asyncio.gather`, parallel HTTP calls, mutex/lock usage, or any read-modify-write pattern on shared state.
+## Default Stance
+Concurrency bugs hide in dev (single user, single thread) and surface in prod (10k QPS). Default to "is there a race condition here?" before "does this look right?". Atomicity is bought, not assumed. Two operations are atomic only when explicitly composed atomically (transaction, single SQL statement, atomic CPU instruction, single Redis command). Everything else is racy.
+## Patterns (use these)
+### Atomic operations only via primitives
+- Database: single statement (`UPDATE … SET n = n + 1`) is atomic. Multi-statement requires a transaction with proper isolation.
+- Redis: single command is atomic. Multi-command needs MULTI or Lua script.
+- In-process: language-level atomics (`atomic.AddInt64`, `Mutex`, `synchronized`, etc.) — never your own flag-and-check loop.
+### Locking strategies
+- **Optimistic locking** — read with version, write `WHERE version = X`. Retry on failure. Best when contention is low.
+- **Pessimistic locking** — `SELECT … FOR UPDATE`. Blocks others. Best when contention is high but lock duration is short.
+- **Distributed lock** — Redis SET NX EX (single-node) for soft mutex. Postgres advisory lock for hard cross-process mutex. NEVER Redlock for hard mutex (see redis.md).
+### Idempotency over locking
+For external-facing operations, idempotency keys + DB unique constraints often beat distributed locks:
+- Client sends `Idempotency-Key`. Server records it with response. Replays return cached.
+- DB unique constraint prevents duplicates if multiple workers process the same job.
+- Cheaper than locks; failure mode is "rejected duplicate", not "deadlock".
+### Backpressure
+Every queue has a bounded size. Every worker pool has a bounded count. When full, callers must back off (429 / drop / slow-path) — NOT pile on. Without backpressure, queue grows unbounded → memory pressure → slow shutdowns → cascading outage.
+### Timeouts everywhere
+Every external call (HTTP, DB query, Redis, RPC) has an explicit timeout. Default = "wait forever" = ticking time bomb.
+- HTTP client timeouts: connect, read, total. Set all three.
+- DB statement timeout (Postgres `statement_timeout`).
+- Test that the timeout actually fires (chaos engineering, fault injection).
+### Retry policy with jitter
+On transient failures (5xx, timeout, connection-reset):
+- Exponential backoff: `base * 2^attempt`.
+- Jitter: random 0-base added per attempt. Without jitter, retries from many clients synchronize → thundering herd.
+- Cap attempts: 3-5. Beyond that, escalate (DLQ, alert).
+- Don't retry non-idempotent operations without idempotency keys.
+### Circuit breakers
+Wrap external dependencies in a circuit breaker:
+- **Closed:** normal traffic.
+- **Open:** fail fast (return cached/error) for N seconds. Set when error rate exceeds threshold.
+- **Half-open:** let one request through; if it succeeds, close.
+Saves the downstream from your retry storm during its outage.
+### Single-writer principle
+For any piece of state, exactly one component writes; everyone else reads. Multi-writer state without coordination = bug pending.
+- DB row contention → queue + single worker per partition.
+- Cache invalidation → write-through from the same component that owns the source.
+- Filesystem mutation → owned by one process.
+### Lock ordering to prevent deadlocks
+If you must take multiple locks, always acquire them in the same global order. `lock(A) → lock(B)` everywhere; never `lock(B) → lock(A)` somewhere else.
+### Read-modify-write must be guarded
+```ts
+// BAD
+const v = await store.get(k);
+await store.put(k, v + 1);
+// GOOD (atomic)
+await store.increment(k);
+// OR
+await store.casUpdate(k, expectedVersion, v + 1);
+```
+## Anti-Patterns (DO NOT)
+### Lock then await external call
+```ts
+mutex.acquire();
+const result = await fetch(externalUrl); // 30s
+mutex.release();
+```
+**Why it bites:** lock held for entire external call duration. Other waiters block 30s. External slowdown = your service slowdown × concurrency.
+**Rule:** do work outside the lock. Lock only the critical section that touches shared state.
+### Concurrent retries without idempotency
+3 clients retry the same `POST /charge` after a timeout. 3 charges happen.
+**Rule:** idempotency keys are mandatory before allowing retries.
+### Unbounded `Promise.all` / `asyncio.gather`
+```ts
+const results = await Promise.all(items.map(item => fetchOne(item)));
+```
+With `items.length = 10000`, you fire 10000 concurrent requests. DB connection pool exhausted, target service rate-limited, OOM possible.
+**Rule:** bounded concurrency (`p-limit`, `asyncio.Semaphore`, worker pool with cap).
+### Sleep-based "wait for" loops
+```ts
+while (!ready()) await sleep(100);
+```
+Polls forever, starves event loop, doesn't scale.
+**Rule:** event-driven (subscribe, await condition, future/promise resolved by notifier).
+### Shared mutable state across requests
+Module-level counter, cache map, "last user" reference.
+**Rule:** request-scoped state, OR explicit mutex/atomic, OR push to external store (Redis).
+### Async work fired and not awaited
+```ts
+async function handle(req) {
+  doExpensiveWorkInBackground(); // unawaited
+  return { ok: true };
+}
+```
+**Why it bites:** rejection unhandled, lifecycle untied to request, lambda may freeze before completion, no error propagation.
+**Rule:** if work is fire-and-forget, push to a real queue (with at-least-once semantics). Don't rely on event loop background tasks.
+### Default-no-timeout on HTTP/DB clients
+Many libraries default to no timeout. One slow downstream pins your worker forever.
+**Rule:** explicit timeouts at client construction. Reject implicit defaults.
+### Retrying on non-idempotent operations
+`POST /charge` failed → retry → second charge.
+**Rule:** retries only when (a) idempotency key in place, OR (b) operation is naturally idempotent (PUT with full state, DELETE).
+### Distributed transactions (2PC across services)
+"Both services must commit OR both rollback."
+**Why it bites:** 2PC has known failure modes (coordinator crashes), needs participant cooperation, has latency cost. Most "distributed transactions" in the wild are buggy.
+**Rule:** use saga pattern (compensating actions on failure) OR design so eventual consistency is acceptable.
+### `forEach` with async callback
+```ts
+items.forEach(async (i) => await save(i));
+```
+**Why it bites:** `forEach` doesn't await the promises — they all fire concurrently AND the function returns before completion. You think it's sequential; it isn't.
+**Rule:** `for...of` with await for sequential; `Promise.all` (with concurrency cap) for parallel.
+### Testing without concurrency
+Tests run single-threaded; bug never reproduces. Then prod has 100 QPS, race condition fires.
+**Rule:** test concurrent paths explicitly (parallel calls in a single test, fault injection).
+## Decision Framework
+| Situation | Choice |
+|---|---|
+| Counter incremented from many places | Atomic `INCR` (Redis) or DB `UPDATE … SET n = n + 1` |
+| Read-modify-write | Transaction with `SELECT … FOR UPDATE`, OR optimistic lock with version, OR idempotent rewrite |
+| Process N items in parallel | Bounded concurrency (semaphore, worker pool, `p-limit(N)`) |
+| External call inside critical section | Refactor: do call outside, lock only the shared-state mutation |
+| Retry on transient error | Exponential backoff + jitter + cap attempts |
+| Strong cross-service consistency | Saga pattern with compensating actions; avoid 2PC |
+| Mutex across replicas | Postgres advisory lock or single-node Redis lock; NOT Redlock for safety |
+| Background work after request | Real queue with persistence; not fire-and-forget Promise |
+| Long-poll vs WebSocket vs SSE | SSE for server-to-client streaming; WS for bidirectional; polling only when neither available |
+## Cost Model
+| Pattern | Cost when wrong |
+|---|---|
+| Lock around external call | 1 slow downstream → all locked-paths slow → cascading outage |
+| Unbounded parallel calls | DB pool exhaustion, target rate limit, OOM |
+| No timeout on HTTP client | One stuck request pins a worker forever |
+| Retry without jitter | Thundering herd amplifies downstream outage |
+| Read-modify-write race | Last-write-wins data corruption; silent until audit catches it |
+| Fire-and-forget background work | Up to 100% of those calls lost on lambda freeze / process restart |
+| Synchronization via sleep loop | Wastes CPU, scales to ~10 concurrent before degrading |
+## Red Flags in Diff
+- `Promise.all(arr.map(...))` where `arr.length` could be > 100 → flag (bounded concurrency needed).
+- New `setTimeout(fn, 0)` / `setImmediate` to "fix race condition" → flag (almost always wrong fix).
+- `await fetch(url)` without explicit timeout option → flag.
+- New retry loop without backoff/jitter/cap → flag.
+- `mutex.acquire()` holding across an `await` of an external call → flag.
+- `forEach(async ...)` → flag (doesn't await).
+- New module-level `let`/`var` mutated in a request handler → flag (shared mutable state).
+- Read-then-write on a row without transaction or version check → flag (race condition).
+- `try { await ... } catch {}` swallowing all errors silently → flag.
+- New background task fired with `void doWork()` or unawaited promise → flag.
+- New external HTTP/DB/Redis client constructed inline per request (not pooled) → flag.
+- "Retry forever" loop without exit condition → flag.
+- Distributed lock implemented as `setIfNotExists` then separate `expire` → flag (race window — see redis.md).
+- 2PC / "atomic across services" claim in plan or code → flag for saga refactor.

package/knowledge/references/db-postgres.md ADDED Viewed

@@ -0,0 +1,153 @@
+---
+tags: [postgres, sql, database, migrations, query-perf, n-plus-one, indexes, backend]
+stack_signals:
+  - project_type: [backend, monorepo]
+summary: |
+  PostgreSQL query and migration discipline — EXPLAIN ANALYZE before merging,
+  N+1 hunting, index design, migration safety on production-sized tables.
+when_to_load: |
+  Task touches SQL files, ORM schema (Prisma *.prisma, TypeORM entities,
+  SQLAlchemy models), migrations, raw queries, query builders, or DB
+  connection setup. Diff including *.sql, schema changes, or query-shape
+  changes qualifies.
+agent_hints: [logic-reviewer, performance, challenger-reviewer]
+---
+# PostgreSQL — Senior Stance
+## When this applies
+Load when the task touches: SQL files, ORM schema (Prisma `*.prisma`, TypeORM entities, SQLAlchemy models), migrations, raw queries, query builders, or DB connection setup. Reviewer auto-loads when diff includes `*.sql`, schema changes, or query-shape changes.
+## Default Stance
+Treat the DB as the slowest and most expensive component. Every query is until-proven-otherwise a potential N+1, missing index, or full table scan. EXPLAIN before merging anything that's not trivially indexed. Migrations on production-sized tables are operational events, not code changes — they are designed for rollback, then run.
+## Patterns (use these)
+### Always run EXPLAIN (ANALYZE) on new queries
+For any query touching > 1 table or > 10K rows. Look for:
+- `Seq Scan` on tables > 10K rows → missing index.
+- `Nested Loop` with high outer cardinality → index missing or wrong.
+- `Filter:` removing > 90% of rows → index doesn't cover the predicate.
+- Hash/Sort spilling to disk → query needs rewriting or work_mem tuning.
+### Index choice
+- **B-tree** — equality, range, ORDER BY. Default.
+- **Partial index** — `WHERE status = 'active'` predicate that hits 5% of rows. Massive win on storage and write cost.
+- **Composite index** — order matters. Leading column = most selective AND most often filtered.
+- **GIN** — JSONB containment, full-text, array containment.
+- **GiST** — geo, range types, similarity.
+- **BRIN** — append-only timestamp columns on huge tables.
+- Covering index (`INCLUDE`) — when query can be answered from index alone (index-only scan).
+### Transactions and isolation
+- Default `READ COMMITTED` is fine for most. Don't lower it without thinking.
+- `REPEATABLE READ` for read-modify-write that needs to see consistent snapshot. Note: serialization failures must be retried by the caller.
+- `SERIALIZABLE` for true correctness across rows but cost is real — only when needed.
+- Wrap multi-step writes in a transaction. Always.
+- Avoid long-running transactions: they hold locks AND prevent VACUUM from reclaiming dead tuples → table bloat.
+### Migration safety on big tables
+- Adding a NOT NULL column with default in PG ≥11 → metadata-only on most recent versions, but **verify on the actual PG version** in use. On older versions: rewrite hits whole table → outage on >10M rows.
+- Adding an index → use `CREATE INDEX CONCURRENTLY` outside transaction. Plain `CREATE INDEX` locks writes for duration.
+- Dropping a column → two-phase: ignore in app first (deploy), then drop in next migration. Never drop and deploy together.
+- Renaming a column → never rename live. Add new, dual-write, backfill, switch reads, drop old. Multiple deploys.
+- Foreign key add on populated table → `NOT VALID` then `VALIDATE CONSTRAINT` separately. Validation is fast read-only check; full add takes write lock.
+### Connection pooling
+- Use a pool. Limit per-instance connections to (max_connections - reserved) / instance_count.
+- For serverless / function compute → use a connection pooler (PgBouncer in transaction mode, RDS Proxy, Supabase pooler). Direct connections from Lambda = burns through max_connections in seconds.
+- Pool size > 20 per instance is almost always wrong; tune with `pg_stat_activity` not by guessing.
+### N+1 detection
+- ORM lazy-load in a loop is the canonical case.
+- Fix: explicit `include` / `select_related` / `JOIN`, or batch loader (DataLoader pattern).
+- Cost: 100 rows × 5ms per N+1 query = 500ms latency, instead of one 20ms join.
+## Anti-Patterns (DO NOT)
+### `SELECT *` in production code
+**Why it bites:** schema evolves, app pulls bytes it doesn't need (network + memory), index-only scan unreachable, breaking change when adding sensitive column.
+**Rule:** explicit column list. Always.
+### Implicit casts in WHERE
+`WHERE id = '123'` where `id` is `bigint`. PG may not use the index. Worse on JSONB.
+**Rule:** match types in predicates, especially on indexed columns.
+### `OFFSET N` for pagination on big tables
+**Why it bites:** OFFSET 10000 = read and discard 10000 rows every page. O(N) per page request. Page 1000 = 10M row scans cumulative.
+**Rule:** keyset pagination — `WHERE id > $last_id ORDER BY id LIMIT 50`. Stable, indexable, scales.
+### `COUNT(*)` on large filtered tables for "total pages"
+**Why it bites:** scans matching rows. Slow on > 1M rows.
+**Rule:** approximate counts (PG `pg_class.reltuples`), or "show next page exists" instead of "total count", or cached count.
+### `WHERE col IN (subquery returning millions)`
+**Why it bites:** PG builds hash of millions of rows. May spill to disk.
+**Rule:** rewrite as JOIN or EXISTS, or batch the outer query.
+### Long-running transaction holding locks
+**Why it bites:** blocks DDL, blocks VACUUM, can deadlock writers, table bloat. Especially in ORMs that auto-open transactions per request and a slow handler keeps it open.
+**Rule:** open transaction at last possible moment, commit at first possible moment. Never wait on external API inside a transaction.
+### `CREATE INDEX` (without CONCURRENTLY) on prod table > 1M rows
+**Why it bites:** AccessExclusiveLock on the table for the index build duration. Writes block. Outage.
+**Rule:** always `CREATE INDEX CONCURRENTLY` on prod-sized tables. Run outside migration framework if needed.
+### Foreign key without index on referencing column
+**Why it bites:** every UPDATE/DELETE on parent locks-checks all child rows. Without index → full scan → lock contention.
+**Rule:** always index the FK side. ORMs don't always do this automatically — verify.
+### `TEXT` for unbounded user input without limit
+**Why it bites:** abuse vector. A single 50MB body kills row size, replication lag, query memory.
+**Rule:** explicit `VARCHAR(N)` or `CHECK (length(col) <= N)`.
+### JSONB as the schema
+Storing all data in `data jsonb` column to "avoid migrations".
+**Why it bites:** no FK, no constraints, can't index efficiently without GIN per query shape, debugging is harder, query planner can't optimize. JSONB is great for sparse/variant data; not as a substitute for schema.
+**Rule:** structured data → columns. Variant/sparse → JSONB.
+### Generated SQL with string concatenation
+**Why it bites:** SQL injection, query plan cache miss, parser overhead.
+**Rule:** parameterized queries. Always. ORMs do this; raw `pg.query(\`SELECT ... ${userInput}\`)` does not.
+### "Soft delete" everywhere via `deleted_at`
+**Why it bites:** every query needs `WHERE deleted_at IS NULL`. Forget once → leak. Indexes need partial (`WHERE deleted_at IS NULL`) or they index dead rows. Joins surface deleted rows unexpectedly.
+**Rule:** soft-delete only what truly needs audit/recovery; hard-delete the rest. If soft-deleted, partial-index everything.
+## Decision Framework
+| Situation | Choice | Why |
+|---|---|---|
+| New filter column on hot read path | Index. Composite if multiple filters together | Filter at index, not after fetch |
+| Pagination on big table | Keyset, not OFFSET | OFFSET is O(N) per page |
+| Multi-row update inside request handler | Transaction with `SELECT FOR UPDATE` | Prevent concurrent overwrite |
+| Table > 100M rows, time-series | Partition by time (monthly/daily) | Scans target one partition |
+| Adding NOT NULL column to prod | Add nullable → backfill → set NOT NULL | Avoid rewrite lock |
+| Need atomic counter | `UPDATE ... SET n = n + 1 RETURNING n` | Single statement is atomic |
+| Concurrent write contention on row | Queue + single worker, OR row-level lock with retry | Don't lock-spin on hot row |
+| Need consistent read across queries | Single transaction with `REPEATABLE READ` | Snapshot stability |
+## Cost Model (orders of magnitude)
+| Operation | Time |
+|---|---|
+| Index lookup (B-tree, cached) | 0.1ms |
+| Sequential scan, 1M rows | 100-500ms |
+| Sequential scan, 100M rows | 10-60s — usually unacceptable |
+| Index-only scan | 2-5x faster than index scan + heap fetch |
+| Single transaction commit (fsync) | 1-10ms |
+| FK check on indexed child | 0.1ms |
+| FK check on un-indexed child, 1M rows | 100ms+ |
+## Red Flags in Diff
+- Raw query strings using template literals / `format()` with user-derived input → SQL injection.
+- New filter / sort / join column without corresponding index in same migration → flag.
+- `OFFSET` in pagination on table that may grow > 10K rows → flag.
+- New FK without index on the referencing column → flag.
+- `CREATE INDEX` without `CONCURRENTLY` in prod-targeted migration → flag.
+- ORM call inside loop body → N+1 candidate.
+- Transaction wrapping HTTP/external call → flag (long-running transaction risk).
+- New `DROP COLUMN` / `RENAME` in single migration without staged rollout note → flag.
+- `.findMany()` / `.find()` without `take` / `LIMIT` on potentially-large set → flag.
+- New `JSONB` column where 80% of fields are always-present → flag (probably should be columns).

package/knowledge/references/e2e-flutter.md ADDED Viewed

@@ -0,0 +1,56 @@
+---
+tags: [e2e, flutter, integration-test, mobile]
+stack_signals:
+  - language: [dart]
+  - project_type: [mobile, frontend-app]
+summary: |
+  Flutter integration test patterns — IntegrationTestWidgetsFlutterBinding,
+  Key-based finders, pumpAndSettle, provider-override mocks.
+when_to_load: |
+  Task writes Flutter integration tests, OR project has integration_test/
+  directory. Validation step asserts end-to-end behavior on a Flutter app.
+agent_hints: [test, acceptance]
+---
+# E2E: Flutter Integration Tests
+## Detection
+`integration_test/` directory or `pubspec.yaml` with Flutter
+## Process
+1. Read existing `integration_test/` files for patterns (test groups, pumping, finders)
+2. Write tests for flows in "Manual Test Steps" section of plan
+3. Run: `flutter test integration_test/` (or specific file)
+## Rules
+- Use `IntegrationTestWidgetsFlutterBinding.ensureInitialized()`
+- Find widgets via `find.byKey`, `find.byType`, `find.text` — prefer `Key` for stability
+- Use `tester.pumpAndSettle()` after actions, not arbitrary delays
+- Mock backend via dependency injection / provider overrides, not real network
+- Group tests with `group()` per feature
+- Test on at least one platform (Android emulator or iOS simulator)
+## pumpAndSettle Timeout
+Default timeout is 10 seconds. Increase for screens with long animations:
+```dart
+await tester.pumpAndSettle(const Duration(seconds: 30));
+```
+If pumpAndSettle never settles (infinite animation like a progress indicator), use `pump()` with specific duration instead.
+## Screenshots
+Capture screenshots during tests for debugging or visual regression:
+```dart
+final binding = IntegrationTestWidgetsFlutterBinding.ensureInitialized();
+await binding.takeScreenshot('step_name');
+```
+## CI Execution
+- Android: run on emulator started in CI (`flutter emulator --launch`)
+- iOS: run on simulator (`open -a Simulator`)
+- Or use Firebase Test Lab / AWS Device Farm for real devices
+- Integration tests require a running device — cannot run headless like unit tests
+## Platform Permissions
+- Camera, location, storage permissions need to be pre-granted in test setup
+- Android: use `adb shell pm grant` in CI before running tests
+- iOS: use `simctl privacy` to grant permissions to simulator

package/knowledge/references/e2e-playwright.md ADDED Viewed

@@ -0,0 +1,53 @@
+---
+tags: [e2e, playwright, web, integration-test, frontend]
+stack_signals:
+  - language: [typescript, javascript]
+  - project_type: [frontend-app, monorepo]
+summary: |
+  Playwright E2E patterns — page object usage, getByRole / getByLabel /
+  getByText selector preference, test.describe per feature.
+when_to_load: |
+  Task writes E2E tests, OR project has Playwright config / e2e directory
+  with *.spec.ts. Validation step asserts end-to-end behavior on a web
+  stack.
+agent_hints: [test, acceptance]
+---
+# E2E: Playwright (Web)
+## Detection
+`e2e/` or `tests/` with `*.spec.ts` + Playwright config
+## Process
+1. Read existing Playwright tests for structure (page objects, fixtures, helpers)
+2. Write tests for every flow in "Manual Test Steps" section of plan
+3. Run: command from CLAUDE.md (usually `npm run test:e2e`)
+## Rules
+- Follow existing page object model if project uses one
+- Use existing fixtures and helpers
+- Prefer: `getByRole`, `getByLabel`, `getByText` over CSS selectors
+- Use `test.describe` blocks per feature
+- No `waitForTimeout` — wait for network/element instead
+- Run against local dev server
+## Authentication
+- Use `storageState` to save/restore auth session (avoid login on every test)
+- Create a `global-setup.ts` that logs in once and saves state
+- Share state via `test.use({ storageState: 'auth.json' })`
+## API Interception
+- `page.route('**/api/endpoint', handler)` to mock backend responses
+- Use for: testing error states, offline mode, slow network simulation
+- Prefer: intercept at network level, not mocking the fetch function
+## Debugging
+- `--headed` flag to see browser during development
+- `--trace on` to capture trace for failed tests
+- Trace viewer: `npx playwright show-trace trace.zip`
+- `page.screenshot()` on failure (configure in `playwright.config.ts`)
+## Parallelism & Isolation
+- Tests run in parallel by default — each test gets fresh browser context
+- Don't share state between tests (no shared variables, no test ordering)
+- Use `test.describe.serial` only when order truly matters

package/knowledge/references/error-handling.md ADDED Viewed

@@ -0,0 +1,208 @@
+---
+tags: [error-handling, retry, fallback, circuit-breaker, exception, resilience]
+stack_signals: []
+summary: |
+  Error-handling design — errors are first-class, fail-fast over
+  swallow-and-continue. Patterns for retry, fallback, circuit breakers,
+  error envelopes, and dead-letter queues.
+when_to_load: |
+  Task touches try/catch blocks, error responses, retry logic, circuit
+  breakers, fallback paths, error envelopes, exception types, error logging,
+  or dead-letter queues. Diff including new external calls, new HTTP
+  handlers, new background jobs, or any change to error-handling code also
+  qualifies.
+agent_hints: [logic-reviewer, challenger-reviewer, security]
+---
+# Error Handling — Senior Stance
+## When this applies
+Load when task touches: try/catch blocks, error responses, retry logic, circuit breakers, fallback paths, error envelopes, exception types, error logging, dead-letter queues. Reviewer auto-loads when diff includes new external calls, new HTTP handlers, new background jobs, or any change to error-handling code.
+## Default Stance
+Errors are first-class. Every external call can fail; every input can be malformed; every assumption can be violated. The question is never "what if it fails?" — it's "how does it fail, and what's the right user-facing outcome?". Default to fail-fast and surface (with proper logging) over swallow-and-continue. Resilience comes from explicit policy (retry, fallback, degrade), not from defensive `try/catch` everywhere.
+## Patterns (use these)
+### Error categorization (decide once, route consistently)
+- **Validation** (4xx) — caller's fault, no retry, surface to user. Don't log as error (noise).
+- **Authentication / authorization** (401/403) — caller's fault, no retry.
+- **Not found** (404) — caller's fault OR auth-by-existence; never expose internal.
+- **Conflict** (409) — caller's fault (idempotency-key conflict, optimistic-lock fail). May retry with new key.
+- **Rate limit** (429) — caller's fault, retry with backoff after `Retry-After`.
+- **Transient downstream** (502/503/504) — not caller's fault, retry with backoff.
+- **Internal error** (500) — server's fault, alert, do NOT retry blindly (might be deterministic bug).
+### Retry policy (per category)
+- Idempotent + transient (5xx, timeout, connection-reset): exponential backoff + jitter, cap 3-5 attempts.
+- Non-idempotent: retry only with idempotency key. Otherwise — fail-fast.
+- 4xx (caller's fault): never retry.
+- 429: respect `Retry-After`. If header missing, default backoff.
+### Circuit breaker
+Wrap each external dependency:
+- **Closed** (normal): pass through.
+- **Open** (when error rate > threshold over window): fail-fast for N seconds.
+- **Half-open**: let one request through; success → close.
+- Saves the downstream from your retry storm during its outage.
+### Fallback path
+For non-critical features, define a "degraded" answer:
+- Recommendation engine times out → return popular items.
+- Personalization service down → return generic content.
+- Cache miss + DB slow → serve stale cache.
+Document the fallback in code AND in the dashboard so operators see "we're degraded, not broken".
+### Error envelope (consistent shape)
+For HTTP:
+```json
+{
+  "error": {
+    "code": "VALIDATION_ERROR",
+    "message": "Email is required",
+    "details": [{ "field": "email", "rule": "required" }],
+    "request_id": "req_abc123"
+  }
+}
+```
+Same shape for every error. Frontend has one error parser, one error UI.
+### Typed errors
+Distinguish error categories at the type level:
+- TS: `class ValidationError extends Error`, `class NotFoundError extends Error`, etc.
+- Python: `class ValidationError(BaseException)` hierarchy.
+- Rust/Go: `Result<T, E>` with enum E.
+Handlers can `instanceof` / pattern-match to decide the right HTTP status and log level.
+### Fail-fast on unknown state
+If state is corrupt/inconsistent, crash the request (or process) loudly rather than continue with bad data. A loud failure is debuggable; a silently propagating bug is not.
+### Dead-letter queue (DLQ)
+For background jobs, after retry exhaustion → push to DLQ. Don't drop. Don't loop forever.
+- DLQ size monitored; alert when non-zero growth rate.
+- Operator can inspect, fix, replay.
+### Error context preservation
+When wrapping/rethrowing:
+- TS: `throw new Error('parse failed', { cause: originalError })` — keeps stack chain.
+- Python: `raise NewError(...) from original` — same.
+- Don't bury the original. Log the chain when surfacing.
+### Timeouts as deliberate errors
+Every external call has a timeout. Timeout fires → that's a normal error path, not a panic. Handle it: retry (if eligible), fallback, return 503 to caller.
+## Anti-Patterns (DO NOT)
+### Empty catch blocks
+```ts
+try { await externalCall(); } catch (e) {}
+```
+**Why it bites:** error swallowed, no log, no metric. Bug invisible until prod incident.
+**Rule:** every catch logs OR rethrows OR has explicit "ignore-because-X" comment with reasoning.
+### Catch-all at the top of every function
+```ts
+async function handleRequest() {
+  try {
+    // entire body
+  } catch (e) {
+    return { error: 'something went wrong' };
+  }
+}
+```
+**Why it bites:** loses error categorization, no proper status code, no useful logs. Caller can't tell validation error from infrastructure failure.
+**Rule:** centralized error middleware/handler that maps typed errors → HTTP responses. Inner code throws specific error types; top-level translates.
+### Logging "error" for every catch including expected ones
+Validation failure logs at ERROR level; oncall paged; turns out it's user typo.
+**Rule:** ERROR for unexpected; WARN for expected-but-noteworthy; INFO for normal flow. Validation failures = INFO or DEBUG.
+### Error message includes stack trace in user-facing response
+`{ "error": "TypeError: cannot read property 'foo' of undefined at ..."}` — leaks internal structure, security risk, terrible UX.
+**Rule:** user gets `code` + safe `message` + `request_id`. Operators look up `request_id` in logs to see the stack.
+### Retry without idempotency
+Retry storm on `POST /charge` → 3 charges. Real prod incident waiting to happen.
+**Rule:** retry only with idempotency key OR for naturally idempotent ops (PUT full state, DELETE).
+### Retry without backoff/jitter
+Tight retry loop: target down → 100 clients × 10 retries × 0 delay = 1000 RPS during downstream outage. Outage prolonged.
+**Rule:** exponential backoff + jitter + max attempts. Always.
+### Generic `Error` for everything
+`throw new Error('user not found')` then catch and `instanceof Error` check. Can't distinguish from any other error.
+**Rule:** typed error classes. `class NotFoundError extends Error`. Handler matches type → status code.
+### Wrapping every error in a generic envelope, losing original
+```ts
+catch (e) { throw new InternalError('failed') }
+```
+Original cause lost. Debug requires guessing.
+**Rule:** include `cause`/`from`. Preserve the chain.
+### "Just retry" as the only resilience strategy
+Retries are useful for transient failures, useless for deterministic bugs. Retrying a SQL syntax error 5 times wastes time.
+**Rule:** distinguish transient (retry) from deterministic (fail-fast, alert). Don't retry 4xx, deterministic 5xx, parse errors.
+### Throw-then-catch as control flow
+Using exceptions for normal branching (e.g., "user not found" as a normal flow path) → exceptions are slow + obscure intent.
+**Rule:** sentinel return values (`null`, `Option`, `Result`) for expected absence. Exceptions for unexpected.
+### Error logged AND returned to caller
+Same error logged at every layer it bubbles through → 5 log lines per error → log volume × users.
+**Rule:** log once, at the boundary where the error is surfaced. Inner layers rethrow without logging.
+### Background job retries forever
+No max attempts → poisoned message loops forever, eats workers, blocks queue.
+**Rule:** max attempts. Then DLQ. Then alert.
+### `process.exit(1)` in library code
+Library kills the host process on error. Caller can't recover.
+**Rule:** library throws; only the application's main loop / signal handler decides whether to exit.
+## Decision Framework
+| Failure | Response |
+|---|---|
+| Caller sent invalid input | 4xx with error envelope; log INFO |
+| Caller not authenticated | 401; log INFO |
+| Resource not found | 404; log INFO unless suspicious pattern |
+| Idempotency-key conflict | 409 with previous response; log INFO |
+| Downstream HTTP timeout | retry (idempotent) or 503 (non-idempotent); log WARN |
+| Downstream HTTP 5xx | retry with backoff; log WARN |
+| Downstream rate-limited (429) | respect Retry-After; log WARN |
+| Database connection lost | retry once with new connection; if fail, 503; log ERROR |
+| Validation passes but business rule violated | 422 with specifics; log INFO |
+| Unexpected exception | 500 with generic message + request_id; log ERROR + alert |
+| Background job fails | retry per policy; on exhaustion → DLQ + alert |
+| Critical invariant violated mid-request | log ERROR + abort request (don't return partial bad data) |
+## Cost Model
+| Pattern | Cost when wrong |
+|---|---|
+| Empty catch block | Bug invisible; surfaces only as user complaint or incident |
+| Retry without idempotency | Duplicate writes; data corruption; potential financial loss |
+| No circuit breaker on flaky downstream | Your service degrades when downstream does; cascading outage |
+| Generic 500 on validation errors | Frontend shows "something went wrong"; UX suffers; oncall paged falsely |
+| No DLQ on background jobs | Silent data loss; processing gaps invisible |
+| Stack trace in user response | Information leak; security review failure |
+| Excess error logging | Log volume cost; signal-to-noise drops; real errors hidden |
+## Red Flags in Diff
+- `try { ... } catch (e) {}` empty catch → flag.
+- `try { ... } catch (e) { console.log(e); }` log-and-continue without rethrow or specific handling → flag (likely swallowing).
+- New external call without timeout option → flag.
+- New retry loop without exponential backoff + jitter + cap → flag.
+- New retry on a non-idempotent op without idempotency key → flag.
+- New catch-all in a request handler returning generic error → flag (use error middleware).
+- `throw new Error('...')` for distinct error categories without typed subclasses → flag.
+- Stack trace / internal type included in error response body → flag (info leak).
+- New background job without DLQ destination on exhaustion → flag.
+- Logging at ERROR level for expected paths (validation, 404 from search) → flag (alert noise).
+- New `process.exit` / `os._exit` / `panic!` outside main entrypoint → flag.
+- Error message constructed by string-concatenating user input → flag (log injection).
+- New error envelope shape that doesn't match the project's existing format → flag (drift).
+- Catching base `Exception` / `Error` and silently mapping to 200 success response → flag immediately.