npm - @coralai/sps-cli - Versions diffs - 0.41.2 → 0.43.0 - Mend

@coralai/sps-cli 0.41.2 → 0.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (168) hide show

package/README.md +34 -3
package/dist/commands/cardAdd.d.ts +1 -1
package/dist/commands/cardAdd.d.ts.map +1 -1
package/dist/commands/cardAdd.js +16 -6
package/dist/commands/cardAdd.js.map +1 -1
package/dist/commands/cardDashboard.js +1 -1
package/dist/commands/cardDashboard.js.map +1 -1
package/dist/commands/doctor.d.ts +9 -0
package/dist/commands/doctor.d.ts.map +1 -1
package/dist/commands/doctor.js +3 -314
package/dist/commands/doctor.js.map +1 -1
package/dist/commands/hookCommand.d.ts.map +1 -1
package/dist/commands/hookCommand.js +6 -7
package/dist/commands/hookCommand.js.map +1 -1
package/dist/commands/pmCommand.js +1 -1
package/dist/commands/pmCommand.js.map +1 -1
package/dist/commands/projectInit.d.ts.map +1 -1
package/dist/commands/projectInit.js +60 -37
package/dist/commands/projectInit.js.map +1 -1
package/dist/commands/setup.d.ts.map +1 -1
package/dist/commands/setup.js +3 -30
package/dist/commands/setup.js.map +1 -1
package/dist/commands/skillCommand.d.ts +2 -0
package/dist/commands/skillCommand.d.ts.map +1 -0
package/dist/commands/skillCommand.js +235 -0
package/dist/commands/skillCommand.js.map +1 -0
package/dist/commands/tick.js +1 -1
package/dist/commands/tick.js.map +1 -1
package/dist/core/checklist.d.ts +22 -0
package/dist/core/checklist.d.ts.map +1 -0
package/dist/core/checklist.js +38 -0
package/dist/core/checklist.js.map +1 -0
package/dist/core/checklist.test.d.ts +2 -0
package/dist/core/checklist.test.d.ts.map +1 -0
package/dist/core/checklist.test.js +74 -0
package/dist/core/checklist.test.js.map +1 -0
package/dist/core/config.d.ts +1 -1
package/dist/core/config.d.ts.map +1 -1
package/dist/core/config.js +1 -1
package/dist/core/config.js.map +1 -1
package/dist/core/config.test.js +7 -4
package/dist/core/config.test.js.map +1 -1
package/dist/core/context.d.ts +1 -1
package/dist/core/context.d.ts.map +1 -1
package/dist/core/skillStore.d.ts +46 -0
package/dist/core/skillStore.d.ts.map +1 -0
package/dist/core/skillStore.js +197 -0
package/dist/core/skillStore.js.map +1 -0
package/dist/core/skillStore.test.d.ts +2 -0
package/dist/core/skillStore.test.d.ts.map +1 -0
package/dist/core/skillStore.test.js +190 -0
package/dist/core/skillStore.test.js.map +1 -0
package/dist/engines/EventHandler.test.js +3 -3
package/dist/engines/EventHandler.test.js.map +1 -1
package/dist/engines/MonitorEngine.js +2 -2
package/dist/engines/MonitorEngine.js.map +1 -1
package/dist/engines/SchedulerEngine.js +1 -1
package/dist/engines/SchedulerEngine.js.map +1 -1
package/dist/engines/StageEngine.js +3 -3
package/dist/engines/StageEngine.js.map +1 -1
package/dist/engines/engine-pipeline-adapter.test.js +2 -2
package/dist/engines/engine-pipeline-adapter.test.js.map +1 -1
package/dist/interfaces/TaskBackend.d.ts +3 -1
package/dist/interfaces/TaskBackend.d.ts.map +1 -1
package/dist/main.js +19 -17
package/dist/main.js.map +1 -1
package/dist/models/types.d.ts +16 -1
package/dist/models/types.d.ts.map +1 -1
package/dist/providers/MarkdownTaskBackend.d.ts +2 -1
package/dist/providers/MarkdownTaskBackend.d.ts.map +1 -1
package/dist/providers/MarkdownTaskBackend.js +28 -5
package/dist/providers/MarkdownTaskBackend.js.map +1 -1
package/dist/providers/registry.d.ts.map +1 -1
package/dist/providers/registry.js +5 -7
package/dist/providers/registry.js.map +1 -1
package/package.json +1 -1
package/project-template/.claude/hooks/start.sh +44 -0
package/project-template/.claude/settings.json +1 -1
package/skills/architecture-decision-records/SKILL.md +207 -0
package/skills/backend/SKILL.md +62 -0
package/skills/backend/references/api-design.md +168 -0
package/skills/backend/references/caching.md +181 -0
package/skills/backend/references/data-access.md +173 -0
package/skills/backend/references/layering.md +181 -0
package/skills/backend/references/observability.md +190 -0
package/skills/backend/references/resilience.md +201 -0
package/skills/backend/references/security.md +186 -0
package/skills/backend-architect/SKILL.md +119 -0
package/skills/code-reviewer/SKILL.md +143 -0
package/skills/coding-standards/SKILL.md +60 -0
package/skills/coding-standards/references/clean-code.md +258 -0
package/skills/coding-standards/references/code-review.md +192 -0
package/skills/coding-standards/references/commits-and-prs.md +226 -0
package/skills/coding-standards/references/error-strategy.md +193 -0
package/skills/coding-standards/references/naming.md +185 -0
package/skills/coding-standards/references/tdd.md +171 -0
package/skills/database/SKILL.md +53 -0
package/skills/database/references/indexing.md +190 -0
package/skills/database/references/migrations.md +199 -0
package/skills/database/references/nosql.md +185 -0
package/skills/database/references/queries.md +295 -0
package/skills/database/references/scaling.md +203 -0
package/skills/database/references/schema.md +191 -0
package/skills/database-optimizer/SKILL.md +168 -0
package/skills/debugging-workflow/SKILL.md +244 -0
package/skills/devops/SKILL.md +55 -0
package/skills/devops/references/ci-cd.md +204 -0
package/skills/devops/references/containers.md +272 -0
package/skills/devops/references/deploy.md +201 -0
package/skills/devops/references/iac.md +252 -0
package/skills/devops/references/observability.md +228 -0
package/skills/devops/references/secrets.md +178 -0
package/skills/devops-automator/SKILL.md +164 -0
package/skills/frontend/SKILL.md +52 -0
package/skills/frontend/references/accessibility.md +222 -0
package/skills/frontend/references/components.md +206 -0
package/skills/frontend/references/performance.md +219 -0
package/skills/frontend/references/routing.md +209 -0
package/skills/frontend/references/state.md +190 -0
package/skills/frontend/references/testing.md +216 -0
package/skills/frontend-developer/SKILL.md +115 -0
package/skills/git-workflow/SKILL.md +355 -0
package/skills/golang/SKILL.md +49 -0
package/skills/golang/references/concurrency.md +284 -0
package/skills/golang/references/errors.md +241 -0
package/skills/golang/references/idioms.md +285 -0
package/skills/golang/references/testing.md +238 -0
package/skills/java/SKILL.md +50 -0
package/skills/java/references/concurrency.md +194 -0
package/skills/java/references/idioms.md +283 -0
package/skills/java/references/testing.md +228 -0
package/skills/kotlin/SKILL.md +47 -0
package/skills/kotlin/references/coroutines.md +240 -0
package/skills/kotlin/references/idioms.md +268 -0
package/skills/kotlin/references/testing.md +219 -0
package/skills/mobile/SKILL.md +50 -0
package/skills/mobile/references/architecture.md +204 -0
package/skills/mobile/references/navigation.md +158 -0
package/skills/mobile/references/performance.md +152 -0
package/skills/mobile/references/platform.md +166 -0
package/skills/mobile/references/state-and-data.md +174 -0
package/skills/python/SKILL.md +51 -0
package/skills/python/THIRD_PARTY.md +14 -0
package/skills/python/references/async.md +218 -0
package/skills/python/references/error-handling.md +254 -0
package/skills/python/references/idioms.md +279 -0
package/skills/python/references/packaging.md +233 -0
package/skills/python/references/testing.md +269 -0
package/skills/python/references/typing.md +292 -0
package/skills/qa-tester/SKILL.md +186 -0
package/skills/rust/SKILL.md +50 -0
package/skills/rust/references/async.md +224 -0
package/skills/rust/references/errors.md +240 -0
package/skills/rust/references/ownership.md +263 -0
package/skills/rust/references/testing.md +274 -0
package/skills/rust/references/traits.md +250 -0
package/skills/security-engineer/SKILL.md +157 -0
package/skills/swift/SKILL.md +48 -0
package/skills/swift/references/concurrency.md +280 -0
package/skills/swift/references/idioms.md +334 -0
package/skills/swift/references/testing.md +229 -0
package/skills/typescript/SKILL.md +51 -0
package/skills/typescript/references/async.md +241 -0
package/skills/typescript/references/errors.md +208 -0
package/skills/typescript/references/idioms.md +246 -0
package/skills/typescript/references/testing.md +225 -0
package/skills/typescript/references/tooling.md +208 -0
package/skills/typescript/references/types.md +259 -0

package/skills/architecture-decision-records/SKILL.md ADDED Viewed

@@ -0,0 +1,207 @@
+---
+name: architecture-decision-records
+description: Workflow skill — write, review, and maintain ADRs. Capture the *why* behind technical decisions so future readers don't re-litigate them.
+origin: ecc-fork (https://github.com/affaan-m/everything-claude-code, MIT)
+---
+# Architecture Decision Records (ADRs)
+Short, versioned documents capturing a single technical decision: what we decided, why, and what we'd need to reconsider it.
+## When to load
+- Making a technical decision with non-trivial reach (affects multiple teams / components / for > 6 months).
+- Introducing a new technology, service, pattern.
+- Deprecating a significant piece of infrastructure.
+- Reviewing someone's proposed ADR.
+- Wondering "why do we do X?" and finding no record.
+## Why ADRs matter
+A codebase without ADRs has this conversation every six months:
+> "Why are we using Kafka here? MQ would be simpler."
+> "I think… performance? I wasn't here when we decided."
+The decision gets made again, people compromise on different tradeoffs, the choice drifts. An ADR records the decision while the context is fresh, so the next discussion starts from facts, not vibes.
+## Anatomy of an ADR
+```
+# ADR-0007: Adopt Postgres as the primary OLTP store
+Date: 2026-04-21
+Status: Accepted
+Deciders: Alice (CTO), Bob (Staff Eng), Carol (Platform)
+## Context
+We need a primary OLTP store for the new user service. Current options
+considered: Postgres, MySQL, DynamoDB, CockroachDB.
+Constraints:
+- Must run in both AWS and on-prem (current requirement from Customer X).
+- Expect 10k QPS peak, 1 TB at year 2.
+- Team has strong Postgres experience; no DynamoDB experience.
+- Budget constraint: self-hosted preferred over managed where reasonable.
+## Decision
+Adopt Postgres 16 as the primary OLTP store for the user service,
+managed via RDS in AWS and self-hosted on-prem.
+## Consequences
+Positive:
+- Team already fluent; hiring pool large.
+- JSONB + strong relational semantics covers 95% of our model.
+- Rich ecosystem (partitioning, logical replication, extensions).
+Negative:
+- Horizontal scaling requires sharding (future problem if we grow past
+  a single-instance + read-replica topology).
+- Less native cloud integration than DynamoDB on AWS.
+## Alternatives considered
+- MySQL: team less familiar; similar capability otherwise.
+- DynamoDB: no on-prem story, access-pattern-locked schema design.
+- CockroachDB: stronger horizontal scaling; team has no ops experience.
+## Reconsider if
+- We need genuine multi-region write active/active.
+- On-prem requirement is dropped.
+- Operational burden of sharding exceeds the effort to migrate.
+## Related
+- ADR-0003 (record why we split auth from user service)
+- ADR-0005 (picked AWS as primary cloud)
+```
+## Structure — keep it short
+Sections:
+1. **Context** — the situation and constraints.
+2. **Decision** — one paragraph. What we're doing.
+3. **Consequences** — positive + negative + neutral effects.
+4. **Alternatives considered** — what else we weighed.
+5. **Reconsider if** — conditions that should trigger a revisit.
+6. **Related** — links to prior ADRs, docs, tickets.
+Two pages max. ADRs that bloat into design docs stop getting read.
+## Numbering & status
+Sequential: `ADR-0001-...md` in `docs/adr/` or similar. Status:
+| Status | Meaning |
+|---|---|
+| **Proposed** | Up for review |
+| **Accepted** | Approved and in effect |
+| **Rejected** | Considered, not adopted |
+| **Deprecated** | No longer applied; kept for history |
+| **Superseded by ADR-XXXX** | Replaced; link the successor |
+Don't edit accepted ADRs. Write a new one that supersedes, and update the old one's status to `Superseded by ADR-NNNN`.
+## When to write one
+Rule of thumb: if someone will ask "why did we do this?" in six months, there should be an ADR.
+Triggers:
+- Adopting or replacing infrastructure (DB, queue, cache, build tool).
+- Choosing a communication style (REST vs. gRPC, sync vs. async).
+- Non-obvious architectural constraints (single-writer model, tenant isolation scheme).
+- Significant policy: code style, review rules, SLO definitions.
+- Deprecations and removals.
+Don't write one for:
+- Naming a variable.
+- Choosing an icon size.
+- Local refactors without reach beyond the file.
+## The review
+Treat an ADR like a PR. Open it for comment with `Status: Proposed`. Reviewers focus on:
+- Are the constraints accurate?
+- Are the alternatives real alternatives?
+- Are the consequences honest (including the painful ones)?
+- Is the "reconsider if" section a real re-opener?
+Timebox the review — ADRs that linger in review lose momentum. A week is usually enough.
+## Who writes / approves
+- **Author**: the engineer proposing or doing the work.
+- **Reviewers**: peers, tech lead, any team directly affected.
+- **Approver**: usually the senior engineer / architect responsible for the area. One approver is enough; more than three is a committee.
+## Living with ADRs
+The document isn't the point — the decision is. Refer to ADRs in:
+- PR descriptions ("This implements the approach in ADR-0012").
+- Onboarding docs ("Our conventions live in `docs/adr/`").
+- Incident postmortems (when a decision's tradeoff bit).
+A directory of ADRs is the most compact onboarding material you can give a new engineer.
+## Tools
+Minimal stack:
+- `docs/adr/NNNN-short-title.md` in the repo.
+- A script or `adr-tools` / `log4brains` for numbering.
+- Index file listing all ADRs and statuses.
+Heavier options (Confluence, Notion) work, but markdown-in-repo wins for:
+- Version control (the decision is versioned with the code that enacts it).
+- Easy diff when an ADR is updated.
+- No hunting across multiple surfaces.
+## What a good ADR feels like
+- A reader can decide "should I care about this?" from the title + first sentence.
+- A new hire reading it a year later can understand the choice without asking.
+- The "reconsider if" section is specific enough that an engineer in 2028 knows when to revisit.
+## What a bad ADR looks like
+- Title: "ADR-12: Kafka" (no decision; no context).
+- 15 pages describing the system in full, decision buried on page 9.
+- No alternatives. No constraints. Reads like a sales pitch for the chosen option.
+- No "reconsider if" — the decision looks eternal.
+- Written after the decision was shipped, recast to fit what was built.
+## Tradeoffs to always name
+- **Write now vs. write later**: writing during the decision takes 30 min; reconstructing it a year later takes hours and produces lies.
+- **Rigor vs. effort**: short-and-honest beats long-and-idealized.
+- **Formal vs. casual process**: start casual; formalize as the org grows.
+- **Centralized vs. team-local ADRs**: team-local for team-scoped decisions; central for cross-team.
+## Common failure modes
+| Failure | Why |
+|---|---|
+| No ADRs written | Decisions get re-litigated; tribal knowledge rots |
+| ADRs written but ignored | Not linked from PRs / docs; unfindable |
+| ADRs written post-hoc to justify | Lose the "we considered X and Y" honesty |
+| ADRs that are 20 pages | Nobody reads them; collapse to summary |
+| ADRs that keep getting edited | Write a new one that supersedes |
+| "ADR" that just says "we'll use X" | Decision without context / alternatives / consequences |
+## Anti-patterns
+- Writing an ADR to lock down a decision that hasn't actually been discussed.
+- Using ADRs as RFC-lite without a clear question and clear options.
+- Updating an accepted ADR to change the decision — write a new superseding ADR.
+- Endless review cycles (> 2 weeks) — call consensus and accept; iterate if reality disagrees later.
+- Hiding ADRs in Confluence under three levels of navigation — in the repo is best.
+- Treating ADRs as permission — the ADR records a decision, it doesn't replace engineering judgment on specifics.
+## Pair with
+- [`coding-standards/references/code-review.md`](../coding-standards/references/code-review.md) — review discipline.
+- [`backend-architect`](../backend-architect/SKILL.md) — the role that most often drives ADRs.

package/skills/backend/SKILL.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+name: backend
+description: Backend end skill — API design, layering, data access, caching, auth, resilience, observability. Language-neutral. Combine with a language skill (`python`, `typescript`, `golang`, etc.) for syntax, and with persona skills (`backend-architect`, `database-optimizer`) for mindset.
+origin: ecc-fork + original (https://github.com/affaan-m/everything-claude-code, MIT)
+---
+# Backend
+Server-side architecture patterns. **Language-neutral by design** — examples use pseudocode or diagrams, never a specific language. Pair with a language skill for idiomatic implementation.
+## When to load
+- Designing or reviewing server-side code (API, service, worker)
+- Deciding layering (repository, service, controller, domain)
+- Data access: queries, transactions, migrations, N+1, connection pooling
+- Caching, queuing, background jobs
+- Authentication, authorization, rate limiting, input validation
+- Resilience: retries, timeouts, circuit breakers, idempotency
+- Observability: structured logging, metrics, traces, health checks
+## Core principles
+1. **Keep the domain ignorant of infrastructure.** Business logic doesn't import HTTP, DB drivers, or queues directly — those cross the boundary through interfaces.
+2. **The caller should be able to swap the implementation.** If you can't replace the DB with an in-memory fake in tests, your layering is wrong.
+3. **Every write is either idempotent or transactional.** Retries must be safe.
+4. **Input validation happens at the edge.** Once data is inside the domain, it is trusted.
+5. **Timeouts on every outbound call.** No unbounded network wait. Ever.
+6. **Never log secrets, tokens, PII.** Redact at the logger, not at the call site.
+7. **Observability is not optional.** A request you can't trace is a bug you can't fix.
+8. **Errors cross the boundary as data, not as exceptions.** The HTTP layer decides status codes; the domain raises domain errors.
+## How to use references
+| Reference | When to load |
+|---|---|
+| [`references/api-design.md`](references/api-design.md) | REST/GraphQL/gRPC conventions, versioning, error format, pagination |
+| [`references/layering.md`](references/layering.md) | Repository / service / controller, hexagonal, dependency direction |
+| [`references/data-access.md`](references/data-access.md) | Transactions, N+1, migrations, connection pooling |
+| [`references/caching.md`](references/caching.md) | Cache-aside, write-through, TTL, invalidation, stampede protection |
+| [`references/security.md`](references/security.md) | AuthN vs authZ, sessions vs tokens, RBAC, rate limiting, input validation |
+| [`references/resilience.md`](references/resilience.md) | Retries, timeouts, circuit breakers, idempotency, background jobs |
+| [`references/observability.md`](references/observability.md) | Structured logging, metrics, traces, health checks, correlation IDs |
+## Language binding
+This skill has no language-specific content. For concrete syntax:
+- Python backend → load `python` + this skill
+- TypeScript/Node → load `typescript` + this skill
+- Go → load `golang` + this skill
+- etc.
+## Forbidden patterns (auto-reject)
+- Business logic that imports HTTP request/response objects directly
+- DB queries issued from controllers (bypass the repository)
+- Outbound HTTP / DB call with no timeout
+- Writes that aren't idempotent AND aren't in a transaction
+- Secrets or tokens in logs
+- Unvalidated input reaching the domain layer
+- Catch-all `500 Internal Server Error` as the only error response
+- Silent swallowing of background-job failures (no dead-letter, no alert)

package/skills/backend/references/api-design.md ADDED Viewed

@@ -0,0 +1,168 @@
+# API Design
+REST, GraphQL, gRPC conventions. Focus on contracts, not implementations.
+## Style selection
+| Style | Good for | Weak for |
+|---|---|---|
+| REST | CRUD resources, public APIs, cacheable reads | Rich queries, partial responses, real-time |
+| GraphQL | Client-driven shape, many UIs against one backend | Simple CRUD, caching, rate limiting per field |
+| gRPC | Service-to-service, strict schemas, streaming | Browsers without a proxy, public APIs |
+When in doubt, start with REST. Switch later if the pain justifies the churn.
+## REST resource conventions
+Nouns, plural, lowercase, hyphenated. Hierarchy reflects ownership.
+```
+GET    /users                      # list
+GET    /users/{id}                 # read one
+POST   /users                      # create
+PUT    /users/{id}                 # full replace
+PATCH  /users/{id}                 # partial update
+DELETE /users/{id}                 # delete
+GET    /users/{id}/orders          # sub-resources
+POST   /users/{id}/orders
+```
+Avoid verbs in paths (`/getUser`, `/createOrder`). If an action truly doesn't fit CRUD, sub-resource it: `POST /orders/{id}/cancel`.
+## Pagination
+Cursor-based for anything that can grow. Offset/limit is fine for small fixed sets.
+```
+# Cursor (preferred for large / infinite lists)
+GET /events?cursor=eyJpZCI6MTIzfQ&limit=50
+# Response
+{
+  "data": [...],
+  "next_cursor": "eyJpZCI6MTczfQ",
+  "has_more": true
+}
+# Offset (fine for small admin views)
+GET /users?offset=0&limit=20
+```
+Offset pagination breaks silently when rows are inserted during paging; cursors don't.
+## Filtering, sorting
+```
+GET /orders?status=paid&created_after=2026-01-01&sort=-created_at&limit=20
+# Sort prefix: - for descending
+sort=-created_at,name
+```
+Whitelist allowed filter and sort fields. Never pass user-provided strings into query builders without validation.
+## Error responses
+Consistent shape everywhere. Problem Details (RFC 9457) is a reasonable default.
+```json
+{
+  "type": "https://errors.example.com/validation",
+  "title": "Validation failed",
+  "status": 422,
+  "detail": "email is required",
+  "errors": [
+    { "field": "email", "code": "required" },
+    { "field": "age", "code": "out_of_range" }
+  ],
+  "request_id": "req_01HX..."
+}
+```
+Rules:
+- `status` matches the HTTP status.
+- `request_id` correlates with server logs.
+- Never leak stack traces to clients.
+## HTTP status codes
+| Code | Use for |
+|---|---|
+| 200 | Successful read or update |
+| 201 | Resource created; include `Location` header |
+| 202 | Async accepted; polling URL in body or `Location` |
+| 204 | Success with no body (e.g. DELETE) |
+| 400 | Malformed request (bad JSON, missing path param) |
+| 401 | No / invalid auth |
+| 403 | Authenticated but forbidden |
+| 404 | Resource does not exist (or is hidden from this caller) |
+| 409 | Conflict (duplicate, version mismatch) |
+| 422 | Well-formed but semantically invalid |
+| 429 | Rate limited; include `Retry-After` |
+| 500 | Unexpected server error |
+| 503 | Dependency down / overloaded |
+`400` vs `422`: parse error vs validation error. `403` vs `404`: exposing 403 leaks existence of the resource — return `404` when that leak matters.
+## Versioning
+Pick one and be consistent.
+| Strategy | Example | Trade-off |
+|---|---|---|
+| URL | `/v1/users`, `/v2/users` | Simple; clutters paths; forces clients to migrate wholesale |
+| Header | `Accept: application/vnd.api+json;version=2` | Clean URLs; harder to test in curl |
+| Query | `/users?v=2` | Easy; often accidentally cached |
+Bump the major version only for breaking changes. Additive changes (new optional fields) go in the same version.
+## Idempotency
+Any non-GET request that retries must be safe. Accept an `Idempotency-Key` header for unsafe methods.
+```
+POST /payments
+Idempotency-Key: 7a8b9c...
+```
+Server stores `(key, request_hash) -> response` for N hours. Same key + same body → return stored response. Same key + different body → 409.
+## GraphQL conventions
+- One endpoint: `POST /graphql`.
+- Mutations return the modified object plus a client-defined selection, so the UI can update without a refetch.
+- Don't expose database IDs; use opaque global IDs (Relay spec) if you want pagination federation.
+- Enforce max query depth and complexity to prevent DoS-by-query.
+## gRPC conventions
+- Use proto3.
+- Every field is optional; breaking changes happen when you rename or renumber.
+- Stream only when the payload doesn't fit one response.
+- Put auth in metadata, not in the request message.
+## Response shape
+Keep it flat. Don't wrap with `{ success: true, data: ... }` unless your framework forces it — HTTP status already signals success.
+```json
+# Single resource
+{ "id": "u_01H", "name": "Alice", "email": "a@x.com" }
+# Collection
+{ "data": [...], "next_cursor": "...", "has_more": false }
+# Errors: see above
+```
+Consistency matters more than cleverness. Pick a shape, document it, follow it.
+## Documentation
+Every public endpoint has:
+- path, method, auth requirement
+- request body schema
+- success response schema (with example)
+- listed error codes
+OpenAPI / Protobuf schemas are the contract. Hand-written prose docs drift and lie.

package/skills/backend/references/caching.md ADDED Viewed

@@ -0,0 +1,181 @@
+# Caching
+Rules, strategies, pitfalls. Cache-aside covers 90% of cases.
+## Cache-aside (lazy loading)
+Application checks cache first; on miss, loads from source and populates cache.
+```
+get(id):
+    v = cache.get(key(id))
+    if v is not None:
+        return v                      # hit
+    v = source.load(id)               # miss
+    if v is not None:
+        cache.set(key(id), v, ttl=5min)
+    return v
+```
+Pros: simple; stale data only appears on cached keys; source is authoritative.
+Cons: first reader after expiry pays full latency; risk of **cache stampede** when many readers miss together.
+## Write-through
+Application writes to cache AND source atomically (usually: write source first, then cache).
+```
+save(entity):
+    source.save(entity)
+    cache.set(key(entity.id), entity, ttl=5min)
+```
+Pros: cache is always fresh after a write.
+Cons: writes are slower; if cache write fails, you have stale data (decide: rollback, or fire-and-forget with expiry).
+## Write-behind (deferred)
+Application writes to cache; a background job flushes to source later. Rare — only for very high write volume and tolerance for delayed durability. Almost always the wrong choice; you're trading data loss risk for write throughput.
+## What to cache (and what NOT to)
+**Cache-friendly**:
+- Read-heavy, changes rarely (config, product catalog, user profile)
+- Expensive to compute (rendered HTML, aggregations, vector search)
+- Idempotent reads
+**Avoid caching**:
+- Per-user personalized data with high cardinality (cache hit rate too low)
+- Rapidly changing data (reconciliation cost > cache benefit)
+- Anything where staleness is a correctness bug (balances, seat availability)
+## TTL strategy
+Every cache entry must expire. No TTL = memory leak.
+| Data type | Starting TTL |
+|---|---|
+| Static config | 1–24 h |
+| User profile | 5–60 min |
+| Hot aggregation | 10 s – 5 min |
+| Computed render | minutes |
+| Feature flag eval | 30–60 s |
+Add a small random jitter (±10%) so entries don't all expire at the same instant → stampede.
+## Invalidation
+The second hardest problem in computing. Three approaches:
+1. **TTL only** — simple; tolerate staleness up to TTL. Default choice.
+2. **Explicit invalidation** — on write, delete the cache key. Works if your mutation paths are countable.
+   ```
+   save(user):
+       db.update(user)
+       cache.delete(key(user.id))
+   ```
+3. **Event-driven** — publish `UserUpdated`; subscribers invalidate their caches. Needed when many services cache the same entity.
+Don't try to *update* the cache on write in complex systems — delete instead and let the next read repopulate. Updates race; deletes don't.
+## Cache key design
+Stable, explicit, version-prefixed.
+```
+# ✅
+user:v2:{user_id}
+product:v1:{sku}:detail
+list:orders:v1:user={uid}:status=paid:cursor={c}
+# ❌
+u_123                  # ambiguous across services
+users:123:details      # no version
+${JSON.stringify(query)}  # fragile; order-dependent
+```
+Version prefix lets you deploy a new format without stampeding the old one; old keys simply age out.
+## Stampede protection
+When a hot key expires, many requests miss at once and pile onto the source. Two fixes:
+### Single-flight / coalescing
+In-process: at most one loader per key; concurrent callers wait for the same result.
+```
+load(key):
+    with singleFlight(key):
+        return source.load(key)
+```
+### Probabilistic early expiration (XFetch)
+Before the TTL, some fraction of readers voluntarily refresh.
+```
+get(key):
+    v, ttl_remaining = cache.get_with_ttl(key)
+    if v is None or should_refresh_early(ttl_remaining):
+        v = source.load(key)
+        cache.set(key, v, ttl=5min)
+    return v
+```
+## Negative caching
+Cache misses are expensive if they happen constantly (e.g., 404 lookups). Cache the absence too, with a short TTL.
+```
+get(id):
+    v = cache.get(key(id))
+    if v is MISSING_SENTINEL:
+        return None                 # known-not-found
+    if v is not None:
+        return v
+    v = source.load(id)
+    cache.set(key(id), v if v else MISSING_SENTINEL, ttl=30s)
+    return v
+```
+Short TTL — don't cache `None` for hours; the item may just have been created.
+## HTTP-level caching
+For public GET endpoints, let the HTTP layer cache. Free, correctly implemented, respected by CDNs.
+```
+Cache-Control: public, max-age=300, s-maxage=600, stale-while-revalidate=60
+ETag: "abc123"
+```
+- `max-age`: browser/client
+- `s-maxage`: shared caches (CDN)
+- `stale-while-revalidate`: serve stale while refreshing in the background
+- `ETag` + `If-None-Match`: 304 responses save bandwidth
+## Local (in-process) cache vs distributed
+| | Local (in-process) | Distributed (Redis, Memcached) |
+|---|---|---|
+| Latency | Nanoseconds | ~1 ms |
+| Consistency across instances | No — each pod has its own | Yes |
+| Size | Limited to process memory | Limited to cluster |
+| Eviction | LRU, LFU | LRU, LFU, TTL |
+| Cost | Free | Infra + ops |
+| Invalidation | Hard across pods | One call |
+Use local for small hot data; distributed for shared state. Don't mix carelessly — a per-pod cache that's supposed to be consistent will drift.
+## Anti-patterns
+| Anti-pattern | Why bad |
+|---|---|
+| No TTL anywhere | Memory leak; stale data forever |
+| Caching mutable objects by reference | Next reader mutates the cached copy |
+| Caching per-user data with high cardinality | Low hit rate; wastes memory |
+| Cache key includes a timestamp that changes every request | Every request is a miss |
+| Serializing cache writes into the request path without timeout | Cache outage → requests hang |
+| Reading cache without a fallback path | Cache is a dependency; treat it as optional |
+| Storing secrets in shared cache | Secret sprawl across cluster |