npm - dojo.md - Versions diffs - 0.1.0 → 0.2.0 - Mend

dojo.md 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (243) hide show

package/courses/postgresql-query-optimization/scenarios/level-4/read-replica-optimization.yaml ADDED Viewed

@@ -0,0 +1,62 @@
+meta:
+  id: read-replica-optimization
+  level: 4
+  course: postgresql-query-optimization
+  type: output
+  description: "Optimize read replica architecture — design replica routing, handle replication lag, and balance read workloads across PostgreSQL replicas"
+  tags: [PostgreSQL, read-replicas, replication-lag, load-balancing, routing, expert]
+state: {}
+trigger: |
+  Your platform has outgrown a single PostgreSQL primary. Read queries
+  account for 85% of total load, and the primary is at 90% CPU during
+  peak hours. You've added 3 read replicas, but they're not helping as
+  much as expected — some replicas are overloaded while others are idle,
+  and users are seeing stale data.
+  Current architecture:
+  - Primary: r6g.4xlarge, 100K queries/sec peak
+  - Replica 1: r6g.2xlarge — serving analytics team (heavy queries)
+  - Replica 2: r6g.4xlarge — serving API reads (high throughput)
+  - Replica 3: r6g.2xlarge — mostly idle (recently added)
+  Problems observed:
+  1. Replication lag: Replica 1 lags 30 seconds during analytics queries
+     because heavy queries consume all CPU, slowing WAL replay.
+  2. Stale reads: A user updates their profile, then immediately sees
+     the old data because the read is routed to a replica.
+  3. Uneven load: Application uses round-robin routing, but analytics
+     queries on Replica 1 make it 10x slower than Replica 2.
+  4. Failover gaps: When the primary fails, Replica 1 was promoted but
+     it was 30 seconds behind — 30 seconds of data was lost.
+  5. Connection routing: Each service has hardcoded replica endpoints.
+     Adding Replica 3 requires updating 20 service configurations.
+  Architecture questions:
+  - How to route reads intelligently (by query type, not round-robin)?
+  - How to handle read-after-write consistency without always hitting
+    the primary?
+  - How to prevent analytics queries from impacting OLTP replicas?
+  - How to manage replication lag and make promotion decisions safely?
+  - Should you use synchronous replication for any replicas?
+  Task: Design the optimized read replica architecture. Write: the
+  replica topology (sizing, roles, sync vs async), the query routing
+  strategy (intelligent routing, not round-robin), the consistency
+  model (handling stale reads), the replication lag management plan,
+  and the failover strategy with RPO guarantees.
+assertions:
+  - type: llm_judge
+    criteria: "Replica topology separates workloads — dedicates specific replicas to OLTP reads vs analytics, sizes replicas appropriately for their workload, and explains the sync vs async trade-off (sync for zero-data-loss failover candidate, async for analytics where lag is acceptable). Addresses all 5 observed problems"
+    weight: 0.35
+    description: "Workload-aware replica topology"
+  - type: llm_judge
+    criteria: "Query routing is intelligent — uses middleware or proxy (e.g., PgPool-II, HAProxy, application-level routing) to route by query type (writes→primary, OLTP reads→OLTP replicas, analytics→analytics replica), implements read-after-write consistency (session stickiness to primary for N seconds after write, or check replication LSN), and handles adding/removing replicas without application changes"
+    weight: 0.35
+    description: "Intelligent query routing"
+  - type: llm_judge
+    criteria: "Replication lag and failover are managed — monitors lag per replica (pg_stat_replication, replay_lag), removes lagging replicas from read pool, uses synchronous replication for at least one failover candidate to guarantee RPO=0, and defines promotion criteria (which replica to promote, how to handle lag gap)"
+    weight: 0.30
+    description: "Lag management and failover"

package/courses/postgresql-query-optimization/scenarios/level-4/vendor-evaluation.yaml ADDED Viewed

@@ -0,0 +1,73 @@
+meta:
+  id: vendor-evaluation
+  level: 4
+  course: postgresql-query-optimization
+  type: output
+  description: "Evaluate PostgreSQL hosting vendors — compare RDS, Aurora, self-managed, Citus, and other options for enterprise workloads"
+  tags: [PostgreSQL, vendor-evaluation, RDS, Aurora, Citus, TCO, expert]
+state: {}
+trigger: |
+  Your company is evaluating PostgreSQL hosting options for a
+  platform migration. The CTO wants a comprehensive comparison
+  with TCO analysis over 3 years.
+  Current state: Self-managed PostgreSQL on bare metal
+  - 5 database clusters, 20 instances total
+  - 10TB total data
+  - 100K queries/second peak
+  - 3 DBAs managing the infrastructure
+  - Annual infrastructure cost: $800K
+  - Annual DBA salary cost: $600K
+  - Monthly operational incidents: 3 (average)
+  Vendors to evaluate:
+  1. AWS RDS for PostgreSQL:
+  - Managed service, automated backups, Multi-AZ
+  - db.r6g.4xlarge ($2.46/hr) × 20 instances
+  2. AWS Aurora PostgreSQL:
+  - Distributed storage, up to 15 read replicas
+  - db.r6g.4xlarge ($3.28/hr) × 10 instances (Aurora needs fewer)
+  3. Self-managed on AWS EC2:
+  - Full control, Patroni for HA
+  - r6g.4xlarge ($0.81/hr) × 20 instances + EBS storage
+  4. Citus Cloud (Azure):
+  - Distributed PostgreSQL, horizontal scaling
+  - Multi-tenant SaaS pattern support
+  5. Neon (serverless PostgreSQL):
+  - Scale to zero, branching, compute-storage separation
+  - Pay per compute-second
+  Evaluation criteria:
+  - Total Cost of Ownership (3-year)
+  - Performance (latency, throughput)
+  - Availability (SLA guarantees)
+  - Operational burden (DBA time saved)
+  - Migration complexity
+  - Lock-in risk
+  - Compliance (SOC 2, PCI DSS, HIPAA)
+  Task: Write the vendor evaluation report. Include: the TCO
+  comparison (3-year costs including hidden costs), the performance
+  benchmark plan, the migration risk assessment per vendor, the
+  recommendation with justification, and the executive summary.
+assertions:
+  - type: llm_judge
+    criteria: "TCO analysis includes hidden costs — considers not just instance pricing but also storage, IOPS, data transfer, backup storage, DBA time reduction (or increase), migration cost, training, and the cost of vendor lock-in. 3-year totals are calculated for each option"
+    weight: 0.35
+    description: "Comprehensive TCO analysis"
+  - type: llm_judge
+    criteria: "Evaluation is balanced — each vendor has pros and cons documented, the recommendation is justified by the company's specific needs (not just cheapest), and lock-in risk is honestly assessed (Aurora's proprietary storage layer, Citus's distributed query limitations)"
+    weight: 0.35
+    description: "Balanced vendor evaluation"
+  - type: llm_judge
+    criteria: "Migration risk and operational impact are assessed — estimates migration timeline per vendor, identifies features that don't transfer (extensions, custom configs), and calculates DBA staffing changes (managed services may reduce 3 DBAs to 1, but requires cloud expertise)"
+    weight: 0.30
+    description: "Migration and operational assessment"

package/courses/rest-api-error-handling/course.yaml ADDED Viewed

@@ -0,0 +1,11 @@
+id: rest-api-error-handling
+name: "REST API Error Handling"
+description: >
+  Master REST API error handling from basic HTTP status codes to enterprise
+  error governance. Learn error response design, RFC 7807 Problem Details,
+  retry and circuit breaker patterns, distributed system error propagation,
+  error budgets, compliance-aware error handling, and API reliability
+  architecture for large-scale systems.
+levels: 5
+scenarios_per_level: 10
+tags: [development, REST, API, error-handling, HTTP, status-codes, reliability, DevOps]

package/courses/rest-api-error-handling/scenarios/level-1/authentication-errors.yaml ADDED Viewed

@@ -0,0 +1,71 @@
+meta:
+  id: authentication-errors
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Handle authentication errors — design error responses for auth failures without leaking security information"
+  tags: [REST, API, authentication, 401, 403, security, beginner]
+state: {}
+trigger: |
+  You're working on the authentication error handling for a healthcare
+  appointment booking API. The security team reviewed your error
+  responses and flagged several issues.
+  Current error responses (flagged as problematic):
+  1. Missing token:
+     401 { "error": "No Authorization header provided" }
+     Security note: "This is fine, but be consistent"
+  2. Malformed token:
+     401 { "error": "JWT format invalid: expected 3 parts, got 1" }
+     Security note: "Reveals token implementation (JWT)"
+  3. Expired token:
+     401 { "error": "Token expired at 2026-02-26T14:30:00Z, current
+     time is 2026-02-27T08:15:00Z" }
+     Security note: "Reveals server clock — useful for timing attacks"
+  4. Invalid signature:
+     401 { "error": "JWT signature verification failed using RS256" }
+     Security note: "Reveals signing algorithm"
+  5. Valid token but user deactivated:
+     401 { "error": "User account deactivated since 2026-01-15" }
+     Security note: "Confirms account existence and status"
+  6. Valid token but wrong role (patient trying admin endpoint):
+     403 { "error": "User john.doe@email.com has role 'patient',
+     requires role 'admin'" }
+     Security note: "Reveals user email and role system"
+  7. Valid token but IP not in allowlist:
+     403 { "error": "IP 192.168.1.50 not in allowlist for admin API" }
+     Security note: "Reveals IP allowlist exists"
+  8. API key instead of bearer token:
+     401 { "error": "Expected Bearer token, received API key
+     sk-abc...xyz" }
+     Security note: "Echoes back the credential!"
+  Task: Rewrite all 8 error responses to be secure (no information
+  leakage) while still being useful enough for legitimate developers
+  to debug. Explain the security principle behind each change. Then
+  write guidelines for the team on what authentication errors should
+  and should not reveal.
+assertions:
+  - type: llm_judge
+    criteria: "Rewritten errors eliminate information leakage — no JWT implementation details, no server timestamps, no user emails, no role system details, no IP allowlist information, and absolutely no credential echoing. Each response gives enough information to debug without revealing security internals"
+    weight: 0.35
+    description: "Secure error responses"
+  - type: llm_judge
+    criteria: "Security principles are correctly explained — covers information disclosure risks (token type, algorithm, clock), account enumeration prevention, principle of least information, and the difference between logging detailed errors server-side vs returning generic errors to clients"
+    weight: 0.35
+    description: "Correct security principles"
+  - type: llm_judge
+    criteria: "Team guidelines are practical — provides clear do/don't list for auth error messages, explains server-side logging of detailed errors for debugging, and addresses the balance between security and developer experience"
+    weight: 0.30
+    description: "Practical team guidelines"

package/courses/rest-api-error-handling/scenarios/level-1/content-negotiation-errors.yaml ADDED Viewed

@@ -0,0 +1,63 @@
+meta:
+  id: content-negotiation-errors
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Handle content negotiation errors — respond correctly when clients send or request unsupported content types"
+  tags: [REST, API, content-type, accept, 406, 415, beginner]
+state: {}
+trigger: |
+  You're building an API that primarily serves JSON but also supports
+  XML for legacy enterprise clients. Users keep hitting confusing
+  errors related to content types.
+  Support tickets this week:
+  Ticket 1: "I'm sending a POST with form data (Content-Type:
+  application/x-www-form-urlencoded) but getting a 500 error with
+  'Unexpected token' in the response."
+  Ticket 2: "I set Accept: text/html but got back JSON anyway. My XML
+  parser is choking on it."
+  Ticket 3: "I'm uploading a file with Content-Type: multipart/
+  form-data to the /reports endpoint but it says 'invalid JSON'."
+  Ticket 4: "My request works in Postman but fails in my code. I
+  think it's because Postman auto-sets Content-Type but my HTTP
+  library doesn't."
+  Ticket 5: "I'm sending JSON but with Content-Type: text/plain
+  and getting a 400 error that says 'Request body is empty' — but
+  the body is definitely there!"
+  Ticket 6: "I need CSV export of my data. GET /reports with Accept:
+  text/csv returns JSON. Is CSV supported?"
+  Current behavior: The API ignores the Accept header entirely and
+  always returns JSON. For request bodies, it tries to parse
+  everything as JSON regardless of Content-Type, causing cryptic
+  errors.
+  Task: Design the correct content negotiation error handling. For
+  each ticket, explain: what's going wrong, what the correct HTTP
+  status code and response should be, and what headers the response
+  should include. Then write the content negotiation middleware logic
+  that handles both request (Content-Type) and response (Accept)
+  negotiation properly.
+assertions:
+  - type: llm_judge
+    criteria: "Correct status codes for each scenario — 415 Unsupported Media Type when the server can't parse the request body's content type, 406 Not Acceptable when the server can't produce the client's requested format, and appropriate error messages explaining what formats are supported"
+    weight: 0.35
+    description: "Correct content negotiation status codes"
+  - type: llm_judge
+    criteria: "All 6 tickets are addressed with clear explanations — explains why form-encoded data fails JSON parsing, why missing Content-Type causes issues, how Accept header should drive response format, and includes the supported content types in error responses"
+    weight: 0.35
+    description: "All tickets addressed"
+  - type: llm_judge
+    criteria: "Middleware logic is complete — validates Content-Type on requests with bodies, negotiates Accept header for responses, falls back to JSON when no Accept header is sent, and includes proper Content-Type header on all responses including error responses"
+    weight: 0.30
+    description: "Complete middleware logic"

package/courses/rest-api-error-handling/scenarios/level-1/error-logging-basics.yaml ADDED Viewed

@@ -0,0 +1,63 @@
+meta:
+  id: error-logging-basics
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Set up API error logging — design structured error logs that make debugging fast without logging sensitive data"
+  tags: [REST, API, logging, debugging, structured-logs, beginner]
+state: {}
+trigger: |
+  You're debugging a production issue on a food delivery API.
+  Customers are getting 500 errors when placing orders, but your
+  current logs look like this:
+  2026-02-27 10:15:33 ERROR: Something went wrong
+  2026-02-27 10:15:34 ERROR: Something went wrong
+  2026-02-27 10:15:35 ERROR: null
+  2026-02-27 10:15:36 ERROR: Something went wrong
+  2026-02-27 10:15:37 ERROR: [object Object]
+  You have no idea: which endpoint failed, what the request was,
+  which user was affected, what the actual error was, or how to
+  reproduce it.
+  After 2 hours of blind debugging, you find it's a null pointer
+  in the restaurant menu lookup. You realize you need proper error
+  logging.
+  Your API has these components:
+  - Order placement (POST /orders)
+  - Payment processing (POST /orders/:id/pay)
+  - Restaurant communication (POST /restaurants/:id/notify)
+  - Delivery assignment (POST /deliveries)
+  - User notifications (POST /notifications)
+  Sensitive data that must NOT appear in logs:
+  - Credit card numbers, CVVs
+  - User passwords
+  - Full addresses (only city/zip is OK)
+  - API keys and tokens
+  - Phone numbers (mask all but last 4)
+  Task: Design the structured error logging system. Show: the log
+  format (structured JSON), what fields every error log must include,
+  what sensitive fields must be masked or excluded, example log entries
+  for 3 different error types (validation error, third-party timeout,
+  unhandled exception), and how to search these logs to debug the
+  original "customers getting 500 errors" issue quickly.
+assertions:
+  - type: llm_judge
+    criteria: "Log format is structured and searchable — uses JSON format with consistent fields (timestamp, level, request_id, method, path, status_code, error_type, message, duration_ms, user_id). Includes enough context to reproduce issues without needing to add more logging"
+    weight: 0.35
+    description: "Structured searchable log format"
+  - type: llm_judge
+    criteria: "Sensitive data handling is thorough — credit cards, passwords, tokens are never logged. Phone numbers and addresses are properly masked. Explains the masking strategy (middleware vs per-field) and how to prevent accidental logging of request/response bodies containing sensitive data"
+    weight: 0.35
+    description: "Thorough sensitive data handling"
+  - type: llm_judge
+    criteria: "Example log entries are realistic and the debugging workflow is practical — shows how to filter logs by request_id, status_code, error_type to quickly identify the root cause. The 3 example entries demonstrate different error severities and contexts"
+    weight: 0.30
+    description: "Realistic examples and debugging workflow"

package/courses/rest-api-error-handling/scenarios/level-1/error-response-format.yaml ADDED Viewed

@@ -0,0 +1,58 @@
+meta:
+  id: error-response-format
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Design error response format — create a consistent JSON error response structure for a REST API"
+  tags: [REST, API, error-response, JSON, format, beginner]
+state: {}
+trigger: |
+  You're building a recipe sharing API and different endpoints return
+  errors in completely different formats. Your team lead says this is
+  confusing for frontend developers and asks you to standardize.
+  Current inconsistent error responses across your API:
+  Endpoint 1 - POST /recipes (validation error):
+  { "error": "Title is required" }
+  Endpoint 2 - GET /recipes/999 (not found):
+  { "message": "Recipe not found", "code": 404 }
+  Endpoint 3 - POST /recipes (auth error):
+  { "status": "error", "reason": "Invalid token" }
+  Endpoint 4 - POST /recipes/5/reviews (rate limit):
+  "Too many requests, please try again later"
+  Endpoint 5 - PATCH /recipes/5 (validation errors):
+  { "errors": ["Title too long", "Invalid category"] }
+  Frontend developers are complaining:
+  - "I never know which field to check for the error message"
+  - "Sometimes I get a string, sometimes an object"
+  - "Validation errors come in different shapes"
+  - "I can't tell if an error is my fault or the server's fault"
+  - "There's no way to show specific field errors in the form"
+  Task: Design a unified error response format for the entire API.
+  Show: the standard error response schema (with required and optional
+  fields), examples for each of the 5 error types above rewritten in
+  the new format, guidelines for when to include each optional field,
+  and the implementation approach (middleware vs per-handler).
+assertions:
+  - type: llm_judge
+    criteria: "Error format is well-designed — includes a machine-readable error code/type, a human-readable message, and the HTTP status code. Handles single errors and multiple validation errors in the same structure. Includes a way to associate errors with specific fields for form validation"
+    weight: 0.35
+    description: "Well-designed error format"
+  - type: llm_judge
+    criteria: "All 5 error types are rewritten consistently — validation errors show field-level detail, not-found errors include resource identification, auth errors distinguish authentication vs authorization, rate limit errors include retry-after information, and all follow the same top-level structure"
+    weight: 0.35
+    description: "Consistent error rewrites"
+  - type: llm_judge
+    criteria: "Implementation guidance is practical — recommends middleware/error handler pattern for consistency, explains how to throw standardized errors from route handlers, and addresses how to prevent internal error details from leaking to the client"
+    weight: 0.30
+    description: "Practical implementation guidance"

package/courses/rest-api-error-handling/scenarios/level-1/first-error-handling-shift.yaml ADDED Viewed

@@ -0,0 +1,67 @@
+meta:
+  id: first-error-handling-shift
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "First error handling shift — handle a stream of real-time API errors during a product launch event"
+  tags: [REST, API, error-handling, shift-simulation, launch, beginner]
+state: {}
+trigger: |
+  You're on your first on-call shift for a ticket marketplace API.
+  The company is launching sales for a major concert tonight and
+  traffic will spike 10x. Your job is to review the API's error
+  handling before the launch.
+  During your pre-launch review, you find these issues:
+  Issue 1 — No global error handler:
+  If any route throws an unhandled exception, Express returns:
+  <!DOCTYPE html><html><body><pre>TypeError: Cannot read property
+  'email' of undefined<br>&nbsp; at /app/src/routes/tickets.js:45
+  </pre></body></html>
+  (HTML error page with stack trace)
+  Issue 2 — Inconsistent error shapes:
+  - /tickets returns { "error": "..." }
+  - /orders returns { "message": "...", "code": 123 }
+  - /payments returns { "errors": ["..."] }
+  - /users returns { "status": "fail", "data": { "message": "..." } }
+  Issue 3 — Database errors leak:
+  When the database constraint prevents duplicate ticket purchases:
+  "duplicate key value violates unique constraint
+  'orders_user_id_ticket_id_key'"
+  Issue 4 — No rate limiting on search:
+  GET /tickets/search has no rate limit. During the last sale, bots
+  made 50,000 requests/second and the API crashed.
+  Issue 5 — Payment errors are too vague:
+  POST /orders/:id/pay returns 400 { "error": "Payment failed" }
+  for every payment issue (declined card, insufficient funds,
+  expired card, processing timeout, fraud detection).
+  Issue 6 — Timeout errors unhandled:
+  The external payment processor sometimes takes 30+ seconds. The API
+  has no timeout, so requests hang until the client gives up.
+  Task: Fix all 6 issues before the launch. For each, write: what's
+  wrong (the risk for tonight's launch), the fix (code structure or
+  configuration), and the expected error response after the fix.
+  Prioritize the fixes by launch impact.
+assertions:
+  - type: llm_judge
+    criteria: "All 6 issues are fixed with correct solutions — global error handler catches unhandled exceptions (no HTML/stack traces), error format is standardized, database errors are translated to user-friendly messages, rate limiting is implemented for search, payment errors are specific but secure, and timeout handling is added with appropriate status codes"
+    weight: 0.35
+    description: "All issues fixed correctly"
+  - type: llm_judge
+    criteria: "Prioritization is sound — identifies which issues pose the greatest risk for a high-traffic launch event (no global error handler and no rate limiting are critical, inconsistent formats are important but not urgent). Each fix includes the expected error response"
+    weight: 0.35
+    description: "Sound prioritization"
+  - type: llm_judge
+    criteria: "Solutions are launch-ready — fixes are quick to implement (no major refactors), handle the specific high-traffic scenario (10x spike), and include monitoring so the on-call engineer can see errors in real-time during the launch"
+    weight: 0.30
+    description: "Launch-ready solutions"

package/courses/rest-api-error-handling/scenarios/level-1/http-status-codes.yaml ADDED Viewed

@@ -0,0 +1,46 @@
+meta:
+  id: http-status-codes
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Classify HTTP status codes — match API scenarios to correct status codes across 2xx, 4xx, and 5xx families"
+  tags: [REST, API, HTTP, status-codes, beginner]
+state: {}
+trigger: |
+  You're a junior backend developer who just joined a team building a
+  task management API. During your first code review, the senior
+  developer flagged that your endpoints return wrong status codes.
+  Your current mistakes:
+  1. POST /tasks returns 200 when a task is created (should be 201)
+  2. DELETE /tasks/:id returns 200 with empty body (should be 204)
+  3. GET /tasks/:id returns 200 with { error: "not found" } when the
+     task doesn't exist (should be 404)
+  4. POST /tasks with missing "title" field returns 500 (should be 400)
+  5. GET /tasks with expired auth token returns 403 (should be 401)
+  6. PUT /tasks/:id on someone else's task returns 404 (should be 403)
+  7. POST /tasks when the database is down returns 400 (should be 503)
+  8. PATCH /tasks/:id with Content-Type: text/plain returns 500
+     (should be 415)
+  Task: For each of the 8 mistakes, explain: (1) why your current
+  status code is wrong, (2) what the correct status code is and why,
+  and (3) the general rule for when to use that status code. Then
+  create a quick-reference cheat sheet mapping common API scenarios
+  to their correct HTTP status codes.
+assertions:
+  - type: llm_judge
+    criteria: "Correctly identifies all 8 status code fixes — 201 Created for resource creation, 204 No Content for successful deletion, 404 Not Found for missing resources, 400 Bad Request for validation errors, 401 Unauthorized for expired/missing auth, 403 Forbidden for insufficient permissions, 503 Service Unavailable for downstream failures, 415 Unsupported Media Type for wrong content types"
+    weight: 0.35
+    description: "Correct status code identification"
+  - type: llm_judge
+    criteria: "Explanations demonstrate understanding of status code semantics — distinguishes between 401 (identity unknown) and 403 (identity known, permission denied), between 400 (client mistake) and 500 (server mistake), and between 200 (response body) and 204 (no body)"
+    weight: 0.35
+    description: "Status code semantic understanding"
+  - type: llm_judge
+    criteria: "Cheat sheet is practical and organized — groups status codes by family (2xx success, 4xx client error, 5xx server error), covers common API scenarios beyond just the 8 mistakes, and is formatted for quick reference"
+    weight: 0.30
+    description: "Practical cheat sheet"

package/courses/rest-api-error-handling/scenarios/level-1/not-found-errors.yaml ADDED Viewed

@@ -0,0 +1,52 @@
+meta:
+  id: not-found-errors
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Handle not-found errors — distinguish between missing resources, wrong URLs, and deleted content with appropriate responses"
+  tags: [REST, API, 404, not-found, routing, beginner]
+state: {}
+trigger: |
+  You're maintaining a blog API and users are reporting confusing
+  behavior. All of these return the same generic 404 response:
+  { "error": "Not found" }
+  But they represent very different situations:
+  1. GET /posts/42 — Post exists but is in draft status (not published)
+  2. GET /posts/99999 — Post ID doesn't exist at all
+  3. GET /posts/abc — Post ID is not a valid integer
+  4. GET /postz/42 — Typo in the URL path (postz instead of posts)
+  5. GET /posts/42 — Post existed but was deleted 3 days ago
+  6. GET /posts/42 — Post exists but belongs to a different tenant
+     in your multi-tenant system
+  7. GET /users/5/posts/42 — User 5 exists but post 42 belongs to
+     user 8
+  8. GET /v1/posts/42 — API v1 is deprecated, only v2 exists
+  The frontend developer says: "I can't tell if the user mistyped
+  the URL, if the post was deleted, if it's a permission issue, or
+  if the endpoint doesn't exist. They all look the same!"
+  Task: Design the appropriate response for each of the 8 scenarios.
+  Not all of them should be 404 — decide which HTTP status code
+  fits each case and explain why. Write the response body for each
+  scenario. Then create a decision tree for "when the resource isn't
+  there" that the team can follow.
+assertions:
+  - type: llm_judge
+    criteria: "Status code selection is correct for each case — distinguishes between 404 (truly not found), 400 (invalid ID format), 410 Gone (deleted content), 403 (wrong tenant/ownership), and appropriate handling for draft content and deprecated API versions. Not all 8 should be 404"
+    weight: 0.35
+    description: "Correct status code selection"
+  - type: llm_judge
+    criteria: "Response bodies are helpful without leaking information — draft posts don't reveal that the post exists (security consideration in multi-tenant), deleted posts may use 410 Gone with deletion context, invalid IDs explain the expected format, and deprecated versions redirect to the current version"
+    weight: 0.35
+    description: "Helpful secure responses"
+  - type: llm_judge
+    criteria: "Decision tree is clear and usable — provides a logical flow for developers to determine the right response when a resource isn't found, considers security implications (don't confirm existence to unauthorized users), and handles edge cases like soft deletes and multi-tenancy"
+    weight: 0.30
+    description: "Clear decision tree"

package/courses/rest-api-error-handling/scenarios/level-1/rate-limiting-errors.yaml ADDED Viewed

@@ -0,0 +1,56 @@
+meta:
+  id: rate-limiting-errors
+  level: 1
+  course: rest-api-error-handling
+  type: output
+  description: "Implement rate limiting errors — design 429 responses with proper headers and retry guidance for API consumers"
+  tags: [REST, API, rate-limiting, 429, throttling, beginner]
+state: {}
+trigger: |
+  You're building a weather data API that offers free and paid tiers.
+  The current rate limiting returns 429 with this response:
+  HTTP/1.1 429
+  { "error": "Too many requests" }
+  Customers are complaining:
+  - "When can I retry? 1 second? 1 hour?"
+  - "How many requests do I have left?"
+  - "I didn't even know there was a rate limit until my app broke"
+  - "My batch job got rate limited and I lost 2 hours of processing"
+  - "The paid plan says 1000 requests/minute but I got limited at 800"
+  Rate limit tiers:
+  - Free: 60 requests/minute, 1000 requests/day
+  - Basic ($29/mo): 600 requests/minute, 50,000 requests/day
+  - Pro ($99/mo): 3000 requests/minute, unlimited daily
+  - Enterprise: custom limits
+  Additional rules:
+  - Burst allowance: 2x the per-minute limit for up to 10 seconds
+  - Different limits for different endpoints (search: 10/min free,
+    forecast: 60/min free)
+  - Authenticated vs unauthenticated requests have different limits
+  Task: Design the complete rate limiting error response. Include:
+  the 429 response body with all necessary information, the rate
+  limit headers (standard and custom), the response headers that
+  should appear on ALL responses (not just 429) so clients can
+  proactively manage their usage, and a guide for API consumers
+  on how to handle rate limits gracefully (backoff strategies).
+assertions:
+  - type: llm_judge
+    criteria: "429 response includes standard rate limit headers — X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (or Retry-After), with clear documentation of header semantics. Response body includes retry timing, current limit, and which limit was hit (per-minute vs daily)"
+    weight: 0.35
+    description: "Proper 429 response with headers"
+  - type: llm_judge
+    criteria: "Rate limit information appears on ALL responses — successful responses include remaining quota and reset time so clients can proactively avoid hitting limits. The design handles the complexity of multiple limit windows (per-minute and daily) and per-endpoint limits"
+    weight: 0.35
+    description: "Proactive rate limit headers"
+  - type: llm_judge
+    criteria: "Consumer guidance is practical — explains exponential backoff with jitter, how to read rate limit headers proactively, how to build a request queue that respects limits, and addresses the batch processing use case (how to process large workloads within rate limits)"
+    weight: 0.30
+    description: "Practical consumer guidance"