npm - dojo.md - Versions diffs - 0.2.1 → 0.2.3 - Mend

dojo.md 0.2.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (152) hide show

package/courses/code-review-feedback-writing/scenarios/level-1/approve-vs-request-changes.yaml ADDED Viewed

@@ -0,0 +1,48 @@
+meta:
+  id: approve-vs-request-changes
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Make review decisions — learn when to approve, approve with comments, request changes, or block a PR based on the issues found"
+  tags: [code-review, approval, request-changes, decision, blocking, beginner]
+state: {}
+trigger: |
+  You need to make a review decision for 5 different PRs. For each,
+  decide: Approve, Approve with Comments, Request Changes, or Block.
+  PR #1: Feature works correctly, has tests, but variable names could
+  be more descriptive. No bugs found.
+  PR #2: Adds a new API endpoint. Logic is correct but there's no
+  input validation — any string is accepted for the email field.
+  PR #3: Fixes a critical production bug (users seeing other users'
+  data). The fix is correct but the code style is inconsistent with
+  the rest of the file.
+  PR #4: Implements a payment flow. You notice the Stripe API key is
+  hardcoded in the source file instead of using environment variables.
+  PR #5: Large refactor that changes the authentication middleware.
+  You don't understand the approach and you're not sure if it handles
+  edge cases. No tests were updated.
+  Task: For each PR, state your decision and write the review comment
+  explaining it. Then write a general decision framework that other
+  reviewers can follow.
+assertions:
+  - type: llm_judge
+    criteria: "Decisions match the severity of issues — PR #1: Approve with comments (naming is a nit, not a blocker — 'Approved! Consider renaming x to orderTotal for clarity, but not blocking on this.'). PR #2: Request changes (missing input validation is a real bug — email validation is a security/data integrity concern). PR #3: Approve (critical production fix shouldn't be blocked by style — 'Approving to unblock the fix. Let's follow up with a style cleanup in a separate PR.'). PR #4: Request changes or Block (hardcoded API key is a security incident waiting to happen — this MUST be fixed before merge). PR #5: Request changes (lack of tests for auth middleware is risky, and the reviewer's uncertainty means the PR needs more explanation or a walkthrough). Each decision is calibrated to risk, not personal preference"
+    weight: 0.35
+    description: "Calibrated decisions"
+  - type: llm_judge
+    criteria: "Review comments explain the reasoning — each comment: states the decision upfront, explains what was reviewed, highlights what's good, explains what needs to change and why. PR #3 comment acknowledges urgency: 'I see this fixes the data leak issue — approving to get this out ASAP. The style inconsistency is minor and can be cleaned up separately.' PR #4 comment explains severity: 'The Stripe API key on line 42 would be committed to Git history. Even if we change it later, the key is permanently exposed. This needs to use process.env.STRIPE_SECRET_KEY instead.' PR #5 comment admits uncertainty: 'I'm not confident I understand the new auth flow — could you add tests for the edge cases or walk me through it?'"
+    weight: 0.35
+    description: "Explained reasoning"
+  - type: llm_judge
+    criteria: "Decision framework provides clear criteria — framework: Approve: no bugs, no security issues, any remaining comments are optional. Approve with Comments: no blocking issues, comments are suggestions/nits that won't cause problems if ignored. Request Changes: bugs that must be fixed, missing validation, missing tests for critical paths, security vulnerabilities. Block (rare): critical security issue (exposed secrets, auth bypass), data loss risk, fundamentally wrong approach requiring redesign. Rules: never block on style alone, never approve known security issues, urgency can shift the threshold (critical fix > style consistency), admit uncertainty rather than blindly approving. Anti-patterns: approving everything to be 'nice' (unsafe), blocking everything to be 'thorough' (slows team down)"
+    weight: 0.30
+    description: "Decision framework"

package/courses/code-review-feedback-writing/scenarios/level-1/asking-questions.yaml ADDED Viewed

@@ -0,0 +1,50 @@
+meta:
+  id: asking-questions
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Ask effective questions — use questions in code reviews to understand context, surface hidden assumptions, and guide the author to better solutions"
+  tags: [code-review, questions, understanding, socratic, curiosity, beginner]
+state: {}
+trigger: |
+  You're reviewing code and encounter things you don't fully understand.
+  Instead of assuming the author made a mistake, you want to ask
+  questions. But questions in code reviews can go wrong:
+  Bad questions (disguised demands):
+  - "Why didn't you use a Map here?" (implies they should have)
+  - "Shouldn't this be async?" (tells, doesn't ask)
+  - "Don't you think this is too complex?" (rhetorical)
+  Good questions (genuine curiosity):
+  - "What's the expected behavior when user is null?"
+  - "I'm curious about the choice to use recursion here — was there
+    a specific reason over iteration?"
+  - "Could you help me understand the business rule behind this check?"
+  Code you're reviewing has:
+  1. A custom retry mechanism instead of using the existing library
+  2. A database query that seems to fetch more data than needed
+  3. A complex regex with no comment explaining the pattern
+  4. A try/catch that catches all exceptions with the same handler
+  5. An unusual data structure choice (array of arrays instead of Map)
+  Task: Write genuine questions for each of the 5 situations. Then
+  write a guide on when to ask questions vs make statements in code
+  reviews.
+assertions:
+  - type: llm_judge
+    criteria: "Questions are genuine, not passive-aggressive — each question: (1) Custom retry: 'I noticed we have a retry utility in lib/retry.ts — was there a specific requirement that the existing library didn't handle? I'm wondering if we could reuse it here.' Not: 'Why are you reinventing the wheel?' (2) Over-fetching: 'This query fetches all user fields but I only see email used below — is there a reason to fetch the full record? (Maybe for caching or a future use?)' Not: 'You're fetching too much data.' (3) Regex: 'Could you add a comment explaining what this regex matches? I'm having trouble parsing it.' Not: 'This regex is unreadable.' (4) Catch-all: 'What should happen if this throws a NetworkError vs a ValidationError? Right now both are handled the same way.' (5) Array of arrays: 'I'm curious about the array-of-arrays structure — was there a performance consideration? I was thinking a Map might work here but you may know something I don't.'"
+    weight: 0.35
+    description: "Genuine questions"
+  - type: llm_judge
+    criteria: "Questions open dialogue and surface context — each question acknowledges that the author may have a good reason ('was there a specific requirement', 'you may know something I don't', 'maybe for caching'). Questions invite explanation, not defense. The reviewer shows humility: 'I'm having trouble parsing it' (the problem is mine, not yours). Questions are specific to the code, not generic. Each question includes what the reviewer observed and what they expected, creating a productive gap for discussion. Questions that reveal assumptions: 'What happens when...' (reveals the reviewer's mental model for the author to correct or confirm)"
+    weight: 0.35
+    description: "Opens dialogue"
+  - type: llm_judge
+    criteria: "Guide distinguishes questions from statements effectively — use questions when: you genuinely don't understand something, the author may have context you lack, there are multiple valid approaches, you want the author to think through implications ('What happens if this list is empty?'). Use statements when: there's a clear bug (null pointer, off-by-one), security vulnerability (no ambiguity — this must be fixed), style guide violation (team has agreed on this). Never use: rhetorical questions ('Don't you think...?'), questions that are really demands ('Shouldn't this be...?'), questions when you already know the answer (condescending). Technique: Socratic questioning — ask questions that guide the author to discover the issue themselves, which builds understanding better than being told"
+    weight: 0.30
+    description: "Questions vs statements guide"

package/courses/code-review-feedback-writing/scenarios/level-1/clear-comment-writing.yaml ADDED Viewed

@@ -0,0 +1,45 @@
+meta:
+  id: clear-comment-writing
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Write clear review comments — craft code review comments that are specific, actionable, and easy to understand"
+  tags: [code-review, comments, clarity, actionable, feedback, beginner]
+state: {}
+trigger: |
+  You're reviewing a teammate's pull request. You've found several
+  issues but your comments need to be clear. Here are examples of
+  unclear comments you've seen from other reviewers:
+  - "This looks wrong" (what looks wrong? why?)
+  - "Fix this" (fix what? how?)
+  - "I don't like this approach" (why? what's the alternative?)
+  - "Hmm" (what does that even mean?)
+  - "nit" (with no explanation)
+  You've found these issues in the PR:
+  1. A variable named "d" that holds a date value
+  2. A function that's 80 lines long doing three distinct things
+  3. A SQL query built with string concatenation (injection risk)
+  4. A missing null check before accessing user.address.city
+  5. A TODO comment with no ticket reference
+  Task: Write a clear, specific, actionable review comment for each
+  of the 5 issues. Each comment should tell the author exactly what
+  the problem is, why it matters, and what to do about it.
+assertions:
+  - type: llm_judge
+    criteria: "Each comment identifies the specific problem — not vague statements like 'this could be better' but specific: (1) 'd' is ambiguous — suggest 'createdDate' or 'expirationDate' with reasoning (readability, searchability). (2) Function does X, Y, and Z — suggest extracting into three functions with proposed names. (3) String concatenation in SQL is a security vulnerability (SQL injection) — suggest parameterized queries with example. (4) If user.address is null, accessing .city throws TypeError — suggest optional chaining or null check. (5) TODO without context becomes permanent dead code — link to a ticket or remove. Each comment references the specific line/code, not just a general observation"
+    weight: 0.35
+    description: "Specific problems"
+  - type: llm_judge
+    criteria: "Comments explain WHY, not just WHAT — variable naming: 'd' is hard to search for and requires reading surrounding code to understand. Long function: harder to test, harder to reuse, higher cognitive load. SQL injection: attacker could drop tables or exfiltrate data, this is OWASP top 10. Null check: production crash when user hasn't entered address (real user scenario). Each comment includes the impact of not fixing the issue — what bad thing happens? Comments prioritize severity: SQL injection is blocking (must fix), variable naming is a nit (nice to fix). The reviewer demonstrates they understand the code's context and purpose"
+    weight: 0.35
+    description: "Explains why"
+  - type: llm_judge
+    criteria: "Comments suggest specific solutions — not just 'fix this' but concrete suggestions: rename 'd' to 'orderDate' (or whatever the context suggests). Extract validation logic into validateOrder(), extraction logic into extractItems(), notification logic into notifyCustomer(). Replace string concatenation with parameterized query (show the corrected code snippet). Add 'user?.address?.city' or 'if (user.address) { ... }'. Either link the TODO to JIRA-1234 or remove it if the task is done. Each suggestion is implementable — the author could make the change based solely on the comment. Where multiple solutions exist, briefly mention alternatives"
+    weight: 0.30
+    description: "Suggests solutions"

package/courses/code-review-feedback-writing/scenarios/level-1/constructive-tone.yaml ADDED Viewed

@@ -0,0 +1,43 @@
+meta:
+  id: constructive-tone
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Use constructive tone — rewrite harsh or demanding code review comments into collaborative, respectful feedback that still communicates the issue clearly"
+  tags: [code-review, tone, constructive, collaborative, respect, beginner]
+state: {}
+trigger: |
+  A senior engineer on your team leaves harsh code review comments.
+  The team is frustrated and some are avoiding submitting PRs. Here
+  are real comments from recent reviews:
+  1. "Why would you ever do it this way? This is terrible."
+  2. "You clearly don't understand how async works."
+  3. "This code is spaghetti. Rewrite the whole thing."
+  4. "Did you even test this? It obviously doesn't work."
+  5. "I already told you not to use var. Do you not read my comments?"
+  6. "This is a junior mistake. Use a Map instead of nested loops."
+  Each comment has a valid technical point buried under the harsh tone.
+  Task: Rewrite all 6 comments to maintain the technical feedback
+  while using a constructive, respectful tone. For each, show the
+  original, explain what's wrong with the tone, and provide the
+  rewritten version. Then write a brief guide on code review tone
+  best practices.
+assertions:
+  - type: llm_judge
+    criteria: "Rewrites preserve the technical point — every rewrite still communicates the actual issue: (1) the approach has problems → suggest the better approach with reasoning. (2) async usage is incorrect → explain the specific async issue and how to fix it. (3) code needs restructuring → identify specific tangled parts and suggest refactoring steps. (4) there's a bug → describe the expected vs actual behavior. (5) var should be const/let → explain why with example. (6) Map is more efficient than nested loops → explain time complexity difference. Technical substance is not diluted — the feedback is still direct and clear, just not hostile"
+    weight: 0.35
+    description: "Preserves technical point"
+  - type: llm_judge
+    criteria: "Tone shifts from attacking person to discussing code — patterns used: question instead of accuse ('Have you considered using X?' vs 'You should use X'), focus on code not person ('This function could be simplified' vs 'You wrote confusing code'), assume positive intent ('I think the intent here is X — would Y achieve that more cleanly?'), offer help ('Happy to pair on this if it would help'), use 'we' language ('We generally prefer const/let in our codebase'). No sarcasm, no condescension, no 'obviously' or 'clearly' or 'just'. Each rewrite explains what was wrong with the original tone — personal attack, assumption of incompetence, frustration directed at person"
+    weight: 0.35
+    description: "Respectful tone"
+  - type: llm_judge
+    criteria: "Tone guide provides actionable principles — principles: (1) critique code, not people (say 'this function' not 'you'), (2) ask questions before assuming mistakes ('Is there a reason this uses var?' allows for intentional choices), (3) phrase suggestions not demands ('Consider using...' vs 'Use...'), (4) acknowledge good work alongside critique, (5) be explicit about severity (prefix with 'nit:' for style, 'blocking:' for must-fix), (6) remember there's a human reading your comment (re-read before posting). Anti-patterns to avoid: 'obviously', 'just', 'simply' (implies it should be easy), exclamation marks on criticism, ALL CAPS. Real impact: harsh reviews slow teams down — developers delay PRs, avoid asking for review, or leave the team"
+    weight: 0.30
+    description: "Tone guide"

package/courses/code-review-feedback-writing/scenarios/level-1/first-review-shift.yaml ADDED Viewed

@@ -0,0 +1,46 @@
+meta:
+  id: first-review-shift
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Combined beginner shift — write a complete code review for a medium-sized PR covering clarity, tone, severity categorization, and approval decisions"
+  tags: [code-review, combined, shift-simulation, complete-review, beginner]
+state: {}
+trigger: |
+  You have a PR to review: "Add user profile settings page"
+  Author: A mid-level developer, 6 months on the team.
+  PR: 150 lines across 4 files.
+  Issues you've found (in order of discovery):
+  1. Profile photo upload accepts any file type (no validation)
+  2. The component is named "Settings" — too generic
+  3. Email validation uses a simple regex that rejects valid emails
+  4. Missing semicolons on 3 lines
+  5. The save button doesn't disable while submitting (double-submit risk)
+  6. Good: error states are handled with clear user-facing messages
+  7. A console.log left in from debugging
+  8. The API call doesn't handle network errors (just catches and ignores)
+  9. Good: the form resets to saved values on cancel (nice UX touch)
+  10. Test file only covers the happy path — no error case tests
+  Task: Write the complete review including: a summary comment,
+  inline comments for each issue (properly categorized and toned),
+  praise for the good things, and your approve/request-changes
+  decision. This should demonstrate everything a good code review
+  includes.
+assertions:
+  - type: llm_judge
+    criteria: "Summary comment sets the right tone and overview — opens with acknowledgment: 'Good work on the profile settings page — the error handling UX is really nice and the cancel/reset behavior is a great touch.' Then overview: 'I found a few issues to address before merge — the main ones are file type validation on photo upload and network error handling. I also left some smaller suggestions.' Severity summary: 'X blocking items, Y suggestions, Z nits.' The author immediately knows: the review is constructive, there are specific things to fix, and the reviewer recognized good work"
+    weight: 0.35
+    description: "Summary comment"
+  - type: llm_judge
+    criteria: "Inline comments are properly categorized and toned — blocking: file type validation (#1, security), email regex (#3, correctness), error swallowing (#8, reliability), missing error tests (#10, quality). Suggestion: generic component name (#2), double-submit (#5), console.log (#7). Nit: semicolons (#4, should be automated). Praise: error states (#6), cancel behavior (#9). Each blocking comment explains the risk clearly. Each suggestion uses 'consider' or 'would you mind' language. Nits are explicitly marked non-blocking. Praise is specific ('the toast messages on save failure give users actionable guidance — nice touch'). Comments flow from most to least important"
+    weight: 0.35
+    description: "Categorized inline comments"
+  - type: llm_judge
+    criteria: "Decision is appropriate with clear next steps — request changes: 4 blocking items need to be addressed (file type validation, email regex, error handling, error tests). Next steps are clear: 'Please address the blocking items. The suggestions and nits are up to you — happy to re-review quickly once the blocking items are fixed.' The review is actionable: the author knows exactly what must change, what's optional, and when to request re-review. Tone throughout: collaborative, respectful, specific. The author should feel supported, not attacked. Re-review expectation: 'This is close — once the validation and error handling are addressed, I think it's ready to ship.'"
+    weight: 0.30
+    description: "Decision and next steps"

package/courses/code-review-feedback-writing/scenarios/level-1/giving-praise.yaml ADDED Viewed

@@ -0,0 +1,44 @@
+meta:
+  id: giving-praise
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Give meaningful praise — write positive review comments that reinforce good practices and make code review a positive experience"
+  tags: [code-review, praise, positive-feedback, reinforcement, motivation, beginner]
+state: {}
+trigger: |
+  Your team's code reviews are 100% criticism — reviewers only comment
+  when something is wrong. This creates anxiety around PRs and makes
+  developers dread reviews. You want to model positive feedback.
+  You're reviewing a PR that implements a new caching layer. You notice
+  several things done well:
+  1. The cache invalidation strategy handles edge cases gracefully
+  2. The author wrote comprehensive tests including race conditions
+  3. Complex business logic is broken into small, well-named functions
+  4. The PR description clearly explains the problem, approach, and
+     trade-offs considered
+  5. Error handling includes specific error types with helpful messages
+  6. The author proactively added a monitoring dashboard for cache
+     hit rates
+  Task: Write specific praise comments for each of the 6 positive
+  observations. Then write a guide on how to incorporate positive
+  feedback into code reviews without it feeling hollow or performative.
+assertions:
+  - type: llm_judge
+    criteria: "Praise is specific, not generic — each comment references the exact thing that's good and why it matters. Not: 'Nice work!' or 'LGTM!' Instead: (1) 'The cache invalidation handling for concurrent writes (lines 45-60) is really well thought out — this prevents the stale data issue we had in the orders service.' (2) 'Love the race condition tests — especially the test at line 120 that simulates concurrent cache reads during invalidation. This catches exactly the kind of bug that's hard to find in production.' (3) 'The function decomposition here is excellent — readProcessingQueue(), validateCacheEntry(), and evictStaleEntries() each do one thing and are easy to test independently.' Each praise explains WHAT is good and WHY it matters"
+    weight: 0.35
+    description: "Specific praise"
+  - type: llm_judge
+    criteria: "Praise reinforces patterns worth repeating — the praise comments teach: (1) cache invalidation: 'This is exactly how we should handle invalidation — I'd love to see this pattern adopted in other services.' (2) Testing: 'These race condition tests are a great example of testing what matters, not just chasing coverage.' (3) Function naming: points out that these names make the code self-documenting. (4) PR description: 'The trade-off analysis in the PR description is really helpful — it makes the review much easier when I understand why alternatives were rejected.' Each praise implicitly sets an example for other developers. Praise isn't just about making the author feel good — it's about amplifying best practices"
+    weight: 0.35
+    description: "Reinforces patterns"
+  - type: llm_judge
+    criteria: "Guide makes praise practical and authentic — principles: (1) be specific (point to exact lines/functions, not 'good job'), (2) explain impact ('this will prevent production incidents' is more meaningful than 'nice'), (3) praise effort on hard problems (cache invalidation IS hard — acknowledge it), (4) don't force it (only praise what genuinely impresses you), (5) ratio: aim for at least one positive comment per review (not necessarily equal to negatives). When praise feels fake: it's because it's vague ('looks good!') or obligatory ('starting with something positive'). Authentic praise comes from recognizing genuine skill, effort, or thoughtfulness. Timing: start the review with what you noticed that's good, then move to improvements — sets collaborative tone. Impact: teams that balance positive and constructive feedback have faster PR turnaround and lower developer turnover"
+    weight: 0.30
+    description: "Practical praise guide"

package/courses/code-review-feedback-writing/scenarios/level-1/nitpick-etiquette.yaml ADDED Viewed

@@ -0,0 +1,44 @@
+meta:
+  id: nitpick-etiquette
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Handle nitpicks properly — learn when to raise minor style issues, how to label them, and when to let them go in favor of automated tools"
+  tags: [code-review, nitpicks, style, etiquette, automation, beginner]
+state: {}
+trigger: |
+  You're reviewing a PR that correctly implements a complex caching
+  layer. The logic is sound, tests are comprehensive, and it solves
+  a real performance problem. But you've noticed 8 minor style issues:
+  1. Inconsistent quotes (mix of single and double quotes)
+  2. One function is 32 lines (team guideline is 30 max)
+  3. A variable named "cacheTimeout" when team uses "cache_timeout" style
+  4. Missing JSDoc on one public method
+  5. An import statement that could be shorter with a barrel export
+  6. A comment with a typo ("retrive" instead of "retrieve")
+  7. Extra blank line between two functions
+  8. Using "!= null" instead of "!== null"
+  The author spent a week on this feature and is eager to ship it.
+  Task: Decide which of these 8 items to raise and which to skip.
+  For items you raise, write the comment with proper "nit:" prefix
+  and make clear it's non-blocking. Explain your decision criteria.
+  Also propose which of these could be automated away.
+assertions:
+  - type: llm_judge
+    criteria: "Decisions about what to raise are well-reasoned — raise with nit prefix: naming convention (#3, consistency matters for team), typo (#6, quick fix), strict equality (#8, potential bug source). Skip or mention once: inconsistent quotes (#1, automate with Prettier), 32 vs 30 lines (#2, not worth blocking a complex feature), missing JSDoc (#4, can be follow-up), barrel import (#5, pure preference), blank line (#7, automate with formatter). Key principle: don't bury important feedback under a mountain of nits. If you raise 8 nits, the author sees 8 problems and feels demoralized despite doing great architectural work. Raise 2-3 meaningful nits, skip the rest"
+    weight: 0.35
+    description: "Selective raising"
+  - type: llm_judge
+    criteria: "Nit comments are properly labeled and non-blocking — each nit is prefixed with 'nit:' and explicitly marked non-blocking: 'nit: Our codebase uses snake_case for variables — would you mind renaming cacheTimeout to cache_timeout? Non-blocking, feel free to address in a follow-up.' The review approval is not held up by nits — approve the PR, mention nits as suggestions. Acknowledge the good work FIRST: 'Great implementation of the caching layer — the eviction strategy is really clean. A few minor style nits below, but this is ready to merge.' Nits never use demanding language ('fix this'), always use collaborative language ('would you mind', 'consider changing')"
+    weight: 0.35
+    description: "Non-blocking nits"
+  - type: llm_judge
+    criteria: "Automation recommendations are practical — automate these: quotes consistency (Prettier/ESLint), blank lines (Prettier), strict equality (ESLint strict-equality rule), function length (ESLint max-lines-per-function), import style (ESLint import rules). Cannot automate: naming convention consistency (requires understanding context), JSDoc completeness (tools exist but often noisy), typos (spell checkers for code exist but have false positives). Recommendation: invest 30 minutes setting up ESLint/Prettier to eliminate 5 of these 8 nits forever. Every nit you automate is a nit you never have to write again. The best code review comment is one you never had to make because a tool caught it first"
+    weight: 0.30
+    description: "Automation recommendations"

package/courses/code-review-feedback-writing/scenarios/level-1/providing-context.yaml ADDED Viewed

@@ -0,0 +1,46 @@
+meta:
+  id: providing-context
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Provide context in comments — write review comments that include enough background for the author to understand the issue without additional back-and-forth"
+  tags: [code-review, context, explanation, self-contained, beginner]
+state: {}
+trigger: |
+  You review a lot of code and notice a pattern: your comments
+  generate multiple rounds of "What do you mean?" replies. Examples
+  of your low-context comments:
+  - "This won't scale" (at what load? why?)
+  - "Use the helper for this" (which helper? where?)
+  - "This conflicts with the RFC" (which RFC? which section?)
+  - "We had a bug like this before" (when? what happened?)
+  - "This pattern is discouraged" (by whom? documented where?)
+  You're reviewing a PR that adds a new user search endpoint. Issues:
+  1. The search uses LIKE '%term%' on a 10M row table
+  2. The response doesn't include pagination
+  3. The endpoint path is GET /searchUsers instead of GET /users/search
+  4. A race condition exists between checking and updating user count
+  5. The error response format differs from all other endpoints
+  Task: Write high-context review comments for each issue. Each
+  comment should be self-contained — the author shouldn't need to
+  ask follow-up questions or look anything up to understand the
+  comment and act on it.
+assertions:
+  - type: llm_judge
+    criteria: "Each comment provides complete context — (1) LIKE query: explain that '%term%' prevents index usage on a 10M row table, expected response time will be 5-10 seconds, suggest full-text search index or Elasticsearch, link to your team's search pattern documentation if applicable. (2) Pagination: 'Our API standard returns paginated responses for list endpoints (see /api/v1/orders as an example). Without pagination, a search returning 50K results would timeout and consume excessive memory. Suggest cursor-based pagination matching our existing pattern.' (3) URL path: 'Our API follows RESTful conventions — resources as nouns, actions via HTTP methods. GET /users/search follows our existing pattern (see GET /products/search). The style guide is at [link].' (4) Race condition: explain the specific sequence of operations that causes the race, show which concurrent requests trigger it, suggest the fix (transaction, atomic operation, or optimistic locking). (5) Error format: show the standard format with example, point to the shared error utility function"
+    weight: 0.35
+    description: "Complete context"
+  - type: llm_judge
+    criteria: "Comments include references and evidence — each comment links to relevant documentation, code examples, or precedent. 'We use cursor-based pagination — see OrdersController.ts:45 for the pattern.' 'The REST naming convention is documented in our style guide: [URL].' 'We had a similar LIKE query performance issue in PR #1234 — the fix was adding a GIN index.' Evidence makes comments authoritative, not just opinion. Numbers where applicable: '10M rows means this query will do a full table scan taking ~8 seconds.' Specific file paths, function names, and line numbers rather than vague references to 'the existing pattern'"
+    weight: 0.35
+    description: "References and evidence"
+  - type: llm_judge
+    criteria: "Comments eliminate need for follow-up — each comment answers: what's the problem, why it matters, what's the fix, and where to look for examples. The author can implement the fix without asking a single follow-up question. Test: read each comment as if you're a new developer on the team — could you fix the issue based only on this comment? If not, more context is needed. Include: before/after code snippets where helpful, links to relevant documentation, similar patterns in the codebase. The goal: one round of review, not three rounds of clarification. Time investment: spending 2 extra minutes writing a high-context comment saves 30 minutes of back-and-forth"
+    weight: 0.30
+    description: "Eliminates follow-up"

package/courses/code-review-feedback-writing/scenarios/level-1/reviewing-small-prs.yaml ADDED Viewed

@@ -0,0 +1,43 @@
+meta:
+  id: reviewing-small-prs
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Review small PRs — practice writing thorough review feedback for small, focused pull requests covering bug fixes, refactors, and features"
+  tags: [code-review, small-pr, focused, thorough, practice, beginner]
+state: {}
+trigger: |
+  You have three small PRs to review today. Each is under 100 lines.
+  Your task is to write a complete review for each.
+  PR #1 — Bug fix (23 lines): Fixes "users can't log in with email
+  containing a plus sign (user+tag@gmail.com)". Changes the email
+  validation regex in auth.ts.
+  PR #2 — Refactor (67 lines): Extracts duplicate API error handling
+  into a shared utility function. Touches 4 files.
+  PR #3 — Feature (89 lines): Adds a "copy to clipboard" button to
+  the API key display page. New component with a 3-second "Copied!"
+  tooltip.
+  Task: Write a complete review for each PR. For each, include:
+  a summary comment (overall assessment), inline comments on specific
+  code, and your approve/request-changes decision with reasoning.
+  Show what a thorough review of a small PR looks like.
+assertions:
+  - type: llm_judge
+    criteria: "Each review has a structured summary comment — PR #1 summary: acknowledge the bug fix, confirm the regex change is correct for plus signs, ask about edge cases (other special characters? RFC 5322 compliance?), note whether tests were added. PR #2 summary: praise the deduplication effort, verify the shared utility handles all 4 usage patterns, check for behavior changes. PR #3 summary: confirm the UX (tooltip duration, accessibility), check clipboard API usage (permissions, fallback for older browsers). Each summary: starts with what the PR does well, identifies concerns, states the decision (approve/request changes). Summaries are 3-5 sentences, not essays"
+    weight: 0.35
+    description: "Structured summaries"
+  - type: llm_judge
+    criteria: "Inline comments address PR-specific concerns — PR #1 (regex): does the new regex still reject invalid emails? Is '+' the only character that was broken? Are there tests for edge cases (multiple plus signs, plus at start/end)? Was the old regex documented — why was '+' excluded originally? PR #2 (refactor): does the shared utility preserve all error handling behaviors? Are the 4 call sites truly identical or are there subtle differences? Is the function well-named and well-placed? PR #3 (feature): does the clipboard API require HTTPS? What happens when clipboard access is denied? Is the tooltip accessible (aria-label)? Does the 3-second timeout clean up properly (memory leak on rapid clicks)? Comments are specific to the actual changes, not generic"
+    weight: 0.35
+    description: "PR-specific feedback"
+  - type: llm_judge
+    criteria: "Approve/request-changes decisions are justified — each decision has clear reasoning. PR #1: if regex is correct and tested, approve with suggestion to add more edge case tests. If regex could reject valid emails, request changes. PR #2: approve if behavior is preserved, request changes if any call site's behavior changed subtly. PR #3: approve if clipboard fallback and accessibility are handled, request changes if no fallback for non-HTTPS or no accessibility support. The threshold is clear: approve means 'ready to merge, comments are optional.' Request changes means 'must address these before merge.' Don't request changes for nits. Acknowledge when you're not sure — 'I'm not confident about the regex edge cases — could you verify?' is better than either blindly approving or blocking"
+    weight: 0.30
+    description: "Decision reasoning"

package/courses/code-review-feedback-writing/scenarios/level-1/style-vs-logic.yaml ADDED Viewed

@@ -0,0 +1,48 @@
+meta:
+  id: style-vs-logic
+  level: 1
+  course: code-review-feedback-writing
+  type: output
+  description: "Distinguish style from logic — categorize review comments by severity and type, separating cosmetic preferences from functional issues"
+  tags: [code-review, severity, style, logic, categorization, nit, beginner]
+state: {}
+trigger: |
+  You're reviewing a PR with 15 comments to make. You need to
+  categorize them properly so the author knows what's critical vs
+  what's optional. Your comments are:
+  1. Missing semicolons (linter would catch this)
+  2. Race condition in concurrent database writes
+  3. Inconsistent indentation (tabs vs spaces)
+  4. Unvalidated user input passed to shell command
+  5. Variable named "data" — could be more descriptive
+  6. API key hardcoded in source code
+  7. Prefer arrow functions over function declarations (style preference)
+  8. Off-by-one error in pagination logic
+  9. Unused import statement
+  10. Missing error handling — function swallows exceptions silently
+  11. Single-letter variable in a 3-line loop (acceptable)
+  12. Potential memory leak from unclosed database connection
+  13. Comment says "TODO: fix later" with no context
+  14. Long ternary that could be an if/else for readability
+  15. Business logic doesn't match acceptance criteria
+  Task: Categorize all 15 items into severity levels and write
+  appropriate review comments for each. Show how to prefix comments
+  to signal severity. Explain the categorization rationale.
+assertions:
+  - type: llm_judge
+    criteria: "Items are correctly categorized by severity — blocking/must-fix: race condition (#2), shell command injection (#4), hardcoded API key (#6), off-by-one error (#8), swallowed exceptions (#10), memory leak (#12), wrong business logic (#15). These are bugs, security vulnerabilities, or incorrect behavior. Suggestion/should-fix: descriptive variable name (#5), unused import (#9), TODO without context (#13). Nice-to-have/nit: semicolons (#1), indentation (#3), arrow functions (#7), single-letter variable (#11), ternary readability (#14). Clear rationale for each: security/correctness issues are blocking, maintainability issues are suggestions, pure style preferences are nits"
+    weight: 0.35
+    description: "Correct categorization"
+  - type: llm_judge
+    criteria: "Comments use severity prefixes consistently — prefix system: 'blocking:' or 'must-fix:' for issues that must be addressed before merge. 'suggestion:' for improvements that should be made. 'nit:' for optional style preferences. 'question:' for things the reviewer is unsure about. 'praise:' for things done well. Each comment uses the appropriate prefix. Blocking comments explain the specific risk (security vulnerability, data corruption, production crash). Nits are explicitly marked as optional ('nit: feel free to ignore'). The author can quickly scan prefixes to prioritize their work — address all blocking items, consider suggestions, optionally handle nits"
+    weight: 0.35
+    description: "Severity prefixes"
+  - type: llm_judge
+    criteria: "Guidance on when to block vs suggest vs nit is practical — blocking criteria: security vulnerabilities, correctness bugs, data loss risk, breaking changes without migration. Suggest criteria: maintainability issues, missing error handling, unclear naming, dead code. Nit criteria: pure style preferences, formatting (especially if no linter), alternative approaches that are equivalent. Rule of thumb: 'Would I wake someone up at 3am for this?' — if yes, it's blocking. 'Would this cause a bug in 6 months?' — suggestion. 'Does this bother only me?' — nit. Also: don't block on nits, don't nit on critical bugs (everything looks minor when prefixed 'nit:'). Automation recommendation: let linters handle style, humans focus on logic"
+    weight: 0.30
+    description: "Categorization guidance"

package/courses/code-review-feedback-writing/scenarios/level-2/architectural-feedback.yaml ADDED Viewed

@@ -0,0 +1,52 @@
+meta:
+  id: architectural-feedback
+  level: 2
+  course: code-review-feedback-writing
+  type: output
+  description: "Give architectural feedback — review code for design patterns, separation of concerns, and consistency with the system's architecture"
+  tags: [code-review, architecture, design, patterns, separation-of-concerns, intermediate]
+state: {}
+trigger: |
+  You're reviewing a PR that adds a notification system. The code
+  works and has tests, but the architecture concerns you:
+  1. The notification logic is embedded directly in the OrderController
+     — it checks order status and sends emails/SMS inline with the
+     order processing logic
+  2. The email templates are hardcoded strings in the notification
+     code, not in template files
+  3. The PR introduces a new database connection for notifications
+     instead of using the existing connection pool
+  4. There's no interface/abstraction for notification channels —
+     adding push notifications later would require modifying the
+     existing email/SMS code
+  5. The notification sending is synchronous — if the email service
+     is slow, order processing is slow
+  The code works perfectly in tests. The author will argue: "It works,
+  why change it?"
+  Task: Write architectural review feedback for each of the 5 concerns.
+  For each, explain the current design problem, the future risk, and
+  propose a better approach. Address the "it works, why change it?"
+  objection.
+assertions:
+  - type: llm_judge
+    criteria: "Each concern identifies a specific architectural principle violation — (1) Mixed concerns: OrderController handles orders AND notifications — violates single responsibility. Risk: changes to notification logic require touching order code. Fix: extract NotificationService. (2) Hardcoded templates: templates change independently from code (marketing wants to update copy). Risk: every text change requires a code deploy. Fix: template files or a template engine. (3) Separate connection: multiple connection pools waste resources, harder to manage transactions. Fix: share existing pool. (4) No abstraction: adding push notifications means modifying email/SMS code (open/closed principle violation). Fix: NotificationChannel interface with EmailChannel, SMSChannel implementations. (5) Synchronous: email service timeout blocks order completion. Fix: async queue (send notification event, process separately). Each concern: principle, current risk, future risk, concrete fix"
+    weight: 0.35
+    description: "Architectural concerns"
+  - type: llm_judge
+    criteria: "Feedback addresses 'it works, why change it?' constructively — acknowledge: 'The implementation is correct and well-tested — I'm not questioning whether it works. My concern is about how it will work as the system grows.' Address each concern with a concrete scenario: 'When marketing wants to change the email copy, they'll need an engineering deploy (template concern).' 'When we add push notifications next quarter (it's on the roadmap), we'll need to refactor this code AND the tests (abstraction concern).' 'If SendGrid has a 30-second timeout, users wait 30 seconds to place an order (sync concern).' Frame as investment: 'Fixing this now takes 2 hours. Fixing it later when 10 features depend on this pattern takes 2 weeks.' Not condescending: 'I've seen this pattern cause problems in [specific project/service]'"
+    weight: 0.35
+    description: "Addresses objection"
+  - type: llm_judge
+    criteria: "Proposed alternatives are concrete and proportionate — each suggestion includes enough detail to implement: extract NotificationService with send(type, recipient, data) method. Show the interface: NotificationChannel with send() method, EmailChannel and SMSChannel implementations. Show the queue pattern: emit OrderCompleted event, NotificationWorker listens and sends. Don't over-engineer: 'We don't need a full event bus — a simple background job queue is sufficient for now.' Prioritize: 'If you only address one concern, make notifications async (#5) — that's the production risk. The others are maintainability and can be follow-up PRs.' Offer to pair: 'Happy to pair on the refactor if you'd like — the extraction should be straightforward.'"
+    weight: 0.30
+    description: "Proportionate alternatives"

package/courses/code-review-feedback-writing/scenarios/level-2/intermediate-review-shift.yaml ADDED Viewed

@@ -0,0 +1,46 @@
+meta:
+  id: intermediate-review-shift
+  level: 2
+  course: code-review-feedback-writing
+  type: output
+  description: "Combined intermediate shift — review a complex PR covering architecture, security, performance, tests, and breaking changes across multiple concerns"
+  tags: [code-review, combined, shift-simulation, multi-concern, intermediate]
+state: {}
+trigger: |
+  You're reviewing a 300-line PR: "Add subscription billing to the
+  payments service." The PR touches:
+  - New SubscriptionService class (120 lines)
+  - Modified PaymentController (40 lines — new endpoints)
+  - Database migration (25 lines — new subscriptions table)
+  - Modified existing PaymentService (30 lines — shared logic extracted)
+  - Tests (85 lines)
+  Issues across multiple dimensions:
+  Architecture: billing logic mixed with subscription lifecycle management
+  Security: subscription plan change doesn't verify user owns the subscription
+  Performance: recurring charge loop processes subscriptions sequentially
+  Tests: only happy path tested, no cancellation or failed payment scenarios
+  Breaking change: PaymentService.charge() signature changed (used by 3 other services)
+  Documentation: no API docs for the 2 new endpoints
+  Good: clean error types, proper use of database transactions
+  Task: Write the complete review covering all dimensions. Organize
+  feedback thematically. Prioritize what must be fixed vs what can
+  be follow-up work. Make an approval decision.
+assertions:
+  - type: llm_judge
+    criteria: "Review is organized thematically with clear priorities — structure: (1) Summary (overall assessment + good things), (2) Blocking issues (security, breaking change, critical test gaps), (3) Architecture suggestions (billing/lifecycle separation), (4) Performance concerns (sequential processing), (5) Follow-up items (documentation, additional tests). Blocking items are clearly marked: 'These must be addressed before merge.' Suggestions are marked: 'These would improve the code but can be separate PRs.' The reviewer has clearly prioritized — security auth check is #1 (someone could modify another user's subscription), breaking change is #2 (other services will break), missing cancellation tests is #3"
+    weight: 0.35
+    description: "Organized priorities"
+  - type: llm_judge
+    criteria: "Each dimension gets substantive feedback — security: 'Line 45: this endpoint modifies a subscription by ID but doesn't check if the authenticated user owns that subscription. An attacker could change any user's subscription plan. Add: verify subscription.userId === authenticatedUser.id.' Breaking change: 'PaymentService.charge() now takes a SubscriptionContext parameter — the 3 services calling this will break. Can we make the parameter optional with a default?' Performance: 'Processing 10,000 subscriptions sequentially at 200ms each = 33 minutes. Consider: batch processing, or better: a job queue that processes subscriptions concurrently.' Tests: suggest specific test cases for cancellation mid-cycle, failed payment retry, plan upgrade/downgrade. Architecture: suggest separating SubscriptionLifecycleService from BillingService"
+    weight: 0.35
+    description: "Multi-dimension feedback"
+  - type: llm_judge
+    criteria: "Decision balances thoroughness with pragmatism — request changes: security issue (#1 priority) and breaking change (#2) must be fixed. But acknowledge: 'This is solid work overall — the transaction handling and error types are clean. The core subscription logic is correct.' Don't demand everything at once: 'I'd suggest addressing the security and breaking change issues now, and creating follow-up tickets for: (1) separating billing from lifecycle management, (2) parallel subscription processing, (3) API documentation, (4) additional test coverage for edge cases.' Timeline suggestion: 'The follow-up PRs don't block launch but should be done within the sprint.' Offer to re-review quickly: 'Once the auth check and backward-compatible charge() signature are in, I can re-review same day.'"
+    weight: 0.30
+    description: "Balanced decision"