npm - dojo.md - Versions diffs - 0.2.2 → 0.2.4 - Mend

dojo.md 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (196) hide show

package/courses/code-review-feedback-writing/scenarios/level-4/scaling-review-process.yaml ADDED Viewed

@@ -0,0 +1,45 @@
+meta:
+  id: scaling-review-process
+  level: 4
+  course: code-review-feedback-writing
+  type: output
+  description: "Scale review process for team growth — design review process changes needed as an engineering team grows from 20 to 50 to 100 engineers"
+  tags: [code-review, scaling, growth, process-change, milestones, expert]
+state: {}
+trigger: |
+  Your engineering team is growing rapidly: 20 engineers today,
+  50 in 6 months, 100 in 18 months. Your current review process
+  works well at 20 but you know it won't scale:
+  At 20: everyone knows everyone, informal reviewer selection, the
+  tech lead reviews most PRs, one Slack channel, everyone in one
+  timezone.
+  At 50 (6 months): 4 teams, some people don't know each other,
+  need formal assignment, tech lead is bottleneck, cross-team reviews
+  needed, 2 timezones.
+  At 100 (18 months): 8 teams, most people are strangers, complex
+  assignment needed, multiple tech leads, heavy cross-team
+  dependencies, 4 timezones, new offices.
+  Task: Design the review process evolution for each milestone.
+  For each stage: what changes, what stays the same, what to
+  introduce, what to retire. Include the transitions between stages
+  and how to introduce changes without disrupting current flow.
+assertions:
+  - type: llm_judge
+    criteria: "Each milestone has specific process changes — at 20 (current): informal assignment works, tech lead as backstop, single channel, trust-based. Works because everyone knows the code and each other. At 50 (6 months): introduce CODEOWNERS for auto-assignment, replace single channel with per-team channels + cross-team bot, 12-hour SLA (timezone-aware), written review guidelines (can't be tribal knowledge at 50), reviewer rotation to prevent silos, security review requirement for sensitive code. At 100 (18 months): automated tiered review (risk-based assignment), cross-timezone review routing, reviewer capacity management, review quality monitoring, formal reviewer training program, escalation paths for conflicts. Each change: what problem it solves, when to introduce it, how to validate it's working"
+    weight: 0.35
+    description: "Milestone changes"
+  - type: llm_judge
+    criteria: "What stays constant provides continuity — constants across all sizes: (1) every production code change is reviewed by someone, (2) reviews are respectful and constructive, (3) security-sensitive code gets extra scrutiny, (4) authors respond to feedback within 24 hours. These are culture, not process — culture should be constant even as process evolves. What retires: (1) tech lead reviewing everything (at 50 — they become a bottleneck), (2) informal assignment (at 50 — replaced by CODEOWNERS), (3) single Slack channel (at 50 — too noisy). Transition principle: never remove something without replacing it. Don't just stop tech lead reviews — introduce the replacement system first, run both in parallel for 2 weeks, then transition"
+    weight: 0.35
+    description: "Constants and transitions"
+  - type: llm_judge
+    criteria: "Transition plans minimize disruption — each transition: (1) announce the change and reasoning 2 weeks before, (2) pilot with one team for 2 weeks, (3) gather feedback and adjust, (4) roll out to all teams, (5) measure impact for 4 weeks, (6) iterate. Specific transitions: 20→50: 'We're introducing CODEOWNERS to replace manual assignment. For the next 2 weeks, Team A will pilot while other teams continue as-is. We'll share findings in the all-hands.' 50→100: 'We're adding tiered review requirements. Low-risk changes get faster review (1 approval, 4-hour SLA), high-risk changes get more thorough review (2 approvals, security review).' Change fatigue: don't introduce all changes at once. One process change per month maximum. Measure before and after each change. Rollback plan for every change: 'If tiered reviews increase defect rate by more than 2%, we revert and reassess.'"
+    weight: 0.30
+    description: "Transition plans"

package/courses/code-review-feedback-writing/scenarios/level-4/security-review-standards.yaml ADDED Viewed

@@ -0,0 +1,41 @@
+meta:
+  id: security-review-standards
+  level: 4
+  course: code-review-feedback-writing
+  type: output
+  description: "Create security review standards — design security-focused review checklists, training, and processes that make security review a standard part of every engineer's review practice"
+  tags: [code-review, security, standards, checklist, OWASP, organization, expert]
+state: {}
+trigger: |
+  Your company handles sensitive data (PII, payment information).
+  Security vulnerabilities have been found in production 3 times
+  this quarter — all were in code that passed code review.
+  Post-incident analysis revealed:
+  - Reviewers focused on functionality, not security
+  - No security review checklist exists
+  - Only 2 engineers (out of 40) have security training
+  - SQL injection in a new endpoint (reviewer didn't check queries)
+  - XSS in user-generated content rendering (reviewer checked logic only)
+  - IDOR vulnerability (reviewer didn't verify authorization checks)
+  Task: Design the security review program. Include: security review
+  checklist by code area, training program to upskill all reviewers
+  on security basics, escalation criteria for security-sensitive
+  changes, and integration with automated security scanning.
+assertions:
+  - type: llm_judge
+    criteria: "Security checklist is specific by code area — API endpoints: (1) input validation on ALL user inputs, (2) authorization check (not just authentication — can THIS user access THIS resource?), (3) rate limiting configured, (4) response doesn't expose sensitive data (PII, internal IDs). Database queries: (1) parameterized queries (no string concatenation), (2) ORM queries checked for injection via raw SQL, (3) sensitive data encrypted at rest. Frontend: (1) user content escaped/sanitized before rendering, (2) no unsafe HTML injection without proper sanitization via DOMPurify or equivalent, (3) CSP headers configured. Authentication: (1) passwords hashed with bcrypt (not MD5/SHA), (2) session tokens rotated on login, (3) failed login rate limiting. File handling: (1) file type validation (not just extension), (2) size limits, (3) stored outside webroot. Each item: what to look for, example vulnerable code, example fixed code"
+    weight: 0.35
+    description: "Security checklist"
+  - type: llm_judge
+    criteria: "Training program upskills all reviewers — tier 1 (all engineers, 2 hours): OWASP top 10 overview, how to spot injection/XSS/IDOR in code review, hands-on exercise (review a PR with planted vulnerabilities). Tier 2 (team leads, 4 hours): threat modeling basics, security review prioritization (which PRs need deep security review), how to escalate security findings. Tier 3 (security champions, 2 days): advanced vulnerability patterns, penetration testing basics, security architecture review. Security champions program: 1 champion per team, additional training, available for security review consultations. Refresher: quarterly 30-minute security review workshop with real examples from your codebase (anonymized). Assessment: after training, each engineer must successfully review a test PR and find 3 of 5 planted vulnerabilities"
+    weight: 0.35
+    description: "Training program"
+  - type: llm_judge
+    criteria: "Escalation and automation integration are practical — escalation criteria: changes to auth/session management, changes to data access patterns, new API endpoints accepting user input, changes to encryption/hashing, dependency updates with security advisories — all require security champion review. Automated scanning integration: (1) Semgrep rules for OWASP top 10 in CI (blocks PR if critical findings), (2) dependency scanning (Snyk/Dependabot) as advisory, (3) secret detection (GitLeaks/TruffleHog) blocks on detection. Human + automation collaboration: automated tools catch known patterns, human reviewers catch logic flaws (IDOR, business logic bypass). Security review SLA: security-flagged PRs reviewed within 4 hours (dedicated security champion on rotation). Metrics: security vulnerability escape rate (target: 0 from reviewed code), percentage of PRs with security checklist completed, time to security review"
+    weight: 0.30
+    description: "Escalation and automation"

package/courses/code-review-feedback-writing/scenarios/level-4/training-reviewers.yaml ADDED Viewed

@@ -0,0 +1,42 @@
+meta:
+  id: training-reviewers
+  level: 4
+  course: code-review-feedback-writing
+  type: output
+  description: "Train code reviewers — design a reviewer training program that develops both technical review skills and feedback communication skills"
+  tags: [code-review, training, skills, feedback-quality, program, expert]
+state: {}
+trigger: |
+  You're building a reviewer training program for your organization.
+  Problems you need to solve:
+  - New hires don't know what to look for in reviews
+  - Some reviewers focus only on style, missing logic bugs
+  - Others focus only on logic, ignoring maintainability
+  - Senior reviewers give great feedback but can't articulate HOW
+    they review (unconscious competence)
+  - Junior reviewers rubber-stamp PRs ("LGTM" with no real review)
+  - Feedback tone varies from excellent to hostile
+  - No one teaches review skills — it's assumed you "just know"
+  Task: Design a complete reviewer training program. Include: skill
+  levels (what good reviewing looks like at each level), training
+  curriculum, practice exercises, assessment criteria, and ongoing
+  development. This program should produce consistently high-quality
+  reviewers.
+assertions:
+  - type: llm_judge
+    criteria: "Skill levels define progressive reviewer competency — Level 1 (apprentice): can identify obvious bugs, style issues, missing tests. Reviews small PRs (< 100 lines). Comments are clear and respectful. Level 2 (practitioner): identifies architectural issues, security concerns, performance problems. Reviews medium PRs. Provides code suggestions, not just problem identification. Level 3 (expert): reviews complex cross-cutting changes, mentors through review, identifies systemic issues. Calibrates feedback to PR risk and author experience. Level 4 (multiplier): improves team review culture, coaches other reviewers, designs review processes. Each level: observable behaviors, example review comments, promotion criteria. Progression: ~3 months per level for an engaged learner"
+    weight: 0.35
+    description: "Skill levels"
+  - type: llm_judge
+    criteria: "Training curriculum covers both technical and communication skills — technical track: (1) code reading skills (how to understand unfamiliar code quickly), (2) bug pattern recognition (common bugs by language: null references, off-by-one, race conditions), (3) security review checklist (OWASP top 10 for code review), (4) performance review patterns (N+1, memory leaks, complexity), (5) architecture review (SOLID, coupling, cohesion). Communication track: (1) writing clear comments (specific, actionable, contextual), (2) tone and empathy (constructive, not critical), (3) severity calibration (blocking vs suggestion vs nit), (4) asking good questions (genuine vs rhetorical). Exercises: review a prepared PR with planted bugs (compare findings against answer key), rewrite harsh comments constructively, practice reviewing code in an unfamiliar language"
+    weight: 0.35
+    description: "Curriculum"
+  - type: llm_judge
+    criteria: "Practice and assessment are hands-on — practice exercises: (1) bug hunting: review a PR with 5 planted bugs of varying severity — graded on what you find AND what you correctly skip (no false positives). (2) Comment workshop: rewrite 10 poorly-written review comments — scored on clarity, tone, and actionability. (3) Triage exercise: given 20 issues, categorize as blocking/suggestion/nit — compare against expert calibration. (4) Live pair review: review a real PR with an expert reviewer, discuss approach differences. Assessment: quarterly review calibration — all reviewers review the same PR independently, then compare results. Highlights: who catches what, tone differences, severity calibration. Ongoing: monthly 'review of reviews' — manager samples recent review comments and provides coaching feedback. Recognition: highlight excellent review comments in team channel to show what 'good' looks like"
+    weight: 0.30
+    description: "Practice and assessment"

package/courses/code-review-feedback-writing/scenarios/level-5/board-quality-metrics.yaml ADDED Viewed

@@ -0,0 +1,44 @@
+meta:
+  id: board-quality-metrics
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Present engineering quality to the board — translate code review and quality metrics into board-level business language connecting engineering practices to company outcomes"
+  tags: [code-review, board-presentation, quality-metrics, business-impact, executive, master]
+state: {}
+trigger: |
+  The board wants quarterly engineering quality reports. They don't
+  understand code review metrics — they understand revenue, risk,
+  customer impact, and competitive positioning.
+  Engineering data available:
+  - Defect escape rate: 4% (down from 12% a year ago)
+  - Mean time to recovery (MTTR): 45 minutes (down from 4 hours)
+  - Deployment frequency: 15/day (up from 2/week)
+  - Change failure rate: 3% (down from 10%)
+  - Code review cycle time: 1.2 days (down from 7 days)
+  - Developer satisfaction: 8.1/10 (up from 3.5/10)
+  - Attrition rate: 8% (down from 25%)
+  - Customer-reported bugs: 3/month (down from 15/month)
+  Task: Write the board-level engineering quality report. Translate
+  every metric into business impact. Connect code review improvements
+  to revenue protection, customer trust, competitive speed, and talent
+  retention. Make the board understand why engineering quality matters
+  to their investment.
+assertions:
+  - type: llm_judge
+    criteria: "Metrics are translated to business language — defect escape 4%: 'For every 100 changes we ship, 96 work perfectly the first time. This means customers experience fewer disruptions and our support team handles 80% fewer escalations.' MTTR 45 min: 'When issues occur, customers are impacted for under an hour instead of half a business day. This protects our SLA commitments and customer trust.' Deploy frequency 15/day: 'We can respond to market changes and customer requests in hours, not weeks. This is a competitive advantage.' Change failure 3%: 'Our quality gates catch 97% of potential issues before they reach customers.' Customer bugs 3/month: 'Customer-reported issues dropped 80% year-over-year, directly improving NPS and reducing churn risk.' Each metric tells a story about business impact, not engineering process"
+    weight: 0.35
+    description: "Business translation"
+  - type: llm_judge
+    criteria: "Financial impact of quality improvements is quantified — attrition reduction: 25% → 8% = retained ~25 engineers × $150K replacement cost = $3.75M saved in recruiting/onboarding. Customer bugs: 15/month → 3/month = 12 fewer customer escalations × $10K average handling cost = $1.44M/year saved. Faster recovery: 4 hours → 45 minutes = reduced customer impact during outages, protecting $X revenue per hour of downtime. Deploy speed: 2/week → 15/day = features reach customers 50x faster, reducing time-to-revenue for new capabilities. Quality investment: $300K (tooling + process improvements + training) generated $5M+ in retained talent + reduced incident costs + faster delivery. ROI: 16x return on quality investment. Frame: 'Engineering quality is not a cost center — it's a revenue protection and acceleration system.'"
+    weight: 0.35
+    description: "Financial quantification"
+  - type: llm_judge
+    criteria: "Competitive positioning and forward-looking strategy are included — competitive context: 'Our deployment frequency (15/day) puts us in the top 10% of companies our size. Our change failure rate (3%) is elite level (per DORA research). Competitors operating at 2 deploys/week cannot respond to market changes as quickly.' Forward-looking: 'Next quarter we're investing in: (1) automated security review (reduces vulnerability exposure time by 80%), (2) AI-assisted code review (accelerates review by 30%, freeing senior engineers for architecture), (3) cross-team knowledge sharing (reduces knowledge silos, improving our bus factor from 2 to 5 across critical systems).' Risk section: 'Key engineering risks: (1) if we stop investing in quality, defect rates will regress within 2 quarters, (2) losing key senior engineers would impact our bus factor on 3 critical systems.' Recommendation: 'Continue quality investment. The compounding returns (faster delivery + fewer bugs + better retention) are the foundation of our technical competitive advantage.'"
+    weight: 0.30
+    description: "Competitive strategy"

package/courses/code-review-feedback-writing/scenarios/level-5/knowledge-transfer-at-scale.yaml ADDED Viewed

@@ -0,0 +1,42 @@
+meta:
+  id: knowledge-transfer-at-scale
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Design knowledge transfer through reviews — create systems that preserve institutional knowledge when key engineers leave, using code review as the primary knowledge capture mechanism"
+  tags: [code-review, knowledge-transfer, institutional-knowledge, bus-factor, retention, master]
+state: {}
+trigger: |
+  Your principal engineer (15 years at the company, built the core
+  platform) just gave 4 weeks notice. They're the only person who
+  deeply understands:
+  - The payment reconciliation system (6 other engineers can maintain
+    it, but only at a surface level)
+  - Why certain architectural decisions were made (not documented)
+  - How to debug the most complex production incidents
+  - The relationships between 30 microservices
+  You have 4 weeks to capture as much knowledge as possible. But
+  you also need a long-term strategy so this never happens again.
+  Task: Design both the emergency knowledge transfer plan (4 weeks)
+  and the long-term knowledge preservation strategy using code review
+  as the primary mechanism. Include: immediate capture methods, how
+  to restructure reviews to prevent knowledge concentration, and how
+  to measure knowledge distribution across the team.
+assertions:
+  - type: llm_judge
+    criteria: "Emergency 4-week plan maximizes knowledge capture — week 1: (1) record video walkthroughs of the 5 most critical systems (payment reconciliation first), (2) pair programming sessions with 2-3 engineers on each critical system (record these too), (3) document all 'why' decisions in ADRs (Architecture Decision Records). Week 2: (1) have the departing engineer review all recent PRs in critical systems — their review comments capture tacit knowledge, (2) create runbooks for the 10 most common production incidents they handle. Week 3: (1) shadow debugging sessions — the departing engineer debugs a real issue while 2 engineers observe and document the process, (2) system architecture diagrams with relationships between services. Week 4: (1) Q&A sessions — other engineers bring their 'I always wondered...' questions, (2) handoff: verify 2+ engineers can independently operate each critical system"
+    weight: 0.35
+    description: "Emergency plan"
+  - type: llm_judge
+    criteria: "Long-term review strategy prevents knowledge concentration — review rotation policy: no engineer reviews more than 50% of PRs in any code area (prevents single-point-of-knowledge). Mandatory cross-review: every PR to critical systems gets reviewed by someone who DOESN'T usually work in that area (forces knowledge spreading). 'Explain your review' comments: reviewers who catch subtle issues must explain HOW they knew (captures the pattern recognition that experts have). Review knowledge base: significant review discussions are tagged, categorized, and searchable — becomes a 'why we did this' knowledge base. Architectural review requirement: changes to critical systems require a review comment explaining 'why this approach' not just 'does this work.' Bus factor tracking: dashboard showing how many engineers can independently review/maintain each system — alert when any system drops below 3"
+    weight: 0.35
+    description: "Prevention strategy"
+  - type: llm_judge
+    criteria: "Knowledge distribution measurement is systematic — metrics: (1) bus factor per system: number of engineers who have reviewed PRs in the system in the last 90 days AND resolved a production issue in the system. Target: minimum 3 per critical system. (2) Knowledge Gini coefficient: how concentrated is review expertise? 0 = perfectly distributed, 1 = one person knows everything. Track monthly, target < 0.4. (3) Review breadth per engineer: number of distinct code areas each engineer has reviewed — increasing means knowledge is spreading. (4) Independent operation test: quarterly exercise where each critical system is 'owned' by a different engineer for a week — can they handle incidents? Organizational change: (1) promote knowledge sharing in performance reviews, (2) engineer titles should require breadth of system knowledge, not just depth, (3) 'knowledge sharing bonus' for engineers who actively distribute their expertise through reviews"
+    weight: 0.30
+    description: "Distribution measurement"

package/courses/code-review-feedback-writing/scenarios/level-5/ma-review-alignment.yaml ADDED Viewed

@@ -0,0 +1,50 @@
+meta:
+  id: ma-review-alignment
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Align review practices post-M&A — unify code review cultures when two engineering organizations merge, preserving the best of both while creating a cohesive new culture"
+  tags: [code-review, M&A, culture-alignment, merger, integration, master]
+state: {}
+trigger: |
+  Your company (80 engineers, structured review process, Python/Go
+  stack) just acquired a startup (30 engineers, fast-moving review
+  culture, Node.js/TypeScript stack). You need to unify the code
+  review practices.
+  Your company: 2 required approvals, security review for auth changes,
+  detailed PR templates, 24-hour SLA, formal review guidelines,
+  automated linting, code review training program.
+  Acquired startup: 1 approval ("trust the developer"), no security
+  review requirement, minimal PR descriptions, ship same-day,
+  no written guidelines, minimal automation, review by vibes.
+  Tensions:
+  - Startup engineers feel "slowed down by bureaucracy"
+  - Your engineers worry about "quality regression from the acquisition"
+  - Startup's best engineer says "I'll leave if you make me wait
+    2 days for a review"
+  - Your security team says "their code has 3 critical vulnerabilities
+    we found in due diligence"
+  Task: Design the review alignment strategy. Include: assessment
+  of both cultures' strengths, unified review framework, migration
+  timeline, how to handle resistance from both sides, and the target
+  state that preserves the best of both cultures.
+assertions:
+  - type: llm_judge
+    criteria: "Assessment respects both cultures' strengths — your company strengths: security rigor, consistent quality, thorough documentation, knowledge sharing through detailed reviews. Weaknesses: slow (24-hour SLA is too long), heavy process (2 approvals for everything is overkill for low-risk changes). Startup strengths: speed, developer trust, low friction, quick iteration. Weaknesses: no security review (3 vulnerabilities found), no documentation (knowledge in people's heads), inconsistent quality. Key insight: neither culture is 'right' — the unified culture should be faster than yours and more rigorous than theirs. 'The startup's speed is their competitive advantage. Our rigor is ours. The merged culture must have both.'"
+    weight: 0.35
+    description: "Culture assessment"
+  - type: llm_judge
+    criteria: "Unified framework takes the best of both — new framework: tiered reviews (adopted from startup speed + your company rigor). Tier 1 (low risk: docs, config, style-only): 1 approval, 4-hour SLA (startup speed). Tier 2 (standard: features, refactors): 1 approval, 12-hour SLA (compromise). Tier 3 (high risk: auth, payments, data): 2 approvals including security reviewer, 24-hour SLA (your company rigor). Auto-classification: CI determines tier based on changed files (CODEOWNERS + risk labels). Result: 60% of PRs get Tier 1 treatment (faster than before for both teams), 30% get Tier 2, 10% get Tier 3 (only when truly needed). Both sides benefit: your engineers get faster reviews for routine changes, startup engineers get security review only where it matters. Automation from your team + speed culture from startup"
+    weight: 0.35
+    description: "Unified framework"
+  - type: llm_judge
+    criteria: "Migration and resistance management are political — timeline: month 1 (mutual learning): engineers from both teams review each other's code, cross-pollination builds understanding. Month 2 (pilot): unified framework piloted on one integrated team (mix of both cultures). Month 3 (expand): all teams adopt unified framework with feedback iteration. Resistance handling: startup engineer threatening to leave: 'We hear you — Tier 1 reviews will be faster than your current process. Security review only applies to auth and payments code, which you'd want someone checking anyway.' Your security team: 'The startup code will go through a security audit sprint before integration. Going forward, Tier 3 classification ensures security review where it matters.' Key principle: don't force either culture to adopt the other — create something new that both contributed to. Name it: 'Combined Engineering Review Standard' — both teams co-authored. Leadership: joint review council with leads from both organizations"
+    weight: 0.30
+    description: "Migration strategy"

package/courses/code-review-feedback-writing/scenarios/level-5/master-review-shift.yaml ADDED Viewed

@@ -0,0 +1,49 @@
+meta:
+  id: master-review-shift
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Combined master shift — present a complete code review transformation strategy to the board after acquiring two companies, unifying three engineering cultures while maintaining delivery velocity"
+  tags: [code-review, combined, shift-simulation, transformation, board, master]
+state: {}
+trigger: |
+  You're the CTO of a newly formed $200M ARR company created by
+  merging three organizations:
+  Company A (yours, 120 engineers): Mature review process, high
+  quality, but slow (7-day average merge time)
+  Company B (acquired, 60 engineers): Fast review culture, good
+  velocity, but 15% defect escape rate and toxic senior reviewer
+  who's also their best architect
+  Company C (acquired, 40 engineers): No formal review process,
+  developers self-merge, highest velocity but unpredictable quality
+  Board mandate: "Unify engineering, maintain delivery speed, improve
+  quality, reduce attrition. Present a plan in 2 weeks."
+  Total: 220 engineers, 15 teams, 3 tech stacks, 3 review cultures,
+  $2M available for process/tooling investment.
+  Task: Write the board presentation for the unified code review
+  strategy. Include: current state assessment, target state, 12-month
+  transformation roadmap, investment plan, risk mitigation (the
+  toxic reviewer, self-merge culture, speed regression), and
+  expected business outcomes.
+assertions:
+  - type: llm_judge
+    criteria: "Current state assessment is honest and actionable — Company A: high quality but slow — review is a bottleneck, developers wait a week to ship. Cost: $X in delayed features. Company B: good velocity but quality issues — 15% defect rate costs $Y in incident response. Toxic reviewer: brilliant but driving attrition — 3 engineers left because of them. Company C: fastest but unpredictable — customer-facing bugs erode trust. Self-merge means no knowledge sharing — bus factor of 1 on multiple systems. Combined risk: if we take the worst of each culture, we get slow + buggy + no review. Opportunity: if we take the best, we get fast + high quality + knowledge sharing. The merged culture must be better than any individual culture"
+    weight: 0.35
+    description: "Current assessment"
+  - type: llm_judge
+    criteria: "12-month roadmap is phased and measured — quarter 1 (foundation): (1) address toxic reviewer immediately (coaching, expectations, consequences if unchanged), (2) unified review guidelines published, (3) CI/CD automation across all three stacks (Prettier/ESLint equivalents), (4) tiered review requirements (risk-based), (5) Company C implements minimum review requirement (1 approval). Quarter 2 (integration): (1) cross-team review exchanges, (2) reviewer training for Company C engineers, (3) shared metrics dashboard, (4) reduce Company A merge time to 3 days. Quarter 3 (optimization): (1) automated reviewer assignment, (2) cross-training on critical systems, (3) review quality calibration across all teams. Quarter 4 (maturity): (1) unified culture with team autonomy, (2) all metrics at target, (3) sustainable process that works at 300+ engineers. Each quarter: measurable milestones, risk checkpoints, course correction criteria"
+    weight: 0.35
+    description: "Phased roadmap"
+  - type: llm_judge
+    criteria: "Board presentation connects to business outcomes — investment: $2M over 12 months — $800K tooling (CI/CD, automation, monitoring), $600K additional headcount (2 DevEx engineers, 1 engineering manager), $400K training, $200K process design and facilitation. Returns: (1) quality improvement: Company B defect rate 15% → 5% = $Z saved in incidents. (2) Velocity: Company A merge time 7 → 3 days = 40% more features shipped. (3) Retention: address toxic culture + unify processes → reduce attrition from 18% to 10% = $W saved in hiring. (4) Knowledge sharing: reduce bus factor risk across all three codebases. Total projected return: 4x ROI on $2M investment. Risks: (1) toxic reviewer escalation (mitigation: clear consequences, replacement plan), (2) Company C resistance to any review (mitigation: start with low-friction requirements, demonstrate value), (3) speed regression during transition (mitigation: tiered approach, fast track for low-risk changes). Success criteria at 12 months: merge time < 3 days, defect escape < 5%, satisfaction > 7/10, attrition < 12%"
+    weight: 0.30
+    description: "Business outcomes"

package/courses/code-review-feedback-writing/scenarios/level-5/review-competitive-advantage.yaml ADDED Viewed

@@ -0,0 +1,48 @@
+meta:
+  id: review-competitive-advantage
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Position review culture as competitive advantage — articulate how excellent code review practices create defensible advantages in talent acquisition, product quality, and engineering velocity"
+  tags: [code-review, competitive-advantage, talent, engineering-brand, culture, master]
+state: {}
+trigger: |
+  You're competing with FAANG companies for engineering talent. Your
+  startup can't match their compensation, but your engineering culture
+  is exceptional. Specifically, your code review culture is a
+  differentiator:
+  - Engineers grow faster because reviews are mentoring opportunities
+  - Code quality is high because reviews catch issues early
+  - Knowledge is shared because reviews cross-pollinate ideas
+  - Psychological safety is high because reviews are respectful
+  - Engineers ship fast because reviews are efficient (< 24 hours)
+  Evidence:
+  - Glassdoor: 4.8/5 for engineering culture (competitor average: 3.9)
+  - 40% of hires come from referrals (engineers recommend the company)
+  - Engineering blog posts about review culture get 100K+ views
+  - Time-to-productivity for new hires: 3 weeks (industry: 3 months)
+  - Retention: 92% (industry: 80%)
+  Task: Write the strategic narrative of code review culture as
+  competitive advantage. Include: how to communicate this in
+  recruiting, how to maintain it as you grow, how to measure the
+  business impact, and how to share the approach publicly without
+  losing the advantage.
+assertions:
+  - type: llm_judge
+    criteria: "Review culture is framed as a strategic asset — thesis: 'Our code review culture is a compounding competitive advantage. It attracts better engineers, develops them faster, and produces higher quality software — creating a virtuous cycle that competitors can't easily replicate.' Three pillars: (1) Talent magnet: engineers want to work somewhere they'll grow. Mentoring through review is visible proof of growth culture. (2) Quality multiplier: reviews catch bugs, share knowledge, and maintain standards — the compound effect is software that rarely breaks and is easy to evolve. (3) Speed enabler: thorough reviews enable confident shipping — teams deploy 15x/day because they trust the quality gates. Defensibility: culture takes years to build and can't be acquired or copied quickly. A competitor can match our tech stack in months but can't match our review culture in less than 2-3 years"
+    weight: 0.35
+    description: "Strategic narrative"
+  - type: llm_judge
+    criteria: "Recruiting and public sharing are strategic — recruiting: every engineering interview includes a 'review a real PR' exercise — candidates experience the culture firsthand. Job descriptions: 'You'll receive thoughtful code reviews from senior engineers who invest in your growth.' Candidate testimonials: engineers share review examples that taught them something. Public sharing: engineering blog series 'How we do code reviews' — transparent about practices but competitive advantage is in execution, not knowledge. Conference talks about review culture (thought leadership attracts talent). Open-source review guidelines (builds reputation, attracts talent who align with values). Why sharing doesn't lose the advantage: 'The practices are simple. The execution is hard. Knowing how we do reviews doesn't help competitors replicate 4 years of culture-building.'"
+    weight: 0.35
+    description: "Recruiting and sharing"
+  - type: llm_judge
+    criteria: "Growth maintenance and measurement are addressed — maintaining at scale: (1) review culture document is part of Day 1 onboarding, (2) new managers trained on 'review culture guardian' role, (3) every promotion criteria includes 'quality of code reviews given and received,' (4) quarterly culture health survey with review-specific questions, (5) 'culture carriers' program — engineers who exemplify the culture and mentor others. Warning signs: (1) review turnaround creeping up, (2) satisfaction scores dropping, (3) reviews becoming rubber stamps, (4) new hires reporting different culture than expected. Business measurement: (1) referral rate as culture proxy (employees recommend what they're proud of), (2) Glassdoor score trend, (3) time-to-productivity trend, (4) retention by tenure cohort. Economic value: calculate total value of retention advantage + recruiting cost savings + quality advantage + velocity advantage = 'our review culture is worth $X million per year in tangible business outcomes'"
+    weight: 0.30
+    description: "Maintenance and measurement"

package/courses/code-review-feedback-writing/scenarios/level-5/review-organizational-learning.yaml ADDED Viewed

@@ -0,0 +1,46 @@
+meta:
+  id: review-organizational-learning
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Design reviews as organizational learning — transform code review from quality gate into the primary mechanism for knowledge sharing and skill development across the engineering organization"
+  tags: [code-review, organizational-learning, knowledge-sharing, skill-development, master]
+state: {}
+trigger: |
+  You lead engineering at a 200-person tech company. You notice that
+  despite having excellent documentation and onboarding, knowledge
+  stays siloed:
+  - The payments team doesn't know how the auth team solved a similar
+    caching problem
+  - Junior developers learn the "what" from docs but not the "why"
+    from experienced engineers
+  - When a senior engineer leaves, their tacit knowledge vanishes
+  - Code review comments contain valuable architecture decisions that
+    are lost after the PR is merged
+  - Teams reinvent solutions that other teams already built
+  You want to transform code review from a "quality gate" (pass/fail)
+  into a "knowledge network" (learning system).
+  Task: Design code review as an organizational learning system.
+  Include: how reviews capture and distribute knowledge, how to
+  measure learning outcomes, cross-team review programs, knowledge
+  preservation from review discussions, and the organizational
+  change needed to support this transformation.
+assertions:
+  - type: llm_judge
+    criteria: "Review as knowledge capture is systematic — knowledge capture: (1) review comments that explain 'why' are tagged and searchable (not just lost in PR history). Architecture Decision Records (ADRs) generated from significant review discussions. (2) Pattern library: when a reviewer suggests a pattern, it's captured in a shared 'patterns' repository with the original context. (3) Decision search: 'Why did we choose PostgreSQL over DynamoDB for this service?' → search review discussions for the original debate. (4) Onboarding knowledge: new engineer reads reviews for their team's code to understand design decisions. Implementation: GitHub Discussions for significant decisions, auto-linking from PR to related decisions, tagging system for review insights"
+    weight: 0.35
+    description: "Knowledge capture"
+  - type: llm_judge
+    criteria: "Cross-team learning programs break silos — cross-team review rotation: each engineer reviews 1 PR/month from another team. Purpose: learn different approaches, share patterns, build organizational understanding. Format: 'review exchange' — Team A reviews Team B's PR, Team B reviews Team A's PR, followed by 30-min discussion. Knowledge broker role: senior engineers designated as 'connectors' who review across teams and identify reuse opportunities ('Team C solved this same problem — see PR #4567'). Review reading group: monthly session where the team reads and discusses an excellent review (from any team) — what made it good? What can we learn? Community of practice: reviewers who focus on specific domains (security, performance, testing) meet monthly to share findings"
+    weight: 0.35
+    description: "Cross-team learning"
+  - type: llm_judge
+    criteria: "Learning outcomes are measurable — metrics: (1) knowledge distribution: are more people qualified to review each code area over time? (bus factor improvement). (2) Pattern reuse: how often do teams adopt patterns discovered through cross-team reviews? (3) Onboarding velocity: do new engineers who read review history ramp up faster? (4) Repeat issues: do the same types of bugs decrease over time? (review learning is working). (5) Teaching effectiveness: do reviewers who explain 'why' produce authors who need less guidance over time? Cultural shift metrics: (1) Do engineers perceive reviews as learning opportunities? (survey). (2) Do engineers voluntarily request review from unfamiliar teams? (behavior). (3) Are review discussions referenced in design documents? (knowledge flow). Organizational change: leadership must model learning-through-review, review time must be valued (not seen as 'not coding'), review quality must be in performance criteria"
+    weight: 0.30
+    description: "Learning metrics"

package/courses/code-review-feedback-writing/scenarios/level-5/review-roi-analysis.yaml ADDED Viewed

@@ -0,0 +1,51 @@
+meta:
+  id: review-roi-analysis
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Calculate code review ROI — build a rigorous cost-benefit analysis of code review investment including time costs, defect prevention, knowledge sharing, and talent retention"
+  tags: [code-review, ROI, cost-benefit, financial-analysis, business-case, master]
+state: {}
+trigger: |
+  The CFO challenges your engineering budget: "Your engineers spend
+  30% of their time reviewing each other's code. That's 20 full-time
+  engineer equivalents doing nothing but reading code. Can we cut
+  reviews in half and hire 10 fewer engineers?"
+  You need a rigorous ROI analysis. Data:
+  Engineering costs:
+  - 60 engineers × $180K average total comp = $10.8M/year
+  - 30% on review = $3.24M/year on review activities
+  - Average time-to-hire: 3 months, cost: $45K per hire
+  Quality data:
+  - With current reviews: 4% defect escape rate (20 production bugs/year)
+  - Average production bug cost: $25K (investigation, fix, customer impact)
+  - Before reviews (2 years ago): 18% defect escape rate (90 bugs/year)
+  Knowledge/talent data:
+  - Attrition: 8% (industry: 15%)
+  - New hire productivity: full speed in 3 months (industry: 6 months)
+  - Knowledge sharing: 65% of cross-team knowledge via reviews
+  Task: Build the financial model for the CFO. Include: true cost
+  of reviews, value of defect prevention, value of knowledge sharing,
+  value of talent retention, and the optimal review investment level.
+  Answer: can we cut reviews without consequence?
+assertions:
+  - type: llm_judge
+    criteria: "Cost model is rigorous and transparent — review cost: $3.24M/year (30% of engineering time). But: not all review time is equal. Breakdown: 40% reading/understanding code (this IS knowledge sharing), 30% writing feedback (this IS mentoring), 20% style/formatting discussion (this SHOULD be automated), 10% back-and-forth disagreements (this CAN be reduced). The actual waste is ~30% of review time (~$970K), not the full $3.24M. Reduction opportunity: automate style checks (eliminate 20% of review time = $648K savings), streamline process (reduce back-and-forth = $324K savings). Total recoverable: ~$972K without touching quality review time. This is different from 'cutting reviews in half'"
+    weight: 0.35
+    description: "Cost model"
+  - type: llm_judge
+    criteria: "Value model quantifies benefits across dimensions — defect prevention: current 20 bugs/year × $25K = $500K. Without reviews (historical): 90 bugs/year × $25K = $2.25M. Net prevention value: $1.75M/year. If we cut reviews in half, estimate 50 bugs/year × $25K = $1.25M (net loss of $750K). Talent retention: 8% vs 15% attrition difference = 4.2 fewer departures × ($45K hiring + $90K ramp-up + $180K first-year productivity loss) = $1.32M/year saved. Onboarding speed: 3 vs 6 month ramp = 3 months × $15K/month × 10 new hires/year = $450K value. Knowledge sharing: prevents knowledge silos, reduces bus factor risk (quantify: if key engineer leaves unplanned, $500K estimated recovery cost per critical system). Total value of current review investment: $3.5M+ annually for $3.24M cost = positive ROI"
+    weight: 0.35
+    description: "Value quantification"
+  - type: llm_judge
+    criteria: "Recommendation addresses the CFO's specific question — 'Can we cut reviews in half? No — but we can make reviews 30% more efficient.' The 30% efficiency gain: (1) automate style/formatting ($648K saved), (2) reduce review rounds with better PR templates ($324K saved), (3) total savings: $972K/year — equivalent to hiring 5.4 additional engineers. The 10 engineers the CFO wants to not-hire: our review efficiency improvements free up equivalent capacity without the quality and retention risks. Optimal review investment: ~20-22% of engineering time (down from 30%) achieves same quality outcomes. Below 15%: defect rate increases measurably. Below 10%: knowledge sharing collapses, attrition rises. Visualization: show the diminishing returns curve — quality vs review investment percentage. Present as: 'We're over-investing in the wrong parts of review. Let me redirect that investment, not eliminate it.'"
+    weight: 0.30
+    description: "Actionable recommendation"

package/courses/code-review-feedback-writing/scenarios/level-5/review-velocity-impact.yaml ADDED Viewed

@@ -0,0 +1,44 @@
+meta:
+  id: review-velocity-impact
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Analyze review impact on velocity — quantify how code review practices affect engineering team velocity, quality, and delivery predictability"
+  tags: [code-review, velocity, engineering-metrics, delivery, impact-analysis, master]
+state: {}
+trigger: |
+  The CEO asks: "Code review takes 30% of our engineering time. Is it
+  worth it? Should we reduce review requirements to ship faster?"
+  Data available:
+  - Engineering time: 30% of developer hours spent on review activities
+  - Cycle time: average 7 days from PR open to production (review is
+    4 of those 7 days)
+  - Defect rate: teams with thorough reviews have 3x fewer production bugs
+  - But: teams with fastest reviews ship 2x more features per quarter
+  - Developer satisfaction: inversely correlated with review wait time
+  - Knowledge sharing: 60% of cross-team knowledge transfer happens
+    through code reviews
+  - Onboarding: new engineers who receive detailed reviews reach full
+    productivity 40% faster
+  Task: Write the analysis for the CEO. Include: the true cost of
+  code review, the true cost of NOT reviewing, optimization
+  opportunities (maintain quality, reduce time), and a recommendation
+  with projected impact. Use data to tell the story.
+assertions:
+  - type: llm_judge
+    criteria: "Cost analysis is honest and complete — cost of review: 30% of engineering time = $X million/year (calculate from team size and average salary). 4 of 7 days in cycle time is a bottleneck. Opportunity cost: features not shipped during review wait time. Cost of NOT reviewing: 3x more production bugs × average bug fix cost ($5K-50K each) = $Y million/year in bug costs. Knowledge silos: without reviews, knowledge stays in individual heads (bus factor risk). Onboarding: 40% slower onboarding = $Z additional cost per new hire. Comparison: review investment should be compared to the alternative (more bugs, slower onboarding, knowledge loss), not to zero. Key insight: 'The question isn't whether to do reviews — it's whether we're doing them efficiently.'"
+    weight: 0.35
+    description: "Cost analysis"
+  - type: llm_judge
+    criteria: "Optimization recommendations maintain quality while reducing time — 'We can reduce review time from 4 days to 1.5 days without reducing quality.' Specific interventions: (1) Automate style/formatting (saves 40% of review time — Prettier, ESLint). (2) PR size limits (smaller PRs reviewed faster — 200 vs 500 lines saves 2-3 hours per review). (3) Reviewer assignment automation (eliminates 1-2 day assignment delay). (4) Review SLA (24-hour first review commitment). (5) High-quality first submissions (PR templates, self-review checklist reduce review rounds). Projected impact: review time from 30% to 20% of engineering hours, cycle time from 7 to 4 days, feature delivery +30%, quality maintained or improved. Each recommendation with expected time savings and implementation cost"
+    weight: 0.35
+    description: "Optimization plan"
+  - type: llm_judge
+    criteria: "Recommendation is data-driven and CEO-friendly — framing: 'Code review is engineering infrastructure — like CI/CD or monitoring. The question isn't whether to invest, but how to invest efficiently.' Recommendation: optimize, don't reduce. Reducing review thoroughness has compounding costs: bugs cost 10x more to fix in production than in review, knowledge loss is irreversible, onboarding slows permanently. Projected outcomes: invest $50K in automation + 90-day process improvement → save $200K/year in review time + $150K/year in prevented bugs + $100K/year in faster onboarding = $450K annual return on $50K investment. Timeline: improvements visible in 30 days (automation), fully realized in 90 days (process changes). Dashboard: CEO-level metrics — cycle time trend, defect rate trend, developer satisfaction trend, features delivered per quarter"
+    weight: 0.30
+    description: "CEO recommendation"

package/courses/code-review-feedback-writing/scenarios/level-5/scaling-reviews-100-plus.yaml ADDED Viewed

@@ -0,0 +1,45 @@
+meta:
+  id: scaling-reviews-100-plus
+  level: 5
+  course: code-review-feedback-writing
+  type: output
+  description: "Scale reviews to 100+ engineers — design review processes that maintain quality and culture at large scale across multiple teams, timezones, and experience levels"
+  tags: [code-review, scaling, large-team, process-design, multi-timezone, master]
+state: {}
+trigger: |
+  Your company is growing from 50 to 200 engineers over 18 months.
+  The current review process works well at 50 but won't scale:
+  - Manual reviewer assignment (team lead picks reviewers)
+  - One review channel in Slack (50 engineers, manageable noise)
+  - Informal mentoring through review (seniors know all juniors)
+  - Single coding standard (everyone agreed in one meeting)
+  - Everyone knows everyone (trust is personal, not institutional)
+  At 200 engineers across 4 timezones:
+  - Team lead can't know everyone's expertise
+  - Slack channel becomes unusable noise
+  - Seniors can't personally mentor 150 juniors
+  - Standards need to be written (can't be tribal knowledge)
+  - Trust must be institutional (process-based, not personal)
+  Task: Design the review process for 200 engineers. Include:
+  automated assignment, tiered review requirements, timezone-aware
+  scheduling, knowledge preservation systems, and how to maintain
+  the collaborative culture that worked at 50 people.
+assertions:
+  - type: llm_judge
+    criteria: "Automated systems replace manual coordination — assignment: CODEOWNERS + automated round-robin weighted by: expertise (code area familiarity), load (current open reviews), timezone (prefer same or adjacent timezone for sync discussions), cross-pollination (10% of assignments go to engineers outside the team for knowledge sharing). Tiered requirements: Tier 1 (low risk: docs, config, style) — 1 approval, any engineer. Tier 2 (standard: features, refactors) — 1 approval, team member. Tier 3 (high risk: security, data, infrastructure) — 2 approvals, at least 1 senior. Tier auto-detected: CI analyzes changed files against CODEOWNERS patterns. Slack: replace single channel with per-team channels + cross-team review request bot. Dashboard: real-time review queue visibility per team"
+    weight: 0.35
+    description: "Automated systems"
+  - type: llm_judge
+    criteria: "Timezone awareness prevents review latency — async-first design: reviews should be completable without real-time discussion (high-quality PR descriptions, thorough first review pass). Timezone routing: assign reviewers in same or +/- 4 hour timezone when possible. Follow-the-sun for urgent reviews: if no reviewer available in requester's timezone, auto-escalate to next timezone's pool. SLA adjustment by timezone overlap: same timezone = 8 hours, adjacent = 16 hours, opposite = 24 hours. Review handoff: if review discussion is complex, write summary for the next timezone reviewer rather than waiting for sync. Critical path: reviews blocking production deploys get priority routing regardless of timezone. Meeting-light: replace synchronous design reviews with async RFC documents + async review comments"
+    weight: 0.35
+    description: "Timezone design"
+  - type: llm_judge
+    criteria: "Culture preservation strategy is explicit — what to preserve from 50-person culture: (1) constructive tone — encode in written guidelines with examples, enforce via manager review. (2) Mentoring — replace personal mentoring with structured program: each junior paired with a mentor-reviewer for 6 months. (3) Trust — shift from personal trust ('I know Alex does good work') to process trust ('this PR passed our quality gates'). (4) Shared ownership — cross-team review rotation prevents silos. What to let go: (1) everyone knowing everyone (impossible at 200), (2) single review channel (too noisy), (3) ad-hoc standards (must be documented). New at scale: (1) written review culture document (onboarding reading), (2) review quality calibration (quarterly, team-level), (3) review recognition program (celebrate excellent reviews monthly), (4) review retrospectives (what's working, what's not, quarterly). Success metric: 'Would a new hire who joined 3 months ago describe our review culture the same way a 3-year veteran would?'"
+    weight: 0.30
+    description: "Culture preservation"