npm - dojo.md - Versions diffs - 0.1.0 → 0.2.0 - Mend

dojo.md 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (243) hide show

package/courses/github-actions-cicd/scenarios/level-4/cicd-incident-response.yaml ADDED Viewed

@@ -0,0 +1,60 @@
+meta:
+  id: cicd-incident-response
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Respond to CI/CD incidents — handle platform outages, security breaches, and deployment failures at the organizational level"
+  tags: [github, actions, incident-response, outage, security, postmortem, expert]
+state: {}
+trigger: |
+  You're the Head of Platform Engineering handling 3 major CI/CD
+  incidents simultaneously. Each requires a different response strategy.
+  Incident 1 — GitHub Actions outage (Severity: P1):
+  GitHub Actions has been experiencing degraded performance for 2 hours.
+  Workflow runs are queuing but not starting. GitHub's status page says
+  "Investigating." Your entire engineering org (500 developers) is
+  blocked. 15 production deployments are queued, including a critical
+  security patch. The CEO asks: "When will engineering be unblocked?"
+  You need to: determine the blast radius, activate the business
+  continuity plan, get the security patch deployed manually, and
+  communicate to the organization.
+  Incident 2 — Credential leak via CI logs (Severity: P1):
+  A workflow accidentally printed a database connection string (with
+  password) to the build logs. The logs are public (open-source repo).
+  The credentials provide read/write access to the production database
+  containing 2M user records. The log has been visible for 6 hours.
+  You need to: assess data exposure, rotate credentials, scrub the
+  logs, notify the security team, and determine if data was accessed.
+  Incident 3 — Deployment pipeline corruption (Severity: P2):
+  A misconfigured reusable workflow update caused all deployments to
+  deploy to the wrong environment. For 4 hours, staging deployments
+  went to production and production deployments went to staging. 3
+  untested features are now in production, and production data is being
+  written by staging code. No data loss yet, but data consistency is
+  at risk.
+  Task: Handle all 3 incidents. For each, write: the incident
+  commander's playbook (first 15 minutes, first hour, resolution),
+  the communication plan (who gets told what and when), the technical
+  resolution steps, and the post-incident review (root cause, corrective
+  actions, process improvements). Include: the unified incident report
+  for the CTO and the long-term platform resilience improvements.
+assertions:
+  - type: llm_judge
+    criteria: "Incident response follows proper IC protocol — establishes incident commander, defines clear communication channels, provides regular status updates, and each incident has a clear escalation path and resolution criteria"
+    weight: 0.35
+    description: "Proper IC protocol"
+  - type: llm_judge
+    criteria: "Technical resolutions are correct — GitHub outage has a BCP (fallback to self-hosted or manual deploy for security patch), credential leak has immediate rotation + log scrubbing + access audit, and environment swap is fixed with immediate rollback and data integrity verification"
+    weight: 0.35
+    description: "Correct technical resolutions"
+  - type: llm_judge
+    criteria: "Post-incident improvements are systemic — identifies that all 3 incidents reveal platform resilience gaps (no GitHub fallback, secrets in logs possible, environment config as single point of failure) and proposes permanent fixes"
+    weight: 0.30
+    description: "Systemic post-incident improvements"

package/courses/github-actions-cicd/scenarios/level-4/cicd-org-design.yaml ADDED Viewed

@@ -0,0 +1,59 @@
+meta:
+  id: cicd-org-design
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Design CI/CD organization — build teams, define roles, and create career paths for CI/CD platform engineering"
+  tags: [github, actions, org-design, platform-engineering, teams, career-paths, expert]
+state: {}
+trigger: |
+  You're the VP of Engineering creating a dedicated Platform Engineering
+  team that owns CI/CD. The company has 600 engineers and no centralized
+  CI/CD ownership — each team manages their own workflows, leading to
+  inconsistency, duplication, and poor developer experience.
+  Current pain points:
+  - 40% of developer time in surveys is "fighting CI/CD issues"
+  - No standard deployment process across 200 repos
+  - 3 different cloud authentication methods (static keys, OIDC, SSO)
+  - Self-hosted runners managed by 3 different teams with no coordination
+  - Security team manually audits CI/CD configurations quarterly
+  - New team onboarding takes 2 weeks just for CI/CD setup
+  Organizational mandate:
+  - Create a Platform Engineering team focused on CI/CD
+  - Headcount: 15 FTEs (mix of engineers, SREs, and program managers)
+  - Budget: $3M/year (headcount + infrastructure)
+  - Goal: Reduce "fighting CI/CD" from 40% to 10% within 1 year
+  Design questions:
+  1. Team structure: How to organize 15 people across different functions
+  2. Interaction model: How does the platform team serve 600 engineers
+  3. Service catalog: What services does the team provide
+  4. Career ladder: How do platform engineers grow
+  5. Success metrics: How to measure the team's impact
+  6. Funding model: Chargeback to teams vs central funding
+  Task: Design the complete organization. Write: the team structure
+  (roles, responsibilities, reporting), the service catalog (what the
+  team provides and what it doesn't), the interaction model with product
+  teams (self-service platform, office hours, embedded support), the
+  career ladder (from junior platform engineer to staff/principal), the
+  success metrics and OKRs for the first year, and the funding model
+  with cost allocation.
+assertions:
+  - type: llm_judge
+    criteria: "Team structure is balanced — includes platform engineers (build tooling), SREs (reliability/observability), and a program manager (governance/adoption), with clear roles and the right ratio for a 15-person team"
+    weight: 0.35
+    description: "Balanced team structure"
+  - type: llm_judge
+    criteria: "Service catalog is clear about boundaries — defines what the team owns (shared workflows, runner fleet, deployment pipelines) vs what product teams own (app-specific tests, feature flags), with a self-service model that scales beyond the 15-person team"
+    weight: 0.35
+    description: "Clear service catalog boundaries"
+  - type: llm_judge
+    criteria: "Success metrics are measurable — includes both output metrics (platform uptime, deployment frequency) and outcome metrics (developer satisfaction, time-to-first-deploy for new teams), and the OKRs target reducing 'fighting CI/CD' from 40% to 10%"
+    weight: 0.30
+    description: "Measurable success metrics"

package/courses/github-actions-cicd/scenarios/level-4/cicd-platform-architecture.yaml ADDED Viewed

@@ -0,0 +1,63 @@
+meta:
+  id: cicd-platform-architecture
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Architect a CI/CD platform — design the internal platform that manages runners, workflows, and deployments at enterprise scale"
+  tags: [github, actions, platform, architecture, runners, scaling, expert]
+state: {}
+trigger: |
+  You're the Staff Engineer designing the internal CI/CD platform for
+  a 800-engineer organization. The platform sits between GitHub Actions
+  and the cloud infrastructure, providing a managed experience.
+  Platform requirements:
+  1. Runner management: Auto-scaled runner pools for different workloads
+     (CI, deployment, ML training, iOS builds)
+  2. Workflow orchestration: Standard deployment pipelines that teams
+     adopt via reusable workflows
+  3. Secret management: Centralized secret store (HashiCorp Vault)
+     integrated with GitHub Actions via OIDC
+  4. Artifact management: Central artifact store with retention policies
+  5. Observability: Metrics, logs, and traces for all workflow runs
+  6. Cost management: Per-team cost allocation and budget enforcement
+  Scale requirements:
+  - 5,000 workflow runs per day
+  - 200 concurrent jobs during peak
+  - 4 runner pools: Linux CI (150 runners), Linux GPU (10), macOS (20),
+    Windows (20)
+  - 99.9% uptime (CI platform down = all teams blocked)
+  - < 30 second job startup time (cold start)
+  Technical constraints:
+  - Runs on existing Kubernetes cluster (AWS EKS)
+  - Must use actions-runner-controller (ARC) for runner management
+  - Must integrate with existing Vault, Datadog, and PagerDuty
+  - Must support both GitHub Enterprise Cloud and self-hosted runners
+  - Budget: $500K/year for infrastructure
+  Task: Design the CI/CD platform architecture. Write: the system
+  architecture (components, data flow, integrations), the runner
+  auto-scaling design (scale-to-zero, warm pools, GPU scheduling),
+  the observability stack (metrics to collect, dashboards, alerts),
+  the cost management system (per-team tracking, budget alerts),
+  and the disaster recovery plan (what happens when the platform goes
+  down). Include architecture diagrams (text-based) and capacity
+  planning calculations.
+assertions:
+  - type: llm_judge
+    criteria: "Architecture handles the scale — ARC configuration supports 200 concurrent jobs with proper auto-scaling policies, runner pool isolation prevents noisy-neighbor issues, cold start is addressed with warm pools, and the $500K budget is allocated across compute, storage, and networking"
+    weight: 0.35
+    description: "Scale-handling architecture"
+  - type: llm_judge
+    criteria: "Observability is comprehensive — tracks runner utilization, job queue depth, wait time, success rates, and cost per team. Integrates with Datadog for metrics and PagerDuty for alerts. Dashboards are designed for different audiences (engineers, managers, finance)"
+    weight: 0.35
+    description: "Comprehensive observability"
+  - type: llm_judge
+    criteria: "Disaster recovery is enterprise-grade — defines RTO and RPO for the CI platform, has a fallback plan when the platform is down (GitHub-hosted runners as failover), and the 99.9% uptime target is addressed with redundancy and monitoring"
+    weight: 0.30
+    description: "Enterprise-grade disaster recovery"

package/courses/github-actions-cicd/scenarios/level-4/cicd-training-program.yaml ADDED Viewed

@@ -0,0 +1,65 @@
+meta:
+  id: cicd-training-program
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Design CI/CD training program — create a comprehensive curriculum for building GitHub Actions expertise across an organization"
+  tags: [github, actions, training, education, certification, onboarding, expert]
+state: {}
+trigger: |
+  You're designing a company-wide GitHub Actions training program for
+  your 500-engineer organization. Currently, CI/CD knowledge is
+  concentrated in 10 people, and the rest of the org depends on them
+  for any workflow changes.
+  Current problems the training must address:
+  - Only 10 engineers can modify CI/CD workflows confidently
+  - New hires take 4 weeks before they can run their first deployment
+  - 40% of CI/CD support tickets are "how do I do X in GitHub Actions"
+  - Teams copy-paste workflows without understanding them
+  - Security misconfigurations are found quarterly in workflow audits
+  - 3 incidents in the last year caused by poorly written workflows
+  Target audience tiers:
+  1. All engineers: Basic workflow literacy (read and understand CI/CD)
+  2. Team leads: Workflow authoring (create and maintain team workflows)
+  3. Platform contributors: Advanced automation (build reusable
+     workflows, custom actions, platform integrations)
+  4. Security engineers: Security review of workflows
+  Available formats:
+  - Self-paced modules (budget: $40K to develop)
+  - Monthly live workshops (budget: $20K/year)
+  - Sandbox environment (dedicated GitHub org for practice)
+  - Certification program (gamification)
+  - Weekly office hours with the platform team
+  Constraints:
+  - Maximum 4 hours of training per engineer per quarter
+  - Must be available asynchronously (remote-first company)
+  - Must show measurable impact within 6 months
+  - Must reduce CI/CD support tickets by 50%
+  Task: Design the complete training program. Write: the curriculum
+  for each tier (learning objectives, modules, hands-on exercises),
+  the sandbox environment design (pre-configured repos for practice),
+  the certification system (levels, requirements, rewards), the
+  measurement framework (how to prove training reduces tickets and
+  incidents), and the rollout plan. Include sample exercises for each
+  tier.
+assertions:
+  - type: llm_judge
+    criteria: "Curriculum is tier-appropriate — all-engineers learn to read workflows and trigger deployments, team leads learn to author workflows and manage secrets, platform contributors learn reusable workflows and custom actions, and security engineers learn vulnerability assessment"
+    weight: 0.35
+    description: "Tier-appropriate curriculum"
+  - type: llm_judge
+    criteria: "Sandbox environment is practical — provides pre-configured repos with progressive exercises, simulates real scenarios (failing tests, flaky CI, deployment issues), and allows safe experimentation without affecting production"
+    weight: 0.35
+    description: "Practical sandbox environment"
+  - type: llm_judge
+    criteria: "Measurement framework proves ROI — tracks leading indicators (training completion, sandbox exercise pass rates) and lagging indicators (support ticket volume, incident count, time-to-first-deployment for new hires), with a clear before/after comparison"
+    weight: 0.30
+    description: "ROI-proving measurement framework"

package/courses/github-actions-cicd/scenarios/level-4/cicd-vendor-evaluation.yaml ADDED Viewed

@@ -0,0 +1,59 @@
+meta:
+  id: cicd-vendor-evaluation
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Evaluate CI/CD platforms — conduct a thorough comparison of GitHub Actions vs GitLab CI vs CircleCI vs Jenkins for enterprise adoption"
+  tags: [github, actions, vendor-evaluation, GitLab, CircleCI, Jenkins, expert]
+state: {}
+trigger: |
+  Your company is evaluating CI/CD platforms for a 500-engineer
+  organization. The current state: 60% use GitHub Actions, 25% use
+  Jenkins (legacy), and 15% use CircleCI (from an acquisition). The
+  CTO wants to consolidate to a single platform within 12 months.
+  Evaluation criteria (weighted by importance):
+  1. Feature completeness (25%): Workflow capabilities, integrations
+  2. Security (20%): OIDC, secret management, supply chain, audit
+  3. Scalability (15%): Concurrent jobs, runner management, performance
+  4. Cost (15%): Per-minute pricing, storage, data transfer
+  5. Developer experience (15%): YAML syntax, debugging, documentation
+  6. Enterprise features (10%): SSO, audit logs, compliance, governance
+  Platforms to evaluate:
+  A. GitHub Actions (current primary)
+  B. GitLab CI/CD (considered for integrated DevOps platform)
+  C. CircleCI (current secondary from acquisition)
+  D. Jenkins (current legacy, considering modern alternatives)
+  Scale requirements:
+  - 500 engineers, 200 repos
+  - 2,000 workflow runs per day
+  - Multi-cloud deployment (AWS, GCP, Azure)
+  - SOC 2 + PCI DSS compliance required
+  - Must support self-hosted runners for compliance
+  - Monorepo support needed for 3 teams
+  Task: Conduct the complete vendor evaluation. Write: the evaluation
+  matrix (score each platform on each criterion with justification),
+  the total cost of ownership analysis (3-year TCO including migration),
+  the risk assessment for each platform (vendor lock-in, outage history,
+  feature roadmap), the migration plan from the current 3-platform state
+  to the recommended single platform, and the executive recommendation
+  with decision rationale. Include the pilot program design.
+assertions:
+  - type: llm_judge
+    criteria: "Evaluation matrix is rigorous — each platform is scored on all 6 criteria with specific justifications (not generic), scores reflect real platform differences, and the weights are applied correctly to produce a weighted total"
+    weight: 0.35
+    description: "Rigorous evaluation matrix"
+  - type: llm_judge
+    criteria: "TCO analysis is complete — includes license/usage costs, migration effort (engineering hours), training costs, ongoing maintenance, and accounts for the different pricing models (per-minute vs per-user vs self-hosted). 3-year projection for all 4 options"
+    weight: 0.35
+    description: "Complete TCO analysis"
+  - type: llm_judge
+    criteria: "Executive recommendation is clear and defensible — makes a specific choice (not 'it depends'), addresses vendor lock-in risk, includes a migration timeline from 3 platforms to 1, and the pilot program validates the choice before full commitment"
+    weight: 0.30
+    description: "Clear executive recommendation"

package/courses/github-actions-cicd/scenarios/level-4/enterprise-cicd-governance.yaml ADDED Viewed

@@ -0,0 +1,55 @@
+meta:
+  id: enterprise-cicd-governance
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Design enterprise CI/CD governance — standardize workflows, enforce policies, and manage GitHub Actions across a large organization"
+  tags: [github, actions, enterprise, governance, policy, standardization, expert]
+state: {}
+trigger: |
+  You're the Director of Platform Engineering at a 500-engineer company
+  with 200+ repositories across 3 GitHub Enterprise organizations
+  (legacy from acquisitions). The CEO has mandated CI/CD standardization
+  after 3 incidents caused by inconsistent deployment practices.
+  Current chaos:
+  - 200+ repos with 400+ unique workflow files
+  - No standard deployment process (some repos deploy on merge, some
+    need manual approval, some have no CI at all)
+  - 15% of repos have no branch protection
+  - Third-party actions are used without vetting (supply chain risk)
+  - No visibility into CI/CD costs per team
+  - Different teams use different cloud providers with no standard auth
+  - Compliance requirements differ by BU (SOC 2, PCI, HIPAA)
+  Governance requirements:
+  1. Policy layer: Define what's allowed and required across all repos
+  2. Platform layer: Provide standard workflows teams can adopt
+  3. Enforcement layer: Automatically detect and remediate violations
+  4. Visibility layer: Dashboard showing compliance status per repo
+  5. Exception layer: Process for teams to request policy exceptions
+  Task: Design the enterprise CI/CD governance framework. Write: the
+  policy document (required controls for all repos, with BU-specific
+  extensions), the platform offerings (standard reusable workflows,
+  approved action allowlist, shared runner pools), the enforcement
+  automation (organization rulesets, required workflows, action
+  allowlist), the compliance dashboard design, and the exception
+  process. Include: the migration plan for the 200+ repos and the
+  organizational model (centralized platform team vs federated ownership).
+assertions:
+  - type: llm_judge
+    criteria: "Governance framework is comprehensive — covers policy definition, platform standardization, automated enforcement, visibility dashboards, and exception handling. The 5 layers work together cohesively"
+    weight: 0.35
+    description: "Comprehensive governance framework"
+  - type: llm_judge
+    criteria: "Enforcement is automated not manual — uses GitHub organization rulesets, required workflows, action allowlists (only approved actions can be used), and automated compliance scanning. Non-compliant repos are detected and flagged automatically"
+    weight: 0.35
+    description: "Automated enforcement"
+  - type: llm_judge
+    criteria: "Migration plan is realistic for 200+ repos — phased approach (not big-bang), provides standard workflows that teams can adopt incrementally, handles the 3-org consolidation, and includes success metrics"
+    weight: 0.30
+    description: "Realistic migration plan"

package/courses/github-actions-cicd/scenarios/level-4/expert-cicd-shift.yaml ADDED Viewed

@@ -0,0 +1,60 @@
+meta:
+  id: expert-cicd-shift
+  level: 4
+  course: github-actions-cicd
+  type: output
+  description: "Expert CI/CD shift — navigate vendor crises, compliance emergencies, and organizational challenges simultaneously"
+  tags: [github, actions, shift-simulation, crisis, vendor, compliance, expert]
+state: {}
+trigger: |
+  You're the VP of Platform Engineering during a week where 4 crises
+  converge on the CI/CD platform.
+  Crisis 1 — GitHub pricing change:
+  GitHub announced a 60% price increase for Actions minutes effective
+  in 90 days. Your current spend is $1.6M/year, projected to become
+  $2.56M/year. The CFO wants options: optimize, migrate to self-hosted,
+  or switch vendors. You need a cost mitigation plan by Friday.
+  Crisis 2 — Compliance audit finding:
+  The SOC 2 auditor discovered that 12% of production deployments have
+  no associated PR or code review (they were deployed via manual
+  workflow_dispatch triggers). This violates the change management
+  control. The auditor is asking for a corrective action plan within
+  2 weeks. The CTO wants to know how this happened.
+  Crisis 3 — Platform team attrition:
+  3 of your 15 platform engineers (including the runner fleet expert
+  and the reusable workflow architect) gave notice this week. They're
+  joining a competitor. Critical knowledge is at risk:
+  - Runner auto-scaling configuration (undocumented)
+  - Custom GitHub App for deployment orchestration (single maintainer)
+  - Vault integration for secret management (complex setup)
+  Crisis 4 — Developer revolt:
+  An internal blog post titled "Our CI/CD is Broken" got 200 upvotes.
+  Key complaints: "CI takes 30 minutes," "deployments fail randomly
+  and no one investigates," "the platform team says they'll fix it but
+  nothing changes." The CEO forwarded it asking for your response.
+  Task: Handle all 4 crises. For each, write: the immediate triage
+  (what to do today), the 2-week action plan, the 90-day resolution,
+  and the stakeholder communication. Then write the integrated strategy
+  memo showing how these crises are interconnected and the unified
+  approach to resolving them while maintaining team morale.
+assertions:
+  - type: llm_judge
+    criteria: "Cost mitigation plan is practical — analyzes options (optimize existing workflows, expand self-hosted, partial vendor migration), provides specific savings projections for each, and the recommendation balances cost with migration risk within the 90-day window"
+    weight: 0.35
+    description: "Practical cost mitigation"
+  - type: llm_judge
+    criteria: "Compliance remediation is achievable — addresses the workflow_dispatch bypass with branch protection and required approvals, provides the corrective action plan within the 2-week deadline, and prevents recurrence through automated enforcement"
+    weight: 0.35
+    description: "Achievable compliance remediation"
+  - type: llm_judge
+    criteria: "Integrated strategy connects all crises — shows how attrition impacts the other 3 crises (knowledge loss makes optimization harder), how the developer revolt is connected to underinvestment, and the unified approach addresses root causes rather than symptoms"
+    weight: 0.30
+    description: "Integrated crisis strategy"

package/courses/github-actions-cicd/scenarios/level-5/cicd-ai-future.yaml ADDED Viewed

@@ -0,0 +1,63 @@
+meta:
+  id: cicd-ai-future
+  level: 5
+  course: github-actions-cicd
+  type: output
+  description: "Design the future of AI-powered CI/CD — architect intelligent pipelines that predict, prevent, and self-heal"
+  tags: [github, actions, AI, ML, intelligent-pipelines, future, master]
+state: {}
+trigger: |
+  You're the VP of AI/ML at a CI/CD platform company designing the
+  next generation of intelligent CI/CD. Your platform serves 5,000
+  organizations running 10M pipeline executions per month. You need to
+  design the AI system that defines CI/CD in 2028-2030.
+  Current AI capabilities (baseline):
+  - Predictive test selection: 60% test reduction, 95% fault detection
+  - Flaky test detection: Identifies flaky tests with 85% accuracy
+  - Basic failure classification: Categorizes failures (infra/code/dep)
+  - Cost recommendations: Suggests caching and parallelism improvements
+  Target capabilities (next 3 years):
+  1. Predictive CI: Before running CI, predict which tests will fail
+     based on the code diff, reducing CI from 20 min to 3 min
+  2. Self-healing pipelines: Automatically fix common failures (retry
+     transient errors, update cached dependencies, adjust timeouts)
+  3. Intelligent deployment: Predict deployment risk, automatically
+     choose deployment strategy (canary % based on risk), and auto-
+     rollback before users are impacted
+  4. Natural language pipelines: "Deploy to staging with extra load
+     testing" → generates and executes the workflow
+  5. Cross-organization learning: Learn patterns from anonymized data
+     across all customers to improve recommendations
+  Constraints:
+  - Customer pipeline code must never leave their tenant
+  - False positives in test selection must be < 1% (missing a real
+    failure is unacceptable)
+  - Natural language pipeline generation must be reviewed before
+    execution (no autonomous deployment without approval)
+  - Cross-org learning must use differential privacy
+  Task: Design the AI-powered CI/CD system. Write: the product vision
+  (what CI/CD looks like in 2030), the ML architecture (models,
+  training, inference, privacy), the safety framework (preventing AI
+  from causing deployments of broken code), the ethical considerations
+  (cross-org learning, developer surveillance), and the go-to-market
+  roadmap. Include the key technical challenges and proposed solutions.
+assertions:
+  - type: llm_judge
+    criteria: "Product vision is compelling and specific — describes concrete developer experiences in 2030 (not vague AI promises), shows the progression from current capabilities to target state, and acknowledges what AI cannot do (domain-specific testing, business logic validation)"
+    weight: 0.35
+    description: "Compelling specific vision"
+  - type: llm_judge
+    criteria: "ML architecture handles constraints — solves the privacy problem (federated learning or on-tenant models), achieves <1% false negative rate in test selection with confidence calibration, and the natural language pipeline generation includes human-in-the-loop review"
+    weight: 0.35
+    description: "Constraint-handling ML architecture"
+  - type: llm_judge
+    criteria: "Safety framework prevents AI-caused outages — defines clear boundaries (AI suggests, humans approve for production), includes circuit breakers for self-healing (rate limits on automated fixes), and addresses the ethical concerns of cross-org learning with differential privacy"
+    weight: 0.30
+    description: "Outage-preventing safety framework"

package/courses/github-actions-cicd/scenarios/level-5/cicd-behavioral-science.yaml ADDED Viewed

@@ -0,0 +1,70 @@
+meta:
+  id: cicd-behavioral-science
+  level: 5
+  course: github-actions-cicd
+  type: output
+  description: "Apply behavioral science to CI/CD — use psychology and behavioral economics to improve developer adoption and reduce CI friction"
+  tags: [github, actions, behavioral-science, psychology, developer-experience, master]
+state: {}
+trigger: |
+  You're the Head of Developer Experience applying behavioral science
+  to CI/CD. Despite having excellent CI/CD infrastructure, developer
+  behavior isn't changing. You have the tools, but adoption and best
+  practices are lagging.
+  Observed behavioral patterns:
+  1. Learned helplessness: After years of slow, flaky CI, developers
+     don't trust the system even after improvements. 60% still run
+     tests locally before pushing (wasting 30 min/day) because "CI
+     might be broken again."
+  2. Present bias: Developers skip writing tests and security scans
+     to ship faster now, even though this causes 3x more incidents
+     later. The cost is delayed and distributed; the benefit is
+     immediate and personal.
+  3. Choice overload: The platform team offers 15 different workflow
+     templates, 8 runner types, and 12 optional integrations.
+     Developers choose the default (minimal) configuration 80% of
+     the time, missing valuable features.
+  4. Social proof gap: Teams with excellent CI/CD practices are
+     invisible. Teams don't know what "good" looks like because
+     there's no benchmarking or sharing of best practices.
+  5. Status quo bias: Teams resist migrating from Jenkins to GitHub
+     Actions (even when Actions is objectively better) because "Jenkins
+     works and we know it."
+  6. Feedback delay: When CI catches a bug, the feedback appears
+     15 minutes after the push. Developers have context-switched and
+     the feedback feels like an interruption rather than a helpful
+     catch.
+  Data available:
+  - 2 years of CI/CD usage data across 800 engineers
+  - Developer experience surveys (quarterly)
+  - A/B testing capability (workflow and UI modifications)
+  Task: Design a behavioral intervention program for CI/CD. For each
+  bias, write: the evidence from your data, the behavioral intervention
+  (nudge, default optimization, or environmental design), the
+  implementation in GitHub Actions (workflow changes, notifications,
+  gamification), and the A/B test design. Then write the unified
+  "Behavioral CI/CD" framework and the ethical considerations.
+assertions:
+  - type: llm_judge
+    criteria: "Interventions are behavior-specific — addresses learned helplessness with trust-building (CI reliability dashboard, streak counters), present bias with immediate feedback and CI-as-a-service defaults, choice overload with opinionated defaults and progressive disclosure"
+    weight: 0.35
+    description: "Behavior-specific interventions"
+  - type: llm_judge
+    criteria: "Implementation is practical in GitHub Actions — interventions are embedded in the CI/CD workflow itself (not separate tools), uses workflow summary annotations, PR comments, Slack notifications, and gamification elements that fit the developer workflow"
+    weight: 0.35
+    description: "Practical GitHub Actions implementation"
+  - type: llm_judge
+    criteria: "A/B test designs are rigorous — each has a clear hypothesis, control group, sample size consideration, success metric, and duration. Ethical considerations address developer privacy and the line between nudging and manipulation"
+    weight: 0.30
+    description: "Rigorous A/B test designs"

package/courses/github-actions-cicd/scenarios/level-5/cicd-board-strategy.yaml ADDED Viewed

@@ -0,0 +1,56 @@
+meta:
+  id: cicd-board-strategy
+  level: 5
+  course: github-actions-cicd
+  type: output
+  description: "Board-level CI/CD strategy — present CI/CD as a strategic capability to the board of a public company"
+  tags: [github, actions, board, strategy, governance, competitive-advantage, master]
+state: {}
+trigger: |
+  You're the CTO of a $5B public technology company presenting to the
+  board of directors. The board has 3 CI/CD-related agenda items.
+  Agenda Item 1 — Competitive intelligence:
+  A competitor just published that they deploy 500 times per day with
+  99.99% success rate. Your company deploys 50 times per day with 95%
+  success. The board asks: "Are we falling behind? Is this a risk to
+  our market position?"
+  Your data: Your company releases features 40% faster year-over-year,
+  but the competitor's absolute velocity is 10x yours. However, your
+  change failure rate is improving while theirs has been static.
+  Agenda Item 2 — $15M infrastructure investment:
+  You're requesting $15M over 3 years to build an internal developer
+  platform with CI/CD at its core. The investment includes: platform
+  engineering team (20 FTEs), self-hosted runner infrastructure, custom
+  tooling, and AI-powered pipeline optimization.
+  The board asks: "What's the ROI? When does this pay back? What if
+  we just use GitHub Actions as-is without the platform investment?"
+  Agenda Item 3 — M&A technology assessment:
+  The company is acquiring a 500-engineer competitor. Due diligence
+  reveals: they use GitLab CI self-hosted, have no deployment
+  automation (manual deploys), and their CI takes 2 hours per run.
+  The board asks: "What's the integration risk? Timeline? Cost?"
+  Task: Prepare the board materials. For each agenda item, write: the
+  1-page board brief, the data visualizations (described in text), the
+  Q&A preparation (5 likely questions and answers), and the governance
+  recommendations. End with a unified strategic narrative connecting
+  all 3 items.
+assertions:
+  - type: llm_judge
+    criteria: "Competitive analysis is nuanced — doesn't panic about the 10x gap, explains that absolute deploy frequency is less important than rate of improvement and change failure rate, and connects CI/CD velocity to business outcomes (time-to-market, customer satisfaction)"
+    weight: 0.35
+    description: "Nuanced competitive analysis"
+  - type: llm_judge
+    criteria: "Investment case is board-ready — presents the $15M as a 3-year investment with clear payback timeline, quantifies the 'do nothing' risk (cost of not investing), and the ROI model includes both tangible returns (cost savings) and intangible (developer productivity, talent retention)"
+    weight: 0.35
+    description: "Board-ready investment case"
+  - type: llm_judge
+    criteria: "M&A assessment is realistic — quantifies the integration risk (timeline, cost, talent retention risk during migration), proposes a phased approach, and connects the platform investment (Item 2) to faster M&A integration capability"
+    weight: 0.30
+    description: "Realistic M&A assessment"