dojo.md 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (243) hide show
  1. package/courses/GENERATION_LOG.md +27 -0
  2. package/courses/api-error-handling/course.yaml +16 -0
  3. package/courses/api-error-handling/scenarios/level-1/error-response-format.yaml +131 -0
  4. package/courses/api-error-handling/scenarios/level-1/http-status-codes-basics.yaml +90 -0
  5. package/courses/api-error-handling/scenarios/level-1/rate-limiting-basics.yaml +135 -0
  6. package/courses/api-error-handling/scenarios/level-1/request-validation-errors.yaml +208 -0
  7. package/courses/api-error-handling/scenarios/level-2/circuit-breaker-pattern.yaml +189 -0
  8. package/courses/api-error-handling/scenarios/level-2/idempotency-retry-logic.yaml +159 -0
  9. package/courses/api-error-handling/scenarios/level-2/rfc-7807-problem-details.yaml +178 -0
  10. package/courses/api-error-handling/scenarios/level-2/webhook-error-handling.yaml +211 -0
  11. package/courses/api-error-handling/scenarios/level-3/distributed-tracing-errors.yaml +275 -0
  12. package/courses/github-actions-cicd/course.yaml +10 -0
  13. package/courses/github-actions-cicd/scenarios/level-1/actions-and-runners.yaml +58 -0
  14. package/courses/github-actions-cicd/scenarios/level-1/basic-workflow-syntax.yaml +52 -0
  15. package/courses/github-actions-cicd/scenarios/level-1/branch-protection-checks.yaml +63 -0
  16. package/courses/github-actions-cicd/scenarios/level-1/environment-variables-secrets.yaml +65 -0
  17. package/courses/github-actions-cicd/scenarios/level-1/first-cicd-shift.yaml +62 -0
  18. package/courses/github-actions-cicd/scenarios/level-1/job-dependencies-outputs.yaml +62 -0
  19. package/courses/github-actions-cicd/scenarios/level-1/simple-ci-pipeline.yaml +57 -0
  20. package/courses/github-actions-cicd/scenarios/level-1/workflow-debugging.yaml +90 -0
  21. package/courses/github-actions-cicd/scenarios/level-1/workflow-status-notifications.yaml +59 -0
  22. package/courses/github-actions-cicd/scenarios/level-1/workflow-triggers.yaml +56 -0
  23. package/courses/github-actions-cicd/scenarios/level-2/concurrency-control.yaml +58 -0
  24. package/courses/github-actions-cicd/scenarios/level-2/conditional-execution.yaml +60 -0
  25. package/courses/github-actions-cicd/scenarios/level-2/custom-actions-development.yaml +55 -0
  26. package/courses/github-actions-cicd/scenarios/level-2/dependency-caching.yaml +58 -0
  27. package/courses/github-actions-cicd/scenarios/level-2/deployment-workflows.yaml +61 -0
  28. package/courses/github-actions-cicd/scenarios/level-2/github-packages-publishing.yaml +59 -0
  29. package/courses/github-actions-cicd/scenarios/level-2/intermediate-cicd-shift.yaml +68 -0
  30. package/courses/github-actions-cicd/scenarios/level-2/matrix-builds.yaml +59 -0
  31. package/courses/github-actions-cicd/scenarios/level-2/reusable-workflows.yaml +61 -0
  32. package/courses/github-actions-cicd/scenarios/level-2/workflow-cost-optimization.yaml +61 -0
  33. package/courses/github-actions-cicd/scenarios/level-3/advanced-cicd-shift.yaml +64 -0
  34. package/courses/github-actions-cicd/scenarios/level-3/compliance-automation.yaml +68 -0
  35. package/courses/github-actions-cicd/scenarios/level-3/docker-action-development.yaml +65 -0
  36. package/courses/github-actions-cicd/scenarios/level-3/github-environments.yaml +65 -0
  37. package/courses/github-actions-cicd/scenarios/level-3/monorepo-ci.yaml +68 -0
  38. package/courses/github-actions-cicd/scenarios/level-3/oidc-cloud-deployments.yaml +55 -0
  39. package/courses/github-actions-cicd/scenarios/level-3/release-automation.yaml +61 -0
  40. package/courses/github-actions-cicd/scenarios/level-3/security-hardening.yaml +63 -0
  41. package/courses/github-actions-cicd/scenarios/level-3/self-hosted-runners.yaml +60 -0
  42. package/courses/github-actions-cicd/scenarios/level-3/workflow-optimization.yaml +59 -0
  43. package/courses/github-actions-cicd/scenarios/level-4/cicd-data-architecture.yaml +63 -0
  44. package/courses/github-actions-cicd/scenarios/level-4/cicd-economics-roi.yaml +63 -0
  45. package/courses/github-actions-cicd/scenarios/level-4/cicd-executive-communication.yaml +58 -0
  46. package/courses/github-actions-cicd/scenarios/level-4/cicd-incident-response.yaml +60 -0
  47. package/courses/github-actions-cicd/scenarios/level-4/cicd-org-design.yaml +59 -0
  48. package/courses/github-actions-cicd/scenarios/level-4/cicd-platform-architecture.yaml +63 -0
  49. package/courses/github-actions-cicd/scenarios/level-4/cicd-training-program.yaml +65 -0
  50. package/courses/github-actions-cicd/scenarios/level-4/cicd-vendor-evaluation.yaml +59 -0
  51. package/courses/github-actions-cicd/scenarios/level-4/enterprise-cicd-governance.yaml +55 -0
  52. package/courses/github-actions-cicd/scenarios/level-4/expert-cicd-shift.yaml +60 -0
  53. package/courses/github-actions-cicd/scenarios/level-5/cicd-ai-future.yaml +63 -0
  54. package/courses/github-actions-cicd/scenarios/level-5/cicd-behavioral-science.yaml +70 -0
  55. package/courses/github-actions-cicd/scenarios/level-5/cicd-board-strategy.yaml +56 -0
  56. package/courses/github-actions-cicd/scenarios/level-5/cicd-consulting-engagement.yaml +61 -0
  57. package/courses/github-actions-cicd/scenarios/level-5/cicd-industry-benchmarks.yaml +63 -0
  58. package/courses/github-actions-cicd/scenarios/level-5/cicd-ma-integration.yaml +73 -0
  59. package/courses/github-actions-cicd/scenarios/level-5/cicd-product-development.yaml +68 -0
  60. package/courses/github-actions-cicd/scenarios/level-5/cicd-regulatory-landscape.yaml +72 -0
  61. package/courses/github-actions-cicd/scenarios/level-5/comprehensive-cicd-system.yaml +66 -0
  62. package/courses/github-actions-cicd/scenarios/level-5/master-cicd-shift.yaml +76 -0
  63. package/courses/github-pr-review/scenarios/level-2/api-change-review.yaml +82 -0
  64. package/courses/github-pr-review/scenarios/level-2/automated-review-tooling.yaml +53 -0
  65. package/courses/github-pr-review/scenarios/level-2/cross-team-review.yaml +61 -0
  66. package/courses/github-pr-review/scenarios/level-2/intermediate-review-shift.yaml +66 -0
  67. package/courses/github-pr-review/scenarios/level-2/performance-review-patterns.yaml +99 -0
  68. package/courses/github-pr-review/scenarios/level-2/review-disagreement-resolution.yaml +64 -0
  69. package/courses/github-pr-review/scenarios/level-2/review-metrics-analysis.yaml +63 -0
  70. package/courses/github-pr-review/scenarios/level-2/review-turnaround-sla.yaml +54 -0
  71. package/courses/github-pr-review/scenarios/level-2/stacked-pr-review.yaml +65 -0
  72. package/courses/github-pr-review/scenarios/level-3/advanced-review-shift.yaml +65 -0
  73. package/courses/github-pr-review/scenarios/level-3/ai-powered-review.yaml +58 -0
  74. package/courses/github-pr-review/scenarios/level-3/compliance-review-process.yaml +64 -0
  75. package/courses/github-pr-review/scenarios/level-3/cross-functional-review.yaml +60 -0
  76. package/courses/github-pr-review/scenarios/level-3/incident-driven-review.yaml +63 -0
  77. package/courses/github-pr-review/scenarios/level-3/large-scale-review-operations.yaml +55 -0
  78. package/courses/github-pr-review/scenarios/level-3/monorepo-review-process.yaml +68 -0
  79. package/courses/github-pr-review/scenarios/level-3/review-automation-platform.yaml +61 -0
  80. package/courses/github-pr-review/scenarios/level-3/review-culture-design.yaml +62 -0
  81. package/courses/github-pr-review/scenarios/level-3/review-data-pipeline.yaml +62 -0
  82. package/courses/github-pr-review/scenarios/level-4/enterprise-review-operations.yaml +61 -0
  83. package/courses/github-pr-review/scenarios/level-4/expert-review-shift.yaml +62 -0
  84. package/courses/github-pr-review/scenarios/level-4/review-data-architecture.yaml +69 -0
  85. package/courses/github-pr-review/scenarios/level-4/review-economics-roi.yaml +63 -0
  86. package/courses/github-pr-review/scenarios/level-4/review-executive-communication.yaml +61 -0
  87. package/courses/github-pr-review/scenarios/level-4/review-incident-postmortem.yaml +69 -0
  88. package/courses/github-pr-review/scenarios/level-4/review-org-design.yaml +62 -0
  89. package/courses/github-pr-review/scenarios/level-4/review-platform-architecture.yaml +64 -0
  90. package/courses/github-pr-review/scenarios/level-4/review-training-program.yaml +66 -0
  91. package/courses/github-pr-review/scenarios/level-4/review-vendor-evaluation.yaml +76 -0
  92. package/courses/github-pr-review/scenarios/level-5/comprehensive-review-system.yaml +68 -0
  93. package/courses/github-pr-review/scenarios/level-5/master-review-shift.yaml +73 -0
  94. package/courses/github-pr-review/scenarios/level-5/review-ai-future.yaml +69 -0
  95. package/courses/github-pr-review/scenarios/level-5/review-behavioral-science.yaml +66 -0
  96. package/courses/github-pr-review/scenarios/level-5/review-board-strategy.yaml +62 -0
  97. package/courses/github-pr-review/scenarios/level-5/review-consulting-engagement.yaml +62 -0
  98. package/courses/github-pr-review/scenarios/level-5/review-devtools-product.yaml +71 -0
  99. package/courses/github-pr-review/scenarios/level-5/review-industry-benchmarks.yaml +64 -0
  100. package/courses/github-pr-review/scenarios/level-5/review-ma-integration.yaml +76 -0
  101. package/courses/github-pr-review/scenarios/level-5/review-regulatory-landscape.yaml +78 -0
  102. package/courses/postgresql-query-optimization/course.yaml +11 -0
  103. package/courses/postgresql-query-optimization/scenarios/level-1/explain-analyze-basics.yaml +80 -0
  104. package/courses/postgresql-query-optimization/scenarios/level-1/first-optimization-shift.yaml +77 -0
  105. package/courses/postgresql-query-optimization/scenarios/level-1/index-fundamentals.yaml +76 -0
  106. package/courses/postgresql-query-optimization/scenarios/level-1/join-basics.yaml +73 -0
  107. package/courses/postgresql-query-optimization/scenarios/level-1/n-plus-one-queries.yaml +62 -0
  108. package/courses/postgresql-query-optimization/scenarios/level-1/query-rewriting-basics.yaml +69 -0
  109. package/courses/postgresql-query-optimization/scenarios/level-1/select-star-problems.yaml +69 -0
  110. package/courses/postgresql-query-optimization/scenarios/level-1/slow-query-diagnosis.yaml +63 -0
  111. package/courses/postgresql-query-optimization/scenarios/level-1/vacuum-and-statistics.yaml +62 -0
  112. package/courses/postgresql-query-optimization/scenarios/level-1/where-clause-optimization.yaml +74 -0
  113. package/courses/postgresql-query-optimization/scenarios/level-2/autovacuum-tuning.yaml +76 -0
  114. package/courses/postgresql-query-optimization/scenarios/level-2/composite-index-design.yaml +81 -0
  115. package/courses/postgresql-query-optimization/scenarios/level-2/covering-indexes.yaml +74 -0
  116. package/courses/postgresql-query-optimization/scenarios/level-2/cte-optimization.yaml +83 -0
  117. package/courses/postgresql-query-optimization/scenarios/level-2/intermediate-optimization-shift.yaml +66 -0
  118. package/courses/postgresql-query-optimization/scenarios/level-2/join-optimization.yaml +72 -0
  119. package/courses/postgresql-query-optimization/scenarios/level-2/partial-and-expression-indexes.yaml +75 -0
  120. package/courses/postgresql-query-optimization/scenarios/level-2/query-planner-settings.yaml +62 -0
  121. package/courses/postgresql-query-optimization/scenarios/level-2/subquery-optimization.yaml +67 -0
  122. package/courses/postgresql-query-optimization/scenarios/level-2/window-function-optimization.yaml +63 -0
  123. package/courses/postgresql-query-optimization/scenarios/level-3/advanced-optimization-shift.yaml +71 -0
  124. package/courses/postgresql-query-optimization/scenarios/level-3/connection-pooling.yaml +60 -0
  125. package/courses/postgresql-query-optimization/scenarios/level-3/full-text-search-optimization.yaml +66 -0
  126. package/courses/postgresql-query-optimization/scenarios/level-3/jsonb-optimization.yaml +88 -0
  127. package/courses/postgresql-query-optimization/scenarios/level-3/lock-contention-analysis.yaml +80 -0
  128. package/courses/postgresql-query-optimization/scenarios/level-3/materialized-view-optimization.yaml +73 -0
  129. package/courses/postgresql-query-optimization/scenarios/level-3/parallel-query-execution.yaml +74 -0
  130. package/courses/postgresql-query-optimization/scenarios/level-3/partitioning-strategies.yaml +71 -0
  131. package/courses/postgresql-query-optimization/scenarios/level-3/specialized-index-types.yaml +67 -0
  132. package/courses/postgresql-query-optimization/scenarios/level-3/write-optimization.yaml +65 -0
  133. package/courses/postgresql-query-optimization/scenarios/level-4/data-architecture-analytics.yaml +64 -0
  134. package/courses/postgresql-query-optimization/scenarios/level-4/database-executive-communication.yaml +64 -0
  135. package/courses/postgresql-query-optimization/scenarios/level-4/database-migration-planning.yaml +57 -0
  136. package/courses/postgresql-query-optimization/scenarios/level-4/enterprise-database-governance.yaml +52 -0
  137. package/courses/postgresql-query-optimization/scenarios/level-4/expert-optimization-shift.yaml +73 -0
  138. package/courses/postgresql-query-optimization/scenarios/level-4/high-availability-architecture.yaml +62 -0
  139. package/courses/postgresql-query-optimization/scenarios/level-4/optimizer-internals.yaml +69 -0
  140. package/courses/postgresql-query-optimization/scenarios/level-4/performance-sla-design.yaml +58 -0
  141. package/courses/postgresql-query-optimization/scenarios/level-4/read-replica-optimization.yaml +62 -0
  142. package/courses/postgresql-query-optimization/scenarios/level-4/vendor-evaluation.yaml +73 -0
  143. package/courses/rest-api-error-handling/course.yaml +11 -0
  144. package/courses/rest-api-error-handling/scenarios/level-1/authentication-errors.yaml +71 -0
  145. package/courses/rest-api-error-handling/scenarios/level-1/content-negotiation-errors.yaml +63 -0
  146. package/courses/rest-api-error-handling/scenarios/level-1/error-logging-basics.yaml +63 -0
  147. package/courses/rest-api-error-handling/scenarios/level-1/error-response-format.yaml +58 -0
  148. package/courses/rest-api-error-handling/scenarios/level-1/first-error-handling-shift.yaml +67 -0
  149. package/courses/rest-api-error-handling/scenarios/level-1/http-status-codes.yaml +46 -0
  150. package/courses/rest-api-error-handling/scenarios/level-1/not-found-errors.yaml +52 -0
  151. package/courses/rest-api-error-handling/scenarios/level-1/rate-limiting-errors.yaml +56 -0
  152. package/courses/rest-api-error-handling/scenarios/level-1/request-validation-errors.yaml +59 -0
  153. package/courses/rest-api-error-handling/scenarios/level-1/server-error-handling.yaml +55 -0
  154. package/courses/rest-api-error-handling/scenarios/level-2/api-versioning-errors.yaml +66 -0
  155. package/courses/rest-api-error-handling/scenarios/level-2/batch-request-errors.yaml +61 -0
  156. package/courses/rest-api-error-handling/scenarios/level-2/circuit-breaker-pattern.yaml +52 -0
  157. package/courses/rest-api-error-handling/scenarios/level-2/error-code-taxonomy.yaml +62 -0
  158. package/courses/rest-api-error-handling/scenarios/level-2/error-monitoring-alerting.yaml +53 -0
  159. package/courses/rest-api-error-handling/scenarios/level-2/intermediate-error-shift.yaml +69 -0
  160. package/courses/rest-api-error-handling/scenarios/level-2/pagination-errors.yaml +66 -0
  161. package/courses/rest-api-error-handling/scenarios/level-2/retry-and-idempotency.yaml +60 -0
  162. package/courses/rest-api-error-handling/scenarios/level-2/rfc7807-problem-details.yaml +60 -0
  163. package/courses/rest-api-error-handling/scenarios/level-2/webhook-error-handling.yaml +55 -0
  164. package/courses/rest-api-error-handling/scenarios/level-3/advanced-error-shift.yaml +72 -0
  165. package/courses/rest-api-error-handling/scenarios/level-3/api-gateway-errors.yaml +71 -0
  166. package/courses/rest-api-error-handling/scenarios/level-3/async-api-errors.yaml +67 -0
  167. package/courses/rest-api-error-handling/scenarios/level-3/caching-error-scenarios.yaml +65 -0
  168. package/courses/rest-api-error-handling/scenarios/level-3/chaos-engineering-apis.yaml +62 -0
  169. package/courses/rest-api-error-handling/scenarios/level-3/database-error-handling.yaml +79 -0
  170. package/courses/rest-api-error-handling/scenarios/level-3/distributed-error-propagation.yaml +63 -0
  171. package/courses/rest-api-error-handling/scenarios/level-3/error-budgets-sre.yaml +61 -0
  172. package/courses/rest-api-error-handling/scenarios/level-3/error-correlation.yaml +58 -0
  173. package/courses/rest-api-error-handling/scenarios/level-3/graphql-vs-rest-errors.yaml +73 -0
  174. package/courses/rest-api-error-handling/scenarios/level-4/compliance-error-handling.yaml +65 -0
  175. package/courses/rest-api-error-handling/scenarios/level-4/enterprise-error-governance.yaml +62 -0
  176. package/courses/rest-api-error-handling/scenarios/level-4/error-analytics-platform.yaml +65 -0
  177. package/courses/rest-api-error-handling/scenarios/level-4/error-cost-optimization.yaml +63 -0
  178. package/courses/rest-api-error-handling/scenarios/level-4/error-executive-communication.yaml +60 -0
  179. package/courses/rest-api-error-handling/scenarios/level-4/error-handling-architecture.yaml +67 -0
  180. package/courses/rest-api-error-handling/scenarios/level-4/error-org-design.yaml +68 -0
  181. package/courses/rest-api-error-handling/scenarios/level-4/error-sla-design.yaml +65 -0
  182. package/courses/rest-api-error-handling/scenarios/level-4/error-training-program.yaml +61 -0
  183. package/courses/rest-api-error-handling/scenarios/level-4/expert-error-shift.yaml +63 -0
  184. package/courses/rest-api-error-handling/scenarios/level-5/comprehensive-error-system.yaml +68 -0
  185. package/courses/rest-api-error-handling/scenarios/level-5/error-ai-future.yaml +75 -0
  186. package/courses/rest-api-error-handling/scenarios/level-5/error-behavioral-science.yaml +73 -0
  187. package/courses/rest-api-error-handling/scenarios/level-5/error-board-strategy.yaml +60 -0
  188. package/courses/rest-api-error-handling/scenarios/level-5/error-consulting-engagement.yaml +58 -0
  189. package/courses/rest-api-error-handling/scenarios/level-5/error-industry-benchmarks.yaml +72 -0
  190. package/courses/rest-api-error-handling/scenarios/level-5/error-ma-integration.yaml +68 -0
  191. package/courses/rest-api-error-handling/scenarios/level-5/error-product-development.yaml +66 -0
  192. package/courses/rest-api-error-handling/scenarios/level-5/error-regulatory-landscape.yaml +80 -0
  193. package/courses/rest-api-error-handling/scenarios/level-5/master-error-shift.yaml +73 -0
  194. package/dist/cli/commands/add.d.ts.map +1 -1
  195. package/dist/cli/commands/add.js +6 -5
  196. package/dist/cli/commands/add.js.map +1 -1
  197. package/dist/cli/commands/generate.d.ts.map +1 -1
  198. package/dist/cli/commands/generate.js +4 -0
  199. package/dist/cli/commands/generate.js.map +1 -1
  200. package/dist/cli/commands/list.d.ts.map +1 -1
  201. package/dist/cli/commands/list.js +6 -18
  202. package/dist/cli/commands/list.js.map +1 -1
  203. package/dist/cli/commands/train.d.ts.map +1 -1
  204. package/dist/cli/commands/train.js +18 -18
  205. package/dist/cli/commands/train.js.map +1 -1
  206. package/dist/cli/index.js +93 -55
  207. package/dist/cli/index.js.map +1 -1
  208. package/dist/cli/run-demo.js +2 -1
  209. package/dist/cli/run-demo.js.map +1 -1
  210. package/dist/cli/setup.d.ts +18 -0
  211. package/dist/cli/setup.d.ts.map +1 -0
  212. package/dist/cli/setup.js +154 -0
  213. package/dist/cli/setup.js.map +1 -0
  214. package/dist/engine/agent-bridge.d.ts +5 -2
  215. package/dist/engine/agent-bridge.d.ts.map +1 -1
  216. package/dist/engine/agent-bridge.js +36 -9
  217. package/dist/engine/agent-bridge.js.map +1 -1
  218. package/dist/engine/loader.d.ts +21 -0
  219. package/dist/engine/loader.d.ts.map +1 -1
  220. package/dist/engine/loader.js +54 -1
  221. package/dist/engine/loader.js.map +1 -1
  222. package/dist/engine/training-loop.d.ts.map +1 -1
  223. package/dist/engine/training-loop.js +1 -0
  224. package/dist/engine/training-loop.js.map +1 -1
  225. package/dist/engine/training.d.ts.map +1 -1
  226. package/dist/engine/training.js +1 -0
  227. package/dist/engine/training.js.map +1 -1
  228. package/dist/generator/skill-generator.d.ts +1 -1
  229. package/dist/generator/skill-generator.d.ts.map +1 -1
  230. package/dist/generator/skill-generator.js +21 -2
  231. package/dist/generator/skill-generator.js.map +1 -1
  232. package/dist/mcp/server.d.ts.map +1 -1
  233. package/dist/mcp/server.js +11 -26
  234. package/dist/mcp/server.js.map +1 -1
  235. package/dist/mcp/session-manager.d.ts +3 -1
  236. package/dist/mcp/session-manager.d.ts.map +1 -1
  237. package/dist/mcp/session-manager.js +44 -22
  238. package/dist/mcp/session-manager.js.map +1 -1
  239. package/dist/types/schemas.d.ts +38 -13
  240. package/dist/types/schemas.d.ts.map +1 -1
  241. package/dist/types/schemas.js +9 -5
  242. package/dist/types/schemas.js.map +1 -1
  243. package/package.json +1 -1
@@ -0,0 +1,63 @@
1
+ meta:
2
+ id: window-function-optimization
3
+ level: 2
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Optimize window functions — reduce sorting overhead and memory usage in analytical queries"
7
+ tags: [PostgreSQL, window-functions, analytics, sorting, intermediate]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your analytics team writes queries with multiple window functions
13
+ that are extremely slow. The DBA says "window functions cause
14
+ re-sorting" but the team doesn't understand why.
15
+
16
+ Problem query (takes 45 seconds on 10M rows):
17
+ SELECT
18
+ id, department, salary, hire_date,
19
+ ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC)
20
+ as salary_rank,
21
+ SUM(salary) OVER (PARTITION BY department) as dept_total,
22
+ AVG(salary) OVER (PARTITION BY department) as dept_avg,
23
+ RANK() OVER (ORDER BY salary DESC) as global_rank,
24
+ LAG(salary) OVER (PARTITION BY department ORDER BY hire_date)
25
+ as prev_salary,
26
+ SUM(salary) OVER (ORDER BY hire_date ROWS BETWEEN UNBOUNDED
27
+ PRECEDING AND CURRENT ROW) as running_total
28
+ FROM employees;
29
+
30
+ EXPLAIN ANALYZE shows 4 separate Sort operations:
31
+ - Sort 1: PARTITION BY department ORDER BY salary DESC (for salary_rank)
32
+ - Sort 2: PARTITION BY department (for dept_total, dept_avg)
33
+ - Sort 3: ORDER BY salary DESC (for global_rank)
34
+ - Sort 4: PARTITION BY department ORDER BY hire_date (for prev_salary)
35
+ - Sort 5: ORDER BY hire_date (for running_total)
36
+
37
+ Each sort processes all 10M rows. Five sorts × 10M rows = slow.
38
+
39
+ Additional challenges:
40
+ 1. The query spills to disk (work_mem too small for 10M-row sorts)
41
+ 2. Can any window functions share a sort operation?
42
+ 3. Is there a way to pre-compute some of these values?
43
+ 4. The running_total window frame is expensive — alternatives?
44
+
45
+ Task: Optimize this query. Show: how to consolidate window
46
+ definitions to minimize sorts, how to use named WINDOW clauses,
47
+ which window functions can share sort operations, work_mem tuning
48
+ for window functions, and alternative approaches for running totals
49
+ at scale.
50
+
51
+ assertions:
52
+ - type: llm_judge
53
+ criteria: "Window consolidation reduces sorts — groups window functions with compatible PARTITION BY and ORDER BY into shared window definitions using named WINDOW clauses. dept_total and dept_avg share the same window, salary_rank and prev_salary may be reorganized to share sorts"
54
+ weight: 0.35
55
+ description: "Sort consolidation"
56
+ - type: llm_judge
57
+ criteria: "work_mem and disk spill are addressed — explains that each Sort node uses work_mem independently, so 5 sorts × work_mem must fit in RAM. Recommends appropriate work_mem for the query size and shows how to set it per-session for analytical queries"
58
+ weight: 0.35
59
+ description: "Memory tuning"
60
+ - type: llm_judge
61
+ criteria: "Alternative approaches for scale are provided — suggests pre-computing aggregates in materialized views, using CTEs to separate window function groups, or restructuring to avoid global window functions (ORDER BY salary DESC without PARTITION) that require sorting the entire table"
62
+ weight: 0.30
63
+ description: "Scalable alternatives"
@@ -0,0 +1,71 @@
1
+ meta:
2
+ id: advanced-optimization-shift
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Advanced optimization shift — rescue a failing database under production load with multiple concurrent crises"
7
+ tags: [PostgreSQL, optimization, shift-simulation, production, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ You're the senior DBA called in to a production emergency. The main
13
+ database for a fintech platform (processing $10M/day) is failing
14
+ under load.
15
+
16
+ System: PostgreSQL 16, 64 cores, 512GB RAM, 10TB NVMe
17
+ Database: 3TB across 200 tables
18
+ Load: 50,000 queries/second (normally 30,000)
19
+
20
+ Crisis timeline:
21
+
22
+ 9:00 AM — Traffic spike (Black Friday sale):
23
+ - Queries/second jumped from 30K to 50K
24
+ - Connection pool exhausted (200/200 connections)
25
+ - New customers getting "connection refused" errors
26
+
27
+ 9:15 AM — Table bloat discovered:
28
+ - The transactions table (500M rows, 200GB) is 60% bloated
29
+ - Autovacuum has been failing for 3 weeks (blocked by a cron job
30
+ that opens a transaction and never closes it)
31
+ - Dead tuples: 300M
32
+ - Sequential scans on this bloated table are 2.5x slower
33
+
34
+ 9:30 AM — Query plan regression:
35
+ - The pricing query (runs 10,000 times/second) switched from Index
36
+ Scan to Seq Scan after an ANALYZE ran with stale statistics
37
+ - This single query is now consuming 40% of CPU
38
+ - Before regression: 2ms per query
39
+ - After regression: 50ms per query
40
+
41
+ 9:45 AM — Disk space alert:
42
+ - WAL directory is 500GB (normally 10GB)
43
+ - A stuck replication slot from a decommissioned replica is
44
+ preventing WAL cleanup
45
+ - Disk: 9.5TB / 10TB used (95%)
46
+
47
+ 10:00 AM — Deadlock storm:
48
+ - The payment processing code has a deadlock pattern:
49
+ Process A: Lock order (accounts → transactions)
50
+ Process B: Lock order (transactions → accounts)
51
+ - 100 deadlocks in the last 15 minutes
52
+ - Each deadlock rolls back one transaction → customer sees error
53
+
54
+ Task: Stabilize the database. Write: the triage order (what to fix
55
+ first), the immediate actions for each crisis, the root cause fixes,
56
+ and the post-incident improvements to prevent this combination of
57
+ issues from recurring.
58
+
59
+ assertions:
60
+ - type: llm_judge
61
+ criteria: "Triage order is correct — disk space first (95% full is existential: drop replication slot to free WAL), then query plan regression (40% CPU from one query: force correct plan or recreate statistics), then connections (PgBouncer or increase max_connections), then deadlocks (fix lock ordering), then bloat (long-term fix)"
62
+ weight: 0.35
63
+ description: "Correct triage order"
64
+ - type: llm_judge
65
+ criteria: "Immediate actions are specific — DROP REPLICATION SLOT for the stuck slot, SET enable_seqscan = off or pin the pricing query plan, kill the cron job holding the open transaction, PgBouncer for connection pooling, and fix the lock ordering in application code"
66
+ weight: 0.35
67
+ description: "Specific immediate actions"
68
+ - type: llm_judge
69
+ criteria: "Post-incident improvements prevent recurrence — replication slot monitoring with alerting, idle_in_transaction_session_timeout to prevent autovacuum blocking, query plan monitoring to detect regressions, connection pooling as default architecture, and deadlock detection in CI/CD testing"
70
+ weight: 0.30
71
+ description: "Recurrence prevention"
@@ -0,0 +1,60 @@
1
+ meta:
2
+ id: connection-pooling
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Implement connection pooling — deploy and tune PgBouncer for high-concurrency applications"
7
+ tags: [PostgreSQL, PgBouncer, connection-pooling, concurrency, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your application is hitting PostgreSQL connection limits and
13
+ performance is degrading. During peak traffic:
14
+
15
+ Symptoms:
16
+ - max_connections = 200, all slots used
17
+ - New connections queue and timeout after 5 seconds
18
+ - Each connection consumes ~10MB of RAM (200 × 10MB = 2GB just for
19
+ connections)
20
+ - Connection creation takes 50-100ms (TCP + SSL + auth)
21
+ - Application creates and destroys connections per request
22
+ (no pooling)
23
+ - Serverless functions (Lambda) open hundreds of connections
24
+ simultaneously
25
+
26
+ Application stack:
27
+ - 20 application servers (Node.js)
28
+ - 50 serverless functions (AWS Lambda)
29
+ - 5 background job workers
30
+ - 3 analytics/reporting services
31
+ - 1 admin dashboard
32
+
33
+ Current connection distribution:
34
+ - App servers: 20 × 10 connections = 200 (at the limit!)
35
+ - Lambda: 50 × 1-5 connections = 50-250 (spiky, blows past limit)
36
+ - Workers: 5 × 5 connections = 25
37
+ - Analytics: 3 × 10 connections = 30 (long-running queries)
38
+ - Admin: 1 × 5 connections = 5
39
+ Total potential: 310-510 connections (but limit is 200)
40
+
41
+ Task: Deploy PgBouncer to solve the connection problem. Write:
42
+ the PgBouncer configuration (pool mode, pool sizes per application
43
+ type), the architecture diagram (where PgBouncer sits), the
44
+ migration plan (how to switch applications to PgBouncer), the
45
+ limitations (what breaks in transaction pooling mode), and the
46
+ monitoring setup.
47
+
48
+ assertions:
49
+ - type: llm_judge
50
+ criteria: "PgBouncer configuration is correct — uses transaction pooling mode for most apps (not session mode), configures different pool sizes per database/user (larger for app servers, smaller for Lambda), and the total server-side connections are well under max_connections"
51
+ weight: 0.35
52
+ description: "Correct PgBouncer configuration"
53
+ - type: llm_judge
54
+ criteria: "Limitations of transaction pooling are addressed — prepared statements don't work in transaction mode (or need server_reset_query), session variables are lost between transactions, LISTEN/NOTIFY doesn't work, advisory locks need session mode, and the analytics long-running queries may need a separate pool or direct connection"
55
+ weight: 0.35
56
+ description: "Transaction pooling limitations"
57
+ - type: llm_judge
58
+ criteria: "Architecture and migration plan are practical — PgBouncer deployment location (same host, sidecar, or separate), application configuration changes (connection string swap), monitoring (pgbouncer SHOW commands, stats database), and a phased rollout that doesn't risk all traffic at once"
59
+ weight: 0.30
60
+ description: "Practical architecture and migration"
@@ -0,0 +1,66 @@
1
+ meta:
2
+ id: full-text-search-optimization
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Optimize full-text search — build fast, relevant search using tsvector, GIN indexes, and ranking functions"
7
+ tags: [PostgreSQL, full-text-search, tsvector, GIN, ranking, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your knowledge base has 5M articles and users complain that
13
+ search is "slow and returns irrelevant results." Current
14
+ implementation uses LIKE queries:
15
+
16
+ SELECT * FROM articles
17
+ WHERE title ILIKE '%postgresql performance%'
18
+ OR body ILIKE '%postgresql performance%'
19
+ ORDER BY created_at DESC LIMIT 20;
20
+
21
+ Problems:
22
+ 1. ILIKE with leading wildcard: Seq Scan on 5M rows, 8 seconds
23
+ 2. No relevance ranking: Results sorted by date, not relevance
24
+ 3. No stemming: Searching "running" doesn't find "run" or "ran"
25
+ 4. No phrase matching: "postgresql performance" matches articles
26
+ where "postgresql" and "performance" are in different paragraphs
27
+ 5. No highlighting: Users can't see why a result matched
28
+
29
+ Requirements:
30
+ - Search must be < 100ms
31
+ - Results ranked by relevance (not just date)
32
+ - Support for phrases, stemming, and Boolean operators
33
+ - Highlight matching text in results
34
+ - Support for multiple languages (English, Spanish, German)
35
+ - Boosted ranking for title matches vs body matches
36
+
37
+ Articles table:
38
+ CREATE TABLE articles (
39
+ id SERIAL PRIMARY KEY,
40
+ title TEXT NOT NULL,
41
+ body TEXT NOT NULL,
42
+ language VARCHAR(10) DEFAULT 'english',
43
+ category VARCHAR(50),
44
+ published_at TIMESTAMP,
45
+ author_id INTEGER
46
+ );
47
+
48
+ Task: Migrate from ILIKE to full-text search. Write: the tsvector
49
+ column and GIN index setup, the search query with ts_query and
50
+ ranking, the relevance tuning (title boost, recency factor), the
51
+ multi-language support approach, and the maintenance plan
52
+ (index updates, dictionary management).
53
+
54
+ assertions:
55
+ - type: llm_judge
56
+ criteria: "FTS implementation is correct — creates a generated tsvector column combining title (weight A) and body (weight B/C), GIN index on the tsvector column, search uses plainto_tsquery or websearch_to_tsquery, and ts_rank or ts_rank_cd for relevance scoring"
57
+ weight: 0.35
58
+ description: "Correct FTS implementation"
59
+ - type: llm_judge
60
+ criteria: "Relevance tuning addresses the requirements — title matches are boosted (using tsvector weights A/B/C/D), recency can factor into ranking (combined score of ts_rank + recency bonus), phrase matching uses phraseto_tsquery, and ts_headline provides result highlighting"
61
+ weight: 0.35
62
+ description: "Relevance tuning"
63
+ - type: llm_judge
64
+ criteria: "Multi-language and maintenance are addressed — uses language-specific text search configurations (english, spanish, german) per article, handles the GIN index maintenance (fastupdate setting, periodic reindex), and performance is estimated to be <100ms for the 5M article dataset with proper indexing"
65
+ weight: 0.30
66
+ description: "Multi-language and maintenance"
@@ -0,0 +1,88 @@
1
+ meta:
2
+ id: jsonb-optimization
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Optimize JSONB queries — index and query JSON documents efficiently without sacrificing flexibility"
7
+ tags: [PostgreSQL, JSONB, JSON, indexing, semi-structured, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your SaaS platform stores product configurations in a JSONB column.
13
+ The schema-less flexibility was great for development, but now
14
+ queries are slow at scale (10M rows).
15
+
16
+ Table:
17
+ CREATE TABLE products (
18
+ id SERIAL PRIMARY KEY,
19
+ tenant_id INTEGER NOT NULL,
20
+ name TEXT NOT NULL,
21
+ config JSONB NOT NULL
22
+ );
23
+
24
+ Sample config JSONB:
25
+ {
26
+ "category": "electronics",
27
+ "brand": "Samsung",
28
+ "price": 999.99,
29
+ "specs": {
30
+ "weight": 0.5,
31
+ "dimensions": { "width": 15, "height": 7, "depth": 0.8 },
32
+ "colors": ["black", "white", "blue"]
33
+ },
34
+ "tags": ["premium", "flagship", "5G"],
35
+ "reviews_count": 2500,
36
+ "in_stock": true
37
+ }
38
+
39
+ Slow queries:
40
+
41
+ Q1: Filter by top-level key:
42
+ WHERE config->>'brand' = 'Samsung'
43
+ (Returns 500K rows, Seq Scan on 10M)
44
+
45
+ Q2: Filter by nested key:
46
+ WHERE config->'specs'->>'weight' < '1.0'
47
+ (Text comparison, not numeric! '9.5' < '1.0' = true)
48
+
49
+ Q3: Array containment:
50
+ WHERE config->'tags' @> '"premium"'
51
+ (Check if tags array contains 'premium')
52
+
53
+ Q4: Existence check:
54
+ WHERE config ? 'warranty'
55
+ (Does the key 'warranty' exist?)
56
+
57
+ Q5: Complex filter:
58
+ WHERE config @> '{"brand": "Samsung", "in_stock": true}'
59
+ AND (config->>'price')::numeric BETWEEN 500 AND 1000
60
+
61
+ Q6: Aggregation on JSONB:
62
+ SELECT config->>'category', AVG((config->>'price')::numeric)
63
+ FROM products GROUP BY config->>'category';
64
+
65
+ Performance issues:
66
+ - Q2 compares strings, not numbers (wrong results)
67
+ - Full GIN index is 2GB for 10M rows
68
+ - Some queries can't use GIN (numeric range)
69
+ - TOAST storage for large JSONB (>2KB) causes slowdown
70
+
71
+ Task: Optimize all 6 queries. For each, write: the correct query
72
+ syntax, the optimal index (GIN, expression, or partial), and the
73
+ expected performance. Then discuss when to extract JSONB fields
74
+ into regular columns vs keeping them in JSONB.
75
+
76
+ assertions:
77
+ - type: llm_judge
78
+ criteria: "All 6 queries are optimized with correct indexing — GIN with jsonb_ops or jsonb_path_ops for containment queries (Q3, Q4, Q5), expression indexes for specific field queries (Q1: (config->>'brand'), Q2: ((config->'specs'->>'weight')::numeric)), and explains why GIN can't help numeric range queries"
79
+ weight: 0.35
80
+ description: "Correct JSONB indexing"
81
+ - type: llm_judge
82
+ criteria: "Q2 type comparison bug is fixed — the text comparison issue is identified (string ordering vs numeric ordering), the fix uses explicit casting ((config->'specs'->>'weight')::numeric), and the expression index uses the cast to support the corrected query"
83
+ weight: 0.35
84
+ description: "Type comparison bug fixed"
85
+ - type: llm_judge
86
+ criteria: "Column extraction guidance is practical — recommends extracting frequently-queried, well-defined fields (brand, price, category) into regular columns with proper types, keeping flexible/rare fields in JSONB, and using generated columns for the hybrid approach. Discusses TOAST storage impact"
87
+ weight: 0.30
88
+ description: "Practical extraction guidance"
@@ -0,0 +1,80 @@
1
+ meta:
2
+ id: lock-contention-analysis
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Diagnose lock contention — identify and resolve blocking queries, deadlocks, and lock escalation issues"
7
+ tags: [PostgreSQL, locks, deadlocks, contention, concurrency, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your application is experiencing intermittent slowdowns. Some
13
+ requests take 30+ seconds when they normally take 50ms. The
14
+ monitoring shows queries waiting for locks.
15
+
16
+ pg_stat_activity during a slowdown shows:
17
+
18
+ PID 1001: UPDATE orders SET status = 'shipped'
19
+ WHERE id = 42;
20
+ wait_event_type: Lock, wait_event: transactionid
21
+ state: active, waiting: true, duration: 28s
22
+
23
+ PID 1002: UPDATE orders SET status = 'paid'
24
+ WHERE id = 42;
25
+ wait_event_type: Lock, wait_event: transactionid
26
+ state: active, waiting: true, duration: 25s
27
+
28
+ PID 1003: SELECT * FROM orders WHERE id = 42;
29
+ state: active, waiting: false, duration: 0.1ms
30
+ (SELECT is not blocked — MVCC!)
31
+
32
+ PID 999: BEGIN;
33
+ UPDATE orders SET tracking_number = 'ABC123'
34
+ WHERE id = 42;
35
+ -- Transaction started 30 seconds ago, no COMMIT yet
36
+ state: idle in transaction
37
+
38
+ Diagnosis: PID 999 holds a row lock on orders id=42, PIDs 1001
39
+ and 1002 are waiting for it. PID 999 is "idle in transaction" —
40
+ the application opened a transaction, did an UPDATE, then went
41
+ to do something else (call an external API?) before committing.
42
+
43
+ Additional issues found:
44
+
45
+ Issue 2 — Deadlock:
46
+ ERROR: deadlock detected
47
+ Process 2001 waits for ShareLock on transaction 5000;
48
+ blocked by process 2002.
49
+ Process 2002 waits for ShareLock on transaction 4999;
50
+ blocked by process 2001.
51
+ (Two processes updating the same rows in opposite order)
52
+
53
+ Issue 3 — DDL blocking DML:
54
+ ALTER TABLE orders ADD COLUMN notes TEXT;
55
+ This takes an AccessExclusive lock, blocking ALL queries on the
56
+ orders table for 45 seconds during the ALTER.
57
+
58
+ Issue 4 — VACUUM blocked by long transaction:
59
+ Autovacuum on orders table can't reclaim dead tuples because
60
+ PID 3001 has a transaction open for 2 hours (analytics query).
61
+ Dead tuples: 5M and growing.
62
+
63
+ Task: Resolve all 4 lock issues. For each, explain: the lock type
64
+ involved, why the contention occurs, the immediate fix, and the
65
+ preventive measure. Then write the monitoring queries to detect
66
+ each issue proactively.
67
+
68
+ assertions:
69
+ - type: llm_judge
70
+ criteria: "All 4 issues are correctly diagnosed — idle-in-transaction holding row locks (fix: idle_in_transaction_session_timeout), deadlock from inconsistent lock ordering (fix: application-level ordering), DDL blocking (fix: lock_timeout + retry, or use CREATE INDEX CONCURRENTLY approach), VACUUM blocked by long transaction (fix: old_snapshot_threshold or statement_timeout for analytics)"
71
+ weight: 0.35
72
+ description: "All issues correctly diagnosed"
73
+ - type: llm_judge
74
+ criteria: "Lock types are correctly identified — row-level locks (FOR UPDATE, FOR SHARE), table-level locks (AccessExclusive for DDL, RowExclusiveLock for DML), and transaction ID locks. Explains MVCC's role (SELECTs don't block, only writes contend)"
75
+ weight: 0.35
76
+ description: "Correct lock type identification"
77
+ - type: llm_judge
78
+ criteria: "Monitoring queries are practical — uses pg_stat_activity to find idle-in-transaction sessions, pg_locks to detect blocking chains, log_lock_waits for automatic logging, and includes alerting thresholds (alert when any transaction is idle-in-transaction for > 60 seconds)"
79
+ weight: 0.30
80
+ description: "Practical monitoring queries"
@@ -0,0 +1,73 @@
1
+ meta:
2
+ id: materialized-view-optimization
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Optimize with materialized views — design refresh strategies, concurrent refresh, incremental maintenance, and query routing for expensive aggregations"
7
+ tags: [PostgreSQL, materialized-views, refresh, aggregation, caching, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your analytics dashboard queries aggregate data from a 500M-row
13
+ events table. The dashboard loads in 45 seconds because every page
14
+ view re-executes expensive aggregations. Your team wants to use
15
+ materialized views but has concerns about data freshness and refresh
16
+ cost.
17
+
18
+ The expensive dashboard query (45 seconds):
19
+ SELECT
20
+ date_trunc('hour', created_at) AS hour,
21
+ event_type,
22
+ COUNT(*) AS event_count,
23
+ COUNT(DISTINCT user_id) AS unique_users,
24
+ AVG(EXTRACT(EPOCH FROM duration)) AS avg_duration_secs
25
+ FROM events
26
+ WHERE created_at >= NOW() - INTERVAL '7 days'
27
+ GROUP BY 1, 2
28
+ ORDER BY 1 DESC, 2;
29
+
30
+ Current state:
31
+ - events table: 500M rows, 200GB, partitioned by month
32
+ - 50 new events/second (4.3M/day)
33
+ - Dashboard accessed by 30 analysts, 200 page views/day
34
+ - Data freshness requirement: within 15 minutes
35
+ - Full refresh of a materialized view takes 8 minutes
36
+ - During refresh, the old data must remain queryable
37
+
38
+ Challenges:
39
+ 1. REFRESH MATERIALIZED VIEW takes an ACCESS EXCLUSIVE lock by default
40
+ — blocks all reads during the 8-minute refresh
41
+ 2. REFRESH MATERIALIZED VIEW CONCURRENTLY requires a UNIQUE INDEX and
42
+ is slower (12 minutes) but doesn't block reads
43
+ 3. No built-in incremental refresh — the entire view is recomputed
44
+ 4. Multiple materialized views for different aggregations (hourly,
45
+ daily, weekly) — refresh coordination is needed
46
+ 5. The view must include the last 15 minutes of data, but refreshing
47
+ every 15 minutes means the view is always being refreshed
48
+
49
+ Additional materialized view patterns to evaluate:
50
+ A. Single matview + CONCURRENTLY refresh every 15 minutes
51
+ B. Two matviews (swap pattern): refresh one while serving from the other
52
+ C. Matview for historical data + live query for recent 15 minutes (UNION)
53
+ D. pg_ivm extension for incremental materialized views
54
+ E. Continuous aggregation (TimescaleDB approach)
55
+
56
+ Task: Design the materialized view strategy. Write: the recommended
57
+ approach with justification, the refresh scheduling architecture,
58
+ how to handle the freshness requirement, the indexing strategy for
59
+ the materialized view, and monitoring for refresh health.
60
+
61
+ assertions:
62
+ - type: llm_judge
63
+ criteria: "Recommended approach handles the freshness-vs-performance trade-off — the hybrid approach (option C: matview for historical + live query for recent data via UNION ALL) or a swap pattern (option B) is recommended with justification. Explains why simple CONCURRENTLY refresh alone isn't sufficient (12-minute refresh for 15-minute freshness leaves only 3 minutes of fresh data)"
64
+ weight: 0.35
65
+ description: "Sound matview strategy"
66
+ - type: llm_judge
67
+ criteria: "Refresh architecture is production-ready — includes scheduling (pg_cron or external scheduler), CONCURRENTLY to avoid blocking reads, unique index requirement for concurrent refresh, error handling for failed refreshes, and monitoring (track refresh duration, detect stale views, alert on refresh failures)"
68
+ weight: 0.35
69
+ description: "Production-ready refresh architecture"
70
+ - type: llm_judge
71
+ criteria: "Indexing and query routing are addressed — materialized view has appropriate indexes for dashboard query patterns, explains how to route queries to use the matview (direct query or view rewriting), and discusses trade-offs of each approach (storage cost, refresh CPU, staleness window)"
72
+ weight: 0.30
73
+ description: "Indexing and query routing"
@@ -0,0 +1,74 @@
1
+ meta:
2
+ id: parallel-query-execution
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Optimize parallel query execution — configure and tune PostgreSQL parallel workers for analytical workloads"
7
+ tags: [PostgreSQL, parallel-query, workers, analytics, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your analytics database has a 64-core server but analytical queries
13
+ only use 1 core. EXPLAIN shows no parallel workers despite large
14
+ table scans.
15
+
16
+ Server: 64 cores, 512GB RAM, NVMe storage
17
+ PostgreSQL 16
18
+ Database: 2TB across 50 tables
19
+
20
+ Current settings (defaults):
21
+ max_parallel_workers = 8
22
+ max_parallel_workers_per_gather = 2
23
+ max_worker_processes = 8
24
+ parallel_setup_cost = 1000
25
+ parallel_tuple_cost = 0.1
26
+ min_parallel_table_scan_size = 8MB
27
+ min_parallel_index_scan_size = 512kB
28
+
29
+ Queries that should parallelize but don't:
30
+
31
+ Q1: Full table aggregate (100M rows):
32
+ SELECT COUNT(*), AVG(amount), SUM(amount)
33
+ FROM transactions
34
+ WHERE created_at >= '2026-01-01';
35
+ EXPLAIN shows: Seq Scan (no Gather node). Why?
36
+
37
+ Q2: GROUP BY with many groups (100M rows):
38
+ SELECT customer_id, SUM(amount)
39
+ FROM transactions
40
+ GROUP BY customer_id;
41
+ Uses parallel workers but is still slow — partial aggregate
42
+ + final aggregate creates bottleneck at Gather.
43
+
44
+ Q3: Complex JOIN (50M × 10M):
45
+ SELECT t.*, c.name
46
+ FROM transactions t
47
+ JOIN customers c ON c.id = t.customer_id
48
+ WHERE t.amount > 1000;
49
+ Parallel scan on transactions but nested loop JOIN is serial.
50
+
51
+ Q4: Partitioned table scan (500M rows, 60 partitions):
52
+ SELECT region, COUNT(*)
53
+ FROM events
54
+ GROUP BY region;
55
+ Should scan partitions in parallel but scans sequentially.
56
+
57
+ Task: Tune parallel query settings for this 64-core server. For
58
+ each query, diagnose why parallelism isn't working and fix it.
59
+ Write the optimal configuration, the expected speedup, and the
60
+ limitations of parallel queries in PostgreSQL.
61
+
62
+ assertions:
63
+ - type: llm_judge
64
+ criteria: "Configuration is correct for 64-core server — max_parallel_workers increased to 32+ (not 64, need cores for other work), max_parallel_workers_per_gather increased to 8-16, parallel_setup_cost reduced for more aggressive parallelization, and explains why max_worker_processes must be >= max_parallel_workers"
65
+ weight: 0.35
66
+ description: "Correct parallel configuration"
67
+ - type: llm_judge
68
+ criteria: "Each query's parallelism issue is diagnosed — Q1 may not parallelize due to cost estimate (reduce parallel_setup_cost), Q2 has Gather bottleneck (increase max_parallel_workers_per_gather), Q3 needs parallel-aware JOIN (hash join can parallelize, nested loop may not), Q4 needs enable_parallel_append for partitioned tables"
69
+ weight: 0.35
70
+ description: "Per-query diagnosis"
71
+ - type: llm_judge
72
+ criteria: "Limitations are honestly stated — not all operations parallelize (correlated subqueries, some functions, writes), Gather node is a bottleneck for result assembly, parallel workers consume memory (work_mem per worker), and adding workers has diminishing returns (Amdahl's law)"
73
+ weight: 0.30
74
+ description: "Honest limitations"
@@ -0,0 +1,71 @@
1
+ meta:
2
+ id: partitioning-strategies
3
+ level: 3
4
+ course: postgresql-query-optimization
5
+ type: output
6
+ description: "Design partitioning strategies — partition large tables for query performance, maintenance efficiency, and data lifecycle management"
7
+ tags: [PostgreSQL, partitioning, range, list, hash, advanced]
8
+
9
+ state: {}
10
+
11
+ trigger: |
12
+ Your IoT platform stores sensor data in a single events table that
13
+ has grown to 5 billion rows (2TB). Performance has degraded to the
14
+ point where even simple queries take minutes.
15
+
16
+ Table:
17
+ CREATE TABLE events (
18
+ id BIGSERIAL,
19
+ device_id INTEGER NOT NULL,
20
+ event_type VARCHAR(50) NOT NULL,
21
+ payload JSONB NOT NULL,
22
+ region VARCHAR(10) NOT NULL,
23
+ created_at TIMESTAMP NOT NULL
24
+ );
25
+
26
+ Current indexes: (device_id, created_at), (event_type, created_at)
27
+ Index sizes: 800GB total (larger than the data!)
28
+
29
+ Query patterns:
30
+ Q1 (80% of queries): Recent events for a device
31
+ WHERE device_id = ? AND created_at >= NOW() - INTERVAL '7 days'
32
+ Q2 (10%): Regional aggregation
33
+ WHERE region = ? AND created_at BETWEEN ? AND ?
34
+ Q3 (5%): Event type analysis
35
+ WHERE event_type = ? AND created_at >= ?
36
+ Q4 (5%): Data archival
37
+ DELETE FROM events WHERE created_at < NOW() - INTERVAL '2 years'
38
+
39
+ Partitioning options to evaluate:
40
+ A. Range partition by created_at (monthly)
41
+ B. List partition by region (5 regions)
42
+ C. Hash partition by device_id (64 partitions)
43
+ D. Composite: Range by created_at, then List by region
44
+ E. Composite: Range by created_at, then Hash by device_id
45
+
46
+ Constraints:
47
+ - Q1 must be < 100ms
48
+ - Monthly archival must be < 1 minute (currently takes 6 hours)
49
+ - No more than 500 total partitions (planner overhead)
50
+ - Must migrate from monolithic table with < 1 hour downtime
51
+
52
+ Task: Evaluate all 5 partitioning options. Recommend the best one
53
+ with justification. Write: the CREATE TABLE DDL, the partition
54
+ creation automation (pg_partman or custom), the migration plan from
55
+ the monolithic table, the impact on each query pattern, and the
56
+ ongoing maintenance (partition creation, archival, VACUUM per
57
+ partition).
58
+
59
+ assertions:
60
+ - type: llm_judge
61
+ criteria: "Partitioning evaluation analyzes all 5 options — range by date enables fast archival (DROP PARTITION) and prunes for time-based queries, hash by device_id distributes load evenly, composite enables multi-dimensional pruning. The chosen strategy balances all 4 query patterns, not just Q1"
62
+ weight: 0.35
63
+ description: "Thorough option evaluation"
64
+ - type: llm_judge
65
+ criteria: "Migration plan minimizes downtime — uses pg_partman or logical replication to migrate data, creates partitions in advance, uses concurrent index builds on partitions, and handles the transition from monolithic to partitioned with <1 hour downtime"
66
+ weight: 0.35
67
+ description: "Low-downtime migration plan"
68
+ - type: llm_judge
69
+ criteria: "Ongoing maintenance is automated — partition creation for future months/weeks, old partition archival via DROP (instead of DELETE), per-partition VACUUM (much faster than table-wide), and monitoring for partition imbalance or missing partitions"
70
+ weight: 0.30
71
+ description: "Automated ongoing maintenance"