gsd-trae 1.0.0 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (763) hide show
  1. package/CHANGELOG.md +40 -0
  2. package/README.md +7 -76
  3. package/assets/screenshot.png +0 -0
  4. package/package.json +12 -3
  5. package/.claude/settings.local.json +0 -8
  6. package/.gitmodules +0 -6
  7. package/.trae/project_rules.md +0 -56
  8. package/.trae/rules/project_rules.md +0 -56
  9. package/.vscode/code-counter/code-counter.db +0 -0
  10. package/.vscode/settings.json +0 -5
  11. package/refs/gsd/.github/CODEOWNERS +0 -2
  12. package/refs/gsd/.github/FUNDING.yml +0 -1
  13. package/refs/gsd/.github/ISSUE_TEMPLATE/bug_report.yml +0 -59
  14. package/refs/gsd/.github/ISSUE_TEMPLATE/feature_request.yml +0 -37
  15. package/refs/gsd/.github/pull_request_template.md +0 -24
  16. package/refs/gsd/.github/workflows/auto-label-issues.yml +0 -21
  17. package/refs/gsd/CHANGELOG.md +0 -1520
  18. package/refs/gsd/LICENSE +0 -21
  19. package/refs/gsd/README.md +0 -704
  20. package/refs/gsd/SECURITY.md +0 -33
  21. package/refs/gsd/agents/gsd-codebase-mapper.md +0 -764
  22. package/refs/gsd/agents/gsd-debugger.md +0 -1246
  23. package/refs/gsd/agents/gsd-executor.md +0 -469
  24. package/refs/gsd/agents/gsd-integration-checker.md +0 -443
  25. package/refs/gsd/agents/gsd-phase-researcher.md +0 -546
  26. package/refs/gsd/agents/gsd-plan-checker.md +0 -690
  27. package/refs/gsd/agents/gsd-planner.md +0 -1275
  28. package/refs/gsd/agents/gsd-project-researcher.md +0 -621
  29. package/refs/gsd/agents/gsd-research-synthesizer.md +0 -239
  30. package/refs/gsd/agents/gsd-roadmapper.md +0 -642
  31. package/refs/gsd/agents/gsd-verifier.md +0 -573
  32. package/refs/gsd/assets/gsd-logo-2000-transparent.png +0 -0
  33. package/refs/gsd/assets/gsd-logo-2000-transparent.svg +0 -17
  34. package/refs/gsd/assets/gsd-logo-2000.png +0 -0
  35. package/refs/gsd/assets/gsd-logo-2000.svg +0 -21
  36. package/refs/gsd/assets/terminal.svg +0 -68
  37. package/refs/gsd/bin/install.js +0 -2090
  38. package/refs/gsd/commands/gsd/add-phase.md +0 -43
  39. package/refs/gsd/commands/gsd/add-tests.md +0 -41
  40. package/refs/gsd/commands/gsd/add-todo.md +0 -47
  41. package/refs/gsd/commands/gsd/audit-milestone.md +0 -36
  42. package/refs/gsd/commands/gsd/check-todos.md +0 -45
  43. package/refs/gsd/commands/gsd/cleanup.md +0 -18
  44. package/refs/gsd/commands/gsd/complete-milestone.md +0 -136
  45. package/refs/gsd/commands/gsd/debug.md +0 -167
  46. package/refs/gsd/commands/gsd/discuss-phase.md +0 -83
  47. package/refs/gsd/commands/gsd/execute-phase.md +0 -41
  48. package/refs/gsd/commands/gsd/health.md +0 -22
  49. package/refs/gsd/commands/gsd/help.md +0 -22
  50. package/refs/gsd/commands/gsd/insert-phase.md +0 -32
  51. package/refs/gsd/commands/gsd/join-discord.md +0 -18
  52. package/refs/gsd/commands/gsd/list-phase-assumptions.md +0 -46
  53. package/refs/gsd/commands/gsd/map-codebase.md +0 -71
  54. package/refs/gsd/commands/gsd/new-milestone.md +0 -44
  55. package/refs/gsd/commands/gsd/new-project.md +0 -42
  56. package/refs/gsd/commands/gsd/new-project.md.bak +0 -1041
  57. package/refs/gsd/commands/gsd/pause-work.md +0 -38
  58. package/refs/gsd/commands/gsd/plan-milestone-gaps.md +0 -34
  59. package/refs/gsd/commands/gsd/plan-phase.md +0 -45
  60. package/refs/gsd/commands/gsd/progress.md +0 -24
  61. package/refs/gsd/commands/gsd/quick.md +0 -41
  62. package/refs/gsd/commands/gsd/reapply-patches.md +0 -110
  63. package/refs/gsd/commands/gsd/remove-phase.md +0 -31
  64. package/refs/gsd/commands/gsd/research-phase.md +0 -189
  65. package/refs/gsd/commands/gsd/resume-work.md +0 -40
  66. package/refs/gsd/commands/gsd/set-profile.md +0 -34
  67. package/refs/gsd/commands/gsd/settings.md +0 -36
  68. package/refs/gsd/commands/gsd/update.md +0 -37
  69. package/refs/gsd/commands/gsd/verify-work.md +0 -38
  70. package/refs/gsd/docs/USER-GUIDE.md +0 -471
  71. package/refs/gsd/docs/context-monitor.md +0 -96
  72. package/refs/gsd/get-shit-done/bin/gsd-tools.cjs +0 -585
  73. package/refs/gsd/get-shit-done/bin/lib/commands.cjs +0 -553
  74. package/refs/gsd/get-shit-done/bin/lib/config.cjs +0 -162
  75. package/refs/gsd/get-shit-done/bin/lib/core.cjs +0 -411
  76. package/refs/gsd/get-shit-done/bin/lib/frontmatter.cjs +0 -299
  77. package/refs/gsd/get-shit-done/bin/lib/init.cjs +0 -710
  78. package/refs/gsd/get-shit-done/bin/lib/milestone.cjs +0 -215
  79. package/refs/gsd/get-shit-done/bin/lib/phase.cjs +0 -870
  80. package/refs/gsd/get-shit-done/bin/lib/roadmap.cjs +0 -298
  81. package/refs/gsd/get-shit-done/bin/lib/state.cjs +0 -521
  82. package/refs/gsd/get-shit-done/bin/lib/template.cjs +0 -222
  83. package/refs/gsd/get-shit-done/bin/lib/verify.cjs +0 -772
  84. package/refs/gsd/get-shit-done/references/checkpoints.md +0 -776
  85. package/refs/gsd/get-shit-done/references/continuation-format.md +0 -249
  86. package/refs/gsd/get-shit-done/references/decimal-phase-calculation.md +0 -65
  87. package/refs/gsd/get-shit-done/references/git-integration.md +0 -248
  88. package/refs/gsd/get-shit-done/references/git-planning-commit.md +0 -38
  89. package/refs/gsd/get-shit-done/references/model-profile-resolution.md +0 -34
  90. package/refs/gsd/get-shit-done/references/model-profiles.md +0 -92
  91. package/refs/gsd/get-shit-done/references/phase-argument-parsing.md +0 -61
  92. package/refs/gsd/get-shit-done/references/planning-config.md +0 -196
  93. package/refs/gsd/get-shit-done/references/questioning.md +0 -145
  94. package/refs/gsd/get-shit-done/references/tdd.md +0 -263
  95. package/refs/gsd/get-shit-done/references/ui-brand.md +0 -160
  96. package/refs/gsd/get-shit-done/references/verification-patterns.md +0 -612
  97. package/refs/gsd/get-shit-done/templates/DEBUG.md +0 -164
  98. package/refs/gsd/get-shit-done/templates/UAT.md +0 -247
  99. package/refs/gsd/get-shit-done/templates/VALIDATION.md +0 -76
  100. package/refs/gsd/get-shit-done/templates/codebase/architecture.md +0 -255
  101. package/refs/gsd/get-shit-done/templates/codebase/concerns.md +0 -310
  102. package/refs/gsd/get-shit-done/templates/codebase/conventions.md +0 -307
  103. package/refs/gsd/get-shit-done/templates/codebase/integrations.md +0 -280
  104. package/refs/gsd/get-shit-done/templates/codebase/stack.md +0 -186
  105. package/refs/gsd/get-shit-done/templates/codebase/structure.md +0 -285
  106. package/refs/gsd/get-shit-done/templates/codebase/testing.md +0 -480
  107. package/refs/gsd/get-shit-done/templates/config.json +0 -37
  108. package/refs/gsd/get-shit-done/templates/context.md +0 -283
  109. package/refs/gsd/get-shit-done/templates/continue-here.md +0 -78
  110. package/refs/gsd/get-shit-done/templates/debug-subagent-prompt.md +0 -91
  111. package/refs/gsd/get-shit-done/templates/discovery.md +0 -146
  112. package/refs/gsd/get-shit-done/templates/milestone-archive.md +0 -123
  113. package/refs/gsd/get-shit-done/templates/milestone.md +0 -115
  114. package/refs/gsd/get-shit-done/templates/phase-prompt.md +0 -569
  115. package/refs/gsd/get-shit-done/templates/planner-subagent-prompt.md +0 -117
  116. package/refs/gsd/get-shit-done/templates/project.md +0 -184
  117. package/refs/gsd/get-shit-done/templates/requirements.md +0 -231
  118. package/refs/gsd/get-shit-done/templates/research-project/ARCHITECTURE.md +0 -204
  119. package/refs/gsd/get-shit-done/templates/research-project/FEATURES.md +0 -147
  120. package/refs/gsd/get-shit-done/templates/research-project/PITFALLS.md +0 -200
  121. package/refs/gsd/get-shit-done/templates/research-project/STACK.md +0 -120
  122. package/refs/gsd/get-shit-done/templates/research-project/SUMMARY.md +0 -170
  123. package/refs/gsd/get-shit-done/templates/research.md +0 -552
  124. package/refs/gsd/get-shit-done/templates/retrospective.md +0 -54
  125. package/refs/gsd/get-shit-done/templates/roadmap.md +0 -202
  126. package/refs/gsd/get-shit-done/templates/state.md +0 -176
  127. package/refs/gsd/get-shit-done/templates/summary-complex.md +0 -59
  128. package/refs/gsd/get-shit-done/templates/summary-minimal.md +0 -41
  129. package/refs/gsd/get-shit-done/templates/summary-standard.md +0 -48
  130. package/refs/gsd/get-shit-done/templates/summary.md +0 -248
  131. package/refs/gsd/get-shit-done/templates/user-setup.md +0 -311
  132. package/refs/gsd/get-shit-done/templates/verification-report.md +0 -322
  133. package/refs/gsd/get-shit-done/workflows/add-phase.md +0 -111
  134. package/refs/gsd/get-shit-done/workflows/add-tests.md +0 -350
  135. package/refs/gsd/get-shit-done/workflows/add-todo.md +0 -157
  136. package/refs/gsd/get-shit-done/workflows/audit-milestone.md +0 -297
  137. package/refs/gsd/get-shit-done/workflows/check-todos.md +0 -176
  138. package/refs/gsd/get-shit-done/workflows/cleanup.md +0 -152
  139. package/refs/gsd/get-shit-done/workflows/complete-milestone.md +0 -763
  140. package/refs/gsd/get-shit-done/workflows/diagnose-issues.md +0 -219
  141. package/refs/gsd/get-shit-done/workflows/discovery-phase.md +0 -289
  142. package/refs/gsd/get-shit-done/workflows/discuss-phase.md +0 -542
  143. package/refs/gsd/get-shit-done/workflows/execute-phase.md +0 -449
  144. package/refs/gsd/get-shit-done/workflows/execute-plan.md +0 -448
  145. package/refs/gsd/get-shit-done/workflows/health.md +0 -156
  146. package/refs/gsd/get-shit-done/workflows/help.md +0 -489
  147. package/refs/gsd/get-shit-done/workflows/insert-phase.md +0 -129
  148. package/refs/gsd/get-shit-done/workflows/list-phase-assumptions.md +0 -178
  149. package/refs/gsd/get-shit-done/workflows/map-codebase.md +0 -315
  150. package/refs/gsd/get-shit-done/workflows/new-milestone.md +0 -382
  151. package/refs/gsd/get-shit-done/workflows/new-project.md +0 -1116
  152. package/refs/gsd/get-shit-done/workflows/pause-work.md +0 -122
  153. package/refs/gsd/get-shit-done/workflows/plan-milestone-gaps.md +0 -274
  154. package/refs/gsd/get-shit-done/workflows/plan-phase.md +0 -569
  155. package/refs/gsd/get-shit-done/workflows/progress.md +0 -381
  156. package/refs/gsd/get-shit-done/workflows/quick.md +0 -453
  157. package/refs/gsd/get-shit-done/workflows/remove-phase.md +0 -154
  158. package/refs/gsd/get-shit-done/workflows/research-phase.md +0 -73
  159. package/refs/gsd/get-shit-done/workflows/resume-project.md +0 -306
  160. package/refs/gsd/get-shit-done/workflows/set-profile.md +0 -80
  161. package/refs/gsd/get-shit-done/workflows/settings.md +0 -213
  162. package/refs/gsd/get-shit-done/workflows/transition.md +0 -544
  163. package/refs/gsd/get-shit-done/workflows/update.md +0 -219
  164. package/refs/gsd/get-shit-done/workflows/verify-phase.md +0 -242
  165. package/refs/gsd/get-shit-done/workflows/verify-work.md +0 -569
  166. package/refs/gsd/hooks/gsd-check-update.js +0 -62
  167. package/refs/gsd/hooks/gsd-context-monitor.js +0 -122
  168. package/refs/gsd/hooks/gsd-statusline.js +0 -108
  169. package/refs/gsd/package.json +0 -50
  170. package/refs/gsd/scripts/build-hooks.js +0 -43
  171. package/refs/gsd/tests/commands.test.cjs +0 -661
  172. package/refs/gsd/tests/helpers.cjs +0 -40
  173. package/refs/gsd/tests/init.test.cjs +0 -205
  174. package/refs/gsd/tests/milestone.test.cjs +0 -98
  175. package/refs/gsd/tests/phase.test.cjs +0 -1241
  176. package/refs/gsd/tests/roadmap.test.cjs +0 -265
  177. package/refs/gsd/tests/state.test.cjs +0 -302
  178. package/refs/gsd/tests/verify.test.cjs +0 -80
  179. package/refs/vbenchmark/.agent/agents/codebase-explorer.md +0 -224
  180. package/refs/vbenchmark/.agent/agents/debugger.md +0 -180
  181. package/refs/vbenchmark/.agent/agents/documenter.md +0 -166
  182. package/refs/vbenchmark/.agent/agents/implementer.md +0 -70
  183. package/refs/vbenchmark/.agent/agents/orchestrator.md +0 -212
  184. package/refs/vbenchmark/.agent/agents/researcher.md +0 -80
  185. package/refs/vbenchmark/.agent/agents/reviewer.md +0 -184
  186. package/refs/vbenchmark/.agent/agents/tester.md +0 -170
  187. package/refs/vbenchmark/.agent/commands/commit.md +0 -29
  188. package/refs/vbenchmark/.agent/commands/debug.md +0 -59
  189. package/refs/vbenchmark/.agent/commands/document.md +0 -52
  190. package/refs/vbenchmark/.agent/commands/gather-context.md +0 -58
  191. package/refs/vbenchmark/.agent/commands/init.md +0 -56
  192. package/refs/vbenchmark/.agent/commands/preset-help.md +0 -50
  193. package/refs/vbenchmark/.agent/commands/refactor.md +0 -71
  194. package/refs/vbenchmark/.agent/commands/research.md +0 -37
  195. package/refs/vbenchmark/.agent/commands/review.md +0 -38
  196. package/refs/vbenchmark/.agent/commands/test.md +0 -61
  197. package/refs/vbenchmark/.agent/rules/01-code-quality.md +0 -33
  198. package/refs/vbenchmark/.agent/rules/02-typescript-go.md +0 -46
  199. package/refs/vbenchmark/.agent/rules/03-security-git.md +0 -34
  200. package/refs/vbenchmark/.agent/rules/04-architecture.md +0 -40
  201. package/refs/vbenchmark/.agent/sync.js +0 -536
  202. package/refs/vbenchmark/.agent/workflows/commit.md +0 -29
  203. package/refs/vbenchmark/.agent/workflows/debug.md +0 -59
  204. package/refs/vbenchmark/.agent/workflows/document.md +0 -52
  205. package/refs/vbenchmark/.agent/workflows/gather-context.md +0 -58
  206. package/refs/vbenchmark/.agent/workflows/init.md +0 -56
  207. package/refs/vbenchmark/.agent/workflows/preset-help.md +0 -50
  208. package/refs/vbenchmark/.agent/workflows/refactor.md +0 -71
  209. package/refs/vbenchmark/.agent/workflows/research.md +0 -37
  210. package/refs/vbenchmark/.agent/workflows/review.md +0 -38
  211. package/refs/vbenchmark/.agent/workflows/test.md +0 -61
  212. package/refs/vbenchmark/.claude/commands/agentic-dev/apply.md +0 -222
  213. package/refs/vbenchmark/.claude/commands/agentic-dev/done.md +0 -166
  214. package/refs/vbenchmark/.claude/commands/agentic-dev/proposal.md +0 -220
  215. package/refs/vbenchmark/.claude/commands/openspec/apply.md +0 -23
  216. package/refs/vbenchmark/.claude/commands/openspec/archive.md +0 -27
  217. package/refs/vbenchmark/.claude/commands/openspec/proposal.md +0 -28
  218. package/refs/vbenchmark/.clinerules/01-rules.md +0 -73
  219. package/refs/vbenchmark/.clinerules/02-agents.md +0 -34
  220. package/refs/vbenchmark/.cursor/commands/commit.md +0 -29
  221. package/refs/vbenchmark/.cursor/commands/debug.md +0 -59
  222. package/refs/vbenchmark/.cursor/commands/document.md +0 -52
  223. package/refs/vbenchmark/.cursor/commands/gather-context.md +0 -58
  224. package/refs/vbenchmark/.cursor/commands/init.md +0 -56
  225. package/refs/vbenchmark/.cursor/commands/preset-help.md +0 -50
  226. package/refs/vbenchmark/.cursor/commands/refactor.md +0 -71
  227. package/refs/vbenchmark/.cursor/commands/research.md +0 -37
  228. package/refs/vbenchmark/.cursor/commands/review.md +0 -38
  229. package/refs/vbenchmark/.cursor/commands/test.md +0 -61
  230. package/refs/vbenchmark/.cursor/rules/agents.mdc +0 -1357
  231. package/refs/vbenchmark/.factory/droids/codebase-explorer.md +0 -224
  232. package/refs/vbenchmark/.factory/droids/debugger.md +0 -180
  233. package/refs/vbenchmark/.factory/droids/documenter.md +0 -166
  234. package/refs/vbenchmark/.factory/droids/implementer.md +0 -70
  235. package/refs/vbenchmark/.factory/droids/orchestrator.md +0 -212
  236. package/refs/vbenchmark/.factory/droids/researcher.md +0 -80
  237. package/refs/vbenchmark/.factory/droids/reviewer.md +0 -184
  238. package/refs/vbenchmark/.factory/droids/tester.md +0 -170
  239. package/refs/vbenchmark/.gemini/workflows/commit.md +0 -29
  240. package/refs/vbenchmark/.gemini/workflows/debug.md +0 -59
  241. package/refs/vbenchmark/.gemini/workflows/document.md +0 -52
  242. package/refs/vbenchmark/.gemini/workflows/gather-context.md +0 -58
  243. package/refs/vbenchmark/.gemini/workflows/init.md +0 -56
  244. package/refs/vbenchmark/.gemini/workflows/preset-help.md +0 -50
  245. package/refs/vbenchmark/.gemini/workflows/refactor.md +0 -71
  246. package/refs/vbenchmark/.gemini/workflows/research.md +0 -37
  247. package/refs/vbenchmark/.gemini/workflows/review.md +0 -38
  248. package/refs/vbenchmark/.gemini/workflows/test.md +0 -61
  249. package/refs/vbenchmark/.github/CODEOWNERS +0 -20
  250. package/refs/vbenchmark/.github/FUNDING.yml +0 -4
  251. package/refs/vbenchmark/.github/ISSUE_TEMPLATE/bug-report.yml +0 -76
  252. package/refs/vbenchmark/.github/ISSUE_TEMPLATE/new-task.yml +0 -106
  253. package/refs/vbenchmark/.github/PULL_REQUEST_TEMPLATE.md +0 -38
  254. package/refs/vbenchmark/.github/copilot-instructions.md +0 -73
  255. package/refs/vbenchmark/.github/workflows/ci.yaml +0 -33
  256. package/refs/vbenchmark/.github/workflows/vercel-auto-pr.yml +0 -478
  257. package/refs/vbenchmark/.github/workflows/vercel-deploy.yaml +0 -487
  258. package/refs/vbenchmark/.github/workflows/vercel-pr-command.yaml +0 -337
  259. package/refs/vbenchmark/.github/workflows/vercel-project-init.yaml +0 -208
  260. package/refs/vbenchmark/.opencode/agent/codebase-explorer.md +0 -224
  261. package/refs/vbenchmark/.opencode/agent/debugger.md +0 -180
  262. package/refs/vbenchmark/.opencode/agent/documenter.md +0 -166
  263. package/refs/vbenchmark/.opencode/agent/implementer.md +0 -70
  264. package/refs/vbenchmark/.opencode/agent/orchestrator.md +0 -212
  265. package/refs/vbenchmark/.opencode/agent/researcher.md +0 -80
  266. package/refs/vbenchmark/.opencode/agent/reviewer.md +0 -184
  267. package/refs/vbenchmark/.opencode/agent/tester.md +0 -170
  268. package/refs/vbenchmark/.opencode/command/commit.md +0 -29
  269. package/refs/vbenchmark/.opencode/command/debug.md +0 -59
  270. package/refs/vbenchmark/.opencode/command/document.md +0 -52
  271. package/refs/vbenchmark/.opencode/command/gather-context.md +0 -58
  272. package/refs/vbenchmark/.opencode/command/init.md +0 -56
  273. package/refs/vbenchmark/.opencode/command/preset-help.md +0 -50
  274. package/refs/vbenchmark/.opencode/command/refactor.md +0 -71
  275. package/refs/vbenchmark/.opencode/command/research.md +0 -37
  276. package/refs/vbenchmark/.opencode/command/review.md +0 -38
  277. package/refs/vbenchmark/.opencode/command/test.md +0 -61
  278. package/refs/vbenchmark/.trae/project_rules.md +0 -73
  279. package/refs/vbenchmark/.windsurf/rules/rules.md +0 -85
  280. package/refs/vbenchmark/AGENTS.md +0 -73
  281. package/refs/vbenchmark/CONTRIBUTING.md +0 -332
  282. package/refs/vbenchmark/Caddyfile +0 -3
  283. package/refs/vbenchmark/LICENSE +0 -47
  284. package/refs/vbenchmark/README.md +0 -354
  285. package/refs/vbenchmark/docker-compose.prod.yaml +0 -35
  286. package/refs/vbenchmark/docker-compose.yaml +0 -53
  287. package/refs/vbenchmark/docs/TASK_EXPANSION_PLAN.md +0 -211
  288. package/refs/vbenchmark/docs/THESIS.md +0 -441
  289. package/refs/vbenchmark/docs/categories/code-evolution.md +0 -138
  290. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/design.md +0 -111
  291. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/proposal.md +0 -15
  292. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/specs/evaluation/spec.md +0 -105
  293. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/specs/leaderboard/spec.md +0 -68
  294. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/specs/task-definition/spec.md +0 -45
  295. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/specs/task-runner/spec.md +0 -49
  296. package/refs/vbenchmark/openspec/changes/init-vibecodingbench/tasks.md +0 -413
  297. package/refs/vbenchmark/package.json +0 -51
  298. package/refs/vbenchmark/packages/cli/eslint.config.js +0 -16
  299. package/refs/vbenchmark/packages/cli/package.json +0 -35
  300. package/refs/vbenchmark/packages/cli/src/agents/index.ts +0 -655
  301. package/refs/vbenchmark/packages/cli/src/commands/eval.ts +0 -197
  302. package/refs/vbenchmark/packages/cli/src/commands/list.ts +0 -63
  303. package/refs/vbenchmark/packages/cli/src/commands/run.ts +0 -147
  304. package/refs/vbenchmark/packages/cli/src/evaluator.ts +0 -125
  305. package/refs/vbenchmark/packages/cli/src/index.ts +0 -21
  306. package/refs/vbenchmark/packages/cli/src/lib/task-variation.ts +0 -153
  307. package/refs/vbenchmark/packages/cli/src/loader.ts +0 -258
  308. package/refs/vbenchmark/packages/cli/src/reporter.ts +0 -222
  309. package/refs/vbenchmark/packages/cli/src/runtime/docker.ts +0 -385
  310. package/refs/vbenchmark/packages/cli/tsconfig.json +0 -8
  311. package/refs/vbenchmark/packages/dashboard/Dockerfile +0 -42
  312. package/refs/vbenchmark/packages/dashboard/index.html +0 -21
  313. package/refs/vbenchmark/packages/dashboard/package.json +0 -29
  314. package/refs/vbenchmark/packages/dashboard/postcss.config.js +0 -6
  315. package/refs/vbenchmark/packages/dashboard/public/favicon.svg +0 -24
  316. package/refs/vbenchmark/packages/dashboard/public/logo.png +0 -0
  317. package/refs/vbenchmark/packages/dashboard/public/logo.svg +0 -39
  318. package/refs/vbenchmark/packages/dashboard/src/App.tsx +0 -1468
  319. package/refs/vbenchmark/packages/dashboard/src/data/category-performance.json +0 -1
  320. package/refs/vbenchmark/packages/dashboard/src/data/leaderboard.json +0 -1
  321. package/refs/vbenchmark/packages/dashboard/src/data/task-results.json +0 -1
  322. package/refs/vbenchmark/packages/dashboard/src/data/tasks.json +0 -1
  323. package/refs/vbenchmark/packages/dashboard/src/index.css +0 -3
  324. package/refs/vbenchmark/packages/dashboard/src/main.tsx +0 -13
  325. package/refs/vbenchmark/packages/dashboard/src/vite-env.d.ts +0 -9
  326. package/refs/vbenchmark/packages/dashboard/tailwind.config.js +0 -11
  327. package/refs/vbenchmark/packages/dashboard/tsconfig.json +0 -21
  328. package/refs/vbenchmark/packages/dashboard/tsconfig.node.json +0 -11
  329. package/refs/vbenchmark/packages/dashboard/vercel.json +0 -6
  330. package/refs/vbenchmark/packages/dashboard/vite.config.ts +0 -28
  331. package/refs/vbenchmark/packages/evaluator/eslint.config.js +0 -16
  332. package/refs/vbenchmark/packages/evaluator/package.json +0 -24
  333. package/refs/vbenchmark/packages/evaluator/src/index.ts +0 -15
  334. package/refs/vbenchmark/packages/evaluator/src/runners/functional.ts +0 -88
  335. package/refs/vbenchmark/packages/evaluator/src/runners/quality.ts +0 -140
  336. package/refs/vbenchmark/packages/evaluator/src/runners/security.ts +0 -94
  337. package/refs/vbenchmark/packages/evaluator/src/runners/visual.ts +0 -108
  338. package/refs/vbenchmark/packages/evaluator/src/types.d.ts +0 -19
  339. package/refs/vbenchmark/packages/evaluator/tsconfig.json +0 -8
  340. package/refs/vbenchmark/packages/leaderboard/Dockerfile +0 -38
  341. package/refs/vbenchmark/packages/leaderboard/drizzle.config.ts +0 -10
  342. package/refs/vbenchmark/packages/leaderboard/eslint.config.js +0 -16
  343. package/refs/vbenchmark/packages/leaderboard/fly.toml +0 -29
  344. package/refs/vbenchmark/packages/leaderboard/package.json +0 -36
  345. package/refs/vbenchmark/packages/leaderboard/src/app.ts +0 -29
  346. package/refs/vbenchmark/packages/leaderboard/src/components/BrowserPreview.tsx +0 -190
  347. package/refs/vbenchmark/packages/leaderboard/src/components/ComparisonView.tsx +0 -205
  348. package/refs/vbenchmark/packages/leaderboard/src/components/LeaderboardTable.tsx +0 -150
  349. package/refs/vbenchmark/packages/leaderboard/src/components/LiveRunCard.tsx +0 -133
  350. package/refs/vbenchmark/packages/leaderboard/src/components/SubmissionForm.tsx +0 -406
  351. package/refs/vbenchmark/packages/leaderboard/src/components/SubmitForm.tsx +0 -293
  352. package/refs/vbenchmark/packages/leaderboard/src/components/TerminalStream.tsx +0 -111
  353. package/refs/vbenchmark/packages/leaderboard/src/config/pricing.ts +0 -206
  354. package/refs/vbenchmark/packages/leaderboard/src/db/index.ts +0 -31
  355. package/refs/vbenchmark/packages/leaderboard/src/db/schema.ts +0 -125
  356. package/refs/vbenchmark/packages/leaderboard/src/index.ts +0 -13
  357. package/refs/vbenchmark/packages/leaderboard/src/lib/websocket.ts +0 -124
  358. package/refs/vbenchmark/packages/leaderboard/src/routes/leaderboard.ts +0 -698
  359. package/refs/vbenchmark/packages/leaderboard/src/routes/live.ts +0 -175
  360. package/refs/vbenchmark/packages/leaderboard/src/routes/submissions.ts +0 -183
  361. package/refs/vbenchmark/packages/leaderboard/src/routes/tasks.ts +0 -215
  362. package/refs/vbenchmark/packages/leaderboard/tests/api.test.ts +0 -228
  363. package/refs/vbenchmark/packages/leaderboard/tsconfig.json +0 -9
  364. package/refs/vbenchmark/scripts/deploy.sh +0 -70
  365. package/refs/vbenchmark/tasks/ai-integration/advanced/context-management/PROMPT.md +0 -15
  366. package/refs/vbenchmark/tasks/ai-integration/advanced/context-management/task.yaml +0 -16
  367. package/refs/vbenchmark/tasks/ai-integration/advanced/evaluation-framework/PROMPT.md +0 -15
  368. package/refs/vbenchmark/tasks/ai-integration/advanced/evaluation-framework/task.yaml +0 -16
  369. package/refs/vbenchmark/tasks/ai-integration/advanced/guardrails-safety/PROMPT.md +0 -15
  370. package/refs/vbenchmark/tasks/ai-integration/advanced/guardrails-safety/task.yaml +0 -16
  371. package/refs/vbenchmark/tasks/ai-integration/advanced/memory-system/PROMPT.md +0 -15
  372. package/refs/vbenchmark/tasks/ai-integration/advanced/memory-system/task.yaml +0 -16
  373. package/refs/vbenchmark/tasks/ai-integration/advanced/model-routing/PROMPT.md +0 -15
  374. package/refs/vbenchmark/tasks/ai-integration/advanced/model-routing/task.yaml +0 -16
  375. package/refs/vbenchmark/tasks/ai-integration/advanced/multi-agent-system/PROMPT.md +0 -15
  376. package/refs/vbenchmark/tasks/ai-integration/advanced/multi-agent-system/task.yaml +0 -16
  377. package/refs/vbenchmark/tasks/ai-integration/advanced/prompt-optimization/PROMPT.md +0 -15
  378. package/refs/vbenchmark/tasks/ai-integration/advanced/prompt-optimization/task.yaml +0 -16
  379. package/refs/vbenchmark/tasks/ai-integration/advanced/reasoning-chain/PROMPT.md +0 -15
  380. package/refs/vbenchmark/tasks/ai-integration/advanced/reasoning-chain/task.yaml +0 -16
  381. package/refs/vbenchmark/tasks/ai-integration/advanced/streaming-pipeline/PROMPT.md +0 -15
  382. package/refs/vbenchmark/tasks/ai-integration/advanced/streaming-pipeline/task.yaml +0 -16
  383. package/refs/vbenchmark/tasks/ai-integration/advanced/tool-use-orchestration/PROMPT.md +0 -15
  384. package/refs/vbenchmark/tasks/ai-integration/advanced/tool-use-orchestration/task.yaml +0 -16
  385. package/refs/vbenchmark/tasks/ai-integration/agents/code-review-agent/PROMPT.md +0 -64
  386. package/refs/vbenchmark/tasks/ai-integration/agents/code-review-agent/task.yaml +0 -24
  387. package/refs/vbenchmark/tasks/ai-integration/agents/research-agent/PROMPT.md +0 -61
  388. package/refs/vbenchmark/tasks/ai-integration/agents/research-agent/task.yaml +0 -24
  389. package/refs/vbenchmark/tasks/ai-integration/agents/web-scraper-agent/PROMPT.md +0 -57
  390. package/refs/vbenchmark/tasks/ai-integration/agents/web-scraper-agent/task.yaml +0 -24
  391. package/refs/vbenchmark/tasks/ai-integration/embeddings/duplicate-detection/PROMPT.md +0 -50
  392. package/refs/vbenchmark/tasks/ai-integration/embeddings/duplicate-detection/task.yaml +0 -24
  393. package/refs/vbenchmark/tasks/ai-integration/embeddings/recommendation-engine/PROMPT.md +0 -51
  394. package/refs/vbenchmark/tasks/ai-integration/embeddings/recommendation-engine/task.yaml +0 -24
  395. package/refs/vbenchmark/tasks/ai-integration/embeddings/semantic-search/PROMPT.md +0 -50
  396. package/refs/vbenchmark/tasks/ai-integration/embeddings/semantic-search/task.yaml +0 -24
  397. package/refs/vbenchmark/tasks/ai-integration/fine-tuning/classification-model/PROMPT.md +0 -50
  398. package/refs/vbenchmark/tasks/ai-integration/fine-tuning/classification-model/task.yaml +0 -24
  399. package/refs/vbenchmark/tasks/ai-integration/function-calling/api-orchestrator/PROMPT.md +0 -60
  400. package/refs/vbenchmark/tasks/ai-integration/function-calling/api-orchestrator/task.yaml +0 -24
  401. package/refs/vbenchmark/tasks/ai-integration/function-calling/calendar-assistant/PROMPT.md +0 -50
  402. package/refs/vbenchmark/tasks/ai-integration/function-calling/calendar-assistant/task.yaml +0 -24
  403. package/refs/vbenchmark/tasks/ai-integration/function-calling/database-query/PROMPT.md +0 -62
  404. package/refs/vbenchmark/tasks/ai-integration/function-calling/database-query/task.yaml +0 -24
  405. package/refs/vbenchmark/tasks/ai-integration/multimodal/chart-interpreter/PROMPT.md +0 -60
  406. package/refs/vbenchmark/tasks/ai-integration/multimodal/chart-interpreter/task.yaml +0 -24
  407. package/refs/vbenchmark/tasks/ai-integration/multimodal/image-captioning/PROMPT.md +0 -49
  408. package/refs/vbenchmark/tasks/ai-integration/multimodal/image-captioning/task.yaml +0 -24
  409. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/code-assistant/PROMPT.md +0 -51
  410. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/code-assistant/task.yaml +0 -24
  411. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/doc-search/PROMPT.md +0 -51
  412. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/doc-search/task.yaml +0 -24
  413. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/pdf-qa/PROMPT.md +0 -76
  414. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/pdf-qa/docker-compose.yaml +0 -30
  415. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/pdf-qa/task.yaml +0 -30
  416. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/pdf-qa/tests/functional/qa.test.py +0 -146
  417. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/support-bot/PROMPT.md +0 -51
  418. package/refs/vbenchmark/tasks/ai-integration/rag-chatbot/support-bot/task.yaml +0 -24
  419. package/refs/vbenchmark/tasks/ai-integration/structured-output/contract-analyzer/PROMPT.md +0 -67
  420. package/refs/vbenchmark/tasks/ai-integration/structured-output/contract-analyzer/task.yaml +0 -24
  421. package/refs/vbenchmark/tasks/ai-integration/structured-output/invoice-parser/PROMPT.md +0 -61
  422. package/refs/vbenchmark/tasks/ai-integration/structured-output/invoice-parser/task.yaml +0 -27
  423. package/refs/vbenchmark/tasks/ai-integration/structured-output/receipt-scanner/PROMPT.md +0 -65
  424. package/refs/vbenchmark/tasks/ai-integration/structured-output/receipt-scanner/task.yaml +0 -24
  425. package/refs/vbenchmark/tasks/ai-integration/structured-output/resume-parser/PROMPT.md +0 -70
  426. package/refs/vbenchmark/tasks/ai-integration/structured-output/resume-parser/task.yaml +0 -24
  427. package/refs/vbenchmark/tasks/api-integrations/advanced/api-analytics/PROMPT.md +0 -15
  428. package/refs/vbenchmark/tasks/api-integrations/advanced/api-analytics/task.yaml +0 -16
  429. package/refs/vbenchmark/tasks/api-integrations/advanced/api-gateway/PROMPT.md +0 -15
  430. package/refs/vbenchmark/tasks/api-integrations/advanced/api-gateway/task.yaml +0 -16
  431. package/refs/vbenchmark/tasks/api-integrations/advanced/api-mocking/PROMPT.md +0 -15
  432. package/refs/vbenchmark/tasks/api-integrations/advanced/api-mocking/task.yaml +0 -16
  433. package/refs/vbenchmark/tasks/api-integrations/advanced/contract-testing/PROMPT.md +0 -15
  434. package/refs/vbenchmark/tasks/api-integrations/advanced/contract-testing/task.yaml +0 -16
  435. package/refs/vbenchmark/tasks/api-integrations/advanced/graphql-federation/PROMPT.md +0 -15
  436. package/refs/vbenchmark/tasks/api-integrations/advanced/graphql-federation/task.yaml +0 -16
  437. package/refs/vbenchmark/tasks/api-integrations/advanced/grpc-gateway/PROMPT.md +0 -15
  438. package/refs/vbenchmark/tasks/api-integrations/advanced/grpc-gateway/task.yaml +0 -16
  439. package/refs/vbenchmark/tasks/api-integrations/advanced/rate-limiter/PROMPT.md +0 -15
  440. package/refs/vbenchmark/tasks/api-integrations/advanced/rate-limiter/task.yaml +0 -16
  441. package/refs/vbenchmark/tasks/api-integrations/advanced/request-validator/PROMPT.md +0 -15
  442. package/refs/vbenchmark/tasks/api-integrations/advanced/request-validator/task.yaml +0 -16
  443. package/refs/vbenchmark/tasks/api-integrations/advanced/sdk-generator/PROMPT.md +0 -15
  444. package/refs/vbenchmark/tasks/api-integrations/advanced/sdk-generator/task.yaml +0 -16
  445. package/refs/vbenchmark/tasks/api-integrations/advanced/webhook-processor/PROMPT.md +0 -15
  446. package/refs/vbenchmark/tasks/api-integrations/advanced/webhook-processor/task.yaml +0 -16
  447. package/refs/vbenchmark/tasks/api-integrations/analytics/mixpanel-events/PROMPT.md +0 -42
  448. package/refs/vbenchmark/tasks/api-integrations/analytics/mixpanel-events/task.yaml +0 -24
  449. package/refs/vbenchmark/tasks/api-integrations/analytics/segment-tracking/PROMPT.md +0 -42
  450. package/refs/vbenchmark/tasks/api-integrations/analytics/segment-tracking/task.yaml +0 -24
  451. package/refs/vbenchmark/tasks/api-integrations/auth-provider/oauth2-github/PROMPT.md +0 -42
  452. package/refs/vbenchmark/tasks/api-integrations/auth-provider/oauth2-github/task.yaml +0 -24
  453. package/refs/vbenchmark/tasks/api-integrations/auth-provider/okta-integration/PROMPT.md +0 -44
  454. package/refs/vbenchmark/tasks/api-integrations/auth-provider/okta-integration/task.yaml +0 -24
  455. package/refs/vbenchmark/tasks/api-integrations/auth-provider/saml-sso/PROMPT.md +0 -42
  456. package/refs/vbenchmark/tasks/api-integrations/auth-provider/saml-sso/task.yaml +0 -24
  457. package/refs/vbenchmark/tasks/api-integrations/communication/discord-webhook/PROMPT.md +0 -44
  458. package/refs/vbenchmark/tasks/api-integrations/communication/discord-webhook/task.yaml +0 -24
  459. package/refs/vbenchmark/tasks/api-integrations/communication/slack-bot/PROMPT.md +0 -42
  460. package/refs/vbenchmark/tasks/api-integrations/communication/slack-bot/task.yaml +0 -24
  461. package/refs/vbenchmark/tasks/api-integrations/communication/twilio-sms/PROMPT.md +0 -42
  462. package/refs/vbenchmark/tasks/api-integrations/communication/twilio-sms/task.yaml +0 -24
  463. package/refs/vbenchmark/tasks/api-integrations/email/transactional/PROMPT.md +0 -82
  464. package/refs/vbenchmark/tasks/api-integrations/email/transactional/task.yaml +0 -27
  465. package/refs/vbenchmark/tasks/api-integrations/maps/google-maps-geocoding/PROMPT.md +0 -41
  466. package/refs/vbenchmark/tasks/api-integrations/maps/google-maps-geocoding/task.yaml +0 -24
  467. package/refs/vbenchmark/tasks/api-integrations/maps/mapbox-directions/PROMPT.md +0 -41
  468. package/refs/vbenchmark/tasks/api-integrations/maps/mapbox-directions/task.yaml +0 -24
  469. package/refs/vbenchmark/tasks/api-integrations/payment/crypto-payments/PROMPT.md +0 -43
  470. package/refs/vbenchmark/tasks/api-integrations/payment/crypto-payments/task.yaml +0 -24
  471. package/refs/vbenchmark/tasks/api-integrations/payment/paypal-integration/PROMPT.md +0 -41
  472. package/refs/vbenchmark/tasks/api-integrations/payment/paypal-integration/task.yaml +0 -24
  473. package/refs/vbenchmark/tasks/api-integrations/social/twitter-api/PROMPT.md +0 -41
  474. package/refs/vbenchmark/tasks/api-integrations/social/twitter-api/task.yaml +0 -24
  475. package/refs/vbenchmark/tasks/api-integrations/storage/cloudinary-upload/PROMPT.md +0 -43
  476. package/refs/vbenchmark/tasks/api-integrations/storage/cloudinary-upload/task.yaml +0 -24
  477. package/refs/vbenchmark/tasks/api-integrations/storage/gcs-streaming/PROMPT.md +0 -43
  478. package/refs/vbenchmark/tasks/api-integrations/storage/gcs-streaming/task.yaml +0 -24
  479. package/refs/vbenchmark/tasks/api-integrations/storage/s3-presigned-urls/PROMPT.md +0 -41
  480. package/refs/vbenchmark/tasks/api-integrations/storage/s3-presigned-urls/task.yaml +0 -24
  481. package/refs/vbenchmark/tasks/api-integrations/stripe/checkout-session/PROMPT.md +0 -41
  482. package/refs/vbenchmark/tasks/api-integrations/stripe/checkout-session/task.yaml +0 -24
  483. package/refs/vbenchmark/tasks/api-integrations/stripe/payment-webhook/PROMPT.md +0 -60
  484. package/refs/vbenchmark/tasks/api-integrations/stripe/payment-webhook/docker-compose.yaml +0 -38
  485. package/refs/vbenchmark/tasks/api-integrations/stripe/payment-webhook/task.yaml +0 -31
  486. package/refs/vbenchmark/tasks/api-integrations/stripe/payment-webhook/tests/webhook.test.ts +0 -193
  487. package/refs/vbenchmark/tasks/api-integrations/stripe/subscription-portal/PROMPT.md +0 -41
  488. package/refs/vbenchmark/tasks/api-integrations/stripe/subscription-portal/task.yaml +0 -24
  489. package/refs/vbenchmark/tasks/code-evolution/advanced/api-deprecation/PROMPT.md +0 -15
  490. package/refs/vbenchmark/tasks/code-evolution/advanced/api-deprecation/task.yaml +0 -16
  491. package/refs/vbenchmark/tasks/code-evolution/advanced/ast-refactoring/PROMPT.md +0 -15
  492. package/refs/vbenchmark/tasks/code-evolution/advanced/ast-refactoring/task.yaml +0 -16
  493. package/refs/vbenchmark/tasks/code-evolution/advanced/concurrency-fix/PROMPT.md +0 -15
  494. package/refs/vbenchmark/tasks/code-evolution/advanced/concurrency-fix/task.yaml +0 -16
  495. package/refs/vbenchmark/tasks/code-evolution/advanced/database-schema-migration/PROMPT.md +0 -15
  496. package/refs/vbenchmark/tasks/code-evolution/advanced/database-schema-migration/task.yaml +0 -16
  497. package/refs/vbenchmark/tasks/code-evolution/advanced/dead-code-elimination/PROMPT.md +0 -15
  498. package/refs/vbenchmark/tasks/code-evolution/advanced/dead-code-elimination/task.yaml +0 -16
  499. package/refs/vbenchmark/tasks/code-evolution/advanced/dependency-upgrade/PROMPT.md +0 -15
  500. package/refs/vbenchmark/tasks/code-evolution/advanced/dependency-upgrade/task.yaml +0 -16
  501. package/refs/vbenchmark/tasks/code-evolution/advanced/memory-optimization/PROMPT.md +0 -15
  502. package/refs/vbenchmark/tasks/code-evolution/advanced/memory-optimization/task.yaml +0 -16
  503. package/refs/vbenchmark/tasks/code-evolution/advanced/monorepo-extraction/PROMPT.md +0 -15
  504. package/refs/vbenchmark/tasks/code-evolution/advanced/monorepo-extraction/task.yaml +0 -16
  505. package/refs/vbenchmark/tasks/code-evolution/advanced/performance-profiling/PROMPT.md +0 -15
  506. package/refs/vbenchmark/tasks/code-evolution/advanced/performance-profiling/task.yaml +0 -16
  507. package/refs/vbenchmark/tasks/code-evolution/advanced/type-migration/PROMPT.md +0 -15
  508. package/refs/vbenchmark/tasks/code-evolution/advanced/type-migration/task.yaml +0 -16
  509. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/callback-to-async/PROMPT.md +0 -47
  510. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/callback-to-async/task.yaml +0 -24
  511. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/express-to-fastify/PROMPT.md +0 -49
  512. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/express-to-fastify/base-code/src/app.ts +0 -22
  513. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/express-to-fastify/task.yaml +0 -37
  514. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/express-to-fastify/tests/api.test.ts +0 -70
  515. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/flask-to-fastapi/PROMPT.md +0 -46
  516. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/flask-to-fastapi/task.yaml +0 -24
  517. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/java-to-kotlin/PROMPT.md +0 -45
  518. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/java-to-kotlin/task.yaml +0 -24
  519. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/jquery-to-react/PROMPT.md +0 -47
  520. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/jquery-to-react/task.yaml +0 -24
  521. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/rest-to-grpc/PROMPT.md +0 -47
  522. package/refs/vbenchmark/tasks/code-evolution/legacy-migration/rest-to-grpc/task.yaml +0 -24
  523. package/refs/vbenchmark/tasks/code-evolution/performance/async-refactor/PROMPT.md +0 -47
  524. package/refs/vbenchmark/tasks/code-evolution/performance/async-refactor/task.yaml +0 -24
  525. package/refs/vbenchmark/tasks/code-evolution/performance/memory-leak-fix/PROMPT.md +0 -47
  526. package/refs/vbenchmark/tasks/code-evolution/performance/memory-leak-fix/task.yaml +0 -24
  527. package/refs/vbenchmark/tasks/code-evolution/performance/query-optimization/PROMPT.md +0 -49
  528. package/refs/vbenchmark/tasks/code-evolution/performance/query-optimization/task.yaml +0 -24
  529. package/refs/vbenchmark/tasks/code-evolution/refactoring/class-to-hooks/PROMPT.md +0 -96
  530. package/refs/vbenchmark/tasks/code-evolution/refactoring/class-to-hooks/task.yaml +0 -27
  531. package/refs/vbenchmark/tasks/code-evolution/refactoring/dependency-injection/PROMPT.md +0 -47
  532. package/refs/vbenchmark/tasks/code-evolution/refactoring/dependency-injection/task.yaml +0 -24
  533. package/refs/vbenchmark/tasks/code-evolution/refactoring/error-handling/PROMPT.md +0 -48
  534. package/refs/vbenchmark/tasks/code-evolution/refactoring/error-handling/task.yaml +0 -24
  535. package/refs/vbenchmark/tasks/code-evolution/refactoring/monolith-to-modules/PROMPT.md +0 -50
  536. package/refs/vbenchmark/tasks/code-evolution/refactoring/monolith-to-modules/task.yaml +0 -24
  537. package/refs/vbenchmark/tasks/code-evolution/refactoring/orm-migration/PROMPT.md +0 -47
  538. package/refs/vbenchmark/tasks/code-evolution/refactoring/orm-migration/task.yaml +0 -24
  539. package/refs/vbenchmark/tasks/code-evolution/security/secrets-rotation/PROMPT.md +0 -49
  540. package/refs/vbenchmark/tasks/code-evolution/security/secrets-rotation/task.yaml +0 -24
  541. package/refs/vbenchmark/tasks/code-evolution/security/sql-injection-fix/PROMPT.md +0 -50
  542. package/refs/vbenchmark/tasks/code-evolution/security/sql-injection-fix/task.yaml +0 -24
  543. package/refs/vbenchmark/tasks/code-evolution/security/xss-prevention/PROMPT.md +0 -47
  544. package/refs/vbenchmark/tasks/code-evolution/security/xss-prevention/task.yaml +0 -24
  545. package/refs/vbenchmark/tasks/code-evolution/testing/add-unit-tests/PROMPT.md +0 -48
  546. package/refs/vbenchmark/tasks/code-evolution/testing/add-unit-tests/task.yaml +0 -24
  547. package/refs/vbenchmark/tasks/code-evolution/testing/e2e-playwright/PROMPT.md +0 -50
  548. package/refs/vbenchmark/tasks/code-evolution/testing/e2e-playwright/task.yaml +0 -24
  549. package/refs/vbenchmark/tasks/code-evolution/testing/pytest-fixtures/PROMPT.md +0 -47
  550. package/refs/vbenchmark/tasks/code-evolution/testing/pytest-fixtures/task.yaml +0 -24
  551. package/refs/vbenchmark/tasks/frontend/accessibility/keyboard-shortcuts/PROMPT.md +0 -44
  552. package/refs/vbenchmark/tasks/frontend/accessibility/keyboard-shortcuts/task.yaml +0 -24
  553. package/refs/vbenchmark/tasks/frontend/accessibility/screen-reader-nav/PROMPT.md +0 -44
  554. package/refs/vbenchmark/tasks/frontend/accessibility/screen-reader-nav/task.yaml +0 -24
  555. package/refs/vbenchmark/tasks/frontend/advanced/canvas-editor/PROMPT.md +0 -15
  556. package/refs/vbenchmark/tasks/frontend/advanced/canvas-editor/task.yaml +0 -16
  557. package/refs/vbenchmark/tasks/frontend/advanced/micro-frontend/PROMPT.md +0 -15
  558. package/refs/vbenchmark/tasks/frontend/advanced/micro-frontend/task.yaml +0 -16
  559. package/refs/vbenchmark/tasks/frontend/advanced/offline-first/PROMPT.md +0 -15
  560. package/refs/vbenchmark/tasks/frontend/advanced/offline-first/task.yaml +0 -16
  561. package/refs/vbenchmark/tasks/frontend/advanced/realtime-collab/PROMPT.md +0 -15
  562. package/refs/vbenchmark/tasks/frontend/advanced/realtime-collab/task.yaml +0 -16
  563. package/refs/vbenchmark/tasks/frontend/advanced/service-worker/PROMPT.md +0 -15
  564. package/refs/vbenchmark/tasks/frontend/advanced/service-worker/task.yaml +0 -16
  565. package/refs/vbenchmark/tasks/frontend/advanced/state-machine/PROMPT.md +0 -15
  566. package/refs/vbenchmark/tasks/frontend/advanced/state-machine/task.yaml +0 -16
  567. package/refs/vbenchmark/tasks/frontend/advanced/virtual-list/PROMPT.md +0 -15
  568. package/refs/vbenchmark/tasks/frontend/advanced/virtual-list/task.yaml +0 -16
  569. package/refs/vbenchmark/tasks/frontend/advanced/wasm-integration/PROMPT.md +0 -15
  570. package/refs/vbenchmark/tasks/frontend/advanced/wasm-integration/task.yaml +0 -16
  571. package/refs/vbenchmark/tasks/frontend/advanced/web-worker/PROMPT.md +0 -15
  572. package/refs/vbenchmark/tasks/frontend/advanced/web-worker/task.yaml +0 -16
  573. package/refs/vbenchmark/tasks/frontend/advanced/webgl-visualization/PROMPT.md +0 -15
  574. package/refs/vbenchmark/tasks/frontend/advanced/webgl-visualization/task.yaml +0 -16
  575. package/refs/vbenchmark/tasks/frontend/animation/page-transitions/PROMPT.md +0 -44
  576. package/refs/vbenchmark/tasks/frontend/animation/page-transitions/task.yaml +0 -24
  577. package/refs/vbenchmark/tasks/frontend/components/data-grid/PROMPT.md +0 -59
  578. package/refs/vbenchmark/tasks/frontend/components/data-grid/task.yaml +0 -24
  579. package/refs/vbenchmark/tasks/frontend/components/date-range-picker/PROMPT.md +0 -57
  580. package/refs/vbenchmark/tasks/frontend/components/date-range-picker/task.yaml +0 -24
  581. package/refs/vbenchmark/tasks/frontend/components/file-uploader/PROMPT.md +0 -55
  582. package/refs/vbenchmark/tasks/frontend/components/file-uploader/task.yaml +0 -24
  583. package/refs/vbenchmark/tasks/frontend/components/form-builder/PROMPT.md +0 -96
  584. package/refs/vbenchmark/tasks/frontend/components/form-builder/task.yaml +0 -28
  585. package/refs/vbenchmark/tasks/frontend/components/rich-text-editor/PROMPT.md +0 -45
  586. package/refs/vbenchmark/tasks/frontend/components/rich-text-editor/task.yaml +0 -24
  587. package/refs/vbenchmark/tasks/frontend/figma-to-code/dashboard-layout/PROMPT.md +0 -50
  588. package/refs/vbenchmark/tasks/frontend/figma-to-code/dashboard-layout/task.yaml +0 -25
  589. package/refs/vbenchmark/tasks/frontend/figma-to-code/landing-page/PROMPT.md +0 -49
  590. package/refs/vbenchmark/tasks/frontend/figma-to-code/landing-page/task.yaml +0 -25
  591. package/refs/vbenchmark/tasks/frontend/figma-to-code/mobile-app-screen/PROMPT.md +0 -51
  592. package/refs/vbenchmark/tasks/frontend/figma-to-code/mobile-app-screen/task.yaml +0 -24
  593. package/refs/vbenchmark/tasks/frontend/figma-to-code/pricing-card/PROMPT.md +0 -93
  594. package/refs/vbenchmark/tasks/frontend/figma-to-code/pricing-card/docker-compose.yaml +0 -23
  595. package/refs/vbenchmark/tasks/frontend/figma-to-code/pricing-card/task.yaml +0 -30
  596. package/refs/vbenchmark/tasks/frontend/figma-to-code/pricing-card/tests/visual/diff.test.ts +0 -107
  597. package/refs/vbenchmark/tasks/frontend/figma-to-code/pricing-card/tests/visual/interaction.test.ts +0 -88
  598. package/refs/vbenchmark/tasks/frontend/performance/image-lazy-load/PROMPT.md +0 -43
  599. package/refs/vbenchmark/tasks/frontend/performance/image-lazy-load/task.yaml +0 -24
  600. package/refs/vbenchmark/tasks/frontend/performance/infinite-scroll/PROMPT.md +0 -44
  601. package/refs/vbenchmark/tasks/frontend/performance/infinite-scroll/task.yaml +0 -24
  602. package/refs/vbenchmark/tasks/frontend/state-management/collaborative-editor/PROMPT.md +0 -44
  603. package/refs/vbenchmark/tasks/frontend/state-management/collaborative-editor/task.yaml +0 -24
  604. package/refs/vbenchmark/tasks/frontend/state-management/shopping-cart/PROMPT.md +0 -53
  605. package/refs/vbenchmark/tasks/frontend/state-management/shopping-cart/task.yaml +0 -24
  606. package/refs/vbenchmark/tasks/frontend/visualization/chart-dashboard/PROMPT.md +0 -83
  607. package/refs/vbenchmark/tasks/frontend/visualization/chart-dashboard/task.yaml +0 -28
  608. package/refs/vbenchmark/tasks/frontend/visualization/gantt-chart/PROMPT.md +0 -57
  609. package/refs/vbenchmark/tasks/frontend/visualization/gantt-chart/task.yaml +0 -24
  610. package/refs/vbenchmark/tasks/frontend/visualization/map-dashboard/PROMPT.md +0 -44
  611. package/refs/vbenchmark/tasks/frontend/visualization/map-dashboard/task.yaml +0 -24
  612. package/refs/vbenchmark/tasks/frontend/visualization/realtime-charts/PROMPT.md +0 -43
  613. package/refs/vbenchmark/tasks/frontend/visualization/realtime-charts/task.yaml +0 -24
  614. package/refs/vbenchmark/tasks/glue-code/advanced/blue-green-deploy/PROMPT.md +0 -15
  615. package/refs/vbenchmark/tasks/glue-code/advanced/blue-green-deploy/task.yaml +0 -16
  616. package/refs/vbenchmark/tasks/glue-code/advanced/canary-release/PROMPT.md +0 -15
  617. package/refs/vbenchmark/tasks/glue-code/advanced/canary-release/task.yaml +0 -16
  618. package/refs/vbenchmark/tasks/glue-code/advanced/change-data-capture/PROMPT.md +0 -15
  619. package/refs/vbenchmark/tasks/glue-code/advanced/change-data-capture/task.yaml +0 -16
  620. package/refs/vbenchmark/tasks/glue-code/advanced/config-management/PROMPT.md +0 -15
  621. package/refs/vbenchmark/tasks/glue-code/advanced/config-management/task.yaml +0 -16
  622. package/refs/vbenchmark/tasks/glue-code/advanced/data-pipeline/PROMPT.md +0 -15
  623. package/refs/vbenchmark/tasks/glue-code/advanced/data-pipeline/task.yaml +0 -16
  624. package/refs/vbenchmark/tasks/glue-code/advanced/distributed-tracing/PROMPT.md +0 -15
  625. package/refs/vbenchmark/tasks/glue-code/advanced/distributed-tracing/task.yaml +0 -16
  626. package/refs/vbenchmark/tasks/glue-code/advanced/log-aggregation/PROMPT.md +0 -15
  627. package/refs/vbenchmark/tasks/glue-code/advanced/log-aggregation/task.yaml +0 -16
  628. package/refs/vbenchmark/tasks/glue-code/advanced/schema-registry/PROMPT.md +0 -15
  629. package/refs/vbenchmark/tasks/glue-code/advanced/schema-registry/task.yaml +0 -16
  630. package/refs/vbenchmark/tasks/glue-code/advanced/secret-rotation/PROMPT.md +0 -15
  631. package/refs/vbenchmark/tasks/glue-code/advanced/secret-rotation/task.yaml +0 -16
  632. package/refs/vbenchmark/tasks/glue-code/advanced/stream-processing/PROMPT.md +0 -15
  633. package/refs/vbenchmark/tasks/glue-code/advanced/stream-processing/task.yaml +0 -16
  634. package/refs/vbenchmark/tasks/glue-code/api-sync/rest-to-graphql/PROMPT.md +0 -66
  635. package/refs/vbenchmark/tasks/glue-code/api-sync/rest-to-graphql/task.yaml +0 -27
  636. package/refs/vbenchmark/tasks/glue-code/caching/redis-cache/PROMPT.md +0 -82
  637. package/refs/vbenchmark/tasks/glue-code/caching/redis-cache/task.yaml +0 -27
  638. package/refs/vbenchmark/tasks/glue-code/data-transform/avro-schema-evolution/PROMPT.md +0 -51
  639. package/refs/vbenchmark/tasks/glue-code/data-transform/avro-schema-evolution/task.yaml +0 -24
  640. package/refs/vbenchmark/tasks/glue-code/data-transform/csv-normalizer/PROMPT.md +0 -49
  641. package/refs/vbenchmark/tasks/glue-code/data-transform/csv-normalizer/task.yaml +0 -24
  642. package/refs/vbenchmark/tasks/glue-code/data-transform/excel-to-json/PROMPT.md +0 -67
  643. package/refs/vbenchmark/tasks/glue-code/data-transform/excel-to-json/task.yaml +0 -28
  644. package/refs/vbenchmark/tasks/glue-code/data-transform/excel-to-json/tests/transform.test.py +0 -137
  645. package/refs/vbenchmark/tasks/glue-code/data-transform/json-to-xml/PROMPT.md +0 -45
  646. package/refs/vbenchmark/tasks/glue-code/data-transform/json-to-xml/task.yaml +0 -24
  647. package/refs/vbenchmark/tasks/glue-code/data-transform/protobuf-converter/PROMPT.md +0 -44
  648. package/refs/vbenchmark/tasks/glue-code/data-transform/protobuf-converter/task.yaml +0 -24
  649. package/refs/vbenchmark/tasks/glue-code/etl/cdc-pipeline/PROMPT.md +0 -52
  650. package/refs/vbenchmark/tasks/glue-code/etl/cdc-pipeline/task.yaml +0 -27
  651. package/refs/vbenchmark/tasks/glue-code/etl/database-sync/PROMPT.md +0 -51
  652. package/refs/vbenchmark/tasks/glue-code/etl/database-sync/task.yaml +0 -24
  653. package/refs/vbenchmark/tasks/glue-code/etl/s3-to-warehouse/PROMPT.md +0 -50
  654. package/refs/vbenchmark/tasks/glue-code/etl/s3-to-warehouse/task.yaml +0 -24
  655. package/refs/vbenchmark/tasks/glue-code/file-processing/image-resizer/PROMPT.md +0 -52
  656. package/refs/vbenchmark/tasks/glue-code/file-processing/image-resizer/task.yaml +0 -24
  657. package/refs/vbenchmark/tasks/glue-code/file-processing/pdf-merger/PROMPT.md +0 -50
  658. package/refs/vbenchmark/tasks/glue-code/file-processing/pdf-merger/task.yaml +0 -24
  659. package/refs/vbenchmark/tasks/glue-code/file-processing/video-transcoder/PROMPT.md +0 -50
  660. package/refs/vbenchmark/tasks/glue-code/file-processing/video-transcoder/task.yaml +0 -27
  661. package/refs/vbenchmark/tasks/glue-code/migration/data-backfill/PROMPT.md +0 -50
  662. package/refs/vbenchmark/tasks/glue-code/migration/data-backfill/task.yaml +0 -24
  663. package/refs/vbenchmark/tasks/glue-code/migration/database-versioning/PROMPT.md +0 -50
  664. package/refs/vbenchmark/tasks/glue-code/migration/database-versioning/task.yaml +0 -24
  665. package/refs/vbenchmark/tasks/glue-code/queue/kafka-producer/PROMPT.md +0 -49
  666. package/refs/vbenchmark/tasks/glue-code/queue/kafka-producer/task.yaml +0 -27
  667. package/refs/vbenchmark/tasks/glue-code/queue/rabbitmq-consumer/PROMPT.md +0 -50
  668. package/refs/vbenchmark/tasks/glue-code/queue/rabbitmq-consumer/task.yaml +0 -27
  669. package/refs/vbenchmark/tasks/glue-code/queue/sqs-batch-processor/PROMPT.md +0 -47
  670. package/refs/vbenchmark/tasks/glue-code/queue/sqs-batch-processor/task.yaml +0 -24
  671. package/refs/vbenchmark/tasks/glue-code/scheduler/cron-job-manager/PROMPT.md +0 -52
  672. package/refs/vbenchmark/tasks/glue-code/scheduler/cron-job-manager/task.yaml +0 -27
  673. package/refs/vbenchmark/tasks/glue-code/scheduler/delayed-tasks/PROMPT.md +0 -51
  674. package/refs/vbenchmark/tasks/glue-code/scheduler/delayed-tasks/task.yaml +0 -27
  675. package/refs/vbenchmark/tasks/saas-core/advanced/api-versioning/PROMPT.md +0 -15
  676. package/refs/vbenchmark/tasks/saas-core/advanced/api-versioning/task.yaml +0 -16
  677. package/refs/vbenchmark/tasks/saas-core/advanced/circuit-breaker/PROMPT.md +0 -13
  678. package/refs/vbenchmark/tasks/saas-core/advanced/circuit-breaker/task.yaml +0 -16
  679. package/refs/vbenchmark/tasks/saas-core/advanced/compliance-gdpr/PROMPT.md +0 -15
  680. package/refs/vbenchmark/tasks/saas-core/advanced/compliance-gdpr/task.yaml +0 -16
  681. package/refs/vbenchmark/tasks/saas-core/advanced/cqrs-pattern/PROMPT.md +0 -13
  682. package/refs/vbenchmark/tasks/saas-core/advanced/cqrs-pattern/task.yaml +0 -16
  683. package/refs/vbenchmark/tasks/saas-core/advanced/data-encryption/PROMPT.md +0 -15
  684. package/refs/vbenchmark/tasks/saas-core/advanced/data-encryption/task.yaml +0 -16
  685. package/refs/vbenchmark/tasks/saas-core/advanced/distributed-locking/PROMPT.md +0 -46
  686. package/refs/vbenchmark/tasks/saas-core/advanced/distributed-locking/task.yaml +0 -24
  687. package/refs/vbenchmark/tasks/saas-core/advanced/event-sourcing/PROMPT.md +0 -23
  688. package/refs/vbenchmark/tasks/saas-core/advanced/event-sourcing/task.yaml +0 -16
  689. package/refs/vbenchmark/tasks/saas-core/advanced/feature-flags-ab/PROMPT.md +0 -15
  690. package/refs/vbenchmark/tasks/saas-core/advanced/feature-flags-ab/task.yaml +0 -16
  691. package/refs/vbenchmark/tasks/saas-core/advanced/saga-orchestration/PROMPT.md +0 -13
  692. package/refs/vbenchmark/tasks/saas-core/advanced/saga-orchestration/task.yaml +0 -16
  693. package/refs/vbenchmark/tasks/saas-core/advanced/webhook-delivery/PROMPT.md +0 -15
  694. package/refs/vbenchmark/tasks/saas-core/advanced/webhook-delivery/task.yaml +0 -16
  695. package/refs/vbenchmark/tasks/saas-core/audit/activity-logging/PROMPT.md +0 -50
  696. package/refs/vbenchmark/tasks/saas-core/audit/activity-logging/task.yaml +0 -27
  697. package/refs/vbenchmark/tasks/saas-core/auth/jwt-refresh-tokens/PROMPT.md +0 -50
  698. package/refs/vbenchmark/tasks/saas-core/auth/jwt-refresh-tokens/task.yaml +0 -27
  699. package/refs/vbenchmark/tasks/saas-core/auth/magic-link-email/PROMPT.md +0 -53
  700. package/refs/vbenchmark/tasks/saas-core/auth/magic-link-email/task.yaml +0 -27
  701. package/refs/vbenchmark/tasks/saas-core/auth/mfa-totp/PROMPT.md +0 -79
  702. package/refs/vbenchmark/tasks/saas-core/auth/mfa-totp/task.yaml +0 -27
  703. package/refs/vbenchmark/tasks/saas-core/auth/rbac-permissions/PROMPT.md +0 -51
  704. package/refs/vbenchmark/tasks/saas-core/auth/rbac-permissions/task.yaml +0 -27
  705. package/refs/vbenchmark/tasks/saas-core/auth/session-management/PROMPT.md +0 -52
  706. package/refs/vbenchmark/tasks/saas-core/auth/session-management/task.yaml +0 -27
  707. package/refs/vbenchmark/tasks/saas-core/auth/supabase-oauth/PROMPT.md +0 -45
  708. package/refs/vbenchmark/tasks/saas-core/auth/supabase-oauth/docker-compose.yaml +0 -47
  709. package/refs/vbenchmark/tasks/saas-core/auth/supabase-oauth/task.yaml +0 -32
  710. package/refs/vbenchmark/tasks/saas-core/auth/supabase-oauth/tests/auth.test.ts +0 -59
  711. package/refs/vbenchmark/tasks/saas-core/billing/invoice-generation/PROMPT.md +0 -53
  712. package/refs/vbenchmark/tasks/saas-core/billing/invoice-generation/task.yaml +0 -27
  713. package/refs/vbenchmark/tasks/saas-core/billing/stripe-subscriptions/PROMPT.md +0 -51
  714. package/refs/vbenchmark/tasks/saas-core/billing/stripe-subscriptions/task.yaml +0 -27
  715. package/refs/vbenchmark/tasks/saas-core/billing/usage-metering/PROMPT.md +0 -52
  716. package/refs/vbenchmark/tasks/saas-core/billing/usage-metering/task.yaml +0 -27
  717. package/refs/vbenchmark/tasks/saas-core/crud/dashboard-table/PROMPT.md +0 -48
  718. package/refs/vbenchmark/tasks/saas-core/crud/dashboard-table/task.yaml +0 -28
  719. package/refs/vbenchmark/tasks/saas-core/multi-tenant/org-isolation/PROMPT.md +0 -50
  720. package/refs/vbenchmark/tasks/saas-core/multi-tenant/org-isolation/task.yaml +0 -27
  721. package/refs/vbenchmark/tasks/saas-core/multi-tenant/subdomain-routing/PROMPT.md +0 -50
  722. package/refs/vbenchmark/tasks/saas-core/multi-tenant/subdomain-routing/task.yaml +0 -27
  723. package/refs/vbenchmark/tasks/saas-core/notifications/email-queue/PROMPT.md +0 -53
  724. package/refs/vbenchmark/tasks/saas-core/notifications/email-queue/task.yaml +0 -27
  725. package/refs/vbenchmark/tasks/saas-core/notifications/in-app-alerts/PROMPT.md +0 -51
  726. package/refs/vbenchmark/tasks/saas-core/notifications/in-app-alerts/task.yaml +0 -27
  727. package/refs/vbenchmark/tasks/saas-core/notifications/push-notifications/PROMPT.md +0 -51
  728. package/refs/vbenchmark/tasks/saas-core/notifications/push-notifications/task.yaml +0 -27
  729. package/refs/vbenchmark/tasks/saas-core/realtime/websocket-chat/PROMPT.md +0 -80
  730. package/refs/vbenchmark/tasks/saas-core/realtime/websocket-chat/task.yaml +0 -27
  731. package/refs/vbenchmark/tasks/saas-core/search/full-text-search/PROMPT.md +0 -51
  732. package/refs/vbenchmark/tasks/saas-core/search/full-text-search/task.yaml +0 -27
  733. package/refs/vbenchmark/tasks/saas-core/security/rate-limiter/PROMPT.md +0 -99
  734. package/refs/vbenchmark/tasks/saas-core/security/rate-limiter/task.yaml +0 -27
  735. package/refs/vbenchmark/tasks/saas-core/settings/user-preferences/PROMPT.md +0 -78
  736. package/refs/vbenchmark/tasks/saas-core/settings/user-preferences/task.yaml +0 -27
  737. package/refs/vbenchmark/templates/fastapi-postgres/docker-compose.yaml +0 -36
  738. package/refs/vbenchmark/templates/fastapi-postgres/pyproject.toml +0 -34
  739. package/refs/vbenchmark/templates/fastapi-postgres/src/__init__.py +0 -0
  740. package/refs/vbenchmark/templates/fastapi-postgres/src/config.py +0 -12
  741. package/refs/vbenchmark/templates/fastapi-postgres/src/database.py +0 -15
  742. package/refs/vbenchmark/templates/fastapi-postgres/src/main.py +0 -51
  743. package/refs/vbenchmark/templates/fastapi-postgres/src/models.py +0 -12
  744. package/refs/vbenchmark/templates/fastapi-postgres/src/schemas.py +0 -20
  745. package/refs/vbenchmark/templates/go-fiber/docker-compose.yaml +0 -34
  746. package/refs/vbenchmark/templates/go-fiber/go.mod +0 -33
  747. package/refs/vbenchmark/templates/go-fiber/go.sum +0 -68
  748. package/refs/vbenchmark/templates/go-fiber/main.go +0 -98
  749. package/refs/vbenchmark/templates/nextjs-supabase/.env.example +0 -3
  750. package/refs/vbenchmark/templates/nextjs-supabase/docker-compose.yaml +0 -68
  751. package/refs/vbenchmark/templates/nextjs-supabase/src/app/globals.css +0 -13
  752. package/refs/vbenchmark/templates/nextjs-supabase/src/app/layout.tsx +0 -19
  753. package/refs/vbenchmark/templates/nextjs-supabase/src/app/page.tsx +0 -38
  754. package/refs/vbenchmark/templates/nextjs-supabase/src/lib/supabase/client.ts +0 -8
  755. package/refs/vbenchmark/templates/nextjs-supabase/src/lib/supabase/server.ts +0 -32
  756. package/refs/vbenchmark/templates/rust-axum/Cargo.lock +0 -2371
  757. package/refs/vbenchmark/templates/rust-axum/Cargo.toml +0 -16
  758. package/refs/vbenchmark/templates/rust-axum/docker-compose.yaml +0 -34
  759. package/refs/vbenchmark/templates/rust-axum/migrations/20240101000000_init.sql +0 -20
  760. package/refs/vbenchmark/templates/rust-axum/src/main.rs +0 -121
  761. package/refs/vbenchmark/tsconfig.base.json +0 -18
  762. package/refs/vbenchmark/turbo.json +0 -23
  763. package/refs/vbenchmark/vercel.json +0 -10
@@ -1,354 +0,0 @@
1
- <p align="center">
2
- <h1 align="center">🚀 VibeCodingBench</h1>
3
- <p align="center">
4
- <strong>The benchmark that measures what AI coding agents actually do in production</strong>
5
- </p>
6
- <p align="center">
7
- <a href="#why-vibecodingbench">Why</a> •
8
- <a href="#quick-start">Quick Start</a> •
9
- <a href="#task-categories">Tasks</a> •
10
- <a href="#evaluation">Evaluation</a> •
11
- <a href="#leaderboard">Leaderboard</a> •
12
- <a href="#contributing">Contributing</a>
13
- </p>
14
- <p align="center">
15
- <img src="https://img.shields.io/badge/tasks-180-blue" alt="Tasks">
16
- <img src="https://img.shields.io/badge/models-15-green" alt="Models">
17
- <img src="https://img.shields.io/badge/languages-10-orange" alt="Languages">
18
-
19
- <img src="https://img.shields.io/badge/version-1.0.0-brightgreen" alt="Version">
20
- </p>
21
- </p>
22
-
23
- ---
24
-
25
- ## Why VibeCodingBench?
26
-
27
- **Existing benchmarks are disconnected from reality.** See our [full thesis](docs/THESIS.md) for detailed analysis.
28
-
29
- | Benchmark | Focus | Real-World Signal | Limitation |
30
- |-----------|-------|-------------------|------------|
31
- | HumanEval | Algorithmic puzzles | ❌ Low | Not production code |
32
- | SWE-bench | Bug fixes in 12 repos | ⚠️ Medium | [63% suspicious patches](https://runloop.ai/blog/swe-bench-deep-dive-unmasking-the-limitations-of-a-popular-benchmark) |
33
- | SWE-bench Pro | Multi-file tasks | ⚠️ Medium | [70% → 23% performance drop](https://scale.com/leaderboard/swe_bench_pro_public) |
34
- | **VibeCodingBench** | Full-stack features | ✅ **High** | Production-aligned tasks |
35
-
36
- ### The Evidence
37
-
38
- **Developer Time Distribution** ([Sonar Research](https://www.sonarsource.com/blog/how-much-time-do-developers-spend-actually-writing-code/)):
39
- - Writing new code: 32% | Code maintenance: 19% | Testing: 12%
40
- - Developers code only **52 minutes/day** on average
41
-
42
- **The Boilerplate Burden** ([GitHub Octoverse 2025](https://github.blog/news-insights/octoverse/)):
43
- - 2.4M repos use Notebooks (+75% YoY)
44
- - 1.9M repos use Dockerfiles (+120% YoY)
45
- - Developers need help with **repetitive patterns**: auth, CRUD, integrations
46
-
47
- **SWE-EVO Exposes the Gap** ([arxiv:2512.18470](https://arxiv.org/abs/2512.18470)):
48
- - Best models: 65% on simple fixes → **only 21% on code evolution**
49
- - "Current AI agents struggle with comprehensive planning and execution"
50
-
51
- **Quality Beyond Pass Rate** ([Qodo 2025](https://www.qodo.ai/reports/state-of-ai-code-quality/)):
52
- - "Claude Sonnet 4 averaged **2.11 issues per passing task**"
53
- - Pass rate alone hides production risks
54
-
55
- **Developer Frustration** ([Stack Overflow 2025](https://survey.stackoverflow.co/2025/)):
56
- - 66% cite "AI solutions almost right, but not quite" as top frustration
57
- - 45% say "debugging AI code is more time-consuming"
58
-
59
- ## Quick Start
60
-
61
- ### From Source
62
-
63
- ```bash
64
- git clone https://github.com/alt-research/vibe-coding-benchmark-public.git
65
- cd coding-model-benchmark
66
- npm install
67
- npm run build
68
-
69
- # List tasks
70
- node packages/cli/dist/index.js list
71
-
72
- # Run a task with mock agent
73
- node packages/cli/dist/index.js run saas-core/auth/supabase-oauth --agent mock
74
-
75
- # Run with real agent (requires API key)
76
- export ANTHROPIC_API_KEY=your_key
77
- node packages/cli/dist/index.js run saas-core/auth/supabase-oauth --agent claude
78
-
79
- # Run full evaluation across agents
80
- node packages/cli/dist/index.js eval --agents claude,glm,minimax
81
-
82
- # Watch live execution
83
- node packages/cli/dist/index.js run <task-id> --agent claude --live
84
- ```
85
-
86
- ## Task Categories
87
-
88
- | Category | Weight | Tasks | Languages | Examples |
89
- |----------|--------|-------|-----------|----------|
90
- | **SaaS Core** | 25% | 20 | TS, Go, Python, Java, Rust | `supabase-oauth`, `jwt-refresh-tokens`, `rbac-permissions` |
91
- | **Glue Code** | 20% | 20 | Python, Go, TS, Java, Rust | `csv-normalizer`, `kafka-producer`, `cdc-pipeline` |
92
- | **AI Integration** | 20% | 20 | Python, TS, Go | `pdf-qa`, `research-agent`, `semantic-search` |
93
- | **Frontend** | 15% | 20 | React, Vue, Svelte, RN | `landing-page`, `data-grid`, `collaborative-editor` |
94
- | **API Integrations** | 10% | 20 | TS, Go, Python, Java | `checkout-session`, `twilio-sms`, `saml-sso` |
95
- | **Code Evolution** | 10% | 20 | TS, Python, Go, Kotlin | `flask-to-fastapi`, `java-to-kotlin`, `secrets-rotation` |
96
-
97
- **Total: 180 tasks** across **10 languages** (TypeScript, Python, Go, Java, Kotlin, Rust, C#, React, Vue, Svelte)
98
-
99
- ### Language Distribution
100
-
101
- Based on [GitHub Octoverse 2025](https://github.blog/news-insights/octoverse/) and [Stack Overflow Developer Survey 2025](https://survey.stackoverflow.co/2025/):
102
-
103
- | Language | % of Tasks | Rationale |
104
- |----------|------------|-----------|
105
- | TypeScript/JavaScript | 40% | #1 on GitHub, dominant in web dev |
106
- | Python | 25% | #2 on GitHub, AI/ML leader |
107
- | Go | 15% | Rising for cloud-native, microservices |
108
- | Java/Kotlin | 10% | Enterprise, Android development |
109
- | Rust | 5% | Systems programming, performance-critical |
110
- | C# | 5% | Enterprise, game development |
111
-
112
- ### Task Structure
113
-
114
- Each task is a self-contained directory:
115
-
116
- ```
117
- tasks/saas-core/auth/supabase-oauth/
118
- ├── task.yaml # Metadata, constraints
119
- ├── PROMPT.md # Instructions for the agent
120
- ├── tests/ # Evaluation tests
121
- │ └── auth.test.ts # Playwright E2E tests
122
- ├── docker-compose.yaml # Services (DB, mock APIs)
123
- └── golden/ # Reference implementation (optional)
124
- ```
125
-
126
- **Hot-reload support**: Add new tasks while the benchmark is running!
127
-
128
- ## Evaluation
129
-
130
- ### Multi-Dimensional Scoring
131
-
132
- We measure what senior engineers care about:
133
-
134
- | Dimension | Weight | Method | Why It Matters |
135
- |-----------|--------|--------|----------------|
136
- | **Functional** | 40% | Playwright E2E, Pass@k | Does it work? |
137
- | **Visual** | 20% | Pixel diff vs reference | Does it look right? |
138
- | **Quality** | 20% | ESLint + Semgrep + complexity | Is it maintainable? |
139
- | **Cost** | 10% | Tokens used, context pollution | Is it efficient? |
140
- | **Speed** | 10% | Wall-clock time, step count | Is it fast? |
141
-
142
- ### Security Gate
143
-
144
- Any **Critical/High** vulnerability = **automatic fail**. We use Semgrep with OWASP rules.
145
-
146
- ### The Scoring Formula
147
-
148
- ```
149
- Final = (Functional × 0.4) + (Visual × 0.2) + (Quality × 0.2)
150
- - (Cost Penalty) - (Speed Penalty)
151
-
152
- Security Fail → Final = 0
153
- ```
154
-
155
- ## Supported Agents
156
-
157
- | Agent | Model | Status | Config | Pricing (Input/Output per MTok) |
158
- |-------|-------|--------|--------|--------------------------------|
159
- | Claude | Haiku 4.5 | ✅ Supported | `ANTHROPIC_API_KEY` | $1.00 / $5.00 |
160
- | Claude | Opus 4.5 | ✅ Supported | `ANTHROPIC_API_KEY` | $5.00 / $25.00 |
161
- | Qwen | Qwen3-Max | ✅ Supported | `QWEN_API_KEY` | $1.20 / $6.00 |
162
- | GLM | GLM-4.7 | ✅ Supported | `GLM_API_KEY` | $0.60 / $2.20 |
163
- | MiniMax | M2.1 | ✅ Supported | `MINIMAX_API_KEY` | $0.30 / $1.20 |
164
- | OpenAI | GPT-5.2 | ✅ Supported | `OPENAI_API_KEY` | $1.75 / $14.00 |
165
- | DeepSeek | Chat-v3 | ✅ Supported | `DEEPSEEK_API_KEY` | $0.40 / $1.60 |
166
- | Gemini | 3-Flash Preview | ✅ Supported | `GOOGLE_API_KEY` | $0.50 / $3.00 |
167
-
168
- ## Leaderboard
169
-
170
- ```
171
- 📈 LEADERBOARD (2026-01-27) - 180 tasks evaluated, 15 models
172
-
173
- ╔══════╤══════════════════════╤═══════╤═══════════╤════════════╤════════════╤══════════════╤═════════════╗
174
- ║ Rank │ Model │ Final │ Pass Rate │ Total Cost │ Total Time │ Avg Time/Task│ Total Tokens║
175
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
176
- ║ #1 │ Claude Opus 4.5 │ 89.2% │ 100.0% │ $12.31 │ 2h 12m │ 44s │ 648K ║
177
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
178
- ║ #2 │ Claude Haiku 4.5 │ 89.0% │ 99.4% │ $3.03 │ 1h 5m │ 22s │ 798K ║
179
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
180
- ║ #3 │ Grok 4 Fast │ 88.8% │ 98.9% │ $0.21 │ 1h 57m │ 70s │ 520K ║
181
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
182
- ║ #4 │ OpenAI GPT-5.2 │ 88.8% │ 98.3% │ $5.01 │ 1h 24m │ 28s │ 485K ║
183
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
184
- ║ #5 │ Qwen3 Max │ 88.6% │ 100.0% │ $5.42 │ 2h 15m │ 45s │ 949K ║
185
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
186
- ║ #6 │ Claude Sonnet 4.5 │ 88.6% │ 98.3% │ $6.98 │ 2h 6m │ 42s │ 612K ║
187
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
188
- ║ #7 │ GLM 4-Plus │ 88.2% │ 98.9% │ $0.93 │ 4h 49m │ 96s │ 794K ║
189
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
190
- ║ #8 │ DeepSeek v3.2 │ 88.2% │ 98.3% │ $0.50 │ 4h 29m │ 90s │ 543K ║
191
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
192
- ║ #9 │ Grok 4 │ 88.0% │ 97.8% │ $5.47 │ 2h 5m │ 75s │ 480K ║
193
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
194
- ║ #10 │ MiniMax M2.1 │ 87.4% │ 99.4% │ $2.40 │ 8h 15m │ 165s │ 2.78M ║
195
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
196
- ║ #11 │ Grok 4.1 Fast │ 86.8% │ 97.2% │ $0.24 │ 2h 27m │ 89s │ 580K ║
197
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
198
- ║ #12 │ Gemini 3 Pro Preview │ 85.8% │ 95.0% │ $10.34 │ 1h 36m │ 32s │ 738K ║
199
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
200
- ║ #13 │ GLM-4.7 │ 83.9% │ 85.6% │ $0.73 │ 2h 50m │ 57s │ 623K ║
201
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
202
- ║ #14 │ GLM 4.7 Flash │ 83.8% │ 92.8% │ $1.11 │ 2h 15m │ 45s │ 650K ║
203
- ╟──────┼──────────────────────┼───────┼───────────┼────────────┼────────────┼──────────────┼─────────────╢
204
- ║ #15 │ Gemini 3 Flash │ 83.4% │ 92.2% │ $0.86 │ 1h 23m │ 28s │ 384K ║
205
- ╚══════╧══════════════════════╧═══════╧═══════════╧════════════╧════════════╧══════════════╧═════════════╝
206
- ```
207
-
208
- ### Pricing (OpenRouter 2026-01-27)
209
-
210
- | Model | Input $/M | Output $/M |
211
- |-------|-----------|------------|
212
- | Claude Opus 4.5 | $5.00 | $25.00 |
213
- | Claude Sonnet 4.5 | $3.00 | $15.00 |
214
- | Claude Haiku 4.5 | $1.00 | $5.00 |
215
- | Qwen3 Max | $1.20 | $6.00 |
216
- | OpenAI GPT-5.2 | $1.75 | $14.00 |
217
- | Grok 4 | $3.00 | $15.00 |
218
- | Grok 4 Fast | $0.20 | $0.50 |
219
- | Grok 4.1 Fast | $0.20 | $0.50 |
220
- | GLM 4-Plus/4.7 | $0.40 | $1.50 |
221
- | GLM 4.7 Flash | $0.07 | $0.40 |
222
- | DeepSeek v3.2 | $0.30 | $1.20 |
223
- | Gemini 3 Flash | $0.50 | $3.00 |
224
- | Gemini 3 Pro | $2.00 | $12.00 |
225
- | MiniMax M2.1 | $0.27 | $1.12 |
226
-
227
- ### Detailed Metrics
228
-
229
- | Model | Functional | Quality | Cost/Task | Tokens/Task |
230
- |-------|------------|---------|-----------|-------------|
231
- | Claude Opus 4.5 | 85.0% | 80.0% | $0.0684 | 3,599 |
232
- | Claude Haiku 4.5 | 84.5% | 79.6% | $0.0168 | 4,435 |
233
- | Grok 4 Fast | 84.1% | 80.0% | $0.0012 | 2,889 |
234
- | Qwen3 Max | 85.0% | 80.0% | $0.0301 | 5,273 |
235
- | OpenAI GPT-5.2 | 83.6% | 79.6% | $0.0278 | 2,694 |
236
- | Claude Sonnet 4.5 | 83.6% | 80.0% | $0.0388 | 3,400 |
237
- | GLM 4-Plus | 84.1% | 80.0% | $0.0052 | 4,412 |
238
- | DeepSeek v3.2 | 83.6% | 80.0% | $0.0028 | 3,015 |
239
- | Grok 4 | 83.6% | 80.0% | $0.0304 | 2,667 |
240
- | MiniMax M2.1 | 84.5% | 80.0% | $0.0133 | 15,436 |
241
- | Grok 4.1 Fast | 82.6% | 78.7% | $0.0013 | 3,222 |
242
- | Gemini 3 Pro Preview | 80.8% | 77.3% | $0.0574 | 4,099 |
243
- | GLM-4.7 | 72.7% | 79.6% | $0.0041 | 3,464 |
244
- | GLM 4.7 Flash | 78.9% | 79.6% | $0.0062 | 3,611 |
245
- | Gemini 3 Flash | 78.4% | 75.1% | $0.0048 | 2,133 |
246
-
247
- **Live Dashboard**: https://vibecoding.llmbench.xyz
248
-
249
- ## Contributing
250
-
251
- We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for details.
252
-
253
- ### Adding a New Task
254
-
255
- 1. **Create task directory**:
256
- ```bash
257
- mkdir -p tasks/<category>/<subcategory>/<task-name>
258
- ```
259
-
260
- 2. **Add task.yaml**:
261
- ```yaml
262
- name: My New Task
263
- category: saas-core
264
- difficulty: medium
265
- stack: nextjs-supabase
266
- tags: [typescript, auth]
267
- ```
268
-
269
- 3. **Write PROMPT.md** with clear requirements
270
-
271
- 4. **Add tests** (Playwright for web, pytest for Python)
272
-
273
- 5. **Submit PR** using the template
274
-
275
- ## Architecture
276
-
277
- ```
278
- vibecodingbench/
279
- ├── packages/
280
- │ ├── cli/ # CLI tool
281
- │ ├── evaluator/ # Scoring engine
282
- │ └── leaderboard/ # Web dashboard
283
- ├── tasks/ # 120 benchmark tasks
284
- │ ├── saas-core/ # 20 tasks
285
- │ ├── glue-code/ # 20 tasks
286
- │ ├── ai-integration/ # 20 tasks
287
- │ ├── frontend/ # 20 tasks
288
- │ ├── api-integrations/ # 20 tasks
289
- │ └── code-evolution/ # 20 tasks
290
- ├── templates/ # Starter codebases
291
- │ ├── nextjs-supabase/
292
- │ ├── fastapi-postgres/
293
- │ ├── go-fiber/
294
- │ └── rust-axum/
295
- └── docker/ # Base images
296
- ```
297
-
298
- ## Deployment
299
-
300
- ### Self-Hosted (Docker)
301
-
302
- ```bash
303
- # Build and run production stack
304
- ./scripts/deploy.sh docker
305
-
306
- # Or in background
307
- ./scripts/deploy.sh docker --detach
308
-
309
- # Services available at:
310
- # - Dashboard: http://localhost:3000
311
- # - API: http://localhost:3001
312
- ```
313
-
314
- ### Fly.io
315
-
316
- ```bash
317
- cd packages/leaderboard
318
- fly launch --config fly.toml
319
- fly deploy
320
- ```
321
-
322
- ## Environment Setup
323
-
324
- ```bash
325
- # Required
326
- export ANTHROPIC_API_KEY=... # Claude (Anthropic)
327
- export OPENAI_API_KEY=... # OpenAI
328
- export GOOGLE_API_KEY=... # Gemini (Google AI)
329
-
330
- # Optional
331
- export GLM_API_KEY=... # GLM (Zhipu AI)
332
- export MINIMAX_API_KEY=... # MiniMax
333
- export QWEN_API_KEY=... # Qwen (Alibaba DashScope)
334
- export DEEPSEEK_API_KEY=... # DeepSeek
335
- ```
336
-
337
- ## Citation
338
-
339
- If you use VibeCodingBench in your research, please cite:
340
-
341
- ```bibtex
342
- @software{vibecodingbench2025,
343
- title = {VibeCodingBench: A Benchmark for AI Coding Agents on Real-World Developer Tasks},
344
- year = {2025},
345
- url = {https://github.com/alt-research/vibe-coding-benchmark-public}
346
- }
347
- ```
348
-
349
-
350
- ---
351
-
352
- <p align="center">
353
- <sub>Built with ❤️ by the open-source community</sub>
354
- </p>
@@ -1,35 +0,0 @@
1
- version: '3.8'
2
-
3
- services:
4
- leaderboard:
5
- build:
6
- context: ./packages/leaderboard
7
- dockerfile: Dockerfile
8
- ports:
9
- - "3001:3001"
10
- environment:
11
- - NODE_ENV=production
12
- - PORT=3001
13
- - DATABASE_URL=${DATABASE_URL:-}
14
- restart: unless-stopped
15
- healthcheck:
16
- test: ["CMD", "wget", "-qO-", "http://localhost:3001/health"]
17
- interval: 30s
18
- timeout: 10s
19
- retries: 3
20
-
21
- dashboard:
22
- build:
23
- context: ./packages/dashboard
24
- dockerfile: Dockerfile
25
- ports:
26
- - "3000:3000"
27
- environment:
28
- - VITE_API_URL=http://leaderboard:3001
29
- depends_on:
30
- - leaderboard
31
- restart: unless-stopped
32
-
33
- networks:
34
- default:
35
- name: vibecodingbench
@@ -1,53 +0,0 @@
1
- version: '3.8'
2
-
3
- services:
4
- postgres:
5
- image: postgres:16-alpine
6
- container_name: benchmark-postgres
7
- environment:
8
- POSTGRES_USER: benchmark
9
- POSTGRES_PASSWORD: benchmark123
10
- POSTGRES_DB: vibecodingbench
11
- ports:
12
- - "5432:5432"
13
- volumes:
14
- - postgres_data:/var/lib/postgresql/data
15
- healthcheck:
16
- test: ["CMD-SHELL", "pg_isready -U benchmark -d vibecodingbench"]
17
- interval: 5s
18
- timeout: 5s
19
- retries: 5
20
-
21
- leaderboard:
22
- build:
23
- context: ./packages/leaderboard
24
- dockerfile: Dockerfile
25
- ports:
26
- - "3001:3001"
27
- environment:
28
- - NODE_ENV=production
29
- - PORT=3001
30
- - DATABASE_URL=postgresql://benchmark:benchmark123@postgres:5432/vibecodingbench
31
- depends_on:
32
- postgres:
33
- condition: service_healthy
34
- restart: unless-stopped
35
-
36
- dashboard:
37
- build:
38
- context: ./packages/dashboard
39
- dockerfile: Dockerfile
40
- ports:
41
- - "3000:3000"
42
- environment:
43
- - VITE_API_URL=http://leaderboard:3001
44
- depends_on:
45
- - leaderboard
46
- restart: unless-stopped
47
-
48
- volumes:
49
- postgres_data:
50
-
51
- networks:
52
- default:
53
- name: vibecodingbench
@@ -1,211 +0,0 @@
1
- # VibeCodingBench Task Expansion Plan
2
-
3
- ## Overview
4
- Expanding from 18 tasks to 120 tasks (20 per category) with multi-language support.
5
-
6
- ## Language Distribution
7
- Based on GitHub Octoverse 2025 and Stack Overflow 2025:
8
- - **TypeScript/JavaScript**: 40% (most used on GitHub)
9
- - **Python**: 25% (dominant in AI/data)
10
- - **Go**: 15% (cloud-native, microservices)
11
- - **Java/Kotlin**: 10% (enterprise)
12
- - **Rust**: 5% (systems, performance)
13
- - **C#**: 5% (enterprise, game dev)
14
-
15
- ---
16
-
17
- ## Category 1: saas-core (20 tasks)
18
-
19
- ### Existing (6):
20
- 1. auth/supabase-oauth (TypeScript)
21
- 2. auth/mfa-totp (TypeScript)
22
- 3. crud/dashboard-table (TypeScript)
23
- 4. settings/user-preferences (TypeScript)
24
- 5. realtime/websocket-chat (TypeScript)
25
- 6. security/rate-limiter (TypeScript)
26
-
27
- ### New (14):
28
- 7. auth/jwt-refresh-tokens (Go) - Implement JWT with refresh token rotation
29
- 8. auth/magic-link-email (Python/FastAPI) - Passwordless email authentication
30
- 9. auth/rbac-permissions (Java/Spring) - Role-based access control system
31
- 10. auth/session-management (Rust/Actix) - Secure session handling with Redis
32
- 11. billing/stripe-subscriptions (TypeScript) - Subscription management with Stripe
33
- 12. billing/usage-metering (Go) - Track and bill based on API usage
34
- 13. billing/invoice-generation (Python) - Generate PDF invoices with line items
35
- 14. multi-tenant/org-isolation (TypeScript) - Database-per-tenant isolation
36
- 15. multi-tenant/subdomain-routing (Go) - Route requests by subdomain
37
- 16. notifications/email-queue (Python) - Async email notification system
38
- 17. notifications/push-notifications (TypeScript) - Web push with service workers
39
- 18. notifications/in-app-alerts (Java/Spring) - Real-time in-app notifications
40
- 19. audit/activity-logging (Go) - Comprehensive audit trail system
41
- 20. search/full-text-search (TypeScript) - Elasticsearch integration for search
42
-
43
- ---
44
-
45
- ## Category 2: glue-code (20 tasks)
46
-
47
- ### Existing (3):
48
- 1. data-transform/excel-to-json (Python)
49
- 2. api-sync/rest-to-graphql (TypeScript)
50
- 3. caching/redis-cache (TypeScript)
51
-
52
- ### New (17):
53
- 4. data-transform/csv-normalizer (Python) - Clean and normalize CSV data
54
- 5. data-transform/json-to-xml (Go) - Bidirectional JSON/XML conversion
55
- 6. data-transform/protobuf-converter (Rust) - Protocol buffer serialization
56
- 7. data-transform/avro-schema-evolution (Java) - Handle Avro schema changes
57
- 8. etl/database-sync (Python) - Sync data between PostgreSQL and MongoDB
58
- 9. etl/s3-to-warehouse (Go) - Load S3 files into data warehouse
59
- 10. etl/cdc-pipeline (TypeScript) - Change data capture with Debezium
60
- 11. queue/rabbitmq-consumer (Python) - Reliable message queue processing
61
- 12. queue/kafka-producer (Go) - High-throughput Kafka event publishing
62
- 13. queue/sqs-batch-processor (TypeScript) - AWS SQS batch processing
63
- 14. scheduler/cron-job-manager (Go) - Distributed cron job scheduling
64
- 15. scheduler/delayed-tasks (Python) - Celery-based delayed task execution
65
- 16. file-processing/image-resizer (Rust) - High-performance image processing
66
- 17. file-processing/pdf-merger (Python) - Merge and manipulate PDFs
67
- 18. file-processing/video-transcoder (Go) - FFmpeg-based video processing
68
- 19. migration/database-versioning (TypeScript) - Schema migration system
69
- 20. migration/data-backfill (Python) - Backfill data with progress tracking
70
-
71
- ---
72
-
73
- ## Category 3: ai-integration (20 tasks)
74
-
75
- ### Existing (2):
76
- 1. structured-output/invoice-parser (Python)
77
- 2. rag-chatbot/pdf-qa (Python)
78
-
79
- ### New (18):
80
- 3. structured-output/resume-parser (Python) - Extract structured data from resumes
81
- 4. structured-output/receipt-scanner (TypeScript) - OCR + LLM receipt extraction
82
- 5. structured-output/contract-analyzer (Python) - Legal document analysis
83
- 6. rag-chatbot/code-assistant (TypeScript) - Codebase Q&A with RAG
84
- 7. rag-chatbot/support-bot (Python) - Customer support with knowledge base
85
- 8. rag-chatbot/doc-search (Go) - Multi-document semantic search
86
- 9. agents/web-scraper-agent (Python) - Autonomous web data extraction
87
- 10. agents/research-agent (TypeScript) - Multi-step research automation
88
- 11. agents/code-review-agent (Python) - Automated PR review with LLM
89
- 12. function-calling/api-orchestrator (TypeScript) - LLM-driven API calls
90
- 13. function-calling/database-query (Python) - Natural language to SQL
91
- 14. function-calling/calendar-assistant (TypeScript) - Schedule management agent
92
- 15. embeddings/semantic-search (Python) - Vector similarity search
93
- 16. embeddings/recommendation-engine (Go) - Content recommendations
94
- 17. embeddings/duplicate-detection (Python) - Find similar documents
95
- 18. fine-tuning/classification-model (Python) - Fine-tune for text classification
96
- 19. multimodal/image-captioning (Python) - Generate image descriptions
97
- 20. multimodal/chart-interpreter (TypeScript) - Extract data from chart images
98
-
99
- ---
100
-
101
- ## Category 4: frontend (20 tasks)
102
-
103
- ### Existing (3):
104
- 1. figma-to-code/pricing-card (TypeScript)
105
- 2. visualization/chart-dashboard (TypeScript)
106
- 3. components/form-builder (TypeScript)
107
-
108
- ### New (17):
109
- 4. figma-to-code/landing-page (TypeScript/React) - Full landing page from design
110
- 5. figma-to-code/dashboard-layout (TypeScript/Vue) - Admin dashboard UI
111
- 6. figma-to-code/mobile-app-screen (TypeScript/React Native) - Mobile UI
112
- 7. components/data-grid (TypeScript/React) - Advanced data grid with virtual scroll
113
- 8. components/rich-text-editor (TypeScript/Vue) - WYSIWYG editor with plugins
114
- 9. components/file-uploader (TypeScript/React) - Drag-drop with preview
115
- 10. components/date-range-picker (TypeScript/Svelte) - Complex date selection
116
- 11. visualization/realtime-charts (TypeScript/React) - Live updating charts
117
- 12. visualization/map-dashboard (TypeScript/Vue) - Geographic data viz
118
- 13. visualization/gantt-chart (TypeScript/React) - Project timeline view
119
- 14. state-management/shopping-cart (TypeScript/React) - Complex cart with Redux
120
- 15. state-management/collaborative-editor (TypeScript/Vue) - Real-time collab
121
- 16. accessibility/screen-reader-nav (TypeScript/React) - WCAG compliant nav
122
- 17. accessibility/keyboard-shortcuts (TypeScript/Vue) - Full keyboard support
123
- 18. performance/infinite-scroll (TypeScript/React) - Virtualized infinite list
124
- 19. performance/image-lazy-load (TypeScript/Svelte) - Optimized image loading
125
- 20. animation/page-transitions (TypeScript/React) - Smooth route animations
126
-
127
- ---
128
-
129
- ## Category 5: api-integrations (20 tasks)
130
-
131
- ### Existing (2):
132
- 1. stripe/payment-webhook (TypeScript)
133
- 2. email/transactional (TypeScript)
134
-
135
- ### New (18):
136
- 3. stripe/checkout-session (Go) - Create Stripe checkout flows
137
- 4. stripe/subscription-portal (TypeScript) - Customer billing portal
138
- 5. payment/paypal-integration (Python) - PayPal payments and refunds
139
- 6. payment/crypto-payments (TypeScript) - Accept cryptocurrency
140
- 7. storage/s3-presigned-urls (Go) - Secure file uploads to S3
141
- 8. storage/cloudinary-upload (TypeScript) - Image upload and transform
142
- 9. storage/gcs-streaming (Python) - Stream large files to GCS
143
- 10. auth-provider/oauth2-github (Go) - GitHub OAuth integration
144
- 11. auth-provider/saml-sso (Java/Spring) - Enterprise SAML SSO
145
- 12. auth-provider/okta-integration (TypeScript) - Okta user management
146
- 13. communication/twilio-sms (Python) - SMS notifications
147
- 14. communication/slack-bot (TypeScript) - Slack app with slash commands
148
- 15. communication/discord-webhook (Go) - Discord notifications
149
- 16. maps/google-maps-geocoding (TypeScript) - Address to coordinates
150
- 17. maps/mapbox-directions (Python) - Route calculation
151
- 18. analytics/segment-tracking (TypeScript) - Event tracking pipeline
152
- 19. analytics/mixpanel-events (Go) - User behavior analytics
153
- 20. social/twitter-api (Python) - Tweet posting and monitoring
154
-
155
- ---
156
-
157
- ## Category 6: code-evolution (20 tasks)
158
-
159
- ### Existing (2):
160
- 1. legacy-migration/express-to-fastify (TypeScript)
161
- 2. refactoring/class-to-hooks (TypeScript)
162
-
163
- ### New (18):
164
- 3. legacy-migration/callback-to-async (TypeScript) - Callback hell to async/await
165
- 4. legacy-migration/jquery-to-react (TypeScript) - jQuery app to React
166
- 5. legacy-migration/flask-to-fastapi (Python) - Flask to FastAPI migration
167
- 6. legacy-migration/java-to-kotlin (Kotlin) - Java codebase to Kotlin
168
- 7. legacy-migration/rest-to-grpc (Go) - REST API to gRPC
169
- 8. refactoring/monolith-to-modules (TypeScript) - Extract modules from monolith
170
- 9. refactoring/orm-migration (Python) - SQLAlchemy to async SQLModel
171
- 10. refactoring/dependency-injection (Java/Spring) - Add DI to legacy code
172
- 11. refactoring/error-handling (Go) - Standardize error handling
173
- 12. testing/add-unit-tests (TypeScript) - Add tests to untested code
174
- 13. testing/e2e-playwright (TypeScript) - Add E2E tests with Playwright
175
- 14. testing/pytest-fixtures (Python) - Refactor tests with fixtures
176
- 15. performance/query-optimization (Python) - Optimize slow DB queries
177
- 16. performance/memory-leak-fix (TypeScript) - Fix memory leaks in Node.js
178
- 17. performance/async-refactor (Python) - Sync to async for I/O bound
179
- 18. security/sql-injection-fix (Python) - Fix SQL injection vulnerabilities
180
- 19. security/xss-prevention (TypeScript) - Add XSS protection
181
- 20. security/secrets-rotation (Go) - Implement secrets rotation
182
-
183
- ---
184
-
185
- ## Implementation Priority
186
-
187
- ### Phase 1 (High Priority - Common Tasks)
188
- - All auth tasks
189
- - All billing tasks
190
- - All payment integrations
191
- - RAG and agent tasks
192
-
193
- ### Phase 2 (Medium Priority - Enterprise)
194
- - Multi-tenant tasks
195
- - SAML/SSO integrations
196
- - Audit logging
197
- - Migration tasks
198
-
199
- ### Phase 3 (Lower Priority - Specialized)
200
- - Multimodal AI tasks
201
- - Advanced visualizations
202
- - Performance optimization tasks
203
-
204
- ---
205
-
206
- ## Sources
207
- - [GitHub Octoverse 2025](https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/)
208
- - [Stack Overflow Developer Survey 2025](https://survey.stackoverflow.co/2025/)
209
- - [HackerRank Real-World Coding Challenges 2025](https://www.hackerrank.com/writing/design-real-world-coding-challenges-junior-backend-developer-screening-2025)
210
- - [LangChain State of Agent Engineering](https://www.langchain.com/state-of-agent-engineering)
211
- - [WorkOS Multi-Tenant Architecture Guide](https://workos.com/blog/developers-guide-saas-multi-tenant-architecture)