claude-flow-novice 2.9.0 → 2.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (353) hide show
  1. package/.claude/agents/cfn-dev-team/CLAUDE.md +1086 -0
  2. package/.claude/agents/cfn-dev-team/README.md +116 -0
  3. package/.claude/agents/cfn-dev-team/architecture/api-designer-persona.md +149 -0
  4. package/.claude/agents/cfn-dev-team/architecture/base-template-generator.md +196 -0
  5. package/.claude/agents/cfn-dev-team/architecture/goal-planner.md +183 -0
  6. package/.claude/agents/cfn-dev-team/architecture/planner.md +182 -0
  7. package/.claude/agents/cfn-dev-team/architecture/system-architect.md +162 -0
  8. package/.claude/agents/cfn-dev-team/coordinators/cfn-frontend-coordinator.md +540 -0
  9. package/.claude/agents/cfn-dev-team/coordinators/cfn-v3-coordinator.md +20 -14
  10. package/.claude/agents/cfn-dev-team/coordinators/consensus-builder.md +167 -0
  11. package/.claude/agents/cfn-dev-team/dev-ops/devops-engineer.md +148 -0
  12. package/.claude/agents/cfn-dev-team/dev-ops/github-commit-agent.md +118 -0
  13. package/.claude/agents/cfn-dev-team/dev-ops/kubernetes-specialist.md +540 -0
  14. package/.claude/agents/cfn-dev-team/developers/backend-dev.md +20 -0
  15. package/.claude/agents/cfn-dev-team/developers/data/data-engineer.md +585 -0
  16. package/.claude/agents/cfn-dev-team/developers/database/database-architect.md +276 -0
  17. package/.claude/agents/cfn-dev-team/developers/dev-backend-api.md +147 -0
  18. package/.claude/agents/cfn-dev-team/developers/frontend/mobile-dev.md +218 -0
  19. package/.claude/agents/cfn-dev-team/developers/{react-frontend-engineer.md → frontend/react-frontend-engineer.md} +53 -5
  20. package/.claude/agents/cfn-dev-team/developers/frontend/spec-mobile-react-native.md +199 -0
  21. package/.claude/agents/cfn-dev-team/developers/graphql-specialist.md +615 -0
  22. package/.claude/agents/cfn-dev-team/developers/rust-developer.md +174 -0
  23. package/.claude/agents/cfn-dev-team/documentation/README-VALIDATION.md +243 -0
  24. package/.claude/agents/cfn-dev-team/documentation/agent-type-guidelines.md +465 -0
  25. package/.claude/agents/cfn-dev-team/documentation/api-docs.md +103 -0
  26. package/.claude/agents/cfn-dev-team/documentation/docs-api-openapi.md +98 -0
  27. package/.claude/agents/cfn-dev-team/documentation/pseudocode.md +159 -0
  28. package/.claude/agents/cfn-dev-team/documentation/specification.md +157 -0
  29. package/.claude/agents/cfn-dev-team/product-owners/accessibility-advocate-persona.md +109 -0
  30. package/.claude/agents/cfn-dev-team/{coordinators → product-owners}/cto-agent.md +8 -6
  31. package/.claude/agents/cfn-dev-team/product-owners/power-user-persona.md +190 -0
  32. package/.claude/agents/cfn-dev-team/{coordinators → product-owners}/product-owner.md +85 -59
  33. package/.claude/agents/cfn-dev-team/reviewers/quality/analyze-code-quality.md +141 -0
  34. package/.claude/agents/cfn-dev-team/reviewers/quality/code-analyzer.md +200 -0
  35. package/.claude/agents/cfn-dev-team/reviewers/quality/cyclomatic-complexity-reducer.md +321 -0
  36. package/.claude/agents/cfn-dev-team/reviewers/quality/perf-analyzer.md +238 -0
  37. package/.claude/agents/cfn-dev-team/reviewers/quality/performance-benchmarker.md +101 -0
  38. package/.claude/agents/cfn-dev-team/reviewers/quality/quality-metrics.md +375 -0
  39. package/.claude/agents/cfn-dev-team/reviewers/quality/security-specialist.md +193 -0
  40. package/.claude/agents/cfn-dev-team/reviewers/reviewer.md +39 -0
  41. package/.claude/agents/cfn-dev-team/testers/interaction-tester.md +31 -0
  42. package/.claude/agents/cfn-dev-team/testers/load-testing-specialist.md +469 -0
  43. package/.claude/agents/cfn-dev-team/testers/playwright-tester.md +24 -0
  44. package/.claude/agents/cfn-dev-team/testers/tester.md +20 -0
  45. package/.claude/agents/cfn-dev-team/utility/agent-builder.md +151 -0
  46. package/.claude/agents/cfn-dev-team/utility/analyst.md +178 -0
  47. package/.claude/agents/cfn-dev-team/utility/claude-code-expert.md +1043 -0
  48. package/.claude/agents/cfn-dev-team/utility/code-booster.md +139 -0
  49. package/.claude/agents/cfn-dev-team/utility/context-curator.md +99 -0
  50. package/.claude/agents/cfn-dev-team/{developers → utility}/researcher.md +6 -4
  51. package/.claude/commands/cfn/CFN_LOOP_FRONTEND.md +741 -0
  52. package/.claude/commands/cfn/CFN_LOOP_TASK_MODE.md +353 -0
  53. package/.claude/commands/cfn/cfn-loop-frontend.md +555 -0
  54. package/.claude/commands/cfn/cfn-loop.md +168 -7
  55. package/{CFN-CLAUDE.md → .claude/root-claude-distribute/CFN-CLAUDE.md} +23 -3
  56. package/.claude/skills/cfn-ace-system/SKILL.md +364 -0
  57. package/.claude/skills/cfn-ace-system/add-bullet.sh +145 -0
  58. package/.claude/skills/cfn-ace-system/analyze-anti-pattern-effectiveness.sh +56 -0
  59. package/.claude/skills/cfn-ace-system/classify-task.sh +18 -0
  60. package/.claude/skills/cfn-ace-system/export-ace-metrics.sh +48 -0
  61. package/.claude/skills/cfn-ace-system/extract-tags.sh +385 -0
  62. package/.claude/skills/cfn-ace-system/format-negative-context.sh +180 -0
  63. package/.claude/skills/cfn-ace-system/init-indexes.sql +160 -0
  64. package/.claude/skills/cfn-ace-system/invoke-context-curate.sh +192 -0
  65. package/.claude/skills/cfn-ace-system/invoke-context-inject.sh +361 -0
  66. package/.claude/skills/cfn-ace-system/invoke-context-query.sh +139 -0
  67. package/.claude/skills/cfn-ace-system/invoke-context-reflect.sh +343 -0
  68. package/.claude/skills/cfn-ace-system/invoke-context-stats.sh +227 -0
  69. package/.claude/skills/cfn-ace-system/log-merge.sh +67 -0
  70. package/.claude/skills/cfn-ace-system/monitor-injection-performance.sh +138 -0
  71. package/.claude/skills/cfn-ace-system/optimize-injection-pipeline.sh +169 -0
  72. package/.claude/skills/cfn-ace-system/query-anti-patterns.sh +276 -0
  73. package/.claude/skills/cfn-ace-system/query-contexts.sh +150 -0
  74. package/.claude/skills/cfn-ace-system/query-reflections.sh +35 -0
  75. package/.claude/skills/cfn-ace-system/schema/001-create-context-reflections.sql +237 -0
  76. package/.claude/skills/cfn-ace-system/schema/README.md +723 -0
  77. package/.claude/skills/cfn-ace-system/schema/SCHEMA_DESIGN_SUMMARY.md +564 -0
  78. package/.claude/skills/cfn-ace-system/schema/populate-test-data-simple.sh +62 -0
  79. package/.claude/skills/cfn-ace-system/schema/populate-test-data.sh +247 -0
  80. package/.claude/skills/cfn-ace-system/schema/run-migration.sh +231 -0
  81. package/.claude/skills/cfn-ace-system/schema/validate-schema.sql +280 -0
  82. package/.claude/skills/cfn-ace-system/score-relevance-adapter.sh +138 -0
  83. package/.claude/skills/cfn-ace-system/score-relevance.sh +253 -0
  84. package/.claude/skills/cfn-ace-system/sprint-7-lessons.json +46 -0
  85. package/.claude/skills/cfn-ace-system/store-reflection.sh +46 -0
  86. package/.claude/skills/cfn-ace-system/test-ace-skill.sh +312 -0
  87. package/.claude/skills/cfn-ace-system/track-ab-test.sh +42 -0
  88. package/.claude/skills/cfn-ace-system/update-reflection.sh +41 -0
  89. package/.claude/skills/cfn-agent-discovery/SKILL.md +40 -0
  90. package/.claude/skills/cfn-agent-discovery/agents-registry-clean.json +0 -0
  91. package/.claude/skills/cfn-agent-discovery/agents-registry-fixed.json +19 -0
  92. package/.claude/skills/cfn-agent-discovery/agents-registry.json +718 -0
  93. package/.claude/skills/cfn-agent-discovery/discover-agents.py +184 -0
  94. package/.claude/skills/cfn-agent-discovery/discover-agents.sh +87 -0
  95. package/.claude/skills/cfn-agent-discovery/invoke-registry.sh +11 -0
  96. package/.claude/skills/cfn-agent-discovery/temp_script.py +0 -0
  97. package/.claude/skills/cfn-agent-execution/execute-agent.sh +126 -0
  98. package/.claude/skills/cfn-agent-output-processing/SKILL.md +359 -0
  99. package/.claude/skills/cfn-agent-selector/SKILL.md +90 -0
  100. package/.claude/skills/cfn-agent-selector/select-agents.sh +112 -0
  101. package/.claude/skills/cfn-agent-spawning/SKILL.md +135 -0
  102. package/.claude/skills/cfn-agent-spawning/agent-selection-guide.md +814 -0
  103. package/.claude/skills/cfn-agent-spawning/check-dependencies.sh +30 -0
  104. package/.claude/skills/cfn-agent-spawning/spawn-agent.sh +263 -0
  105. package/.claude/skills/cfn-agent-spawning/spawn-templates.sh +613 -0
  106. package/.claude/skills/cfn-analytics/description-refinement-guide.md +164 -0
  107. package/.claude/skills/cfn-analytics/log-skill-invocation.js +122 -0
  108. package/.claude/skills/cfn-analytics/run-production-criteria-tests.sh +126 -0
  109. package/.claude/skills/cfn-analytics/skill-analytics-dashboard.js +113 -0
  110. package/.claude/skills/cfn-analytics/skill-invocation-hook.sh +28 -0
  111. package/.claude/skills/cfn-analytics/skill-invocations.sql +58 -0
  112. package/.claude/skills/cfn-analytics/test-corpus.json +32 -0
  113. package/.claude/skills/cfn-analytics/test-data-generator.js +115 -0
  114. package/.claude/skills/cfn-analytics/test-manual-override-rate.js +285 -0
  115. package/.claude/skills/cfn-analytics/validate-skill-selection.js +188 -0
  116. package/.claude/skills/cfn-config-management/SKILL.md +34 -0
  117. package/.claude/skills/cfn-config-management/check-dependencies.sh +56 -0
  118. package/.claude/skills/cfn-config-management/config.json +32 -0
  119. package/.claude/skills/cfn-config-management/manage-config.sh +113 -0
  120. package/.claude/skills/cfn-event-bus/SKILL.md +412 -0
  121. package/.claude/skills/cfn-event-bus/config.json +111 -0
  122. package/.claude/skills/cfn-event-bus/eventbus-wrapper.cjs +69 -0
  123. package/.claude/skills/cfn-event-bus/invoke-event-publish.sh +147 -0
  124. package/.claude/skills/cfn-event-bus/invoke-event-subscribe.sh +171 -0
  125. package/.claude/skills/cfn-event-bus/invoke-lifecycle-track.sh +201 -0
  126. package/.claude/skills/cfn-event-bus/test-event-bus.sh +280 -0
  127. package/.claude/skills/cfn-fleet-manager/SKILL.md +412 -0
  128. package/.claude/skills/cfn-fleet-manager/config.json +60 -0
  129. package/.claude/skills/cfn-fleet-manager/invoke-fleet-allocate.sh +182 -0
  130. package/.claude/skills/cfn-fleet-manager/invoke-fleet-balance.sh +239 -0
  131. package/.claude/skills/cfn-fleet-manager/invoke-fleet-metrics.sh +193 -0
  132. package/.claude/skills/cfn-fleet-manager/invoke-fleet-register.sh +124 -0
  133. package/.claude/skills/cfn-fleet-manager/test-fleet-manager.sh +345 -0
  134. package/.claude/skills/cfn-hook-pipeline/SKILL.md +148 -0
  135. package/.claude/skills/cfn-hook-pipeline/auto-resolve.sh +66 -0
  136. package/.claude/skills/cfn-hook-pipeline/check-dependencies.sh +40 -0
  137. package/.claude/skills/cfn-hook-pipeline/feedback-resolver.sh +452 -0
  138. package/.claude/skills/cfn-hook-pipeline/post-edit-handler.sh +154 -0
  139. package/.claude/skills/cfn-hook-pipeline/security-scan.json +60 -0
  140. package/.claude/skills/cfn-hook-pipeline/security-scanner.sh +121 -0
  141. package/.claude/skills/cfn-hook-pipeline/test-root-warning-resolution.sh +148 -0
  142. package/.claude/skills/cfn-hybrid-routing/SKILL.md +46 -0
  143. package/.claude/skills/cfn-hybrid-routing/check-dependencies.sh +52 -0
  144. package/.claude/skills/cfn-hybrid-routing/config.json +26 -0
  145. package/.claude/skills/cfn-hybrid-routing/spawn-worker.sh +44 -0
  146. package/.claude/skills/cfn-loop-orchestration/SKILL.md +299 -0
  147. package/.claude/skills/cfn-loop-orchestration/helpers/auto-tune-timeouts.sh +228 -0
  148. package/.claude/skills/cfn-loop-orchestration/helpers/consensus.sh +84 -0
  149. package/.claude/skills/cfn-loop-orchestration/helpers/context-injection.sh +142 -0
  150. package/.claude/skills/cfn-loop-orchestration/helpers/context-lookup.sh +359 -0
  151. package/.claude/skills/cfn-loop-orchestration/helpers/deliverable-verifier.sh +71 -0
  152. package/.claude/skills/cfn-loop-orchestration/helpers/gate-check.sh +90 -0
  153. package/.claude/skills/cfn-loop-orchestration/helpers/iteration-manager.sh +87 -0
  154. package/.claude/skills/cfn-loop-orchestration/helpers/spawn-agents.sh +271 -0
  155. package/.claude/skills/cfn-loop-orchestration/helpers/timeout-calculator.sh +51 -0
  156. package/.claude/skills/cfn-loop-orchestration/inject-loop-context.sh +41 -0
  157. package/.claude/skills/cfn-loop-orchestration/monitor-execution.sh +156 -0
  158. package/.claude/skills/cfn-loop-orchestration/orchestrate.sh +884 -0
  159. package/.claude/skills/cfn-loop-orchestration/orchestrate.sh.backup +840 -0
  160. package/.claude/skills/cfn-loop-orchestration/security_utils.sh +99 -0
  161. package/.claude/skills/cfn-loop-orchestration/test-cfn-orchestration.sh +281 -0
  162. package/.claude/skills/cfn-loop-orchestration/test-edge-cases.sh +188 -0
  163. package/.claude/skills/cfn-loop-validation/SKILL.md +353 -0
  164. package/.claude/skills/cfn-loop-validation/check-dependencies.sh +31 -0
  165. package/.claude/skills/cfn-loop-validation/config.json +161 -0
  166. package/.claude/skills/cfn-loop-validation/consensus-calculator.js +477 -0
  167. package/.claude/skills/cfn-loop-validation/evidence-chain.sql +163 -0
  168. package/.claude/skills/cfn-loop-validation/examples/README.md +453 -0
  169. package/.claude/skills/cfn-loop-validation/examples/coordinator-full-cfn-loop.sh +234 -0
  170. package/.claude/skills/cfn-loop-validation/examples/coordinator-loop2-consensus.sh +132 -0
  171. package/.claude/skills/cfn-loop-validation/examples/coordinator-loop3-gate.sh +115 -0
  172. package/.claude/skills/cfn-loop-validation/examples/coordinator-redis-integration.sh +186 -0
  173. package/.claude/skills/cfn-loop-validation/orchestrate-cfn-loop.sh +252 -0
  174. package/.claude/skills/cfn-loop-validation/validate-iteration.sh +134 -0
  175. package/.claude/skills/cfn-process-lifecycle/SKILL.md +39 -0
  176. package/.claude/skills/cfn-process-lifecycle/check-dependencies.sh +58 -0
  177. package/.claude/skills/cfn-process-lifecycle/config.json +39 -0
  178. package/.claude/skills/cfn-process-lifecycle/process-manager.sh +144 -0
  179. package/.claude/skills/cfn-product-owner-decision/SKILL.md +332 -0
  180. package/.claude/skills/cfn-product-owner-decision/execute-decision.sh +176 -0
  181. package/.claude/skills/cfn-product-owner-decision/parse-decision.sh +66 -0
  182. package/.claude/skills/cfn-product-owner-decision/validate-deliverables.sh +82 -0
  183. package/.claude/skills/cfn-redis-coordination/AGENT_LOGGING.md +280 -0
  184. package/.claude/skills/cfn-redis-coordination/BZPOPMIN_FIX_SUMMARY.md +209 -0
  185. package/.claude/skills/cfn-redis-coordination/HEARTBEAT.md +57 -0
  186. package/.claude/skills/cfn-redis-coordination/HEARTBEAT_MONITORING.md +267 -0
  187. package/.claude/skills/cfn-redis-coordination/LOGGING.md +260 -0
  188. package/.claude/skills/cfn-redis-coordination/SECURITY_REVIEW.md +25 -0
  189. package/.claude/skills/cfn-redis-coordination/SHUTDOWN_HANDLING.md +164 -0
  190. package/.claude/skills/cfn-redis-coordination/SKILL.md +720 -0
  191. package/.claude/skills/cfn-redis-coordination/agent-log.sh +124 -0
  192. package/.claude/skills/cfn-redis-coordination/agent-recovery.sh +75 -0
  193. package/.claude/skills/cfn-redis-coordination/analyze-task-complexity.sh +277 -0
  194. package/.claude/skills/cfn-redis-coordination/cancel-swarm.sh +221 -0
  195. package/.claude/skills/cfn-redis-coordination/cfn-loop-exec.sh +468 -0
  196. package/.claude/skills/cfn-redis-coordination/cfn-loop-relaunch.sh +29 -0
  197. package/.claude/skills/cfn-redis-coordination/check-dependencies.sh +32 -0
  198. package/.claude/skills/cfn-redis-coordination/collect-confidence-scores.sh +179 -0
  199. package/.claude/skills/cfn-redis-coordination/collect-results.sh +75 -0
  200. package/.claude/skills/cfn-redis-coordination/complete-swarm.sh +75 -0
  201. package/.claude/skills/cfn-redis-coordination/config.json +61 -0
  202. package/.claude/skills/cfn-redis-coordination/data/cfn-loop.db +0 -0
  203. package/.claude/skills/cfn-redis-coordination/demos/phase4-wake-queue-test-report.md +82 -0
  204. package/.claude/skills/cfn-redis-coordination/demos/test-bzpopmin-fix.sh +274 -0
  205. package/.claude/skills/cfn-redis-coordination/demos/test-cancel-swarm.sh +276 -0
  206. package/.claude/skills/cfn-redis-coordination/demos/test-dlq.sh +129 -0
  207. package/.claude/skills/cfn-redis-coordination/demos/test-iteration-feedback.sh +320 -0
  208. package/.claude/skills/cfn-redis-coordination/demos/test-orchestrator.sh +249 -0
  209. package/.claude/skills/cfn-redis-coordination/demos/test-priority-wake-phase4-unix.sh +148 -0
  210. package/.claude/skills/cfn-redis-coordination/demos/test-priority-wake-phase4.sh +163 -0
  211. package/.claude/skills/cfn-redis-coordination/demos/test-priority-wake.sh +138 -0
  212. package/.claude/skills/cfn-redis-coordination/demos/test-quick-fix.sh +81 -0
  213. package/.claude/skills/cfn-redis-coordination/demos/test-quorum-absolute.sh +45 -0
  214. package/.claude/skills/cfn-redis-coordination/demos/test-quorum-fallback.sh +68 -0
  215. package/.claude/skills/cfn-redis-coordination/demos/test-quorum-percentage.sh +56 -0
  216. package/.claude/skills/cfn-redis-coordination/demos/test-quorum-with-retry.sh +81 -0
  217. package/.claude/skills/cfn-redis-coordination/demos/test-quorum.sh +57 -0
  218. package/.claude/skills/cfn-redis-coordination/demos/test-shutdown-handling.sh +187 -0
  219. package/.claude/skills/cfn-redis-coordination/demos/test-shutdown.sh +160 -0
  220. package/.claude/skills/cfn-redis-coordination/demos/test-utils-unix.sh +97 -0
  221. package/.claude/skills/cfn-redis-coordination/demos/test-utils.sh +97 -0
  222. package/.claude/skills/cfn-redis-coordination/demos/test-waiting-mode.sh +59 -0
  223. package/.claude/skills/cfn-redis-coordination/examples/README.md +73 -0
  224. package/.claude/skills/cfn-redis-coordination/examples/grafana-dashboard.json +352 -0
  225. package/.claude/skills/cfn-redis-coordination/examples/hierarchical-pattern.sh +127 -0
  226. package/.claude/skills/cfn-redis-coordination/examples/mesh-pattern.sh +171 -0
  227. package/.claude/skills/cfn-redis-coordination/examples/timeout-handling.sh +227 -0
  228. package/.claude/skills/cfn-redis-coordination/examples/waiting-mode-pattern.sh +239 -0
  229. package/.claude/skills/cfn-redis-coordination/execute-product-owner-decision.sh +258 -0
  230. package/.claude/skills/cfn-redis-coordination/get-agent-timeout.sh +177 -0
  231. package/.claude/skills/cfn-redis-coordination/heartbeat-functions.sh +137 -0
  232. package/.claude/skills/cfn-redis-coordination/heartbeat-protocol.md +106 -0
  233. package/.claude/skills/cfn-redis-coordination/heartbeat.sh +126 -0
  234. package/.claude/skills/cfn-redis-coordination/init-swarm.sh +148 -0
  235. package/.claude/skills/cfn-redis-coordination/invoke-redis-pattern.sh +220 -0
  236. package/.claude/skills/cfn-redis-coordination/invoke-waiting-mode.sh +283 -0
  237. package/.claude/skills/cfn-redis-coordination/invoke-waiting-mode.sh.backup-p7 +423 -0
  238. package/.claude/skills/cfn-redis-coordination/list-active-swarms.sh +147 -0
  239. package/.claude/skills/cfn-redis-coordination/log-event.sh +109 -0
  240. package/.claude/skills/cfn-redis-coordination/metrics-export.sh +674 -0
  241. package/.claude/skills/cfn-redis-coordination/metrics-schema.json +66 -0
  242. package/.claude/skills/cfn-redis-coordination/metrics-storage.md +31 -0
  243. package/.claude/skills/cfn-redis-coordination/monitor-cfn-violations.sh +391 -0
  244. package/.claude/skills/cfn-redis-coordination/monitor-heartbeats.sh +101 -0
  245. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop-v3.sh +141 -0
  246. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh +31 -0
  247. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup +38 -0
  248. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup-1761167675 +1672 -0
  249. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup-p5 +1604 -0
  250. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup-phase1 +1550 -0
  251. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup-phase2 +1621 -0
  252. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.backup-phase3 +1621 -0
  253. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.bak +0 -0
  254. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.broken +1627 -0
  255. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.corrupted +80 -0
  256. package/.claude/skills/cfn-redis-coordination/orchestrate-cfn-loop.sh.deprecated +1864 -0
  257. package/.claude/skills/cfn-redis-coordination/priority-wake-mechanism.md +75 -0
  258. package/.claude/skills/cfn-redis-coordination/priority_wake.py +134 -0
  259. package/.claude/skills/cfn-redis-coordination/query-dlq.sh +162 -0
  260. package/.claude/skills/cfn-redis-coordination/query-logs.sh +103 -0
  261. package/.claude/skills/cfn-redis-coordination/redis-pattern.sh +619 -0
  262. package/.claude/skills/cfn-redis-coordination/retrieve-context.sh +58 -0
  263. package/.claude/skills/cfn-redis-coordination/select-specialist-agent.sh +371 -0
  264. package/.claude/skills/cfn-redis-coordination/semantic-match-tfidf.py +252 -0
  265. package/.claude/skills/cfn-redis-coordination/send-heartbeat.sh +165 -0
  266. package/.claude/skills/cfn-redis-coordination/signal.sh +38 -0
  267. package/.claude/skills/cfn-redis-coordination/store-context.sh +86 -0
  268. package/.claude/skills/cfn-redis-coordination/store-epic-context.sh +123 -0
  269. package/.claude/skills/cfn-redis-coordination/test-context-injection.sh +354 -0
  270. package/.claude/skills/cfn-redis-coordination/test-timeout-enforcement.sh +513 -0
  271. package/.claude/skills/cfn-redis-coordination/tests/convert-line-endings.sh +15 -0
  272. package/.claude/skills/cfn-redis-coordination/tests/dlq-functionality-test.sh +102 -0
  273. package/.claude/skills/cfn-redis-coordination/tests/edge-cases-test.sh +99 -0
  274. package/.claude/skills/cfn-redis-coordination/tests/integration-test.sh +170 -0
  275. package/.claude/skills/cfn-redis-coordination/tests/retry-mechanism-test.sh +82 -0
  276. package/.claude/skills/cfn-redis-coordination/tests/run-test-suite.sh +92 -0
  277. package/.claude/skills/cfn-redis-coordination/tests/run-tests.sh +4 -0
  278. package/.claude/skills/cfn-redis-coordination/tests/test-heartbeat-monitoring.sh +418 -0
  279. package/.claude/skills/cfn-redis-coordination/tests/test-heartbeat-simple.sh +124 -0
  280. package/.claude/skills/cfn-redis-coordination/tests/test-primitives.sh +166 -0
  281. package/.claude/skills/cfn-redis-coordination/tests/test-utils.sh +54 -0
  282. package/.claude/skills/cfn-redis-coordination/tests/test_coordination_primitives.sh.deprecated +20 -0
  283. package/.claude/skills/cfn-redis-coordination/tests/test_utils.sh +49 -0
  284. package/.claude/skills/cfn-redis-coordination/v2_modularization/core_orchestration.sh +76 -0
  285. package/.claude/skills/cfn-redis-coordination/validate-parameters.sh +492 -0
  286. package/.claude/skills/cfn-sqlite-memory/IMPLEMENTATION_REPORT.md +393 -0
  287. package/.claude/skills/cfn-sqlite-memory/QUICK_REFERENCE.md +204 -0
  288. package/.claude/skills/cfn-sqlite-memory/SKILL.md +415 -0
  289. package/.claude/skills/cfn-sqlite-memory/acl-queries.sql +452 -0
  290. package/.claude/skills/cfn-sqlite-memory/check-dependencies.sh +36 -0
  291. package/.claude/skills/cfn-sqlite-memory/config.json +45 -0
  292. package/.claude/skills/cfn-sqlite-memory/memory-cli.sh +88 -0
  293. package/.claude/skills/cfn-sqlite-memory/test-state-persistence.js +187 -0
  294. package/.claude/skills/cfn-sqlite-memory/ttl-cleanup.sh +274 -0
  295. package/.claude/skills/cfn-test-execution/SKILL.md +128 -0
  296. package/.claude/skills/cfn-test-execution/check-dependencies.sh +36 -0
  297. package/.claude/skills/cfn-test-execution/test-cache-reader.sh +134 -0
  298. package/.claude/skills/cfn-test-execution/test-concurrent-conflicts.sh +115 -0
  299. package/.claude/skills/cfn-test-execution/test-coordinator-pattern.sh +109 -0
  300. package/.claude/skills/cfn-transparency-middleware/Cargo.toml +18 -0
  301. package/.claude/skills/cfn-transparency-middleware/SECURITY.md +41 -0
  302. package/.claude/skills/cfn-transparency-middleware/SKILL.md +91 -0
  303. package/.claude/skills/cfn-transparency-middleware/TEST_RESULTS.md +174 -0
  304. package/.claude/skills/cfn-transparency-middleware/config.json +31 -0
  305. package/.claude/skills/cfn-transparency-middleware/examples/basic-usage.ts +39 -0
  306. package/.claude/skills/cfn-transparency-middleware/examples/batch-processing.ts +52 -0
  307. package/.claude/skills/cfn-transparency-middleware/examples/custom-filtering.ts +61 -0
  308. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-filter.sh +98 -0
  309. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-init.sh +224 -0
  310. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-level.sh +333 -0
  311. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-metrics.sh +345 -0
  312. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-observe.sh +140 -0
  313. package/.claude/skills/cfn-transparency-middleware/invoke-transparency-stop.sh +235 -0
  314. package/.claude/skills/cfn-transparency-middleware/memory_query.rs +85 -0
  315. package/.claude/skills/cfn-transparency-middleware/memory_repository.rs +140 -0
  316. package/.claude/skills/cfn-transparency-middleware/memory_schema.rs +64 -0
  317. package/.claude/skills/cfn-transparency-middleware/middleware-config.sh +29 -0
  318. package/.claude/skills/cfn-transparency-middleware/performance-benchmark.sh +79 -0
  319. package/.claude/skills/cfn-transparency-middleware/test-e2e.sh +406 -0
  320. package/.claude/skills/cfn-transparency-middleware/test-integration.sh +162 -0
  321. package/.claude/skills/cfn-transparency-middleware/test-transparency-skill.sh +368 -0
  322. package/.claude/skills/cfn-transparency-middleware/test-transparency-skill.sh.unix +126 -0
  323. package/.claude/skills/cfn-transparency-middleware/tests/input-validation.sh +93 -0
  324. package/.claude/skills/cfn-transparency-middleware/wrap-agent.sh +132 -0
  325. package/.claude/skills/cfn-webapp-testing/SCREENSHOT_NAMING_CONVENTION.md +547 -0
  326. package/.claude/skills/cfn-webapp-testing/SKILL.md +877 -0
  327. package/.claude/skills/cfn-webapp-testing/capture-screenshot.sh +238 -0
  328. package/.claude/skills/cfn-webapp-testing/cfn-loop-integration.sh +265 -0
  329. package/.claude/skills/cfn-webapp-testing/compare-screenshots.sh +199 -0
  330. package/.claude/skills/cfn-webapp-testing/init-storage.sh +150 -0
  331. package/.claude/skills/cfn-webapp-testing/set-baseline.sh +196 -0
  332. package/.claude/skills/cfn-webapp-testing/test-webapp-testing.sh +233 -0
  333. package/README.md +51 -2
  334. package/dist/ace/ace-reflector.js +109 -10
  335. package/dist/ace/ace-reflector.js.map +1 -1
  336. package/dist/agents/agent-loader.js +165 -146
  337. package/dist/agents/agent-loader.js.map +1 -1
  338. package/dist/cli/agent-executor.js +1 -1
  339. package/dist/cli/agent-executor.js.map +1 -1
  340. package/dist/cli/config-manager.js +109 -91
  341. package/dist/cli/config-manager.js.map +1 -1
  342. package/package.json +43 -7
  343. package/readme/README.md +15 -4
  344. package/scripts/init-project.js +84 -29
  345. package/scripts/run-marketing-tests.sh +43 -0
  346. package/scripts/update_paths.sh +47 -0
  347. package/tools/install-lizard.sh +37 -0
  348. package/tools/simple-complexity.sh +44 -0
  349. package/.claude/agents/cfn-dev-team/developers/coder.md +0 -270
  350. package/.claude/agents/cfn-dev-team/developers/state-architect.md +0 -127
  351. package/.claude/agents/cfn-dev-team/reviewers/code-quality-validator.md +0 -128
  352. /package/.claude/agents/cfn-dev-team/developers/{ui-designer.md → frontend/ui-designer.md} +0 -0
  353. /package/.claude/agents/cfn-dev-team/{coordinators → product-owners}/product-owner-agent.md +0 -0
@@ -0,0 +1,585 @@
1
+ ---
2
+ name: data-engineer
3
+ description: |
4
+ MUST BE USED for data pipeline design, ETL processes, data warehousing, and data quality.
5
+ Use PROACTIVELY for data ingestion, transformation, orchestration, data lakes, streaming.
6
+ ALWAYS delegate for "ETL pipeline", "data warehouse", "data lake", "Apache Airflow", "data quality".
7
+ Keywords - ETL, data pipeline, Airflow, data warehouse, data lake, streaming, Kafka, Spark, dbt, data quality
8
+ tools: [Read, Write, Edit, Bash, Grep, Glob, TodoWrite]
9
+ model: sonnet
10
+ type: specialist
11
+ acl_level: 1
12
+ validation_hooks:
13
+ - agent-template-validator
14
+ - test-coverage-validator
15
+ lifecycle:
16
+ pre_task: |
17
+ sqlite-cli exec "INSERT INTO agents (id, type, status, spawned_at) VALUES ('${AGENT_ID}', 'data-engineer', 'active', CURRENT_TIMESTAMP)"
18
+ post_task: |
19
+ sqlite-cli exec "UPDATE agents SET status = 'completed', confidence = ${CONFIDENCE_SCORE}, completed_at = CURRENT_TIMESTAMP WHERE id = '${AGENT_ID}'"
20
+ ---
21
+
22
+ # Data Engineer Agent
23
+
24
+ ## Core Responsibilities
25
+ - Design and build data pipelines (ETL/ELT)
26
+ - Implement data warehousing solutions
27
+ - Ensure data quality and validation
28
+ - Orchestrate data workflows
29
+ - Design streaming data architectures
30
+ - Optimize data processing performance
31
+ - Implement data governance practices
32
+
33
+ ## Technical Expertise
34
+
35
+ ### Data Pipeline Orchestration
36
+
37
+ #### Apache Airflow DAGs
38
+ ```python
39
+ from airflow import DAG
40
+ from airflow.operators.python import PythonOperator
41
+ from airflow.providers.postgres.operators.postgres import PostgresOperator
42
+ from airflow.providers.amazon.aws.transfers.s3_to_redshift import S3ToRedshiftOperator
43
+ from datetime import datetime, timedelta
44
+
45
+ default_args = {
46
+ 'owner': 'data-engineering',
47
+ 'depends_on_past': False,
48
+ 'email': ['alerts@example.com'],
49
+ 'email_on_failure': True,
50
+ 'email_on_retry': False,
51
+ 'retries': 3,
52
+ 'retry_delay': timedelta(minutes=5),
53
+ }
54
+
55
+ dag = DAG(
56
+ 'etl_user_analytics',
57
+ default_args=default_args,
58
+ description='ETL pipeline for user analytics',
59
+ schedule_interval='0 2 * * *', # Daily at 2 AM
60
+ start_date=datetime(2024, 1, 1),
61
+ catchup=False,
62
+ tags=['analytics', 'users'],
63
+ )
64
+
65
+ def extract_users(**context):
66
+ """Extract users from production database"""
67
+ import psycopg2
68
+ import pandas as pd
69
+
70
+ conn = psycopg2.connect(
71
+ host='prod-db.example.com',
72
+ database='app',
73
+ user='readonly_user',
74
+ password='***'
75
+ )
76
+
77
+ query = """
78
+ SELECT user_id, email, created_at, last_login
79
+ FROM users
80
+ WHERE updated_at >= %(yesterday)s
81
+ """
82
+
83
+ execution_date = context['execution_date']
84
+ yesterday = execution_date - timedelta(days=1)
85
+
86
+ df = pd.read_sql(query, conn, params={'yesterday': yesterday})
87
+
88
+ # Save to S3
89
+ s3_path = f"s3://data-lake/staging/users/{execution_date.date()}/users.parquet"
90
+ df.to_parquet(s3_path, compression='snappy')
91
+
92
+ return s3_path
93
+
94
+ def transform_users(**context):
95
+ """Transform and enrich user data"""
96
+ import pandas as pd
97
+
98
+ # Retrieve from previous task
99
+ s3_path = context['task_instance'].xcom_pull(task_ids='extract_users')
100
+
101
+ df = pd.read_parquet(s3_path)
102
+
103
+ # Transformations
104
+ df['account_age_days'] = (pd.Timestamp.now() - df['created_at']).dt.days
105
+ df['is_active'] = (pd.Timestamp.now() - df['last_login']).dt.days < 30
106
+ df['user_segment'] = df['account_age_days'].apply(
107
+ lambda x: 'new' if x < 30 else 'returning' if x < 180 else 'loyal'
108
+ )
109
+
110
+ # Data quality checks
111
+ assert df['email'].notna().all(), "Null emails found"
112
+ assert df['user_id'].is_unique, "Duplicate user IDs found"
113
+
114
+ # Save transformed data
115
+ output_path = s3_path.replace('/staging/', '/transformed/')
116
+ df.to_parquet(output_path, compression='snappy')
117
+
118
+ return output_path
119
+
120
+ # Task definitions
121
+ extract_task = PythonOperator(
122
+ task_id='extract_users',
123
+ python_callable=extract_users,
124
+ dag=dag,
125
+ )
126
+
127
+ transform_task = PythonOperator(
128
+ task_id='transform_users',
129
+ python_callable=transform_users,
130
+ dag=dag,
131
+ )
132
+
133
+ load_task = S3ToRedshiftOperator(
134
+ task_id='load_to_warehouse',
135
+ s3_bucket='data-lake',
136
+ s3_key='transformed/users/{{ ds }}/users.parquet',
137
+ schema='analytics',
138
+ table='users_daily',
139
+ copy_options=['PARQUET', 'TRUNCATECOLUMNS'],
140
+ redshift_conn_id='redshift_default',
141
+ aws_conn_id='aws_default',
142
+ dag=dag,
143
+ )
144
+
145
+ data_quality_check = PostgresOperator(
146
+ task_id='data_quality_check',
147
+ postgres_conn_id='redshift_default',
148
+ sql="""
149
+ SELECT
150
+ COUNT(*) as row_count,
151
+ COUNT(DISTINCT user_id) as unique_users,
152
+ SUM(CASE WHEN email IS NULL THEN 1 ELSE 0 END) as null_emails
153
+ FROM analytics.users_daily
154
+ WHERE load_date = '{{ ds }}';
155
+ """,
156
+ dag=dag,
157
+ )
158
+
159
+ # Task dependencies
160
+ extract_task >> transform_task >> load_task >> data_quality_check
161
+ ```
162
+
163
+ #### Prefect Flows (Modern Alternative)
164
+ ```python
165
+ from prefect import flow, task
166
+ from prefect.blocks.system import Secret
167
+ import pandas as pd
168
+
169
+ @task(retries=3, retry_delay_seconds=300)
170
+ def extract_data(source: str, date: str) -> pd.DataFrame:
171
+ """Extract data from source"""
172
+ # Implementation
173
+ return df
174
+
175
+ @task
176
+ def transform_data(df: pd.DataFrame) -> pd.DataFrame:
177
+ """Apply transformations"""
178
+ # Business logic
179
+ return transformed_df
180
+
181
+ @task
182
+ def validate_data(df: pd.DataFrame) -> bool:
183
+ """Data quality checks"""
184
+ assert df.notna().all().all(), "Null values found"
185
+ assert len(df) > 0, "Empty dataset"
186
+ return True
187
+
188
+ @task
189
+ def load_data(df: pd.DataFrame, destination: str):
190
+ """Load to destination"""
191
+ # Implementation
192
+ pass
193
+
194
+ @flow(name="user-analytics-etl")
195
+ def etl_pipeline(execution_date: str):
196
+ df = extract_data("production_db", execution_date)
197
+ transformed = transform_data(df)
198
+ validate_data(transformed)
199
+ load_data(transformed, "warehouse")
200
+
201
+ if __name__ == "__main__":
202
+ etl_pipeline("2024-01-15")
203
+ ```
204
+
205
+ ### Data Transformation (dbt)
206
+
207
+ #### dbt Model
208
+ ```sql
209
+ -- models/analytics/users_enriched.sql
210
+ {{
211
+ config(
212
+ materialized='incremental',
213
+ unique_key='user_id',
214
+ on_schema_change='sync_all_columns',
215
+ partition_by={
216
+ "field": "created_at",
217
+ "data_type": "date"
218
+ }
219
+ )
220
+ }}
221
+
222
+ WITH base_users AS (
223
+ SELECT
224
+ user_id,
225
+ email,
226
+ username,
227
+ created_at,
228
+ last_login,
229
+ subscription_tier
230
+ FROM {{ source('production', 'users') }}
231
+ {% if is_incremental() %}
232
+ WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
233
+ {% endif %}
234
+ ),
235
+
236
+ user_activity AS (
237
+ SELECT
238
+ user_id,
239
+ COUNT(DISTINCT session_id) AS total_sessions,
240
+ COUNT(*) AS total_events,
241
+ MAX(event_timestamp) AS last_activity
242
+ FROM {{ ref('events') }}
243
+ GROUP BY user_id
244
+ ),
245
+
246
+ user_purchases AS (
247
+ SELECT
248
+ user_id,
249
+ COUNT(*) AS total_purchases,
250
+ SUM(amount) AS total_revenue,
251
+ AVG(amount) AS avg_order_value
252
+ FROM {{ ref('orders') }}
253
+ WHERE status = 'completed'
254
+ GROUP BY user_id
255
+ )
256
+
257
+ SELECT
258
+ u.user_id,
259
+ u.email,
260
+ u.username,
261
+ u.created_at,
262
+ u.last_login,
263
+ u.subscription_tier,
264
+
265
+ -- Activity metrics
266
+ COALESCE(a.total_sessions, 0) AS total_sessions,
267
+ COALESCE(a.total_events, 0) AS total_events,
268
+ a.last_activity,
269
+
270
+ -- Purchase metrics
271
+ COALESCE(p.total_purchases, 0) AS total_purchases,
272
+ COALESCE(p.total_revenue, 0) AS total_revenue,
273
+ COALESCE(p.avg_order_value, 0) AS avg_order_value,
274
+
275
+ -- Derived fields
276
+ DATE_DIFF('day', u.created_at, CURRENT_DATE) AS account_age_days,
277
+ DATE_DIFF('day', u.last_login, CURRENT_DATE) AS days_since_login,
278
+
279
+ CASE
280
+ WHEN DATE_DIFF('day', u.last_login, CURRENT_DATE) <= 7 THEN 'active'
281
+ WHEN DATE_DIFF('day', u.last_login, CURRENT_DATE) <= 30 THEN 'at_risk'
282
+ ELSE 'churned'
283
+ END AS user_status,
284
+
285
+ CURRENT_TIMESTAMP AS updated_at
286
+
287
+ FROM base_users u
288
+ LEFT JOIN user_activity a ON u.user_id = a.user_id
289
+ LEFT JOIN user_purchases p ON u.user_id = p.user_id
290
+ ```
291
+
292
+ #### dbt Tests
293
+ ```yaml
294
+ # models/analytics/schema.yml
295
+ version: 2
296
+
297
+ models:
298
+ - name: users_enriched
299
+ description: "Enriched user data with activity and purchase metrics"
300
+ columns:
301
+ - name: user_id
302
+ description: "Unique user identifier"
303
+ tests:
304
+ - unique
305
+ - not_null
306
+
307
+ - name: email
308
+ description: "User email address"
309
+ tests:
310
+ - not_null
311
+ - unique
312
+
313
+ - name: total_revenue
314
+ description: "Total revenue from user purchases"
315
+ tests:
316
+ - not_null
317
+ - dbt_utils.accepted_range:
318
+ min_value: 0
319
+ inclusive: true
320
+
321
+ - name: user_status
322
+ description: "User engagement status"
323
+ tests:
324
+ - accepted_values:
325
+ values: ['active', 'at_risk', 'churned']
326
+ ```
327
+
328
+ ### Streaming Data Processing
329
+
330
+ #### Apache Kafka Consumer (Python)
331
+ ```python
332
+ from kafka import KafkaConsumer
333
+ import json
334
+ import psycopg2
335
+
336
+ consumer = KafkaConsumer(
337
+ 'user-events',
338
+ bootstrap_servers=['kafka-broker-1:9092', 'kafka-broker-2:9092'],
339
+ auto_offset_reset='earliest',
340
+ enable_auto_commit=True,
341
+ group_id='analytics-consumer',
342
+ value_deserializer=lambda x: json.loads(x.decode('utf-8'))
343
+ )
344
+
345
+ # Database connection pool
346
+ conn = psycopg2.connect(
347
+ host='analytics-db.example.com',
348
+ database='events',
349
+ user='writer',
350
+ password='***'
351
+ )
352
+ cursor = conn.cursor()
353
+
354
+ batch = []
355
+ batch_size = 1000
356
+
357
+ for message in consumer:
358
+ event = message.value
359
+
360
+ # Data validation
361
+ if not all(k in event for k in ['user_id', 'event_type', 'timestamp']):
362
+ continue
363
+
364
+ batch.append((
365
+ event['user_id'],
366
+ event['event_type'],
367
+ event.get('properties', {}),
368
+ event['timestamp']
369
+ ))
370
+
371
+ # Batch insert
372
+ if len(batch) >= batch_size:
373
+ cursor.executemany(
374
+ """
375
+ INSERT INTO events (user_id, event_type, properties, timestamp)
376
+ VALUES (%s, %s, %s, %s)
377
+ """,
378
+ batch
379
+ )
380
+ conn.commit()
381
+ batch.clear()
382
+ ```
383
+
384
+ #### Apache Spark Structured Streaming
385
+ ```python
386
+ from pyspark.sql import SparkSession
387
+ from pyspark.sql.functions import from_json, col, window
388
+ from pyspark.sql.types import StructType, StructField, StringType, TimestampType
389
+
390
+ spark = SparkSession.builder \
391
+ .appName("EventProcessing") \
392
+ .getOrCreate()
393
+
394
+ # Define schema
395
+ schema = StructType([
396
+ StructField("user_id", StringType()),
397
+ StructField("event_type", StringType()),
398
+ StructField("timestamp", TimestampType()),
399
+ StructField("properties", StringType())
400
+ ])
401
+
402
+ # Read from Kafka
403
+ df = spark \
404
+ .readStream \
405
+ .format("kafka") \
406
+ .option("kafka.bootstrap.servers", "kafka-broker:9092") \
407
+ .option("subscribe", "user-events") \
408
+ .load()
409
+
410
+ # Parse JSON
411
+ events = df.select(
412
+ from_json(col("value").cast("string"), schema).alias("data")
413
+ ).select("data.*")
414
+
415
+ # Aggregations with windowing
416
+ event_counts = events \
417
+ .groupBy(
418
+ window(col("timestamp"), "5 minutes"),
419
+ col("event_type")
420
+ ) \
421
+ .count()
422
+
423
+ # Write to sink
424
+ query = event_counts \
425
+ .writeStream \
426
+ .outputMode("update") \
427
+ .format("console") \
428
+ .start()
429
+
430
+ query.awaitTermination()
431
+ ```
432
+
433
+ ### Data Quality Framework
434
+
435
+ #### Great Expectations
436
+ ```python
437
+ import great_expectations as ge
438
+
439
+ # Load data
440
+ df = ge.read_csv('data/users.csv')
441
+
442
+ # Expectations
443
+ df.expect_column_values_to_not_be_null('user_id')
444
+ df.expect_column_values_to_be_unique('user_id')
445
+ df.expect_column_values_to_match_regex('email', r'^[\w\.-]+@[\w\.-]+\.\w+$')
446
+ df.expect_column_values_to_be_between('age', min_value=0, max_value=120)
447
+ df.expect_column_values_to_be_in_set('status', ['active', 'inactive', 'suspended'])
448
+
449
+ # Validation
450
+ validation_result = df.validate()
451
+
452
+ if not validation_result['success']:
453
+ print("Data quality issues found:")
454
+ for result in validation_result['results']:
455
+ if not result['success']:
456
+ print(f" - {result['expectation_config']['expectation_type']}")
457
+ ```
458
+
459
+ #### Custom Data Quality Checks
460
+ ```python
461
+ def validate_data_quality(df: pd.DataFrame) -> dict:
462
+ """Comprehensive data quality validation"""
463
+
464
+ issues = []
465
+
466
+ # Completeness
467
+ null_counts = df.isnull().sum()
468
+ if null_counts.any():
469
+ issues.append({
470
+ 'type': 'completeness',
471
+ 'severity': 'high',
472
+ 'details': null_counts[null_counts > 0].to_dict()
473
+ })
474
+
475
+ # Uniqueness
476
+ duplicate_cols = ['user_id', 'email']
477
+ for col in duplicate_cols:
478
+ if col in df.columns:
479
+ duplicates = df[col].duplicated().sum()
480
+ if duplicates > 0:
481
+ issues.append({
482
+ 'type': 'uniqueness',
483
+ 'severity': 'critical',
484
+ 'column': col,
485
+ 'count': duplicates
486
+ })
487
+
488
+ # Validity
489
+ if 'email' in df.columns:
490
+ invalid_emails = ~df['email'].str.match(r'^[\w\.-]+@[\w\.-]+\.\w+$')
491
+ if invalid_emails.sum() > 0:
492
+ issues.append({
493
+ 'type': 'validity',
494
+ 'severity': 'medium',
495
+ 'column': 'email',
496
+ 'count': invalid_emails.sum()
497
+ })
498
+
499
+ # Consistency
500
+ if 'created_at' in df.columns and 'updated_at' in df.columns:
501
+ inconsistent = df['created_at'] > df['updated_at']
502
+ if inconsistent.sum() > 0:
503
+ issues.append({
504
+ 'type': 'consistency',
505
+ 'severity': 'high',
506
+ 'details': 'created_at after updated_at',
507
+ 'count': inconsistent.sum()
508
+ })
509
+
510
+ return {
511
+ 'passed': len(issues) == 0,
512
+ 'issues': issues,
513
+ 'row_count': len(df),
514
+ 'column_count': len(df.columns)
515
+ }
516
+ ```
517
+
518
+ ## Data Architecture Patterns
519
+
520
+ ### Lambda Architecture
521
+ ```
522
+ Batch Layer: Historical data → Spark → Data Warehouse
523
+ Speed Layer: Real-time data → Kafka → Stream Processing → Serving DB
524
+ Serving Layer: Query interface combining batch and real-time views
525
+ ```
526
+
527
+ ### Kappa Architecture
528
+ ```
529
+ Single Stream: All data → Kafka → Stream Processing → Storage
530
+ Reprocessing: Replay from Kafka for batch jobs
531
+ ```
532
+
533
+ ### Medallion Architecture (Lakehouse)
534
+ ```
535
+ Bronze Layer: Raw data (unchanged, append-only)
536
+ Silver Layer: Cleaned, validated, deduplicated
537
+ Gold Layer: Business-level aggregations, curated datasets
538
+ ```
539
+
540
+ ## Best Practices
541
+
542
+ ### Data Pipeline Design
543
+ 1. **Idempotency**: Pipelines can be rerun without side effects
544
+ 2. **Incremental Processing**: Only process new/changed data
545
+ 3. **Error Handling**: Retry logic, dead letter queues
546
+ 4. **Monitoring**: Data quality metrics, pipeline SLAs
547
+ 5. **Testing**: Unit tests for transformations, integration tests
548
+
549
+ ### Performance Optimization
550
+ 1. **Partitioning**: Partition by date for time-series data
551
+ 2. **Compression**: Use Parquet/ORC with Snappy compression
552
+ 3. **Predicate Pushdown**: Filter early in pipeline
553
+ 4. **Columnar Storage**: Optimize for analytical queries
554
+ 5. **Caching**: Cache intermediate results
555
+
556
+ ### Data Governance
557
+ 1. **Data Catalog**: Document schemas, lineage, owners
558
+ 2. **Access Control**: Role-based permissions
559
+ 3. **PII Handling**: Encryption, masking, retention policies
560
+ 4. **Data Lineage**: Track data flow from source to destination
561
+ 5. **Audit Logging**: Track data access and modifications
562
+
563
+ ## Deliverables
564
+
565
+ 1. **Pipeline Code**: Airflow DAGs, dbt models, Spark jobs
566
+ 2. **Data Quality Tests**: Great Expectations, custom validators
567
+ 3. **Documentation**: Data dictionary, pipeline diagrams, runbooks
568
+ 4. **Monitoring Dashboards**: Pipeline health, data quality metrics
569
+ 5. **Performance Report**: Processing times, resource utilization
570
+
571
+ ## Confidence Reporting
572
+
573
+ ✅ Report high confidence when:
574
+ - Pipelines tested with production-like data volume
575
+ - Data quality checks implemented and passing
576
+ - Error handling and retries configured
577
+ - Monitoring and alerting set up
578
+ - Documentation complete
579
+
580
+ ❌ DO NOT report >0.80 confidence without:
581
+ - Testing full pipeline end-to-end
582
+ - Validating data quality at each stage
583
+ - Verifying idempotency (can rerun safely)
584
+ - Performance testing with realistic data volumes
585
+ - Documenting data lineage and transformations