@djm204/agent-skills 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +597 -0
- package/bin/cli.js +8 -0
- package/package.json +55 -0
- package/src/index.js +1817 -0
- package/src/index.test.js +1264 -0
- package/templates/_shared/code-quality.mdc +52 -0
- package/templates/_shared/communication.mdc +43 -0
- package/templates/_shared/core-principles.mdc +67 -0
- package/templates/_shared/git-workflow.mdc +48 -0
- package/templates/_shared/security-fundamentals.mdc +41 -0
- package/templates/agents/utility-agent/.cursor/rules/action-control.mdc +71 -0
- package/templates/agents/utility-agent/.cursor/rules/context-management.mdc +61 -0
- package/templates/agents/utility-agent/.cursor/rules/hallucination-prevention.mdc +58 -0
- package/templates/agents/utility-agent/.cursor/rules/overview.mdc +34 -0
- package/templates/agents/utility-agent/.cursor/rules/token-optimization.mdc +71 -0
- package/templates/agents/utility-agent/CLAUDE.md +513 -0
- package/templates/business/market-intelligence/.cursor/rules/data-sources.mdc +62 -0
- package/templates/business/market-intelligence/.cursor/rules/overview.mdc +55 -0
- package/templates/business/market-intelligence/.cursor/rules/reporting.mdc +59 -0
- package/templates/business/market-intelligence/.cursor/rules/risk-signals.mdc +63 -0
- package/templates/business/market-intelligence/.cursor/rules/sentiment-analysis.mdc +70 -0
- package/templates/business/market-intelligence/.cursor/rules/trend-detection.mdc +72 -0
- package/templates/business/market-intelligence/CLAUDE.md +371 -0
- package/templates/business/marketing-expert/.cursor/rules/brand-strategy.mdc +74 -0
- package/templates/business/marketing-expert/.cursor/rules/campaign-planning.mdc +60 -0
- package/templates/business/marketing-expert/.cursor/rules/growth-frameworks.mdc +69 -0
- package/templates/business/marketing-expert/.cursor/rules/market-analysis.mdc +70 -0
- package/templates/business/marketing-expert/.cursor/rules/marketing-analytics.mdc +71 -0
- package/templates/business/marketing-expert/.cursor/rules/overview.mdc +56 -0
- package/templates/business/marketing-expert/CLAUDE.md +567 -0
- package/templates/business/predictive-maintenance/.cursor/rules/alerting.mdc +56 -0
- package/templates/business/predictive-maintenance/.cursor/rules/asset-lifecycle.mdc +71 -0
- package/templates/business/predictive-maintenance/.cursor/rules/failure-prediction.mdc +65 -0
- package/templates/business/predictive-maintenance/.cursor/rules/maintenance-scheduling.mdc +61 -0
- package/templates/business/predictive-maintenance/.cursor/rules/overview.mdc +55 -0
- package/templates/business/predictive-maintenance/.cursor/rules/sensor-analytics.mdc +66 -0
- package/templates/business/predictive-maintenance/CLAUDE.md +529 -0
- package/templates/business/product-manager/.cursor/rules/communication.mdc +77 -0
- package/templates/business/product-manager/.cursor/rules/discovery.mdc +79 -0
- package/templates/business/product-manager/.cursor/rules/metrics.mdc +75 -0
- package/templates/business/product-manager/.cursor/rules/overview.mdc +47 -0
- package/templates/business/product-manager/.cursor/rules/prioritization.mdc +66 -0
- package/templates/business/product-manager/.cursor/rules/requirements.mdc +79 -0
- package/templates/business/product-manager/CLAUDE.md +593 -0
- package/templates/business/project-manager/.cursor/rules/overview.mdc +53 -0
- package/templates/business/project-manager/.cursor/rules/reporting.mdc +68 -0
- package/templates/business/project-manager/.cursor/rules/risk-management.mdc +71 -0
- package/templates/business/project-manager/.cursor/rules/scheduling.mdc +67 -0
- package/templates/business/project-manager/.cursor/rules/scope-management.mdc +66 -0
- package/templates/business/project-manager/.cursor/rules/stakeholder-management.mdc +70 -0
- package/templates/business/project-manager/CLAUDE.md +540 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/compliance-tracking.mdc +74 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/impact-assessment.mdc +62 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/monitoring.mdc +67 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/overview.mdc +55 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/reporting.mdc +61 -0
- package/templates/business/regulatory-sentinel/.cursor/rules/risk-classification.mdc +73 -0
- package/templates/business/regulatory-sentinel/CLAUDE.md +572 -0
- package/templates/business/resource-allocator/.cursor/rules/capacity-modeling.mdc +65 -0
- package/templates/business/resource-allocator/.cursor/rules/coordination.mdc +67 -0
- package/templates/business/resource-allocator/.cursor/rules/crisis-management.mdc +64 -0
- package/templates/business/resource-allocator/.cursor/rules/demand-prediction.mdc +52 -0
- package/templates/business/resource-allocator/.cursor/rules/overview.mdc +76 -0
- package/templates/business/resource-allocator/.cursor/rules/scheduling.mdc +63 -0
- package/templates/business/resource-allocator/CLAUDE.md +525 -0
- package/templates/business/strategic-negotiator/.cursor/rules/contract-analysis.mdc +60 -0
- package/templates/business/strategic-negotiator/.cursor/rules/deal-structuring.mdc +66 -0
- package/templates/business/strategic-negotiator/.cursor/rules/game-theory.mdc +64 -0
- package/templates/business/strategic-negotiator/.cursor/rules/overview.mdc +55 -0
- package/templates/business/strategic-negotiator/.cursor/rules/preparation.mdc +79 -0
- package/templates/business/strategic-negotiator/.cursor/rules/scenario-modeling.mdc +66 -0
- package/templates/business/strategic-negotiator/CLAUDE.md +640 -0
- package/templates/business/supply-chain/.cursor/rules/cost-modeling.mdc +67 -0
- package/templates/business/supply-chain/.cursor/rules/demand-forecasting.mdc +67 -0
- package/templates/business/supply-chain/.cursor/rules/inventory-management.mdc +69 -0
- package/templates/business/supply-chain/.cursor/rules/logistics.mdc +61 -0
- package/templates/business/supply-chain/.cursor/rules/overview.mdc +64 -0
- package/templates/business/supply-chain/.cursor/rules/supplier-evaluation.mdc +66 -0
- package/templates/business/supply-chain/CLAUDE.md +590 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/disruption-response.mdc +67 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/inventory-rebalancing.mdc +63 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/overview.mdc +65 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/rerouting.mdc +64 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/scenario-simulation.mdc +68 -0
- package/templates/business/supply-chain-harmonizer/.cursor/rules/stakeholder-notifications.mdc +61 -0
- package/templates/business/supply-chain-harmonizer/CLAUDE.md +600 -0
- package/templates/creative/brand-guardian/.cursor/rules/brand-voice.mdc +64 -0
- package/templates/creative/brand-guardian/.cursor/rules/content-review.mdc +47 -0
- package/templates/creative/brand-guardian/.cursor/rules/ethical-guidelines.mdc +47 -0
- package/templates/creative/brand-guardian/.cursor/rules/multi-channel.mdc +49 -0
- package/templates/creative/brand-guardian/.cursor/rules/overview.mdc +58 -0
- package/templates/creative/brand-guardian/.cursor/rules/visual-identity.mdc +64 -0
- package/templates/creative/brand-guardian/CLAUDE.md +634 -0
- package/templates/creative/content-creation-expert/.cursor/rules/content-strategy.mdc +65 -0
- package/templates/creative/content-creation-expert/.cursor/rules/copywriting.mdc +59 -0
- package/templates/creative/content-creation-expert/.cursor/rules/editorial-operations.mdc +65 -0
- package/templates/creative/content-creation-expert/.cursor/rules/multimedia-production.mdc +64 -0
- package/templates/creative/content-creation-expert/.cursor/rules/overview.mdc +58 -0
- package/templates/creative/content-creation-expert/.cursor/rules/seo-content.mdc +75 -0
- package/templates/creative/content-creation-expert/CLAUDE.md +568 -0
- package/templates/creative/narrative-architect/.cursor/rules/collaboration.mdc +62 -0
- package/templates/creative/narrative-architect/.cursor/rules/continuity-tracking.mdc +56 -0
- package/templates/creative/narrative-architect/.cursor/rules/overview.mdc +68 -0
- package/templates/creative/narrative-architect/.cursor/rules/story-bible.mdc +77 -0
- package/templates/creative/narrative-architect/.cursor/rules/timeline-management.mdc +60 -0
- package/templates/creative/narrative-architect/.cursor/rules/world-building.mdc +78 -0
- package/templates/creative/narrative-architect/CLAUDE.md +737 -0
- package/templates/creative/social-media-expert/.cursor/rules/audience-growth.mdc +62 -0
- package/templates/creative/social-media-expert/.cursor/rules/community-management.mdc +67 -0
- package/templates/creative/social-media-expert/.cursor/rules/content-strategy.mdc +60 -0
- package/templates/creative/social-media-expert/.cursor/rules/overview.mdc +48 -0
- package/templates/creative/social-media-expert/.cursor/rules/platform-strategy.mdc +64 -0
- package/templates/creative/social-media-expert/.cursor/rules/social-analytics.mdc +64 -0
- package/templates/creative/social-media-expert/CLAUDE.md +624 -0
- package/templates/creative/trend-forecaster/.cursor/rules/cultural-analysis.mdc +59 -0
- package/templates/creative/trend-forecaster/.cursor/rules/forecasting-methods.mdc +63 -0
- package/templates/creative/trend-forecaster/.cursor/rules/overview.mdc +58 -0
- package/templates/creative/trend-forecaster/.cursor/rules/reporting.mdc +61 -0
- package/templates/creative/trend-forecaster/.cursor/rules/signal-analysis.mdc +74 -0
- package/templates/creative/trend-forecaster/.cursor/rules/trend-lifecycle.mdc +75 -0
- package/templates/creative/trend-forecaster/CLAUDE.md +717 -0
- package/templates/creative/ux-designer/.cursor/rules/accessibility.mdc +69 -0
- package/templates/creative/ux-designer/.cursor/rules/emotional-design.mdc +59 -0
- package/templates/creative/ux-designer/.cursor/rules/handoff.mdc +73 -0
- package/templates/creative/ux-designer/.cursor/rules/information-architecture.mdc +62 -0
- package/templates/creative/ux-designer/.cursor/rules/interaction-design.mdc +66 -0
- package/templates/creative/ux-designer/.cursor/rules/overview.mdc +61 -0
- package/templates/creative/ux-designer/.cursor/rules/research.mdc +61 -0
- package/templates/creative/ux-designer/.cursor/rules/visual-design.mdc +68 -0
- package/templates/creative/ux-designer/CLAUDE.md +124 -0
- package/templates/dogfood/project-overview.mdc +12 -0
- package/templates/dogfood/project-structure.mdc +82 -0
- package/templates/dogfood/rules-creation-best-practices.mdc +45 -0
- package/templates/education/educator/.cursor/rules/accessibility.mdc +67 -0
- package/templates/education/educator/.cursor/rules/assessment.mdc +68 -0
- package/templates/education/educator/.cursor/rules/curriculum.mdc +57 -0
- package/templates/education/educator/.cursor/rules/engagement.mdc +65 -0
- package/templates/education/educator/.cursor/rules/instructional-design.mdc +69 -0
- package/templates/education/educator/.cursor/rules/overview.mdc +57 -0
- package/templates/education/educator/.cursor/rules/retention.mdc +64 -0
- package/templates/education/educator/CLAUDE.md +338 -0
- package/templates/engineering/blockchain/.cursor/rules/defi-patterns.mdc +48 -0
- package/templates/engineering/blockchain/.cursor/rules/gas-optimization.mdc +77 -0
- package/templates/engineering/blockchain/.cursor/rules/overview.mdc +41 -0
- package/templates/engineering/blockchain/.cursor/rules/security.mdc +61 -0
- package/templates/engineering/blockchain/.cursor/rules/smart-contracts.mdc +64 -0
- package/templates/engineering/blockchain/.cursor/rules/testing.mdc +77 -0
- package/templates/engineering/blockchain/.cursor/rules/web3-integration.mdc +47 -0
- package/templates/engineering/blockchain/CLAUDE.md +389 -0
- package/templates/engineering/cli-tools/.cursor/rules/architecture.mdc +76 -0
- package/templates/engineering/cli-tools/.cursor/rules/arguments.mdc +65 -0
- package/templates/engineering/cli-tools/.cursor/rules/distribution.mdc +40 -0
- package/templates/engineering/cli-tools/.cursor/rules/error-handling.mdc +67 -0
- package/templates/engineering/cli-tools/.cursor/rules/overview.mdc +58 -0
- package/templates/engineering/cli-tools/.cursor/rules/testing.mdc +42 -0
- package/templates/engineering/cli-tools/.cursor/rules/user-experience.mdc +43 -0
- package/templates/engineering/cli-tools/CLAUDE.md +356 -0
- package/templates/engineering/data-engineering/.cursor/rules/data-modeling.mdc +71 -0
- package/templates/engineering/data-engineering/.cursor/rules/data-quality.mdc +78 -0
- package/templates/engineering/data-engineering/.cursor/rules/overview.mdc +49 -0
- package/templates/engineering/data-engineering/.cursor/rules/performance.mdc +71 -0
- package/templates/engineering/data-engineering/.cursor/rules/pipeline-design.mdc +79 -0
- package/templates/engineering/data-engineering/.cursor/rules/security.mdc +79 -0
- package/templates/engineering/data-engineering/.cursor/rules/testing.mdc +75 -0
- package/templates/engineering/data-engineering/CLAUDE.md +974 -0
- package/templates/engineering/devops-sre/.cursor/rules/capacity-planning.mdc +49 -0
- package/templates/engineering/devops-sre/.cursor/rules/change-management.mdc +51 -0
- package/templates/engineering/devops-sre/.cursor/rules/chaos-engineering.mdc +50 -0
- package/templates/engineering/devops-sre/.cursor/rules/disaster-recovery.mdc +54 -0
- package/templates/engineering/devops-sre/.cursor/rules/incident-management.mdc +56 -0
- package/templates/engineering/devops-sre/.cursor/rules/observability.mdc +50 -0
- package/templates/engineering/devops-sre/.cursor/rules/overview.mdc +76 -0
- package/templates/engineering/devops-sre/.cursor/rules/postmortems.mdc +49 -0
- package/templates/engineering/devops-sre/.cursor/rules/runbooks.mdc +49 -0
- package/templates/engineering/devops-sre/.cursor/rules/slo-sli.mdc +46 -0
- package/templates/engineering/devops-sre/.cursor/rules/toil-reduction.mdc +52 -0
- package/templates/engineering/devops-sre/CLAUDE.md +1007 -0
- package/templates/engineering/fullstack/.cursor/rules/api-contracts.mdc +79 -0
- package/templates/engineering/fullstack/.cursor/rules/architecture.mdc +79 -0
- package/templates/engineering/fullstack/.cursor/rules/overview.mdc +61 -0
- package/templates/engineering/fullstack/.cursor/rules/shared-types.mdc +77 -0
- package/templates/engineering/fullstack/.cursor/rules/testing.mdc +72 -0
- package/templates/engineering/fullstack/CLAUDE.md +349 -0
- package/templates/engineering/ml-ai/.cursor/rules/data-engineering.mdc +71 -0
- package/templates/engineering/ml-ai/.cursor/rules/deployment.mdc +43 -0
- package/templates/engineering/ml-ai/.cursor/rules/model-development.mdc +44 -0
- package/templates/engineering/ml-ai/.cursor/rules/monitoring.mdc +45 -0
- package/templates/engineering/ml-ai/.cursor/rules/overview.mdc +42 -0
- package/templates/engineering/ml-ai/.cursor/rules/security.mdc +51 -0
- package/templates/engineering/ml-ai/.cursor/rules/testing.mdc +44 -0
- package/templates/engineering/ml-ai/CLAUDE.md +1136 -0
- package/templates/engineering/mobile/.cursor/rules/navigation.mdc +75 -0
- package/templates/engineering/mobile/.cursor/rules/offline-first.mdc +68 -0
- package/templates/engineering/mobile/.cursor/rules/overview.mdc +76 -0
- package/templates/engineering/mobile/.cursor/rules/performance.mdc +78 -0
- package/templates/engineering/mobile/.cursor/rules/testing.mdc +77 -0
- package/templates/engineering/mobile/CLAUDE.md +233 -0
- package/templates/engineering/platform-engineering/.cursor/rules/ci-cd.mdc +51 -0
- package/templates/engineering/platform-engineering/.cursor/rules/developer-experience.mdc +48 -0
- package/templates/engineering/platform-engineering/.cursor/rules/infrastructure-as-code.mdc +62 -0
- package/templates/engineering/platform-engineering/.cursor/rules/kubernetes.mdc +51 -0
- package/templates/engineering/platform-engineering/.cursor/rules/observability.mdc +52 -0
- package/templates/engineering/platform-engineering/.cursor/rules/overview.mdc +44 -0
- package/templates/engineering/platform-engineering/.cursor/rules/security.mdc +74 -0
- package/templates/engineering/platform-engineering/.cursor/rules/testing.mdc +59 -0
- package/templates/engineering/platform-engineering/CLAUDE.md +850 -0
- package/templates/engineering/qa-engineering/.cursor/rules/automation.mdc +71 -0
- package/templates/engineering/qa-engineering/.cursor/rules/metrics.mdc +68 -0
- package/templates/engineering/qa-engineering/.cursor/rules/overview.mdc +45 -0
- package/templates/engineering/qa-engineering/.cursor/rules/quality-gates.mdc +54 -0
- package/templates/engineering/qa-engineering/.cursor/rules/test-design.mdc +59 -0
- package/templates/engineering/qa-engineering/.cursor/rules/test-strategy.mdc +62 -0
- package/templates/engineering/qa-engineering/CLAUDE.md +726 -0
- package/templates/engineering/testing/.cursor/rules/advanced-techniques.mdc +44 -0
- package/templates/engineering/testing/.cursor/rules/ci-cd-integration.mdc +43 -0
- package/templates/engineering/testing/.cursor/rules/overview.mdc +61 -0
- package/templates/engineering/testing/.cursor/rules/performance-testing.mdc +39 -0
- package/templates/engineering/testing/.cursor/rules/quality-metrics.mdc +74 -0
- package/templates/engineering/testing/.cursor/rules/reliability.mdc +39 -0
- package/templates/engineering/testing/.cursor/rules/tdd-methodology.mdc +52 -0
- package/templates/engineering/testing/.cursor/rules/test-data.mdc +46 -0
- package/templates/engineering/testing/.cursor/rules/test-design.mdc +45 -0
- package/templates/engineering/testing/.cursor/rules/test-types.mdc +71 -0
- package/templates/engineering/testing/CLAUDE.md +1134 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/csharp-architecture.mdc +61 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/multiplayer-networking.mdc +67 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/overview.mdc +56 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/performance-optimization.mdc +76 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/physics-rendering.mdc +59 -0
- package/templates/engineering/unity-dev-expert/.cursor/rules/ui-systems.mdc +59 -0
- package/templates/engineering/unity-dev-expert/CLAUDE.md +534 -0
- package/templates/engineering/web-backend/.cursor/rules/api-design.mdc +64 -0
- package/templates/engineering/web-backend/.cursor/rules/authentication.mdc +69 -0
- package/templates/engineering/web-backend/.cursor/rules/database-patterns.mdc +73 -0
- package/templates/engineering/web-backend/.cursor/rules/error-handling.mdc +66 -0
- package/templates/engineering/web-backend/.cursor/rules/overview.mdc +74 -0
- package/templates/engineering/web-backend/.cursor/rules/security.mdc +60 -0
- package/templates/engineering/web-backend/.cursor/rules/testing.mdc +74 -0
- package/templates/engineering/web-backend/CLAUDE.md +366 -0
- package/templates/engineering/web-frontend/.cursor/rules/accessibility.mdc +75 -0
- package/templates/engineering/web-frontend/.cursor/rules/component-patterns.mdc +76 -0
- package/templates/engineering/web-frontend/.cursor/rules/overview.mdc +77 -0
- package/templates/engineering/web-frontend/.cursor/rules/performance.mdc +73 -0
- package/templates/engineering/web-frontend/.cursor/rules/state-management.mdc +71 -0
- package/templates/engineering/web-frontend/.cursor/rules/styling.mdc +69 -0
- package/templates/engineering/web-frontend/.cursor/rules/testing.mdc +75 -0
- package/templates/engineering/web-frontend/CLAUDE.md +399 -0
- package/templates/languages/cpp-expert/.cursor/rules/concurrency.mdc +68 -0
- package/templates/languages/cpp-expert/.cursor/rules/error-handling.mdc +65 -0
- package/templates/languages/cpp-expert/.cursor/rules/memory-and-ownership.mdc +68 -0
- package/templates/languages/cpp-expert/.cursor/rules/modern-cpp.mdc +75 -0
- package/templates/languages/cpp-expert/.cursor/rules/overview.mdc +37 -0
- package/templates/languages/cpp-expert/.cursor/rules/performance.mdc +74 -0
- package/templates/languages/cpp-expert/.cursor/rules/testing.mdc +70 -0
- package/templates/languages/cpp-expert/.cursor/rules/tooling.mdc +77 -0
- package/templates/languages/cpp-expert/CLAUDE.md +242 -0
- package/templates/languages/csharp-expert/.cursor/rules/aspnet-core.mdc +78 -0
- package/templates/languages/csharp-expert/.cursor/rules/async-patterns.mdc +71 -0
- package/templates/languages/csharp-expert/.cursor/rules/dependency-injection.mdc +76 -0
- package/templates/languages/csharp-expert/.cursor/rules/error-handling.mdc +65 -0
- package/templates/languages/csharp-expert/.cursor/rules/language-features.mdc +74 -0
- package/templates/languages/csharp-expert/.cursor/rules/overview.mdc +47 -0
- package/templates/languages/csharp-expert/.cursor/rules/performance.mdc +66 -0
- package/templates/languages/csharp-expert/.cursor/rules/testing.mdc +78 -0
- package/templates/languages/csharp-expert/.cursor/rules/tooling.mdc +78 -0
- package/templates/languages/csharp-expert/CLAUDE.md +360 -0
- package/templates/languages/golang-expert/.cursor/rules/concurrency.mdc +79 -0
- package/templates/languages/golang-expert/.cursor/rules/error-handling.mdc +77 -0
- package/templates/languages/golang-expert/.cursor/rules/interfaces-and-types.mdc +77 -0
- package/templates/languages/golang-expert/.cursor/rules/overview.mdc +74 -0
- package/templates/languages/golang-expert/.cursor/rules/performance.mdc +76 -0
- package/templates/languages/golang-expert/.cursor/rules/production-patterns.mdc +76 -0
- package/templates/languages/golang-expert/.cursor/rules/stdlib-and-tooling.mdc +68 -0
- package/templates/languages/golang-expert/.cursor/rules/testing.mdc +77 -0
- package/templates/languages/golang-expert/CLAUDE.md +361 -0
- package/templates/languages/java-expert/.cursor/rules/concurrency.mdc +69 -0
- package/templates/languages/java-expert/.cursor/rules/error-handling.mdc +70 -0
- package/templates/languages/java-expert/.cursor/rules/modern-java.mdc +74 -0
- package/templates/languages/java-expert/.cursor/rules/overview.mdc +42 -0
- package/templates/languages/java-expert/.cursor/rules/performance.mdc +69 -0
- package/templates/languages/java-expert/.cursor/rules/persistence.mdc +74 -0
- package/templates/languages/java-expert/.cursor/rules/spring-boot.mdc +73 -0
- package/templates/languages/java-expert/.cursor/rules/testing.mdc +79 -0
- package/templates/languages/java-expert/.cursor/rules/tooling.mdc +76 -0
- package/templates/languages/java-expert/CLAUDE.md +325 -0
- package/templates/languages/javascript-expert/.cursor/rules/language-deep-dive.mdc +74 -0
- package/templates/languages/javascript-expert/.cursor/rules/node-patterns.mdc +77 -0
- package/templates/languages/javascript-expert/.cursor/rules/overview.mdc +66 -0
- package/templates/languages/javascript-expert/.cursor/rules/performance.mdc +64 -0
- package/templates/languages/javascript-expert/.cursor/rules/react-patterns.mdc +70 -0
- package/templates/languages/javascript-expert/.cursor/rules/testing.mdc +76 -0
- package/templates/languages/javascript-expert/.cursor/rules/tooling.mdc +72 -0
- package/templates/languages/javascript-expert/.cursor/rules/typescript-deep-dive.mdc +77 -0
- package/templates/languages/javascript-expert/CLAUDE.md +479 -0
- package/templates/languages/kotlin-expert/.cursor/rules/coroutines.mdc +75 -0
- package/templates/languages/kotlin-expert/.cursor/rules/error-handling.mdc +69 -0
- package/templates/languages/kotlin-expert/.cursor/rules/frameworks.mdc +76 -0
- package/templates/languages/kotlin-expert/.cursor/rules/language-features.mdc +78 -0
- package/templates/languages/kotlin-expert/.cursor/rules/overview.mdc +38 -0
- package/templates/languages/kotlin-expert/.cursor/rules/performance.mdc +73 -0
- package/templates/languages/kotlin-expert/.cursor/rules/testing.mdc +70 -0
- package/templates/languages/kotlin-expert/.cursor/rules/tooling.mdc +67 -0
- package/templates/languages/kotlin-expert/CLAUDE.md +276 -0
- package/templates/languages/python-expert/.cursor/rules/async-python.mdc +71 -0
- package/templates/languages/python-expert/.cursor/rules/overview.mdc +76 -0
- package/templates/languages/python-expert/.cursor/rules/patterns-and-idioms.mdc +77 -0
- package/templates/languages/python-expert/.cursor/rules/performance.mdc +74 -0
- package/templates/languages/python-expert/.cursor/rules/testing.mdc +77 -0
- package/templates/languages/python-expert/.cursor/rules/tooling.mdc +77 -0
- package/templates/languages/python-expert/.cursor/rules/type-system.mdc +77 -0
- package/templates/languages/python-expert/.cursor/rules/web-and-apis.mdc +76 -0
- package/templates/languages/python-expert/CLAUDE.md +264 -0
- package/templates/languages/ruby-expert/.cursor/rules/concurrency-and-threading.mdc +65 -0
- package/templates/languages/ruby-expert/.cursor/rules/error-handling.mdc +69 -0
- package/templates/languages/ruby-expert/.cursor/rules/idioms-and-style.mdc +76 -0
- package/templates/languages/ruby-expert/.cursor/rules/overview.mdc +60 -0
- package/templates/languages/ruby-expert/.cursor/rules/performance.mdc +68 -0
- package/templates/languages/ruby-expert/.cursor/rules/rails-and-frameworks.mdc +60 -0
- package/templates/languages/ruby-expert/.cursor/rules/testing.mdc +56 -0
- package/templates/languages/ruby-expert/.cursor/rules/tooling.mdc +52 -0
- package/templates/languages/ruby-expert/CLAUDE.md +102 -0
- package/templates/languages/rust-expert/.cursor/rules/concurrency.mdc +69 -0
- package/templates/languages/rust-expert/.cursor/rules/ecosystem-and-tooling.mdc +76 -0
- package/templates/languages/rust-expert/.cursor/rules/error-handling.mdc +76 -0
- package/templates/languages/rust-expert/.cursor/rules/overview.mdc +62 -0
- package/templates/languages/rust-expert/.cursor/rules/ownership-and-borrowing.mdc +70 -0
- package/templates/languages/rust-expert/.cursor/rules/performance-and-unsafe.mdc +70 -0
- package/templates/languages/rust-expert/.cursor/rules/testing.mdc +73 -0
- package/templates/languages/rust-expert/.cursor/rules/traits-and-generics.mdc +76 -0
- package/templates/languages/rust-expert/CLAUDE.md +283 -0
- package/templates/languages/swift-expert/.cursor/rules/concurrency.mdc +77 -0
- package/templates/languages/swift-expert/.cursor/rules/error-handling.mdc +76 -0
- package/templates/languages/swift-expert/.cursor/rules/language-features.mdc +78 -0
- package/templates/languages/swift-expert/.cursor/rules/overview.mdc +46 -0
- package/templates/languages/swift-expert/.cursor/rules/performance.mdc +69 -0
- package/templates/languages/swift-expert/.cursor/rules/swiftui.mdc +77 -0
- package/templates/languages/swift-expert/.cursor/rules/testing.mdc +75 -0
- package/templates/languages/swift-expert/.cursor/rules/tooling.mdc +77 -0
- package/templates/languages/swift-expert/CLAUDE.md +275 -0
- package/templates/professional/documentation/.cursor/rules/adr.mdc +65 -0
- package/templates/professional/documentation/.cursor/rules/api-documentation.mdc +64 -0
- package/templates/professional/documentation/.cursor/rules/code-comments.mdc +75 -0
- package/templates/professional/documentation/.cursor/rules/maintenance.mdc +58 -0
- package/templates/professional/documentation/.cursor/rules/overview.mdc +48 -0
- package/templates/professional/documentation/.cursor/rules/readme-standards.mdc +70 -0
- package/templates/professional/documentation/CLAUDE.md +120 -0
- package/templates/professional/executive-assistant/.cursor/rules/calendar.mdc +51 -0
- package/templates/professional/executive-assistant/.cursor/rules/confidentiality.mdc +53 -0
- package/templates/professional/executive-assistant/.cursor/rules/email.mdc +49 -0
- package/templates/professional/executive-assistant/.cursor/rules/meetings.mdc +39 -0
- package/templates/professional/executive-assistant/.cursor/rules/overview.mdc +42 -0
- package/templates/professional/executive-assistant/.cursor/rules/prioritization.mdc +48 -0
- package/templates/professional/executive-assistant/.cursor/rules/stakeholder-management.mdc +50 -0
- package/templates/professional/executive-assistant/.cursor/rules/travel.mdc +43 -0
- package/templates/professional/executive-assistant/CLAUDE.md +620 -0
- package/templates/professional/grant-writer/.cursor/rules/budgets.mdc +55 -0
- package/templates/professional/grant-writer/.cursor/rules/compliance.mdc +47 -0
- package/templates/professional/grant-writer/.cursor/rules/funding-research.mdc +47 -0
- package/templates/professional/grant-writer/.cursor/rules/narrative.mdc +58 -0
- package/templates/professional/grant-writer/.cursor/rules/overview.mdc +68 -0
- package/templates/professional/grant-writer/.cursor/rules/post-award.mdc +59 -0
- package/templates/professional/grant-writer/.cursor/rules/review-criteria.mdc +51 -0
- package/templates/professional/grant-writer/.cursor/rules/sustainability.mdc +48 -0
- package/templates/professional/grant-writer/CLAUDE.md +577 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/document-management.mdc +51 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/knowledge-graphs.mdc +63 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/overview.mdc +74 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/research-workflow.mdc +50 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/search-retrieval.mdc +62 -0
- package/templates/professional/knowledge-synthesis/.cursor/rules/summarization.mdc +61 -0
- package/templates/professional/knowledge-synthesis/CLAUDE.md +593 -0
- package/templates/professional/life-logistics/.cursor/rules/financial-optimization.mdc +78 -0
- package/templates/professional/life-logistics/.cursor/rules/negotiation.mdc +68 -0
- package/templates/professional/life-logistics/.cursor/rules/overview.mdc +75 -0
- package/templates/professional/life-logistics/.cursor/rules/research-methodology.mdc +76 -0
- package/templates/professional/life-logistics/.cursor/rules/scheduling.mdc +68 -0
- package/templates/professional/life-logistics/.cursor/rules/task-management.mdc +47 -0
- package/templates/professional/life-logistics/CLAUDE.md +601 -0
- package/templates/professional/research-assistant/.cursor/rules/citation-attribution.mdc +61 -0
- package/templates/professional/research-assistant/.cursor/rules/information-synthesis.mdc +65 -0
- package/templates/professional/research-assistant/.cursor/rules/overview.mdc +56 -0
- package/templates/professional/research-assistant/.cursor/rules/research-methodologies.mdc +54 -0
- package/templates/professional/research-assistant/.cursor/rules/search-strategies.mdc +57 -0
- package/templates/professional/research-assistant/.cursor/rules/source-evaluation.mdc +59 -0
- package/templates/professional/research-assistant/CLAUDE.md +318 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/adaptive-planning.mdc +69 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/data-integration.mdc +60 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/fitness-programming.mdc +66 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/nutrition-planning.mdc +57 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/overview.mdc +76 -0
- package/templates/professional/wellness-orchestrator/.cursor/rules/sleep-optimization.mdc +68 -0
- package/templates/professional/wellness-orchestrator/CLAUDE.md +573 -0
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Data Quality
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Data Quality
|
|
7
|
+
|
|
8
|
+
Patterns for validating, monitoring, and ensuring data quality.
|
|
9
|
+
|
|
10
|
+
## Quality Dimensions
|
|
11
|
+
|
|
12
|
+
- **Completeness** — required data is present (no unexpected nulls)
|
|
13
|
+
- **Accuracy** — data reflects reality (values in expected ranges)
|
|
14
|
+
- **Consistency** — data agrees across systems (totals match line items)
|
|
15
|
+
- **Timeliness** — data is fresh enough (within SLA)
|
|
16
|
+
- **Uniqueness** — no unwanted duplicates (primary keys are unique)
|
|
17
|
+
- **Validity** — data conforms to business rules (email matches regex)
|
|
18
|
+
|
|
19
|
+
## Validation Patterns
|
|
20
|
+
|
|
21
|
+
- **Schema validation** — enforce expected column names, types, and nullability before processing
|
|
22
|
+
- **Null checks** — assert zero nulls in required columns; fail pipeline on violation
|
|
23
|
+
- **Uniqueness checks** — `groupBy(keys).count().filter("count > 1")` must be empty
|
|
24
|
+
- **Range checks** — numeric values within `[min, max]`; flag or reject out-of-range
|
|
25
|
+
- **Referential integrity** — left-anti join to find orphan foreign keys
|
|
26
|
+
|
|
27
|
+
```python
|
|
28
|
+
def check_required_columns(df, required):
|
|
29
|
+
for col in required:
|
|
30
|
+
if df.filter(F.col(col).isNull()).count() > 0:
|
|
31
|
+
raise DataQualityError(f"Nulls in required column: {col}")
|
|
32
|
+
return df
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Severity Levels
|
|
36
|
+
|
|
37
|
+
- **Critical** — pipeline must fail (null PKs, duplicates on unique keys)
|
|
38
|
+
- **Warning** — log and continue (invalid emails, future dates); emit metrics for trending
|
|
39
|
+
|
|
40
|
+
## Freshness Monitoring
|
|
41
|
+
|
|
42
|
+
- Define SLA per table (e.g., orders: 2h, daily_sales: 24h)
|
|
43
|
+
- Compare `MAX(timestamp_column)` against current time
|
|
44
|
+
- Alert on SLA breaches with severity matching business impact
|
|
45
|
+
|
|
46
|
+
## Anomaly Detection
|
|
47
|
+
|
|
48
|
+
- **Volume** — compare today's record count against rolling 30-day mean ± 3 std devs
|
|
49
|
+
- **Distribution drift** — detect shifts in categorical value distributions (KL divergence)
|
|
50
|
+
- Zero-count partitions or sudden spikes both warrant alerts
|
|
51
|
+
|
|
52
|
+
## Quarantine Pattern
|
|
53
|
+
|
|
54
|
+
Separate good and bad records; route failures to a quarantine table for investigation.
|
|
55
|
+
|
|
56
|
+
```python
|
|
57
|
+
good = df.filter("order_id IS NOT NULL AND total_amount >= 0")
|
|
58
|
+
bad = df.subtract(good)
|
|
59
|
+
if bad.count() > 0:
|
|
60
|
+
bad.withColumn("_reason", F.lit("validation_failed")) \
|
|
61
|
+
.write.mode("append").saveAsTable("quarantine.orders")
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
## DBT Tests
|
|
65
|
+
|
|
66
|
+
```yaml
|
|
67
|
+
columns:
|
|
68
|
+
- name: order_id
|
|
69
|
+
tests: [not_null, unique]
|
|
70
|
+
- name: status
|
|
71
|
+
tests: [{accepted_values: {values: [pending, confirmed, shipped]}}]
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Anti-Patterns
|
|
75
|
+
|
|
76
|
+
- Silently dropping bad records without logging or quarantine
|
|
77
|
+
- Only validating in development; skipping checks in production
|
|
78
|
+
- No freshness monitoring — stale data serves silently
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Data Engineering
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Data Engineering
|
|
7
|
+
|
|
8
|
+
Guidelines for building robust, scalable data platforms and pipelines.
|
|
9
|
+
|
|
10
|
+
## Scope
|
|
11
|
+
|
|
12
|
+
- Batch pipelines (ETL/ELT) and stream processing
|
|
13
|
+
- Data warehouses, lakehouses, analytics engineering (DBT)
|
|
14
|
+
- Data quality and observability systems
|
|
15
|
+
|
|
16
|
+
## Key Principles
|
|
17
|
+
|
|
18
|
+
- **Idempotency is non-negotiable** — every pipeline must produce identical results on re-run
|
|
19
|
+
- **Data quality is a feature** — validate early, monitor continuously, alert proactively
|
|
20
|
+
- **Schema is a contract** — breaking changes require coordination and versioning
|
|
21
|
+
- **Observability over debugging** — instrument everything; never debug production with ad-hoc queries
|
|
22
|
+
- **Cost-aware engineering** — compute and storage have real costs; optimize deliberately
|
|
23
|
+
|
|
24
|
+
## Data Architecture Layers
|
|
25
|
+
|
|
26
|
+
- **Bronze/Raw** — exact copy of source, append-only (minutes-hours freshness)
|
|
27
|
+
- **Silver/Curated** — validated, typed, deduplicated (hours freshness)
|
|
28
|
+
- **Gold/Marts** — business-ready aggregates (hours-daily freshness)
|
|
29
|
+
|
|
30
|
+
## Pipeline Design Essentials
|
|
31
|
+
|
|
32
|
+
- Delete-then-insert or merge for idempotency — never blind append
|
|
33
|
+
- Use `execution_date`, not `current_timestamp()` for determinism
|
|
34
|
+
- Parameterize source/target tables and dates for testability
|
|
35
|
+
- Support backfill via parameterized execution dates
|
|
36
|
+
- Implement dead letter queues for invalid records
|
|
37
|
+
|
|
38
|
+
## Anti-Patterns
|
|
39
|
+
|
|
40
|
+
- Appending without clearing target partition (creates duplicates on re-run)
|
|
41
|
+
- Processing only today's data without a late-arrival window
|
|
42
|
+
- Hardcoded table names and connection strings in pipeline code
|
|
43
|
+
|
|
44
|
+
## Definition of Done
|
|
45
|
+
|
|
46
|
+
- [ ] Idempotency verified (re-run produces same result)
|
|
47
|
+
- [ ] Data quality checks implemented and passing
|
|
48
|
+
- [ ] Schema documented; monitoring and alerting configured
|
|
49
|
+
- [ ] Cost estimate understood
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Performance Optimization
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Performance Optimization
|
|
7
|
+
|
|
8
|
+
Patterns for efficient, cost-effective data pipelines.
|
|
9
|
+
|
|
10
|
+
## Partitioning
|
|
11
|
+
|
|
12
|
+
- Partition by columns used in most query filters (usually date)
|
|
13
|
+
- Target 100K–10M records per partition; avoid over-partitioning
|
|
14
|
+
- Compact small files: `OPTIMIZE table` and set `maxRecordsPerFile`
|
|
15
|
+
|
|
16
|
+
```python
|
|
17
|
+
# Good: single partition column matching query patterns
|
|
18
|
+
df.write.partitionBy("order_date").saveAsTable("curated.orders")
|
|
19
|
+
|
|
20
|
+
# Bad: over-partitioning creates millions of tiny files
|
|
21
|
+
df.write.partitionBy("order_date", "customer_id", "product_id").saveAsTable(t)
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Query Optimization
|
|
25
|
+
|
|
26
|
+
- **Predicate pushdown** — filter on native columns; UDFs block pushdown (full scan)
|
|
27
|
+
- **Column pruning** — `.select()` needed columns before transforms, not after
|
|
28
|
+
- **Broadcast joins** — `broadcast(small_df)` for dimension tables (<100MB)
|
|
29
|
+
- **Aggregate before join** — reduce shuffle by aggregating first, joining second
|
|
30
|
+
- **Avoid unnecessary sorts** — only `orderBy` when needed; `limit` first when possible
|
|
31
|
+
|
|
32
|
+
```python
|
|
33
|
+
# Good: native filter pushes to storage
|
|
34
|
+
df.filter(F.col("order_date") > F.date_sub(F.current_date(), 7))
|
|
35
|
+
|
|
36
|
+
# Bad: UDF blocks pushdown, causes full scan
|
|
37
|
+
df.filter(is_recent_udf(F.col("order_date")))
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Caching
|
|
41
|
+
|
|
42
|
+
- Cache intermediate DataFrames reused multiple times
|
|
43
|
+
- Always `unpersist()` in a `finally` block
|
|
44
|
+
- Use `MEMORY_AND_DISK` for large datasets that may spill
|
|
45
|
+
|
|
46
|
+
## Shuffle Optimization
|
|
47
|
+
|
|
48
|
+
- Aggregate before join to reduce shuffle volume
|
|
49
|
+
- `coalesce(n)` before write (no shuffle); `repartition(n)` only when increasing parallelism
|
|
50
|
+
- Tune `spark.sql.shuffle.partitions` (default 200; use `"auto"` on Spark 3.0+)
|
|
51
|
+
|
|
52
|
+
## Z-Ordering (Delta Lake)
|
|
53
|
+
|
|
54
|
+
Co-locate data for multi-column filter queries:
|
|
55
|
+
|
|
56
|
+
```sql
|
|
57
|
+
OPTIMIZE curated.orders ZORDER BY (customer_id, product_id)
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Cost Management
|
|
61
|
+
|
|
62
|
+
- Use incremental processing; avoid full-refresh when unnecessary
|
|
63
|
+
- Apply data retention: `DELETE` + `VACUUM` old partitions
|
|
64
|
+
- Monitor pipeline compute costs and bytes scanned weekly
|
|
65
|
+
- Prefer `zstd` compression for better ratio
|
|
66
|
+
|
|
67
|
+
## Anti-Patterns
|
|
68
|
+
|
|
69
|
+
- Filtering on derived columns that prevent partition pruning
|
|
70
|
+
- Reading all columns then selecting late in the pipeline
|
|
71
|
+
- Caching without cleanup — memory leaks across jobs
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Pipeline Design
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Pipeline Design
|
|
7
|
+
|
|
8
|
+
Patterns for building reliable data pipelines.
|
|
9
|
+
|
|
10
|
+
## Core Principles
|
|
11
|
+
|
|
12
|
+
- **Idempotency** — same inputs, same outputs, every time
|
|
13
|
+
- **Determinism** — use `execution_date`, not `current_timestamp()`; seed random operations
|
|
14
|
+
- **Atomicity** — all-or-nothing writes; use Delta/Iceberg transactions or staging+swap
|
|
15
|
+
|
|
16
|
+
## Pipeline Patterns
|
|
17
|
+
|
|
18
|
+
### Batch Full Refresh
|
|
19
|
+
Use when source is small or doesn't support incremental.
|
|
20
|
+
|
|
21
|
+
```python
|
|
22
|
+
df = spark.read.table(source)
|
|
23
|
+
df.write.mode("overwrite").saveAsTable(target)
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
### Batch Incremental (Watermark)
|
|
27
|
+
Use when data is large and source supports change tracking.
|
|
28
|
+
|
|
29
|
+
```python
|
|
30
|
+
last_wm = get_watermark(target, wm_col)
|
|
31
|
+
new = spark.read.table(source).filter(F.col(wm_col) > last_wm)
|
|
32
|
+
DeltaTable.forName(spark, target).alias("t") \
|
|
33
|
+
.merge(new.alias("s"), "t.id = s.id") \
|
|
34
|
+
.whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### CDC (Change Data Capture)
|
|
38
|
+
Use when tracking all changes for point-in-time queries.
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
DeltaTable.forName(spark, target).alias("t") \
|
|
42
|
+
.merge(cdc_events.alias("s"), "t.id = s.id") \
|
|
43
|
+
.whenMatchedDelete(condition="s.op = 'DELETE'") \
|
|
44
|
+
.whenMatchedUpdateAll(condition="s.op = 'UPDATE'") \
|
|
45
|
+
.whenNotMatchedInsertAll(condition="s.op = 'INSERT'").execute()
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Streaming
|
|
49
|
+
Use when low latency required from event streams.
|
|
50
|
+
|
|
51
|
+
```python
|
|
52
|
+
spark.readStream.format("kafka").option("subscribe", topic).load() \
|
|
53
|
+
.transform(process) \
|
|
54
|
+
.writeStream.format("delta") \
|
|
55
|
+
.option("checkpointLocation", ckpt).toTable(target)
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
## Late Data Handling
|
|
59
|
+
|
|
60
|
+
- **Batch**: reprocess a rolling window (e.g., last 3 days) to catch late arrivals
|
|
61
|
+
- **Streaming**: use `.withWatermark("event_time", "1 hour")` to define late-data tolerance
|
|
62
|
+
|
|
63
|
+
## Orchestration
|
|
64
|
+
|
|
65
|
+
- Design DAGs with explicit data dependencies between tasks
|
|
66
|
+
- Retry with exponential backoff: 3 retries, 5min initial, 1hr max
|
|
67
|
+
- Support backfill via parameterized `execution_date`
|
|
68
|
+
|
|
69
|
+
## Error Handling
|
|
70
|
+
|
|
71
|
+
- **Fail fast** — validate critical assumptions (nulls, empty input) before transforms
|
|
72
|
+
- **Dead letter queue** — route invalid records to a DLQ table for investigation
|
|
73
|
+
- Log record counts at each stage; emit metrics for monitoring
|
|
74
|
+
|
|
75
|
+
## Anti-Patterns
|
|
76
|
+
|
|
77
|
+
- Blind append without partition clearing (duplicates on re-run)
|
|
78
|
+
- Hidden dependencies via side effects instead of explicit table reads
|
|
79
|
+
- Hardcoded table names instead of parameterized source/target
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Security & Governance
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Security & Governance
|
|
7
|
+
|
|
8
|
+
Patterns for securing data and maintaining compliance.
|
|
9
|
+
|
|
10
|
+
## Data Classification
|
|
11
|
+
|
|
12
|
+
- **Public** — shareable externally; no special controls
|
|
13
|
+
- **Internal** — business data; requires authentication
|
|
14
|
+
- **Confidential** — sensitive business data; encryption + access control
|
|
15
|
+
- **Restricted** — PII/regulated data; encryption, masking, audit logging required
|
|
16
|
+
|
|
17
|
+
Tag tables with classification metadata (`data_classification`, `contains_pii`, `data_owner`).
|
|
18
|
+
|
|
19
|
+
## PII Handling
|
|
20
|
+
|
|
21
|
+
Strategies by sensitivity:
|
|
22
|
+
- **Hash** (SHA-256) — one-way, for matching without exposing values
|
|
23
|
+
- **Encrypt** (Fernet) — reversible, for authorized access
|
|
24
|
+
- **Mask** — partial display (`jo****@email.com`)
|
|
25
|
+
- **Redact** — replace with `[REDACTED]`
|
|
26
|
+
|
|
27
|
+
```python
|
|
28
|
+
def mask_pii(df, strategy):
|
|
29
|
+
for col, method in strategy.items():
|
|
30
|
+
if col not in df.columns: continue
|
|
31
|
+
if method == "hash": df = df.withColumn(col, F.sha2(F.col(col), 256))
|
|
32
|
+
elif method == "redact": df = df.withColumn(col, F.lit("[REDACTED]"))
|
|
33
|
+
return df
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Access Control
|
|
37
|
+
|
|
38
|
+
- **RBAC** — create roles (analyst, engineer, admin); grant minimal permissions per role
|
|
39
|
+
- **Column-level** — create masked views for non-privileged users; grant view access, not table access
|
|
40
|
+
- **Row-level** — filter rows in views by user attributes (e.g., region)
|
|
41
|
+
- **Least privilege** — default to restricted; require approval for elevation
|
|
42
|
+
|
|
43
|
+
```sql
|
|
44
|
+
-- Analysts get masked view, not raw table
|
|
45
|
+
GRANT SELECT ON curated.customers_masked TO data_analyst;
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Audit Logging
|
|
49
|
+
|
|
50
|
+
- Log all data access: user, table, operation, row count, timestamp
|
|
51
|
+
- Log schema changes: before/after schema, change type, user
|
|
52
|
+
- Append-only audit tables — never delete audit records
|
|
53
|
+
- Review access patterns for anomalies (unusual volume, off-hours access)
|
|
54
|
+
|
|
55
|
+
## Data Retention
|
|
56
|
+
|
|
57
|
+
- Define retention policy per layer (raw: 90d, curated: 3yr, audit: 7yr)
|
|
58
|
+
- Archive before delete when required
|
|
59
|
+
- Support GDPR right-to-erasure: delete by customer_id across all tables, log deletion
|
|
60
|
+
|
|
61
|
+
## Secrets Management
|
|
62
|
+
|
|
63
|
+
- Never hardcode credentials — use environment variables or secrets managers
|
|
64
|
+
- Support credential rotation without downtime
|
|
65
|
+
- Re-encrypt data when rotating encryption keys
|
|
66
|
+
|
|
67
|
+
```python
|
|
68
|
+
# Bad: hardcoded
|
|
69
|
+
conn = "postgresql://user:pass@host/db"
|
|
70
|
+
|
|
71
|
+
# Good: from environment or secrets manager
|
|
72
|
+
conn = f"postgresql://{os.environ['DB_USER']}:{os.environ['DB_PASS']}@{os.environ['DB_HOST']}/db"
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Anti-Patterns
|
|
76
|
+
|
|
77
|
+
- Granting `ALL PRIVILEGES` to broad groups instead of minimal role-based access
|
|
78
|
+
- Storing PII unencrypted in bronze/raw layer without masking downstream
|
|
79
|
+
- No audit logging on sensitive table access
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Data Pipeline Testing
|
|
3
|
+
alwaysApply: false
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Data Pipeline Testing
|
|
7
|
+
|
|
8
|
+
Strategies for testing data pipelines effectively.
|
|
9
|
+
|
|
10
|
+
## Testing Pyramid
|
|
11
|
+
|
|
12
|
+
- **Unit** — single transformation, fast, test logic in isolation
|
|
13
|
+
- **Integration** — end-to-end pipeline flow, medium speed
|
|
14
|
+
- **Contract** — schema/interface compatibility, prevent breaking changes
|
|
15
|
+
- **Data Quality** — validate production data, slowest
|
|
16
|
+
|
|
17
|
+
## Unit Tests
|
|
18
|
+
|
|
19
|
+
Test individual transformations with small, controlled DataFrames.
|
|
20
|
+
|
|
21
|
+
```python
|
|
22
|
+
def test_calculate_order_total(spark):
|
|
23
|
+
df = spark.createDataFrame([{"order_id": "1", "quantity": 2, "unit_price": Decimal("10.00")}])
|
|
24
|
+
assert calculate_order_total(df).collect()[0]["total"] == Decimal("20.00")
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Integration Tests
|
|
28
|
+
|
|
29
|
+
Test full pipeline flows in isolated test databases. Use fixtures to create/drop DBs.
|
|
30
|
+
|
|
31
|
+
```python
|
|
32
|
+
def test_idempotency(spark, test_db):
|
|
33
|
+
for _ in range(2):
|
|
34
|
+
run_pipeline(source=f"{test_db}.raw", target=f"{test_db}.curated", date=date(2024,1,15))
|
|
35
|
+
assert spark.table(f"{test_db}.curated").count() == 3 # Not doubled
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## Contract Tests
|
|
39
|
+
|
|
40
|
+
Ensure schema backward compatibility for downstream consumers.
|
|
41
|
+
|
|
42
|
+
```python
|
|
43
|
+
def test_schema_backward_compatible():
|
|
44
|
+
schema = spark.table("curated.orders").schema
|
|
45
|
+
required = {"order_id": StringType(), "total_amount": DecimalType(12,2)}
|
|
46
|
+
for col, expected_type in required.items():
|
|
47
|
+
assert col in [f.name for f in schema.fields]
|
|
48
|
+
assert schema[col].dataType == expected_type
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## DBT Tests
|
|
52
|
+
|
|
53
|
+
```yaml
|
|
54
|
+
models:
|
|
55
|
+
- name: orders
|
|
56
|
+
columns:
|
|
57
|
+
- name: order_id
|
|
58
|
+
tests: [not_null, unique]
|
|
59
|
+
- name: customer_id
|
|
60
|
+
tests: [not_null, {relationships: {to: ref('customers'), field: customer_id}}]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
## Best Practices
|
|
64
|
+
|
|
65
|
+
- Test behavior, not implementation (test WHAT, not HOW)
|
|
66
|
+
- Each test sets up its own data — no shared mutable state
|
|
67
|
+
- Use descriptive names: `test_calculate_total_handles_zero_quantity`
|
|
68
|
+
- Test edge cases: nulls, empty inputs, duplicates, large values
|
|
69
|
+
- Use `assert_df_equality` (chispa) for DataFrame comparisons
|
|
70
|
+
|
|
71
|
+
## Anti-Patterns
|
|
72
|
+
|
|
73
|
+
- Only testing the happy path; skipping null/empty/duplicate cases
|
|
74
|
+
- Tests depending on shared state or execution order
|
|
75
|
+
- Testing implementation details (e.g., checking query plans)
|