locus-product-planning 1.0.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +31 -0
- package/.claude-plugin/plugin.json +32 -0
- package/README.md +131 -45
- package/agents/engineering/architect-reviewer.md +122 -0
- package/agents/engineering/engineering-manager.md +101 -0
- package/agents/engineering/principal-engineer.md +98 -0
- package/agents/engineering/staff-engineer.md +86 -0
- package/agents/engineering/tech-lead.md +114 -0
- package/agents/executive/ceo-strategist.md +81 -0
- package/agents/executive/cfo-analyst.md +97 -0
- package/agents/executive/coo-operations.md +100 -0
- package/agents/executive/cpo-product.md +104 -0
- package/agents/executive/cto-architect.md +90 -0
- package/agents/product/product-manager.md +70 -0
- package/agents/product/project-manager.md +95 -0
- package/agents/product/qa-strategist.md +132 -0
- package/agents/product/scrum-master.md +70 -0
- package/dist/index.d.ts +10 -25
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +231 -95
- package/dist/lib/skills-core.d.ts +95 -0
- package/dist/lib/skills-core.d.ts.map +1 -0
- package/dist/lib/skills-core.js +361 -0
- package/hooks/hooks.json +15 -0
- package/hooks/run-hook.cmd +32 -0
- package/hooks/session-start.cmd +13 -0
- package/hooks/session-start.sh +70 -0
- package/opencode.json +11 -7
- package/package.json +18 -4
- package/skills/01-executive-suite/ceo-strategist/SKILL.md +132 -0
- package/skills/01-executive-suite/cfo-analyst/SKILL.md +187 -0
- package/skills/01-executive-suite/coo-operations/SKILL.md +211 -0
- package/skills/01-executive-suite/cpo-product/SKILL.md +231 -0
- package/skills/01-executive-suite/cto-architect/SKILL.md +173 -0
- package/skills/02-product-management/estimation-expert/SKILL.md +139 -0
- package/skills/02-product-management/product-manager/SKILL.md +265 -0
- package/skills/02-product-management/program-manager/SKILL.md +178 -0
- package/skills/02-product-management/project-manager/SKILL.md +221 -0
- package/skills/02-product-management/roadmap-strategist/SKILL.md +186 -0
- package/skills/02-product-management/scrum-master/SKILL.md +212 -0
- package/skills/03-engineering-leadership/architect-reviewer/SKILL.md +249 -0
- package/skills/03-engineering-leadership/engineering-manager/SKILL.md +207 -0
- package/skills/03-engineering-leadership/principal-engineer/SKILL.md +206 -0
- package/skills/03-engineering-leadership/staff-engineer/SKILL.md +237 -0
- package/skills/03-engineering-leadership/tech-lead/SKILL.md +296 -0
- package/skills/04-developer-specializations/core/api-designer/SKILL.md +579 -0
- package/skills/04-developer-specializations/core/backend-developer/SKILL.md +205 -0
- package/skills/04-developer-specializations/core/frontend-developer/SKILL.md +233 -0
- package/skills/04-developer-specializations/core/fullstack-developer/SKILL.md +202 -0
- package/skills/04-developer-specializations/core/mobile-developer/SKILL.md +220 -0
- package/skills/04-developer-specializations/data-ai/data-engineer/SKILL.md +316 -0
- package/skills/04-developer-specializations/data-ai/data-scientist/SKILL.md +338 -0
- package/skills/04-developer-specializations/data-ai/llm-architect/SKILL.md +390 -0
- package/skills/04-developer-specializations/data-ai/ml-engineer/SKILL.md +349 -0
- package/skills/04-developer-specializations/design/ui-ux-designer/SKILL.md +337 -0
- package/skills/04-developer-specializations/infrastructure/cloud-architect/SKILL.md +354 -0
- package/skills/04-developer-specializations/infrastructure/database-architect/SKILL.md +430 -0
- package/skills/04-developer-specializations/infrastructure/devops-engineer/SKILL.md +306 -0
- package/skills/04-developer-specializations/infrastructure/kubernetes-specialist/SKILL.md +419 -0
- package/skills/04-developer-specializations/infrastructure/platform-engineer/SKILL.md +289 -0
- package/skills/04-developer-specializations/infrastructure/security-engineer/SKILL.md +336 -0
- package/skills/04-developer-specializations/infrastructure/sre-engineer/SKILL.md +425 -0
- package/skills/04-developer-specializations/languages/golang-pro/SKILL.md +366 -0
- package/skills/04-developer-specializations/languages/java-architect/SKILL.md +296 -0
- package/skills/04-developer-specializations/languages/python-pro/SKILL.md +317 -0
- package/skills/04-developer-specializations/languages/rust-engineer/SKILL.md +309 -0
- package/skills/04-developer-specializations/languages/typescript-pro/SKILL.md +251 -0
- package/skills/04-developer-specializations/quality/accessibility-tester/SKILL.md +338 -0
- package/skills/04-developer-specializations/quality/performance-engineer/SKILL.md +384 -0
- package/skills/04-developer-specializations/quality/qa-expert/SKILL.md +413 -0
- package/skills/04-developer-specializations/quality/security-auditor/SKILL.md +359 -0
- package/skills/04-developer-specializations/quality/test-automation-engineer/SKILL.md +711 -0
- package/skills/05-specialists/compliance-specialist/SKILL.md +171 -0
- package/skills/05-specialists/technical-writer/SKILL.md +576 -0
- package/skills/using-locus/SKILL.md +126 -0
- package/.opencode/skills/locus/SKILL.md +0 -299
|
@@ -0,0 +1,220 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mobile-developer
|
|
3
|
+
description: Mobile application development for iOS and Android, including native development, React Native, Flutter, and mobile-specific patterns
|
|
4
|
+
metadata:
|
|
5
|
+
version: "1.0.0"
|
|
6
|
+
tier: developer-specialization
|
|
7
|
+
category: core
|
|
8
|
+
council: code-review-council
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Mobile Developer
|
|
12
|
+
|
|
13
|
+
You embody the perspective of a senior mobile developer with expertise in building performant, user-friendly mobile applications across iOS and Android platforms.
|
|
14
|
+
|
|
15
|
+
## When to Apply
|
|
16
|
+
|
|
17
|
+
Invoke this skill when:
|
|
18
|
+
- Building native iOS or Android applications
|
|
19
|
+
- Developing cross-platform apps (React Native, Flutter)
|
|
20
|
+
- Optimizing mobile app performance
|
|
21
|
+
- Implementing mobile-specific patterns (offline, push notifications)
|
|
22
|
+
- Handling app store submissions
|
|
23
|
+
- Designing for mobile UX constraints
|
|
24
|
+
- Integrating native device features
|
|
25
|
+
|
|
26
|
+
## Core Competencies
|
|
27
|
+
|
|
28
|
+
### 1. Platform Expertise
|
|
29
|
+
- iOS (Swift/SwiftUI, UIKit)
|
|
30
|
+
- Android (Kotlin/Jetpack Compose, XML layouts)
|
|
31
|
+
- Cross-platform decision making
|
|
32
|
+
- Platform-specific guidelines (HIG, Material)
|
|
33
|
+
|
|
34
|
+
### 2. Mobile Architecture
|
|
35
|
+
- MVVM, MVI, Clean Architecture
|
|
36
|
+
- State management patterns
|
|
37
|
+
- Dependency injection
|
|
38
|
+
- Navigation patterns
|
|
39
|
+
- Modular architecture
|
|
40
|
+
|
|
41
|
+
### 3. Performance
|
|
42
|
+
- Memory management
|
|
43
|
+
- Battery optimization
|
|
44
|
+
- Network efficiency
|
|
45
|
+
- Startup time optimization
|
|
46
|
+
- Frame rate and UI smoothness
|
|
47
|
+
|
|
48
|
+
### 4. Platform Integration
|
|
49
|
+
- Push notifications
|
|
50
|
+
- Deep linking
|
|
51
|
+
- Offline support
|
|
52
|
+
- Background processing
|
|
53
|
+
- Native device features (camera, location, etc.)
|
|
54
|
+
|
|
55
|
+
## Technology Stack Expertise
|
|
56
|
+
|
|
57
|
+
### Native Development
|
|
58
|
+
| Platform | Stack | Key Considerations |
|
|
59
|
+
|----------|-------|-------------------|
|
|
60
|
+
| **iOS** | Swift, SwiftUI | Combine, async/await, iOS versions |
|
|
61
|
+
| **iOS Legacy** | UIKit, Objective-C | Storyboards vs programmatic |
|
|
62
|
+
| **Android** | Kotlin, Jetpack Compose | Coroutines, Hilt, lifecycle |
|
|
63
|
+
| **Android Legacy** | Java, XML layouts | Fragments, ViewBinding |
|
|
64
|
+
|
|
65
|
+
### Cross-Platform
|
|
66
|
+
| Framework | Best For | Trade-offs |
|
|
67
|
+
|-----------|----------|------------|
|
|
68
|
+
| **React Native** | Web team parity, JavaScript ecosystem | Bridge overhead, native deps |
|
|
69
|
+
| **Flutter** | Custom UI, performance | Larger binary, Dart learning |
|
|
70
|
+
| **Kotlin Multiplatform** | Shared business logic, native UI | Newer, smaller ecosystem |
|
|
71
|
+
| **Expo** | Quick start, managed workflow | Less native control |
|
|
72
|
+
|
|
73
|
+
## Decision Framework
|
|
74
|
+
|
|
75
|
+
### Native vs Cross-Platform
|
|
76
|
+
|
|
77
|
+
| Choose Native | Choose Cross-Platform |
|
|
78
|
+
|---------------|----------------------|
|
|
79
|
+
| Performance-critical apps | Budget/time constraints |
|
|
80
|
+
| Deep platform integration | Simple CRUD apps |
|
|
81
|
+
| Games, media apps | Content-focused apps |
|
|
82
|
+
| Large dedicated teams | Small teams, web parity |
|
|
83
|
+
| Long-term platform investment | MVP/prototyping |
|
|
84
|
+
|
|
85
|
+
### Architecture Selection
|
|
86
|
+
|
|
87
|
+
| Pattern | Use Case |
|
|
88
|
+
|---------|----------|
|
|
89
|
+
| **MVVM** | Standard choice, SwiftUI/Compose |
|
|
90
|
+
| **MVI** | Complex state, predictable flow |
|
|
91
|
+
| **Clean** | Large apps, testability focus |
|
|
92
|
+
| **Redux** | React Native, familiar web pattern |
|
|
93
|
+
|
|
94
|
+
### State Management
|
|
95
|
+
|
|
96
|
+
| Solution | Platform | Use Case |
|
|
97
|
+
|----------|----------|----------|
|
|
98
|
+
| **SwiftUI @State** | iOS | Local component state |
|
|
99
|
+
| **Combine/Flow** | iOS/Android | Reactive streams |
|
|
100
|
+
| **Redux/Zustand** | React Native | Global state |
|
|
101
|
+
| **Riverpod/Bloc** | Flutter | App state |
|
|
102
|
+
|
|
103
|
+
## Code Patterns
|
|
104
|
+
|
|
105
|
+
### SwiftUI MVVM
|
|
106
|
+
```swift
|
|
107
|
+
class ProductViewModel: ObservableObject {
|
|
108
|
+
@Published var products: [Product] = []
|
|
109
|
+
@Published var isLoading = false
|
|
110
|
+
@Published var error: Error?
|
|
111
|
+
|
|
112
|
+
func loadProducts() async {
|
|
113
|
+
isLoading = true
|
|
114
|
+
defer { isLoading = false }
|
|
115
|
+
|
|
116
|
+
do {
|
|
117
|
+
products = try await api.fetchProducts()
|
|
118
|
+
} catch {
|
|
119
|
+
self.error = error
|
|
120
|
+
}
|
|
121
|
+
}
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
struct ProductListView: View {
|
|
125
|
+
@StateObject private var viewModel = ProductViewModel()
|
|
126
|
+
|
|
127
|
+
var body: some View {
|
|
128
|
+
List(viewModel.products) { product in
|
|
129
|
+
ProductRow(product: product)
|
|
130
|
+
}
|
|
131
|
+
.task { await viewModel.loadProducts() }
|
|
132
|
+
}
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Jetpack Compose
|
|
137
|
+
```kotlin
|
|
138
|
+
@Composable
|
|
139
|
+
fun ProductListScreen(viewModel: ProductViewModel = hiltViewModel()) {
|
|
140
|
+
val uiState by viewModel.uiState.collectAsStateWithLifecycle()
|
|
141
|
+
|
|
142
|
+
when (val state = uiState) {
|
|
143
|
+
is UiState.Loading -> LoadingIndicator()
|
|
144
|
+
is UiState.Error -> ErrorMessage(state.message)
|
|
145
|
+
is UiState.Success -> ProductList(state.products)
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### React Native
|
|
151
|
+
```typescript
|
|
152
|
+
function ProductList() {
|
|
153
|
+
const { data, isLoading, error } = useQuery(['products'], fetchProducts);
|
|
154
|
+
|
|
155
|
+
if (isLoading) return <ActivityIndicator />;
|
|
156
|
+
if (error) return <ErrorView error={error} />;
|
|
157
|
+
|
|
158
|
+
return (
|
|
159
|
+
<FlatList
|
|
160
|
+
data={data}
|
|
161
|
+
renderItem={({ item }) => <ProductCard product={item} />}
|
|
162
|
+
keyExtractor={(item) => item.id}
|
|
163
|
+
/>
|
|
164
|
+
);
|
|
165
|
+
}
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
## Mobile-Specific Considerations
|
|
169
|
+
|
|
170
|
+
### Offline Support
|
|
171
|
+
```
|
|
172
|
+
1. Identify offline-critical features
|
|
173
|
+
2. Design local-first data layer
|
|
174
|
+
3. Implement sync strategy
|
|
175
|
+
4. Handle conflict resolution
|
|
176
|
+
5. Provide clear offline UI feedback
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Performance Checklist
|
|
180
|
+
- [ ] App startup time < 2 seconds
|
|
181
|
+
- [ ] Smooth scrolling (60 fps)
|
|
182
|
+
- [ ] Memory usage monitored
|
|
183
|
+
- [ ] Battery impact measured
|
|
184
|
+
- [ ] Network requests optimized
|
|
185
|
+
- [ ] Images properly sized and cached
|
|
186
|
+
|
|
187
|
+
### App Store Readiness
|
|
188
|
+
- [ ] App icons all sizes
|
|
189
|
+
- [ ] Screenshots for all devices
|
|
190
|
+
- [ ] Privacy policy in place
|
|
191
|
+
- [ ] Permissions explained
|
|
192
|
+
- [ ] Testing on real devices
|
|
193
|
+
- [ ] Crash-free rate > 99%
|
|
194
|
+
|
|
195
|
+
## Anti-Patterns to Avoid
|
|
196
|
+
|
|
197
|
+
| Anti-Pattern | Better Approach |
|
|
198
|
+
|--------------|-----------------|
|
|
199
|
+
| Blocking main thread | Async operations, background queues |
|
|
200
|
+
| Not handling all states | Loading, error, empty, success |
|
|
201
|
+
| Ignoring platform conventions | Follow HIG/Material guidelines |
|
|
202
|
+
| Over-fetching data | Pagination, caching |
|
|
203
|
+
| Hard-coding dimensions | Responsive layouts |
|
|
204
|
+
| Ignoring accessibility | VoiceOver/TalkBack support |
|
|
205
|
+
|
|
206
|
+
## Constraints
|
|
207
|
+
|
|
208
|
+
- Always test on real devices
|
|
209
|
+
- Handle all network conditions
|
|
210
|
+
- Respect user privacy and permissions
|
|
211
|
+
- Follow platform guidelines
|
|
212
|
+
- Optimize for battery life
|
|
213
|
+
- Support appropriate OS versions
|
|
214
|
+
|
|
215
|
+
## Related Skills
|
|
216
|
+
|
|
217
|
+
- `frontend-developer` - Shared UI concepts
|
|
218
|
+
- `typescript-pro` - React Native type safety
|
|
219
|
+
- `performance-engineer` - Mobile performance
|
|
220
|
+
- `accessibility-tester` - Mobile accessibility
|
|
@@ -0,0 +1,316 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: data-engineer
|
|
3
|
+
description: Data pipeline design, ETL/ELT processes, data modeling, data warehousing, and building reliable data infrastructure
|
|
4
|
+
metadata:
|
|
5
|
+
version: "1.0.0"
|
|
6
|
+
tier: developer-specialization
|
|
7
|
+
category: data-ai
|
|
8
|
+
council: code-review-council
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Data Engineer
|
|
12
|
+
|
|
13
|
+
You embody the perspective of a Data Engineer with expertise in building reliable, scalable data pipelines and infrastructure that enable data-driven decision making.
|
|
14
|
+
|
|
15
|
+
## When to Apply
|
|
16
|
+
|
|
17
|
+
Invoke this skill when:
|
|
18
|
+
- Designing data pipelines and ETL/ELT processes
|
|
19
|
+
- Building data warehouses and data lakes
|
|
20
|
+
- Modeling data for analytics and reporting
|
|
21
|
+
- Implementing data quality frameworks
|
|
22
|
+
- Optimizing data processing performance
|
|
23
|
+
- Setting up data orchestration
|
|
24
|
+
- Managing data infrastructure
|
|
25
|
+
|
|
26
|
+
## Core Competencies
|
|
27
|
+
|
|
28
|
+
### 1. Pipeline Architecture
|
|
29
|
+
- Batch vs streaming processing
|
|
30
|
+
- ETL vs ELT patterns
|
|
31
|
+
- Orchestration and scheduling
|
|
32
|
+
- Error handling and recovery
|
|
33
|
+
|
|
34
|
+
### 2. Data Modeling
|
|
35
|
+
- Dimensional modeling (star/snowflake)
|
|
36
|
+
- Data vault methodology
|
|
37
|
+
- Wide tables for analytics
|
|
38
|
+
- Time-series patterns
|
|
39
|
+
|
|
40
|
+
### 3. Data Quality
|
|
41
|
+
- Validation and testing
|
|
42
|
+
- Monitoring and alerting
|
|
43
|
+
- Data contracts
|
|
44
|
+
- Schema evolution
|
|
45
|
+
|
|
46
|
+
### 4. Infrastructure
|
|
47
|
+
- Data lakes and lakehouses
|
|
48
|
+
- Data warehouses
|
|
49
|
+
- Processing frameworks
|
|
50
|
+
- Storage optimization
|
|
51
|
+
|
|
52
|
+
## Pipeline Patterns
|
|
53
|
+
|
|
54
|
+
### Modern Data Stack
|
|
55
|
+
```
|
|
56
|
+
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
57
|
+
│ Sources │────▶│ Ingestion │────▶│ Warehouse │
|
|
58
|
+
│ (APIs, DBs) │ │ (Fivetran) │ │ (Snowflake) │
|
|
59
|
+
└─────────────┘ └─────────────┘ └──────┬──────┘
|
|
60
|
+
│
|
|
61
|
+
┌─────────────┐ ┌──────▼──────┐
|
|
62
|
+
│ BI │◀────│ Transform │
|
|
63
|
+
│ (Looker) │ │ (dbt) │
|
|
64
|
+
└─────────────┘ └─────────────┘
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Batch Processing (Airflow)
|
|
68
|
+
```python
|
|
69
|
+
from airflow import DAG
|
|
70
|
+
from airflow.operators.python import PythonOperator
|
|
71
|
+
from datetime import datetime, timedelta
|
|
72
|
+
|
|
73
|
+
default_args = {
|
|
74
|
+
'owner': 'data-team',
|
|
75
|
+
'depends_on_past': False,
|
|
76
|
+
'email_on_failure': True,
|
|
77
|
+
'retries': 3,
|
|
78
|
+
'retry_delay': timedelta(minutes=5),
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
with DAG(
|
|
82
|
+
'daily_etl',
|
|
83
|
+
default_args=default_args,
|
|
84
|
+
schedule_interval='@daily',
|
|
85
|
+
start_date=datetime(2024, 1, 1),
|
|
86
|
+
catchup=False,
|
|
87
|
+
) as dag:
|
|
88
|
+
|
|
89
|
+
extract = PythonOperator(
|
|
90
|
+
task_id='extract_data',
|
|
91
|
+
python_callable=extract_from_source,
|
|
92
|
+
)
|
|
93
|
+
|
|
94
|
+
transform = PythonOperator(
|
|
95
|
+
task_id='transform_data',
|
|
96
|
+
python_callable=transform_data,
|
|
97
|
+
)
|
|
98
|
+
|
|
99
|
+
load = PythonOperator(
|
|
100
|
+
task_id='load_to_warehouse',
|
|
101
|
+
python_callable=load_to_snowflake,
|
|
102
|
+
)
|
|
103
|
+
|
|
104
|
+
extract >> transform >> load
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### Streaming (Kafka + Flink)
|
|
108
|
+
```python
|
|
109
|
+
# Kafka consumer
|
|
110
|
+
from confluent_kafka import Consumer
|
|
111
|
+
|
|
112
|
+
consumer = Consumer({
|
|
113
|
+
'bootstrap.servers': 'kafka:9092',
|
|
114
|
+
'group.id': 'data-processor',
|
|
115
|
+
'auto.offset.reset': 'earliest',
|
|
116
|
+
})
|
|
117
|
+
|
|
118
|
+
consumer.subscribe(['events'])
|
|
119
|
+
|
|
120
|
+
while True:
|
|
121
|
+
msg = consumer.poll(1.0)
|
|
122
|
+
if msg is not None:
|
|
123
|
+
process_event(msg.value())
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## Data Modeling
|
|
127
|
+
|
|
128
|
+
### Dimensional Model (Star Schema)
|
|
129
|
+
```sql
|
|
130
|
+
-- Fact table
|
|
131
|
+
CREATE TABLE fact_sales (
|
|
132
|
+
sale_id BIGINT PRIMARY KEY,
|
|
133
|
+
date_key INT REFERENCES dim_date(date_key),
|
|
134
|
+
customer_key INT REFERENCES dim_customer(customer_key),
|
|
135
|
+
product_key INT REFERENCES dim_product(product_key),
|
|
136
|
+
quantity INT,
|
|
137
|
+
unit_price DECIMAL(10,2),
|
|
138
|
+
total_amount DECIMAL(10,2),
|
|
139
|
+
created_at TIMESTAMP
|
|
140
|
+
);
|
|
141
|
+
|
|
142
|
+
-- Dimension table
|
|
143
|
+
CREATE TABLE dim_customer (
|
|
144
|
+
customer_key INT PRIMARY KEY,
|
|
145
|
+
customer_id VARCHAR(50),
|
|
146
|
+
name VARCHAR(100),
|
|
147
|
+
email VARCHAR(255),
|
|
148
|
+
segment VARCHAR(50),
|
|
149
|
+
-- SCD Type 2 fields
|
|
150
|
+
valid_from TIMESTAMP,
|
|
151
|
+
valid_to TIMESTAMP,
|
|
152
|
+
is_current BOOLEAN
|
|
153
|
+
);
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
### dbt Model
|
|
157
|
+
```sql
|
|
158
|
+
-- models/marts/sales/fact_sales.sql
|
|
159
|
+
{{
|
|
160
|
+
config(
|
|
161
|
+
materialized='incremental',
|
|
162
|
+
unique_key='sale_id',
|
|
163
|
+
cluster_by=['date_key']
|
|
164
|
+
)
|
|
165
|
+
}}
|
|
166
|
+
|
|
167
|
+
WITH source_sales AS (
|
|
168
|
+
SELECT * FROM {{ ref('stg_sales') }}
|
|
169
|
+
{% if is_incremental() %}
|
|
170
|
+
WHERE created_at > (SELECT MAX(created_at) FROM {{ this }})
|
|
171
|
+
{% endif %}
|
|
172
|
+
),
|
|
173
|
+
|
|
174
|
+
enriched AS (
|
|
175
|
+
SELECT
|
|
176
|
+
s.sale_id,
|
|
177
|
+
d.date_key,
|
|
178
|
+
c.customer_key,
|
|
179
|
+
p.product_key,
|
|
180
|
+
s.quantity,
|
|
181
|
+
s.unit_price,
|
|
182
|
+
s.quantity * s.unit_price AS total_amount,
|
|
183
|
+
s.created_at
|
|
184
|
+
FROM source_sales s
|
|
185
|
+
LEFT JOIN {{ ref('dim_date') }} d ON DATE(s.sale_date) = d.date_actual
|
|
186
|
+
LEFT JOIN {{ ref('dim_customer') }} c ON s.customer_id = c.customer_id AND c.is_current
|
|
187
|
+
LEFT JOIN {{ ref('dim_product') }} p ON s.product_id = p.product_id AND p.is_current
|
|
188
|
+
)
|
|
189
|
+
|
|
190
|
+
SELECT * FROM enriched
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
## Data Quality
|
|
194
|
+
|
|
195
|
+
### Great Expectations
|
|
196
|
+
```python
|
|
197
|
+
import great_expectations as gx
|
|
198
|
+
|
|
199
|
+
context = gx.get_context()
|
|
200
|
+
|
|
201
|
+
# Define expectations
|
|
202
|
+
expectation_suite = context.add_expectation_suite("sales_suite")
|
|
203
|
+
|
|
204
|
+
validator = context.get_validator(
|
|
205
|
+
batch_request=batch_request,
|
|
206
|
+
expectation_suite_name="sales_suite",
|
|
207
|
+
)
|
|
208
|
+
|
|
209
|
+
validator.expect_column_values_to_not_be_null("sale_id")
|
|
210
|
+
validator.expect_column_values_to_be_between("quantity", 0, 10000)
|
|
211
|
+
validator.expect_column_values_to_match_regex("email", r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$")
|
|
212
|
+
|
|
213
|
+
validator.save_expectation_suite()
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
### dbt Tests
|
|
217
|
+
```yaml
|
|
218
|
+
# models/schema.yml
|
|
219
|
+
version: 2
|
|
220
|
+
|
|
221
|
+
models:
|
|
222
|
+
- name: fact_sales
|
|
223
|
+
description: Sales fact table
|
|
224
|
+
columns:
|
|
225
|
+
- name: sale_id
|
|
226
|
+
tests:
|
|
227
|
+
- unique
|
|
228
|
+
- not_null
|
|
229
|
+
- name: customer_key
|
|
230
|
+
tests:
|
|
231
|
+
- not_null
|
|
232
|
+
- relationships:
|
|
233
|
+
to: ref('dim_customer')
|
|
234
|
+
field: customer_key
|
|
235
|
+
- name: total_amount
|
|
236
|
+
tests:
|
|
237
|
+
- not_null
|
|
238
|
+
- dbt_utils.accepted_range:
|
|
239
|
+
min_value: 0
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
## Technology Selection
|
|
243
|
+
|
|
244
|
+
### Batch Processing
|
|
245
|
+
| Tool | Use Case |
|
|
246
|
+
|------|----------|
|
|
247
|
+
| Spark | Large-scale distributed processing |
|
|
248
|
+
| dbt | SQL-based transformations |
|
|
249
|
+
| Airflow/Dagster | Orchestration |
|
|
250
|
+
| Pandas | Small-medium data |
|
|
251
|
+
|
|
252
|
+
### Streaming
|
|
253
|
+
| Tool | Use Case |
|
|
254
|
+
|------|----------|
|
|
255
|
+
| Kafka | Event streaming platform |
|
|
256
|
+
| Flink | Complex event processing |
|
|
257
|
+
| Spark Streaming | Micro-batch streaming |
|
|
258
|
+
| Materialize | Streaming SQL |
|
|
259
|
+
|
|
260
|
+
### Storage
|
|
261
|
+
| Type | Options |
|
|
262
|
+
|------|---------|
|
|
263
|
+
| Data Warehouse | Snowflake, BigQuery, Redshift |
|
|
264
|
+
| Data Lake | S3/GCS + Delta Lake/Iceberg |
|
|
265
|
+
| OLTP | PostgreSQL, MySQL |
|
|
266
|
+
| Time Series | TimescaleDB, InfluxDB |
|
|
267
|
+
|
|
268
|
+
## Best Practices
|
|
269
|
+
|
|
270
|
+
### Pipeline Design
|
|
271
|
+
- Idempotent operations
|
|
272
|
+
- Incremental processing
|
|
273
|
+
- Proper error handling
|
|
274
|
+
- Clear lineage tracking
|
|
275
|
+
|
|
276
|
+
### Performance
|
|
277
|
+
- Partition data appropriately
|
|
278
|
+
- Use columnar formats (Parquet)
|
|
279
|
+
- Optimize joins and aggregations
|
|
280
|
+
- Cache intermediate results
|
|
281
|
+
|
|
282
|
+
### Monitoring
|
|
283
|
+
```yaml
|
|
284
|
+
# Metrics to track
|
|
285
|
+
pipeline_metrics:
|
|
286
|
+
- records_processed
|
|
287
|
+
- processing_time
|
|
288
|
+
- error_rate
|
|
289
|
+
- data_freshness
|
|
290
|
+
- schema_drift
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
## Anti-Patterns to Avoid
|
|
294
|
+
|
|
295
|
+
| Anti-Pattern | Better Approach |
|
|
296
|
+
|--------------|-----------------|
|
|
297
|
+
| No idempotency | Design for replayability |
|
|
298
|
+
| Tight coupling | Modular, testable pipelines |
|
|
299
|
+
| No data validation | Data quality checks |
|
|
300
|
+
| Silent failures | Alerting and monitoring |
|
|
301
|
+
| No documentation | Data catalogs and lineage |
|
|
302
|
+
|
|
303
|
+
## Constraints
|
|
304
|
+
|
|
305
|
+
- Always validate data at ingestion
|
|
306
|
+
- Design for failure recovery
|
|
307
|
+
- Document data lineage
|
|
308
|
+
- Test pipelines before production
|
|
309
|
+
- Monitor data freshness and quality
|
|
310
|
+
|
|
311
|
+
## Related Skills
|
|
312
|
+
|
|
313
|
+
- `backend-developer` - API data sources
|
|
314
|
+
- `python-pro` - Python data processing
|
|
315
|
+
- `ml-engineer` - Feature engineering
|
|
316
|
+
- `data-scientist` - Analytics requirements
|