@groupby/ai-dev 0.5.7 → 0.5.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/package.json +1 -1
  2. package/teams/fhr-ai-team/github/PULL_REQUEST_TEMPLATE/full.md +31 -0
  3. package/teams/fhr-ai-team/github/PULL_REQUEST_TEMPLATE/light.md +7 -0
  4. package/teams/fhr-ai-team/github/copilot-instructions.md +24 -0
  5. package/teams/fhr-ai-team/github/instructions/python.instructions.md +23 -0
  6. package/teams/fhr-ai-team/github/pull_request_template.md +21 -0
  7. package/teams/fhr-ai-team/prompts/brainstorm.md +7 -0
  8. package/teams/fhr-ai-team/prompts/plan-algo-tests.md +7 -0
  9. package/teams/fhr-ai-team/prompts/plan.md +7 -0
  10. package/teams/fhr-ai-team/prompts/pr-description.md +7 -0
  11. package/teams/fhr-ai-team/prompts/test.md +7 -0
  12. package/teams/fhr-ai-team/resources/AGENTS.md +55 -0
  13. package/teams/fhr-ai-team/resources/CLAUDE.md +52 -0
  14. package/teams/fhr-ai-team/resources/README.md +51 -0
  15. package/teams/fhr-ai-team/resources/claude-code-setup.md +60 -0
  16. package/teams/fhr-ai-team/resources/copilot-setup.md +64 -0
  17. package/teams/fhr-ai-team/resources/onboarding.md +179 -0
  18. package/teams/fhr-ai-team/resources/opencode-install.md +29 -0
  19. package/teams/fhr-ai-team/resources/opencode-setup.md +43 -0
  20. package/teams/fhr-ai-team/skills/algo-test-planning/SKILL.md +192 -0
  21. package/teams/fhr-ai-team/skills/algo-test-planning/references/pipeline-registry.md +280 -0
  22. package/teams/fhr-ai-team/skills/brainstorming/SKILL.md +111 -0
  23. package/teams/fhr-ai-team/skills/e2e-testing/SKILL.md +163 -0
  24. package/teams/fhr-ai-team/skills/grill-me/SKILL.md +10 -0
  25. package/teams/fhr-ai-team/skills/ml-tooling-dev/SKILL.md +313 -0
  26. package/teams/fhr-ai-team/skills/ml-tooling-dev/references/kubectl-debug.md +165 -0
  27. package/teams/fhr-ai-team/skills/ml-tooling-dev/references/mongodb-config.md +218 -0
  28. package/teams/fhr-ai-team/skills/ml-tooling-dev/references/pipeline-configs.md +190 -0
  29. package/teams/fhr-ai-team/skills/ml-tooling-dev/references/pipeline-steps.md +182 -0
  30. package/teams/fhr-ai-team/skills/ml-tooling-dev/scripts/kf_logs.py +203 -0
  31. package/teams/fhr-ai-team/skills/ml-tooling-dev/scripts/kf_query.py +233 -0
  32. package/teams/fhr-ai-team/skills/ml-tooling-dev/scripts/kf_wait.py +195 -0
  33. package/teams/fhr-ai-team/skills/ml-tooling-dev/scripts/mlflow_query.py +252 -0
  34. package/teams/fhr-ai-team/skills/ml-tooling-dev/scripts/mongo_predictor.py +352 -0
  35. package/teams/fhr-ai-team/skills/naming-conventions-reviewer/SKILL.md +230 -0
  36. package/teams/fhr-ai-team/skills/naming-conventions-reviewer/references/dataset-naming.md +190 -0
  37. package/teams/fhr-ai-team/skills/naming-conventions-reviewer/references/domain-vocabulary.md +447 -0
  38. package/teams/fhr-ai-team/skills/naming-conventions-reviewer/references/repo-dependency-graph.md +264 -0
  39. package/teams/fhr-ai-team/skills/planning/SKILL.md +138 -0
  40. package/teams/fhr-ai-team/skills/pr-description/SKILL.md +94 -0
@@ -0,0 +1,43 @@
1
+ # OpenCode Setup
2
+
3
+ ## Installation
4
+
5
+ ```bash
6
+ git clone https://github.com/Attraqt/ai.agent-skills.git
7
+ ```
8
+
9
+ Open the cloned directory in OpenCode. No additional setup is required.
10
+
11
+ ## How It Works
12
+
13
+ - `AGENTS.md` at the repo root is loaded automatically and provides agent instructions
14
+ - Skills in the `skills/` directory are discovered via directory convention
15
+ - The agent automatically selects the appropriate skill based on your natural language request
16
+ - No slash commands are needed; the agent detects intent and routes to the right workflow
17
+
18
+ ## Skill Routing
19
+
20
+ | Your request | Skill invoked |
21
+ |-------------|---------------|
22
+ | "Let's brainstorm how to..." | `skills/brainstorming/SKILL.md` |
23
+ | "Plan the implementation for..." | `skills/planning/SKILL.md` |
24
+ | "Test the pipeline for..." | `skills/algo-test-planning/SKILL.md` |
25
+ | "Run the tests for..." | `skills/e2e-testing/SKILL.md` |
26
+ | "Check my Kubeflow run" | `skills/ml-tooling-dev/SKILL.md` |
27
+ | "Review the naming in..." | `skills/naming-conventions-reviewer/SKILL.md` |
28
+
29
+ ## Using in Other Projects
30
+
31
+ To use ai.pierre skills when working in a different project repo:
32
+
33
+ 1. Ensure `AGENTS.md` is copied or symlinked to your project root
34
+ 2. Copy the `skills/` directory (or specific skills you need) into your project
35
+ 3. The agent will auto-discover and apply them
36
+
37
+ Alternatively, reference the skills directory in your OpenCode configuration to load them globally.
38
+
39
+ ## Notes
40
+
41
+ - The skill routing depends on the model consistently following rules in `AGENTS.md`
42
+ - For best results, keep your requests natural and descriptive
43
+ - The agent will use the `question` tool (OpenCode equivalent of AskUserQuestion) to gather requirements interactively
@@ -0,0 +1,192 @@
1
+ ---
2
+ name: algo-test-planning
3
+ description: >
4
+ Use when the user wants to plan or configure a test for an algo pipeline. Guides through
5
+ pipeline selection, config gathering, and Kubeflow config JSON generation via a 3-stage
6
+ interactive flow using AskUserQuestion. Covers full multi-step pipelines and base
7
+ single-step pipelines.
8
+ ---
9
+
10
+ # Algo Pipeline Test Planning
11
+
12
+ ## Overview
13
+
14
+ This skill guides you through planning and configuring a test run for an ML pipeline.
15
+ It produces a complete Kubeflow config JSON and a test plan with launch and verification steps.
16
+
17
+ All pipeline operations target the **DEV environment only**.
18
+
19
+ ## Stage 1: Pipeline Type Selection
20
+
21
+ Use AskUserQuestion to ask:
22
+
23
+ **Question:** "What type of pipeline do you want to test?"
24
+
25
+ | Option | Description |
26
+ |--------|-------------|
27
+ | Full pipeline | Multi-step end-to-end pipeline (e.g., learning + evaluation + encoding) |
28
+ | Base/single-step pipeline | Single step using a base pipeline template |
29
+
30
+ ---
31
+
32
+ ## Stage 2a: Full Pipeline Selection
33
+
34
+ If the user chose "Full pipeline", use AskUserQuestion to ask which pipeline.
35
+
36
+ Consult `references/pipeline-registry.md` for the complete list grouped by domain.
37
+ Present the most relevant options based on context, or let the user search.
38
+
39
+ Common full pipelines by domain:
40
+
41
+ **Semantic Search:**
42
+ - `semantic_search_learning_with_generated_analytics_pipeline`
43
+ - `semantic_search_item_encoding_pipeline`
44
+
45
+ **Visual Search / CLIP:**
46
+ - `clip_learning_pipeline`
47
+ - `clip_item_encoding_pipeline`
48
+
49
+ **Tagging:**
50
+ - `tagging_learning_pipeline`
51
+ - `transformer_tagging_learning_pipeline`
52
+
53
+ **Image:**
54
+ - `image_encoder_learning_pipeline`
55
+ - `image_classifier_pipeline`
56
+
57
+ **Shop the Look:**
58
+ - `shop_the_look_learning_pipeline`
59
+
60
+ **FM / Recommendations:**
61
+ - `fm_learning_pipeline`
62
+
63
+ **Text Encoder:**
64
+ - `text_encoder_learning_pipeline`
65
+
66
+ Then gather pipeline-specific config fields based on the selected pipeline's requirements
67
+ (see `references/pipeline-registry.md` for required fields per pipeline).
68
+
69
+ ---
70
+
71
+ ## Stage 2b: Base Pipeline Selection
72
+
73
+ If the user chose "Base/single-step pipeline", use AskUserQuestion to ask:
74
+
75
+ **Question:** "Which base pipeline type?"
76
+
77
+ | Option | Description |
78
+ |--------|-------------|
79
+ | `python_batch_pipeline` | Standard Python batch jobs |
80
+ | `large_python_batch_pipeline` | GPU/high-memory Python batch jobs |
81
+ | `scala_batch_pipeline` | Scala-based batch jobs |
82
+ | `spark_scala_batch_pipeline` | Spark Scala batch jobs |
83
+
84
+ Then ask:
85
+ - **Strategy ID** (what job to run, e.g., `semantic-search-learning`, `item-images-single-encoding`)
86
+ - **Docker image name** (e.g., `semantic-search`, `algo-fm-batch`)
87
+ - **Arguments** specific to the strategy (varies by step type)
88
+
89
+ Key rules for base pipelines:
90
+ - `python_batch_pipeline` and `large_python_batch_pipeline` use `batch_config.arguments` for custom params
91
+ - `scala_batch_pipeline` and `spark_scala_batch_pipeline` use `batch_config.custom_params` (NOT `arguments`)
92
+ - GPU jobs must include `gpu_vendor: "nvidia.com/gpu"` and `gpu_accelerator_name: "nvidia-l4"` for L4 nodes
93
+
94
+ ---
95
+
96
+ ## Stage 3: Config Gathering and JSON Generation
97
+
98
+ Use AskUserQuestion sequentially for each required input:
99
+
100
+ ### 3.1 Predictor ID
101
+ Ask for the MongoDB ObjectId (e.g., `64f0a12b5856b11b7aa4e71e`).
102
+ This identifies the tenant/predictor whose config will be used.
103
+
104
+ ### 3.2 Experiment Name
105
+ Discover available experiments:
106
+ ```bash
107
+ python3 scripts/kf_query.py --experiments
108
+ ```
109
+ Or let the user provide one directly.
110
+
111
+ ### 3.3 Strategy ID
112
+ Based on the pipeline or step selected. Reference `skills/ml-tooling-dev/references/pipeline-configs.md`
113
+ for the canonical strategy ID list.
114
+
115
+ ### 3.4 Image Version
116
+ Verify the version exists in Kubeflow:
117
+ ```bash
118
+ python3 scripts/kf_query.py --pipeline-versions <pipeline_name>
119
+ ```
120
+ Use the most recent `version_name` from the output (e.g., `"0.1.271"`).
121
+
122
+ ### 3.5 Dataset Paths (if applicable)
123
+ GCS paths from previous pipeline runs. Discover via:
124
+ ```bash
125
+ python3 scripts/kf_query.py <previous_run_id>
126
+ ```
127
+ Check Kubeflow UI -> run -> succeeded steps -> Output artifacts tab.
128
+
129
+ ### 3.6 MLflow Run ID (if applicable)
130
+ For evaluation or encoding steps that need a trained model:
131
+ ```bash
132
+ python3 scripts/mlflow_query.py model-for-predictor <predictor_id>
133
+ ```
134
+
135
+ ### 3.7 MongoDB Config Check
136
+ Read current training hyperparameters:
137
+ ```bash
138
+ mongosh "mongodb://10.11.96.21:27017/earlybirds" --quiet --eval '
139
+ const doc = db.predictors.findOne({"_id": ObjectId("<PREDICTOR_ID>")});
140
+ print(JSON.stringify(doc.config.batch, null, 2));
141
+ '
142
+ ```
143
+ Present the current config to the user. Ask if any changes are needed before the test run.
144
+ If changes are needed, generate the `updateOne` command (see `skills/ml-tooling-dev/references/mongodb-config.md`).
145
+
146
+ ### 3.8 Resource Overrides
147
+ Use defaults from `skills/ml-tooling-dev/references/pipeline-configs.md` unless the user specifies:
148
+ - CPU/memory requests and limits
149
+ - GPU type and count
150
+ - Disk size
151
+
152
+ ---
153
+
154
+ ## Output
155
+
156
+ Generate the following:
157
+
158
+ ### 1. Complete Kubeflow Config JSON
159
+ A ready-to-submit JSON file. Save to `/tmp/<pipeline>-<predictor_id>-test.json`.
160
+
161
+ ### 2. Pre-Launch Checklist
162
+ - [ ] `version_name` verified via `kf_query.py --pipeline-versions`
163
+ - [ ] MongoDB config confirmed (show current values)
164
+ - [ ] Dataset paths validated (exist in GCS)
165
+ - [ ] Experiment exists in Kubeflow
166
+
167
+ ### 3. Launch Command
168
+ ```bash
169
+ cd attraqt-kubeflow-configs/scripts
170
+ python -m run -c <absolute_path_to_config>
171
+ ```
172
+
173
+ ### 4. Verification Steps
174
+ - Monitor run: `python3 scripts/kf_query.py <run_id>`
175
+ - Check failed steps: `python3 scripts/kf_query.py <run_id> --failed`
176
+ - Expected step outcomes for each pipeline step
177
+ - Pod log patterns to watch for
178
+
179
+ ### 5. Failure Recovery
180
+ - Debug failed steps: see `skills/ml-tooling-dev/references/kubectl-debug.md`
181
+ - Common failure patterns and fixes
182
+ - How to re-run individual failed steps
183
+
184
+ ---
185
+
186
+ ## Skill Dependencies
187
+
188
+ This skill invokes `ai.pierre:ml-tooling-dev` for:
189
+ - Config templates and validation
190
+ - Kubeflow/MLflow query commands
191
+ - MongoDB read/update operations
192
+ - kubectl debugging commands
@@ -0,0 +1,280 @@
1
+ # Pipeline Registry
2
+
3
+ Complete registry of all AI pipelines from `attraqt-kubeflow-pipelines`.
4
+ Source: `kubeflow_pipelines/pipelines/ai/__init__.py`
5
+
6
+ ---
7
+
8
+ ## Base Pipelines (Single-Step Templates)
9
+
10
+ These are generic pipeline wrappers for running a single batch job step.
11
+
12
+ | Pipeline | Type | Use case |
13
+ |----------|------|----------|
14
+ | `python_batch_pipeline` | Python | Standard Python batch jobs |
15
+ | `large_python_batch_pipeline` | Python | GPU/high-memory Python batch jobs |
16
+ | `scala_batch_pipeline` | Scala | Scala-based batch jobs |
17
+ | `spark_scala_batch_pipeline` | Spark Scala | Spark Scala batch jobs |
18
+
19
+ **Config differences:**
20
+ - Python pipelines use `batch_config.arguments` for custom params
21
+ - Scala pipelines use `batch_config.custom_params` (NOT `arguments`)
22
+ - `large_python_batch_pipeline` supports GPU: set `gpu_vendor: "nvidia.com/gpu"` and `gpu_accelerator_name: "nvidia-l4"`
23
+
24
+ ---
25
+
26
+ ## Semantic Search
27
+
28
+ | Pipeline | Type | Description |
29
+ |----------|------|-------------|
30
+ | `semantic_search_learning_pipeline` | Full | Learning only |
31
+ | `semantic_search_learning_with_generated_analytics_pipeline` | Full | Learning + analytics generation (most common) |
32
+ | `semantic_search_item_encoding_pipeline` | Full | Item encoding after training |
33
+ | `export_huggingface_sentence_transformer_model_pipeline` | Script | Export HuggingFace sentence transformer model |
34
+
35
+ ---
36
+
37
+ ## Search (Text-only, no images)
38
+
39
+ | Pipeline | Type | Description |
40
+ |----------|------|-------------|
41
+ | `search_rnn_learning_pipeline_without_images` | Full | RNN-based search learning |
42
+ | `search_llm_learning_pipeline_without_images` | Full | LLM-based search learning |
43
+ | `search_item_encoding_pipeline_without_images` | Full | Item encoding (text only) |
44
+
45
+ ---
46
+
47
+ ## Search + CLIP (Text + Images)
48
+
49
+ | Pipeline | Type | Description |
50
+ |----------|------|-------------|
51
+ | `clip_search_rnn_learning_pipeline` | Full | CLIP + RNN search learning |
52
+ | `clip_search_llm_learning_pipeline` | Full | CLIP + LLM search learning |
53
+ | `clip_search_llm_learning_pipeline_with_data_augmentation` | Full | CLIP + LLM with data augmentation |
54
+ | `clip_search_vertical_rnn_learning_pipeline` | Full | Vertical (per-tenant) CLIP + RNN learning |
55
+ | `clip_search_vertical_llm_learning_pipeline` | Full | Vertical (per-tenant) CLIP + LLM learning |
56
+ | `clip_large_search_vertical_rnn_learning_pipeline` | Full | Large vertical CLIP + RNN (GPU) |
57
+ | `clip_large_search_vertical_llm_learning_pipeline` | Full | Large vertical CLIP + LLM (GPU) |
58
+ | `clip_search_item_encoding_pipeline` | Full | CLIP search item encoding |
59
+
60
+ ---
61
+
62
+ ## Search + Image Encoder
63
+
64
+ | Pipeline | Type | Description |
65
+ |----------|------|-------------|
66
+ | `image_encoder_search_item_encoding_pipeline_with_images` | Full | Image encoder search encoding (with images) |
67
+ | `image_encoder_search_item_encoding_pipeline_without_images` | Full | Image encoder search encoding (without images) |
68
+
69
+ ---
70
+
71
+ ## Search Evaluation
72
+
73
+ | Pipeline | Type | Description |
74
+ |----------|------|-------------|
75
+ | `search_evaluation_pipeline` | Full | Standard search evaluation |
76
+ | `search_llm_evaluation_pipeline` | Full | LLM-based search evaluation |
77
+
78
+ ---
79
+
80
+ ## Computer Vision - CLIP
81
+
82
+ | Pipeline | Type | Description |
83
+ |----------|------|-------------|
84
+ | `clip_learning_pipeline` | Full | CLIP model learning |
85
+ | `clip_vertical_learning_pipeline` | Full | Per-tenant CLIP learning |
86
+ | `large_clip_learning_pipeline` | Full | Large CLIP learning (GPU) |
87
+ | `clip_item_images_single_encoding_pipeline` | Full | CLIP single image encoding |
88
+ | `export_huggingface_clip_model_pipeline` | Script | Export HuggingFace CLIP model |
89
+
90
+ ---
91
+
92
+ ## Computer Vision - Image Encoder
93
+
94
+ | Pipeline | Type | Description |
95
+ |----------|------|-------------|
96
+ | `computer_vision_learning_pipeline` | Full | Image encoder learning |
97
+ | `computer_vision_vertical_learning_pipeline` | Full | Per-tenant image encoder learning |
98
+ | `large_computer_vision_vertical_learning_pipeline` | Full | Large vertical learning (GPU) |
99
+ | `computer_vision_item_images_single_encoding_pipeline` | Full | Image encoding |
100
+
101
+ ---
102
+
103
+ ## Computer Vision - SAM
104
+
105
+ | Pipeline | Type | Description |
106
+ |----------|------|-------------|
107
+ | `export_huggingface_sam_model_pipeline` | Script | Export HuggingFace SAM model |
108
+
109
+ ---
110
+
111
+ ## FM (Factorization Machines / Recommendations)
112
+
113
+ | Pipeline | Type | Description |
114
+ |----------|------|-------------|
115
+ | `fm_global_initialization_pipeline` | Full | FM global model initialization |
116
+ | `fm_global_incremental_pipeline` | Full | FM global incremental update |
117
+ | `fm_complementarity_initialization_pipeline` | Full | FM complementarity initialization |
118
+ | `fm_complementarity_incremental_pipeline` | Full | FM complementarity incremental update |
119
+
120
+ ---
121
+
122
+ ## GPT (Generative)
123
+
124
+ | Pipeline | Type | Description |
125
+ |----------|------|-------------|
126
+ | `gpt_initialization_pipeline_with_images` | Full | GPT init (with images) |
127
+ | `gpt_initialization_pipeline_without_images` | Full | GPT init (text only) |
128
+ | `gpt_incremental_pipeline_with_images` | Full | GPT incremental (with images) |
129
+ | `gpt_incremental_pipeline_without_images` | Full | GPT incremental (text only) |
130
+ | `gpt_item_encoding_pipeline_with_images` | Full | GPT item encoding (with images) |
131
+ | `gpt_item_encoding_pipeline_without_images` | Full | GPT item encoding (text only) |
132
+
133
+ ---
134
+
135
+ ## Tagging
136
+
137
+ | Pipeline | Type | Description |
138
+ |----------|------|-------------|
139
+ | `tagging_learning_pipeline` | Full | Tagging model learning |
140
+ | `tagging_item_tagging_pipeline` | Full | Apply tagging to items |
141
+ | `tagging_item_macro_tagging_pipeline` | Full | Apply macro tagging to items |
142
+
143
+ ---
144
+
145
+ ## Shop the Look
146
+
147
+ | Pipeline | Type | Description |
148
+ |----------|------|-------------|
149
+ | `shop_the_look_recommendation_pipeline` | Full | STL recommendations |
150
+ | `shop_the_look_recommendation_with_segmentation_pipeline` | Full | STL with image segmentation |
151
+ | `shop_the_look_recommendation_without_segmentation_pipeline` | Full | STL without segmentation |
152
+ | `shop_the_look_recommendation_with_outfit_detection_pipeline` | Full | STL with outfit detection |
153
+ | `outfit_image_classification_learning_pipeline` | Full | Outfit classifier learning |
154
+ | `outfit_image_classification_vertical_learning_pipeline` | Full | Per-tenant outfit classifier |
155
+
156
+ ---
157
+
158
+ ## YOLO (Object Detection)
159
+
160
+ | Pipeline | Type | Description |
161
+ |----------|------|-------------|
162
+ | `yolo_model_fine_tuning_pipeline` | Full | YOLO model fine-tuning |
163
+ | `yolo_model_fine_tuning_vertical_pipeline` | Full | Per-tenant YOLO fine-tuning |
164
+ | `export_ultralytics_yolo_model_pipeline` | Script | Export Ultralytics YOLO model |
165
+
166
+ ---
167
+
168
+ ## Item and Analytic Data
169
+
170
+ | Pipeline | Type | Description |
171
+ |----------|------|-------------|
172
+ | `xo_item_data_pipeline` | Full | XO item data ingestion |
173
+ | `fhr_item_data_pipeline` | Full | FHR item data ingestion |
174
+ | `fhr_item_data_pipeline_legacy` | Full | FHR item data (legacy) |
175
+ | `cidp_item_data_pipeline` | Full | CIDP item data ingestion |
176
+ | `fhr_analytic_incremental_data_pipeline` | Full | FHR analytics incremental |
177
+ | `fhr_analytic_incremental_data_pipeline_legacy` | Full | FHR analytics incremental (legacy) |
178
+ | `fhr_analytic_data_pipeline_legacy` | Full | FHR analytics (legacy) |
179
+
180
+ ---
181
+
182
+ ## NLP
183
+
184
+ | Pipeline | Type | Description |
185
+ |----------|------|-------------|
186
+ | `nlp_word_tokenizer_pipeline` | Full | Word tokenizer training |
187
+ | `nlp_character_tokenizer_pipeline` | Full | Character tokenizer training |
188
+
189
+ ---
190
+
191
+ ## Content-Based
192
+
193
+ | Pipeline | Type | Description |
194
+ |----------|------|-------------|
195
+ | `content_based_word2vec_pipeline` | Full | Word2Vec content-based recommendations |
196
+
197
+ ---
198
+
199
+ ## ALS (Alternating Least Squares)
200
+
201
+ | Pipeline | Type | Description |
202
+ |----------|------|-------------|
203
+ | `als_pipeline` | Full | ALS collaborative filtering |
204
+
205
+ ---
206
+
207
+ ## FP-Growth
208
+
209
+ | Pipeline | Type | Description |
210
+ |----------|------|-------------|
211
+ | `fp_growth_items_pipeline` | Full | FP-Growth item associations |
212
+ | `fp_growth_categories_pipeline` | Full | FP-Growth category associations |
213
+
214
+ ---
215
+
216
+ ## Pass-Through (Graph)
217
+
218
+ | Pipeline | Type | Description |
219
+ |----------|------|-------------|
220
+ | `pass_through_scored_graph_pipeline` | Full | Scored graph pass-through |
221
+ | `pass_through_unscored_graph_1_pipeline` | Full | Unscored graph variant 1 |
222
+ | `pass_through_unscored_graph_2_pipeline` | Full | Unscored graph variant 2 |
223
+ | `pass_through_source_to_items_unscored_graph_pipeline` | Full | Source-to-items unscored graph |
224
+
225
+ ---
226
+
227
+ ## Autocomplete
228
+
229
+ | Pipeline | Type | Description |
230
+ |----------|------|-------------|
231
+ | `autocomplete_pipeline` | Full | Autocomplete model training |
232
+
233
+ ---
234
+
235
+ ## Miscellaneous
236
+
237
+ | Pipeline | Type | Description |
238
+ |----------|------|-------------|
239
+ | `basic_pipeline` | Full | Basic/generic pipeline template |
240
+ | `sessions_pipeline` | Full | Session data processing |
241
+ | `bigquery_cleanup_pipeline` | Full | BigQuery data cleanup |
242
+ | `gibberish_pipeline` | Full | Gibberish detection |
243
+ | `dummy_ai_scores_pipeline` | Full | Dummy AI scores (testing) |
244
+ | `item_tagging_pipeline` | Full | Items enrichment tagging |
245
+ | `merch_agent_data_pipeline` | Full | Merch agent data preparation |
246
+ | `lakefs_garbage_collection_pipeline` | Full | LakeFS garbage collection |
247
+
248
+ ---
249
+
250
+ ## Label Studio
251
+
252
+ | Pipeline | Type | Description |
253
+ |----------|------|-------------|
254
+ | `outfit_tasks_import_pipeline` | Script | Import outfit tasks to Label Studio |
255
+ | `outfit_annotations_export_pipeline` | Script | Export outfit annotations from Label Studio |
256
+ | `yolo_tasks_import_pipeline` | Script | Import YOLO tasks to Label Studio |
257
+ | `yolo_annotations_export_pipeline` | Script | Export YOLO annotations from Label Studio |
258
+
259
+ ---
260
+
261
+ ## Monitoring and Maintenance
262
+
263
+ | Pipeline | Type | Description |
264
+ |----------|------|-------------|
265
+ | `activity_monitoring` | Monitoring | Activity monitoring |
266
+ | `experiments_with_consecutive_failed_runs_monitoring_pipeline` | Monitoring | Failed experiments monitoring |
267
+ | `runs_with_abnormal_duration_cleaning_pipeline` | Monitoring | Abnormal duration cleanup |
268
+ | `gcs_cleaning_pipeline` | Script | GCS storage cleanup |
269
+ | `gcs_activities_copy_pipeline` | Script | GCS activities data copy |
270
+ | `image_download_pipeline` | Script | Image download utility |
271
+ | `inference_data_cleaning_pipeline` | Script | Inference data cleanup |
272
+
273
+ ---
274
+
275
+ ## Total: ~93 pipelines
276
+
277
+ - 4 base pipelines
278
+ - ~70 full (multi-step) pipelines
279
+ - ~12 script/utility pipelines
280
+ - ~7 monitoring/maintenance pipelines
@@ -0,0 +1,111 @@
1
+ ---
2
+ name: brainstorming
3
+ description: >
4
+ Use when the user wants to brainstorm, design, or explore a new feature, improvement,
5
+ or architecture decision. Discovers AI team repos via gh, searches existing code before
6
+ proposing solutions, and gathers requirements interactively via AskUserQuestion.
7
+ ---
8
+
9
+ # Codebase-Aware Brainstorming
10
+
11
+ ## Hard Gate
12
+
13
+ Do NOT invoke any implementation skill, write any code, scaffold any project, or take any
14
+ implementation action until you have presented a design and the user has approved it.
15
+
16
+ ## Process
17
+
18
+ ### Step 1: Discover AI Team Repos
19
+
20
+ Run the following to get the current repo landscape:
21
+
22
+ ```bash
23
+ gh repo list Attraqt --json name,description --limit 200 --no-archived
24
+ ```
25
+
26
+ Filter results for repos matching `ai.*`, `algo.*`, `ebap-*`, `attraqt-kubeflow-*`, and `*-toolbox` patterns.
27
+ Present the user with a summary of the relevant repos grouped by category:
28
+
29
+ | Category | Pattern | Purpose |
30
+ |----------|---------|---------|
31
+ | ML algorithms | `algo.*` | Model training, inference, evaluation |
32
+ | ML training | `algo.*-ml` | Kubeflow-based model training/fine-tuning |
33
+ | AI services | `ai.*` | FastAPI/Streamlit microservices |
34
+ | Toolboxes | `*-toolbox` | Shared Python libraries |
35
+ | Kubeflow infra | `attraqt-kubeflow-*` | Pipeline configs and definitions |
36
+ | Platform infra | `ebap-*` | Early Birds AI Platform |
37
+
38
+ ### Step 2: Explore Project Context
39
+
40
+ For repos relevant to the brainstorm topic:
41
+ - Read their `CLAUDE.md` or `README.md` for architecture context
42
+ - Check recent git history (`git log --oneline -20`) for active development areas
43
+ - Scan directory structure to understand component layout
44
+
45
+ ### Step 3: Search Before Proposing
46
+
47
+ **MANDATORY:** Before proposing any solution, search the codebase for existing utilities,
48
+ patterns, and implementations related to the topic.
49
+
50
+ - Use Grep/Glob across relevant repos
51
+ - Check shared libraries: `earlybirds_commons`, `torch_toolbox`, `item-toolbox`, `nlp-toolbox`, `eb_tensorflow`
52
+ - Report findings to the user: "I found X in repo Y that does something similar"
53
+
54
+ If existing code covers part of the need, build on it rather than proposing greenfield work.
55
+
56
+ ### Step 4: Gather Requirements
57
+
58
+ Use the AskUserQuestion tool to gather requirements interactively.
59
+
60
+ Rules:
61
+ - **One question per message.** Do not batch multiple questions.
62
+ - **Prefer multiple-choice** over open-ended questions. Provide 2-4 concrete options based on what you found in the codebase.
63
+ - Cover these dimensions (not all at once; ask only what is relevant):
64
+ - Scope: what is in/out
65
+ - Target repos: which repos are affected
66
+ - Constraints: performance, compatibility, timeline
67
+ - Dependencies: what must exist first
68
+ - Users: who benefits from this
69
+
70
+ ### Step 5: Propose 2-3 Approaches
71
+
72
+ For each approach, include:
73
+ - **Summary:** one-sentence description
74
+ - **Trade-offs:** pros, cons, effort
75
+ - **Repos affected:** which repos need changes
76
+ - **Reuse opportunities:** what existing code can be leveraged
77
+ - **Concrete code references:** point to specific files/functions in real repos
78
+
79
+ ### Step 6: Present Design in Sections
80
+
81
+ Break the design into focused sections, each covering one concern.
82
+ Wait for user feedback between sections. Sections might include:
83
+ - Data model / schema changes
84
+ - API contracts
85
+ - Pipeline configuration
86
+ - Integration points with existing code
87
+ - Testing strategy
88
+
89
+ ### Step 7: Write Design Document
90
+
91
+ After user approval, save the design to `docs/specs/YYYY-MM-DD-<topic>-design.md`
92
+ in the relevant project repo. Include:
93
+ - Problem statement
94
+ - Chosen approach (with rationale)
95
+ - Detailed design per section
96
+ - Open questions (if any remain)
97
+ - References to existing code being reused
98
+
99
+ ### Step 8: Self-Review
100
+
101
+ Before presenting the final spec, review it for:
102
+ - Placeholders or vague language ("TBD", "as appropriate", "handle errors")
103
+ - Contradictions between sections
104
+ - Scope creep beyond what was agreed
105
+ - Missing error paths or edge cases
106
+ - Naming convention violations (invoke `ai.pierre:naming-conventions-reviewer` if code is shown)
107
+
108
+ ### Step 9: User Review and Transition
109
+
110
+ Present the spec for final user review. After approval, offer to invoke `/plan` to create
111
+ implementation tasks from the approved design.