data_drain 0.3.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +40 -1
- data/CHANGELOG.md +32 -0
- data/CLAUDE.md +14 -0
- data/README.md +2 -0
- data/data_drain.gemspec +1 -1
- data/docs/IMPROVEMENT_PLAN.md +122 -21
- data/docs/execution/archive/v0.3.0-OBSERVACIONES.md +136 -0
- data/docs/execution/archive/v0.3.0.md +1111 -0
- data/docs/execution/v0.3.1-OBSERVACIONES.md +146 -0
- data/docs/execution/v0.3.1.md +842 -0
- data/lib/data_drain/engine.rb +3 -2
- data/lib/data_drain/file_ingestor.rb +1 -1
- data/lib/data_drain/observability.rb +2 -0
- data/lib/data_drain/storage/base.rb +12 -0
- data/lib/data_drain/storage/local.rb +1 -3
- data/lib/data_drain/storage/s3.rb +5 -3
- data/lib/data_drain/types/json_type.rb +1 -0
- data/lib/data_drain/validations.rb +2 -0
- data/lib/data_drain/version.rb +2 -1
- data/lib/data_drain.rb +1 -0
- data/skill/references/antipatrones.md +10 -0
- data/skill/references/postgres-tuning.md +14 -0
- metadata +6 -2
|
@@ -0,0 +1,842 @@
|
|
|
1
|
+
# Plan de Ejecución — v0.3.1
|
|
2
|
+
|
|
3
|
+
**Release objetivo:** v0.3.1 — Calidad, CI y DX (cierre del roadmap original)
|
|
4
|
+
**Items del roadmap:** 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24 ([ver IMPROVEMENT_PLAN.md](../IMPROVEMENT_PLAN.md))
|
|
5
|
+
**Branch sugerido:** `feature/v0.3.1`
|
|
6
|
+
**Base:** `main` (contiene v0.3.0)
|
|
7
|
+
**Estado:** No iniciado
|
|
8
|
+
**Última actualización:** 2026-04-15 (revisión incorporada)
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Contexto
|
|
13
|
+
|
|
14
|
+
v0.3.0 cerró refactor core + observabilidad avanzada. v0.3.1 **cierra el roadmap original de 24 items** atacando deuda de calidad: RuboCop en specs, matrix de Ruby, YARD coverage, CI polish y cosméticos. Incluye 1 bug fix pre-existente del workflow (branch trigger).
|
|
15
|
+
|
|
16
|
+
**Breaking:** `required_ruby_version` sube de `>= 3.0.0` a `>= 3.2` (Ruby 3.0/3.1 EOL oficialmente).
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Review de agentes — incorporado
|
|
21
|
+
|
|
22
|
+
Revisión por **big-pickle** 2026-04-15 (`docs/execution/v0.3.1-OBSERVACIONES.md`). Baselines reales medidos + 6 issues del plan. Todas las observaciones incorporadas en esta revisión:
|
|
23
|
+
|
|
24
|
+
### Baseline real vs plan original
|
|
25
|
+
|
|
26
|
+
| Aspecto | Plan original estimaba | Realidad medida |
|
|
27
|
+
|---------|----------------------|-----------------|
|
|
28
|
+
| Ofensas RuboCop en `spec/` | ~48 | **37** (12 files) |
|
|
29
|
+
| `COUNT(*)` en `lib/` | no cuantificado | **3** (engine.rb×2, file_ingestor.rb×1) |
|
|
30
|
+
| `stub_const` en specs | 2 files (solo S3) | **2 files: S3 + GlueRunner** ← scope de item 19 ampliado |
|
|
31
|
+
| Coverage threshold | 80% | **80%** (real 97.49%) |
|
|
32
|
+
| CI branch trigger | main | **master** (bug pre-existente) ← fix en item 14 |
|
|
33
|
+
| Matrix Ruby | 3.4.4 solo | **3.4.4 solo** |
|
|
34
|
+
|
|
35
|
+
### Desglose de 37 ofensas en `spec/`
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
Metrics/BlockLength 29 ← describe/context blocks largos
|
|
39
|
+
Layout/LineLength 5 ← safe auto-correct
|
|
40
|
+
Naming/VariableNumber 2 ← test_1/test_2
|
|
41
|
+
Style/MultilineBlockChain 1 ← auto-correct
|
|
42
|
+
---
|
|
43
|
+
Total 37
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### 3 decisiones tomadas
|
|
47
|
+
|
|
48
|
+
1. **Ruby mínimo: `>= 3.2`** (Opción B de big-pickle).
|
|
49
|
+
- Razón: Ruby 3.0 EOL desde 2024-03-30, Ruby 3.1 EOL desde 2025-03-31. CI más corto (3 versiones vs 5). Alineado con `TargetRubyVersion: 3.2` de rubocop.
|
|
50
|
+
- BREAKING preventivo documentado en CHANGELOG.
|
|
51
|
+
|
|
52
|
+
2. **Postgres en CI: mocks suficientes (Opción C).**
|
|
53
|
+
- Confirmado: tests de Engine usan `instance_double(PG::Connection)` y `instance_double(PG::Result)` (ver `engine_spec.rb` desde v0.2.0).
|
|
54
|
+
- NO se levanta Postgres como service container.
|
|
55
|
+
- Tests con PG real se mantendrán futuro como opcional integration tests (item out-of-scope v0.3.1, abrir en post-roadmap).
|
|
56
|
+
|
|
57
|
+
3. **`rubocop-rspec`: descartado para v0.3.1.**
|
|
58
|
+
- Agregarlo ampliaría el scope (nuevas ofensas RSpec-específicas).
|
|
59
|
+
- Se documenta como ítem 25 (post-roadmap) si se necesita más adelante.
|
|
60
|
+
|
|
61
|
+
### Issues del plan (resolución 1ra pasada)
|
|
62
|
+
|
|
63
|
+
| # | Issue big-pickle | Resolución |
|
|
64
|
+
|---|------------------|-----------|
|
|
65
|
+
| 4.1 | `rubocop-rspec` opcional ambiguo | Descartado explícitamente (decisión 3) |
|
|
66
|
+
| 4.2 | Items 14 y 17 con dependencia implícita | **Fusionados en Fase 1** |
|
|
67
|
+
| 4.3 | Workflow CI en `master` no `main` | Incluido como parte del fix de Fase 1 |
|
|
68
|
+
| 4.4 | DuckDB binary amd64-only | OK por ahora (runners amd64) |
|
|
69
|
+
| 4.5 | Compatibilidad DuckDB + Ruby 3.2/3.3 | Verificación explícita en Fase 1.7 |
|
|
70
|
+
| 4.6 | Postgres en CI | Decisión 2 resuelve (mocks) |
|
|
71
|
+
|
|
72
|
+
### Issues 2da pasada de big-pickle (también incorporados)
|
|
73
|
+
|
|
74
|
+
| # | Issue | Severidad | Resolución |
|
|
75
|
+
|---|-------|-----------|-----------|
|
|
76
|
+
| 6.1 | YAML de matrix ambiguo en plan | Material | Fase 1.5 reescrita con indentación explícita y nota "strategy NO contiene steps". Validación YAML sintáctica agregada |
|
|
77
|
+
| 6.2 | Badge URL depende del nombre del workflow | Bajo | Fase 2.2 agrega step de verificación `ls .github/workflows/` + placeholder `<WORKFLOW_FILENAME>` |
|
|
78
|
+
| 6.3 | Estimado YARD bajo (4-6h → 5-8h real) | Ajuste | Tabla total v0.3.1 actualizada a 15-21h. Fase 5.1 muestra desglose ~47 items × 3-5 min |
|
|
79
|
+
| 6.4 | Bundler cache no se usa en CI | Mejora | Agregado como item 30 post-roadmap (ahorra ~6min/run matrix) |
|
|
80
|
+
| 6.5 | Marcar items `[~]` antes de ejecutar | Operativa | Fase 0.2.1 nueva: editar `IMPROVEMENT_PLAN.md` marcando items 12-24 como `[~]` antes de arrancar |
|
|
81
|
+
|
|
82
|
+
### Item 13 absorbido en Fase 0
|
|
83
|
+
|
|
84
|
+
30min de warm-up inline. Ya no tiene fase propia.
|
|
85
|
+
|
|
86
|
+
### Item 19 scope ampliado
|
|
87
|
+
|
|
88
|
+
Incluye `glue_runner_spec.rb` además de `s3_spec.rb` (big-pickle encontró `stub_const` también ahí).
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Items del release (orden ejecución)
|
|
93
|
+
|
|
94
|
+
| Fase | Items | Estimación |
|
|
95
|
+
|------|-------|-----------|
|
|
96
|
+
| 0 | Setup + **Item 13** (build_path_base inline) | 45min |
|
|
97
|
+
| 1 | **Items 17 + 14 + 18** (fusionados: rubocop specs + workflow + matrix Ruby + branch fix) | 4-6h |
|
|
98
|
+
| 2 | **Items 22 + 24** (cache rubocop + CI badge) | 40min |
|
|
99
|
+
| 3 | **Item 19** (stub_responses migration: S3 + Glue) | 2h |
|
|
100
|
+
| 4 | **Items 23 + 16** (bump coverage threshold + Friendly SQL) | 1-2h |
|
|
101
|
+
| 5 | **Items 12 + 15** (YARD + docs DEBUG/tuning) | 7-10h |
|
|
102
|
+
| 6 | Release | 30min |
|
|
103
|
+
|
|
104
|
+
**Total revisado:** 15-21h (2-3 días). Menos que original (17-24h) gracias a la fusión de fases y baseline real más bajo. Subió ligeramente respecto al primer revisado (14-19h) por ajuste de YARD según observación 6.3 big-pickle.
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## Pre-requisitos (Fase 0)
|
|
109
|
+
|
|
110
|
+
### 0.1 Verificar entorno
|
|
111
|
+
|
|
112
|
+
- [ ] `git checkout main && git pull`
|
|
113
|
+
- [ ] Versión actual en `lib/data_drain/version.rb` = `"0.3.0"`
|
|
114
|
+
- [ ] `bundle exec rspec` pasa (143 specs, coverage 97.49%)
|
|
115
|
+
- [ ] `bundle exec rubocop lib/` sin ofensas
|
|
116
|
+
- [ ] `bundle exec rubocop spec/ 2>&1 | tail -5` — reporta **37 ofensas** (confirmar)
|
|
117
|
+
|
|
118
|
+
### 0.2 Crear branch
|
|
119
|
+
|
|
120
|
+
- [ ] `git checkout -b feature/v0.3.1`
|
|
121
|
+
|
|
122
|
+
### 0.2.1 Marcar items como en progreso (observación 6.5 big-pickle)
|
|
123
|
+
|
|
124
|
+
Convención del `IMPROVEMENT_PLAN.md`: antes de arrancar un item, marcarlo `[~]` (en progreso). Esto mantiene la memoria del proyecto consistente si múltiples agentes trabajan en paralelo.
|
|
125
|
+
|
|
126
|
+
- [ ] Editar `docs/IMPROVEMENT_PLAN.md`: cambiar estado de items 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24 de `[ ]` a `[~]`.
|
|
127
|
+
- [ ] Commit: `chore: marcar items v0.3.1 como en progreso`
|
|
128
|
+
- [ ] Se marcarán `[x]` en Fase 6.4 al cerrar el release.
|
|
129
|
+
|
|
130
|
+
### 0.3 Item 13 — Extraer `build_path_base` en `Storage::Base` (warm-up, 30min)
|
|
131
|
+
|
|
132
|
+
**Contexto:** `Storage::Local#build_path` (local.rb:26-29) y `Storage::S3#build_path` (s3.rb:21-24) duplican lógica. `Storage::Base` existe sin lógica compartida — extraer ahí.
|
|
133
|
+
|
|
134
|
+
- [ ] Editar `lib/data_drain/storage/base.rb`, agregar al final de la clase:
|
|
135
|
+
```ruby
|
|
136
|
+
protected
|
|
137
|
+
|
|
138
|
+
# @param bucket [String]
|
|
139
|
+
# @param folder_name [String]
|
|
140
|
+
# @param partition_path [String, nil]
|
|
141
|
+
# @return [String] path sin prefix de protocolo ni sufijo glob
|
|
142
|
+
def build_path_base(bucket, folder_name, partition_path)
|
|
143
|
+
base = File.join(bucket, folder_name)
|
|
144
|
+
base = File.join(base, partition_path) if partition_path && !partition_path.empty?
|
|
145
|
+
base
|
|
146
|
+
end
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
- [ ] Refactor `lib/data_drain/storage/local.rb#build_path`:
|
|
150
|
+
```ruby
|
|
151
|
+
def build_path(bucket, folder_name, partition_path)
|
|
152
|
+
"#{build_path_base(bucket, folder_name, partition_path)}/**/*.parquet"
|
|
153
|
+
end
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
- [ ] Refactor `lib/data_drain/storage/s3.rb#build_path`:
|
|
157
|
+
```ruby
|
|
158
|
+
def build_path(bucket, folder_name, partition_path)
|
|
159
|
+
"s3://#{build_path_base(bucket, folder_name, partition_path)}/**/*.parquet"
|
|
160
|
+
end
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
- [ ] Validar: `bundle exec rspec spec/data_drain/storage/` y `bundle exec rubocop lib/data_drain/storage/`
|
|
164
|
+
|
|
165
|
+
- [ ] Commit: `refactor(storage): extraer build_path_base a Base (item 13)`
|
|
166
|
+
|
|
167
|
+
### Checkpoint Fase 0
|
|
168
|
+
|
|
169
|
+
- [ ] Branch creado
|
|
170
|
+
- [ ] Baseline: 37 ofensas specs confirmadas
|
|
171
|
+
- [ ] Item 13 cerrado (commit + tests verdes)
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## Fase 1 — Items 17 + 14 + 18: CI completo + matrix Ruby
|
|
176
|
+
|
|
177
|
+
**Roadmap:**
|
|
178
|
+
- [Item 17](../IMPROVEMENT_PLAN.md#item-17--arreglar-48-ofensas-rubocop-en-spec-y-re-habilitar-en-ci)
|
|
179
|
+
- [Item 14](../IMPROVEMENT_PLAN.md#item-14--ci-con-github-actions)
|
|
180
|
+
- [Item 18](../IMPROVEMENT_PLAN.md#item-18--matrix-ruby-en-ci-32-33-34)
|
|
181
|
+
|
|
182
|
+
**Contexto:** fusionados (observación 4.2 de big-pickle). No mergear Item 17 sin 14 porque rubocop local pasa pero CI no lo corre. Más incluye fix del bug `master`→`main` (observación 4.3) y matrix Ruby (item 18).
|
|
183
|
+
|
|
184
|
+
### 1.1 Quitar exclusión `spec/` de RuboCop
|
|
185
|
+
|
|
186
|
+
- [ ] Editar `.rubocop.yml`:
|
|
187
|
+
- Quitar:
|
|
188
|
+
```yaml
|
|
189
|
+
AllCops:
|
|
190
|
+
Exclude:
|
|
191
|
+
- spec/
|
|
192
|
+
```
|
|
193
|
+
- Mantener `TargetRubyVersion: 3.2` y otras configs.
|
|
194
|
+
- [ ] Verificar: `bundle exec rubocop spec/ 2>&1 | tail -3` — confirmar 37 ofensas.
|
|
195
|
+
|
|
196
|
+
### 1.2 Auto-correct seguro
|
|
197
|
+
|
|
198
|
+
- [ ] `bundle exec rubocop spec/ --autocorrect`
|
|
199
|
+
- Arregla automáticamente: 5 `Layout/LineLength`, 1 `Style/MultilineBlockChain`, parte de `Metrics/BlockLength` si cabe.
|
|
200
|
+
- [ ] `bundle exec rspec` — verificar tests verdes tras auto-correct.
|
|
201
|
+
- [ ] Commit: `style(spec): rubocop autocorrect safe (item 17)`
|
|
202
|
+
|
|
203
|
+
### 1.3 Config para ofensas no auto-correctibles
|
|
204
|
+
|
|
205
|
+
Las 29 `Metrics/BlockLength` no se auto-corrigen — son `describe`/`context` legítimamente largos. Excluir specs del límite:
|
|
206
|
+
|
|
207
|
+
- [ ] Editar `.rubocop.yml`, agregar:
|
|
208
|
+
```yaml
|
|
209
|
+
Metrics/BlockLength:
|
|
210
|
+
Exclude:
|
|
211
|
+
- spec/**/*_spec.rb
|
|
212
|
+
- data_drain.gemspec
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
- [ ] Para las 2 ofensas `Naming/VariableNumber` (ej. `test_1`/`test_2`): evaluar si renombrar o relajar cop.
|
|
216
|
+
- Opción A: renombrar a `first_test`, `second_test`
|
|
217
|
+
- Opción B: agregar config:
|
|
218
|
+
```yaml
|
|
219
|
+
Naming/VariableNumber:
|
|
220
|
+
EnforcedStyle: normalcase
|
|
221
|
+
```
|
|
222
|
+
- **Recomendación:** A (renombrar). Más claro, cambio trivial.
|
|
223
|
+
|
|
224
|
+
- [ ] `bundle exec rubocop` — verificar 0 ofensas en todo el proyecto.
|
|
225
|
+
- [ ] Commit: `chore(rubocop): config para Metrics/BlockLength + rename variables (item 17)`
|
|
226
|
+
|
|
227
|
+
### 1.4 Fix workflow CI: `master` → `main`
|
|
228
|
+
|
|
229
|
+
Observación 4.3 big-pickle: workflow trigger está en `master`, repo usa `main`. Bug pre-existente.
|
|
230
|
+
|
|
231
|
+
- [ ] Editar `.github/workflows/main.yml`:
|
|
232
|
+
```yaml
|
|
233
|
+
on:
|
|
234
|
+
push:
|
|
235
|
+
branches: [ main ] # ← era master
|
|
236
|
+
pull_request:
|
|
237
|
+
branches: [ main ] # ← era master
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
### 1.5 Matrix Ruby 3.2 / 3.3 / 3.4 (item 18)
|
|
241
|
+
|
|
242
|
+
**Decisión:** `required_ruby_version = ">= 3.2"` (ver sección "Review incorporado" punto 1).
|
|
243
|
+
|
|
244
|
+
- [ ] Editar `data_drain.gemspec`:
|
|
245
|
+
```ruby
|
|
246
|
+
spec.required_ruby_version = ">= 3.2"
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
- [ ] Editar `.github/workflows/main.yml`. Reemplazar el contenido actual del workflow (solo la parte del job `build`, el `on:` trigger ya se fijó en 1.4) por:
|
|
250
|
+
|
|
251
|
+
```yaml
|
|
252
|
+
# Indentación explícita: TODOS los atributos (runs-on, env, strategy, steps)
|
|
253
|
+
# son siblings bajo `build:`. `strategy:` NO contiene `steps:`.
|
|
254
|
+
# (Observación 6.1 big-pickle.)
|
|
255
|
+
jobs:
|
|
256
|
+
build:
|
|
257
|
+
runs-on: ubuntu-latest
|
|
258
|
+
name: Ruby ${{ matrix.ruby }}
|
|
259
|
+
env:
|
|
260
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
|
|
261
|
+
strategy:
|
|
262
|
+
fail-fast: false
|
|
263
|
+
matrix:
|
|
264
|
+
ruby: ["3.2", "3.3", "3.4"]
|
|
265
|
+
steps:
|
|
266
|
+
- uses: actions/checkout@v4
|
|
267
|
+
- uses: ruby/setup-ruby@v1
|
|
268
|
+
with:
|
|
269
|
+
ruby-version: ${{ matrix.ruby }}
|
|
270
|
+
bundler-cache: false
|
|
271
|
+
- name: Download DuckDB library
|
|
272
|
+
run: |
|
|
273
|
+
curl -sOL https://github.com/duckdb/duckdb/releases/download/v1.4.4/libduckdb-linux-amd64.zip
|
|
274
|
+
unzip -o libduckdb-linux-amd64.zip -d libduckdb
|
|
275
|
+
- name: Install DuckDB library
|
|
276
|
+
run: |
|
|
277
|
+
sudo cp libduckdb/libduckdb.so /usr/local/lib/
|
|
278
|
+
sudo cp libduckdb/duckdb.h /usr/local/include/
|
|
279
|
+
sudo ldconfig
|
|
280
|
+
- name: Install gems
|
|
281
|
+
run: bundle install --jobs 4 --retry 3
|
|
282
|
+
- name: Run tests
|
|
283
|
+
run: bundle exec rspec
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
- [ ] Validar sintaxis YAML antes de pushear: `yq eval '.' .github/workflows/main.yml` o `python -c "import yaml; yaml.safe_load(open('.github/workflows/main.yml'))"`.
|
|
287
|
+
|
|
288
|
+
### 1.6 Agregar RuboCop al workflow (item 14)
|
|
289
|
+
|
|
290
|
+
- [ ] Agregar step antes de "Run tests":
|
|
291
|
+
```yaml
|
|
292
|
+
- name: Run RuboCop
|
|
293
|
+
run: bundle exec rubocop --no-color
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
### 1.7 Verificación local de compatibilidad Ruby (observación 4.5)
|
|
297
|
+
|
|
298
|
+
Antes de pushear, verificar que el `duckdb` gem compila en Ruby 3.2 y 3.3:
|
|
299
|
+
|
|
300
|
+
- [ ] Si hay chruby/rbenv/asdf disponible:
|
|
301
|
+
```bash
|
|
302
|
+
chruby 3.2.x && bundle install && bundle exec rspec
|
|
303
|
+
chruby 3.3.x && bundle install && bundle exec rspec
|
|
304
|
+
```
|
|
305
|
+
- [ ] Si no hay Ruby 3.2/3.3 instalados localmente: confiar en CI y corregir si falla. Posibles problemas:
|
|
306
|
+
- Sintaxis post-3.3 usada por error (ej. `it` block parameter implícito es 3.4+)
|
|
307
|
+
- C extension del gem `duckdb` sin binario pre-compilado para alguna versión
|
|
308
|
+
|
|
309
|
+
### 1.8 Commits Fase 1
|
|
310
|
+
|
|
311
|
+
- [ ] (Si falta separar) `ci: agregar rubocop al workflow (item 14)`
|
|
312
|
+
- [ ] `ci: matrix Ruby 3.2/3.3/3.4 + required_ruby_version bump (item 18)`
|
|
313
|
+
- [ ] `fix(ci): branch trigger master → main (observación big-pickle)`
|
|
314
|
+
|
|
315
|
+
Orden de commits flexible. Lo importante es que tras el último, `git push origin feature/v0.3.1` haga CI verde en 3 versiones Ruby.
|
|
316
|
+
|
|
317
|
+
### 1.9 Push y verificar
|
|
318
|
+
|
|
319
|
+
- [ ] `git push origin feature/v0.3.1`
|
|
320
|
+
- [ ] Esperar CI. Verificar que los 3 jobs (Ruby 3.2/3.3/3.4) pasan.
|
|
321
|
+
- [ ] Si 3.2 o 3.3 falla:
|
|
322
|
+
- Leer logs de compilación de `duckdb` gem
|
|
323
|
+
- Leer logs de rspec (sintaxis incompatible)
|
|
324
|
+
- Aplicar fixes y commit con prefijo `fix(ci):`
|
|
325
|
+
|
|
326
|
+
### Checkpoint Fase 1
|
|
327
|
+
|
|
328
|
+
- [ ] 0 ofensas rubocop en todo el proyecto
|
|
329
|
+
- [ ] CI corre en las 3 versiones Ruby
|
|
330
|
+
- [ ] RuboCop incluido en CI
|
|
331
|
+
- [ ] Workflow dispara en `main` (no `master`)
|
|
332
|
+
- [ ] `required_ruby_version = ">= 3.2"`
|
|
333
|
+
|
|
334
|
+
---
|
|
335
|
+
|
|
336
|
+
## Fase 2 — Items 22 + 24: Cache RuboCop + CI badge
|
|
337
|
+
|
|
338
|
+
**Roadmap:**
|
|
339
|
+
- [Item 22](../IMPROVEMENT_PLAN.md#item-22--cache-de-rubocop-en-ci)
|
|
340
|
+
- [Item 24](../IMPROVEMENT_PLAN.md#item-24--ci-badge-en-readme)
|
|
341
|
+
|
|
342
|
+
### 2.1 Cache RuboCop (item 22)
|
|
343
|
+
|
|
344
|
+
- [ ] Editar `.github/workflows/main.yml`, agregar step antes de "Run RuboCop":
|
|
345
|
+
```yaml
|
|
346
|
+
- uses: actions/cache@v4
|
|
347
|
+
with:
|
|
348
|
+
path: ~/.cache/rubocop_cache
|
|
349
|
+
key: rubocop-${{ matrix.ruby }}-${{ hashFiles('.rubocop.yml') }}
|
|
350
|
+
restore-keys: rubocop-${{ matrix.ruby }}-
|
|
351
|
+
```
|
|
352
|
+
- [ ] Push y verificar:
|
|
353
|
+
- Primera corrida: cache miss, tiempo rubocop normal.
|
|
354
|
+
- Segunda corrida (sin cambiar `.rubocop.yml`): cache hit, tiempo rubocop < 5s.
|
|
355
|
+
|
|
356
|
+
### 2.2 CI badge (item 24)
|
|
357
|
+
|
|
358
|
+
- [ ] **Verificar nombre real del workflow** (observación 6.2 big-pickle):
|
|
359
|
+
```bash
|
|
360
|
+
ls .github/workflows/
|
|
361
|
+
```
|
|
362
|
+
Si es `main.yml`, usar la URL de abajo. Si fue renombrado (ej. `ci.yml`), ajustar el path del badge.
|
|
363
|
+
|
|
364
|
+
- [ ] Editar `README.md`. Insertar después del título:
|
|
365
|
+
```markdown
|
|
366
|
+
# DataDrain
|
|
367
|
+
|
|
368
|
+
[](https://github.com/gedera/data_drain/actions/workflows/<WORKFLOW_FILENAME>)
|
|
369
|
+
|
|
370
|
+
Micro-framework Ruby para extraer...
|
|
371
|
+
```
|
|
372
|
+
Reemplazar `<WORKFLOW_FILENAME>` con el nombre real verificado arriba (típicamente `main.yml`).
|
|
373
|
+
|
|
374
|
+
### 2.3 Commits + push
|
|
375
|
+
|
|
376
|
+
- [ ] Commit: `ci: cache rubocop por ruby version y config hash (item 22)`
|
|
377
|
+
- [ ] Commit: `docs(readme): agregar CI badge (item 24)`
|
|
378
|
+
- [ ] Push y verificar badge verde.
|
|
379
|
+
|
|
380
|
+
### Checkpoint Fase 2
|
|
381
|
+
|
|
382
|
+
- [ ] Cache rubocop activo (tiempo subsecuentes <5s)
|
|
383
|
+
- [ ] Badge verde visible en README
|
|
384
|
+
|
|
385
|
+
---
|
|
386
|
+
|
|
387
|
+
## Fase 3 — Item 19: Migrar tests a `stub_responses`
|
|
388
|
+
|
|
389
|
+
**Roadmap:** [Item 19](../IMPROVEMENT_PLAN.md#item-19--migrar-tests-s3-de-stub_const-a-aws_s3_client_stub_responses)
|
|
390
|
+
|
|
391
|
+
**Observación big-pickle:** scope ampliado — no solo `s3_spec.rb` sino también `glue_runner_spec.rb` tienen `stub_const("Aws::...", ...)`.
|
|
392
|
+
|
|
393
|
+
### 3.1 Migrar `spec/data_drain/storage/s3_spec.rb`
|
|
394
|
+
|
|
395
|
+
- [ ] Reemplazar `stub_const("Aws::S3::Client", Class.new ...)` por:
|
|
396
|
+
```ruby
|
|
397
|
+
let(:s3_client) { Aws::S3::Client.new(stub_responses: true, region: "us-east-1") }
|
|
398
|
+
|
|
399
|
+
before do
|
|
400
|
+
allow(Aws::S3::Client).to receive(:new).and_return(s3_client)
|
|
401
|
+
end
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
- [ ] Tests de `destroy_partitions`:
|
|
405
|
+
```ruby
|
|
406
|
+
it "arma prefix con folder y primera partition key" do
|
|
407
|
+
s3_client.stub_responses(:list_objects_v2, ->(context) {
|
|
408
|
+
expect(context.params[:bucket]).to eq("my-bucket")
|
|
409
|
+
expect(context.params[:prefix]).to eq("versions/isp_id=42/")
|
|
410
|
+
{ contents: [] }
|
|
411
|
+
})
|
|
412
|
+
|
|
413
|
+
adapter.destroy_partitions(
|
|
414
|
+
"my-bucket", "versions", %i[isp_id year month], { isp_id: 42 }
|
|
415
|
+
)
|
|
416
|
+
end
|
|
417
|
+
|
|
418
|
+
it "borra objetos matching pattern completo" do
|
|
419
|
+
s3_client.stub_responses(:list_objects_v2, {
|
|
420
|
+
contents: [
|
|
421
|
+
{ key: "versions/isp_id=42/year=2026/month=3/data.parquet" },
|
|
422
|
+
{ key: "versions/isp_id=42/year=2026/month=3/metadata.parquet" }
|
|
423
|
+
]
|
|
424
|
+
})
|
|
425
|
+
|
|
426
|
+
deleted_keys = []
|
|
427
|
+
s3_client.stub_responses(:delete_objects, ->(context) {
|
|
428
|
+
deleted_keys.concat(context.params[:delete][:objects].map { |o| o[:key] })
|
|
429
|
+
{ deleted: deleted_keys.map { |k| { key: k } } }
|
|
430
|
+
})
|
|
431
|
+
|
|
432
|
+
adapter.destroy_partitions(
|
|
433
|
+
"my-bucket", "versions", %i[isp_id year month], { isp_id: 42, year: 2026, month: 3 }
|
|
434
|
+
)
|
|
435
|
+
|
|
436
|
+
expect(deleted_keys).to include(
|
|
437
|
+
"versions/isp_id=42/year=2026/month=3/data.parquet",
|
|
438
|
+
"versions/isp_id=42/year=2026/month=3/metadata.parquet"
|
|
439
|
+
)
|
|
440
|
+
end
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
### 3.2 Migrar `spec/data_drain/glue_runner_spec.rb` (scope ampliado)
|
|
444
|
+
|
|
445
|
+
- [ ] Reemplazar `stub_const("Aws::Glue::Client", Class.new ...)` por equivalente `stub_responses`:
|
|
446
|
+
```ruby
|
|
447
|
+
let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
|
|
448
|
+
|
|
449
|
+
before do
|
|
450
|
+
allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
|
|
451
|
+
end
|
|
452
|
+
```
|
|
453
|
+
|
|
454
|
+
- [ ] Reescribir tests usando `stub_responses`:
|
|
455
|
+
```ruby
|
|
456
|
+
it "retorna true cuando SUCCEEDED inmediato" do
|
|
457
|
+
glue_client.stub_responses(:start_job_run, { job_run_id: "run-123" })
|
|
458
|
+
glue_client.stub_responses(:get_job_run, {
|
|
459
|
+
job_run: { job_run_state: "SUCCEEDED", error_message: nil }
|
|
460
|
+
})
|
|
461
|
+
|
|
462
|
+
result = described_class.run_and_wait("my-job", { "--key" => "val" })
|
|
463
|
+
expect(result).to be true
|
|
464
|
+
end
|
|
465
|
+
|
|
466
|
+
it "hace polling hasta SUCCEEDED" do
|
|
467
|
+
glue_client.stub_responses(:start_job_run, { job_run_id: "run-123" })
|
|
468
|
+
glue_client.stub_responses(:get_job_run, [
|
|
469
|
+
{ job_run: { job_run_state: "RUNNING", error_message: nil } },
|
|
470
|
+
{ job_run: { job_run_state: "SUCCEEDED", error_message: nil } }
|
|
471
|
+
])
|
|
472
|
+
|
|
473
|
+
allow(Kernel).to receive(:sleep)
|
|
474
|
+
|
|
475
|
+
result = described_class.run_and_wait("my-job", {}, polling_interval: 5)
|
|
476
|
+
expect(result).to be true
|
|
477
|
+
end
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
### 3.3 Validación
|
|
481
|
+
|
|
482
|
+
- [ ] `bundle exec rspec spec/data_drain/storage/s3_spec.rb spec/data_drain/glue_runner_spec.rb`
|
|
483
|
+
- [ ] `grep -rn "stub_const.*Aws::" spec/` — debe retornar vacío.
|
|
484
|
+
- [ ] Coverage igual o superior.
|
|
485
|
+
|
|
486
|
+
### 3.4 Commit
|
|
487
|
+
|
|
488
|
+
- [ ] Commit: `refactor(spec): migrar stub_const a stub_responses en S3 y Glue (item 19)`
|
|
489
|
+
|
|
490
|
+
### Checkpoint Fase 3
|
|
491
|
+
|
|
492
|
+
- [ ] 0 `stub_const("Aws::...", ...)` en spec/
|
|
493
|
+
- [ ] Tests S3 + Glue verdes
|
|
494
|
+
|
|
495
|
+
---
|
|
496
|
+
|
|
497
|
+
## Fase 4 — Items 23 + 16: Coverage threshold + Friendly SQL
|
|
498
|
+
|
|
499
|
+
### 4.1 Item 23 — Bump coverage threshold a 90%
|
|
500
|
+
|
|
501
|
+
Trivial. No se agregan tests nuevos (cobertura actual 97.49% > 90%).
|
|
502
|
+
|
|
503
|
+
- [ ] Editar `spec/spec_helper.rb`:
|
|
504
|
+
```ruby
|
|
505
|
+
SimpleCov.start do
|
|
506
|
+
add_filter "/spec/"
|
|
507
|
+
minimum_coverage 90 # era 80
|
|
508
|
+
end
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
- [ ] `bundle exec rspec` — debe pasar (97.49% > 90%).
|
|
512
|
+
|
|
513
|
+
- [ ] Commit: `test: subir SimpleCov minimum_coverage a 90% (item 23)`
|
|
514
|
+
|
|
515
|
+
### 4.2 Item 16 — DuckDB Friendly SQL (`count()`)
|
|
516
|
+
|
|
517
|
+
**Observación big-pickle:** hay 3 `COUNT(*)` en `lib/`, reemplazar 2 y dejar 1.
|
|
518
|
+
|
|
519
|
+
- [ ] Editar `lib/data_drain/engine.rb`:
|
|
520
|
+
- Línea ~163 (`get_postgres_count`): `SELECT COUNT(*) AS row_count` → `SELECT count() AS row_count` ✅
|
|
521
|
+
- Línea ~199 (`verify_integrity`): `SELECT COUNT(*) FROM read_parquet(...)` → `SELECT count() FROM read_parquet(...)` ✅
|
|
522
|
+
- Línea ~206 (`export_to_parquet` dentro del COPY): **DEJAR `SELECT *` como está**. El `SELECT *` dentro del `COPY (SELECT ... FROM postgres_query...) TO ...` es legibilidad estándar, cambiarlo no aporta.
|
|
523
|
+
|
|
524
|
+
- [ ] Editar `lib/data_drain/file_ingestor.rb`:
|
|
525
|
+
- Línea ~53 (`source_count`): `SELECT COUNT(*) FROM #{reader_function}` → `SELECT count() FROM #{reader_function}` ✅
|
|
526
|
+
|
|
527
|
+
- [ ] `grep -rn "COUNT(\*)" lib/` — verificar que solo queda el del COPY o (idealmente) ninguno si decidimos cambiar ese también.
|
|
528
|
+
|
|
529
|
+
### 4.3 Validación
|
|
530
|
+
|
|
531
|
+
- [ ] `bundle exec rspec` — tests verdes (count() es DuckDB-valid).
|
|
532
|
+
|
|
533
|
+
### 4.4 Commits
|
|
534
|
+
|
|
535
|
+
- [ ] Commit separado: `refactor(engine,file_ingestor): count() friendly SQL de DuckDB (item 16)`
|
|
536
|
+
|
|
537
|
+
### Checkpoint Fase 4
|
|
538
|
+
|
|
539
|
+
- [ ] Coverage threshold = 90
|
|
540
|
+
- [ ] 2 de 3 `COUNT(*)` reemplazados por `count()`
|
|
541
|
+
- [ ] Tests verdes
|
|
542
|
+
|
|
543
|
+
---
|
|
544
|
+
|
|
545
|
+
## Fase 5 — Items 12 + 15: Docs (YARD + DEBUG/tuning)
|
|
546
|
+
|
|
547
|
+
### 5.1 Item 12 — YARD coverage 50% → 90%
|
|
548
|
+
|
|
549
|
+
**Estimación revisada (observación 6.3 big-pickle):** 5-8h, no 4-6h. Scope real:
|
|
550
|
+
|
|
551
|
+
| Componente | Items a documentar |
|
|
552
|
+
|------------|-------------------|
|
|
553
|
+
| Configuration | ~21 (18 attrs + 3 métodos) |
|
|
554
|
+
| Observability | 3 |
|
|
555
|
+
| Observability::Timing | 2 |
|
|
556
|
+
| Storage::Base/Local/S3 | ~12 |
|
|
557
|
+
| Errors | 4-5 |
|
|
558
|
+
| Record | 5 |
|
|
559
|
+
| **Total** | **~47 items × 3-5 min = 5-8h** |
|
|
560
|
+
|
|
561
|
+
#### 5.1.1 Medir baseline
|
|
562
|
+
|
|
563
|
+
- [ ] `yard stats --list-undoc`
|
|
564
|
+
- [ ] Documentar:
|
|
565
|
+
> Métodos públicos sin documentar: _______________
|
|
566
|
+
> Cobertura actual: _______________%
|
|
567
|
+
|
|
568
|
+
#### 5.1.2 Documentar por prioridad
|
|
569
|
+
|
|
570
|
+
**Prioridad alta (high-visibility):**
|
|
571
|
+
|
|
572
|
+
- [ ] **Configuration** — todos los atributos con YARD:
|
|
573
|
+
```ruby
|
|
574
|
+
# @!attribute [rw] storage_mode
|
|
575
|
+
# @return [Symbol] :local o :s3. Default :local.
|
|
576
|
+
# @!attribute [rw] aws_region
|
|
577
|
+
# @return [String, nil] Obligatorio si storage_mode == :s3.
|
|
578
|
+
# @!attribute [rw] aws_access_key_id
|
|
579
|
+
# @return [String, nil] Opcional — si nil usa AWS credential_chain (IAM role/env/~/.aws).
|
|
580
|
+
# @!attribute [rw] aws_secret_access_key
|
|
581
|
+
# @return [String, nil] Opcional — ver aws_access_key_id.
|
|
582
|
+
# @!attribute [rw] db_host
|
|
583
|
+
# @return [String] Host PostgreSQL. Default "127.0.0.1".
|
|
584
|
+
# @!attribute [rw] db_port
|
|
585
|
+
# @return [Integer] Default 5432.
|
|
586
|
+
# @!attribute [rw] db_user
|
|
587
|
+
# @return [String, nil] Obligatorio para Engine.
|
|
588
|
+
# @!attribute [rw] db_pass
|
|
589
|
+
# @return [String, nil] Opcional (auth peer/trust/IAM puede tener nil).
|
|
590
|
+
# @!attribute [rw] db_name
|
|
591
|
+
# @return [String, nil] Obligatorio para Engine.
|
|
592
|
+
# @!attribute [rw] batch_size
|
|
593
|
+
# @return [Integer] Default 5000. Registros por DELETE en purga.
|
|
594
|
+
# @!attribute [rw] throttle_delay
|
|
595
|
+
# @return [Float] Default 0.5. Segundos entre lotes.
|
|
596
|
+
# @!attribute [rw] idle_in_transaction_session_timeout
|
|
597
|
+
# @return [Integer, nil] Ms. 0 = DESACTIVADO (default). nil = no setear.
|
|
598
|
+
# @!attribute [rw] limit_ram
|
|
599
|
+
# @return [String, nil] Ej. "2GB". Límite memoria DuckDB.
|
|
600
|
+
# @!attribute [rw] tmp_directory
|
|
601
|
+
# @return [String, nil] Spill-to-disk DuckDB.
|
|
602
|
+
# @!attribute [rw] logger
|
|
603
|
+
# @return [Logger] Default Logger.new($stdout).
|
|
604
|
+
# @!attribute [rw] vacuum_after_purge
|
|
605
|
+
# @return [Boolean] Default false. Si true ejecuta VACUUM ANALYZE post-purga.
|
|
606
|
+
# @!attribute [rw] slow_batch_threshold_s
|
|
607
|
+
# @return [Integer] Default 30. Segundos por batch que disparan warning.
|
|
608
|
+
# @!attribute [rw] slow_batch_alert_after
|
|
609
|
+
# @return [Integer] Default 5. Lotes lentos consecutivos antes de engine.purge_degraded.
|
|
610
|
+
class Configuration
|
|
611
|
+
attr_accessor :storage_mode, :aws_region, ...
|
|
612
|
+
end
|
|
613
|
+
```
|
|
614
|
+
|
|
615
|
+
- [ ] **Observability** (métodos privados):
|
|
616
|
+
```ruby
|
|
617
|
+
# Emite un log estructurado de forma segura.
|
|
618
|
+
# Garantiza que el logging nunca interrumpa el proceso principal (Resilience).
|
|
619
|
+
#
|
|
620
|
+
# @param level [Symbol] :debug, :info, :warn, :error
|
|
621
|
+
# @param event [String] Ej. "engine.complete"
|
|
622
|
+
# @param metadata [Hash] Pares clave-valor de contexto
|
|
623
|
+
# @return [void]
|
|
624
|
+
def safe_log(level, event, metadata = {}); ...; end
|
|
625
|
+
|
|
626
|
+
# @param error [Exception]
|
|
627
|
+
# @return [Hash] :error_class y :error_message (truncado a 200 chars)
|
|
628
|
+
def exception_metadata(error); ...; end
|
|
629
|
+
|
|
630
|
+
# @return [String] Primer namespace de la clase en snake_case (ej. "data_drain")
|
|
631
|
+
def observability_name; ...; end
|
|
632
|
+
```
|
|
633
|
+
|
|
634
|
+
- [ ] **Observability::Timing** (mixin):
|
|
635
|
+
```ruby
|
|
636
|
+
# @return [Float] Segundos del clock monotónico
|
|
637
|
+
def monotonic; ...; end
|
|
638
|
+
|
|
639
|
+
# Mide duración de un bloque y acumula en @durations.
|
|
640
|
+
# @param step_name [Symbol] Clave para @durations
|
|
641
|
+
# @yield Bloque a medir
|
|
642
|
+
# @return [Object] Resultado del bloque
|
|
643
|
+
def timed(step_name); ...; end
|
|
644
|
+
```
|
|
645
|
+
|
|
646
|
+
- [ ] **Storage::Base / Local / S3** — completar YARD donde falte.
|
|
647
|
+
|
|
648
|
+
- [ ] **Errors** — documentar causa de cada uno:
|
|
649
|
+
```ruby
|
|
650
|
+
# Levantado cuando la configuración es inválida o incompleta.
|
|
651
|
+
# @see Configuration#validate!
|
|
652
|
+
class ConfigurationError < Error; end
|
|
653
|
+
|
|
654
|
+
# Levantado cuando COUNT(*) Postgres != COUNT(*) Parquet.
|
|
655
|
+
# Nota: actualmente Engine#call retorna false en lugar de levantarlo.
|
|
656
|
+
class IntegrityError < Error; end
|
|
657
|
+
|
|
658
|
+
# Problemas interactuando con disco local, S3 o DuckDB.
|
|
659
|
+
class StorageError < Error; end
|
|
660
|
+
```
|
|
661
|
+
|
|
662
|
+
#### 5.1.3 Validar
|
|
663
|
+
|
|
664
|
+
- [ ] `yard stats --list-undoc` — objetivo ≥ 90% coverage.
|
|
665
|
+
- [ ] Generar docs local: `yard doc && open doc/index.html`
|
|
666
|
+
|
|
667
|
+
#### 5.1.4 Commit
|
|
668
|
+
|
|
669
|
+
- [ ] Commit: `docs(yard): coverage 50% → 90%+ (item 12)`
|
|
670
|
+
|
|
671
|
+
#### 5.1.5 Fallback (Plan B)
|
|
672
|
+
|
|
673
|
+
Si YARD toma > 6h:
|
|
674
|
+
- Cubrir solo Configuration + Observability + Observability::Timing en v0.3.1
|
|
675
|
+
- Diferir Storage::*, Errors detalles a post-roadmap v0.4.0
|
|
676
|
+
|
|
677
|
+
### 5.2 Item 15 — DEBUG en bloque + tabla tuning
|
|
678
|
+
|
|
679
|
+
#### 5.2.1 CLAUDE.md refuerzo DEBUG
|
|
680
|
+
|
|
681
|
+
- [ ] Editar `CLAUDE.md` sección "Logging", agregar:
|
|
682
|
+
```markdown
|
|
683
|
+
### DEBUG en bloque (obligatorio)
|
|
684
|
+
|
|
685
|
+
Usar siempre forma de bloque para evitar costo de serialización cuando DEBUG está off:
|
|
686
|
+
|
|
687
|
+
✅ Correcto:
|
|
688
|
+
logger.debug { "query=#{expensive_serialize(obj)}" }
|
|
689
|
+
|
|
690
|
+
❌ Incorrecto — evalúa siempre aunque DEBUG esté off:
|
|
691
|
+
logger.debug("query=#{expensive_serialize(obj)}")
|
|
692
|
+
```
|
|
693
|
+
|
|
694
|
+
- [ ] Actualizar `skill/references/antipatrones.md` antipatrón 11 (DEBUG sin bloque) con ejemplo real de DataDrain (ej. debug del query de export completo).
|
|
695
|
+
|
|
696
|
+
#### 5.2.2 postgres-tuning.md — tabla tuning parámetros
|
|
697
|
+
|
|
698
|
+
- [ ] Editar `skill/references/postgres-tuning.md`, agregar sección:
|
|
699
|
+
```markdown
|
|
700
|
+
## Tuning de parámetros DataDrain por tamaño
|
|
701
|
+
|
|
702
|
+
| Filas tabla | `batch_size` | `throttle_delay` | `vacuum_after_purge` | `slow_batch_threshold_s` |
|
|
703
|
+
|------------|-------------|-----------------|---------------------|-------------------------|
|
|
704
|
+
| <1M | 5000 | 0.1 | false | 30 |
|
|
705
|
+
| 1M-100M | 5000 | 0.5 | true | 30 |
|
|
706
|
+
| 100M-1B | 10000 | 1.0 | true | 60 |
|
|
707
|
+
| >1B | migrar a particionamiento (ver arriba) | | | |
|
|
708
|
+
|
|
709
|
+
Contexto operacional:
|
|
710
|
+
- **OLTP concurrente**: throttle_delay alto (≥0.5s) para no saturar.
|
|
711
|
+
- **Tablas frías** (sin queries de usuarios): throttle_delay 0 OK.
|
|
712
|
+
- **slow_batch_threshold_s** alto en tablas grandes porque cada batch tarda más legítimamente.
|
|
713
|
+
```
|
|
714
|
+
|
|
715
|
+
#### 5.2.3 Commit
|
|
716
|
+
|
|
717
|
+
- [ ] Commit: `docs: DEBUG en bloque + tabla tuning por tamaño (item 15)`
|
|
718
|
+
|
|
719
|
+
### Checkpoint Fase 5
|
|
720
|
+
|
|
721
|
+
- [ ] YARD ≥ 90% (o plan B aplicado)
|
|
722
|
+
- [ ] CLAUDE.md con ejemplo DEBUG
|
|
723
|
+
- [ ] postgres-tuning.md con tabla tuning
|
|
724
|
+
|
|
725
|
+
---
|
|
726
|
+
|
|
727
|
+
## Fase 6 — Release
|
|
728
|
+
|
|
729
|
+
### 6.1 Lint + tests finales
|
|
730
|
+
|
|
731
|
+
- [ ] `bundle exec rubocop` — 0 ofensas (lib + spec).
|
|
732
|
+
- [ ] `bundle exec rspec` — coverage ≥ 90%.
|
|
733
|
+
- [ ] CI verde en Ruby 3.2, 3.3, 3.4.
|
|
734
|
+
|
|
735
|
+
### 6.2 CHANGELOG
|
|
736
|
+
|
|
737
|
+
- [ ] Editar `CHANGELOG.md`, agregar al tope:
|
|
738
|
+
```markdown
|
|
739
|
+
## [0.3.1] - 2026-XX-XX
|
|
740
|
+
|
|
741
|
+
### BREAKING (preventivo)
|
|
742
|
+
- `required_ruby_version` bumpeado a `">= 3.2"` (Ruby 3.0 y 3.1 están EOL).
|
|
743
|
+
|
|
744
|
+
### Refactor
|
|
745
|
+
- Extraído `Storage::Base#build_path_base` para eliminar duplicación entre Local y S3. (item 13)
|
|
746
|
+
- Queries SQL internas adoptan `count()` friendly syntax de DuckDB (2 de 3 occurrences). (item 16)
|
|
747
|
+
|
|
748
|
+
### Tests
|
|
749
|
+
- 37 ofensas RuboCop en `spec/` arregladas; RuboCop corre en todo el proyecto. (item 17)
|
|
750
|
+
- Tests S3 + GlueRunner migrados de `stub_const` a `Aws::*::Client.stub_responses` nativo. (item 19)
|
|
751
|
+
- SimpleCov `minimum_coverage` subido a 90% (cobertura actual 97.49%). (item 23)
|
|
752
|
+
|
|
753
|
+
### CI
|
|
754
|
+
- Matrix Ruby 3.2 / 3.3 / 3.4. (item 18)
|
|
755
|
+
- RuboCop agregado al workflow. (item 14)
|
|
756
|
+
- Fix: workflow trigger corregido de `master` a `main`. (item 14)
|
|
757
|
+
- Cache de RuboCop por Ruby version + hash de config. (item 22)
|
|
758
|
+
- Badge de CI en README. (item 24)
|
|
759
|
+
|
|
760
|
+
### Docs
|
|
761
|
+
- YARD coverage 50% → 90%+: Configuration, Observability, Observability::Timing, Errors, Storage::*. (item 12)
|
|
762
|
+
- CLAUDE.md refuerzo: DEBUG en bloque obligatorio con ejemplo. (item 15)
|
|
763
|
+
- skill/references/postgres-tuning.md: nueva tabla de tuning de parámetros DataDrain por tamaño de tabla. (item 15)
|
|
764
|
+
|
|
765
|
+
### Roadmap
|
|
766
|
+
- **Roadmap original completado: 24/24 items cerrados.** Items post-roadmap migrados a sección "Follow-ups" en IMPROVEMENT_PLAN.md.
|
|
767
|
+
```
|
|
768
|
+
|
|
769
|
+
### 6.3 Bump versión
|
|
770
|
+
|
|
771
|
+
- [ ] `lib/data_drain/version.rb`: `VERSION = "0.3.1"`
|
|
772
|
+
- [ ] `bundle install`
|
|
773
|
+
|
|
774
|
+
### 6.4 Actualizar roadmap
|
|
775
|
+
|
|
776
|
+
- [ ] `docs/IMPROVEMENT_PLAN.md`:
|
|
777
|
+
- Items 12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24 → `[x]`
|
|
778
|
+
- Resumen ejecutivo: "Roadmap original 24/24 completado"
|
|
779
|
+
- Agregar sección "Follow-ups post-roadmap" con:
|
|
780
|
+
- Item 25: `fetch_dead_tuple_count` retorna `nil` en lugar de `-1` (cosmético logs)
|
|
781
|
+
- Item 26: documentar `lock_configuration` + httpfs comportamiento en skill
|
|
782
|
+
- Item 27: integration tests con Postgres real (service container en CI, tag `:integration`)
|
|
783
|
+
- Item 28: `rubocop-rspec` plugin + nuevas ofensas RSpec
|
|
784
|
+
- Item 29: particionamiento declarativo nativo en Engine (detectar y usar DROP PARTITION)
|
|
785
|
+
- Item 30: habilitar `bundler-cache: true` en CI tras setup DuckDB (observación 6.4 big-pickle) — ahorra ~6min/run matrix 3 versiones
|
|
786
|
+
|
|
787
|
+
### 6.5 Commit release
|
|
788
|
+
|
|
789
|
+
- [ ] Commit: `chore: release v0.3.1 — calidad, CI y DX (cierre roadmap 24/24)`
|
|
790
|
+
|
|
791
|
+
### 6.6 PR + merge + tag
|
|
792
|
+
|
|
793
|
+
- [ ] `git push origin feature/v0.3.1`
|
|
794
|
+
- [ ] `gh pr create --title "v0.3.1: calidad, CI y DX (cierre roadmap 24/24)"` con body del CHANGELOG
|
|
795
|
+
- [ ] Esperar CI verde (3 jobs Ruby)
|
|
796
|
+
- [ ] Mergear
|
|
797
|
+
- [ ] Tag: `git tag v0.3.1 && git push origin v0.3.1`
|
|
798
|
+
|
|
799
|
+
### 6.7 Post-merge
|
|
800
|
+
|
|
801
|
+
- [ ] Archivar: `git mv docs/execution/v0.3.1.md docs/execution/archive/v0.3.1.md`
|
|
802
|
+
- [ ] Commit: `chore: archive v0.3.1 plan, roadmap 24/24 completado`
|
|
803
|
+
- [ ] Actualizar memoria: marcar `project_data_drain.md` con estado "roadmap 24/24 completado, v0.3.1 latest".
|
|
804
|
+
|
|
805
|
+
---
|
|
806
|
+
|
|
807
|
+
## Validación final
|
|
808
|
+
|
|
809
|
+
- [ ] CI verde en main Ruby 3.2/3.3/3.4
|
|
810
|
+
- [ ] Coverage ≥ 90%
|
|
811
|
+
- [ ] `bundle exec rubocop` sin ofensas
|
|
812
|
+
- [ ] YARD ≥ 90%
|
|
813
|
+
- [ ] Badge verde
|
|
814
|
+
- [ ] Tag v0.3.1
|
|
815
|
+
- [ ] 11 items marcados `[x]` (roadmap 24/24)
|
|
816
|
+
- [ ] Plan archivado
|
|
817
|
+
- [ ] CHANGELOG completo
|
|
818
|
+
|
|
819
|
+
---
|
|
820
|
+
|
|
821
|
+
## Plan B — escenarios de bloqueo
|
|
822
|
+
|
|
823
|
+
| Si... | Entonces... |
|
|
824
|
+
|-------|-------------|
|
|
825
|
+
| Item 17 — ofensas reales > 37 al abrir un cop que estaba excluido | Evaluar cop por cop. Relajar config o fix manual según cost/benefit. |
|
|
826
|
+
| Item 18 — CI falla en Ruby 3.2 o 3.3 por sintaxis | Diagnosticar error específico. Típicamente: reemplazar sintaxis post-3.3 incompatible. |
|
|
827
|
+
| Item 18 — `duckdb` gem no compila en Ruby 3.2 | Actualizar gem a versión con binario prebuilt para 3.2 o subir `required_ruby_version` a `>= 3.3`. |
|
|
828
|
+
| Item 19 — `stub_responses` no cubre algún método AWS SDK que usamos | Mantener `stub_const` para ese caso específico, migrar el resto. |
|
|
829
|
+
| Item 12 — YARD toma > 6h | Aplicar Plan B: cubrir solo Configuration + Observability + Timing en v0.3.1. Diferir resto a v0.4.0. |
|
|
830
|
+
| CI matrix > 10min | Usar `fail-fast: false` + paralelizar si hace falta. |
|
|
831
|
+
| Item 23 — coverage baja de 90% tras auto-correct rubocop | Improbable (auto-correct no remueve código). Si pasa: agregar tests o bajar threshold a 85%. |
|
|
832
|
+
|
|
833
|
+
---
|
|
834
|
+
|
|
835
|
+
## Notas para el agente que ejecuta
|
|
836
|
+
|
|
837
|
+
- **Fase 1 es la más invasiva.** Fusiona 3 items (17 + 14 + 18) + bug fix. Commits granulares dentro de la fase facilitan bisect.
|
|
838
|
+
- **Cada fase cierra con verde:** rspec + rubocop + CI si la fase lo requiere.
|
|
839
|
+
- **La observación 4.5 de big-pickle (DuckDB + Ruby compat)** requiere verificación real en la primera corrida de CI con matrix. Si falla, documentar en Plan B.
|
|
840
|
+
- **`rubocop-rspec` NO se agrega en v0.3.1** (decisión 3). Documentar como ítem futuro post-roadmap.
|
|
841
|
+
- **Roadmap 24/24:** al cerrar v0.3.1, el documento `IMPROVEMENT_PLAN.md` queda en estado "completo". Nuevos items van a "Follow-ups".
|
|
842
|
+
- **Memoria del proyecto** se actualiza en Fase 6.7 para reflejar estado final.
|