data_drain 0.3.2 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1216 @@
1
+ # Plan de Ejecución — v0.4.0
2
+
3
+ **Release objetivo:** v0.4.0 — Glue Jobs Lifecycle (post-roadmap)
4
+ **Items:** 32, 33, 34, 35, 36 (nuevos, no estaban en roadmap original)
5
+ **Branch sugerido:** `feature/v0.4.0`
6
+ **Base:** `main` (contiene v0.3.1, roadmap 24/24 cerrado)
7
+ **Estado:** No iniciado
8
+ **Última actualización:** 2026-04-15
9
+
10
+ ---
11
+
12
+ ## Contexto
13
+
14
+ DataDrain hoy solo ejecuta Glue Jobs pre-existentes (`GlueRunner.run_and_wait`). Para automatizar el ciclo de vida completo (infra-as-code), agregamos creación/actualización/eliminación de Jobs.
15
+
16
+ **Caso de uso:** un proceso schedulado (cron mensual) que asegura el job existe con la config correcta antes de correrlo:
17
+
18
+ ```ruby
19
+ DataDrain::GlueRunner.ensure_job(
20
+ name: "data-drain-export-versions",
21
+ role_arn: "arn:aws:iam::123:role/GlueServiceRole",
22
+ script_location: "s3://my-bucket/scripts/export.py",
23
+ glue_version: "4.0",
24
+ worker_type: "G.1X",
25
+ number_of_workers: 5
26
+ )
27
+ DataDrain::GlueRunner.run_and_wait("data-drain-export-versions", { ... }, max_wait_seconds: 3600)
28
+ ```
29
+
30
+ ## Decisiones de diseño aprobadas
31
+
32
+ | # | Decisión | Elegida |
33
+ |---|---------|---------|
34
+ | A | API shape | **A2 — `ensure_job` idempotente + atómicos** |
35
+ | B | Script upload a S3 | **B1 — caller responsable** (no upload desde gema) |
36
+ | C | Validación IAM/bucket pre-create | **C1 — no validar** (errores AWS son claros) |
37
+
38
+ ---
39
+
40
+ ## Items del release
41
+
42
+ | Fase | Item | Resumen | Estimación |
43
+ |------|------|---------|------------|
44
+ | 0 | — | Setup baseline + branch | 15min |
45
+ | 1 | **34** | Helpers consultivos: `job_exists?`, `get_job` | 1h |
46
+ | 2 | **32** | Atómicos: `create_job`, `update_job`, `delete_job` | 4-5h |
47
+ | 3 | **33** | `ensure_job` idempotente (compara y reconcilia) | 3-4h |
48
+ | 4 | **35** | Tests con `Aws::Glue::Client.stub_responses` (consolidación) | 3h (parcial inline en fases 1-3) |
49
+ | 5 | **36** | Docs: `glue-jobs-lifecycle.md` + ejemplos README | 2h |
50
+ | 6 | — | Release | 30min |
51
+
52
+ **Total estimado:** 13-16h, ~2 días enfocados.
53
+
54
+ **Breaking:** ninguno. Solo agregamos métodos al módulo `GlueRunner` existente. `run_and_wait` queda intacto.
55
+
56
+ ---
57
+
58
+ ## Review de agentes — incorporado
59
+
60
+ Revisión por **big-pickle** 2026-04-15 (`docs/execution/v0.4.0-OBSERVACIONES.md`). 5 observaciones, todas incorporadas (con corrección a la #5):
61
+
62
+ | # | Severidad | Resolución | Ubicación |
63
+ |---|-----------|-----------|-----------|
64
+ | 1 | **Bloqueante** | Crear `validate_glue_name!` (regex `\A[a-zA-Z0-9_-]+\z`) — Glue acepta `-`, regex actual no. Aplicado ANTES de Fase 1. | Fase 0.4 |
65
+ | 2 | Media | Refinar `changed_fields` para tratar `extracted[field].nil? && !desired_config.key?(field)` como "no opinion" (no diff) | Fase 3.2 |
66
+ | 3 | Media | Test inline en Fase 2.3 captura params de `update_job` para validar API shape (no incluir `:name`, `command` con shape completa) | Fase 2.5 |
67
+ | 4 | Baja | Verificación explícita: `extract_current_config` NO incluye `created_on`, `last_modified_on`, `allocated_capacity`. Documentar | Fase 3.2 |
68
+ | 5 | ⚠️ Corrección parcial | Big-pickle dice que `Hash#==` en Ruby compara referencias. **Es incorrecto:** `Hash#==` compara contenido (claves + valores). Sin embargo, vale agregar test explícito para confirmar Hash equality del `default_arguments`. Sin cambio al código, sí test. | Fase 4.1 |
69
+
70
+ **Nota técnica para la observación #5:**
71
+
72
+ ```ruby
73
+ { "--k" => "v" } == { "--k" => "v" } # => true (compara contenido)
74
+ { "--k" => "v" } != { "--k" => "v" } # => false
75
+ ```
76
+
77
+ Riesgo real existente (no señalado por big-pickle):
78
+ - Si AWS retorna `Aws::Glue::Types::JobUpdate` que no es Hash plain, comparación fallaría. Mi plan ya hace `default_arguments&.to_h || {}` — convierte a Hash. OK.
79
+ - Si las keys vienen como Symbol vs String desde AWS SDK, comparación falla. Verificar con test.
80
+
81
+ ---
82
+
83
+ ## Orden de ejecución y dependencias
84
+
85
+ ```
86
+ Fase 0: setup
87
+
88
+
89
+ Fase 1: Item 34 (job_exists? + get_job) ──► foundation, sin deps
90
+ │ requerido por 32 (delete_job) y 33 (ensure_job)
91
+
92
+ Fase 2: Item 32 (create + update + delete atómicos) ──► usa get_job para delete safety
93
+
94
+
95
+ Fase 3: Item 33 (ensure_job idempotente) ──► usa get_job + create_job/update_job
96
+ │ comparación de hash deseado vs actual
97
+
98
+ Fase 4: Item 35 (tests consolidación) ──► tests inline en fases 1-3, esta fase
99
+ │ valida cobertura final
100
+
101
+ Fase 5: Item 36 (docs) ──► glue-jobs-lifecycle.md + README + skill
102
+
103
+
104
+ Fase 6: Release
105
+ ```
106
+
107
+ ---
108
+
109
+ ## Pre-requisitos (Fase 0)
110
+
111
+ ### 0.1 Verificar entorno
112
+
113
+ - [ ] `git checkout main && git pull`
114
+ - [ ] Versión actual `lib/data_drain/version.rb` = `"0.3.1"`
115
+ - [ ] `bundle exec rspec` pasa (coverage ≥ 90%)
116
+ - [ ] `bundle exec rubocop` sin ofensas
117
+ - [ ] CI verde en main Ruby 3.2/3.3/3.4
118
+
119
+ ### 0.2 Crear branch
120
+
121
+ - [ ] `git checkout -b feature/v0.4.0`
122
+
123
+ ### 0.3 Marcar items como en progreso
124
+
125
+ Convención del `IMPROVEMENT_PLAN.md`:
126
+
127
+ - [ ] Editar `docs/IMPROVEMENT_PLAN.md` sección "Follow-ups post-roadmap":
128
+ - Agregar items 32, 33, 34, 35, 36 con detalle (similar a item 17 estructura).
129
+ - Marcar `[~]` (en progreso).
130
+ - [ ] Commit: `chore: agregar items 32-36 al roadmap como en progreso`
131
+
132
+ ### 0.4 Crear `validate_glue_name!` (BLOQUEANTE — observación 1 big-pickle)
133
+
134
+ AWS Glue Job names aceptan guiones (`-`). El `Validations.validate_identifier!` actual usa regex `\A[a-zA-Z_][a-zA-Z0-9_]*\z` que NO acepta `-`. Sin este fix, ejemplos del plan (`name: "data-drain-export-versions"`) fallarían.
135
+
136
+ **Decisión:** crear método separado, no modificar `validate_identifier!` (afectaría `table_name`, `primary_key`, `folder_name` que SÍ deben ser identificadores SQL estrictos).
137
+
138
+ - [ ] Editar `lib/data_drain/validations.rb`. Agregar:
139
+ ```ruby
140
+ GLUE_NAME_REGEX = /\A[a-zA-Z0-9_-]+\z/
141
+
142
+ # Valida un nombre de AWS Glue Job. Permite alfanumérico, `_` y `-`.
143
+ # Más permisivo que validate_identifier! porque AWS lo permite.
144
+ #
145
+ # @param field_name [Symbol]
146
+ # @param value [String]
147
+ # @raise [DataDrain::ConfigurationError]
148
+ def self.validate_glue_name!(field_name, value)
149
+ return if GLUE_NAME_REGEX.match?(value.to_s)
150
+
151
+ raise ConfigurationError,
152
+ "#{field_name} '#{value}' debe ser un Glue Job name válido (alfanumérico, '-', '_')"
153
+ end
154
+ ```
155
+
156
+ - [ ] Tests en `spec/data_drain/validations_spec.rb`:
157
+ ```ruby
158
+ describe ".validate_glue_name!" do
159
+ it "no levanta para nombres con guiones" do
160
+ expect { described_class.validate_glue_name!(:name, "data-drain-export-versions") }.not_to raise_error
161
+ end
162
+
163
+ it "no levanta para alfanumérico simple" do
164
+ expect { described_class.validate_glue_name!(:name, "myJob123") }.not_to raise_error
165
+ end
166
+
167
+ it "no levanta para guiones bajos" do
168
+ expect { described_class.validate_glue_name!(:name, "my_job_v2") }.not_to raise_error
169
+ end
170
+
171
+ it "rechaza espacios" do
172
+ expect { described_class.validate_glue_name!(:name, "my job") }.to raise_error(DataDrain::ConfigurationError)
173
+ end
174
+
175
+ it "rechaza punto y coma (SQL injection)" do
176
+ expect { described_class.validate_glue_name!(:name, "job; DROP") }.to raise_error(DataDrain::ConfigurationError)
177
+ end
178
+
179
+ it "rechaza nombre vacío" do
180
+ expect { described_class.validate_glue_name!(:name, "") }.to raise_error(DataDrain::ConfigurationError)
181
+ end
182
+ end
183
+ ```
184
+
185
+ - [ ] Validar: `bundle exec rspec spec/data_drain/validations_spec.rb`
186
+ - [ ] Commit: `feat(validations): validate_glue_name! para Glue Job names con '-' (pre-fase v0.4.0)`
187
+
188
+ ### Checkpoint Fase 0
189
+
190
+ - [ ] Branch creado
191
+ - [ ] Items en roadmap como `[~]`
192
+ - [ ] Baseline verde
193
+ - [ ] `validate_glue_name!` implementado y testeado (resuelve obs 1 big-pickle)
194
+
195
+ ---
196
+
197
+ ## Fase 1 — Item 34: Helpers consultivos `job_exists?` + `get_job`
198
+
199
+ **Foundation.** Necesarios para items 32 (delete safety) y 33 (ensure idempotente).
200
+
201
+ ### 1.1 Implementación
202
+
203
+ - [ ] Editar `lib/data_drain/glue_runner.rb`. Agregar al final de la clase:
204
+ ```ruby
205
+ # Verifica si un Glue Job existe.
206
+ #
207
+ # @param name [String] Nombre del Job en AWS.
208
+ # @return [Boolean]
209
+ def self.job_exists?(name)
210
+ !get_job(name).nil?
211
+ end
212
+
213
+ # Retorna los datos actuales del Job o nil si no existe.
214
+ #
215
+ # @param name [String]
216
+ # @return [Aws::Glue::Types::Job, nil]
217
+ def self.get_job(name)
218
+ config = DataDrain.configuration
219
+ config.validate!
220
+ client = Aws::Glue::Client.new(region: config.aws_region)
221
+ client.get_job(job_name: name).job
222
+ rescue Aws::Glue::Errors::EntityNotFoundException
223
+ nil
224
+ end
225
+ ```
226
+
227
+ - [ ] Mantener `extend Observability` y `private_class_method` ya existentes.
228
+
229
+ ### 1.2 Tests
230
+
231
+ - [ ] Editar `spec/data_drain/glue_runner_spec.rb`. Agregar al final:
232
+ ```ruby
233
+ describe ".get_job" do
234
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
235
+
236
+ before do
237
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
238
+ end
239
+
240
+ it "retorna el Job si existe" do
241
+ glue_client.stub_responses(:get_job, {
242
+ job: { name: "my-job", role: "arn:aws:iam::123:role/Glue" }
243
+ })
244
+
245
+ job = described_class.get_job("my-job")
246
+ expect(job.name).to eq("my-job")
247
+ end
248
+
249
+ it "retorna nil si EntityNotFoundException" do
250
+ glue_client.stub_responses(:get_job, "EntityNotFoundException")
251
+
252
+ expect(described_class.get_job("nonexistent")).to be_nil
253
+ end
254
+
255
+ it "propaga otros errores AWS" do
256
+ glue_client.stub_responses(:get_job, "InternalServiceException")
257
+
258
+ expect { described_class.get_job("my-job") }.to raise_error(Aws::Glue::Errors::ServiceError)
259
+ end
260
+ end
261
+
262
+ describe ".job_exists?" do
263
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
264
+
265
+ before do
266
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
267
+ end
268
+
269
+ it "true si get_job retorna Job" do
270
+ glue_client.stub_responses(:get_job, { job: { name: "my-job" } })
271
+ expect(described_class.job_exists?("my-job")).to be true
272
+ end
273
+
274
+ it "false si get_job retorna nil" do
275
+ glue_client.stub_responses(:get_job, "EntityNotFoundException")
276
+ expect(described_class.job_exists?("nonexistent")).to be false
277
+ end
278
+ end
279
+ ```
280
+
281
+ ### 1.3 Validación
282
+
283
+ - [ ] `bundle exec rspec spec/data_drain/glue_runner_spec.rb`
284
+ - [ ] `bundle exec rubocop lib/data_drain/glue_runner.rb`
285
+
286
+ ### 1.4 Commit
287
+
288
+ - [ ] Commit: `feat(glue): job_exists? + get_job helpers consultivos (item 34)`
289
+
290
+ ### Checkpoint Fase 1
291
+
292
+ - [ ] Helpers funcionando con stubs
293
+ - [ ] EntityNotFoundException → nil (no propaga)
294
+ - [ ] Otros errores AWS sí propagan
295
+
296
+ ---
297
+
298
+ ## Fase 2 — Item 32: Atómicos `create_job` + `update_job` + `delete_job`
299
+
300
+ ### 2.1 Definición de hash de configuración
301
+
302
+ Para evitar firmas con muchos kwargs, usar un hash de config bien definido. Helpers internos lo traducen a la API AWS.
303
+
304
+ - [ ] Documentar las opciones soportadas (subset de Aws::Glue::Types::JobUpdate):
305
+ ```
306
+ name [String, REQUERIDO]
307
+ role_arn [String, REQUERIDO]
308
+ script_location [String, REQUERIDO] — "s3://..."
309
+ glue_version [String, default "4.0"]
310
+ worker_type [String, default "G.1X"] — G.1X, G.2X, G.4X, G.8X
311
+ number_of_workers [Integer, default 5]
312
+ timeout_minutes [Integer, default 2880] — 48h
313
+ max_retries [Integer, default 0]
314
+ max_concurrent_runs [Integer, default 1]
315
+ command_name [String, default "glueetl"] — o "pythonshell"
316
+ python_version [String, default "3"]
317
+ default_arguments [Hash, default {}]
318
+ description [String, opcional]
319
+ ```
320
+
321
+ ### 2.2 Implementar `create_job`
322
+
323
+ - [ ] Agregar a `lib/data_drain/glue_runner.rb`:
324
+ ```ruby
325
+ # Crea un Glue Job.
326
+ #
327
+ # @param config [Hash] Ver "Definición de hash de configuración" en docs.
328
+ # @return [String] El nombre del job creado.
329
+ # @raise [Aws::Glue::Errors::AlreadyExistsException] Si ya existe.
330
+ # @raise [DataDrain::ConfigurationError] Si faltan campos obligatorios.
331
+ def self.create_job(config)
332
+ validate_job_config!(config)
333
+ config_for_aws = build_aws_job_params(config)
334
+
335
+ aws_config = DataDrain.configuration
336
+ aws_config.validate!
337
+ client = Aws::Glue::Client.new(region: aws_config.aws_region)
338
+
339
+ @logger = aws_config.logger
340
+ safe_log(:info, "glue_runner.job_create",
341
+ { job: config[:name],
342
+ glue_version: config_for_aws[:glue_version],
343
+ worker_type: config_for_aws[:worker_type],
344
+ number_of_workers: config_for_aws[:number_of_workers] })
345
+
346
+ client.create_job(config_for_aws)
347
+ config[:name]
348
+ rescue Aws::Glue::Errors::ServiceError => e
349
+ safe_log(:error, "glue_runner.job_create_error",
350
+ { job: config[:name] }.merge(exception_metadata(e)))
351
+ raise
352
+ end
353
+
354
+ # @api private
355
+ def self.validate_job_config!(config)
356
+ %i[name role_arn script_location].each do |field|
357
+ val = config[field]
358
+ next unless val.nil? || val.to_s.empty?
359
+
360
+ raise DataDrain::ConfigurationError, "config[:#{field}] es obligatorio para Glue Job"
361
+ end
362
+
363
+ DataDrain::Validations.validate_glue_name!(:name, config[:name])
364
+ end
365
+
366
+ # @api private
367
+ def self.build_aws_job_params(config)
368
+ {
369
+ name: config[:name],
370
+ description: config[:description],
371
+ role: config[:role_arn],
372
+ command: {
373
+ name: config.fetch(:command_name, "glueetl"),
374
+ script_location: config[:script_location],
375
+ python_version: config.fetch(:python_version, "3")
376
+ }.compact,
377
+ default_arguments: config.fetch(:default_arguments, {}),
378
+ glue_version: config.fetch(:glue_version, "4.0"),
379
+ worker_type: config.fetch(:worker_type, "G.1X"),
380
+ number_of_workers: config.fetch(:number_of_workers, 5),
381
+ timeout: config.fetch(:timeout_minutes, 2880),
382
+ max_retries: config.fetch(:max_retries, 0),
383
+ execution_property: { max_concurrent_runs: config.fetch(:max_concurrent_runs, 1) }
384
+ }.compact
385
+ end
386
+ ```
387
+
388
+ ### 2.3 Implementar `update_job`
389
+
390
+ - [ ] Agregar:
391
+ ```ruby
392
+ # Actualiza un Glue Job existente.
393
+ #
394
+ # @param config [Hash] Mismos campos que create_job.
395
+ # @return [String] Nombre del job actualizado.
396
+ # @raise [Aws::Glue::Errors::EntityNotFoundException] Si no existe.
397
+ def self.update_job(config)
398
+ validate_job_config!(config)
399
+ aws_params = build_aws_job_params(config)
400
+
401
+ aws_config = DataDrain.configuration
402
+ aws_config.validate!
403
+ client = Aws::Glue::Client.new(region: aws_config.aws_region)
404
+
405
+ @logger = aws_config.logger
406
+
407
+ # AWS API: update_job toma {name:, job_update: {...}} donde job_update
408
+ # NO incluye :name (es el ID del path), pero sí :command, :role, etc.
409
+ job_update = aws_params.except(:name)
410
+
411
+ safe_log(:info, "glue_runner.job_update",
412
+ { job: config[:name] })
413
+
414
+ client.update_job(name: config[:name], job_update: job_update)
415
+ config[:name]
416
+ rescue Aws::Glue::Errors::ServiceError => e
417
+ safe_log(:error, "glue_runner.job_update_error",
418
+ { job: config[:name] }.merge(exception_metadata(e)))
419
+ raise
420
+ end
421
+ ```
422
+
423
+ ### 2.4 Implementar `delete_job`
424
+
425
+ - [ ] Agregar:
426
+ ```ruby
427
+ # Elimina un Glue Job. No-op si no existe (similar a DROP TABLE IF EXISTS).
428
+ #
429
+ # @param name [String]
430
+ # @return [Boolean] true si se borró, false si no existía.
431
+ def self.delete_job(name)
432
+ DataDrain::Validations.validate_glue_name!(:name, name)
433
+
434
+ config = DataDrain.configuration
435
+ config.validate!
436
+ client = Aws::Glue::Client.new(region: config.aws_region)
437
+
438
+ @logger = config.logger
439
+
440
+ client.delete_job(job_name: name)
441
+ safe_log(:info, "glue_runner.job_delete", { job: name })
442
+ true
443
+ rescue Aws::Glue::Errors::EntityNotFoundException
444
+ safe_log(:info, "glue_runner.job_delete_skipped", { job: name, reason: "not_found" })
445
+ false
446
+ rescue Aws::Glue::Errors::ServiceError => e
447
+ safe_log(:error, "glue_runner.job_delete_error",
448
+ { job: name }.merge(exception_metadata(e)))
449
+ raise
450
+ end
451
+ ```
452
+
453
+ **Decisión:** `delete_job` es **idempotente** (no levanta si no existe). Mejor UX para callers que quieren tear-down sin chequear existencia primero.
454
+
455
+ ### 2.5 Tests
456
+
457
+ - [ ] Agregar a `spec/data_drain/glue_runner_spec.rb`:
458
+ ```ruby
459
+ describe ".create_job" do
460
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
461
+
462
+ before do
463
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
464
+ end
465
+
466
+ let(:valid_config) do
467
+ {
468
+ name: "my-job",
469
+ role_arn: "arn:aws:iam::123:role/GlueRole",
470
+ script_location: "s3://my-bucket/scripts/export.py"
471
+ }
472
+ end
473
+
474
+ it "crea con defaults razonables" do
475
+ created_params = nil
476
+ glue_client.stub_responses(:create_job, lambda { |context|
477
+ created_params = context.params
478
+ { name: "my-job" }
479
+ })
480
+
481
+ result = described_class.create_job(valid_config)
482
+ expect(result).to eq("my-job")
483
+ expect(created_params[:role]).to eq("arn:aws:iam::123:role/GlueRole")
484
+ expect(created_params[:glue_version]).to eq("4.0")
485
+ expect(created_params[:worker_type]).to eq("G.1X")
486
+ expect(created_params[:number_of_workers]).to eq(5)
487
+ expect(created_params[:command][:name]).to eq("glueetl")
488
+ expect(created_params[:command][:python_version]).to eq("3")
489
+ end
490
+
491
+ it "respeta worker_type custom" do
492
+ created_params = nil
493
+ glue_client.stub_responses(:create_job, lambda { |context|
494
+ created_params = context.params
495
+ { name: "my-job" }
496
+ })
497
+
498
+ described_class.create_job(valid_config.merge(worker_type: "G.4X", number_of_workers: 20))
499
+ expect(created_params[:worker_type]).to eq("G.4X")
500
+ expect(created_params[:number_of_workers]).to eq(20)
501
+ end
502
+
503
+ it "rechaza config sin name" do
504
+ expect {
505
+ described_class.create_job(valid_config.merge(name: nil))
506
+ }.to raise_error(DataDrain::ConfigurationError, /name/)
507
+ end
508
+
509
+ it "rechaza config sin role_arn" do
510
+ expect {
511
+ described_class.create_job(valid_config.merge(role_arn: nil))
512
+ }.to raise_error(DataDrain::ConfigurationError, /role_arn/)
513
+ end
514
+
515
+ it "rechaza name con caracteres inválidos" do
516
+ expect {
517
+ described_class.create_job(valid_config.merge(name: "my-job; DROP"))
518
+ }.to raise_error(DataDrain::ConfigurationError, /name/)
519
+ end
520
+
521
+ it "ACEPTA name con guiones (Glue convention)" do
522
+ glue_client.stub_responses(:create_job, { name: "data-drain-export-versions" })
523
+ expect {
524
+ described_class.create_job(valid_config.merge(name: "data-drain-export-versions"))
525
+ }.not_to raise_error
526
+ end
527
+
528
+ it "loguea glue_runner.job_create_error y propaga si falla" do
529
+ glue_client.stub_responses(:create_job, "AlreadyExistsException")
530
+
531
+ expect {
532
+ described_class.create_job(valid_config)
533
+ }.to raise_error(Aws::Glue::Errors::AlreadyExistsException)
534
+ end
535
+ end
536
+
537
+ describe ".update_job" do
538
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
539
+
540
+ before do
541
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
542
+ end
543
+
544
+ let(:valid_config) do
545
+ {
546
+ name: "my-job",
547
+ role_arn: "arn:aws:iam::123:role/GlueRole",
548
+ script_location: "s3://my-bucket/scripts/export.py"
549
+ }
550
+ end
551
+
552
+ # Observación 3 big-pickle: capturar params para validar API shape de update_job
553
+ it "envía job_update SIN :name (es path param)" do
554
+ captured_params = nil
555
+ glue_client.stub_responses(:update_job, lambda { |context|
556
+ captured_params = context.params
557
+ { job_name: "my-job" }
558
+ })
559
+
560
+ described_class.update_job(valid_config)
561
+
562
+ expect(captured_params[:name]).to eq("my-job")
563
+ expect(captured_params[:job_update]).to be_a(Hash)
564
+ expect(captured_params[:job_update]).not_to have_key(:name) # ← clave
565
+ expect(captured_params[:job_update][:role]).to eq("arn:aws:iam::123:role/GlueRole")
566
+ expect(captured_params[:job_update][:command]).to be_a(Hash)
567
+ expect(captured_params[:job_update][:command][:script_location]).to eq("s3://my-bucket/scripts/export.py")
568
+ end
569
+
570
+ it "propaga EntityNotFoundException si el job no existe" do
571
+ glue_client.stub_responses(:update_job, "EntityNotFoundException")
572
+
573
+ expect {
574
+ described_class.update_job(valid_config)
575
+ }.to raise_error(Aws::Glue::Errors::EntityNotFoundException)
576
+ end
577
+ end
578
+
579
+ describe ".delete_job" do
580
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
581
+
582
+ before do
583
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
584
+ end
585
+
586
+ it "borra y retorna true si existe" do
587
+ glue_client.stub_responses(:delete_job, { job_name: "my-job" })
588
+
589
+ expect(described_class.delete_job("my-job")).to be true
590
+ end
591
+
592
+ it "retorna false si no existe (idempotente)" do
593
+ glue_client.stub_responses(:delete_job, "EntityNotFoundException")
594
+
595
+ expect(described_class.delete_job("nonexistent")).to be false
596
+ end
597
+
598
+ it "propaga otros errores AWS" do
599
+ glue_client.stub_responses(:delete_job, "ServiceUnavailable")
600
+
601
+ expect {
602
+ described_class.delete_job("my-job")
603
+ }.to raise_error(Aws::Glue::Errors::ServiceError)
604
+ end
605
+
606
+ it "valida name como identificador" do
607
+ expect {
608
+ described_class.delete_job("my-job; DROP")
609
+ }.to raise_error(DataDrain::ConfigurationError)
610
+ end
611
+ end
612
+ ```
613
+
614
+ ### 2.6 Validación Fase 2
615
+
616
+ - [ ] `bundle exec rspec spec/data_drain/glue_runner_spec.rb`
617
+ - [ ] `bundle exec rubocop lib/data_drain/glue_runner.rb`
618
+
619
+ ### 2.7 Commit
620
+
621
+ - [ ] Commit: `feat(glue): create_job + update_job + delete_job atómicos (item 32)`
622
+
623
+ ### Checkpoint Fase 2
624
+
625
+ - [ ] 3 operaciones atómicas + helpers de validación
626
+ - [ ] Tests cubren happy + edge + error AWS
627
+ - [ ] `delete_job` es idempotente
628
+ - [ ] Identificadores `name` validados con regex
629
+
630
+ ---
631
+
632
+ ## Fase 3 — Item 33: `ensure_job` idempotente
633
+
634
+ ### 3.1 Diseño de comparación
635
+
636
+ **Reto:** AWS `get_job` retorna campos default que el caller no setea (ej. `MaxRetries: 0`). Si comparamos hash actual vs deseado naive, falsamente vemos diff.
637
+
638
+ **Solución:** comparar SOLO los campos que el caller envía explícitamente. El resto se considera "no opinion".
639
+
640
+ Estrategia:
641
+ ```ruby
642
+ # Comparar set de campos del config caller vs valores actuales del Job
643
+ def diff_fields(desired_config, current_job)
644
+ changes = []
645
+ comparable_fields(desired_config).each do |field|
646
+ desired = desired_config[field]
647
+ current = extract_current_value(current_job, field)
648
+ changes << field if desired != current
649
+ end
650
+ changes
651
+ end
652
+ ```
653
+
654
+ Campos comparables (del subset que el caller setea):
655
+ - `description`, `role_arn`, `script_location`, `glue_version`,
656
+ - `worker_type`, `number_of_workers`, `timeout_minutes`,
657
+ - `max_retries`, `max_concurrent_runs`, `command_name`, `python_version`,
658
+ - `default_arguments`
659
+
660
+ ### 3.2 Implementación
661
+
662
+ - [ ] Agregar a `lib/data_drain/glue_runner.rb`:
663
+ ```ruby
664
+ # Asegura que el Glue Job existe con la config deseada. Idempotente.
665
+ #
666
+ # - Si no existe → create_job
667
+ # - Si existe pero difiere → update_job (loguea changed_fields)
668
+ # - Si existe y coincide → no-op (loguea unchanged)
669
+ #
670
+ # @param config [Hash] Misma estructura que create_job.
671
+ # @return [Symbol] :created | :updated | :unchanged
672
+ def self.ensure_job(config)
673
+ validate_job_config!(config)
674
+
675
+ current = get_job(config[:name])
676
+ return create_and_log(config) if current.nil?
677
+
678
+ changed = changed_fields(config, current)
679
+ if changed.empty?
680
+ @logger = DataDrain.configuration.logger
681
+ safe_log(:info, "glue_runner.job_unchanged", { job: config[:name] })
682
+ :unchanged
683
+ else
684
+ update_and_log(config, changed)
685
+ end
686
+ end
687
+
688
+ # @api private
689
+ def self.create_and_log(config)
690
+ create_job(config)
691
+ :created
692
+ end
693
+
694
+ # @api private
695
+ def self.update_and_log(config, changed_fields)
696
+ @logger = DataDrain.configuration.logger
697
+ safe_log(:info, "glue_runner.job_update",
698
+ { job: config[:name], changed_fields: changed_fields })
699
+ update_job(config)
700
+ :updated
701
+ end
702
+
703
+ # @api private
704
+ # @return [Array<Symbol>] keys que difieren entre desired y current
705
+ #
706
+ # Refinamiento (observación 2 big-pickle): si extracted retorna nil para un
707
+ # campo Y el caller no lo especificó, NO se considera diff (es "no opinion"
708
+ # en ambos lados).
709
+ def self.changed_fields(desired_config, current_job)
710
+ extracted = extract_current_config(current_job)
711
+
712
+ %i[description role_arn script_location glue_version worker_type
713
+ number_of_workers timeout_minutes max_retries max_concurrent_runs
714
+ command_name python_version default_arguments].select do |field|
715
+ next false unless desired_config.key?(field)
716
+ # Si extracted es nil (AWS no retornó el campo) y el caller SÍ lo especificó,
717
+ # sigue siendo un diff (necesita update). Si caller no lo especificó tampoco,
718
+ # el `unless desired_config.key?(field)` arriba ya lo descartó.
719
+
720
+ desired_config[field] != extracted[field]
721
+ end
722
+ end
723
+
724
+ # @api private
725
+ # @return [Hash] config "deseada" extraída del Job actual
726
+ #
727
+ # IMPORTANTE (observación 4 big-pickle): este método NO extrae:
728
+ # - created_on / last_modified_on (timestamps, siempre difieren)
729
+ # - allocated_capacity (deprecated por AWS, reemplazado por number_of_workers)
730
+ # - log_uri / connections / non_overridable_arguments (no soportados aún)
731
+ # Si se agregan en el futuro, asegurar que tienen contraparte en
732
+ # build_aws_job_params para evitar diff falsos en ensure_job.
733
+ def self.extract_current_config(job)
734
+ {
735
+ description: job.description,
736
+ role_arn: job.role,
737
+ script_location: job.command&.script_location,
738
+ glue_version: job.glue_version,
739
+ worker_type: job.worker_type,
740
+ number_of_workers: job.number_of_workers,
741
+ timeout_minutes: job.timeout,
742
+ max_retries: job.max_retries,
743
+ max_concurrent_runs: job.execution_property&.max_concurrent_runs,
744
+ command_name: job.command&.name,
745
+ python_version: job.command&.python_version,
746
+ default_arguments: job.default_arguments&.to_h || {}
747
+ }
748
+ end
749
+ ```
750
+
751
+ ### 3.3 Tests
752
+
753
+ - [ ] Agregar a `spec/data_drain/glue_runner_spec.rb`:
754
+ ```ruby
755
+ describe ".ensure_job" do
756
+ let(:glue_client) { Aws::Glue::Client.new(stub_responses: true, region: "us-east-1") }
757
+
758
+ before do
759
+ allow(Aws::Glue::Client).to receive(:new).and_return(glue_client)
760
+ end
761
+
762
+ let(:base_config) do
763
+ {
764
+ name: "my-job",
765
+ role_arn: "arn:aws:iam::123:role/GlueRole",
766
+ script_location: "s3://my-bucket/scripts/export.py",
767
+ worker_type: "G.1X",
768
+ number_of_workers: 5
769
+ }
770
+ end
771
+
772
+ it "retorna :created si el job no existe" do
773
+ glue_client.stub_responses(:get_job, "EntityNotFoundException")
774
+ glue_client.stub_responses(:create_job, { name: "my-job" })
775
+
776
+ expect(described_class.ensure_job(base_config)).to eq(:created)
777
+ end
778
+
779
+ it "retorna :unchanged si el job existe con misma config" do
780
+ glue_client.stub_responses(:get_job, {
781
+ job: {
782
+ name: "my-job",
783
+ role: "arn:aws:iam::123:role/GlueRole",
784
+ command: { script_location: "s3://my-bucket/scripts/export.py", name: "glueetl", python_version: "3" },
785
+ worker_type: "G.1X",
786
+ number_of_workers: 5
787
+ }
788
+ })
789
+
790
+ expect(described_class.ensure_job(base_config)).to eq(:unchanged)
791
+ end
792
+
793
+ it "retorna :updated si difiere algún campo" do
794
+ glue_client.stub_responses(:get_job, {
795
+ job: {
796
+ name: "my-job",
797
+ role: "arn:aws:iam::123:role/GlueRole",
798
+ command: { script_location: "s3://my-bucket/scripts/export.py" },
799
+ worker_type: "G.1X",
800
+ number_of_workers: 3 # ← difiere de 5
801
+ }
802
+ })
803
+ glue_client.stub_responses(:update_job, { job_name: "my-job" })
804
+
805
+ expect(described_class.ensure_job(base_config)).to eq(:updated)
806
+ end
807
+
808
+ it "loguea changed_fields en :updated" do
809
+ glue_client.stub_responses(:get_job, {
810
+ job: {
811
+ name: "my-job",
812
+ role: "arn:aws:iam::123:role/GlueRole",
813
+ command: { script_location: "s3://OLD-bucket/scripts/export.py" }, # difiere
814
+ worker_type: "G.1X",
815
+ number_of_workers: 3 # difiere
816
+ }
817
+ })
818
+ glue_client.stub_responses(:update_job, { job_name: "my-job" })
819
+
820
+ logs = capture_logs { described_class.ensure_job(base_config) }
821
+ update_log = logs.find { |l| l.include?("glue_runner.job_update") }
822
+ expect(update_log).to include("changed_fields=")
823
+ expect(update_log).to match(/script_location|number_of_workers/)
824
+ end
825
+
826
+ it "ignora campos no especificados por el caller" do
827
+ # caller solo pide worker_type y number_of_workers
828
+ partial_config = { name: "my-job", role_arn: "arn:...", script_location: "s3://..." }
829
+
830
+ glue_client.stub_responses(:get_job, {
831
+ job: {
832
+ name: "my-job",
833
+ role: "arn:...",
834
+ command: { script_location: "s3://..." },
835
+ max_retries: 5 # campo NO pedido por caller, no debe disparar update
836
+ }
837
+ })
838
+
839
+ expect(described_class.ensure_job(partial_config)).to eq(:unchanged)
840
+ end
841
+ end
842
+ ```
843
+
844
+ ### 3.4 Validación Fase 3
845
+
846
+ - [ ] `bundle exec rspec spec/data_drain/glue_runner_spec.rb`
847
+ - [ ] `bundle exec rubocop lib/data_drain/glue_runner.rb`
848
+
849
+ ### 3.5 Commit
850
+
851
+ - [ ] Commit: `feat(glue): ensure_job idempotente con diff de campos (item 33)`
852
+
853
+ ### Checkpoint Fase 3
854
+
855
+ - [ ] `ensure_job` retorna `:created | :updated | :unchanged`
856
+ - [ ] Comparación ignora campos no seteados por caller (no false positives)
857
+ - [ ] `changed_fields` se loguea en update
858
+
859
+ ---
860
+
861
+ ## Fase 4 — Item 35: Tests consolidación + cobertura
862
+
863
+ Tests inline en fases 1-3. Esta fase valida cobertura final + edge cases.
864
+
865
+ ### 4.1 Edge cases
866
+
867
+ - [ ] Test: `ensure_job` con `default_arguments: { "--key" => "val" }` que cambia → `:updated`.
868
+ - [ ] Test: `ensure_job` con timeout que difiere por tipo (Integer vs String).
869
+ - [ ] Test: `delete_job` con name vacío → ConfigurationError.
870
+ - [ ] Test: errores de red transitorios (`Aws::Glue::Errors::ServiceUnavailable`) propagan.
871
+
872
+ ### 4.1.1 Tests específicos de Hash equality (observación 5 big-pickle)
873
+
874
+ Ruby `Hash#==` compara contenido, no referencia. Pero hay edge cases reales con AWS SDK que requieren cobertura explícita:
875
+
876
+ - [ ] Test: `default_arguments` igual contenido (String keys) → `:unchanged`:
877
+ ```ruby
878
+ it "default_arguments con mismas String keys es :unchanged" do
879
+ config = base_config.merge(default_arguments: { "--TempDir" => "s3://tmp/" })
880
+ glue_client.stub_responses(:get_job, {
881
+ job: {
882
+ name: "my-job", role: "...",
883
+ command: { script_location: "..." },
884
+ default_arguments: { "--TempDir" => "s3://tmp/" }
885
+ }
886
+ })
887
+
888
+ expect(described_class.ensure_job(config)).to eq(:unchanged)
889
+ end
890
+ ```
891
+
892
+ - [ ] Test: `default_arguments` con orden distinto pero misma data → `:unchanged` (Hash#== ignora orden):
893
+ ```ruby
894
+ it "default_arguments con orden de keys distinto es :unchanged" do
895
+ config = base_config.merge(default_arguments: { "--A" => "1", "--B" => "2" })
896
+ glue_client.stub_responses(:get_job, {
897
+ job: {
898
+ name: "my-job", role: "...",
899
+ command: { script_location: "..." },
900
+ default_arguments: { "--B" => "2", "--A" => "1" } # orden distinto
901
+ }
902
+ })
903
+
904
+ expect(described_class.ensure_job(config)).to eq(:unchanged)
905
+ end
906
+ ```
907
+
908
+ - [ ] Test: `default_arguments` con valor distinto → `:updated`:
909
+ ```ruby
910
+ it "default_arguments con valor distinto dispara :updated" do
911
+ config = base_config.merge(default_arguments: { "--TempDir" => "s3://NEW/" })
912
+ glue_client.stub_responses(:get_job, {
913
+ job: {
914
+ name: "my-job", role: "...",
915
+ command: { script_location: "..." },
916
+ default_arguments: { "--TempDir" => "s3://OLD/" }
917
+ }
918
+ })
919
+ glue_client.stub_responses(:update_job, { job_name: "my-job" })
920
+
921
+ expect(described_class.ensure_job(config)).to eq(:updated)
922
+ end
923
+ ```
924
+
925
+ - [ ] Test: AWS retorna Symbol keys (improbable, pero verificar):
926
+ ```ruby
927
+ it "default_arguments funciona si AWS retorna Symbol keys (defensive)" do
928
+ # Si AWS SDK alguna vez retorna { :"--TempDir" => ... } en lugar de String,
929
+ # la comparación fallaría. Test defensive — si falla, agregar normalización
930
+ # en extract_current_config: default_arguments.transform_keys(&:to_s).
931
+ config = base_config.merge(default_arguments: { "--TempDir" => "s3://tmp/" })
932
+ # AWS no parece retornar Symbol keys aquí, pero validar comportamiento.
933
+ end
934
+ ```
935
+
936
+ ### 4.2 Coverage
937
+
938
+ - [ ] `bundle exec rspec` — verificar coverage ≥ 90% (threshold actual).
939
+ - [ ] Si baja: agregar tests para ramas descubiertas.
940
+
941
+ ### 4.3 Commit
942
+
943
+ - [ ] Commit: `test(glue): cobertura edge cases lifecycle Jobs (item 35)`
944
+
945
+ ### Checkpoint Fase 4
946
+
947
+ - [ ] Coverage estable o sube
948
+ - [ ] No flakes en 3 corridas seguidas
949
+
950
+ ---
951
+
952
+ ## Fase 5 — Item 36: Docs
953
+
954
+ ### 5.1 `skill/references/glue-jobs-lifecycle.md`
955
+
956
+ - [ ] Crear archivo nuevo:
957
+ ```markdown
958
+ # Glue Jobs Lifecycle
959
+
960
+ DataDrain v0.4.0+ provee gestión de ciclo de vida de AWS Glue Jobs:
961
+ crear, actualizar, eliminar e idempotentemente garantizar (`ensure_job`).
962
+
963
+ ## Pre-requisitos
964
+
965
+ - IAM rol con `glue:CreateJob`, `glue:UpdateJob`, `glue:DeleteJob`, `glue:GetJob`.
966
+ - Script PySpark/Python ya subido a S3 (la gema NO sube scripts).
967
+ - IAM rol que ejecuta el Job Glue (separado del que crea, ver AWS docs).
968
+
969
+ ## Operaciones atómicas
970
+
971
+ ### create_job
972
+
973
+ ```ruby
974
+ DataDrain::GlueRunner.create_job(
975
+ name: "data-drain-export-versions",
976
+ role_arn: "arn:aws:iam::123:role/GlueServiceRole",
977
+ script_location: "s3://my-bucket/scripts/export.py"
978
+ )
979
+ ```
980
+
981
+ Defaults: `glue_version: "4.0"`, `worker_type: "G.1X"`, `number_of_workers: 5`,
982
+ `timeout: 2880` (48h), `command_name: "glueetl"`, `python_version: "3"`.
983
+
984
+ ### update_job
985
+
986
+ Mismo hash de config. Falla si no existe.
987
+
988
+ ### delete_job
989
+
990
+ ```ruby
991
+ DataDrain::GlueRunner.delete_job("data-drain-export-versions")
992
+ # => true si se borró, false si no existía (idempotente)
993
+ ```
994
+
995
+ ## Idempotente — `ensure_job` (recomendado)
996
+
997
+ ```ruby
998
+ DataDrain::GlueRunner.ensure_job(
999
+ name: "data-drain-export-versions",
1000
+ role_arn: "...",
1001
+ script_location: "s3://...",
1002
+ worker_type: "G.4X",
1003
+ number_of_workers: 10,
1004
+ default_arguments: { "--TempDir" => "s3://my-bucket/temp/" }
1005
+ )
1006
+ # => :created | :updated | :unchanged
1007
+ ```
1008
+
1009
+ Algoritmo:
1010
+ 1. `get_job(name)` — si nil → `create_job`, retorna `:created`
1011
+ 2. Si existe, comparar campos seteados por caller con job actual
1012
+ 3. Si difieren → `update_job`, retorna `:updated` (loguea `changed_fields`)
1013
+ 4. Si coinciden → no-op, retorna `:unchanged`
1014
+
1015
+ **Importante:** `ensure_job` solo compara campos que el caller setea. Campos no
1016
+ declarados (ej. `max_retries` si no lo pasás) NO disparan update aunque AWS
1017
+ los retorne con valores default.
1018
+
1019
+ ## Helpers consultivos
1020
+
1021
+ ```ruby
1022
+ DataDrain::GlueRunner.job_exists?("...") # Boolean
1023
+ DataDrain::GlueRunner.get_job("...") # Aws::Glue::Types::Job o nil
1024
+ ```
1025
+
1026
+ ## Patrón completo: ensure + run + tear-down
1027
+
1028
+ ```ruby
1029
+ job_name = "data-drain-export-versions"
1030
+
1031
+ DataDrain::GlueRunner.ensure_job(
1032
+ name: job_name,
1033
+ role_arn: ENV["GLUE_ROLE_ARN"],
1034
+ script_location: "s3://#{bucket}/scripts/export.py",
1035
+ worker_type: "G.1X",
1036
+ number_of_workers: 5
1037
+ )
1038
+
1039
+ DataDrain::GlueRunner.run_and_wait(
1040
+ job_name,
1041
+ { "--start_date" => start_date.to_fs(:db), ... },
1042
+ max_wait_seconds: 3600
1043
+ )
1044
+
1045
+ DataDrain::Engine.new(
1046
+ bucket: bucket, table_name: "versions", ...,
1047
+ skip_export: true
1048
+ ).call
1049
+
1050
+ # Opcional: cleanup
1051
+ # DataDrain::GlueRunner.delete_job(job_name)
1052
+ ```
1053
+
1054
+ ## Eventos de telemetría
1055
+
1056
+ - `glue_runner.job_create` (INFO) — `job`, `glue_version`, `worker_type`, `number_of_workers`
1057
+ - `glue_runner.job_update` (INFO) — `job`, `changed_fields` (en ensure)
1058
+ - `glue_runner.job_unchanged` (INFO) — `job`
1059
+ - `glue_runner.job_delete` (INFO) — `job`
1060
+ - `glue_runner.job_delete_skipped` (INFO) — `job`, `reason: "not_found"`
1061
+ - `glue_runner.job_create_error` (ERROR) — `job`, `error_class`, `error_message`
1062
+ - `glue_runner.job_update_error` (ERROR) — idem
1063
+ - `glue_runner.job_delete_error` (ERROR) — idem
1064
+
1065
+ ## Limitaciones (v0.4.0)
1066
+
1067
+ - **No upload de scripts.** Caller responsable de subir scripts a S3 antes de `create_job`.
1068
+ - **No validación pre-create de IAM/bucket.** Errores AWS son claros, gema no agrega chequeos.
1069
+ - **No gestión de Workflows/Triggers/Crawlers Glue.** Solo Jobs.
1070
+ - **No soporta `connections:` en Job config.** Si necesitás conexiones JDBC dentro del Job, agregalas con AWS Console o `update_job` directo.
1071
+ ```
1072
+
1073
+ ### 5.2 README
1074
+
1075
+ - [ ] Editar `README.md` sección "Orquestación con AWS Glue", agregar sub-sección:
1076
+ ```markdown
1077
+ ### Gestión de Jobs Glue (v0.4.0+)
1078
+
1079
+ Para automatizar create/update/delete de Jobs Glue:
1080
+
1081
+ DataDrain::GlueRunner.ensure_job(
1082
+ name: "my-export-job",
1083
+ role_arn: ENV["GLUE_ROLE_ARN"],
1084
+ script_location: "s3://my-bucket/scripts/export.py"
1085
+ )
1086
+
1087
+ Detalle: [`skill/references/glue-jobs-lifecycle.md`](skill/references/glue-jobs-lifecycle.md).
1088
+ ```
1089
+
1090
+ ### 5.3 SKILL.md + eventos
1091
+
1092
+ - [ ] Editar `skill/SKILL.md` sección "Referencias", agregar:
1093
+ ```markdown
1094
+ - [Glue Jobs Lifecycle](references/glue-jobs-lifecycle.md) — Crear, actualizar, eliminar Jobs Glue
1095
+ ```
1096
+
1097
+ - [ ] Editar `skill/references/eventos-telemetria.md`. Agregar sección "GlueRunner — Lifecycle":
1098
+ - `glue_runner.job_create`, `job_update`, `job_unchanged`, `job_delete`, `job_delete_skipped`
1099
+ - `glue_runner.job_create_error`, `job_update_error`, `job_delete_error`
1100
+
1101
+ - [ ] Editar `skill/references/api-detallada.md` sección GlueRunner: agregar 5 nuevos métodos.
1102
+
1103
+ ### 5.4 Commit
1104
+
1105
+ - [ ] Commit: `docs: glue-jobs-lifecycle + ejemplos README + eventos (item 36)`
1106
+
1107
+ ### Checkpoint Fase 5
1108
+
1109
+ - [ ] `glue-jobs-lifecycle.md` cubre todos los métodos
1110
+ - [ ] README mencionado el nuevo feature
1111
+ - [ ] Eventos catalogados
1112
+
1113
+ ---
1114
+
1115
+ ## Fase 6 — Release
1116
+
1117
+ ### 6.1 Lint + tests finales
1118
+
1119
+ - [ ] `bundle exec rubocop` — 0 ofensas.
1120
+ - [ ] `bundle exec rspec` — coverage ≥ 90%.
1121
+ - [ ] CI verde Ruby 3.2/3.3/3.4.
1122
+
1123
+ ### 6.2 CHANGELOG
1124
+
1125
+ - [ ] Editar `CHANGELOG.md`, agregar al tope:
1126
+ ```markdown
1127
+ ## [0.4.0] - 2026-XX-XX
1128
+
1129
+ ### Features
1130
+ - **Glue Jobs Lifecycle:** nuevos métodos en `GlueRunner` para gestionar Jobs Glue:
1131
+ - `create_job(config)` — crea con defaults razonables (Glue 4.0, G.1X, 5 workers)
1132
+ - `update_job(config)` — actualiza Job existente
1133
+ - `delete_job(name)` — idempotente (no levanta si no existe)
1134
+ - `ensure_job(config)` — declarative idempotente: `:created | :updated | :unchanged`
1135
+ - `get_job(name)` — retorna el Job o nil
1136
+ - `job_exists?(name)` — boolean
1137
+ Permite "infra as code" sin gestionar Jobs vía Console o Terraform. (items 32, 33, 34)
1138
+
1139
+ ### Telemetry nueva
1140
+ - `glue_runner.job_create`, `job_update`, `job_unchanged`, `job_delete`, `job_delete_skipped`
1141
+ - `glue_runner.job_create_error`, `job_update_error`, `job_delete_error`
1142
+
1143
+ ### Docs
1144
+ - `skill/references/glue-jobs-lifecycle.md` (item 36)
1145
+ - README sección "Gestión de Jobs Glue" actualizada
1146
+
1147
+ ### Tests
1148
+ - Cobertura mantiene ≥ 90% (real ~97%+).
1149
+ - Stubs `Aws::Glue::Client.new(stub_responses: true)` para todos los nuevos métodos.
1150
+ ```
1151
+
1152
+ ### 6.3 Bump versión
1153
+
1154
+ - [ ] `lib/data_drain/version.rb`: `VERSION = "0.4.0"`
1155
+ - [ ] `bundle install`
1156
+
1157
+ ### 6.4 Actualizar roadmap
1158
+
1159
+ - [ ] `docs/IMPROVEMENT_PLAN.md` sección "Follow-ups post-roadmap":
1160
+ - Items 32, 33, 34, 35, 36 → `[x]`
1161
+
1162
+ ### 6.5 Commit release
1163
+
1164
+ - [ ] Commit: `chore: release v0.4.0 — Glue Jobs Lifecycle`
1165
+
1166
+ ### 6.6 PR + merge + tag
1167
+
1168
+ - [ ] `git push origin feature/v0.4.0`
1169
+ - [ ] `gh pr create --title "v0.4.0: Glue Jobs Lifecycle (create/update/delete/ensure)"`
1170
+ - [ ] CI verde
1171
+ - [ ] Mergear
1172
+ - [ ] Tag: `git tag v0.4.0 && git push origin v0.4.0`
1173
+
1174
+ ### 6.7 Post-merge
1175
+
1176
+ - [ ] Archivar plan: `git mv docs/execution/v0.4.0.md docs/execution/archive/v0.4.0.md`
1177
+ - [ ] Commit: `chore: archive v0.4.0 plan, items 32-36 [x]`
1178
+ - [ ] Actualizar memoria proyecto.
1179
+
1180
+ ---
1181
+
1182
+ ## Validación final
1183
+
1184
+ - [ ] CI verde matrix Ruby
1185
+ - [ ] Coverage ≥ 90%
1186
+ - [ ] `bundle exec rubocop` sin ofensas
1187
+ - [ ] Tag v0.4.0 creado
1188
+ - [ ] 5 items marcados `[x]`
1189
+ - [ ] Plan archivado
1190
+ - [ ] CHANGELOG actualizado
1191
+
1192
+ ---
1193
+
1194
+ ## Plan B — escenarios de bloqueo
1195
+
1196
+ | Si... | Entonces... |
1197
+ |-------|-------------|
1198
+ | `ensure_job` diff genera false positives (AWS retorna campos default que no setteamos) | Refinar `extract_current_config` con `compact` o exclusiones explícitas. Tests con stub que retorne campos extras. |
1199
+ | `Aws::Glue::Errors::EntityNotFoundException` no es la clase exacta en alguna versión SDK | Verificar nombre real con `Aws::Glue::Errors.constants`. Ajustar rescue. |
1200
+ | `update_job` rechaza `:name` en `job_update` (vs lo que documentamos) | Probar contra stub real. Ajustar `aws_params.except(:name)`. |
1201
+ | Tests con `stub_responses` no soportan `EntityNotFoundException` por nombre string | Usar instancia: `glue_client.stub_responses(:get_job, Aws::Glue::Errors::EntityNotFoundException.new(nil, "msg"))`. |
1202
+ | Item 33 (ensure_job) toma >4h por edge cases del diff | Cortar scope: lanzar v0.4.0 con solo atómicos (item 32+34) + helpers, dejar `ensure_job` para v0.4.1. |
1203
+ | Permisos IAM faltantes en testing real | Documentar en glue-jobs-lifecycle.md el set mínimo requerido. |
1204
+ | AWS SDK retorna `default_arguments` con Symbol keys (no String) | Normalizar en `extract_current_config`: `default_arguments&.to_h&.transform_keys(&:to_s) || {}`. Test específico en Fase 4.1.1 detecta el caso. |
1205
+ | Glue API requiere `:command` con shape completa en `update_job` (no parcial) | Mi `build_aws_job_params` ya envía hash completo de `:command` (name + script_location + python_version). Test Fase 2.5 valida shape. |
1206
+
1207
+ ---
1208
+
1209
+ ## Notas para el agente que ejecuta
1210
+
1211
+ - **Item 33 (`ensure_job`) es el más complejo.** Tests deben cubrir bien el diff de fields para no falsos positivos.
1212
+ - **Cada commit cierra con verde:** rspec + rubocop antes de avanzar.
1213
+ - **`delete_job` es idempotente por diseño.** Documentar esta decisión claramente en YARD.
1214
+ - **Identificadores `name`** se validan con `Validations.validate_identifier!` (existente, regex `\A[a-zA-Z_][a-zA-Z0-9_]*\z`). Si un caller usa nombres con `-` (común en AWS), **ajustar la regex o crear validación específica para Glue Job names** (que permiten `-`).
1215
+ - **Acción de pre-fase:** verificar si Glue Job names permiten `-`. Si sí (creo que sí), adaptar `validate_identifier!` o crear `validate_glue_name!` separado en `Validations`.
1216
+ - **Coordinación con big-pickle:** después de subir el plan, pedir review (similar a v0.3.0 y v0.3.1).