brisk-provider-kubernetes 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 882e58cd13f8b798cc6e6beceb16a38a4140d940ddeeb9cc774bacc74e269a13
4
+ data.tar.gz: 9040dd237daac56141cc8554089135377dee0eb6d308693b0e6fee046266b882
5
+ SHA512:
6
+ metadata.gz: 8fa771314fc4cda075e6ade74e7b82241acd0c8f3752159ea82f4673db620e18c98bda64c880d50e7e7fe5864d012a69982a57cdb14346d58c9401bcca569037
7
+ data.tar.gz: c16a8dc635b63c3ba1d661ab01c066d536e42e900c9693b883667c6d0378994f081a7d0aaf300a2cfaa38838d39f22b7e3ab0180cedb96d052d70b911fb335fe
data/README.md ADDED
@@ -0,0 +1,435 @@
1
+ # Brisk Provider: Kubernetes
2
+
3
+ Run Brisk CI workers as Kubernetes pods in your cluster. This provider manages the full lifecycle of worker pods, from creation to cleanup.
4
+
5
+ ## Features
6
+
7
+ - ✅ **Dynamic Pod Creation** - Automatically creates worker pods on-demand
8
+ - ✅ **Auto-scaling** - Works with Kubernetes Horizontal Pod Autoscaler
9
+ - ✅ **Health Management** - Uses Kubernetes liveness and readiness probes
10
+ - ✅ **Resource Limits** - Configurable CPU and memory requests/limits
11
+ - ✅ **Node Selection** - Support for node selectors and tolerations
12
+ - ✅ **Namespace Isolation** - Run workers in dedicated namespaces
13
+ - ✅ **Orphan Cleanup** - Automatic reconciliation of orphaned pods
14
+
15
+ ## Installation
16
+
17
+ Add this line to your application's Gemfile:
18
+
19
+ ```ruby
20
+ gem 'brisk-provider-kubernetes'
21
+ ```
22
+
23
+ Or install from source:
24
+
25
+ ```ruby
26
+ gem 'brisk-provider-kubernetes', git: 'https://github.com/brisktest/brisk-provider-kubernetes'
27
+ ```
28
+
29
+ Then execute:
30
+
31
+ ```bash
32
+ bundle install
33
+ ```
34
+
35
+ The provider will auto-register when Rails boots.
36
+
37
+ ## Configuration
38
+
39
+ ### Prerequisites
40
+
41
+ 1. **Kubernetes Cluster** - You need a running Kubernetes cluster
42
+ 2. **kubectl Access** - Configured kubeconfig or in-cluster service account
43
+ 3. **Namespace** - Create a namespace for Brisk workers:
44
+
45
+ ```bash
46
+ kubectl create namespace brisk-workers
47
+ ```
48
+
49
+ ### Environment Variables
50
+
51
+ **For local development (using kubeconfig):**
52
+ ```bash
53
+ KUBECONFIG=/path/to/kubeconfig
54
+ K8S_NAMESPACE=brisk-workers # Optional, defaults to brisk-workers
55
+ BRISK_API_ENDPOINT=api.brisk.dev:443
56
+ ```
57
+
58
+ **For in-cluster deployment (using service account):**
59
+ ```bash
60
+ K8S_NAMESPACE=brisk-workers # Optional
61
+ BRISK_API_ENDPOINT=api.brisk.dev:443
62
+ ```
63
+
64
+ ### RBAC Permissions
65
+
66
+ The Brisk API needs these Kubernetes permissions:
67
+
68
+ ```yaml
69
+ apiVersion: v1
70
+ kind: ServiceAccount
71
+ metadata:
72
+ name: brisk-api
73
+ namespace: brisk-system
74
+ ---
75
+ apiVersion: rbac.authorization.k8s.io/v1
76
+ kind: Role
77
+ metadata:
78
+ name: brisk-worker-manager
79
+ namespace: brisk-workers
80
+ rules:
81
+ - apiGroups: [""]
82
+ resources: ["pods"]
83
+ verbs: ["get", "list", "watch", "create", "delete"]
84
+ - apiGroups: [""]
85
+ resources: ["pods/log"]
86
+ verbs: ["get"]
87
+ ---
88
+ apiVersion: rbac.authorization.k8s.io/v1
89
+ kind:RoleBinding
90
+ metadata:
91
+ name: brisk-api-worker-manager
92
+ namespace: brisk-workers
93
+ subjects:
94
+ - kind: ServiceAccount
95
+ name: brisk-api
96
+ namespace: brisk-system
97
+ roleRef:
98
+ kind: Role
99
+ name: brisk-worker-manager
100
+ apiGroup: rbac.authorization.k8s.io
101
+ ```
102
+
103
+ Apply with:
104
+ ```bash
105
+ kubectl apply -f rbac.yaml
106
+ ```
107
+
108
+ ### Project Configuration
109
+
110
+ Create a project using the Kubernetes provider:
111
+
112
+ ```ruby
113
+ project = Project.create!(
114
+ name: "My K8s Project",
115
+ worker_provider: 'kubernetes',
116
+ provider_config: {
117
+ # Required
118
+ 'namespace' => 'brisk-workers',
119
+
120
+ # Optional: Custom environment variables
121
+ 'env' => {
122
+ 'CUSTOM_VAR' => 'value',
123
+ 'DEBUG' => 'true'
124
+ },
125
+
126
+ # Optional: Resource defaults
127
+ 'default_memory_mb' => 4096,
128
+ 'default_cpu_count' => 2,
129
+
130
+ # Optional: Node selection
131
+ 'node_selector' => {
132
+ 'workload-type' => 'ci'
133
+ },
134
+
135
+ # Optional: Tolerations for tainted nodes
136
+ 'tolerations' => [
137
+ {
138
+ 'key' => 'ci-workload',
139
+ 'operator' => 'Equal',
140
+ 'value' => 'true',
141
+ 'effect' => 'NoSchedule'
142
+ }
143
+ ]
144
+ }
145
+ )
146
+ ```
147
+
148
+ ## Usage
149
+
150
+ Once configured, the provider works automatically:
151
+
152
+ ```ruby
153
+ # Start a test run
154
+ jobrun = project.jobruns.create!(/* ... */)
155
+
156
+ # Provider automatically:
157
+ # 1. Finds available worker pods
158
+ # 2. Creates new pods if needed
159
+ # 3. Allocates them to the jobrun
160
+ # 4. Cleans up after tests complete
161
+
162
+ # Check worker status
163
+ project.workers.in_use.each do |worker|
164
+ puts "Worker #{worker.id}: #{worker.state}"
165
+ puts " Pod: #{worker.machine.uid}"
166
+ puts " Node: #{worker.machine.json_data['node_name']}"
167
+ end
168
+ ```
169
+
170
+ ## Pod Configuration
171
+
172
+ Worker pods are created with:
173
+
174
+ ### Container Spec
175
+ - **Image**: From `project.image.url`
176
+ - **Ports**:
177
+ - 50051 (gRPC for worker communication)
178
+ - 8081 (HTTP for health checks)
179
+ - **Environment**:
180
+ - `BRISK_PROJECT_ID`
181
+ - `BRISK_PROJECT_TOKEN`
182
+ - `BRISK_API_ENDPOINT`
183
+ - `WORKER_CONCURRENCY`
184
+ - Custom vars from `provider_config['env']`
185
+
186
+ ### Resources
187
+ ```yaml
188
+ resources:
189
+ requests:
190
+ memory: "4096Mi" # Configurable
191
+ cpu: "2" # Configurable
192
+ limits:
193
+ memory: "6144Mi" # 1.5x requests
194
+ cpu: "3" # 1.5x requests
195
+ ```
196
+
197
+ ### Health Checks
198
+ ```yaml
199
+ livenessProbe:
200
+ httpGet:
201
+ path: /health
202
+ port: 8081
203
+ initialDelaySeconds: 10
204
+ periodSeconds: 30
205
+
206
+ readinessProbe:
207
+ httpGet:
208
+ path: /ready
209
+ port: 8081
210
+ initialDelaySeconds: 5
211
+ periodSeconds: 10
212
+ ```
213
+
214
+ ### Labels
215
+ All pods are labeled with:
216
+ - `brisk-project-id: <project_id>`
217
+ - `brisk-role: worker`
218
+ - `app: brisk-worker`
219
+
220
+ Use these labels for monitoring, network policies, etc.
221
+
222
+ ## Advanced Configuration
223
+
224
+ ### Custom Resource Limits
225
+
226
+ ```ruby
227
+ project.update!(
228
+ provider_config: {
229
+ 'namespace' => 'brisk-workers',
230
+ 'default_memory_mb' => 8192, # 8GB RAM
231
+ 'default_cpu_count' => 4 # 4 CPUs
232
+ }
233
+ )
234
+ ```
235
+
236
+ ### Node Selection
237
+
238
+ Run workers on specific nodes:
239
+
240
+ ```ruby
241
+ project.update!(
242
+ provider_config: {
243
+ 'namespace' => 'brisk-workers',
244
+ 'node_selector' => {
245
+ 'workload-type' => 'ci',
246
+ 'disk-type' => 'ssd'
247
+ }
248
+ }
249
+ )
250
+ ```
251
+
252
+ ### Tolerations
253
+
254
+ Run on tainted nodes:
255
+
256
+ ```ruby
257
+ project.update!(
258
+ provider_config: {
259
+ 'namespace' => 'brisk-workers',
260
+ 'tolerations' => [
261
+ {
262
+ 'key' => 'ci-workload',
263
+ 'operator' => 'Equal',
264
+ 'value' => 'true',
265
+ 'effect' => 'NoSchedule'
266
+ }
267
+ ]
268
+ }
269
+ )
270
+ ```
271
+
272
+ ### Multiple Namespaces
273
+
274
+ Run different projects in different namespaces:
275
+
276
+ ```ruby
277
+ # Project 1 - staging environment
278
+ project1 = Project.create!(
279
+ name: "Staging Tests",
280
+ worker_provider: 'kubernetes',
281
+ provider_config: { 'namespace' => 'brisk-staging' }
282
+ )
283
+
284
+ # Project 2 - production environment
285
+ project2 = Project.create!(
286
+ name: "Production Tests",
287
+ worker_provider: 'kubernetes',
288
+ provider_config: { 'namespace' => 'brisk-production' }
289
+ )
290
+ ```
291
+
292
+ ## Monitoring
293
+
294
+ ### View Pods
295
+
296
+ ```bash
297
+ # List all Brisk worker pods
298
+ kubectl get pods -n brisk-workers -l app=brisk-worker
299
+
300
+ # Watch pod status
301
+ kubectl get pods -n brisk-workers -l app=brisk-worker -w
302
+
303
+ # View logs
304
+ kubectl logs -n brisk-workers <pod-name>
305
+ ```
306
+
307
+ ### Metrics
308
+
309
+ Worker pods expose metrics on port 8081:
310
+ - `/health` - Liveness check
311
+ - `/ready` - Readiness check
312
+ - `/metrics` - Prometheus metrics (if enabled)
313
+
314
+ ### Events
315
+
316
+ ```bash
317
+ # View pod events
318
+ kubectl describe pod -n brisk-workers <pod-name>
319
+
320
+ # Watch events
321
+ kubectl get events -n brisk-workers -w
322
+ ```
323
+
324
+ ## Troubleshooting
325
+
326
+ ### Pods Not Starting
327
+
328
+ Check events:
329
+ ```bash
330
+ kubectl describe pod -n brisk-workers <pod-name>
331
+ ```
332
+
333
+ Common issues:
334
+ - **ImagePullBackOff**: Check image URL and registry credentials
335
+ - **Pending**: Check resource requests vs available node capacity
336
+ - **CrashLoopBackOff**: Check pod logs for errors
337
+
338
+ ### Workers Not Connecting
339
+
340
+ 1. Check pod logs:
341
+ ```bash
342
+ kubectl logs -n brisk-workers <pod-name>
343
+ ```
344
+
345
+ 2. Verify connectivity:
346
+ ```bash
347
+ kubectl exec -n brisk-workers <pod-name> -- curl http://api.brisk.dev
348
+ ```
349
+
350
+ 3. Check environment variables:
351
+ ```bash
352
+ kubectl exec -n brisk-workers <pod-name> -- env | grep BRISK
353
+ ```
354
+
355
+ ### Orphaned Pods
356
+
357
+ The provider automatically cleans up orphaned pods via reconciliation. To manually clean up:
358
+
359
+ ```bash
360
+ # Delete all pods for a project
361
+ kubectl delete pods -n brisk-workers -l brisk-project-id=<project_id>
362
+
363
+ # Delete all Brisk worker pods
364
+ kubectl delete pods -n brisk-workers -l app=brisk-worker
365
+ ```
366
+
367
+ ## Development
368
+
369
+ ### Running Tests
370
+
371
+ ```bash
372
+ bundle exec rspec
373
+ ```
374
+
375
+ ### Testing Locally
376
+
377
+ 1. Start minikube or kind cluster
378
+ 2. Configure kubeconfig
379
+ 3. Add gem to Gemfile with `path:` option
380
+ 4. Create test project with `worker_provider: 'kubernetes'`
381
+
382
+ ### Debugging
383
+
384
+ Enable debug logging:
385
+
386
+ ```ruby
387
+ # config/environments/development.rb
388
+ Rails.logger.level = :debug
389
+ ```
390
+
391
+ View provider logs:
392
+ ```ruby
393
+ Rails.logger.tagged("K8s Provider") do
394
+ # Provider operations will be logged
395
+ end
396
+ ```
397
+
398
+ ## Architecture
399
+
400
+ ### Provider → PodService → Kubernetes API
401
+
402
+ ```
403
+ Provider (provider.rb)
404
+ └─> PodService (pod_service.rb)
405
+ └─> K8s Client (k8s-ruby gem)
406
+ └─> Kubernetes API
407
+ ```
408
+
409
+ ### Pod Lifecycle
410
+
411
+ 1. **Creation**: Provider creates pod via PodService
412
+ 2. **Registration**: Pod starts and registers via gRPC
413
+ 3. **Allocation**: Worker is allocated to jobrun
414
+ 4. **Execution**: Tests run on worker
415
+ 5. **Cleanup**: Pod is deleted after job completes
416
+
417
+ ### Health Management
418
+
419
+ - Kubernetes manages pod health via probes
420
+ - Brisk disables internal health tracking (sets `last_checked_at` to far future)
421
+ - Dead pods are automatically restarted by Kubernetes
422
+
423
+ ## Contributing
424
+
425
+ Bug reports and pull requests are welcome on GitHub at https://github.com/brisktest/brisk-provider-kubernetes.
426
+
427
+ ## License
428
+
429
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
430
+
431
+ ## Support
432
+
433
+ - Documentation: https://docs.brisk.dev/providers/kubernetes
434
+ - Issues: https://github.com/brisktest/brisk-provider-kubernetes/issues
435
+ - Discussions: https://github.com/brisktest/brisk/discussions
data/Rakefile ADDED
@@ -0,0 +1,8 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'bundler/gem_tasks'
4
+ require 'rspec/core/rake_task'
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ task default: :spec
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Brisk
4
+ module Providers
5
+ module Kubernetes
6
+ # Rails Engine for Kubernetes provider
7
+ # Auto-registers provider when gem is loaded
8
+ class Engine < ::Rails::Engine
9
+ isolate_namespace Brisk::Providers::Kubernetes
10
+
11
+ # Auto-register provider when Rails boots
12
+ initializer 'brisk_provider_kubernetes.register', before: :load_config_initializers do |app|
13
+ app.config.to_prepare do
14
+ if defined?(::Providers::ProviderRegistry)
15
+ ::Providers::ProviderRegistry.instance.register(
16
+ 'kubernetes',
17
+ Brisk::Providers::Kubernetes::Provider
18
+ )
19
+ Rails.logger.info '✓ Kubernetes Provider registered'
20
+ else
21
+ Rails.logger.warn '⚠ Cannot register Kubernetes Provider - ProviderRegistry not found'
22
+ end
23
+ end
24
+ end
25
+
26
+ # Add any migrations from this gem to the main app
27
+ initializer 'brisk_provider_kubernetes.migrations' do |app|
28
+ unless app.root.to_s.match?(root.to_s)
29
+ config.paths['db/migrate'].expanded.each do |expanded_path|
30
+ app.config.paths['db/migrate'] << expanded_path
31
+ end
32
+ end
33
+ end
34
+ end
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,244 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Brisk
4
+ module Providers
5
+ module Kubernetes
6
+ # Service for managing Kubernetes pods as Brisk workers
7
+ class PodService
8
+ attr_reader :project, :client
9
+
10
+ def initialize(project, client)
11
+ @project = project
12
+ @client = client
13
+ end
14
+
15
+ # Get or create worker pods for a job run
16
+ # @param jobrun [Jobrun] The job run
17
+ # @param workers_needed [Integer] Number of workers needed
18
+ # @return [Array<Worker>] Array of workers
19
+ def get_workers_for_project(_jobrun, workers_needed)
20
+ namespace = project.provider_config['namespace'] || 'brisk-workers'
21
+
22
+ # Find available workers (pods in Ready state)
23
+ available_workers = find_available_workers(namespace)
24
+
25
+ if available_workers.size >= workers_needed
26
+ # Use existing workers
27
+ available_workers.first(workers_needed)
28
+ else
29
+ # Create additional workers
30
+ needed = workers_needed - available_workers.size
31
+ Rails.logger.info "[K8s] Creating #{needed} new worker pods"
32
+
33
+ new_workers = needed.times.map do
34
+ create_worker_pod(default_machine_config)
35
+ end
36
+
37
+ available_workers + new_workers.compact
38
+ end
39
+ end
40
+
41
+ # Create a worker pod
42
+ # @param machine_config [Hash] Machine configuration
43
+ # @return [Machine] The created machine
44
+ def create_worker_pod(machine_config)
45
+ namespace = project.provider_config['namespace'] || 'brisk-workers'
46
+ pod_name = "brisk-worker-#{project.id}-#{SecureRandom.hex(4)}"
47
+
48
+ pod_spec = build_pod_spec(pod_name, machine_config)
49
+
50
+ Rails.logger.debug "[K8s] Creating pod: #{pod_name}"
51
+ client.api('v1').resource('pods', namespace: namespace).create_resource(pod_spec)
52
+
53
+ # Wait for pod to get IP address (with timeout)
54
+ pod = wait_for_pod_ip(namespace, pod_name, timeout: 30)
55
+
56
+ # Create machine record
57
+ machine = Machine.create!(
58
+ uid: pod.metadata.name,
59
+ provider: 'kubernetes',
60
+ ip_address: pod.status.podIP,
61
+ host_ip: pod.status.hostIP,
62
+ state: 'running',
63
+ config: machine_config,
64
+ json_data: {
65
+ namespace: namespace,
66
+ pod_name: pod.metadata.name,
67
+ node_name: pod.spec.nodeName
68
+ }
69
+ )
70
+
71
+ Rails.logger.info "[K8s] Created pod #{pod_name} on node #{pod.spec.nodeName}"
72
+ machine
73
+ rescue K8s::Error::API => e
74
+ Rails.logger.error "[K8s] Failed to create pod: #{e.message}"
75
+ raise ::Providers::ProviderError, "Kubernetes API error: #{e.message}"
76
+ end
77
+
78
+ # Restart a pod (delete and recreate)
79
+ # @param worker [Worker] The worker to restart
80
+ # @return [Machine] The new machine
81
+ def restart_pod(worker)
82
+ namespace = worker.machine.json_data['namespace'] || 'brisk-workers'
83
+ old_pod_name = worker.machine.uid
84
+
85
+ # Delete old pod
86
+ begin
87
+ client.api('v1').resource('pods', namespace: namespace).delete(
88
+ old_pod_name,
89
+ propagationPolicy: 'Foreground'
90
+ )
91
+ rescue K8s::Error::NotFound
92
+ # Already deleted, continue
93
+ end
94
+
95
+ # Create new pod
96
+ create_worker_pod(worker.machine.config || default_machine_config)
97
+ end
98
+
99
+ private
100
+
101
+ # Find available workers (free and in ready state)
102
+ # @param namespace [String] Kubernetes namespace
103
+ # @return [Array<Worker>] Available workers
104
+ def find_available_workers(namespace)
105
+ # Find all pods for this project
106
+ pods = client.api('v1').resource('pods', namespace: namespace).list(
107
+ labelSelector: "brisk-project-id=#{project.id},brisk-role=worker"
108
+ )
109
+
110
+ # Get workers from database that are free
111
+ pod_uids = pods.map { |p| p.metadata.name }
112
+ machines = project.machines.where(provider: 'kubernetes', uid: pod_uids)
113
+
114
+ project.workers.free.not_stale.where(machine: machines).to_a
115
+ end
116
+
117
+ # Wait for pod to get an IP address
118
+ # @param namespace [String] Kubernetes namespace
119
+ # @param pod_name [String] Pod name
120
+ # @param timeout [Integer] Timeout in seconds
121
+ # @return [K8s::Resource] Pod with IP address
122
+ def wait_for_pod_ip(namespace, pod_name, timeout: 30)
123
+ deadline = Time.current + timeout
124
+
125
+ loop do
126
+ pod = client.api('v1').resource('pods', namespace: namespace).get(pod_name)
127
+
128
+ return pod if pod.status.podIP.present?
129
+
130
+ raise ::Providers::ProviderError, "Timeout waiting for pod IP: #{pod_name}" if Time.current > deadline
131
+
132
+ sleep 1
133
+ end
134
+ rescue K8s::Error::NotFound
135
+ raise ::Providers::ProviderError, "Pod not found: #{pod_name}"
136
+ end
137
+
138
+ # Build Kubernetes pod specification
139
+ # @param pod_name [String] Pod name
140
+ # @param machine_config [Hash] Machine configuration
141
+ # @return [Hash] Pod specification
142
+ def build_pod_spec(pod_name, machine_config)
143
+ namespace = project.provider_config['namespace'] || 'brisk-workers'
144
+
145
+ {
146
+ apiVersion: 'v1',
147
+ kind: 'Pod',
148
+ metadata: {
149
+ name: pod_name,
150
+ namespace: namespace,
151
+ labels: {
152
+ 'brisk-project-id' => project.id.to_s,
153
+ 'brisk-role' => 'worker',
154
+ 'app' => 'brisk-worker'
155
+ },
156
+ annotations: {
157
+ 'brisk.dev/project-name' => project.name,
158
+ 'brisk.dev/created-at' => Time.current.iso8601
159
+ }
160
+ },
161
+ spec: {
162
+ restartPolicy: 'Never',
163
+ containers: [
164
+ {
165
+ name: 'worker',
166
+ image: project.image.url,
167
+ imagePullPolicy: 'Always',
168
+ env: build_env_vars,
169
+ resources: build_resources(machine_config),
170
+ ports: [
171
+ { name: 'grpc', containerPort: 50_051, protocol: 'TCP' },
172
+ { name: 'health', containerPort: 8081, protocol: 'TCP' }
173
+ ],
174
+ livenessProbe: {
175
+ httpGet: { path: '/health', port: 8081 },
176
+ initialDelaySeconds: 10,
177
+ periodSeconds: 30
178
+ },
179
+ readinessProbe: {
180
+ httpGet: { path: '/ready', port: 8081 },
181
+ initialDelaySeconds: 5,
182
+ periodSeconds: 10
183
+ }
184
+ }
185
+ ],
186
+ # Optional: Use node selector for specific node pools
187
+ nodeSelector: machine_config[:node_selector] || {},
188
+ # Optional: Add tolerations for tainted nodes
189
+ tolerations: machine_config[:tolerations] || []
190
+ }
191
+ }
192
+ end
193
+
194
+ # Build environment variables for worker container
195
+ # @return [Array<Hash>] Environment variables
196
+ def build_env_vars
197
+ base_env = [
198
+ { name: 'BRISK_PROJECT_ID', value: project.id.to_s },
199
+ { name: 'BRISK_PROJECT_TOKEN', value: project.project_token },
200
+ { name: 'BRISK_API_ENDPOINT', value: ENV['BRISK_API_ENDPOINT'] || 'api.brisk.dev:443' },
201
+ { name: 'WORKER_CONCURRENCY', value: project.worker_concurrency.to_s }
202
+ ]
203
+
204
+ # Add custom environment variables from provider config
205
+ custom_env = project.provider_config.fetch('env', {}).map do |key, value|
206
+ { name: key.to_s, value: value.to_s }
207
+ end
208
+
209
+ base_env + custom_env
210
+ end
211
+
212
+ # Build resource requests and limits
213
+ # @param machine_config [Hash] Machine configuration
214
+ # @return [Hash] Resource specification
215
+ def build_resources(machine_config)
216
+ memory = machine_config[:memory_mb] || 4096
217
+ cpu = machine_config[:cpu_count] || 2
218
+
219
+ {
220
+ requests: {
221
+ memory: "#{memory}Mi",
222
+ cpu: cpu.to_s
223
+ },
224
+ limits: {
225
+ memory: "#{memory * 1.5}Mi", # Allow 50% burst
226
+ cpu: (cpu * 1.5).to_s
227
+ }
228
+ }
229
+ end
230
+
231
+ # Default machine configuration
232
+ # @return [Hash] Default configuration
233
+ def default_machine_config
234
+ {
235
+ memory_mb: 4096,
236
+ cpu_count: 2,
237
+ node_selector: {},
238
+ tolerations: []
239
+ }
240
+ end
241
+ end
242
+ end
243
+ end
244
+ end
@@ -0,0 +1,246 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'k8s-ruby'
4
+
5
+ module Brisk
6
+ module Providers
7
+ module Kubernetes
8
+ # Kubernetes provider for Brisk
9
+ # Manages workers as Kubernetes pods in a cluster
10
+ class Provider < ::Providers::BaseProvider
11
+ TRACER = defined?(MyAppTracer) ? MyAppTracer : nil
12
+
13
+ # Get workers for a project by finding available pods
14
+ # @param jobrun [Jobrun] The job run requesting workers
15
+ # @return [Array<Worker>] Array of allocated workers
16
+ def get_workers_for_project(jobrun)
17
+ workers_needed = project.worker_concurrency
18
+ Rails.logger.info "[K8s] Getting #{workers_needed} workers for jobrun #{jobrun.id}"
19
+
20
+ service = Brisk::Providers::Kubernetes::PodService.new(project, k8s_client)
21
+ service.get_workers_for_project(jobrun, workers_needed)
22
+ end
23
+
24
+ # Create a new worker pod
25
+ # @param machine_config [Hash] Machine configuration
26
+ # @return [Machine] The created machine record
27
+ def create_worker(machine_config)
28
+ validate_config!
29
+
30
+ Rails.logger.info "[K8s] Creating worker pod with config: #{machine_config}"
31
+
32
+ service = Brisk::Providers::Kubernetes::PodService.new(project, k8s_client)
33
+ service.create_worker_pod(machine_config)
34
+ rescue K8s::Error::API => e
35
+ raise ::Providers::ProviderError, "Failed to create Kubernetes pod: #{e.message}"
36
+ end
37
+
38
+ # Start a worker pod (scale from 0 or unpause)
39
+ # @param worker [Worker] The worker to start
40
+ def start_worker(worker)
41
+ Rails.logger.info "[K8s] Starting worker #{worker.id}"
42
+
43
+ pod_name = worker.machine.uid
44
+ namespace = k8s_namespace
45
+
46
+ # If pod is paused, resume it
47
+ # Otherwise, ensure it's running
48
+ client = k8s_client
49
+ pod = client.api('v1').resource('pods', namespace: namespace).get(pod_name)
50
+
51
+ unless pod.status.phase == 'Running'
52
+ # Restart pod by deleting and recreating
53
+ service = Brisk::Providers::Kubernetes::PodService.new(project, client)
54
+ service.restart_pod(worker)
55
+ end
56
+ rescue K8s::Error::NotFound
57
+ Rails.logger.warn "[K8s] Pod #{pod_name} not found, recreating"
58
+ create_worker(worker.machine.config)
59
+ rescue K8s::Error::API => e
60
+ raise ::Providers::ProviderError, "Failed to start pod: #{e.message}"
61
+ end
62
+
63
+ # Stop a worker pod (scale to 0 but preserve definition)
64
+ # @param worker [Worker] The worker to stop
65
+ def stop_worker(worker)
66
+ Rails.logger.info "[K8s] Stopping worker #{worker.id}"
67
+
68
+ pod_name = worker.machine.uid
69
+ namespace = k8s_namespace
70
+
71
+ # Delete the pod (deployment will handle recreation if needed)
72
+ client = k8s_client
73
+ client.api('v1').resource('pods', namespace: namespace).delete(pod_name, propagationPolicy: 'Foreground')
74
+
75
+ worker.machine.update(state: 'stopped')
76
+ rescue K8s::Error::NotFound
77
+ Rails.logger.warn "[K8s] Pod #{pod_name} already deleted"
78
+ rescue K8s::Error::API => e
79
+ raise ::Providers::ProviderError, "Failed to stop pod: #{e.message}"
80
+ end
81
+
82
+ # Suspend a worker pod (not supported, falls back to stop)
83
+ # @param worker [Worker] The worker to suspend
84
+ def suspend_worker(worker)
85
+ Rails.logger.info "[K8s] Suspending worker #{worker.id} (using stop)"
86
+ stop_worker(worker)
87
+ end
88
+
89
+ # Destroy a worker pod permanently
90
+ # @param worker [Worker] The worker to destroy
91
+ def destroy_worker(worker)
92
+ Rails.logger.info "[K8s] Destroying worker #{worker.id}"
93
+
94
+ pod_name = worker.machine.uid
95
+ namespace = k8s_namespace
96
+
97
+ client = k8s_client
98
+ client.api('v1').resource('pods', namespace: namespace).delete(
99
+ pod_name,
100
+ propagationPolicy: 'Foreground',
101
+ gracePeriodSeconds: 30
102
+ )
103
+
104
+ worker.machine.update(state: 'terminated', finished_at: Time.current)
105
+ rescue K8s::Error::NotFound
106
+ Rails.logger.warn "[K8s] Pod #{pod_name} already deleted"
107
+ rescue K8s::Error::API => e
108
+ Rails.logger.error "[K8s] Failed to destroy pod: #{e.message}"
109
+ # Continue even if deletion fails
110
+ end
111
+
112
+ # Reconcile workers - clean up orphaned pods
113
+ def reconcile_workers
114
+ Rails.logger.info "[K8s] Reconciling workers for project #{project.id}"
115
+
116
+ namespace = k8s_namespace
117
+ client = k8s_client
118
+
119
+ # Find all pods with project label
120
+ pods = client.api('v1').resource('pods', namespace: namespace).list(
121
+ labelSelector: "brisk-project-id=#{project.id},brisk-role=worker"
122
+ )
123
+
124
+ # Get all known machines
125
+ known_uids = project.machines.where(provider: 'kubernetes').pluck(:uid)
126
+
127
+ # Delete orphaned pods
128
+ orphaned = pods.reject { |pod| known_uids.include?(pod.metadata.name) }
129
+ orphaned.each do |pod|
130
+ Rails.logger.info "[K8s] Deleting orphaned pod: #{pod.metadata.name}"
131
+ client.api('v1').resource('pods', namespace: namespace).delete(
132
+ pod.metadata.name,
133
+ propagationPolicy: 'Background'
134
+ )
135
+ end
136
+
137
+ Rails.logger.info "[K8s] Reconciliation complete: #{orphaned.size} orphaned pods deleted"
138
+ rescue K8s::Error::API => e
139
+ Rails.logger.error "[K8s] Reconciliation failed: #{e.message}"
140
+ end
141
+
142
+ # Feature support
143
+ # @param feature [Symbol] Feature name
144
+ # @return [Boolean] Whether feature is supported
145
+ def supports?(feature)
146
+ # Can create pods on-demand and use HPA (Horizontal Pod Autoscaler)
147
+ %i[dynamic_creation auto_scale].include?(feature)
148
+ end
149
+
150
+ # Called after workers are allocated
151
+ # @param workers [Array<Worker>] Allocated workers
152
+ def after_worker_allocated(workers)
153
+ Rails.logger.debug "[K8s] #{workers.size} workers allocated, no balancing needed"
154
+ # Kubernetes handles scheduling and balancing
155
+ end
156
+
157
+ # Called after a worker is freed
158
+ # @param worker [Worker] The freed worker
159
+ def after_worker_freed(worker)
160
+ Rails.logger.info "[K8s] Scheduling cleanup for worker #{worker.id}"
161
+ # Schedule cleanup after a delay to allow for reuse
162
+ # CleanupKubernetesWorkerJob.set(wait: 5.minutes).perform_later(worker.id)
163
+
164
+ # For now, just log - pods will be cleaned up by reconciliation
165
+ end
166
+
167
+ # Should Brisk track health for this worker?
168
+ # Kubernetes manages pod health via liveness/readiness probes
169
+ # @param worker [Worker] The worker
170
+ # @return [Boolean] false - Kubernetes manages health
171
+ def should_track_health?(_worker)
172
+ false
173
+ end
174
+
175
+ # Register worker metadata when worker registers via gRPC
176
+ # @param worker [Worker] The worker
177
+ # @param params [Hash] Registration parameters
178
+ # @return [Hash] Metadata to set on worker
179
+ def register_worker_metadata(_worker, _params)
180
+ {
181
+ last_checked_at: 10.years.from_now # Kubernetes manages health
182
+ }
183
+ end
184
+
185
+ # Does this provider manage this machine?
186
+ # @param machine [Machine] The machine
187
+ # @return [Boolean] true if machine is a Kubernetes pod
188
+ def manages_machine?(machine)
189
+ machine.provider == 'kubernetes'
190
+ end
191
+
192
+ private
193
+
194
+ # Get Kubernetes client
195
+ # @return [K8s::Client] Kubernetes client
196
+ def k8s_client
197
+ @k8s_client ||= begin
198
+ config = k8s_config
199
+ K8s::Client.config(config)
200
+ end
201
+ end
202
+
203
+ # Get Kubernetes configuration
204
+ # @return [K8s::Config] Kubernetes configuration
205
+ def k8s_config
206
+ # Try in-cluster config first (for running inside Kubernetes)
207
+ if in_cluster?
208
+ K8s::Config.load_file('/var/run/secrets/kubernetes.io/serviceaccount/token')
209
+ else
210
+ # Use kubeconfig for local development
211
+ kubeconfig_path = ENV['KUBECONFIG'] || File.expand_path('~/.kube/config')
212
+ K8s::Config.load_file(kubeconfig_path)
213
+ end
214
+ rescue StandardError => e
215
+ raise ::Providers::ConfigurationError, "Failed to load Kubernetes config: #{e.message}"
216
+ end
217
+
218
+ # Check if running inside Kubernetes cluster
219
+ # @return [Boolean] true if running in-cluster
220
+ def in_cluster?
221
+ File.exist?('/var/run/secrets/kubernetes.io/serviceaccount/token')
222
+ end
223
+
224
+ # Get Kubernetes namespace for this project
225
+ # @return [String] Namespace name
226
+ def k8s_namespace
227
+ project.provider_config['namespace'] || ENV['K8S_NAMESPACE'] || 'brisk-workers'
228
+ end
229
+
230
+ # Validate provider configuration
231
+ # @raise [Providers::ConfigurationError] if configuration is invalid
232
+ def validate_config!
233
+ # Ensure namespace exists or can be created
234
+ namespace = k8s_namespace
235
+
236
+ raise ::Providers::ConfigurationError, 'Kubernetes namespace not configured' unless namespace.present?
237
+
238
+ # Validate image is configured
239
+ return if project.image&.url.present?
240
+
241
+ raise ::Providers::ConfigurationError, 'Worker image not configured'
242
+ end
243
+ end
244
+ end
245
+ end
246
+ end
@@ -0,0 +1,9 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Brisk
4
+ module Providers
5
+ module Kubernetes
6
+ VERSION = '1.0.0'
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'brisk/providers/kubernetes/version'
4
+ require_relative 'brisk/providers/kubernetes/engine'
5
+ require_relative 'brisk/providers/kubernetes/pod_service'
6
+ require_relative 'brisk/providers/kubernetes/provider'
metadata ADDED
@@ -0,0 +1,139 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: brisk-provider-kubernetes
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Brisk Team
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2025-10-28 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: k8s-ruby
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '0.10'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '0.10'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rails
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '7.0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '7.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: factory_bot_rails
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '6.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '6.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec-rails
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '6.0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '6.0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: sqlite3
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '2.1'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '2.1'
83
+ - !ruby/object:Gem::Dependency
84
+ name: webmock
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: '3.0'
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: '3.0'
97
+ description: Manage Brisk workers as Kubernetes pods in your cluster
98
+ email:
99
+ - support@brisk.dev
100
+ executables: []
101
+ extensions: []
102
+ extra_rdoc_files: []
103
+ files:
104
+ - README.md
105
+ - Rakefile
106
+ - lib/brisk-provider-kubernetes.rb
107
+ - lib/brisk/providers/kubernetes/engine.rb
108
+ - lib/brisk/providers/kubernetes/pod_service.rb
109
+ - lib/brisk/providers/kubernetes/provider.rb
110
+ - lib/brisk/providers/kubernetes/version.rb
111
+ homepage: https://github.com/brisktest/brisk-provider-kubernetes
112
+ licenses:
113
+ - MIT
114
+ metadata:
115
+ homepage_uri: https://github.com/brisktest/brisk-provider-kubernetes
116
+ source_code_uri: https://github.com/brisktest/brisk-provider-kubernetes
117
+ changelog_uri: https://github.com/brisktest/brisk-provider-kubernetes/blob/main/CHANGELOG.md
118
+ documentation_uri: https://docs.brisk.dev/providers/kubernetes
119
+ rubygems_mfa_required: 'true'
120
+ post_install_message:
121
+ rdoc_options: []
122
+ require_paths:
123
+ - lib
124
+ required_ruby_version: !ruby/object:Gem::Requirement
125
+ requirements:
126
+ - - ">="
127
+ - !ruby/object:Gem::Version
128
+ version: 3.2.0
129
+ required_rubygems_version: !ruby/object:Gem::Requirement
130
+ requirements:
131
+ - - ">="
132
+ - !ruby/object:Gem::Version
133
+ version: '0'
134
+ requirements: []
135
+ rubygems_version: 3.5.16
136
+ signing_key:
137
+ specification_version: 4
138
+ summary: Kubernetes provider for Brisk CI
139
+ test_files: []