tetra-rp 0.5.5__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,806 @@
1
+ Metadata-Version: 2.4
2
+ Name: tetra_rp
3
+ Version: 0.5.5
4
+ Summary: A Python library for distributed inference and serving of machine learning models
5
+ Author-email: Marut Pandya <pandyamarut@gmail.com>, Patrick Rachford <prachford@icloud.com>, Dean Quinanola <dean.quinanola@runpod.io>
6
+ License: MIT
7
+ Classifier: Development Status :: 3 - Alpha
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: OS Independent
11
+ Requires-Python: <3.14,>=3.9
12
+ Description-Content-Type: text/markdown
13
+ Requires-Dist: cloudpickle>=3.1.1
14
+ Requires-Dist: runpod~=1.7.9
15
+ Requires-Dist: python-dotenv>=1.0.0
16
+
17
+ # Tetra: Serverless computing for AI workloads
18
+
19
+ Tetra is a Python SDK that streamlines the development and deployment of AI workflows on Runpod's [Serverless infrastructure](http://docs.runpod.io/serverless/overview). Write Python functions locally, and Tetra handles the infrastructure, provisioning GPUs and CPUs, managing dependencies, and transferring data, allowing you to focus on building AI applications.
20
+
21
+ You can find a repository of prebuilt Tetra examples at [runpod/tetra-examples](https://github.com/runpod/tetra-examples).
22
+
23
+ > [!Note]
24
+ > **New feature - Consolidated template management:** `PodTemplate` overrides now seamlessly integrate with `ServerlessResource` defaults, providing more consistent resource configuration and reducing deployment complexity.
25
+
26
+ ## Table of contents
27
+
28
+ - [Requirements](#requirements)
29
+ - [Getting started](#getting-started)
30
+ - [Key concepts](#key-concepts)
31
+ - [How it works](#how-it-works)
32
+ - [Use cases](#use-cases)
33
+ - [Advanced features](#advanced-features)
34
+ - [Configuration](#configuration)
35
+ - [Workflow examples](#workflow-examples)
36
+ - [Contributing](#contributing)
37
+ - [Troubleshooting](#troubleshooting)
38
+
39
+ ## Getting started
40
+
41
+ Before you can use Tetra, you'll need:
42
+
43
+ - Python 3.9 (or higher) installed on your local machine.
44
+ - A Runpod account with API key ([sign up here](https://runpod.io/console)).
45
+ - Basic knowledge of Python and async programming.
46
+
47
+ ### Step 1: Install Tetra
48
+
49
+ ```bash
50
+ pip install tetra_rp
51
+ ```
52
+
53
+ ### Step 2: Set your API key
54
+
55
+ Generate an API key from the [Runpod account settings](https://docs.runpod.io/get-started/api-keys) page and set it as an environment variable:
56
+
57
+ ```bash
58
+ export RUNPOD_API_KEY=[YOUR_API_KEY]
59
+ ```
60
+
61
+ Or save it in a `.env` file in your project directory:
62
+
63
+ ```bash
64
+ echo "RUNPOD_API_KEY=[YOUR_API_KEY]" > .env
65
+ ```
66
+
67
+ ### Step 3: Write your first Tetra function
68
+
69
+ Add the following code to a new Python file:
70
+
71
+ ```python
72
+ import asyncio
73
+ from tetra_rp import remote, LiveServerless
74
+
75
+ # Configure GPU resources
76
+ gpu_config = LiveServerless(name="tetra-quickstart")
77
+
78
+ @remote(
79
+ resource_config=gpu_config,
80
+ dependencies=["torch", "numpy"]
81
+ )
82
+ def gpu_compute(data):
83
+ import torch
84
+ import numpy as np
85
+
86
+ # This runs on a GPU in Runpod's cloud
87
+ tensor = torch.tensor(data, device="cuda")
88
+ result = tensor.sum().item()
89
+
90
+ return {
91
+ "result": result,
92
+ "device": torch.cuda.get_device_name(0)
93
+ }
94
+
95
+ async def main():
96
+ # This runs locally
97
+ result = await gpu_compute([1, 2, 3, 4, 5])
98
+ print(f"Sum: {result['result']}")
99
+ print(f"Computed on: {result['device']}")
100
+
101
+ if __name__ == "__main__":
102
+ asyncio.run(main())
103
+ ```
104
+
105
+ Run the example:
106
+
107
+ ```bash
108
+ python your_script.py
109
+ ```
110
+
111
+ ## Key concepts
112
+
113
+ ### Remote functions
114
+
115
+ Tetra's `@remote` decorator marks functions for execution on Runpod's infrastructure. Everything inside the decorated function runs remotely, while code outside runs locally.
116
+
117
+ ```python
118
+ @remote(resource_config=config, dependencies=["pandas"])
119
+ def process_data(data):
120
+ # This code runs remotely
121
+ import pandas as pd
122
+ df = pd.DataFrame(data)
123
+ return df.describe().to_dict()
124
+
125
+ async def main():
126
+ # This code runs locally
127
+ result = await process_data(my_data)
128
+ ```
129
+
130
+ ### Resource configuration
131
+
132
+ Tetra provides fine-grained control over hardware allocation through configuration objects:
133
+
134
+ ```python
135
+ from tetra_rp import LiveServerless, GpuGroup, CpuInstanceType, PodTemplate
136
+
137
+ # GPU configuration
138
+ gpu_config = LiveServerless(
139
+ name="ml-inference",
140
+ gpus=[GpuGroup.AMPERE_80], # A100 80GB
141
+ workersMax=5,
142
+ template=PodTemplate(containerDiskInGb=100) # Extra disk space
143
+ )
144
+
145
+ # CPU configuration
146
+ cpu_config = LiveServerless(
147
+ name="data-processor",
148
+ instanceIds=[CpuInstanceType.CPU5C_4_16], # 4 vCPU, 16GB RAM
149
+ workersMax=3
150
+ )
151
+ ```
152
+
153
+ ### Dependency management
154
+
155
+ Specify Python packages in the decorator, and Tetra installs them automatically:
156
+
157
+ ```python
158
+ @remote(
159
+ resource_config=gpu_config,
160
+ dependencies=["transformers==4.36.0", "torch", "pillow"]
161
+ )
162
+ def generate_image(prompt):
163
+ # Import inside the function
164
+ from transformers import pipeline
165
+ import torch
166
+ from PIL import Image
167
+
168
+ # Your code here
169
+ ```
170
+
171
+ ### Parallel execution
172
+
173
+ Run multiple remote functions concurrently using Python's async capabilities:
174
+
175
+ ```python
176
+ # Process multiple items in parallel
177
+ results = await asyncio.gather(
178
+ process_item(item1),
179
+ process_item(item2),
180
+ process_item(item3)
181
+ )
182
+ ```
183
+
184
+ ## How it works
185
+
186
+ Tetra orchestrates workflow execution through a sophisticated multi-step process:
187
+
188
+ 1. **Function identification**: The `@remote` decorator marks functions for remote execution, enabling Tetra to distinguish between local and remote operations.
189
+ 2. **Dependency analysis**: Tetra automatically analyzes function dependencies to construct an optimal execution order, ensuring data flows correctly between sequential and parallel operations.
190
+ 3. **Resource provisioning and execution**: For each remote function, Tetra:
191
+ - Dynamically provisions endpoint and worker resources on Runpod's infrastructure.
192
+ - Serializes and securely transfers input data to the remote worker.
193
+ - Executes the function on the remote infrastructure with the specified GPU or CPU resources.
194
+ - Returns results to your local environment for further processing.
195
+ 4. **Data orchestration**: Results flow seamlessly between functions according to your local Python code structure, maintaining the same programming model whether functions run locally or remotely.
196
+
197
+ ## Use cases
198
+
199
+ Tetra is well-suited for a diverse range of AI and data processing workloads:
200
+
201
+ - **Multi-modal AI pipelines**: Orchestrate unified workflows combining text, image, and audio models with GPU acceleration.
202
+ - **Distributed model training**: Scale training operations across multiple GPU workers for faster model development.
203
+ - **AI research experimentation**: Rapidly prototype and test complex model combinations without infrastructure overhead.
204
+ - **Production inference systems**: Deploy sophisticated multi-stage inference pipelines for real-world applications.
205
+ - **Data processing workflows**: Efficiently process large datasets using CPU workers for general computation and GPU workers for accelerated tasks.
206
+ - **Hybrid GPU/CPU workflows**: Optimize cost and performance by combining CPU preprocessing with GPU inference.
207
+
208
+ ## Advanced features
209
+
210
+ ### Custom Docker images
211
+
212
+ `LiveServerless` resources use a fixed Docker image that's optimized for Tetra runtime, and supports full remote code execution. For specialized environments that require a custom Docker image, use `ServerlessEndpoint` or `CpuServerlessEndpoint`:
213
+
214
+ ```python
215
+ from tetra_rp import ServerlessEndpoint
216
+
217
+ custom_gpu = ServerlessEndpoint(
218
+ name="custom-ml-env",
219
+ imageName="pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime",
220
+ gpus=[GpuGroup.AMPERE_80]
221
+ )
222
+ ```
223
+
224
+ Unlike `LiveServerless`, these endpoints only support dictionary payloads in the form of `{"input": {...}}` (similar to a traditional [Serverless endpoint request](https://docs.runpod.io/serverless/endpoints/send-requests)), and cannot execute arbitrary Python functions remotely.
225
+
226
+ ### Persistent storage
227
+
228
+ Attach network volumes for model caching:
229
+
230
+ ```python
231
+ config = LiveServerless(
232
+ name="model-server",
233
+ networkVolumeId="vol_abc123", # Your volume ID
234
+ template=PodTemplate(containerDiskInGb=100)
235
+ )
236
+ ```
237
+
238
+ ### Environment variables
239
+
240
+ Pass configuration to remote functions:
241
+
242
+ ```python
243
+ config = LiveServerless(
244
+ name="api-worker",
245
+ env={"HF_TOKEN": "your_token", "MODEL_ID": "gpt2"}
246
+ )
247
+ ```
248
+
249
+ ## Configuration
250
+
251
+ ### GPU configuration parameters
252
+
253
+ The following parameters can be used with `LiveServerless` (full remote code execution) and `ServerlessEndpoint` (Dictionary payload only) to configure your Runpod GPU endpoints:
254
+
255
+ | Parameter | Description | Default | Example Values |
256
+ |--------------------|-------------------------------------------------|---------------|-------------------------------------|
257
+ | `name` | (Required) Name for your endpoint | `""` | `"stable-diffusion-server"` |
258
+ | `gpus` | GPU pool IDs that can be used by workers | `[GpuGroup.ANY]` | `[GpuGroup.ADA_24]` for RTX 4090 |
259
+ | `gpuCount` | Number of GPUs per worker | 1 | 1, 2, 4 |
260
+ | `workersMin` | Minimum number of workers | 0 | Set to 1 for persistence |
261
+ | `workersMax` | Maximum number of workers | 3 | Higher for more concurrency |
262
+ | `idleTimeout` | Minutes before scaling down | 5 | 10, 30, 60 |
263
+ | `env` | Environment variables | `None` | `{"HF_TOKEN": "xyz"}` |
264
+ | `networkVolumeId` | Persistent storage ID | `None` | `"vol_abc123"` |
265
+ | `executionTimeoutMs`| Max execution time (ms) | 0 (no limit) | 600000 (10 min) |
266
+ | `scalerType` | Scaling strategy | `QUEUE_DELAY` | `REQUEST_COUNT` |
267
+ | `scalerValue` | Scaling parameter value | 4 | 1-10 range typical |
268
+ | `locations` | Preferred datacenter locations | `None` | `"us-east,eu-central"` |
269
+ | `imageName` | Custom Docker image (`ServerlessEndpoint` only) | Fixed for LiveServerless | `"pytorch/pytorch:latest"`, `"my-registry/custom:v1.0"` |
270
+
271
+ ### CPU configuration parameters
272
+
273
+ The same GPU configuration parameters above apply to `LiveServerless` (full remote code execution) and `CpuServerlessEndpoint` (dictionary payload only), with these additional CPU-specific parameters:
274
+
275
+ | Parameter | Description | Default | Example Values |
276
+ |--------------------|-------------------------------------------------|---------------|-------------------------------------|
277
+ | `instanceIds` | CPU Instance Types (forces a CPU endpoint type) | `None` | `[CpuInstanceType.CPU5C_2_4]` |
278
+ | `imageName` | Custom Docker image (`CpuServerlessEndpoint` only) | Fixed for `LiveServerless` | `"python:3.11-slim"`, `"my-registry/custom:v1.0"` |
279
+
280
+ ### Resource class comparison
281
+
282
+ | Feature | LiveServerless | ServerlessEndpoint | CpuServerlessEndpoint |
283
+ |---------|----------------|-------------------|----------------------|
284
+ | **Remote code execution** | ✅ Full Python function execution | ❌ Dictionary payloads only | ❌ Dictionary payloads only |
285
+ | **Custom Docker images** | ❌ Fixed optimized images | ✅ Any Docker image | ✅ Any Docker image |
286
+ | **Use case** | Dynamic remote functions | Traditional API endpoints | Traditional CPU endpoints |
287
+ | **Function returns** | Any Python object | Dictionary only | Dictionary only |
288
+ | **@remote decorator** | Full functionality | Limited to payload passing | Limited to payload passing |
289
+
290
+ ### Available GPU types
291
+
292
+ Some common GPU groups available through `GpuGroup`:
293
+
294
+ - `GpuGroup.ANY` - Any available GPU (default)
295
+ - `GpuGroup.ADA_24` - NVIDIA GeForce RTX 4090
296
+ - `GpuGroup.AMPERE_80` - NVIDIA A100 80GB
297
+ - `GpuGroup.AMPERE_48` - NVIDIA A40, RTX A6000
298
+ - `GpuGroup.AMPERE_24` - NVIDIA RTX A5000, L4, RTX 3090
299
+
300
+
301
+ ### Available CPU instance types
302
+ - `CpuInstanceType.CPU3G_1_4` - (cpu3g-1-4) 3rd gen general purpose, 1 vCPU, 4GB RAM
303
+ - `CpuInstanceType.CPU3G_2_8` - (cpu3g-2-8) 3rd gen general purpose, 2 vCPU, 8GB RAM
304
+ - `CpuInstanceType.CPU3G_4_16` - (cpu3g-4-16) 3rd gen general purpose, 4 vCPU, 16GB RAM
305
+ - `CpuInstanceType.CPU3G_8_32` - (cpu3g-8-32) 3rd gen general purpose, 8 vCPU, 32GB RAM
306
+ - `CpuInstanceType.CPU3C_1_2` - (cpu3c-1-2) 3rd gen compute-optimized, 1 vCPU, 2GB RAM
307
+ - `CpuInstanceType.CPU3C_2_4` - (cpu3c-2-4) 3rd gen compute-optimized, 2 vCPU, 4GB RAM
308
+ - `CpuInstanceType.CPU3C_4_8` - (cpu3c-4-8) 3rd gen compute-optimized, 4 vCPU, 8GB RAM
309
+ - `CpuInstanceType.CPU3C_8_16` - (cpu3c-8-16) 3rd gen compute-optimized, 8 vCPU, 16GB RAM
310
+ - `CpuInstanceType.CPU5C_1_2` - (cpu5c-1-2) 5th gen compute-optimized, 1 vCPU, 2GB RAM
311
+ - `CpuInstanceType.CPU5C_2_4` - (cpu5c-2-4) 5th gen compute-optimized, 2 vCPU, 4GB RAM
312
+ - `CpuInstanceType.CPU5C_4_8` - (cpu5c-4-8) 5th gen compute-optimized, 4 vCPU, 8GB RAM
313
+ - `CpuInstanceType.CPU5C_8_16` - (cpu5c-8-16) 5th gen compute-optimized, 8 vCPU, 16GB RAM
314
+
315
+ ## Workflow examples
316
+
317
+ ### Basic GPU workflow
318
+
319
+ ```python
320
+ import asyncio
321
+ from tetra_rp import remote, LiveServerless
322
+
323
+ # Simple GPU configuration
324
+ gpu_config = LiveServerless(name="example-gpu-server")
325
+
326
+ @remote(
327
+ resource_config=gpu_config,
328
+ dependencies=["torch", "numpy"]
329
+ )
330
+ def gpu_compute(data):
331
+ import torch
332
+ import numpy as np
333
+
334
+ # Convert to tensor and perform computation on GPU
335
+ tensor = torch.tensor(data, device="cuda")
336
+ result = tensor.sum().item()
337
+
338
+ # Get GPU info
339
+ gpu_info = torch.cuda.get_device_properties(0)
340
+
341
+ return {
342
+ "result": result,
343
+ "gpu_name": gpu_info.name,
344
+ "cuda_version": torch.version.cuda
345
+ }
346
+
347
+ async def main():
348
+ result = await gpu_compute([1, 2, 3, 4, 5])
349
+ print(f"Result: {result['result']}")
350
+ print(f"Computed on: {result['gpu_name']} with CUDA {result['cuda_version']}")
351
+
352
+ if __name__ == "__main__":
353
+ asyncio.run(main())
354
+ ```
355
+
356
+ ### Advanced GPU workflow with template configuration
357
+
358
+ ```python
359
+ import asyncio
360
+ from tetra_rp import remote, LiveServerless, GpuGroup, PodTemplate
361
+
362
+ # Advanced GPU configuration with consolidated template overrides
363
+ sd_config = LiveServerless(
364
+ gpus=[GpuGroup.AMPERE_80], # A100 80GB GPUs
365
+ name="example_image_gen_server",
366
+ template=PodTemplate(containerDiskInGb=100), # Large disk for models
367
+ workersMax=3,
368
+ idleTimeout=10
369
+ )
370
+
371
+ @remote(
372
+ resource_config=sd_config,
373
+ dependencies=["diffusers", "transformers", "torch", "accelerate", "safetensors"]
374
+ )
375
+ def generate_image(prompt, width=512, height=512):
376
+ import torch
377
+ from diffusers import StableDiffusionPipeline
378
+ import io
379
+ import base64
380
+
381
+ # Load pipeline (benefits from large container disk)
382
+ pipeline = StableDiffusionPipeline.from_pretrained(
383
+ "runwayml/stable-diffusion-v1-5",
384
+ torch_dtype=torch.float16
385
+ )
386
+ pipeline = pipeline.to("cuda")
387
+
388
+ # Generate image
389
+ image = pipeline(prompt=prompt, width=width, height=height).images[0]
390
+
391
+ # Convert to base64 for return
392
+ buffered = io.BytesIO()
393
+ image.save(buffered, format="PNG")
394
+ img_str = base64.b64encode(buffered.getvalue()).decode()
395
+
396
+ return {"image": img_str, "prompt": prompt}
397
+
398
+ async def main():
399
+ result = await generate_image("A serene mountain landscape at sunset")
400
+ print(f"Generated image for: {result['prompt']}")
401
+ # Save image locally if needed
402
+ # img_data = base64.b64decode(result["image"])
403
+ # with open("output.png", "wb") as f:
404
+ # f.write(img_data)
405
+
406
+ if __name__ == "__main__":
407
+ asyncio.run(main())
408
+ ```
409
+
410
+ ### Basic CPU workflow
411
+
412
+ ```python
413
+ import asyncio
414
+ from tetra_rp import remote, LiveServerless, CpuInstanceType
415
+
416
+ # Simple CPU configuration
417
+ cpu_config = LiveServerless(
418
+ name="example-cpu-server",
419
+ instanceIds=[CpuInstanceType.CPU5G_2_8], # 2 vCPU, 8GB RAM
420
+ )
421
+
422
+ @remote(
423
+ resource_config=cpu_config,
424
+ dependencies=["pandas", "numpy"]
425
+ )
426
+ def cpu_data_processing(data):
427
+ import pandas as pd
428
+ import numpy as np
429
+ import platform
430
+
431
+ # Process data using CPU
432
+ df = pd.DataFrame(data)
433
+
434
+ return {
435
+ "row_count": len(df),
436
+ "column_count": len(df.columns) if not df.empty else 0,
437
+ "mean_values": df.select_dtypes(include=[np.number]).mean().to_dict(),
438
+ "system_info": platform.processor(),
439
+ "platform": platform.platform()
440
+ }
441
+
442
+ async def main():
443
+ sample_data = [
444
+ {"name": "Alice", "age": 30, "score": 85},
445
+ {"name": "Bob", "age": 25, "score": 92},
446
+ {"name": "Charlie", "age": 35, "score": 78}
447
+ ]
448
+
449
+ result = await cpu_data_processing(sample_data)
450
+ print(f"Processed {result['row_count']} rows on {result['platform']}")
451
+ print(f"Mean values: {result['mean_values']}")
452
+
453
+ if __name__ == "__main__":
454
+ asyncio.run(main())
455
+ ```
456
+
457
+ ### Advanced CPU workflow with template configuration
458
+
459
+ ```python
460
+ import asyncio
461
+ import base64
462
+ from tetra_rp import remote, LiveServerless, CpuInstanceType, PodTemplate
463
+
464
+ # Advanced CPU configuration with template overrides
465
+ data_processing_config = LiveServerless(
466
+ name="advanced-cpu-processor",
467
+ instanceIds=[CpuInstanceType.CPU5C_4_16, CpuInstanceType.CPU3C_4_8], # Fallback options
468
+ template=PodTemplate(
469
+ containerDiskInGb=20, # Extra disk space for data processing
470
+ env=[{"key": "PYTHONPATH", "value": "/workspace"}] # Custom environment
471
+ ),
472
+ workersMax=5,
473
+ idleTimeout=15,
474
+ env={"PROCESSING_MODE": "batch", "DEBUG": "false"} # Additional env vars
475
+ )
476
+
477
+ @remote(
478
+ resource_config=data_processing_config,
479
+ dependencies=["pandas", "numpy", "scipy", "scikit-learn"]
480
+ )
481
+ def advanced_data_analysis(dataset, analysis_type="full"):
482
+ import pandas as pd
483
+ import numpy as np
484
+ from sklearn.preprocessing import StandardScaler
485
+ from sklearn.decomposition import PCA
486
+ import platform
487
+
488
+ # Create DataFrame
489
+ df = pd.DataFrame(dataset)
490
+
491
+ # Perform analysis based on type
492
+ results = {
493
+ "platform": platform.platform(),
494
+ "dataset_shape": df.shape,
495
+ "memory_usage": df.memory_usage(deep=True).sum()
496
+ }
497
+
498
+ if analysis_type == "full":
499
+ # Advanced statistical analysis
500
+ numeric_cols = df.select_dtypes(include=[np.number]).columns
501
+ if len(numeric_cols) > 0:
502
+ # Standardize data
503
+ scaler = StandardScaler()
504
+ scaled_data = scaler.fit_transform(df[numeric_cols])
505
+
506
+ # PCA analysis
507
+ pca = PCA(n_components=min(len(numeric_cols), 3))
508
+ pca_result = pca.fit_transform(scaled_data)
509
+
510
+ results.update({
511
+ "correlation_matrix": df[numeric_cols].corr().to_dict(),
512
+ "pca_explained_variance": pca.explained_variance_ratio_.tolist(),
513
+ "pca_shape": pca_result.shape
514
+ })
515
+
516
+ return results
517
+
518
+ async def main():
519
+ # Generate sample dataset
520
+ sample_data = [
521
+ {"feature1": np.random.randn(), "feature2": np.random.randn(),
522
+ "feature3": np.random.randn(), "category": f"cat_{i%3}"}
523
+ for i in range(1000)
524
+ ]
525
+
526
+ result = await advanced_data_analysis(sample_data, "full")
527
+ print(f"Processed dataset with shape: {result['dataset_shape']}")
528
+ print(f"Memory usage: {result['memory_usage']} bytes")
529
+ print(f"PCA explained variance: {result.get('pca_explained_variance', 'N/A')}")
530
+
531
+ if __name__ == "__main__":
532
+ asyncio.run(main())
533
+ ```
534
+
535
+ ### Hybrid GPU/CPU workflow
536
+
537
+ ```python
538
+ import asyncio
539
+ from tetra_rp import remote, LiveServerless, GpuGroup, CpuInstanceType, PodTemplate
540
+
541
+ # GPU configuration for model inference
542
+ gpu_config = LiveServerless(
543
+ name="ml-inference-gpu",
544
+ gpus=[GpuGroup.AMPERE_24], # RTX 3090/A5000
545
+ template=PodTemplate(containerDiskInGb=50), # Space for models
546
+ workersMax=2
547
+ )
548
+
549
+ # CPU configuration for data preprocessing
550
+ cpu_config = LiveServerless(
551
+ name="data-preprocessor",
552
+ instanceIds=[CpuInstanceType.CPU5C_4_16], # 4 vCPU, 16GB RAM
553
+ template=PodTemplate(
554
+ containerDiskInGb=30,
555
+ env=[{"key": "NUMPY_NUM_THREADS", "value": "4"}]
556
+ ),
557
+ workersMax=3
558
+ )
559
+
560
+ @remote(
561
+ resource_config=cpu_config,
562
+ dependencies=["pandas", "numpy", "scikit-learn"]
563
+ )
564
+ def preprocess_data(raw_data):
565
+ import pandas as pd
566
+ import numpy as np
567
+ from sklearn.preprocessing import StandardScaler
568
+
569
+ # Data cleaning and preprocessing
570
+ df = pd.DataFrame(raw_data)
571
+
572
+ # Handle missing values
573
+ df = df.fillna(df.mean(numeric_only=True))
574
+
575
+ # Normalize numeric features
576
+ numeric_cols = df.select_dtypes(include=[np.number]).columns
577
+ if len(numeric_cols) > 0:
578
+ scaler = StandardScaler()
579
+ df[numeric_cols] = scaler.fit_transform(df[numeric_cols])
580
+
581
+ return {
582
+ "processed_data": df.to_dict('records'),
583
+ "shape": df.shape,
584
+ "columns": list(df.columns)
585
+ }
586
+
587
+ @remote(
588
+ resource_config=gpu_config,
589
+ dependencies=["torch", "transformers", "numpy"]
590
+ )
591
+ def run_inference(processed_data):
592
+ import torch
593
+ import numpy as np
594
+
595
+ # Simulate ML model inference on GPU
596
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
597
+
598
+ # Convert to tensor
599
+ data_array = np.array([list(item.values()) for item in processed_data["processed_data"]])
600
+ tensor = torch.tensor(data_array, dtype=torch.float32).to(device)
601
+
602
+ # Simple neural network simulation
603
+ with torch.no_grad():
604
+ # Simulate model computation
605
+ result = torch.nn.functional.softmax(tensor.mean(dim=1), dim=0)
606
+ predictions = result.cpu().numpy().tolist()
607
+
608
+ return {
609
+ "predictions": predictions,
610
+ "device_used": str(device),
611
+ "input_shape": tensor.shape
612
+ }
613
+
614
+ async def ml_pipeline(raw_dataset):
615
+ """Complete ML pipeline: CPU preprocessing -> GPU inference"""
616
+ print("Step 1: Preprocessing data on CPU...")
617
+ preprocessed = await preprocess_data(raw_dataset)
618
+ print(f"Preprocessed data shape: {preprocessed['shape']}")
619
+
620
+ print("Step 2: Running inference on GPU...")
621
+ results = await run_inference(preprocessed)
622
+ print(f"Inference completed on: {results['device_used']}")
623
+
624
+ return {
625
+ "preprocessing": preprocessed,
626
+ "inference": results
627
+ }
628
+
629
+ async def main():
630
+ # Sample dataset
631
+ raw_data = [
632
+ {"feature1": np.random.randn(), "feature2": np.random.randn(),
633
+ "feature3": np.random.randn(), "label": i % 2}
634
+ for i in range(100)
635
+ ]
636
+
637
+ # Run the complete pipeline
638
+ results = await ml_pipeline(raw_data)
639
+
640
+ print("\nPipeline Results:")
641
+ print(f"Data processed: {results['preprocessing']['shape']}")
642
+ print(f"Predictions generated: {len(results['inference']['predictions'])}")
643
+ print(f"GPU device: {results['inference']['device_used']}")
644
+
645
+ if __name__ == "__main__":
646
+ asyncio.run(main())
647
+ ```
648
+
649
+ ### Multi-stage ML pipeline example
650
+
651
+ ```python
652
+ import os
653
+ import asyncio
654
+ from tetra_rp import remote, LiveServerless
655
+
656
+ # Configure Runpod resources
657
+ runpod_config = LiveServerless(name="multi-stage-pipeline-server")
658
+
659
+ # Feature extraction on GPU
660
+ @remote(
661
+ resource_config=runpod_config,
662
+ dependencies=["torch", "transformers"]
663
+ )
664
+ def extract_features(texts):
665
+ import torch
666
+ from transformers import AutoTokenizer, AutoModel
667
+
668
+ tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
669
+ model = AutoModel.from_pretrained("bert-base-uncased")
670
+ model.to("cuda")
671
+
672
+ features = []
673
+ for text in texts:
674
+ inputs = tokenizer(text, return_tensors="pt").to("cuda")
675
+ with torch.no_grad():
676
+ outputs = model(**inputs)
677
+ features.append(outputs.last_hidden_state[:, 0].cpu().numpy().tolist()[0])
678
+
679
+ return features
680
+
681
+ # Classification on GPU
682
+ @remote(
683
+ resource_config=runpod_config,
684
+ dependencies=["torch", "sklearn"]
685
+ )
686
+ def classify(features, labels=None):
687
+ import torch
688
+ import numpy as np
689
+ from sklearn.linear_model import LogisticRegression
690
+
691
+ features_np = np.array(features[1:] if labels is None and isinstance(features, list) and len(features)>0 and isinstance(features[0], dict) else features)
692
+
693
+ if labels is not None:
694
+ labels_np = np.array(labels)
695
+ classifier = LogisticRegression()
696
+ classifier.fit(features_np, labels_np)
697
+
698
+ coefficients = {
699
+ "coef": classifier.coef_.tolist(),
700
+ "intercept": classifier.intercept_.tolist(),
701
+ "classes": classifier.classes_.tolist()
702
+ }
703
+ return coefficients
704
+ else:
705
+ coefficients = features[0]
706
+
707
+ classifier = LogisticRegression()
708
+ classifier.coef_ = np.array(coefficients["coef"])
709
+ classifier.intercept_ = np.array(coefficients["intercept"])
710
+ classifier.classes_ = np.array(coefficients["classes"])
711
+
712
+ # Predict
713
+ predictions = classifier.predict(features_np)
714
+ probabilities = classifier.predict_proba(features_np)
715
+
716
+ return {
717
+ "predictions": predictions.tolist(),
718
+ "probabilities": probabilities.tolist()
719
+ }
720
+
721
+ # Complete pipeline
722
+ async def text_classification_pipeline(train_texts, train_labels, test_texts):
723
+ train_features = await extract_features(train_texts)
724
+ test_features = await extract_features(test_texts)
725
+
726
+ model_coeffs = await classify(train_features, train_labels)
727
+
728
+ # For inference, pass model coefficients along with test features
729
+ # The classify function expects a list where the first element is the model (coeffs)
730
+ # and subsequent elements are features for prediction.
731
+ predictions = await classify([model_coeffs] + test_features)
732
+
733
+ return predictions
734
+ ```
735
+
736
+ ### More examples
737
+
738
+ You can find many more examples in the [tetra-examples repository](https://github.com/runpod/tetra-examples).
739
+
740
+ You can also install the examples as a submodule:
741
+
742
+ ```bash
743
+ git clone https://github.com/runpod/tetra-examples.git
744
+ cd tetra-examples
745
+ python -m examples.example
746
+ python -m examples.image_gen
747
+ python -m examples.matrix_operations
748
+ ```
749
+
750
+ ## Contributing
751
+
752
+ We welcome contributions to Tetra! Whether you're fixing bugs, adding features, or improving documentation, your help makes this project better.
753
+
754
+ ### Development Setup
755
+
756
+ 1. Fork and clone the repository
757
+ 2. Set up your development environment following the project guidelines
758
+ 3. Make your changes following our coding standards
759
+ 4. Test your changes thoroughly
760
+ 5. Submit a pull request
761
+
762
+ ### Release Process
763
+
764
+ This project uses an automated release system built on Release Please. For detailed information about how releases work, including conventional commits, versioning, and the CI/CD pipeline, see our [Release System Documentation](RELEASE_SYSTEM.md).
765
+
766
+ **Quick reference for contributors:**
767
+ - Use conventional commits: `feat:`, `fix:`, `docs:`, etc.
768
+ - CI automatically runs quality checks on all PRs
769
+ - Release PRs are created automatically when changes are merged to main
770
+ - Releases are published to PyPI automatically when release PRs are merged
771
+
772
+ ## Troubleshooting
773
+
774
+ ### Authentication errors
775
+
776
+ Verify your API key is set correctly:
777
+
778
+ ```bash
779
+ echo $RUNPOD_API_KEY # Should show your key
780
+ ```
781
+
782
+ ### Import errors in remote functions
783
+
784
+ Remember to import packages inside remote functions:
785
+
786
+ ```python
787
+ @remote(dependencies=["requests"])
788
+ def fetch_data(url):
789
+ import requests # Import here, not at top of file
790
+ return requests.get(url).json()
791
+ ```
792
+
793
+ ### Performance optimization
794
+
795
+ - Set `workersMin=1` to keep workers warm and avoid cold starts
796
+ - Use `idleTimeout` to balance cost and responsiveness
797
+ - Choose appropriate GPU types for your workload
798
+
799
+ ## License
800
+
801
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
802
+
803
+ <p align="center">
804
+ <a href="https://github.com/yourusername/tetra">Tetra</a> •
805
+ <a href="https://runpod.io">Runpod</a>
806
+ </p>