npm - gitgreen - Versions diffs - 1.3.0 → 1.4.0 - Mend

gitgreen 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md +275 -380
package/package.json +1 -1
package/scripts/build_power_profiles.py +0 -268

package/README.md CHANGED Viewed

@@ -1,488 +1,383 @@
-# GitGreen CLI
+<p align="center">
+  <h1 align="center">GitGreen</h1>
+  <p align="center">
+    <strong>Carbon footprint tracking for GitLab CI/CD pipelines</strong>
+  </p>
+  <p align="center">
+    <a href="#quick-start">Quick Start</a> •
+    <a href="#installation">Installation</a> •
+    <a href="#usage">Usage</a> •
+    <a href="#methodology">Methodology</a> •
+    <a href="#faq">FAQ</a>
+  </p>
+</p>
+---
+[![npm version](https://img.shields.io/npm/v/gitgreen.svg)](https://www.npmjs.com/package/gitgreen)
+[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
+[![Node.js](https://img.shields.io/badge/Node.js-20+-339933?logo=node.js&logoColor=white)](https://nodejs.org/)
+GitGreen measures the carbon emissions of your CI/CD jobs using real CPU/RAM metrics, grid carbon intensity data, and research-backed power models. Works with **GCP** and **AWS** runners.
-Self-contained carbon calculation CLI for GitLab jobs (no API server). It reuses the same power profiles and budget reporting as the existing implementation, pulls Electricity Maps intensity, and can post Merge Request notes via `CI_JOB_TOKEN`. Works with both GCP and AWS runners.
-## Prerequisites
-Before installing GitGreen CLI, ensure you have the following installed and configured:
-### Required
-- **Node.js** (v20 or higher) - [Download](https://nodejs.org/)
-  - Required to run the CLI tool
-  - Check version: `node --version`
+```
+Total emissions: 2.847 gCO₂e
+├── CPU:     1.923 gCO₂e
+├── RAM:     0.412 gCO₂e
+└── Scope 3: 0.512 gCO₂e (embodied carbon)
+```
-- **npm** or **pnpm** - Package manager (comes with Node.js)
-  - This project uses `pnpm@8.15.4` as the package manager
-  - Install pnpm: `npm install -g pnpm`
+## Quick Start
-- **Git** - [Download](https://git-scm.com/)
-  - Required for detecting GitLab project information during initialization
-  - Check version: `git --version`
+```bash
+# Install globally
+npm install -g gitgreen
-### Cloud Provider CLIs (Required based on provider)
+# Initialize in your GitLab project
+cd your-repo
+gitgreen init
+```
-- **Google Cloud SDK (gcloud CLI)** - [Installation Guide](https://cloud.google.com/sdk/docs/install-sdk)
-  - Required for GCP provider setup and authentication
-  - After installation, authenticate: `gcloud auth login`
-  - Set default project: `gcloud config set project YOUR_PROJECT_ID`
-  - Check version: `gcloud --version`
+The wizard configures everything: provider credentials, carbon budgets, and CI/CD integration.
-- **AWS Credentials** - [AWS CLI Installation](https://aws.amazon.com/cli/) (optional, but recommended)
-  - Required for AWS provider (can use environment variables or AWS CLI config)
-  - Configure credentials: `aws configure`
-  - Or set environment variables: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`
+## Table of Contents
-### Optional (but recommended)
+- [Installation](#installation)
+- [Usage](#usage)
+- [Advanced](#advanced)
+  - [Providers](#providers)
+  - [Output Integrations](#output-integrations)
+  - [Database Schema](#database-schema)
+- [Methodology](#methodology)
+- [Architecture](#architecture)
+- [Configuration](#configuration)
+- [FAQ](#faq)
+- [Contributing](#contributing)
+- [References](#references)
+- [License](#license)
-- **GitLab CLI (glab)** - [Installation Guide](https://docs.gitlab.com/cli/)
-  - Recommended for easier GitLab authentication and CI/CD variable management
-  - After installation, authenticate: `glab auth login`
-  - Check version: `glab --version`
-  - If not installed, you'll need to provide a GitLab Personal Access Token manually
+## Installation
-### API Keys
+### Prerequisites
-- **Electricity Maps API Key** - [Get free key](https://api-portal.electricitymaps.com)
-  - Required for carbon intensity data
-  - You'll be prompted for this during `gitgreen init`
+| Requirement | Version | Notes |
+|-------------|---------|-------|
+| Node.js | 20+ | [Download](https://nodejs.org/) |
+| Git | Any | For GitLab project detection |
+| gcloud CLI | Any | Required for GCP ([Install](https://cloud.google.com/sdk/docs/install-sdk)) |
+| AWS CLI | Any | Required for AWS ([Install](https://aws.amazon.com/cli/)) |
+| glab CLI | Any | Optional, for easier GitLab auth ([Install](https://docs.gitlab.com/cli/)) |
-## Install
-- From npm (global CLI):
-  ```bash
-  npm install -g gitgreen
-  gitgreen --help
-  ```
+### Install
-Run tests:
 ```bash
-npm test
+npm install -g gitgreen
 ```
-Stress test multiple live configs (build first, real APIs):
-```bash
-npm build
-npm stress
-```
+### API Keys
-## Usage
+- **Electricity Maps** (required): [Get free API key](https://api-portal.electricitymaps.com)
-### Initializing a New GitLab Project
+## Usage
-To get started with carbon tracking in any GitLab project (any project with a remote pointing to a GitLab instance), run:
+### Initialize a Project
 ```bash
-# In your repo
 gitgreen init
 ```
-The initialization wizard will guide you through the setup process:
-- Configure provider/machine/region settings
-- Set carbon budgets
-- Configure MR note preferences
-- Choose to use an existing runner or spin up a new one with supported providers
-After initialization, the wizard will:
-- Append a ready-made job to your `.gitlab-ci.yml`
-- Print the CI/CD variable checklist (ELECTRICITY_MAPS_API_KEY, provider credentials, budget flags)
-### How It Works
-Once initialized, all subsequent pipelines will run on the configured runner, and their performance will be automatically measured. The carbon tracking is implemented using GitLab CI/CD components as a final step in your pipeline. The carbon tracking job itself is not computationally expensive, so it adds minimal overhead to your CI/CD workflows.
-Key environment variables:
-- `ELECTRICITY_MAPS_API_KEY` (required) - Get a free key from https://api-portal.electricitymaps.com
-- `GCP_PROJECT_ID` / `GOOGLE_CLOUD_PROJECT` (required for GCP) - Your GCP project ID
-- `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` (required for AWS CloudWatch) - Credentials with CloudWatch read access on the runner
-- `AWS_REGION` (for AWS) - Region of the runner EC2 instance
-The `gitgreen init` wizard will automatically set all necessary GitLab CI/CD variables for you.
-## Providers
-GitGreen supports multiple cloud providers:
-- **GCP**: Fully wired — the CLI parser expects GCP Monitoring JSON, the GitLab component polls GCE metrics, and `gitgreen init` provisions or reuses GCP runners.
-- **AWS**: CloudWatch-backed — the CLI can pull metrics directly via `--from-cloudwatch`, the GitLab component fetches CloudWatch CPU/RAM data, and `gitgreen init` can configure or provision EC2 runners.
-## Architecture
-Runtime flow:
-```
-CPU/RAM timeseries (GCP Monitoring JSON or custom collector)
-        |
-        v
-+------------------+    +----------------------------+
-| CLI (gitgreen)   |--->| CarbonCalculator           |
-| - parse metrics  |    | - PowerProfileRepository   |<- machine data in data/*.json
-| - CLI/env opts   |    | - ZoneMapper (region→zone) |<- static + runtime PUE mapping
-+------------------+    | - IntensityProvider        |<- Electricity Maps API
-                         +-------------+------------+
-                                       |
-                                       v
-                        +-----------------------------+
-                        | Report formatter            |
-                        | - Markdown/JSON artifacts   |
-                        | - Budget evaluation         |
-                        +-------------+---------------+
-                                       |
-                                       v
-                        +-----------------------------+
-                        | GitLab client (optional)    |
-                        | - MR note via CI_JOB_TOKEN  |
-                        +-----------------------------+
-```
-GitLab CI path:
-```
-Pipeline starts → component script fetches CPU/RAM timeseries from GCP Monitoring or AWS CloudWatch
-              → writes JSON files → runs `gitgreen --provider gcp|aws ...`
-              → emits `carbon-report.md` / `carbon-report.json`
-              → optional MR note when CI_JOB_TOKEN is present
-```
-## Output Integrations
-During `gitgreen init` you can opt into exporting GitGreen data to external systems. The wizard includes an integration step with two optional sinks:
+The wizard will:
+1. Configure your cloud provider (GCP/AWS)
+2. Set machine type and region
+3. Configure carbon budgets
+4. Set up database exports (optional)
+5. Append the tracking job to `.gitlab-ci.yml`
+6. Store credentials in GitLab CI/CD variables
-- **Per-job carbon data** – emissions, runtime, and runner tags for every CI job.
-- **Runner inventory** – the machine catalog that powers your GitLab runners, including machine type and scope 3 estimates.
+**That's it!** Run your next pipeline and carbon emissions will be calculated automatically. Results appear as MR comments, in your configured database, and are stored in carbon job artifacts.
-Built-in connectors today:
-- **MySQL** – populates `GITGREEN_JOB_MYSQL_*` / `GITGREEN_RUNNER_MYSQL_*` and inserts rows through a standard MySQL client.
-- **PostgreSQL** – captures host, port, credentials, schema, table, and SSL mode (`GITGREEN_JOB_POSTGRES_*` / `GITGREEN_RUNNER_POSTGRES_*`) for storage in Postgres.
+---
-When you select either connector, the wizard captures host, port, username, password, database, and target table names and stores them in CI/CD variables. It immediately connects with those credentials to ensure the database, schema, and table exist (job sinks also create a `<table>_timeseries` table linked via foreign key). During CI, the GitGreen CLI automatically detects those env vars and:
+## Advanced
-- runs `gitgreen migrate --scope job|runner` to apply any pending migrations (tracked per DB via `gitgreen_migrations`);
-- writes each carbon calculation (typed summary columns plus CPU/RAM timeseries rows) and optional runner inventory snapshot into the configured sink.
+### Providers
-### Extending the interface
+| Provider | Metrics Source | Status |
+|----------|---------------|--------|
+| **GCP** | Cloud Monitoring API | Full support |
+| **AWS** | CloudWatch API | Full support |
-Additional connectors can be added without touching the wizard logic. Each destination implements the `OutputIntegration` interface in `src/lib/integrations/output-integrations.ts`, which specifies:
+### Output Integrations
-1. Display metadata (`id`, `name`, `description`)
-2. The data target it handles (`job` vs `runner`)
-3. Prompted credential fields (label, env var key, input type, default, mask flag)
+Export carbon data to external databases for analytics and dashboards.
-To add another sink (for example PostgreSQL or a webhook), create a new entry in that file with the fields your integration needs. Re-run `gitgreen init` and the option will automatically appear in the integration step.
+#### Supported Connectors
-### Database migrations
+| Connector | Job Data | Runner Inventory |
+|-----------|----------|------------------|
+| MySQL | `GITGREEN_JOB_MYSQL_*` | `GITGREEN_RUNNER_MYSQL_*` |
+| PostgreSQL | `GITGREEN_JOB_POSTGRES_*` | `GITGREEN_RUNNER_POSTGRES_*` |
-Structured sinks rely on migrations tracked in `gitgreen_migrations`. Run them whenever you update GitGreen or change table names:
+#### Database Migrations
 ```bash
-gitgreen migrate --scope job      # apply job sink migrations (summary + timeseries)
-gitgreen migrate --scope runner   # apply runner inventory migrations
-gitgreen migrate --scope all      # convenience wrapper (used by the GitLab component)
+gitgreen migrate --scope job      # Job emissions tables
+gitgreen migrate --scope runner   # Runner inventory tables
+gitgreen migrate --scope all      # Both
 ```
-The GitLab component automatically runs `gitgreen migrate --scope job` and `--scope runner` before calculating emissions, so pipelines stay in sync even when you change versions.
+Migrations run automatically in CI pipelines.
-### Table schemas
+### Database Schema
-#### Job table (per-job emissions)
+<details>
+<summary><strong>Job Table</strong> (per-job emissions)</summary>
 | Column | Type | Description |
 |--------|------|-------------|
-| `id` | BIGSERIAL | Auto-incrementing primary key |
-| `ingested_at` | TIMESTAMPTZ | When the record was inserted |
-| `provider` | TEXT | Cloud provider (`gcp` or `aws`) |
-| `region` | TEXT | Cloud region/zone where the job ran |
-| `machine_type` | TEXT | Instance type (e.g., `e2-standard-4`, `t3.medium`) |
-| `cpu_points` | INT | Number of CPU metric data points collected |
-| `ram_points` | INT | Number of RAM metric data points collected |
-| `runtime_seconds` | INT | Total job duration in seconds |
-| `total_emissions` | DOUBLE | Total carbon emissions in gCO2eq (cpu + ram + scope3) |
-| `cpu_emissions` | DOUBLE | Emissions from CPU usage in gCO2eq |
-| `ram_emissions` | DOUBLE | Emissions from RAM usage in gCO2eq |
-| `scope3_emissions` | DOUBLE | Embodied carbon (manufacturing, shipping, disposal) in gCO2eq |
-| `carbon_intensity` | DOUBLE | Grid carbon intensity in gCO2eq/kWh from Electricity Maps |
-| `pue` | DOUBLE | Power Usage Effectiveness of the data center |
-| `carbon_budget` | DOUBLE | Configured carbon budget threshold in gCO2eq |
-| `over_budget` | BOOLEAN | Whether total emissions exceeded the budget |
-| `gitlab_project_id` | BIGINT | GitLab project ID |
-| `gitlab_pipeline_id` | BIGINT | GitLab pipeline ID |
-| `gitlab_job_id` | BIGINT | GitLab job ID |
-| `gitlab_job_name` | TEXT | Name of the GitLab job |
-| `runner_id` | TEXT | GitLab runner ID |
-| `runner_description` | TEXT | Runner description from GitLab |
-| `runner_tags` | TEXT | Comma-separated runner tags |
-| `runner_version` | TEXT | GitLab runner version |
-| `runner_revision` | TEXT | GitLab runner revision |
-| `payload` | JSONB | Full calculation result as JSON (for extensibility) |
-#### Job timeseries table
+| `id` | BIGSERIAL | Primary key |
+| `ingested_at` | TIMESTAMPTZ | Insert timestamp |
+| `provider` | TEXT | `gcp` or `aws` |
+| `region` | TEXT | Cloud region/zone |
+| `machine_type` | TEXT | Instance type |
+| `runtime_seconds` | INT | Job duration |
+| `total_emissions` | DOUBLE | Total gCO₂eq |
+| `cpu_emissions` | DOUBLE | CPU gCO₂eq |
+| `ram_emissions` | DOUBLE | RAM gCO₂eq |
+| `scope3_emissions` | DOUBLE | Embodied gCO₂eq |
+| `carbon_intensity` | DOUBLE | Grid intensity (gCO₂eq/kWh) |
+| `pue` | DOUBLE | Power Usage Effectiveness |
+| `carbon_budget` | DOUBLE | Budget threshold |
+| `over_budget` | BOOLEAN | Budget exceeded |
+| `gitlab_project_id` | BIGINT | GitLab project |
+| `gitlab_pipeline_id` | BIGINT | Pipeline ID |
+| `gitlab_job_id` | BIGINT | Job ID |
+| `gitlab_job_name` | TEXT | Job name |
+| `payload` | JSONB | Full result JSON |
+</details>
+<details>
+<summary><strong>Timeseries Table</strong> (CPU/RAM metrics)</summary>
 | Column | Type | Description |
 |--------|------|-------------|
-| `id` | BIGSERIAL | Auto-incrementing primary key |
-| `job_id` | BIGINT | Foreign key to the job table |
-| `metric` | TEXT | Metric name: `cpu`, `ram_used`, or `ram_size` |
-| `ts` | TIMESTAMPTZ | Timestamp of the data point |
-| `value` | DOUBLE | Metric value (CPU utilization %, RAM bytes used, or RAM bytes total) |
+| `id` | BIGSERIAL | Primary key |
+| `job_id` | BIGINT | Foreign key to job |
+| `metric` | TEXT | `cpu`, `ram_used`, `ram_size` |
+| `ts` | TIMESTAMPTZ | Timestamp |
+| `value` | DOUBLE | Metric value |
+</details>
-#### Runner inventory table
+<details>
+<summary><strong>Runner Inventory Table</strong></summary>
 | Column | Type | Description |
 |--------|------|-------------|
-| `id` | BIGSERIAL | Auto-incrementing primary key |
-| `ingested_at` | TIMESTAMPTZ | When the record was inserted |
+| `id` | BIGSERIAL | Primary key |
 | `runner_id` | TEXT | GitLab runner ID |
-| `runner_description` | TEXT | Runner description |
-| `runner_version` | TEXT | GitLab runner version |
-| `runner_revision` | TEXT | GitLab runner revision |
-| `runner_platform` | TEXT | OS platform (e.g., `linux`) |
-| `runner_architecture` | TEXT | CPU architecture (e.g., `amd64`) |
-| `runner_executor` | TEXT | Executor type (e.g., `docker`, `shell`) |
-| `runner_tags` | TEXT | Comma-separated runner tags |
 | `machine_type` | TEXT | Instance type |
 | `provider` | TEXT | Cloud provider |
-| `region` | TEXT | Cloud region |
-| `gcp_project_id` | TEXT | GCP project ID (if applicable) |
-| `gcp_instance_id` | TEXT | GCP instance ID |
-| `gcp_zone` | TEXT | GCP zone |
-| `aws_region` | TEXT | AWS region (if applicable) |
-| `aws_instance_id` | TEXT | AWS EC2 instance ID |
-| `last_job_machine_type` | TEXT | Machine type from most recent job |
-| `last_job_region` | TEXT | Region from most recent job |
-| `last_job_provider` | TEXT | Provider from most recent job |
-| `last_job_runtime_seconds` | INT | Runtime of most recent job |
-| `last_job_total_emissions` | DOUBLE | Emissions from most recent job in gCO2eq |
-| `last_job_recorded_at` | TIMESTAMPTZ | When the most recent job was recorded |
-| `payload` | JSONB | Full runner metadata as JSON |
-## Adding a provider
-1. Extend `CloudProvider` and the provider guard in `src/index.ts` so the calculator accepts the new key.
-2. Add machine power data (`<provider>_machine_power_profiles.json`) and, if needed, CPU profiles to `data/`, then update `PowerProfileRepository.loadMachineData` to load it.
-3. Map regions to Electricity Maps zones and a PUE default in `ZoneMapper` (or via `data/runtime-pue-mappings.json` for runtime overrides).
-4. Parse that provider's metrics into the `TimeseriesPoint` shape (timestamp + numeric value) alongside RAM size/usage, and update the CLI/init/templates to pull those metrics.
-5. Wire any CI automation (runner tags, MR note flags) to pass the correct provider, machine type, and region strings.
-## Publish
-- Ensure version bump in `package.json`
-- Run `pnpm -C node-module build && pnpm -C node-module test`
-- Publish: `pnpm -C node-module publish --access public`
+| `region` | TEXT | Region |
+| `last_job_total_emissions` | DOUBLE | Last job emissions |
+| `payload` | JSONB | Full metadata |
-## FAQ
+</details>
-### Are my API keys and credentials secure?
+## Methodology
-Yes. GitGreen does not save your keys locally. During `gitgreen init`, all sensitive credentials (API keys, service account keys, AWS credentials) are stored only in GitLab CI/CD variables, which are encrypted and managed by GitLab. The CLI never writes credentials to disk on your local machine.
+GitGreen's calculations are based on peer-reviewed methodologies from [re:cinq](https://re-cinq.com/blog/cloud-cpu-energy-consumption) and [Teads](https://github.com/re-cinq/emissions-data).
-### Does GitGreen call any third-party APIs?
+### Formula
-GitGreen only calls APIs that you explicitly configure and authorize:
+```
+E_total = E_operational + E_embodied
-- **Electricity Maps API** - For carbon intensity data (requires your API key)
-- **GitLab API** - Only when using `glab` CLI or providing a PAT, to set CI/CD variables
-- **GCP Monitoring API** - To fetch CPU/RAM metrics from your GCP project (requires your service account)
-- **AWS CloudWatch API** - To fetch CPU/RAM metrics from your AWS account (requires your AWS credentials)
+E_operational = (P_cpu + P_ram) × runtime_hours × PUE × carbon_intensity
+E_embodied    = scope3_hourly × runtime_hours
+```
-GitGreen does not operate any backend services or call any APIs owned by the GitGreen project. All API calls are made directly from your environment or CI/CD pipeline using your credentials.
+| Variable | Description | Source |
+|----------|-------------|--------|
+| `P_cpu` | CPU power (kW) | Interpolated from utilization |
+| `P_ram` | RAM power (0.5 W/GB) | Industry standard for DDR4 |
+| `PUE` | Data center efficiency | Google/AWS published data |
+| `carbon_intensity` | Grid emissions (gCO₂eq/kWh) | Electricity Maps API |
+| `scope3_hourly` | Embodied carbon rate | Dell R740 LCA study |
-### How do I know GitGreen won't destroy my `.gitlab-ci.yml` file?
+### CPU Power Model
-GitGreen takes safety precautions when modifying your `.gitlab-ci.yml`:
+CPU power is **non-linear** with utilization. We use cubic spline interpolation across measured data points:
-1. **Automatic backups** - Before making any changes, GitGreen creates a backup of your existing `.gitlab-ci.yml` file
-2. **Backup location** - The backup path is printed to the console (e.g., `/tmp/gitlab-ci-backup-{timestamp}.yml`)
-3. **Append-only changes** - GitGreen only appends new content to your CI file; it never removes or modifies existing jobs
-4. **Confirmation prompt** - You're asked to confirm before any changes are made
+| Utilization | Power (% of TDP) |
+|-------------|------------------|
+| 0% (idle) | 1.7% |
+| 10% | 3.4% |
+| 50% | 16.9% |
+| 100% | 100% |
-If something goes wrong, you can restore your file from the backup location that was printed.
+**VM Power Correction:** Cloud VMs share physical CPUs. We scale power by the ratio of VM vCPUs to physical threads:
-### Can I use GitGreen without the `init` wizard?
+```
+P_vm = TDP × ratio × (vm_vcpus / physical_threads)
+```
-Yes. You can configure GitGreen manually by:
+| CPU | Cores | Threads | TDP | Source |
+|-----|-------|---------|-----|--------|
+| Intel Xeon Gold 6268CL | 24 | 48 | 205W | [Intel](https://www.ebay.com/p/2321792675) |
+| Intel Xeon Platinum 8481C | 56 | 112 | 350W | [TechPowerUp](https://www.techpowerup.com/cpu-specs/xeon-platinum-8481c.c3992) |
+| AMD EPYC 7B12 | 64 | 128 | 240W | [AMD](https://www.newegg.com/p/1FR-00G6-00026) |
+| Ampere Altra Q64-30 | 64 | 64 | 180W | [Ampere](https://amperecomputing.com/en/briefs/ampere-altra-family-product-brief) |
-1. Setting the required CI/CD variables in your GitLab project settings
-2. Adding the GitLab CI/CD component to your `.gitlab-ci.yml` manually
-3. Running `gitgreen` directly with command-line options instead of using the component
+### Scope 3 (Embodied Carbon)
-The `init` wizard is a convenience tool, but all functionality is available through manual configuration.
+Manufacturing emissions amortized over hardware lifespan (6 years):
-### What data does GitGreen collect?
+| Component | Emissions |
+|-----------|-----------|
+| Base server | ~1000 kgCO₂eq |
+| Per CPU | ~100 kgCO₂eq |
+| Per 32GB DIMM | ~44 kgCO₂eq |
+| Per SSD | ~50-100 kgCO₂eq |
-GitGreen only collects:
+*Source: Dell PowerEdge R740 Life Cycle Assessment*
-- **CPU and RAM usage metrics** from your cloud provider (GCP Monitoring or AWS CloudWatch)
-- **Pipeline metadata** (start/end times) from GitLab CI/CD environment variables
+### Accuracy & Limitations
-GitGreen does not collect any personal information, code, or data from your repositories. All calculations happen locally in your CI/CD pipeline.
+**Expected Accuracy:**
+- Relative comparisons: Excellent (comparing job A vs B)
+- Absolute values: ±15-25% for CPU-bound workloads
-### Can I run GitGreen locally for testing?
+**Known Limitations:**
-Yes. You can run `gitgreen` directly with your own metrics files:
+| Limitation | Impact | Notes |
+|------------|--------|-------|
+| No GPU modeling | High | GPU jobs underestimated |
+| Fixed RAM power | Low | 0.5 W/GB industry standard |
+| No network/storage I/O | Low | <5% of typical job power |
-**GCP example:**
-```bash
-gitgreen --provider gcp \
-  --machine e2-standard-4 \
-  --region us-central1-a \
-  --cpu-timeseries cpu.json \
-  --ram-used-timeseries ram-used.json \
-  --ram-size-timeseries ram-size.json \
-  --out-md report.md
-```
+## Architecture
-**AWS example:**
-```bash
-gitgreen --provider aws \
-  --machine t3.medium \
-  --region us-east-1 \
-  --cpu-timeseries cpu.json \
-  --ram-used-timeseries ram-used.json \
-  --ram-size-timeseries ram-size.json \
-  --out-md report.md
 ```
-Or use `--from-cloudwatch` to fetch metrics directly from AWS CloudWatch:
-```bash
-gitgreen --provider aws \
-  --machine t3.medium \
-  --region us-east-1 \
-  --from-cloudwatch \
-  --out-md report.md
+┌─────────────────────────────────────────────────────────────┐
+│                        GitLab Pipeline                      │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────────┐    ┌──────────────────────────────────┐    │
+│  │ Your Jobs   │───▶│ GitGreen Carbon Tracking Job     │    │
+│  └─────────────┘    │  ├─ Fetch CPU/RAM metrics        │    │
+│                     │  ├─ Calculate emissions          │    │
+│                     │  ├─ Post MR comment              │    │
+│                     │  └─ Export to database           │    │
+│                     └──────────────────────────────────┘    │
+└─────────────────────────────────────────────────────────────┘
+                              │
+         ┌────────────────────┼────────────────────┐
+         ▼                    ▼                    ▼
+┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
+│ GCP Monitoring  │  │ Electricity Maps│  │ MySQL/Postgres  │
+│ AWS CloudWatch  │  │ API             │  │ (optional)      │
+└─────────────────┘  └─────────────────┘  └─────────────────┘
 ```
-This is useful for testing before integrating into your CI/CD pipeline.
+## Configuration
-## Carbon Calculation Methodology
+### Adding a New Provider
-GitGreen's carbon calculations are based on the methodology developed by [re:cinq](https://re-cinq.com/blog/cloud-cpu-energy-consumption) and [Teads](https://github.com/re-cinq/emissions-data), adapted for CI/CD workloads.
+1. Add machine power profiles to `data/<provider>_machine_power_profiles.json`
+2. Update `PowerProfileRepository.loadMachineData()`
+3. Map regions to Electricity Maps zones in `ZoneMapper`
+4. Parse metrics into `TimeseriesPoint` format
+5. Wire CI automation for provider-specific settings
-### Formula
+### Source Data Files
-The total carbon emissions for a CI job are calculated as:
+| File | Description |
+|------|-------------|
+| `data/cpu_physical_specs.json` | Physical CPU specs with sources |
+| `data/cpu_power_profiles.json` | TDP and power ratios |
+| `data/gcp_machine_power_profiles.json` | GCP machine mappings |
+| `data/aws_machine_power_profiles.json` | AWS instance mappings |
+| `data/source/*.csv` | Original re:cinq research data |
-```
-E_total = E_operational + E_embodied
+## FAQ
-E_operational = (P_cpu + P_ram) × runtime_hours × PUE × carbon_intensity
-E_embodied = scope3_emissions_hourly × runtime_hours
-```
+<details>
+<summary><strong>Are my credentials secure?</strong></summary>
-Where:
-- **P_cpu**: CPU power consumption in kW, interpolated from utilization
-- **P_ram**: RAM power consumption (0.5 W/GB × used GB)
-- **PUE**: Power Usage Effectiveness of the data center (typically 1.1-1.2 for hyperscalers)
-- **carbon_intensity**: Grid carbon intensity in gCO2eq/kWh from Electricity Maps API
-- **scope3_emissions_hourly**: Amortized embodied carbon from manufacturing
+Yes. GitGreen stores all credentials in GitLab CI/CD variables (encrypted). Nothing is written to your local disk.
+</details>
-### CPU Power Interpolation
+<details>
+<summary><strong>What APIs does GitGreen call?</strong></summary>
-CPU power is not linear with utilization. We use **cubic spline interpolation** across 4 measured points:
+Only APIs you explicitly configure:
+- Electricity Maps API (carbon intensity)
+- GCP Monitoring API (metrics)
+- AWS CloudWatch API (metrics)
+- GitLab API (MR comments, via CI_JOB_TOKEN)
-| Utilization | Power Ratio (of TDP) |
-|-------------|---------------------|
-| 0% (idle)   | ~1.7% |
-| 10%         | ~3.4% |
-| 50%         | ~16.9% |
-| 100%        | 100% |
+GitGreen has no backend server.
+</details>
-Power at any utilization is calculated as:
-```
-P_cpu(util) = CubicSpline([0, 10, 50, 100], [W_idle, W_10, W_50, W_100])(util)
-```
+<details>
+<summary><strong>Will it break my .gitlab-ci.yml?</strong></summary>
-The power values are derived from CPU TDP (Thermal Design Power) multiplied by empirically measured ratios from the [re:cinq emissions-data](https://github.com/re-cinq/emissions-data) project.
+No. GitGreen creates a backup before any changes, only appends content (never modifies existing jobs), and asks for confirmation.
+</details>
-### Physical vCPU Correction
+<details>
+<summary><strong>Can I run it without the wizard?</strong></summary>
-Cloud VMs share physical CPUs. To accurately estimate power for a VM, we scale by the ratio of VM vCPUs to physical CPU threads:
+Yes. Set CI/CD variables manually and run `gitgreen` with CLI options.
+</details>
-```
-P_vm = TDP × ratio × (vm_vcpus / physical_threads)
-```
+## Contributing
-Physical thread counts are sourced from official specifications:
+1. Fork the repository
+2. Create a feature branch
+3. Run tests: `npm test`
+4. Submit a pull request
-| CPU | Cores | Threads | TDP | Source |
-|-----|-------|---------|-----|--------|
-| Intel Xeon Gold 6268CL | 24 | 48 | 205W | [eBay/Intel](https://www.ebay.com/p/2321792675) |
-| Intel Xeon Gold 6253CL | 18 | 36 | 205W | [PassMark](https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+Gold+6253CL) |
-| Intel Xeon Platinum 8481C | 56 | 112 | 350W | [TechPowerUp](https://www.techpowerup.com/cpu-specs/xeon-platinum-8481c.c3992) |
-| Intel Xeon Platinum 8373C | 36 | 72 | 300W | [Wikipedia](https://en.wikipedia.org/wiki/List_of_Intel_Xeon_processors_(Ice_Lake-based)) |
-| AMD EPYC 7B12 | 64 | 128 | 240W | [Newegg](https://www.newegg.com/p/1FR-00G6-00026) |
-| Ampere Altra Q64-30 | 64 | 64 | 180W | [Ampere](https://amperecomputing.com/en/briefs/ampere-altra-family-product-brief) |
-**Example:** For `n2-standard-80` (80 vCPUs, Xeon Gold 6268CL with 48 threads):
-- Old (inflated): 205W × 80 = 16,400W at 100%
-- Corrected: 205W × (80/48) = 342W at 100%
+## References
-### Scope 3 (Embodied Carbon)
-Scope 3 emissions represent the carbon footprint of manufacturing, shipping, and disposing of hardware. Based on Dell PowerEdge R740 LCA data:
-| Component | Emissions (kgCO2eq) |
-|-----------|---------------------|
-| Base server (1 socket, low DRAM) | ~1000 |
-| Per additional CPU | ~100 |
-| Per 32GB DIMM | ~44 |
-| Per SSD | ~50-100 |
+### Research & Methodology
-These are amortized over a 6-year server lifespan (~0.019 gCO2eq/hour conversion factor) and allocated proportionally to VM resources.
+| Source | Description |
+|--------|-------------|
+| [re:cinq Cloud CPU Energy Consumption](https://re-cinq.com/blog/cloud-cpu-energy-consumption) | CPU power modeling methodology |
+| [re:cinq emissions-data](https://github.com/re-cinq/emissions-data) | Machine power profiles and ratios |
+| [Teads Engineering](https://engineering.teads.com/) | Original research on cloud carbon |
+| Dell PowerEdge R740 LCA | Scope 3 embodied carbon data |
 ### Data Sources
 | Data | Source |
 |------|--------|
-| CPU power ratios | [re:cinq emissions-data](https://github.com/re-cinq/emissions-data) |
-| GCP machine specs | [Google Cloud CPU Platforms](https://cloud.google.com/compute/docs/cpu-platforms) |
-| Carbon intensity | [Electricity Maps API](https://www.electricitymaps.com/) |
-| PUE values | [Google Data Center Efficiency](https://www.google.com/about/datacenters/efficiency/) |
-| Scope 3 LCA data | Dell PowerEdge R740 Life Cycle Assessment |
-### Accuracy & Limitations
-#### What's grounded in real research
+| Real-time carbon intensity | [Electricity Maps API](https://www.electricitymaps.com/) |
+| GCP machine specifications | [Google Cloud CPU Platforms](https://cloud.google.com/compute/docs/cpu-platforms) |
+| GCP data center PUE | [Google Data Center Efficiency](https://www.google.com/about/datacenters/efficiency/) |
+| Intel CPU specifications | [Intel ARK](https://ark.intel.com/) |
+| AMD CPU specifications | [AMD Product Pages](https://www.amd.com/en/products/specifications/processors) |
+| Ampere CPU specifications | [Ampere Product Briefs](https://amperecomputing.com/briefs) |
-| Component | Methodology | Source |
-|-----------|-------------|--------|
-| CPU power ratios | Measured at 0%, 10%, 50%, 100% utilization | [re:cinq/Teads](https://github.com/re-cinq/emissions-data) |
-| Physical CPU specs | Official vendor specifications | Intel ARK, TechPowerUp, AMD, Ampere |
-| Carbon intensity | Real-time grid data | [Electricity Maps API](https://www.electricitymaps.com/) |
-| PUE values | Published data center efficiency | [Google](https://www.google.com/about/datacenters/efficiency/) |
-| Scope 3 methodology | Life cycle assessment | Dell PowerEdge R740 LCA |
+### CPU Specifications Used
-#### Expected accuracy
-- **Relative accuracy**: Excellent for comparing jobs (A vs B) and tracking trends
-- **Absolute accuracy**: ±15-25% for CPU-bound workloads
+| CPU | Cores | Threads | TDP | Source |
+|-----|-------|---------|-----|--------|
+| Intel Xeon Gold 6268CL | 24 | 48 | 205W | [Product Listing](https://www.ebay.com/p/2321792675) |
+| Intel Xeon Gold 6253CL | 18 | 36 | 205W | [PassMark](https://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+Gold+6253CL) |
+| Intel Xeon Platinum 8481C | 56 | 112 | 350W | [TechPowerUp](https://www.techpowerup.com/cpu-specs/xeon-platinum-8481c.c3992) |
+| Intel Xeon Platinum 8373C | 36 | 72 | 300W | [Wikipedia](https://en.wikipedia.org/wiki/List_of_Intel_Xeon_processors_(Ice_Lake-based)) |
+| AMD EPYC 7B12 | 64 | 128 | 240W | [Newegg](https://www.newegg.com/p/1FR-00G6-00026) |
+| Ampere Altra Q64-30 | 64 | 64 | 180W | [Ampere Brief](https://amperecomputing.com/en/briefs/ampere-altra-family-product-brief) |
-#### Known limitations
+## License
-| Limitation | Impact | Notes |
-|------------|--------|-------|
-| RAM uses fixed 0.5 W/GB | Low | Industry standard estimate for DDR4 |
-| Scope 3 only for some AWS types | Medium | GCP scope 3 data not yet available |
-| No GPU modeling | High (if using GPUs) | GPU-heavy jobs will be underestimated |
-| No network I/O modeling | Low | Typically <5% of job power |
-| No storage I/O modeling | Low | Typically <5% of job power |
-| Multi-tenant overhead | Low | Actual power may be 5-10% lower due to shared resources |
-| CPU specs may be incomplete | Medium | Falls back to unscaled profile if CPU not in database |
-#### What this means for you
-- **CI/CD optimization**: The relative comparisons are reliable - if job A shows 2x the emissions of job B, that's meaningful
-- **Absolute reporting**: Use the numbers for directional guidance, not precise carbon accounting
-- **Trend tracking**: Week-over-week and month-over-month trends are accurate
-- **GPU workloads**: Currently underestimated - GPU power not modeled
-### Source Data
-Raw source data files are available in `data/`:
-- `cpu_physical_specs.json` - Physical CPU specs with thread counts and sources (our research)
-- `cpu_power_profiles.json` - TDP and power ratios per CPU type
-- `gcp_machine_power_profiles.json` - GCP machine type to power mappings
-- `aws_machine_power_profiles.json` - AWS instance type to power mappings
-Original re:cinq data in `data/source/`:
-- `GCP Machine types - CPU Profiles.csv` - TDP and power ratios per CPU
-- `GCP Machine types - Instances.csv` - Machine type to CPU mappings
-- `GCP Machine types - Scope 3 Ratios.csv` - Embodied carbon factors
-- `GCP Machine types - Dell R740 LCA.csv` - Life cycle assessment reference
+MIT License - see [LICENSE](LICENSE) for details.
-## License
+---
-MIT License - see [LICENSE](LICENSE) file for details.
+<p align="center">
+  <sub>Built with care for a sustainable software future.</sub>
+</p>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gitgreen",
-  "version": "1.3.0",
+  "version": "1.4.0",
   "description": "GitGreen CLI for carbon reporting in GitLab pipelines (GCP/AWS)",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",

package/scripts/build_power_profiles.py DELETED Viewed

@@ -1,268 +0,0 @@
-#!/usr/bin/env python3
-"""
-Build complete GCP machine power profiles by correlating data across multiple CSV files.
-The power data is spread across different files and needs to be combined.
-"""
-import pandas as pd
-import json
-import numpy as np
-def find_actual_power_data():
-    """
-    Search through all CSV files to find where the actual power consumption data is
-    """
-    print("=== SEARCHING FOR ACTUAL POWER DATA ===")
-    # 1. Check Machine Ratios file
-    print("\n--- Machine Ratios File ---")
-    ratios_df = pd.read_csv("data/GCP Machine types - Machine Ratios.csv")
-    # Look for non-NaN power data
-    power_cols = ['CPU Name', 'PkgWatt Idle', 'PkgWatt CPUStress 10%', 'PkgWatt CPUStress 50%', 'PkgWatt Average 100%']
-    power_data = ratios_df[power_cols].dropna()
-    print(f"Rows with power data: {len(power_data)}")
-    if len(power_data) > 0:
-        print("Sample power data from Machine Ratios:")
-        print(power_data.head())
-    # 2. Check CPU Profiles file
-    print("\n--- CPU Profiles File ---")
-    cpu_df = pd.read_csv("data/GCP Machine types - CPU Profiles.csv")
-    # Look for ratio data that can be used to calculate power
-    ratio_cols = ['Processor SKU', 'TDP (W)', 'IDLE ratio', '10% ratio', '50% Ratio']
-    cpu_power = cpu_df[ratio_cols].dropna()
-    print(f"Rows with CPU ratio data: {len(cpu_power)}")
-    if len(cpu_power) > 0:
-        print("Sample CPU ratio data:")
-        print(cpu_power.head())
-    # 3. Check Bare Metal Profiles (this had actual power data in the first analysis)
-    print("\n--- Bare Metal Power Profiles ---")
-    bare_metal_df = pd.read_csv("data/GCP Machine types - Bare Metal Power Profiles.csv")
-    # Find rows with actual power measurements
-    bare_metal_power_cols = ['Product Name', 'PkgWatt Idle', 'PkgWatt CPUStress 10%', 'PkgWatt CPUStress 50%', 'PkgWatt CPUStress 100%']
-    bare_metal_power = bare_metal_df[bare_metal_power_cols].dropna()
-    print(f"Rows with bare metal power data: {len(bare_metal_power)}")
-    if len(bare_metal_power) > 0:
-        print("Sample bare metal power data:")
-        print(bare_metal_power.head())
-    return power_data, cpu_power, bare_metal_power
-def correlate_cpu_to_machines():
-    """
-    Try to correlate CPU types to GCP machine instances
-    """
-    print("\n=== CORRELATING CPU TYPES TO GCP MACHINES ===")
-    # Load instances file
-    instances_df = pd.read_csv("data/GCP Machine types - Instances.csv")
-    # Look at CPU information in instances
-    cpu_info_cols = ['Instance type', 'Platform CPU Name', 'Instance vCPU', 'Instance Memory (in GB)']
-    cpu_info = instances_df[cpu_info_cols].dropna(subset=['Platform CPU Name'])
-    print(f"Machines with CPU information: {len(cpu_info)}")
-    if len(cpu_info) > 0:
-        print("Sample CPU information:")
-        print(cpu_info.head(10))
-        # Get unique CPU types used in GCP
-        unique_cpus = cpu_info['Platform CPU Name'].unique()
-        print(f"\nUnique CPU types in GCP: {len(unique_cpus)}")
-        for cpu in unique_cpus[:10]:  # Show first 10
-            print(f"  - {cpu}")
-    return cpu_info
-def build_power_profiles_from_ratios():
-    """
-    Build power profiles using CPU ratios and TDP values
-    """
-    print("\n=== BUILDING POWER PROFILES FROM CPU RATIOS ===")
-    # Load CPU profiles with ratios
-    cpu_df = pd.read_csv("data/GCP Machine types - CPU Profiles.csv")
-    # Clean and process the data
-    ratio_data = cpu_df[['Processor SKU', 'TDP (W)', 'IDLE ratio', '10% ratio', '50% Ratio']].dropna()
-    if len(ratio_data) > 0:
-        print(f"CPUs with ratio data: {len(ratio_data)}")
-        # Calculate actual power consumption from ratios
-        power_profiles = {}
-        for _, row in ratio_data.iterrows():
-            cpu_name = row['Processor SKU']
-            tdp_watts = row['TDP (W)']
-            idle_ratio = row['IDLE ratio']
-            ratio_10 = row['10% ratio']
-            ratio_50 = row['50% Ratio']
-            # Calculate power at different utilization levels
-            # Using TDP and ratios to estimate power consumption
-            power_profiles[cpu_name] = {
-                'tdp_watts': tdp_watts,
-                'power_profile': [
-                    {'percentage': 0, 'watts': tdp_watts * idle_ratio},
-                    {'percentage': 10, 'watts': tdp_watts * ratio_10},
-                    {'percentage': 50, 'watts': tdp_watts * ratio_50},
-                    {'percentage': 100, 'watts': tdp_watts}  # Assume 100% = TDP
-                ]
-            }
-        print("\nSample calculated power profiles:")
-        for i, (cpu_name, profile) in enumerate(power_profiles.items()):
-            if i < 3:  # Show first 3
-                print(f"\n{cpu_name}:")
-                print(f"  TDP: {profile['tdp_watts']}W")
-                for point in profile['power_profile']:
-                    print(f"  {point['percentage']}%: {point['watts']:.1f}W")
-        # Save CPU power profiles
-        with open('cpu_power_profiles.json', 'w') as f:
-            json.dump(power_profiles, f, indent=2)
-        print(f"\n✅ Saved {len(power_profiles)} CPU power profiles to cpu_power_profiles.json")
-        return power_profiles
-    else:
-        print("❌ No usable ratio data found")
-        return {}
-def map_gcp_machines_to_power():
-    """
-    Map GCP machine types to their CPU power profiles
-    """
-    print("\n=== MAPPING GCP MACHINES TO POWER PROFILES ===")
-    # Load instances and find CPU mappings
-    instances_df = pd.read_csv("data/GCP Machine types - Instances.csv")
-    cpu_info = instances_df[['Instance type', 'Platform CPU Name', 'Instance vCPU', 'Instance Memory (in GB)']].dropna(subset=['Platform CPU Name'])
-    # Load CPU power profiles
-    try:
-        with open('cpu_power_profiles.json', 'r') as f:
-            cpu_power_profiles = json.load(f)
-    except:
-        print("❌ CPU power profiles not found. Run build_power_profiles_from_ratios() first.")
-        return {}
-    # Map GCP machines to their power profiles
-    gcp_machine_profiles = {}
-    for _, row in cpu_info.iterrows():
-        machine_type = row['Instance type']
-        cpu_name = row['Platform CPU Name']
-        vcpus = row['Instance vCPU']
-        memory_gb = row['Instance Memory (in GB)']
-        # Find matching CPU power profile (exact match or partial match)
-        matching_cpu = None
-        for cpu_profile_name in cpu_power_profiles.keys():
-            if cpu_name in cpu_profile_name or cpu_profile_name in cpu_name:
-                matching_cpu = cpu_profile_name
-                break
-        if matching_cpu:
-            # Scale power consumption based on vCPUs (since profiles are per-CPU)
-            base_profile = cpu_power_profiles[matching_cpu]
-            gcp_machine_profiles[machine_type] = {
-                'vcpus': vcpus,
-                'memory_gb': memory_gb,
-                'platform_cpu': cpu_name,
-                'matched_cpu_profile': matching_cpu,
-                'cpu_power_profile': [
-                    {
-                        'percentage': point['percentage'],
-                        'watts': point['watts'] * (vcpus / base_profile.get('vcpus', 1))  # Scale by vCPU count
-                    }
-                    for point in base_profile['power_profile']
-                ]
-            }
-    if gcp_machine_profiles:
-        print(f"✅ Mapped {len(gcp_machine_profiles)} GCP machines to power profiles")
-        # Show sample mappings
-        print("\nSample GCP machine power profiles:")
-        for i, (machine, profile) in enumerate(gcp_machine_profiles.items()):
-            if i < 3:
-                print(f"\n{machine} ({profile['vcpus']} vCPUs, {profile['memory_gb']}GB):")
-                print(f"  CPU: {profile['platform_cpu']}")
-                print(f"  Matched profile: {profile['matched_cpu_profile']}")
-                for point in profile['cpu_power_profile']:
-                    print(f"  {point['percentage']}%: {point['watts']:.1f}W")
-        # Save complete GCP machine power profiles
-        with open('gcp_machine_power_profiles.json', 'w') as f:
-            json.dump(gcp_machine_profiles, f, indent=2)
-        print(f"\n✅ Saved complete GCP machine power profiles to gcp_machine_power_profiles.json")
-        return gcp_machine_profiles
-    else:
-        print("❌ Could not map any GCP machines to power profiles")
-        return {}
-def final_recommendations():
-    """
-    Provide final implementation recommendations based on available data
-    """
-    print("\n" + "="*60)
-    print("FINAL IMPLEMENTATION RECOMMENDATIONS")
-    print("="*60)
-    try:
-        with open('gcp_machine_power_profiles.json', 'r') as f:
-            profiles = json.load(f)
-        print("✅ SUCCESS: Ready for re:cinq implementation!")
-        print(f"   - {len(profiles)} GCP machine types with power profiles")
-        print("   - Power consumption curves: 0%, 10%, 50%, 100% CPU utilization")
-        print("   - Data ready for cubic spline interpolation")
-        print("\n📋 IMPLEMENTATION STEPS:")
-        print("1. Load gcp_machine_power_profiles.json in CarbonService")
-        print("2. Extract machine type from GitLab runner tags")
-        print("3. Use cubic-spline package for power interpolation")
-        print("4. Get real-time carbon intensity from Electricity Maps API")
-        print("5. Apply Google data center PUE values")
-        print("6. Calculate: interpolated_power(kW) × runtime(h) × PUE × carbon_intensity")
-        print(f"\n🔧 SAMPLE CALCULATION CODE:")
-        sample_machine = list(profiles.keys())[0]
-        sample_profile = profiles[sample_machine]
-        print(f"// Example for {sample_machine}")
-        print("const powerProfile = [")
-        for point in sample_profile['cpu_power_profile']:
-            print(f"  {{ percentage: {point['percentage']}, watts: {point['watts']:.1f} }},")
-        print("];")
-        print("const powerWatts = cubicSplineInterpolation(powerProfile, cpuUtilization);")
-    except FileNotFoundError:
-        print("❌ No machine power profiles generated")
-        print("   Run the correlation functions first")
-if __name__ == "__main__":
-    # Step 1: Find where actual power data exists
-    machine_ratios_power, cpu_ratios, bare_metal_power = find_actual_power_data()
-    # Step 2: Correlate CPU information
-    cpu_info = correlate_cpu_to_machines()
-    # Step 3: Build power profiles from available data
-    cpu_profiles = build_power_profiles_from_ratios()
-    # Step 4: Map GCP machines to power profiles
-    gcp_profiles = map_gcp_machines_to_power()
-    # Step 5: Final recommendations
-    final_recommendations()