carbongate 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,8 @@
1
+ .venv/
2
+ dist/
3
+ *.egg-info/
4
+ .pytest_cache/
5
+ __pycache__/
6
+ *.pyc
7
+ .DS_Store
8
+ *.parquet.bak
@@ -0,0 +1,530 @@
1
+ Metadata-Version: 2.4
2
+ Name: carbongate
3
+ Version: 0.2.0
4
+ Summary: Shift-left carbon analysis for Terraform — suggests lower-carbon cloud regions in pull requests.
5
+ Project-URL: Homepage, https://github.com/carbongate/carbongate
6
+ Project-URL: Repository, https://github.com/carbongate/carbongate
7
+ Project-URL: Documentation, https://github.com/carbongate/carbongate#readme
8
+ Project-URL: Issues, https://github.com/carbongate/carbongate/issues
9
+ Author: CarbonGate
10
+ License: MIT
11
+ Keywords: aws,carbon,devops,shift-left,sustainability,terraform
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Environment :: Console
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Programming Language :: Python :: 3.13
21
+ Classifier: Topic :: Software Development :: Quality Assurance
22
+ Classifier: Topic :: System :: Monitoring
23
+ Requires-Python: >=3.11
24
+ Requires-Dist: duckdb>=1.0
25
+ Requires-Dist: polars>=1.0
26
+ Requires-Dist: python-hcl2>=5.0
27
+ Requires-Dist: requests>=2.28
28
+ Requires-Dist: typer>=0.12
29
+ Provides-Extra: dev
30
+ Requires-Dist: build>=1.0; extra == 'dev'
31
+ Requires-Dist: pytest>=8.0; extra == 'dev'
32
+ Requires-Dist: twine>=5.0; extra == 'dev'
33
+ Provides-Extra: mcp
34
+ Requires-Dist: mcp>=1.0; extra == 'mcp'
35
+ Description-Content-Type: text/markdown
36
+
37
+ # 🌱 CarbonGate
38
+
39
+ > **Shift-left carbon analysis for Terraform. Suggests lower-carbon cloud regions right in your Pull Requests.**
40
+
41
+ [🇫🇷 Lire en français](#carbongate--analyse-carbone-shift-left-pour-terraform)
42
+
43
+ ---
44
+
45
+ CarbonGate is a **deterministic, zero-token, zero-latency** CLI tool that analyzes Terraform infrastructure diffs and recommends AWS regions with lower carbon intensity — before you merge.
46
+
47
+ ### The Sweet Spot
48
+
49
+ You give away the software (free, open-source, runs on their own CI runner). You sell the data (fresh Parquet feed with real-time carbon intensity from emissions.dev). No dashboards. No LLM hallucinations. Just a YAML file, a ruthless parser, a DuckDB engine faster than electricity, and a gate (the data) that you control.
50
+
51
+ ### For buyers & sales calls
52
+
53
+ One-page pitch (FR), demo script, pricing summary, objections: **[docs/SALES_ONE_PAGER.md](docs/SALES_ONE_PAGER.md)** · product sheet: **[docs/FICHE_PRODUIT.md](docs/FICHE_PRODUIT.md)**.
54
+
55
+ ### For executives (FR)
56
+
57
+ **[IMPACT© methodology](docs/IMPACT_METHODOLOGY.md)** — audit framework, ROI ranges, CSRD scope, 3× fee-back guarantee.
58
+ **[IMPACT Pre-Scan](docs/IMPACT_METHODOLOGY.md#0-impact-pre-scan-qualification--avant-audit-payant)** — qualification 1 repo + call 30 min → livrable 1 page (gratuit sous conditions).
59
+ **[Case study #0](docs/CASE_STUDY_VPC_MODULE.md)** — public `terraform-aws-vpc` demo (~86 % modeled carbon savings eu-west-1 → eu-north-1; not a paying client).
60
+ **[Trigger events playbook](gtm/docs/TRIGGER_EVENTS.md)** — CSRD / FinOps signals, CFO outreach, CHASE prioritization.
61
+ Executive audit template: **[gtm/templates/audit_report_exec.md](gtm/templates/audit_report_exec.md)** · KPI Shield: **[CFO](gtm/templates/shield_kpi_cfo.md)** / **[CTO](gtm/templates/shield_kpi_cto.md)**.
62
+
63
+ ---
64
+
65
+ ## How It Works
66
+
67
+ ```
68
+ Developer pushes Terraform on GitHub
69
+
70
+
71
+ GitHub Action triggers (ubuntu-latest)
72
+
73
+ ├── 1. Parse diff → extract AWS regions (regex, <0.1s)
74
+ ├── 2. Query DuckDB over Parquet → find lowest-carbon alternative in same zone (0.01s)
75
+ ├── 3. If savings ≥ 30%, format Markdown PR comment
76
+ └── 4. Post comment via GitHub API
77
+ ```
78
+
79
+ No API call per PR run. The intelligence is baked into the Parquet file. Your compute cost: $0.00.
80
+
81
+ ## Quick Start
82
+
83
+ ```bash
84
+ pip install carbongate
85
+ carbongate --diff-file path/to/pr.diff
86
+ carbongate --hcl-file main.tf --tfvars-file terraform.tfvars # optional
87
+ carbongate --plan-json plan.json # terraform plan -json
88
+ carbongate --plan-json plan.json --json # machine-readable output
89
+ ```
90
+
91
+ Install with MCP support: `pip install carbongate[mcp]`
92
+
93
+ ### Agent-native usage (MCP)
94
+
95
+ CarbonGate exposes an MCP server for AI agents (Claude, Cursor, etc.):
96
+
97
+ ```bash
98
+ carbongate serve
99
+ ```
100
+
101
+ Configure your MCP client:
102
+
103
+ ```json
104
+ {
105
+ "mcpServers": {
106
+ "carbongate": {
107
+ "command": "carbongate",
108
+ "args": ["serve"]
109
+ }
110
+ }
111
+ }
112
+ ```
113
+
114
+ Tools exposed:
115
+
116
+ | Tool | Input | Description |
117
+ |------|-------|-------------|
118
+ | **analyze_plan** | `plan_json_path` | Analyze a `terraform plan -json` file for carbon savings |
119
+ | **analyze_diff** | `diff_path` + `tfvars_path?` + `threshold_pct?` | Analyze a git diff file for carbon savings |
120
+ | **suggest_region** | `region` | Get the best lower-carbon alternative for an AWS region |
121
+
122
+ JSON output is versioned (`schema_version: 1`) for agent workflow stability.
123
+ [JSON Schema](docs/schema/carbongate-v1.json) available for automated validation.
124
+
125
+ ### MCP example: analyze a diff
126
+
127
+ ```python
128
+ # Agent calls via MCP — no shell required
129
+ analyze_diff(
130
+ diff_path="/path/to/terraform.diff",
131
+ tfvars_path="/path/to/terraform.tfvars", # optional: resolves var.region
132
+ threshold_pct=20.0 # optional: lower threshold
133
+ )
134
+ # → {"schema_version": 1, "suggestions": [{"current_region": "us-east-1", ...}]}
135
+ ```
136
+
137
+ From source (dev):
138
+
139
+ ```bash
140
+ git clone https://github.com/carbongate/carbongate.git && cd carbongate
141
+ pip install -e ".[dev]"
142
+ carbongate --diff-file tests/fixtures/sample.diff
143
+ ```
144
+
145
+ Regenerate carbon data (optional): `EMISSIONS_DEV_API_KEY=em_live_xxxx python data_packer.py`
146
+
147
+ ### Output Example
148
+
149
+ ```markdown
150
+ ## 🌱 CarbonGate Shift-Left Analysis
151
+
152
+ ⚠️ **Optimization detected** for your infrastructure deployment:
153
+
154
+ - **Current Region:** `us-east-1` (Carbon: 322.0 gCO₂eq/kWh)
155
+ - **Suggested Region:** `us-west-2` (Carbon: 122.5 gCO₂eq/kWh)
156
+
157
+ 💰 **Estimated Carbon Savings:** `62.0%`
158
+ 📉 **Spot Price Multiplier:** `0.8x` (Lower is cheaper)
159
+
160
+ > 💡 **Action:** Consider updating your Terraform config to use the suggested
161
+ > region(s) and reduce both carbon footprint and cloud costs.
162
+ ```
163
+
164
+ ## Architecture
165
+
166
+ ```
167
+ carbongate/
168
+ ├── data_packer.py # Polars → Fetches real CI from emissions.dev, writes Parquet
169
+ ├── query_engine.py # DuckDB in-process: get_best_alternative(region) → dict
170
+ ├── tf_parser.py # Deterministic regex + plan JSON: extract AWS regions
171
+ ├── judge.py # Decision engine: ≥30% savings threshold, Markdown + JSON
172
+ ├── carbongate_cli.py # Typer CLI entry point (carbongate analyze / serve)
173
+ ├── mcp_server.py # MCP server for AI agent integration (3 tools)
174
+ ├── entrypoint.sh # GitHub Action entrypoint (fetches diff via API)
175
+ ├── Dockerfile # python:3.12-slim, ready for GitHub Actions
176
+ ├── action.yml # GitHub Action definition
177
+ ├── pyproject.toml # Package metadata (pip install .)
178
+ └── data/
179
+ ├── eu/eu_carbon.parquet # EU region carbon data
180
+ └── us/us_carbon.parquet # US region carbon data
181
+ ```
182
+
183
+ ### Module Dependency Graph
184
+
185
+ ```
186
+ ┌─────────────┐
187
+ │ carbongate_cli │ Typer CLI, GitHub API integration
188
+ └──────┬──────┘
189
+
190
+ ┌──────▼──────┐
191
+ diff.txt ────────►│ judge.py │ Decision engine (≥30% threshold)
192
+ └──┬──────┬───┘
193
+ │ │
194
+ ┌────────▼──┐ ┌─▼───────────┐
195
+ │ tf_parser │ │query_engine │ DuckDB over Parquet
196
+ └───────────┘ └──────┬───────┘
197
+
198
+ ┌────────▼──────────┐
199
+ │ data_packer.py │ Polars → Parquet
200
+ │ emissions.dev │ Real CI data
201
+ └───────────────────┘
202
+ ```
203
+
204
+ ### Data Flow
205
+
206
+ ```
207
+ emissions.dev API (real CI data)
208
+
209
+
210
+ data_packer.py ─── Polars DataFrame ───► data/{eu,us}/*.parquet
211
+
212
+
213
+ query_engine.py ─── DuckDB read_parquet() ───► get_best_alternative()
214
+
215
+
216
+ tf_parser.py ─── extract regions from diff ──► judge.py ──► Markdown PR comment
217
+ ```
218
+
219
+
220
+ ## Region coverage
221
+
222
+ | Detected in PR diff | Notes |
223
+ |---------------------|-------|
224
+ | Literal `region = "eu-west-1"` | provider, resources, locals |
225
+ | `availability_zone` | Mapped to region when possible |
226
+ | `region = var.region` | Resolved if `terraform.tfvars` / `--tfvars-file` sets `region` or `aws_region` |
227
+ | Auto tfvars | `terraform.tfvars`, `*.auto.tfvars`, `*.tfvars` next to input file |
228
+
229
+ **Not covered (yet):** unresolved `var.region` without tfvars, GCP/Azure, hourly carbon scheduling.
230
+
231
+ See [Infracost integration](docs/infracost-integration.md), [PyPI publish checklist](docs/PYPI_PUBLISH.md), and [sales one-pager](docs/SALES_ONE_PAGER.md) (demo commands, pricing, objections).
232
+
233
+
234
+ ## Supported Regions
235
+
236
+ | AWS Region | Country | Data Source | Best EU/US pair (feed) | Typical savings |
237
+ |---------------|---------------|--------------------|-------------------------|-----------------|
238
+ | `eu-west-1` | Ireland | emissions.dev (IE) | → `eu-north-1` | ~86% |
239
+ | `eu-west-2` | United Kingdom| emissions.dev (GB) | → `eu-north-1` | ~84% |
240
+ | `eu-west-3` | France | emissions.dev (FR) | → `eu-north-1` | ~17% (below threshold) |
241
+ | `eu-central-1`| Germany | emissions.dev (DE) | → `eu-north-1` | ~90% |
242
+ | `eu-north-1` | Sweden | emissions.dev (SE) | already optimal | — |
243
+ | `us-east-1` | Virginia | eGRID 2023 (×0.92) | → `us-west-2` | ~62% |
244
+ | `us-east-2` | Ohio | eGRID 2023 (×1.25) | → `us-west-2` | ~72% |
245
+ | `us-west-1` | California | eGRID 2023 (×0.55) | → `us-west-2` | ~36% |
246
+ | `us-west-2` | Oregon | eGRID 2023 (×0.35) | already optimal | — |
247
+
248
+ Savings are vs the lowest-carbon region in the same zone (EU or US), using the bundled Parquet feed. See `query_engine.get_best_alternative()`.
249
+
250
+ ### 30% savings threshold
251
+
252
+ CarbonGate only posts a PR comment when estimated carbon savings are **≥ 30%** (`SAVINGS_THRESHOLD_PCT` in `judge.py`, overridable with `--threshold`). This avoids noisy suggestions for marginal gains.
253
+
254
+ **Silent output is expected**, not a bug, when:
255
+
256
+ - The region is already near-optimal in its zone (e.g. `eu-west-3` at ~42 gCO₂eq/kWh vs `eu-north-1` at ~35 → **~17% savings**, below threshold).
257
+ - No resolvable region in the diff (e.g. `var.region` without tfvars, or only removed lines).
258
+
259
+ **Example — `eu-west-3` (no suggestion):**
260
+
261
+ ```bash
262
+ carbongate --diff-file tests/fixtures/eu-west-3-only.diff
263
+ ```
264
+
265
+ ```
266
+ 🔍 Analyzing diff for carbon optimization opportunities...
267
+ ✅ No carbon optimization detected (all regions within 30% of optimal).
268
+ ```
269
+
270
+ **Example — `eu-west-1` (suggestion emitted):**
271
+
272
+ ```bash
273
+ carbongate --diff-file gtm/audits/terraform-aws-vpc/audit_input.diff
274
+ ```
275
+
276
+ ```markdown
277
+ ## 🌱 CarbonGate Shift-Left Analysis
278
+ ...
279
+ 💰 **Estimated Carbon Savings:** `86.4%`
280
+ ```
281
+
282
+
283
+ ### CLI options (Phase 0)
284
+
285
+ | Flag | Description |
286
+ |------|-------------|
287
+ | `analyze --diff-file` | Git diff from PR |
288
+ | `analyze --hcl-file` | Static `.tf` scan (synthetic diff) |
289
+ | `analyze --plan-json` | `terraform plan -json` / `terraform show -json` output |
290
+ | `analyze --tfvars-file` | Explicit tfvars (repeatable); also auto-discovered |
291
+ | `analyze --threshold` | Min savings % to comment (default `30`) |
292
+ | `analyze --api-key` | Fresh Parquet from emissions.dev |
293
+ | `analyze --json` | Machine-readable JSON output (all input modes) |
294
+ | `serve` | Start MCP server for AI agent integration |
295
+
296
+ ## Configuration
297
+
298
+ | Variable | Required | Description |
299
+ |----------|----------|-------------|
300
+ | `EMISSIONS_DEV_API_KEY` | No | emissions.dev API key (free: 500 req/month). Without it, uses realistic mock data. |
301
+
302
+
303
+ ### Using the Action (published vs local)
304
+
305
+ **Published (after release on GitHub):**
306
+
307
+ ```yaml
308
+ - uses: carbongate/carbongate@v0.2
309
+ with:
310
+ github-token: ${{ secrets.GITHUB_TOKEN }}
311
+ api-key: ${{ secrets.CARBONGATE_API_KEY }}
312
+ ```
313
+
314
+ **Same repository (development):**
315
+
316
+ ```yaml
317
+ - uses: ./
318
+ with:
319
+ github-token: ${{ secrets.GITHUB_TOKEN }}
320
+ ```
321
+
322
+ See [GitHub Marketplace checklist](docs/GITHUB_MARKETPLACE.md).
323
+
324
+ ## GitHub Action Setup
325
+
326
+ Add to `.github/workflows/carbongate.yml` (or use [cost + carbon](.github/workflows/cost-and-carbon.yml) with Infracost):
327
+
328
+ ```yaml
329
+ name: CarbonGate Analysis
330
+ on:
331
+ pull_request:
332
+ paths:
333
+ - '**/*.tf'
334
+ - '**/*.tfvars'
335
+
336
+ jobs:
337
+ carbongate:
338
+ runs-on: ubuntu-latest
339
+ steps:
340
+ - uses: carbongate/carbongate@v0.2
341
+ with:
342
+ github-token: ${{ secrets.GITHUB_TOKEN }}
343
+ api-key: ${{ secrets.CARBONGATE_API_KEY }}
344
+ ```
345
+
346
+ ## Tech Stack
347
+
348
+ | Component | Technology | Why |
349
+ |-----------|------------|-----|
350
+ | Data processing | **Polars** | Zero-copy, columnar, faster than Pandas |
351
+ | Query engine | **DuckDB** | In-process OLAP, reads Parquet in <1ms |
352
+ | CLI | **Typer** | Type-safe Click wrapper |
353
+ | Data format | **Apache Parquet** | Compressed columnar, 2MB per zone |
354
+ | Data source | **emissions.dev** | Free tier, annual CI per country + cloud region |
355
+ | Packaging | **Docker** / **pip** | GitHub Action + PyPI ready (85 pytest) |
356
+
357
+ ## Performance
358
+
359
+ - **Parse diff**: <0.1s (regex)
360
+ - **Query DuckDB**: <0.01s (in-memory Parquet)
361
+ - **End-to-end**: <0.2s
362
+ - **API calls per PR run**: 0 (data pre-baked in Parquet)
363
+ - **Compute cost per PR**: $0.00
364
+
365
+ ## License
366
+
367
+ MIT
368
+
369
+ ---
370
+
371
+ # CarbonGate — Analyse Carbone Shift-Left pour Terraform
372
+
373
+ ## 🇫🇷 Français
374
+
375
+ CarbonGate est un outil CLI **déterministe, zéro-token, zéro-latence** qui analyse les diffs d'infrastructure Terraform et recommande des régions AWS avec une intensité carbone plus faible — avant le merge.
376
+
377
+ ### Le Sweet Spot
378
+
379
+ Tu donnes le logiciel (gratuit, open-source, tourne sur leur propre runner CI). Tu vends la donnée (feed Parquet frais avec l'intensité carbone temps réel d'emissions.dev). Pas de dashboards. Pas d'hallucinations LLM. Juste un fichier YAML, un parser impitoyable, un moteur DuckDB plus rapide que l'électricité, et une porte d'entrée (la donnée) que tu contrôles.
380
+
381
+ ### Comment ça marche
382
+
383
+ ```
384
+ Le développeur pousse du Terraform sur GitHub
385
+
386
+
387
+ GitHub Action se déclenche (ubuntu-latest)
388
+
389
+ ├── 1. Parse le diff → extrait les régions AWS (regex, <0.1s)
390
+ ├── 2. Interroge DuckDB sur le Parquet → trouve l'alternative la moins carbonée dans la même zone (0.01s)
391
+ ├── 3. Si économies ≥ 30%, formate un commentaire Markdown PR
392
+ └── 4. Poste le commentaire via l'API GitHub
393
+ ```
394
+
395
+ Aucun appel API par exécution de PR. L'intelligence est pré-cuite dans le fichier Parquet. Ton coût compute : 0,00 €.
396
+
397
+ ### Démarrage Rapide
398
+
399
+ ```bash
400
+ git clone https://github.com/carbongate/carbongate.git
401
+ cd carbongate
402
+ pip install duckdb polars requests typer
403
+ export EMISSIONS_DEV_API_KEY=em_live_xxxx # Gratuit sur https://emissions.dev
404
+ python data_packer.py
405
+ carbongate --diff-file chemin/vers/terraform.diff
406
+ ```
407
+
408
+ ### Architecture
409
+
410
+ ```
411
+ carbongate/
412
+ ├── data_packer.py # Polars → Récupère CI réelle d'emissions.dev, écrit Parquet
413
+ ├── query_engine.py # DuckDB in-process : get_best_alternative(region) → dict
414
+ ├── tf_parser.py # Regex déterministe : extrait les régions AWS du diff git
415
+ ├── judge.py # Moteur de décision : seuil ≥30%, formatage Markdown
416
+ ├── carbongate_cli.py # Point d'entrée CLI Typer (analyze / serve)
417
+ ├── mcp_server.py # Serveur MCP pour intégration agent IA (3 tools)
418
+ ├── entrypoint.sh # Entrypoint GitHub Action
419
+ ├── Dockerfile # python:3.12-slim, prêt pour GitHub Actions
420
+ ├── action.yml # Définition GitHub Action
421
+ ├── pyproject.toml # Métadonnées package (pip install .)
422
+ └── data/
423
+ ├── eu/eu_carbon.parquet
424
+ └── us/us_carbon.parquet
425
+ ```
426
+
427
+ ### Régions supportées
428
+
429
+ | Région AWS | Pays | Source données | Meilleur pair (feed) | Économie typique |
430
+ |---------------|---------------|--------------------|----------------------|------------------|
431
+ | `eu-west-1` | Irlande | emissions.dev (IE) | → `eu-north-1` | ~86 % |
432
+ | `eu-west-2` | Royaume-Uni | emissions.dev (GB) | → `eu-north-1` | ~84 % |
433
+ | `eu-west-3` | France | emissions.dev (FR) | → `eu-north-1` | ~17 % (sous seuil) |
434
+ | `eu-central-1`| Allemagne | emissions.dev (DE) | → `eu-north-1` | ~90 % |
435
+ | `eu-north-1` | Suède | emissions.dev (SE) | déjà optimal | — |
436
+ | `us-east-1` | Virginie | eGRID 2023 (×0.92) | → `us-west-2` | ~62 % |
437
+ | `us-east-2` | Ohio | eGRID 2023 (×1.25) | → `us-west-2` | ~72 % |
438
+ | `us-west-1` | Californie | eGRID 2023 (×0.55) | → `us-west-2` | ~36 % |
439
+ | `us-west-2` | Oregon | eGRID 2023 (×0.35) | déjà optimal | — |
440
+
441
+ ### Seuil 30 %
442
+
443
+ CarbonGate ne commente une PR que si l'économie carbone estimée est **≥ 30 %** (`SAVINGS_THRESHOLD_PCT` dans `judge.py`). Exemple : `eu-west-3` (~42 gCO₂eq/kWh) vs `eu-north-1` (~35) → **~17 %** → sortie silencieuse (« No carbon optimization detected »), pas un bug.
444
+
445
+ ---
446
+
447
+ ## AI / LLM / Agent Usage
448
+
449
+ ### For AI Agents (Claude Code, Cursor, Copilot, etc.)
450
+
451
+ CarbonGate is designed to be **operated by AI agents** as well as humans. The codebase follows these agent-friendly conventions:
452
+
453
+ 1. **Single-file modules**: each `.py` file is a self-contained module with a clear single responsibility
454
+ 2. **Type-safe**: all public functions use `TypedDict` or explicit return types
455
+ 3. **Deterministic**: zero LLM calls in the pipeline; everything is regex + SQL
456
+ 4. **Silent on success**: functions return `None` when there's nothing to report; no spam
457
+ 5. **Idempotent**: `data_packer.py` can be re-run safely; `query_engine.py` reads latest data
458
+ 6. **Mock fallback**: works without API keys for dev/testing
459
+
460
+ ### For LLM Context Windows
461
+
462
+ When an AI agent reads this project, the optimal files to load are:
463
+
464
+ | Priority | File | Purpose |
465
+ |----------|------|---------|
466
+ | 1 | `README.md` | This file — architecture overview |
467
+ | 2 | `pyproject.toml` | Dependencies and entry points |
468
+ | 3 | `cli.py` | Understand the CLI surface |
469
+ | 4 | `judge.py` | Decision logic + Markdown format |
470
+ | 5 | `query_engine.py` | DuckDB query patterns |
471
+ | 6 | `tf_parser.py` | Region extraction regex |
472
+ | 7 | `data_packer.py` | Data pipeline (only if modifying data sources) |
473
+
474
+ ### llms.txt
475
+
476
+ ```
477
+ # CarbonGate — Shift-left carbon analysis for Terraform
478
+ # Architecture (7 files, 1019 lines)
479
+
480
+ ## Core Pipeline
481
+ carbongate_cli.py: Typer CLI (--diff-file, --hcl-file, --plan-json, --json, --threshold, serve)
482
+ judge.py: Decision engine (≥30% savings threshold), Markdown + JSON formatter
483
+ tf_parser.py: hcl2 + regex + plan JSON recursive; tfvars; var.region resolution
484
+ query_engine.py: DuckDB in-process query engine over Parquet carbon data
485
+
486
+ ## Agent Integration
487
+ mcp_server.py: FastMCP server — 3 tools: analyze_plan, analyze_diff, suggest_region
488
+ Entry point: carbongate serve or carbongate-serve
489
+ JSON output: schema_version: 1 envelope
490
+
491
+ ## Data Pipeline
492
+ data_packer.py: Polars-based ETL. Fetches real CI from emissions.dev API, falls back to mock data.
493
+ Reads EMISSIONS_DEV_API_KEY from env. Writes data/{eu,us}/*.parquet.
494
+
495
+ ## Deployment
496
+ action.yml: GitHub Action definition (docker-based)
497
+ entrypoint.sh: Extracts PR diff via GitHub API, runs carbongate (supports INPUT_PLAN_JSON_FILE)
498
+ Dockerfile: python:3.12-slim with curl, jq, duckdb, polars
499
+ pyproject.toml: Package metadata, dependencies, entry points (carbongate, carbongate-serve)
500
+
501
+ ## Data
502
+ data/eu/eu_carbon.parquet: EU region carbon intensity (5 regions × latest timestamp)
503
+ data/us/us_carbon.parquet: US region carbon intensity (4 regions × latest timestamp)
504
+
505
+ ## External APIs
506
+ emissions.dev /v1/electricity/grid: country-level grid carbon intensity (free tier: 500 req/month)
507
+ GitHub REST API: POST /repos/{owner}/{repo}/issues/{pr_number}/comments
508
+
509
+ ## Key Design Decisions
510
+ - Annual average CI (not real-time): structural advantage matters more than hourly fluctuations for infra decisions
511
+ - No LLM calls in pipeline: purely deterministic (regex + SQL)
512
+ - Data pre-baked as Parquet: zero API calls per PR run, $0 compute cost for user
513
+ - AWS region → country mapping: static table, no runtime lookups
514
+ - 30% savings threshold: avoids noisy suggestions for marginal improvements
515
+ ```
516
+
517
+ ---
518
+
519
+ ## Competitor Landscape (2026)
520
+
521
+ | Tool | Position | CarbonGate Advantage |
522
+ |------|----------|---------------------|
523
+ | **Infracost + InfraCarbon** | PR comments, closed backend | Open-source CLI, Parquet data feed, no dashboard lock-in |
524
+ | **CarbonAware** | Scheduling, K8s/Airflow | Pre-merge analysis, not post-deploy |
525
+ | **Cloud Carbon Footprint** | Post-deploy reporting dashboard | Shift-left: catch it *before* merge |
526
+ | **Climatiq** | General carbon API | Deprecated cloud endpoint (Sep 2026) |
527
+
528
+ ---
529
+
530
+ Built with ❤️ and DuckDB.