queryguard-cli 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Mark de Haan
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,124 @@
1
+ Metadata-Version: 2.4
2
+ Name: queryguard-cli
3
+ Version: 0.1.0
4
+ Summary: A BigQuery Cost Analysis CLI Tool
5
+ License-File: LICENSE
6
+ Keywords: bigquery,gcp,cost-optimization,finops,cli
7
+ Author: Mark de Haan
8
+ Author-email: markdehaan90@gmail.com
9
+ Requires-Python: >=3.12,<4.0
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Intended Audience :: System Administrators
12
+ Classifier: Operating System :: OS Independent
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Classifier: Programming Language :: Python :: 3.14
17
+ Classifier: Topic :: Database
18
+ Classifier: Topic :: Utilities
19
+ Requires-Dist: db-dtypes (>=1.2.0,<2.0.0)
20
+ Requires-Dist: google-cloud-bigquery (>=3.40.0,<4.0.0)
21
+ Requires-Dist: rich (>=14.2.0,<15.0.0)
22
+ Requires-Dist: typer (>=0.21.1,<0.22.0)
23
+ Project-URL: Repository, https://github.com/mark-de-haan/query-guard-cli
24
+ Description-Content-Type: text/markdown
25
+
26
+ # QueryGuard CLI 🛡️
27
+
28
+ **The Forensic Auditor for your BigQuery Bill.**
29
+
30
+ [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
31
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
32
+ [![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
33
+
34
+ **QueryGuard** (`bqg`) is a CLI tool that hunts down expensive BigQuery queries across your entire Google Cloud organization. It connects to the `INFORMATION_SCHEMA`, calculates exact costs based on regional pricing, and flags high-risk patterns like `SELECT *` or missing `LIMIT` clauses.
35
+
36
+ Stop guessing who spent the budget. **Know.**
37
+
38
+ ---
39
+
40
+ ## ⚡ Features
41
+
42
+ * **🌍 Global Auto-Discovery**: Automatically scans your project to find active regions and queries them in parallel. No more guessing if data is in `us-central1` or `europe-west3`.
43
+
44
+ * **💸 Forensic Cost Analysis**: Calculates costs based on the **exact datacenter pricing** (e.g., pricing Zurich queries at $8.75/TiB vs. US queries at $6.25/TiB).
45
+
46
+ * **🚩 Risk Detection**: Instantly flags bad habits:
47
+ * `SELECT *` usage
48
+ * Queries without `LIMIT`
49
+ * Heavy scans (>100 GB)
50
+ * Wrapper scripts vs. actual compute
51
+
52
+ * **🤖 Bot Filtering**: Use `--humans-only` to filter out service accounts and Looker bots, focusing strictly on manual engineering errors.
53
+ * **🚀 High Performance**: Uses multi-threaded execution to audit dozens of regions in seconds.
54
+
55
+ ---
56
+
57
+ ## 📦 Installation
58
+
59
+ ### Option 1: Using Pip
60
+ ```bash
61
+ pip install queryguard-cli
62
+ ```
63
+
64
+ ### Option 2: From Source (Poetry)
65
+ ```bash
66
+ # Clone the repo
67
+ git clone git@github.com:mark-de-haan/query-guard-cli.git
68
+
69
+ # Navigate
70
+ cd queryguard-cli
71
+
72
+ # Install locally
73
+ poetry install
74
+ ```
75
+
76
+ ## 🚀 Quick Start
77
+ Ensure you are authenticated with Google Cloud:
78
+ ```bash
79
+ gcloud auth application-default login
80
+ ```
81
+
82
+ Run a forensic scan on your primary project for the last 7 days:
83
+ ```bash
84
+ bqg scan --project my-gcp-project
85
+ ```
86
+
87
+ #### Global scanning
88
+ Audit every active region globally to find hidden costs:
89
+ ```bash
90
+ bqg scan --project my-gcp-project --global
91
+ ```
92
+
93
+ ## 🛠 Usage Guide
94
+ The `scan` Command
95
+ | Flag | Short | Description|
96
+ | -----|-------|------------|
97
+ | --project | -p | Required. The GCP Project ID to audit. |
98
+ | --global | -g | Auto-discover active regions and scan them all in parallel. |
99
+ | --region | -r | Scan a specific region (e.g., europe-west1). Ignored if --global is set. |
100
+ | --days | -d | Lookback window in days (Default: 7). |
101
+ | --humans-only | | Hides service accounts (e.g., gserviceaccount, monitoring) to find manual errors. |
102
+ | --limit | -l | Number of expensive queries to display (Default: 10). |
103
+
104
+ #### Examples
105
+ Find who is running expensive queries manually:
106
+ ```bash
107
+ bqg scan -p my-data-warehouse --global --humans-only --days 30
108
+ ```
109
+
110
+ Audit a specific region for a deep dive
111
+ ```bash
112
+ bqg scan -p my-data-warehouse -r europe-west3
113
+ ```
114
+
115
+ ## 🤝 Contributing
116
+ Contributions are welcome! Please check the issues page.
117
+ 1. Fork the Project
118
+ 2. Create your Feature Branch (git checkout -b feat/AmazingFeature)
119
+ 3. Commit your Changes (git commit -m 'Add some AmazingFeature')
120
+ 4. Push to the Branch (git push origin feat/AmazingFeature)
121
+ 5. Open a Pull Request
122
+
123
+ ## 📄 License
124
+ Distributed under the MIT License. See LICENSE for more information.
@@ -0,0 +1,99 @@
1
+ # QueryGuard CLI 🛡️
2
+
3
+ **The Forensic Auditor for your BigQuery Bill.**
4
+
5
+ [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+ [![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
8
+
9
+ **QueryGuard** (`bqg`) is a CLI tool that hunts down expensive BigQuery queries across your entire Google Cloud organization. It connects to the `INFORMATION_SCHEMA`, calculates exact costs based on regional pricing, and flags high-risk patterns like `SELECT *` or missing `LIMIT` clauses.
10
+
11
+ Stop guessing who spent the budget. **Know.**
12
+
13
+ ---
14
+
15
+ ## ⚡ Features
16
+
17
+ * **🌍 Global Auto-Discovery**: Automatically scans your project to find active regions and queries them in parallel. No more guessing if data is in `us-central1` or `europe-west3`.
18
+
19
+ * **💸 Forensic Cost Analysis**: Calculates costs based on the **exact datacenter pricing** (e.g., pricing Zurich queries at $8.75/TiB vs. US queries at $6.25/TiB).
20
+
21
+ * **🚩 Risk Detection**: Instantly flags bad habits:
22
+ * `SELECT *` usage
23
+ * Queries without `LIMIT`
24
+ * Heavy scans (>100 GB)
25
+ * Wrapper scripts vs. actual compute
26
+
27
+ * **🤖 Bot Filtering**: Use `--humans-only` to filter out service accounts and Looker bots, focusing strictly on manual engineering errors.
28
+ * **🚀 High Performance**: Uses multi-threaded execution to audit dozens of regions in seconds.
29
+
30
+ ---
31
+
32
+ ## 📦 Installation
33
+
34
+ ### Option 1: Using Pip
35
+ ```bash
36
+ pip install queryguard-cli
37
+ ```
38
+
39
+ ### Option 2: From Source (Poetry)
40
+ ```bash
41
+ # Clone the repo
42
+ git clone git@github.com:mark-de-haan/query-guard-cli.git
43
+
44
+ # Navigate
45
+ cd queryguard-cli
46
+
47
+ # Install locally
48
+ poetry install
49
+ ```
50
+
51
+ ## 🚀 Quick Start
52
+ Ensure you are authenticated with Google Cloud:
53
+ ```bash
54
+ gcloud auth application-default login
55
+ ```
56
+
57
+ Run a forensic scan on your primary project for the last 7 days:
58
+ ```bash
59
+ bqg scan --project my-gcp-project
60
+ ```
61
+
62
+ #### Global scanning
63
+ Audit every active region globally to find hidden costs:
64
+ ```bash
65
+ bqg scan --project my-gcp-project --global
66
+ ```
67
+
68
+ ## 🛠 Usage Guide
69
+ The `scan` Command
70
+ | Flag | Short | Description|
71
+ | -----|-------|------------|
72
+ | --project | -p | Required. The GCP Project ID to audit. |
73
+ | --global | -g | Auto-discover active regions and scan them all in parallel. |
74
+ | --region | -r | Scan a specific region (e.g., europe-west1). Ignored if --global is set. |
75
+ | --days | -d | Lookback window in days (Default: 7). |
76
+ | --humans-only | | Hides service accounts (e.g., gserviceaccount, monitoring) to find manual errors. |
77
+ | --limit | -l | Number of expensive queries to display (Default: 10). |
78
+
79
+ #### Examples
80
+ Find who is running expensive queries manually:
81
+ ```bash
82
+ bqg scan -p my-data-warehouse --global --humans-only --days 30
83
+ ```
84
+
85
+ Audit a specific region for a deep dive
86
+ ```bash
87
+ bqg scan -p my-data-warehouse -r europe-west3
88
+ ```
89
+
90
+ ## 🤝 Contributing
91
+ Contributions are welcome! Please check the issues page.
92
+ 1. Fork the Project
93
+ 2. Create your Feature Branch (git checkout -b feat/AmazingFeature)
94
+ 3. Commit your Changes (git commit -m 'Add some AmazingFeature')
95
+ 4. Push to the Branch (git push origin feat/AmazingFeature)
96
+ 5. Open a Pull Request
97
+
98
+ ## 📄 License
99
+ Distributed under the MIT License. See LICENSE for more information.
@@ -0,0 +1,32 @@
1
+ [tool.poetry]
2
+ name = "queryguard-cli"
3
+ version = "0.1.0"
4
+ description = "A BigQuery Cost Analysis CLI Tool"
5
+ authors = ["Mark de Haan <markdehaan90@gmail.com>"]
6
+ readme = "README.md"
7
+ packages = [{include = "queryguard"}]
8
+ repository = "https://github.com/mark-de-haan/query-guard-cli"
9
+ keywords = ["bigquery", "gcp", "cost-optimization", "finops", "cli"]
10
+ classifiers = [
11
+ "Topic :: Database",
12
+ "Topic :: Utilities",
13
+ "Intended Audience :: Developers",
14
+ "Intended Audience :: System Administrators",
15
+ "Programming Language :: Python :: 3",
16
+ "Programming Language :: Python :: 3.12",
17
+ "Operating System :: OS Independent",
18
+ ]
19
+
20
+ [tool.poetry.dependencies]
21
+ python = "^3.12"
22
+ typer = "^0.21.1"
23
+ rich = "^14.2.0"
24
+ google-cloud-bigquery = "^3.40.0"
25
+ db-dtypes = "^1.2.0"
26
+
27
+ [tool.poetry.scripts]
28
+ bqg = "queryguard.main:app"
29
+
30
+ [build-system]
31
+ requires = ["poetry-core"]
32
+ build-backend = "poetry.core.masonry.api"
File without changes
@@ -0,0 +1,67 @@
1
+ import re
2
+
3
+ # Pricing per TiB (approximate on-demand rates)
4
+ REGION_PRICING_TABLE: dict[str, float] = {
5
+ # --- The "Standard" Tier ($6.25) ---
6
+ "us": 6.25,
7
+ "us-central1": 6.25,
8
+ "us-east1": 6.25,
9
+ "us-east4": 6.25,
10
+ "us-west1": 6.25,
11
+ "eu": 6.25,
12
+ "europe-west1": 6.25, # Belgium
13
+ "europe-north1": 6.25, # Finland
14
+
15
+ # --- The "High Energy" Tier (~$7.50 - $8.50) ---
16
+ "europe-west4": 7.50, # Netherlands
17
+ "europe-west2": 7.82, # London
18
+ "europe-west3": 8.13, # Frankfurt
19
+ "europe-west6": 8.75, # Zurich
20
+ "us-west2": 8.44, # Los Angeles
21
+ "us-west3": 8.44, # Salt Lake City
22
+
23
+ # --- The "Premium" Tier ($9.00+) ---
24
+ "southamerica-east1": 11.25, # Sao Paulo
25
+ "asia-northeast1": 7.40, # Tokyo
26
+ "asia-southeast1": 7.80, # Singapore
27
+ "me-central2": 10.00, # Dammam
28
+ }
29
+
30
+ DEFAULT_PRICE: float = 6.25
31
+
32
+
33
+ class ForensicAuditor:
34
+
35
+ @staticmethod
36
+ def get_price_per_tib(region: str) -> float:
37
+ """Returns the price per TiB for the given region, defaulting to $6.25."""
38
+ return REGION_PRICING_TABLE.get(region.lower(), DEFAULT_PRICE)
39
+
40
+ @staticmethod
41
+ def calculate_cost(bytes_billed: int, region: str) -> float:
42
+ """Calculates cost based on region-specific pricing."""
43
+ if not bytes_billed:
44
+ return 0.0
45
+
46
+ tebibytes: float = bytes_billed / (1024**4)
47
+ price = ForensicAuditor.get_price_per_tib(region)
48
+ return tebibytes * price
49
+
50
+ @staticmethod
51
+ def analyze_query(sql: str, bytes_billed: int) -> list[str]:
52
+ """Returns a list of detected issues."""
53
+ risks: list[str] = []
54
+
55
+ # 1. SELECT *
56
+ if re.search(r'SELECT\s+\*\s+', sql, re.IGNORECASE):
57
+ risks.append("SELECT *")
58
+
59
+ # 2. Missing Limit
60
+ if "LIMIT" not in sql.upper():
61
+ risks.append("NO LIMIT")
62
+
63
+ # 3. High Scan Volume (> 100 GB)
64
+ if bytes_billed > (100 * 1024**3):
65
+ risks.append("HEAVY SCAN")
66
+
67
+ return risks
@@ -0,0 +1,121 @@
1
+ from concurrent import futures
2
+ import sys
3
+ from typing import cast
4
+
5
+ from google.auth import default
6
+ from google.auth.credentials import Credentials
7
+ from google.cloud.bigquery import Client as BigQueryClient
8
+
9
+
10
+ def get_bq_client(project_id: str | None = None) -> BigQueryClient:
11
+ """
12
+ Authenticates using local Application Default Credentials (ADC).
13
+ Returns a configured BigQueryClient.
14
+ """
15
+ try:
16
+ credentials, default_project = default()
17
+ target_project = project_id or default_project
18
+
19
+ if not target_project:
20
+ print(
21
+ "Error: No Google Cloud project found. Please pass --project or set a default in gcloud.")
22
+ sys.exit(1)
23
+
24
+ return BigQueryClient(project=target_project, credentials=cast(Credentials, credentials))
25
+
26
+ except Exception as e:
27
+ print(f"Authentication Error: {e}")
28
+ print("Tip: Run 'gcloud auth application-default login' to authenticate.")
29
+ sys.exit(1)
30
+
31
+
32
+ def discover_active_regions(client: BigQueryClient, project_id: str) -> list[str]:
33
+ """
34
+ Auto-detects active regions by listing datasets.
35
+ Accesses _properties directly as DatasetListItem does not expose .location.
36
+ """
37
+ print(f" ... Auto-discovering active regions for {project_id} ...")
38
+ try:
39
+ datasets = list(client.list_datasets(project=project_id))
40
+ except Exception as e:
41
+ print(f" Warning: Could not list datasets to discover regions ({e})")
42
+ return ["us", "eu"] # Fallback defaults
43
+
44
+ regions = set()
45
+ for dataset in datasets:
46
+ # Accessing the raw resource dict
47
+ props = dataset._properties
48
+ if "location" in props:
49
+ regions.add(props["location"].lower())
50
+
51
+ found = list(regions)
52
+ if not found:
53
+ print(" Warning: No datasets found. Defaulting to 'us'.")
54
+ return ["us"]
55
+
56
+ print(f" Found active data in: {', '.join(found)}")
57
+ return found
58
+
59
+
60
+ def _fetch_single_region(client: BigQueryClient, project_id: str, region: str, days: int, limit: int) -> list[dict]:
61
+ """Worker: Scans one specific region."""
62
+ table_id = f"`{project_id}`.`region-{region}`.INFORMATION_SCHEMA.JOBS"
63
+
64
+ query = f"""
65
+ SELECT
66
+ job_id,
67
+ user_email,
68
+ total_bytes_billed,
69
+ query,
70
+ creation_time,
71
+ total_slot_ms,
72
+ statement_type
73
+ FROM
74
+ {table_id}
75
+ WHERE
76
+ creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL {days} DAY)
77
+ AND job_type = 'QUERY'
78
+ AND statement_type != 'SCRIPT'
79
+ AND total_bytes_billed > 0
80
+ ORDER BY
81
+ total_bytes_billed DESC
82
+ LIMIT {limit}
83
+ """
84
+ try:
85
+ query_job = client.query(query)
86
+ return [dict(row) | {"region": region} for row in query_job.result()]
87
+ except Exception:
88
+ return []
89
+
90
+
91
+ def fetch_recent_jobs(client: BigQueryClient, project_id: str, region: str, days: int, global_scan: bool, limit: int) -> list[dict]:
92
+ """
93
+ Fetches jobs from a single region OR auto-discovers all regions if global_scan is True.
94
+ """
95
+ if not global_scan:
96
+ return _fetch_single_region(client, project_id, region.lower(), days, limit)
97
+
98
+ regions_to_scan: list[str] = discover_active_regions(client, project_id)
99
+ all_jobs: list[dict] = []
100
+
101
+ print(f" ... Scanning {len(regions_to_scan)} regions in parallel ...")
102
+ with futures.ThreadPoolExecutor(max_workers=10) as executor:
103
+ future_to_region = {
104
+ executor.submit(_fetch_single_region, client, project_id, r, days, limit): r
105
+ for r in regions_to_scan
106
+ }
107
+
108
+ for future in futures.as_completed(future_to_region):
109
+ data = future.result()
110
+ if data:
111
+ all_jobs.extend(data)
112
+
113
+ seen_jobs = set()
114
+ unique_jobs = []
115
+ for job in all_jobs:
116
+ if job['job_id'] not in seen_jobs:
117
+ unique_jobs.append(job)
118
+ seen_jobs.add(job['job_id'])
119
+
120
+ unique_jobs.sort(key=lambda x: x.get('total_bytes_billed', 0), reverse=True)
121
+ return unique_jobs[:limit]
@@ -0,0 +1,20 @@
1
+ from rich.console import Console
2
+ from rich.panel import Panel
3
+ from rich.table import Table
4
+
5
+
6
+ def print_audit_results(
7
+ console: Console,
8
+ table: Table,
9
+ total_spend: float,
10
+ displayed_count: int
11
+ ) -> None:
12
+ """Prints the audit results in a formatted table."""
13
+ console.print(table)
14
+ console.print(Panel(
15
+ f"[bold]Total Cost in View: ${total_spend:.2f}[/bold]\n"
16
+ f"Showing {displayed_count} queries.",
17
+ title="Audit Summary",
18
+ border_style="white",
19
+ expand=False
20
+ ))
@@ -0,0 +1,127 @@
1
+ from rich.console import Console
2
+ from rich.panel import Panel
3
+ from rich.progress import Progress, SpinnerColumn, TextColumn
4
+ from rich.table import Table
5
+ from typer import Context, Exit, Option, Typer
6
+
7
+ from google.cloud.bigquery import Client as BigQueryClient
8
+
9
+ from .analysis import ForensicAuditor
10
+ from .client import fetch_recent_jobs, get_bq_client
11
+ from .console_utils import print_audit_results
12
+
13
+ app: Typer = Typer()
14
+ console: Console = Console()
15
+
16
+
17
+ @app.callback(invoke_without_command=True)
18
+ def main_callback(ctx: Context):
19
+ """
20
+ QueryGuard CLI - Audit BigQuery spend and find cost anomalies.
21
+ """
22
+ if ctx.invoked_subcommand is None:
23
+ console.print(Panel(
24
+ "[bold]QueryGuard CLI[/bold]\n\n"
25
+ "The forensic auditor for your BigQuery bills.\n\n"
26
+ "Usage:\n"
27
+ " [cyan]bqg scan[/cyan] Start a forensic audit\n"
28
+ " [cyan]bqg scan --help[/cyan] Show all options",
29
+ title="Welcome",
30
+ border_style="blue"
31
+ ))
32
+ raise Exit()
33
+
34
+
35
+ @app.command("scan")
36
+ def scan(
37
+ project_id: str = Option(None, "--project", "-p", help="GCP Project ID. Defaults to local config."),
38
+ region: str = Option("us", "--region", "-r", help="BigQuery Region (e.g. us, eu, europe-west4)."),
39
+ days: int = Option(7, "--days", "-d", help="Lookback window in days."),
40
+ global_scan: bool = Option(False, "--global", "-g", help="Auto-discover and scan all active regions."),
41
+ limit: int = Option(10, "--limit", "-l", help="Number of queries to show."),
42
+ humans_only: bool = Option(False, "--humans-only", help="Filter out service accounts and bots."),
43
+ ):
44
+ """
45
+ Audit BigQuery spend. Use --humans-only to find manual errors.
46
+ """
47
+
48
+ client: BigQueryClient = get_bq_client(project_id)
49
+ target_project_id: str = client.project
50
+
51
+ region_display = "GLOBAL (Auto-Discovery)" if global_scan else region
52
+
53
+ console.print(f"[bold]QueryGuard Forensic Scan[/bold]")
54
+ console.print(f"Target: [cyan]{target_project_id}[/cyan] | Region: [cyan]{region_display}[/cyan] | Lookback: [cyan]{days} days[/cyan]\n")
55
+
56
+ with Progress(
57
+ SpinnerColumn(),
58
+ TextColumn("[progress.description]{task.description}"),
59
+ transient=True,
60
+ ) as progress:
61
+ progress.add_task(description="Scanning audit logs...", total=None)
62
+ rows = fetch_recent_jobs(client, target_project_id, region, days, global_scan, limit=limit * 5)
63
+ jobs = list(rows)
64
+
65
+ if not jobs:
66
+ console.print("[yellow]No billed queries found in the specified period.[/yellow]")
67
+ raise Exit()
68
+
69
+ # 3. Build Table
70
+ table: Table = Table(
71
+ title=f"Top Expensive Queries ({'Humans Only' if humans_only else 'All Users'})",
72
+ border_style="white",
73
+ box=None,
74
+ header_style="bold cyan"
75
+ )
76
+
77
+ table.add_column("User", style="white", no_wrap=True)
78
+ table.add_column("Region", justify="right")
79
+ table.add_column("Data", justify="right")
80
+ table.add_column("Cost", justify="right", style="green")
81
+ table.add_column("Flags", style="red")
82
+ table.add_column("Query Snippet", style="dim")
83
+
84
+ displayed_count: float = 0
85
+ total_spend: float = 0.0
86
+
87
+ job: dict
88
+ for job in jobs:
89
+ sql = job.get('query') or ""
90
+ user_email = job.get('user_email') or "unknown"
91
+ total_bytes_billed = job.get('total_bytes_billed') or 0
92
+ job_region: str = job.get('region') or "us"
93
+
94
+ if humans_only:
95
+ # Common patterns for bots/service accounts
96
+ if "gserviceaccount" in user_email or "monitoring" in user_email:
97
+ continue
98
+
99
+ # FIXME calculate more precise region if global scan
100
+ cost: float = ForensicAuditor.calculate_cost(total_bytes_billed, job_region)
101
+ total_spend += cost
102
+ risks: list[str] = ForensicAuditor.analyze_query(sql, total_bytes_billed)
103
+
104
+ gb_scanned: float = total_bytes_billed / (1024**3)
105
+
106
+ clean_query: str = " ".join(sql.replace("\n", " ").split())
107
+ query_snippet: str = clean_query[:60] + "..." if len(clean_query) > 60 else clean_query
108
+
109
+ table.add_row(
110
+ user_email.split('@')[0],
111
+ job_region,
112
+ f"{gb_scanned:.2f} GB",
113
+ f"${cost:.2f}",
114
+ ", ".join(risks) if risks else "[dim]OK[/dim]",
115
+ query_snippet
116
+ )
117
+
118
+ displayed_count += 1
119
+ if displayed_count >= limit:
120
+ break
121
+
122
+ print_audit_results(console, table, total_spend, displayed_count)
123
+
124
+
125
+
126
+ if __name__ == "__main__":
127
+ app()