PyPI - aisbom-cli - Versions diffs - 0.1.0__tar.gz - Mend

aisbom-cli 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

aisbom_cli-0.1.0/PKG-INFO +101 -0
aisbom_cli-0.1.0/README.md +81 -0
aisbom_cli-0.1.0/aisbom/__init__.py +1 -0
aisbom_cli-0.1.0/aisbom/cli.py +93 -0
aisbom_cli-0.1.0/aisbom/safety.py +54 -0
aisbom_cli-0.1.0/aisbom/scanner.py +126 -0
aisbom_cli-0.1.0/pyproject.toml +21 -0

aisbom_cli-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,101 @@
+Metadata-Version: 2.4
+Name: aisbom-cli
+Version: 0.1.0
+Summary: An AI Supply Chain security tool that that detects Pickle bombs and generates CycloneDX SBOMs for Machine Learning models.
+Author: Ajoy L
+Author-email: lab700xdev@gmail.com
+Requires-Python: >=3.11,<4.0
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Programming Language :: Python :: 3.14
+Requires-Dist: click (<8.2.0)
+Requires-Dist: cyclonedx-python-lib (>=8.5.0,<9.0.0)
+Requires-Dist: pip-requirements-parser (>=32.0.1,<33.0.0)
+Requires-Dist: rich (>=13.7.1,<14.0.0)
+Requires-Dist: typer[all] (>=0.12.5,<0.13.0)
+Project-URL: Repository, https://github.com/Lab700xOrg/aisbom
+Description-Content-Type: text/markdown
+# AIsbom: The Supply Chain for Artificial Intelligence
+![License](https://img.shields.io/badge/license-Apache%202.0-blue)
+![Python](https://img.shields.io/badge/python-3.10%2B-blue)
+![Compliance](https://img.shields.io/badge/standard-CycloneDX-green)
+**AIsbom** is a specialized security scanner for Machine Learning artifacts. Unlike generic SBOM tools that only parse `requirements.txt`, AIsbom performs **Deep Binary Introspection** on model files (`.pt`, `.pkl`, `.safetensors`) to detect risks hidden inside the serialized weights.
+---
+## 🚀 The Problem
+AI models are not just text files; they are executable programs.
+*   **PyTorch (`.pt`)** files are Zip archives containing Pickle bytecode.
+*   **Pickle** files can execute arbitrary code (RCE) instantly upon loading.
+*   Legacy scanners see a binary blob and ignore it. **We look inside.**
+## ✨ Features
+*   **🧠 Deep Introspection:** Peeks inside PyTorch Zip structures without loading weights into RAM.
+*   **💣 Pickle Bomb Detector:** Disassembles bytecode to detect `os.system`, `subprocess`, and `eval` calls before they run.
+*   **🛡️ Compliance Ready:** Generates standard [CycloneDX v1.6](https://cyclonedx.org/) JSON for enterprise integration (Dependency-Track, ServiceNow).
+*   **⚡ Blazing Fast:** Scans GB-sized models in milliseconds by reading headers only.
+---
+## 📦 Installation
+```bash
+git clone https://github.com/your-org/aisbom.git
+cd aisbom
+pip install -e .
+```
+---
+## 🛠️ Usage
+1. Scan a directory
+Pass any directory containing your ML project. AIsbom will find requirements files AND model artifacts.
+```bash
+aisbom scan ./my-ml-project
+```
+2. Output
+You will see a risk assessment table in your terminal:
+🧠 AI Model Artifacts Found
+| Filename | Framework | Risk Level |
+| :--- | :--- | :--- |
+| `bert_finetune.pt` | PyTorch | 🔴 **CRITICAL** (RCE Detected: posix.system) |
+| `safe_model.safetensors` | SafeTensors | 🟢 **LOW** (Binary Safe) |
+A compliant `sbom.json` will be generated in the current directory.
+---
+## 🔒 Security Logic
+AIsbom uses a static analysis engine to disassemble Python Pickle opcodes. It looks for specific GLOBAL and STACK_GLOBAL instructions that reference dangerous modules:
+* os / posix (System calls)
+* subprocess (Shell execution)
+* builtins.eval / exec (Dynamic code execution)
+* socket (Network reverse shells)
+---
+## 🧪 Verification & Safety
+Security tools require trust. **Real Detection:** To maintain a safe repository, **we provide the *source code* to generate a test "Pickle Bomb" locally.** AIsbom detects the *structure* of the threat, not just a known file hash.
+**To verify the engine yourself:**
+1.  Inspect `demo_data/generate_malware.py`. You will see it uses standard Python libraries to create a payload that simulates an `os.system` call.
+2.  Run the generator:
+    ```bash
+    python demo_data/generate_malware.py
+    ```
+3.  Scan the newly created artifact:
+    ```bash
+    aisbom scan demo_data
+    ```

aisbom_cli-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,81 @@
+# AIsbom: The Supply Chain for Artificial Intelligence
+![License](https://img.shields.io/badge/license-Apache%202.0-blue)
+![Python](https://img.shields.io/badge/python-3.10%2B-blue)
+![Compliance](https://img.shields.io/badge/standard-CycloneDX-green)
+**AIsbom** is a specialized security scanner for Machine Learning artifacts. Unlike generic SBOM tools that only parse `requirements.txt`, AIsbom performs **Deep Binary Introspection** on model files (`.pt`, `.pkl`, `.safetensors`) to detect risks hidden inside the serialized weights.
+---
+## 🚀 The Problem
+AI models are not just text files; they are executable programs.
+*   **PyTorch (`.pt`)** files are Zip archives containing Pickle bytecode.
+*   **Pickle** files can execute arbitrary code (RCE) instantly upon loading.
+*   Legacy scanners see a binary blob and ignore it. **We look inside.**
+## ✨ Features
+*   **🧠 Deep Introspection:** Peeks inside PyTorch Zip structures without loading weights into RAM.
+*   **💣 Pickle Bomb Detector:** Disassembles bytecode to detect `os.system`, `subprocess`, and `eval` calls before they run.
+*   **🛡️ Compliance Ready:** Generates standard [CycloneDX v1.6](https://cyclonedx.org/) JSON for enterprise integration (Dependency-Track, ServiceNow).
+*   **⚡ Blazing Fast:** Scans GB-sized models in milliseconds by reading headers only.
+---
+## 📦 Installation
+```bash
+git clone https://github.com/your-org/aisbom.git
+cd aisbom
+pip install -e .
+```
+---
+## 🛠️ Usage
+1. Scan a directory
+Pass any directory containing your ML project. AIsbom will find requirements files AND model artifacts.
+```bash
+aisbom scan ./my-ml-project
+```
+2. Output
+You will see a risk assessment table in your terminal:
+🧠 AI Model Artifacts Found
+| Filename | Framework | Risk Level |
+| :--- | :--- | :--- |
+| `bert_finetune.pt` | PyTorch | 🔴 **CRITICAL** (RCE Detected: posix.system) |
+| `safe_model.safetensors` | SafeTensors | 🟢 **LOW** (Binary Safe) |
+A compliant `sbom.json` will be generated in the current directory.
+---
+## 🔒 Security Logic
+AIsbom uses a static analysis engine to disassemble Python Pickle opcodes. It looks for specific GLOBAL and STACK_GLOBAL instructions that reference dangerous modules:
+* os / posix (System calls)
+* subprocess (Shell execution)
+* builtins.eval / exec (Dynamic code execution)
+* socket (Network reverse shells)
+---
+## 🧪 Verification & Safety
+Security tools require trust. **Real Detection:** To maintain a safe repository, **we provide the *source code* to generate a test "Pickle Bomb" locally.** AIsbom detects the *structure* of the threat, not just a known file hash.
+**To verify the engine yourself:**
+1.  Inspect `demo_data/generate_malware.py`. You will see it uses standard Python libraries to create a payload that simulates an `os.system` call.
+2.  Run the generator:
+    ```bash
+    python demo_data/generate_malware.py
+    ```
+3.  Scan the newly created artifact:
+    ```bash
+    aisbom scan demo_data
+    ```

aisbom_cli-0.1.0/aisbom/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ # This file makes aisbom_cli a Python package

aisbom_cli-0.1.0/aisbom/cli.py ADDED Viewed

@@ -0,0 +1,93 @@
+import typer
+import json
+from rich.console import Console
+from rich.table import Table
+from rich.panel import Panel
+from cyclonedx.model.bom import Bom
+from cyclonedx.model.component import Component, ComponentType
+from cyclonedx.output.json import JsonV1Dot5, JsonV1Dot6
+# Import our new logic engine
+from .scanner import DeepScanner
+app = typer.Typer()
+console = Console()
+@app.callback(invoke_without_command=True)
+def main(
+    directory: str = typer.Argument(".", help="Target directory to scan"),
+    output: str = typer.Option("sbom.json", help="Output file path"),
+    schema_version: str = typer.Option("1.6", help="CycloneDX schema version (default is 1.6)", case_sensitive=False, rich_help_panel="Advanced Options")
+):
+    """
+    Deep Introspection Scan: Analyzes binary headers and dependency manifests.
+    """
+    console.print(Panel.fit(f"🚀 [bold cyan]AIsbom[/bold cyan] Scanning: [underline]{directory}[/underline]"))
+    # 1. Run the Logic
+    scanner = DeepScanner(directory)
+    results = scanner.scan()
+    # 2. Render Results (UI)
+    if results['artifacts']:
+        table = Table(title="🧠 AI Model Artifacts Found")
+        table.add_column("Filename", style="cyan")
+        table.add_column("Framework", style="magenta")
+        table.add_column("Risk Level", style="bold red")
+        table.add_column("Metadata", style="dim")
+        for art in results['artifacts']:
+            risk_style = "green" if "LOW" in art['risk_level'] else "red"
+            table.add_row(
+                art['name'],
+                art['framework'],
+                f"[{risk_style}]{art['risk_level']}[/{risk_style}]",
+                str(art.get('details', ''))[:40] + "..."
+            )
+        console.print(table)
+    else:
+        console.print("[yellow]No AI models found.[/yellow]")
+    if results['dependencies']:
+        console.print(f"\n📦 Found [bold]{len(results['dependencies'])}[/bold] Python libraries.")
+    if results['errors']:
+        console.print("\n[bold red]⚠️ Errors Encountered:[/bold red]")
+        for err in results['errors']:
+            console.print(f"  - Could not parse [yellow]{err['file']}[/yellow]: {err['error']}")
+    # 3. Generate CycloneDX SBOM (Standard Compliance)
+    bom = Bom()
+    # Add Models
+    for art in results['artifacts']:
+        c = Component(
+            name=art['name'],
+            type=ComponentType.MACHINE_LEARNING_MODEL,
+            # We shove our risk assessment into the description for now
+            description=f"Risk: {art['risk_level']} | Framework: {art['framework']}"
+        )
+        bom.components.add(c)
+    # Add Libraries
+    for dep in results['dependencies']:
+        c = Component(
+            name=dep['name'],
+            version=dep['version'],
+            type=ComponentType.LIBRARY
+        )
+        bom.components.add(c)
+    # 4. Save to Disk
+    if schema_version == "1.5":
+        outputter = JsonV1Dot5(bom)
+    else:
+        outputter = JsonV1Dot6(bom)
+    with open(output, "w") as f:
+        f.write(outputter.output_as_string())
+    console.print(f"\n[bold green]✔ Compliance Artifact Generated:[/bold green] {output} (CycloneDX v{schema_version})")
+if __name__ == "__main__":
+    app()

aisbom_cli-0.1.0/aisbom/safety.py ADDED Viewed

@@ -0,0 +1,54 @@
+import pickletools
+import io
+from typing import List, Set, Tuple
+# The "Blocklist" of dangerous modules and functions
+# If a model tries to import these, it is trying to break out of the sandbox.
+DANGEROUS_GLOBALS = {
+    "os": {"system", "popen", "execl", "execvp"},
+    "subprocess": {"Popen", "call", "check_call", "check_output", "run"},
+    "builtins": {"eval", "exec", "compile", "open"},
+    "posix": {"system", "popen"},
+    "webbrowser": {"open"},
+    "socket": {"socket", "connect"},
+}
+def scan_pickle_stream(data: bytes) -> List[str]:
+    """
+    Disassembles a pickle stream and checks for dangerous imports.
+    Returns a list of detected threats (e.g., ["os.system"]).
+    """
+    threats = []
+    memo = []  # Used to track recent string literals for STACK_GLOBAL
+    try:
+        stream = io.BytesIO(data)
+        for opcode, arg, pos in pickletools.genops(stream):
+            # Track the last few string literals we've seen on the stack
+            if opcode.name in ("SHORT_BINUNICODE", "UNICODE", "BINUNICODE"):
+                memo.append(arg)
+                if len(memo) > 2:
+                    memo.pop(0)
+            if opcode.name == "GLOBAL":
+                # Arg is "module\nname"
+                if isinstance(arg, str) and "\n" in arg:
+                    module, name = arg.split("\n")
+                    if module in DANGEROUS_GLOBALS and name in DANGEROUS_GLOBALS[module]:
+                        threats.append(f"{module}.{name}")
+            elif opcode.name == "STACK_GLOBAL":
+                # Takes two arguments from the stack: module and name
+                if len(memo) == 2:
+                    module, name = memo
+                    if module in DANGEROUS_GLOBALS and name in DANGEROUS_GLOBALS[module]:
+                        threats.append(f"{module}.{name}")
+                # Clear memo after use to avoid false positives
+                memo.clear()
+    except Exception as e:
+        # Avoid crashing on malformed pickles
+        pass
+    return threats

aisbom_cli-0.1.0/aisbom/scanner.py ADDED Viewed

@@ -0,0 +1,126 @@
+import os
+import json
+import zipfile
+import struct
+from typing import List, Dict, Any
+from pathlib import Path
+from .safety import scan_pickle_stream
+from pip_requirements_parser import RequirementsFile
+# Constants for file types make the code cleaner and easier to extend
+PYTORCH_EXTENSIONS = {'.pt', '.pth', '.bin'}
+SAFETENSORS_EXTENSION = '.safetensors'
+REQUIREMENTS_FILENAME = 'requirements.txt'
+class DeepScanner:
+    def __init__(self, root_path: str):
+        self.root_path = Path(root_path)
+        self.artifacts = []
+        self.dependencies = []
+        self.errors = []
+    def scan(self):
+        """Orchestrates the scan of the directory."""
+        # Use rglob for a more concise way to recursively find files
+        for full_path in self.root_path.rglob("*"):
+            if full_path.is_file():
+                ext = full_path.suffix.lower()
+                # 1. Scan AI Artifacts
+                if ext in PYTORCH_EXTENSIONS:
+                    self.artifacts.append(self._inspect_pytorch(full_path))
+                elif ext == SAFETENSORS_EXTENSION:
+                    self.artifacts.append(self._inspect_safetensors(full_path))
+                # 2. Scan Dependency Manifests
+                elif full_path.name == REQUIREMENTS_FILENAME:
+                    self._parse_requirements(full_path)
+        return {"artifacts": self.artifacts, "dependencies": self.dependencies, "errors": self.errors}
+    def _inspect_pytorch(self, path: Path) -> Dict[str, Any]:
+        """Peeks inside a PyTorch file structure and SCANS for malware."""
+        meta = {
+            "name": path.name,
+            "type": "machine-learning-model",
+            "framework": "PyTorch",
+            "risk_level": "UNKNOWN",
+            "details": {}
+        }
+        try:
+            if zipfile.is_zipfile(path):
+                with zipfile.ZipFile(path, 'r') as z:
+                    files = z.namelist()
+                    # 1. Find the data file (usually archive/data.pkl or just data.pkl)
+                    pickle_files = [f for f in files if f.endswith('.pkl')]
+                    threats = []
+                    if pickle_files:
+                        # 2. Extract and Scan the pickle bytes
+                        # We only scan the first few MBs or the main file to be fast
+                        main_pkl = pickle_files[0]
+                        with z.open(main_pkl) as f:
+                            # Read first 10MB max to prevent zip bombs
+                            content = f.read(10 * 1024 * 1024)
+                            threats = scan_pickle_stream(content)
+                    # 3. Assess Risk
+                    if threats:
+                        meta["risk_level"] = f"CRITICAL (RCE Detected: {', '.join(threats)})"
+                    elif pickle_files:
+                        meta["risk_level"] = "MEDIUM (Pickle Present)"
+                    else:
+                        meta["risk_level"] = "LOW (No bytecode found)"
+                    meta["details"] = {"internal_files": len(files), "threats": threats}
+            else:
+                meta["risk_level"] = "CRITICAL (Legacy Binary)"
+        except Exception as e:
+            meta["error"] = str(e)
+        return meta
+    def _inspect_safetensors(self, path: Path) -> Dict[str, Any]:
+        """Reads the JSON header from a .safetensors file."""
+        meta = {
+            "name": path.name,
+            "type": "machine-learning-model",
+            "framework": "SafeTensors",
+            "risk_level": "LOW", # Safe by design
+            "details": {}
+        }
+        try:
+            with open(path, 'rb') as f:
+                # First 8 bytes = header length
+                length_bytes = f.read(8)
+                if len(length_bytes) == 8:
+                    header_len = struct.unpack('<Q', length_bytes)[0]
+                    header_json = json.loads(f.read(header_len))
+                    meta["details"] = {
+                        "tensors": len(header_json.keys()),
+                        "metadata": header_json.get("__metadata__", {})
+                    }
+        except Exception as e:
+            meta["error"] = str(e)
+        return meta
+    def _parse_requirements(self, path: Path):
+        """Parses requirements.txt into individual components."""
+        try:
+            req_file = RequirementsFile.from_file(path)
+            for req in req_file.requirements:
+                if req.name:
+                    # Robust version extraction
+                    version = "unknown"
+                    specs = list(req.specifier) if req.specifier else []
+                    if specs:
+                        # Grab the first version number found, e.g. "==1.2.0" -> "1.2.0"
+                        version = specs[0].version
+                    self.dependencies.append({
+                        "name": req.name,
+                        "version": version,
+                        "type": "library"
+                    })
+        except Exception as e:
+            self.errors.append({"file": str(path), "error": str(e)})

aisbom_cli-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,21 @@
+[tool.poetry]
+name = "aisbom-cli"
+version = "0.1.0"
+description = "An AI Supply Chain security tool that that detects Pickle bombs and generates CycloneDX SBOMs for Machine Learning models."
+authors = ["Ajoy L <lab700xdev@gmail.com>"]
+readme = "README.md"
+packages = [{include = "aisbom"}]
+repository = "https://github.com/Lab700xOrg/aisbom"
+[tool.poetry.dependencies]
+python = "^3.11"
+typer = {extras = ["all"], version = "^0.12.5"}
+rich = "^13.7.1"
+cyclonedx-python-lib = "^8.5.0"
+pip-requirements-parser = "^32.0.1"
+click = "<8.2.0"
+[build-system]
+requires = ["poetry-core"]
+build-backend = "poetry.core.masonry.api"
+[tool.poetry.scripts]
+aisbom = "aisbom.cli:app"