PyPI - coreinsight-cli - Versions diffs - 0.2.9__tar.gz → 0.3.0__tar.gz - Mend

coreinsight-cli 0.2.9tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

coreinsight_cli-0.3.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,173 @@
+Metadata-Version: 2.4
+Name: coreinsight-cli
+Version: 0.3.0
+Summary: Local-first AI performance profiler that mathematically verifies optimizations for Python, C++, and CUDA
+Author: Varun Jani
+License: GPL-3.0-or-later
+Project-URL: Homepage, https://github.com/Prais3/coreinsight_cli
+Project-URL: Bug Tracker, https://github.com/Prais3/coreinsight_cli/issues
+Keywords: performance,profiling,optimization,llm,cuda,cpp,python,hpc,benchmarking
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Topic :: Software Development :: Debuggers
+Classifier: Topic :: System :: Hardware
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: rich>=13.0
+Requires-Dist: docker>=6.0
+Requires-Dist: tree-sitter==0.21.3
+Requires-Dist: tree-sitter-languages
+Requires-Dist: langchain>=0.2.0
+Requires-Dist: langchain-core>=0.2.0
+Requires-Dist: langchain-ollama>=0.1.0
+Requires-Dist: langchain-google-genai>=1.0.0
+Requires-Dist: langchain-openai>=0.1.0
+Requires-Dist: langchain-anthropic>=0.1.0
+Requires-Dist: pydantic>=2.0
+Requires-Dist: chromadb>=0.5.0
+Requires-Dist: sentence-transformers>=3.0.0
+Requires-Dist: textual>=0.60.0
+Requires-Dist: psutil>=5.9
+Provides-Extra: compat
+Requires-Dist: pysqlite3-binary>=0.5.0; extra == "compat"
+Dynamic: license-file
+# CoreInsight
+**AI-powered performance profiler for Python, C++, and CUDA.**
+CoreInsight finds hardware bottlenecks in your code, generates optimized replacements, and verifies the speedup mathematically inside an isolated Docker sandbox — all running locally on your machine.
+---
+## Install
+```bash
+pip install coreinsight-cli
+```
+**Requirements:** Python 3.9+ · Docker Desktop · [Ollama](https://ollama.com/download) (for local inference)
+---
+## Quick start
+```bash
+# Configure your AI provider (defaults to Ollama + llama3.2)
+coreinsight configure
+# Run the built-in demo
+coreinsight demo
+# Analyse your own file
+coreinsight analyze path/to/your_file.py
+```
+---
+## What it does
+CoreInsight runs a full optimization pipeline on every function it extracts:
+1. **Bottleneck analysis**
+2. **Code generation**
+3. **Sandbox verification**
+4. **Hardware profiling**
+Every result is stored in a local vector database. On repeat analyses, matching patterns are recalled instantly — no LLM call, no sandbox spin-up.
+---
+## Commands
+| Command | Description |
+|:--------|:------------|
+| `coreinsight analyze <file>` | Analyse a `.py`, `.cpp`, or `.cu` file |
+| `coreinsight demo [--lang cpp]` | Run on a built-in example |
+| `coreinsight configure` | Set up AI provider and API keys |
+| `coreinsight configure --pro-key <key>` | Activate Pro tier |
+| `coreinsight memory` | Inspect stored optimizations |
+| `coreinsight memory --clear` | Wipe the memory store |
+| `coreinsight memory --export out.csv` | Export memory to CSV or Markdown |
+| `coreinsight index [--dir <path>]` | Index a repo for cross-file RAG context |
+| `coreinsight scan [--dir <path>]` | Rank hotspots by complexity without LLM |
+| `coreinsight view` | Launch the interactive TUI |
+All commands accept `--no-docker` to skip sandboxing when Docker is unavailable.
+---
+## Supported languages
+| Language | Analysis | Benchmarking | Correctness |
+|:---------|:--------:|:------------:|:-----------:|
+| Python   | ✅ | ✅ | ✅ |
+| C++      | ✅ | ✅ | ✅ |
+| CUDA     | ✅ | ✅ | — |
+---
+## AI providers
+| Provider | Tier | Notes |
+|:---------|:----:|:------|
+| Ollama | Free | `ollama pull llama3.2` |
+| LM Studio / vLLM | Free | Any OpenAI-compatible server |
+| OpenAI | Pro | GPT 5.3 recommended |
+| Anthropic | Pro | Claude 4.6 Sonnet recommended |
+| Google Gemini | Pro | Gemini 2.5 Pro recommended |
+Local providers run entirely on-device. No code leaves your machine unless you configure a cloud provider.
+---
+## Pro — free during beta
+Pro unlocks cloud providers and AI-free hardware profiling.
+Keys are being distributed manually during the beta.
+**Request a key → [tally.so/r/xXZ9YE](https://tally.so/r/xXZ9YE)**
+```bash
+coreinsight configure --pro-key <your-key>
+```
+---
+## Privacy
+- **Local providers** — nothing leaves your machine
+- **Cloud providers** — only the function code you analyse is sent to the provider API, under your own key
+- The memory store lives at `~/.coreinsight/memory_db` on your filesystem
+---
+## Troubleshooting
+**Docker not running**
+```
+open Docker Desktop, or: sudo systemctl start docker
+```
+**Ollama model not found**
+```bash
+ollama pull llama3.2
+```
+**ChromaDB / SQLite error**
+```bash
+pip install pysqlite3-binary
+```
+---
+## Links
+- PyPI: [pypi.org/project/coreinsight-cli](https://pypi.org/project/coreinsight-cli/)
+- GitHub: [github.com/Prais3/coreinsight_cli](https://github.com/Prais3/coreinsight_cli)

coreinsight_cli-0.3.0/README.md ADDED Viewed

@@ -0,0 +1,133 @@
+# CoreInsight
+**AI-powered performance profiler for Python, C++, and CUDA.**
+CoreInsight finds hardware bottlenecks in your code, generates optimized replacements, and verifies the speedup mathematically inside an isolated Docker sandbox — all running locally on your machine.
+---
+## Install
+```bash
+pip install coreinsight-cli
+```
+**Requirements:** Python 3.9+ · Docker Desktop · [Ollama](https://ollama.com/download) (for local inference)
+---
+## Quick start
+```bash
+# Configure your AI provider (defaults to Ollama + llama3.2)
+coreinsight configure
+# Run the built-in demo
+coreinsight demo
+# Analyse your own file
+coreinsight analyze path/to/your_file.py
+```
+---
+## What it does
+CoreInsight runs a full optimization pipeline on every function it extracts:
+1. **Bottleneck analysis**
+2. **Code generation**
+3. **Sandbox verification**
+4. **Hardware profiling**
+Every result is stored in a local vector database. On repeat analyses, matching patterns are recalled instantly — no LLM call, no sandbox spin-up.
+---
+## Commands
+| Command | Description |
+|:--------|:------------|
+| `coreinsight analyze <file>` | Analyse a `.py`, `.cpp`, or `.cu` file |
+| `coreinsight demo [--lang cpp]` | Run on a built-in example |
+| `coreinsight configure` | Set up AI provider and API keys |
+| `coreinsight configure --pro-key <key>` | Activate Pro tier |
+| `coreinsight memory` | Inspect stored optimizations |
+| `coreinsight memory --clear` | Wipe the memory store |
+| `coreinsight memory --export out.csv` | Export memory to CSV or Markdown |
+| `coreinsight index [--dir <path>]` | Index a repo for cross-file RAG context |
+| `coreinsight scan [--dir <path>]` | Rank hotspots by complexity without LLM |
+| `coreinsight view` | Launch the interactive TUI |
+All commands accept `--no-docker` to skip sandboxing when Docker is unavailable.
+---
+## Supported languages
+| Language | Analysis | Benchmarking | Correctness |
+|:---------|:--------:|:------------:|:-----------:|
+| Python   | ✅ | ✅ | ✅ |
+| C++      | ✅ | ✅ | ✅ |
+| CUDA     | ✅ | ✅ | — |
+---
+## AI providers
+| Provider | Tier | Notes |
+|:---------|:----:|:------|
+| Ollama | Free | `ollama pull llama3.2` |
+| LM Studio / vLLM | Free | Any OpenAI-compatible server |
+| OpenAI | Pro | GPT 5.3 recommended |
+| Anthropic | Pro | Claude 4.6 Sonnet recommended |
+| Google Gemini | Pro | Gemini 2.5 Pro recommended |
+Local providers run entirely on-device. No code leaves your machine unless you configure a cloud provider.
+---
+## Pro — free during beta
+Pro unlocks cloud providers and AI-free hardware profiling.
+Keys are being distributed manually during the beta.
+**Request a key → [tally.so/r/xXZ9YE](https://tally.so/r/xXZ9YE)**
+```bash
+coreinsight configure --pro-key <your-key>
+```
+---
+## Privacy
+- **Local providers** — nothing leaves your machine
+- **Cloud providers** — only the function code you analyse is sent to the provider API, under your own key
+- The memory store lives at `~/.coreinsight/memory_db` on your filesystem
+---
+## Troubleshooting
+**Docker not running**
+```
+open Docker Desktop, or: sudo systemctl start docker
+```
+**Ollama model not found**
+```bash
+ollama pull llama3.2
+```
+**ChromaDB / SQLite error**
+```bash
+pip install pysqlite3-binary
+```
+---
+## Links
+- PyPI: [pypi.org/project/coreinsight-cli](https://pypi.org/project/coreinsight-cli/)
+- GitHub: [github.com/Prais3/coreinsight_cli](https://github.com/Prais3/coreinsight_cli)

{coreinsight_cli-0.2.9 → coreinsight_cli-0.3.0}/coreinsight/analyzer.py RENAMED Viewed

@@ -1,6 +1,6 @@
 import re
 import logging
-from typing import Optional, List
+from typing import Callable, Optional, List
 from pydantic import BaseModel, Field
 from langchain_core.output_parsers import JsonOutputParser
@@ -45,6 +45,43 @@ def _is_truncated(raw: str) -> bool:
 logger = logging.getLogger(__name__)
+# -------------------------------------------------------------------------------------
+# Prompt compression - SMALL-tier models (≤7B) within their 4 096-token context budget.
+# -------------------------------------------------------------------------------------
+_SMALL_CONTEXT_CHAR_LIMIT = 1_200   # ~300 tokens — enough for signatures
+_SMALL_CODE_CHAR_LIMIT    = 2_000   # ~500 tokens — function body cap
+def _compress_for_small_model(
+    context: str,
+    code: str,
+    model_tier: str,
+) -> tuple:
+    """
+    Aggressively trims RAG context and target code for SMALL-tier models so
+    the entire prompt + format instructions + response fit within 4 096 tokens.
+    Returns (compressed_context, compressed_code). No-op for MEDIUM / LARGE.
+    """
+    from coreinsight.prompts import ModelTier
+    if model_tier != ModelTier.SMALL:
+        return context, code
+    if context and len(context) > _SMALL_CONTEXT_CHAR_LIMIT:
+        context = (
+            context[:_SMALL_CONTEXT_CHAR_LIMIT]
+            + "\n\n# [context truncated — top dependencies shown only]"
+        )
+    if code and len(code) > _SMALL_CODE_CHAR_LIMIT:
+        lines = code.splitlines()
+        kept  = lines[:60]
+        if len(lines) > 60:
+            kept.append(
+                f"# ... [{len(lines) - 60} lines truncated for small model]"
+            )
+        code = "\n".join(kept)
+    return context, code
 class Bottleneck(BaseModel):
     line: int = Field(description="The approximate line number of the issue in the original code")
@@ -255,30 +292,59 @@ class AnalyzerAgent:
         )
         self.chain = self.prompt | self.json_llm | self.parser
-    def analyze(self, code: str, language: str, context: str = "", hardware_target: str = "Generic CPU"):
+    def analyze(
+        self,
+        code: str,
+        language: str,
+        context: str = "",
+        hardware_target: str = "Generic CPU",
+        stream_callback: Optional[Callable[[str], None]] = None,
+    ):
+        context, code = _compress_for_small_model(context, code, self.model_tier)
         try:
+            if stream_callback is not None:
+                # Stream raw tokens → accumulate → parse at end.
+                # Keeps the cursor alive on slow local models instead of hanging.
+                raw_chain   = self.prompt | self.json_llm
+                accumulated = ""
+                for chunk in raw_chain.stream({
+                    "language":        language,
+                    "code_content":    code,
+                    "context":         context,
+                    "hardware_target": hardware_target,
+                }):
+                    token = chunk.content if hasattr(chunk, "content") else str(chunk)
+                    if isinstance(token, list):
+                        token = "".join(
+                            t.get("text", "") if isinstance(t, dict) else str(t)
+                            for t in token
+                        )
+                    if token:
+                        accumulated += token
+                        stream_callback(token)
+                return self.parser.parse(accumulated)
             return self.chain.invoke({
-                "language": language,
-                "code_content": code,
-                "context": context,
+                "language":        language,
+                "code_content":    code,
+                "context":         context,
                 "hardware_target": hardware_target,
             })
         except OutputParserException:
             return {
-                "severity": "Error",
-                "issue": "AI Output Parsing Failed",
-                "reasoning": "The model failed to return valid JSON.",
-                "suggestion": "Try running the analysis again or use a larger parameter model.",
-                "bottlenecks": [],
+                "severity":       "Error",
+                "issue":          "AI Output Parsing Failed",
+                "reasoning":      "The model failed to return valid JSON.",
+                "suggestion":     "Try running the analysis again or use a larger parameter model.",
+                "bottlenecks":    [],
                 "optimized_code": None,
             }
         except Exception as e:
             return {
-                "severity": "Error",
-                "issue": str(e),
-                "reasoning": "System error during analysis pipeline.",
-                "suggestion": "Check LLM API keys and connectivity.",
-                "bottlenecks": [],
+                "severity":       "Error",
+                "issue":          str(e),
+                "reasoning":      "System error during analysis pipeline.",
+                "suggestion":     "Check LLM API keys and connectivity.",
+                "bottlenecks":    [],
                 "optimized_code": None,
             }
@@ -296,11 +362,37 @@ class AnalyzerAgent:
             lines.pop(0)
         return "\n".join(lines).strip()
-    def _invoke_code_chain(self, template: str, variables: dict, language: str) -> str:
+    def _invoke_code_chain(
+        self,
+        template: str,
+        variables: dict,
+        language: str,
+        stream_callback: Optional[Callable[[str], None]] = None,
+    ) -> str:
         """Shared invocation + extraction logic for harness and fix chains."""
         chain = PromptTemplate.from_template(template) | self.base_llm
         try:
-            result = chain.invoke(variables)
+            if stream_callback is not None:
+                accumulated = ""
+                for chunk in chain.stream(variables):
+                    token = chunk.content if hasattr(chunk, "content") else str(chunk)
+                    if isinstance(token, list):
+                        token = "".join(
+                            t.get("text", "") if isinstance(t, dict) else str(t)
+                            for t in token
+                        )
+                    if token:
+                        accumulated += token
+                        stream_callback(token)
+                raw = accumulated
+            else:
+                result = chain.invoke(variables)
+                raw = result.content if hasattr(result, "content") else str(result)
+                if isinstance(raw, list):
+                    raw = "\n".join(
+                        item["text"] if isinstance(item, dict) and "text" in item else str(item)
+                        for item in raw
+                    )
         except Exception as e:
             err = str(e).lower()
             if any(h in err for h in _TRUNCATION_HINTS):
@@ -309,12 +401,6 @@ class AnalyzerAgent:
                     f"or a model with a larger context window. Detail: {e}"
                 ) from e
             raise
-        raw = result.content if hasattr(result, "content") else str(result)
-        if isinstance(raw, list):
-            raw = "\n".join(
-                item["text"] if isinstance(item, dict) and "text" in item else str(item)
-                for item in raw
-            )
         if _is_truncated(raw):
             logger.warning(
                 f"LLM output appears truncated (len={len(raw)}). "
@@ -334,21 +420,26 @@ class AnalyzerAgent:
         language: str,
         context: str = "",
         hardware_target: str = "Generic CPU",
+        stream_callback: Optional[Callable[[str], None]] = None,
     ) -> str:
         try:
+            context, original_code = _compress_for_small_model(
+                context, original_code, self.model_tier
+            )
             tiered_template = _HARNESS_TEMPLATE + HARNESS_ADDENDUM.get(self.model_tier, "")
             return self._invoke_code_chain(
                 tiered_template,
                 {
-                    "language": language,
-                    "func_name": func_name,
-                    "original": original_code,
-                    "optimized": optimized_code,
-                    "context": context,
+                    "language":        language,
+                    "func_name":       func_name,
+                    "original":        original_code,
+                    "optimized":       optimized_code,
+                    "context":         context,
                     "hardware_target": hardware_target,
                 },
                 language,
+                stream_callback=stream_callback,
             )
         except Exception as e:
             is_python = language.lower() == "python"
@@ -365,21 +456,26 @@ class AnalyzerAgent:
         error_logs: str,
         language: str,
         context: str = "",
+        stream_callback: Optional[Callable[[str], None]] = None,
     ) -> str:
         try:
+            context, original_code = _compress_for_small_model(
+                context, original_code, self.model_tier
+            )
             tiered_template = _FIX_TEMPLATE + HARNESS_ADDENDUM.get(self.model_tier, "")
             return self._invoke_code_chain(
                 tiered_template,
                 {
-                    "language": language,
-                    "func_name": func_name,
-                    "original": original_code,
-                    "bad_harness": bad_harness,
+                    "language":   language,
+                    "func_name":  func_name,
+                    "original":   original_code,
+                    "bad_harness":bad_harness,
                     "error_logs": error_logs,
-                    "context": context,
+                    "context":    context,
                 },
                 language,
+                stream_callback=stream_callback,
             )
         except Exception as e:
             is_python = language.lower() == "python"
@@ -577,8 +673,29 @@ class BottleneckAgent:
         language:        str,
         context:         str = "",
         hardware_target: str = "Generic CPU",
+        stream_callback: Optional[Callable[[str], None]] = None,
     ) -> dict:
+        context, code = _compress_for_small_model(context, code, self.model_tier)
         try:
+            if stream_callback is not None:
+                raw_chain   = self._prompt | self._json_llm
+                accumulated = ""
+                for chunk in raw_chain.stream({
+                    "language":        language,
+                    "code_content":    code,
+                    "context":         context,
+                    "hardware_target": hardware_target,
+                }):
+                    token = chunk.content if hasattr(chunk, "content") else str(chunk)
+                    if isinstance(token, list):
+                        token = "".join(
+                            t.get("text", "") if isinstance(t, dict) else str(t)
+                            for t in token
+                        )
+                    if token:
+                        accumulated += token
+                        stream_callback(token)
+                return self.parser.parse(accumulated)
             return self._chain.invoke({
                 "language":        language,
                 "code_content":    code,
@@ -651,6 +768,9 @@ class OptimizerAgent:
         Returns original_code on any failure so the pipeline can continue.
         """
         try:
+            context, original_code = _compress_for_small_model(
+                context or "", original_code, self.model_tier
+            )
             chain  = PromptTemplate.from_template(self._template) | self._base_llm
             result = chain.invoke({
                 "language":        language,

{coreinsight_cli-0.2.9 → coreinsight_cli-0.3.0}/coreinsight/config.py RENAMED Viewed

@@ -1,4 +1,6 @@
 import json
+import hashlib
+import urllib
 from pathlib import Path
 from rich.console import Console
 from rich.prompt import Prompt, Confirm
@@ -8,17 +10,17 @@ CONFIG_FILE = Path.home() / ".coreinsight" / "config.json"
 PRO_WAITLIST_URL = "https://tally.so/r/xXZ9YE"
-# Raw URL of GitHub Gist - beta testing for new pro users
-PRO_KEYS_GIST_URL = "https://gist.githubusercontent.com/Prais3/4a57cf927734c6678602ff2066fc080c/raw/b4347c6ffea869490afb9a828802ec882ecd0eca/valid_keys.json"
+# Cloudflare Worker endpoint for Pro key validation (v0.3.0+)
+PRO_KEY_VALIDATION_URL = "https://coreinsight.coreinsight-dev.workers.dev/"
 CLOUD_PROVIDERS = ["openai", "anthropic", "google"]
 FREE_TIER_LIMITS = {
-    "max_functions":     3,
-    "max_retries":       2,
-    "num_test_cases":    8,
+    "max_functions":      None,  # unlimited
+    "max_retries":        3,
+    "num_test_cases":     5,
     "hardware_profiling": False,
-    "max_files": 2,
+    "max_files":          None,
 }
 PRO_TIER_LIMITS = {
@@ -106,11 +108,20 @@ def run_configure(pro_key: str = None, agent_mode: str = None):
         key_hash = hashlib.sha256(pro_key.encode()).hexdigest()
         try:
-            req = urllib.request.Request(PRO_KEYS_GIST_URL)
-            with urllib.request.urlopen(req, timeout=5) as response:
-                valid_hashes = json.loads(response.read().decode())
-            if key_hash in valid_hashes:
+            payload = json.dumps({"hash": key_hash}).encode()
+            req = urllib.request.Request(
+                PRO_KEY_VALIDATION_URL,
+                data=payload,
+                headers={
+                    "Content-Type": "application/json",
+                    "User-Agent": "coreinsight-cli/0.3.0",
+                },
+                method="POST",
+            )
+            with urllib.request.urlopen(req, timeout=8) as response:
+                result = json.loads(response.read().decode())
+            if result.get("valid"):
                 config["pro"] = True
                 save_config(config)
                 console.print("[bold green]✅ Pro tier activated![/bold green]")
@@ -118,8 +129,11 @@ def run_configure(pro_key: str = None, agent_mode: str = None):
                 config["pro"] = False
                 save_config(config)
                 console.print("[red]❌ Invalid or revoked Pro key.[/red]")
-        except Exception as e:
-            console.print("[red]⚠️ Could not verify key. Please check your internet connection or try again later.[/red]")
+        except Exception:
+            console.print(
+                "[red]⚠️ Could not verify key — check your internet connection "
+                "or try again later.[/red]"
+            )
         return
     if agent_mode is not None:

coreinsight-cli 0.2.9__tar.gz → 0.3.0__tar.gz

coreinsight-cli 0.2.9tar.gz → 0.3.0tar.gz