PyPI - odin-engine - Versions diffs - 0.1.0__tar.gz - Mend

odin-engine 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (70) hide show

odin_engine-0.1.0/LICENSE +21 -0
odin_engine-0.1.0/MANIFEST.in +11 -0
odin_engine-0.1.0/PKG-INFO +456 -0
odin_engine-0.1.0/README.md +419 -0
odin_engine-0.1.0/benchmarks/__init__.py +17 -0
odin_engine-0.1.0/benchmarks/datasets.py +284 -0
odin_engine-0.1.0/benchmarks/metrics.py +275 -0
odin_engine-0.1.0/benchmarks/run_ablation.py +279 -0
odin_engine-0.1.0/benchmarks/run_npll_benchmark.py +270 -0
odin_engine-0.1.0/docs/AGENT_INTEGRATION_GUIDE.md +188 -0
odin_engine-0.1.0/docs/ARCHITECTURE.md +872 -0
odin_engine-0.1.0/npll/__init__.py +10 -0
odin_engine-0.1.0/npll/bootstrap.py +474 -0
odin_engine-0.1.0/npll/core/__init__.py +34 -0
odin_engine-0.1.0/npll/core/knowledge_graph.py +309 -0
odin_engine-0.1.0/npll/core/logical_rules.py +497 -0
odin_engine-0.1.0/npll/core/mln.py +475 -0
odin_engine-0.1.0/npll/inference/__init__.py +41 -0
odin_engine-0.1.0/npll/inference/e_step.py +420 -0
odin_engine-0.1.0/npll/inference/elbo.py +435 -0
odin_engine-0.1.0/npll/inference/m_step.py +577 -0
odin_engine-0.1.0/npll/npll_model.py +632 -0
odin_engine-0.1.0/npll/scoring/__init__.py +43 -0
odin_engine-0.1.0/npll/scoring/embeddings.py +442 -0
odin_engine-0.1.0/npll/scoring/probability.py +403 -0
odin_engine-0.1.0/npll/scoring/scoring_module.py +370 -0
odin_engine-0.1.0/npll/training/__init__.py +25 -0
odin_engine-0.1.0/npll/training/evaluation.py +497 -0
odin_engine-0.1.0/npll/training/npll_trainer.py +521 -0
odin_engine-0.1.0/npll/utils/__init__.py +48 -0
odin_engine-0.1.0/npll/utils/batch_utils.py +493 -0
odin_engine-0.1.0/npll/utils/config.py +145 -0
odin_engine-0.1.0/npll/utils/math_utils.py +339 -0
odin_engine-0.1.0/odin/__init__.py +20 -0
odin_engine-0.1.0/odin/engine.py +264 -0
odin_engine-0.1.0/odin_engine.egg-info/PKG-INFO +456 -0
odin_engine-0.1.0/odin_engine.egg-info/SOURCES.txt +68 -0
odin_engine-0.1.0/odin_engine.egg-info/dependency_links.txt +1 -0
odin_engine-0.1.0/odin_engine.egg-info/requires.txt +13 -0
odin_engine-0.1.0/odin_engine.egg-info/top_level.txt +4 -0
odin_engine-0.1.0/pyproject.toml +50 -0
odin_engine-0.1.0/retrieval/__init__.py +50 -0
odin_engine-0.1.0/retrieval/adapters.py +140 -0
odin_engine-0.1.0/retrieval/adapters_arango.py +1418 -0
odin_engine-0.1.0/retrieval/aggregators.py +707 -0
odin_engine-0.1.0/retrieval/beam.py +127 -0
odin_engine-0.1.0/retrieval/budget.py +60 -0
odin_engine-0.1.0/retrieval/cache.py +159 -0
odin_engine-0.1.0/retrieval/confidence.py +88 -0
odin_engine-0.1.0/retrieval/eval.py +49 -0
odin_engine-0.1.0/retrieval/linker.py +87 -0
odin_engine-0.1.0/retrieval/metrics.py +105 -0
odin_engine-0.1.0/retrieval/metrics_motifs.py +36 -0
odin_engine-0.1.0/retrieval/orchestrator.py +571 -0
odin_engine-0.1.0/retrieval/ppr/__init__.py +12 -0
odin_engine-0.1.0/retrieval/ppr/anchors.py +41 -0
odin_engine-0.1.0/retrieval/ppr/bippr.py +61 -0
odin_engine-0.1.0/retrieval/ppr/engines.py +257 -0
odin_engine-0.1.0/retrieval/ppr/global_pr.py +76 -0
odin_engine-0.1.0/retrieval/ppr/indexes.py +78 -0
odin_engine-0.1.0/retrieval/ppr.py +156 -0
odin_engine-0.1.0/retrieval/ppr_cache.py +25 -0
odin_engine-0.1.0/retrieval/scoring.py +294 -0
odin_engine-0.1.0/retrieval/utils/__init__.py +0 -0
odin_engine-0.1.0/retrieval/utils/pii_redaction.py +36 -0
odin_engine-0.1.0/retrieval/writers/__init__.py +9 -0
odin_engine-0.1.0/retrieval/writers/arango_writer.py +28 -0
odin_engine-0.1.0/retrieval/writers/base.py +21 -0
odin_engine-0.1.0/retrieval/writers/janus_writer.py +36 -0
odin_engine-0.1.0/setup.cfg +4 -0

odin_engine-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Prescott Data
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

odin_engine-0.1.0/MANIFEST.in ADDED Viewed

@@ -0,0 +1,11 @@
+# Include documentation
+include README.md
+include LICENSE
+include docs/ARCHITECTURE.md
+include docs/AGENT_INTEGRATION_GUIDE.md
+# Exclude build artifacts
+global-exclude *.pyc
+global-exclude __pycache__
+global-exclude *.so
+global-exclude .git*

odin_engine-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,456 @@
+Metadata-Version: 2.4
+Name: odin-engine
+Version: 0.1.0
+Summary: Odin: Graph intelligence engine with PPR + NPLL for AI agent navigation
+Author: Prescott Data
+License: MIT
+Project-URL: Homepage, https://bitbucket.org/dromos/odin-kg-engine
+Keywords: knowledge-graph,graph-intelligence,pagerank,neural-logic,ai-agents,graph-exploration
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Science/Research
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Classifier: Topic :: Database
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: torch>=2.0.0
+Requires-Dist: python-arango>=7.0
+Requires-Dist: numpy>=1.20.0
+Requires-Dist: networkx>=3.0
+Requires-Dist: scipy>=1.10.0
+Requires-Dist: scikit-learn>=1.3.0
+Requires-Dist: gremlinpython>=3.5.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
+Requires-Dist: pytest-cov; extra == "dev"
+Requires-Dist: flake8; extra == "dev"
+Requires-Dist: bandit; extra == "dev"
+Dynamic: license-file
+# Odin KG Engine
+**Graph Intelligence for Autonomous AI Agents**
+Odin is a production-ready Python library that transforms how AI agents navigate knowledge graphs. It combines structural graph algorithms (Personalized PageRank), semantic plausibility scoring (NPLL), and pattern detection to guide agents toward high-signal discoveries in complex, multi-domain graphs.
+**Built for:** Healthcare analytics, fraud detection, regulatory compliance, supply chain intelligence, and any domain where autonomous agents need to discover patterns in graphs with 10K-5M entities.
+[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
+[![Tests](https://img.shields.io/badge/tests-62%20passing-green.svg)](tests/)
+[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
+[![PyPI](https://img.shields.io/badge/pypi-odin--kg--engine-blue)](https://pypi.org/project/odin-kg-engine/)
+---
+## The Problem
+AI agents exploring knowledge graphs face three critical challenges:
+1. **Exponential Path Growth** - A 3-hop exploration from a single node in a densely connected graph can generate 100K+ paths, most of which are noise
+2. **Semantic Invalidity** - Naive traversal follows edges that violate domain logic (e.g., `Patient → diagnosed_by → Medication`)
+3. **No Prioritization** - Without ranking, agents waste turns analyzing low-value paths while missing critical patterns
+**Traditional approaches fail:**
+- **BFS/DFS**: Exponential explosion, no signal filtering
+- **Fixed Cypher Queries**: Only finds patterns you already know exist
+- **Random Walk**: No convergence guarantees, wasted compute
+- **LLM Prompting Alone**: Hallucinates relationships, can't verify graph structure
+## The Solution
+Odin provides a **guided exploration framework** that acts as a navigation compass for agents:
+```
+┌─────────────┐    ┌──────────────────────────────────────┐    ┌─────────────────┐
+│ Seed        │ -> │ PPR → Beam Search → NPLL → Aggregate │ -> │ Ranked Paths +  │
+│ Entities    │    │        (Odin Engine)                 │    │ Patterns + Score│
+└─────────────┘    └──────────────────────────────────────┘    └─────────────────┘
+```
+**Key Components:**
+| Component | Purpose | Impact |
+|-----------|---------|--------|
+| **Personalized PageRank (PPR)** | Identifies structurally important nodes as starting points | Reduces search space by 80% |
+| **Beam Search** | Efficiently explores top-K paths at each hop | Prevents exponential explosion |
+| **NPLL (Neural Probabilistic Logic)** | Scores edge plausibility using learned rules from your graph | Filters 60-90% of semantically invalid paths |
+| **Motif Detection** | Surfaces recurring patterns (e.g., "A→B→C appears 47 times") | Automatic anomaly detection |
+| **Triage Scoring** | 0-100 importance ranking for agent prioritization | Focuses agent compute on high-signal areas |
+**Results:** 10x faster exploration, 5x more relevant discoveries, agents focus on high-signal regions.
+---
+## Quick Start
+### Installation
+```bash
+# From PyPI (recommended)
+pip install odin-kg-engine
+# From source
+git clone https://github.com/prescott-data/odin-kg-engine.git
+cd odin-kg-engine
+pip install -e .
+```
+**Requirements:**
+- Python 3.9+
+- ArangoDB 3.10+ (or compatible graph database)
+- PyTorch 2.0+ (for NPLL)
+### 5-Minute Integration
+```python
+from arango import ArangoClient
+from odin import OdinEngine
+# 1. Connect to your knowledge graph
+client = ArangoClient(hosts="http://localhost:8529")
+db = client.db("my_database", username="user", password="pass")
+# 2. Initialize Odin (auto-trains NPLL from your graph on first run)
+engine = OdinEngine(db=db, community_id="my_community")
+# 3. Explore from seed entities
+result = engine.retrieve(
+    seeds=["entity/claim_123", "entity/provider_456"],
+    max_paths=50,
+    hop_limit=3,
+)
+# 4. Access scored paths
+print(f"Found {len(result['paths'])} paths")
+print(f"Triage Score: {result['triage']['score']}/100")
+for path in result['paths'][:5]:
+    nodes = " → ".join(path['nodes'])
+    print(f"  [{path['score']:.2f}] {nodes}")
+```
+**Output Example:**
+```
+Found 47 paths
+Triage Score: 87/100
+  [0.94] claim_123 → billed_by → provider_456 → flagged_in → audit_07
+  [0.89] claim_123 → has_diagnosis → sepsis_dx → rare_in → nursing_home_cluster
+  [0.82] provider_456 → prescribed → medication_999 → contraindicated_with → patient_history
+```
+### Self-Managing Intelligence
+Odin automatically manages its NPLL model lifecycle:
+1. **First Run**: Extracts edge patterns from your graph, trains NPLL model (~2-5 minutes)
+2. **Stores Weights**: Saves learned parameters in ArangoDB collection (`NPLLWeights`)
+3. **Subsequent Runs**: Loads weights from DB and rebuilds model in ~30 seconds
+**No separate ML pipeline, no .pt files, no DevOps overhead.** Just initialize `OdinEngine` and it handles everything.
+---
+## Core Features
+### 1. Intelligent Path Finding
+```python
+# Start from suspicious entities, let Odin find connections
+result = engine.retrieve(
+    seeds=["claim/CLM_99285"],
+    max_paths=100,
+    hop_limit=4,
+)
+# Returns scored paths with:
+# - Structural importance (PPR)
+# - Semantic plausibility (NPLL)
+# - Pattern frequency (motif detection)
+```
+### 2. Edge Plausibility Scoring
+```python
+# Validate specific relationships
+score = engine.score_edge(
+    head="entity/patient_001",
+    relation="treated_by",
+    tail="entity/doctor_smith"
+)
+# Returns: 0.0 (impossible) to 1.0 (highly plausible)
+# Use in agent decision loops
+if score > 0.7:
+    agent.investigate_further(path)
+```
+### 3. Anchor Node Discovery
+```python
+# Find most important nodes in your graph
+anchors = engine.find_anchors(
+    seeds=["community/insurance_claims"],
+    topn=20
+)
+# Returns top-N nodes by PageRank for targeted exploration
+```
+### 4. Pattern Detection
+```python
+# Automatic motif extraction
+motifs = result['aggregates']['motifs']
+for motif in motifs:
+    print(f"{motif['pattern']} appears {motif['count']} times")
+# Example output:
+# Claim → has_policyholder → Person (47 instances)
+# Provider → billed_by → Claim → flagged_in → Audit (12 instances)
+```
+---
+## Architecture
+Odin is composed of four layers working in concert:
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                              ODIN ENGINE                                    │
+│                                                                             │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │                    RETRIEVAL ORCHESTRATOR                             │  │
+│  │  Coordinates PPR → Beam Search → NPLL → Aggregation pipeline         │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+│                                                                             │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │
+│  │ Graph        │  │ PPR Engine   │  │ NPLL Model   │  │ Aggregators  │   │
+│  │ Accessor     │  │ (Anchors)    │  │ (Confidence) │  │ (Motifs)     │   │
+│  │ + Cache      │  │              │  │              │  │              │   │
+│  └──────────────┘  └──────────────┘  └──────────────┘  └──────────────┘   │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+**Component Details:**
+| Layer | Responsibility | Performance |
+|-------|----------------|-------------|
+| **Graph Accessor** | LRU-cached graph queries | <10ms avg per node fetch |
+| **PPR Engine** | Compute importance scores | ~200ms for 1K nodes |
+| **Beam Search** | Multi-hop path exploration | 50-500ms depending on beam width |
+| **NPLL Confidence** | Semantic edge filtering | ~5ms per edge (cached) |
+| **Aggregators** | Pattern extraction | ~100ms post-retrieval |
+**Typical End-to-End Latency:** 300-800ms for 50-path retrieval
+---
+## Use Cases
+### 1. Healthcare Fraud Detection
+**Scenario:** Find providers billing unusual procedure combinations
+```python
+engine = OdinEngine(db, community_id="medicare_claims")
+result = engine.retrieve(
+    seeds=["provider/high_volume_clinic"],
+    max_paths=100,
+)
+# Odin surfaces: "CPT_99285 + CPT_office_visit" pattern in 34 claims
+```
+### 2. Supply Chain Risk Analysis
+**Scenario:** Identify cascading supplier dependencies
+```python
+result = engine.retrieve(
+    seeds=["supplier/critical_vendor"],
+    hop_limit=5,  # Deep supply chain exploration
+)
+# Discovers: Tier-3 supplier affects 47 downstream products
+```
+### 3. Regulatory Compliance Checks
+**Scenario:** Validate entity relationships against compliance rules
+```python
+score = engine.score_edge(
+    head="entity/investment_fund",
+    relation="managed_by",
+    tail="entity/sanctioned_entity"
+)
+if score > 0.5:
+    compliance_agent.flag_for_review()
+```
+---
+## Documentation
+| Document | Description |
+|----------|-------------|
+| [**Architecture**](docs/ARCHITECTURE.md) | Complete technical design (873 lines) |
+| [**Agent Integration Guide**](docs/AGENT_INTEGRATION_GUIDE.md) | How to integrate with AI agents (189 lines) |
+| [**API Reference**](#api-reference) | Full method signatures and parameters |
+---
+## API Reference
+### OdinEngine
+```python
+from odin import OdinEngine
+engine = OdinEngine(
+    db: StandardDatabase,              # ArangoDB connection
+    community_id: str = "global",      # Scope for exploration
+    cache_size: int = 5000,            # LRU cache size
+    auto_train: bool = True,           # Auto-train NPLL if needed
+    community_mode: str = "none"       # "none" | "mapping"
+)
+```
+**Methods:**
+#### `retrieve(seeds, max_paths, hop_limit, **kwargs)`
+Find and score paths from seed entities.
+**Parameters:**
+- `seeds: List[str]` - Starting entity IDs (e.g., `["entity/123"]`)
+- `max_paths: int = 50` - Maximum paths to return
+- `hop_limit: int = 3` - Maximum hops from seeds
+- `beam_width: int = 10` - Top-K paths to explore at each hop
+**Returns:**
+```python
+{
+    "paths": [
+        {
+            "nodes": ["entity/A", "entity/B", "entity/C"],
+            "edges": [{"relation": "relates_to", "score": 0.89}],
+            "score": 0.94
+        }
+    ],
+    "triage": {"score": 87, "confidence": "high"},
+    "aggregates": {
+        "motifs": [{"pattern": "A→B→C", "count": 12}],
+        "node_frequencies": {"entity/A": 47}
+    }
+}
+```
+#### `score_edge(head, relation, tail)`
+Score plausibility of a single edge.
+**Parameters:**
+- `head: str` - Source entity ID
+- `relation: str` - Edge type
+- `tail: str` - Target entity ID
+**Returns:** `float` (0.0-1.0)
+#### `find_anchors(seeds, topn)`
+Get top-N most important nodes by PageRank.
+**Parameters:**
+- `seeds: List[str]` - Seed entities for personalized PageRank
+- `topn: int = 20` - Number of top nodes to return
+**Returns:** `List[Tuple[str, float]]` - (entity_id, ppr_score)
+#### `retrain_model(force_retrain=True)`
+Force NPLL model retraining (use after major graph updates).
+---
+## Performance
+| Metric | Value | Notes |
+|--------|-------|-------|
+| **Graph Scale** | 10K-5M entities | Tested on healthcare, insurance, supply chain |
+| **Typical Latency** | 300-800ms | For 50-path retrieval with 3-hop limit |
+| **Cache Hit Rate** | >80% | After warm-up period |
+| **NPLL Training** | 2-5 minutes | First run, then cached |
+| **NPLL Loading** | ~30 seconds | Subsequent runs |
+| **Memory** | ~500MB-2GB | Depends on cache size and graph density |
+**Tested Deployments:**
+- Healthcare KG: 2.3M entities, 8.7M edges (avg 450ms retrieval)
+- Insurance Claims: 850K entities, 4.1M edges (avg 320ms retrieval)
+- Supply Chain: 450K entities, 2.3M edges (avg 280ms retrieval)
+---
+## Testing
+```bash
+# Run all tests (62 passing)
+pytest tests/ -v
+# Unit tests only
+pytest tests/unit/ -v
+# Integration tests
+pytest tests/integration/ -v
+# With coverage report
+pytest tests/ --cov=odin --cov=npll --cov=retrieval --cov-report=html
+```
+---
+## Contributing
+We welcome contributions! Areas of interest:
+- **Scalability**: Optimizations for graphs >10M entities
+- **Algorithms**: Alternative PPR implementations, new aggregators
+- **Database Support**: Neo4j, Neptune adapters
+- **Benchmarks**: Academic dataset comparisons
+---
+## License
+MIT License - see [LICENSE](LICENSE) for details.
+---
+## Authors
+**Prescott Data**
+- Muyukani Kizito - Lead Engineer
+- Elizabeth Nyambere - NPLL & GNN Research
+---
+## Citation
+If you use Odin in academic work, please cite:
+```bibtex
+@software{odin_kg_engine,
+  title={Odin: Graph Intelligence for Autonomous AI Agents},
+  author={Prescott Data},
+  year={2026},
+  url={https://github.com/prescott-data/odin-kg-engine}
+}
+```
+---
+## Links
+- [PyPI Package](https://pypi.org/project/odin-kg-engine/)
+- [Documentation](docs/)
+- [GitHub Repository](https://github.com/prescott-data/odin-kg-engine)
+- [Prescott Data](https://prescottdata.io)