PyPI - atlas-mem - Versions diffs - 2.0.0__tar.gz - Mend

atlas-mem 2.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

atlas_mem-2.0.0/PKG-INFO +448 -0
atlas_mem-2.0.0/README.md +407 -0
atlas_mem-2.0.0/atlas_mem/__init__.py +17 -0
atlas_mem-2.0.0/atlas_mem/client.py +254 -0
atlas_mem-2.0.0/atlas_mem/cognitive_brain.py +679 -0
atlas_mem-2.0.0/atlas_mem.egg-info/PKG-INFO +448 -0
atlas_mem-2.0.0/atlas_mem.egg-info/SOURCES.txt +10 -0
atlas_mem-2.0.0/atlas_mem.egg-info/dependency_links.txt +1 -0
atlas_mem-2.0.0/atlas_mem.egg-info/requires.txt +21 -0
atlas_mem-2.0.0/atlas_mem.egg-info/top_level.txt +1 -0
atlas_mem-2.0.0/pyproject.toml +65 -0
atlas_mem-2.0.0/setup.cfg +4 -0

atlas_mem-2.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,448 @@
+Metadata-Version: 2.4
+Name: atlas-mem
+Version: 2.0.0
+Summary: Cognitive AI memory for agents — episodic, semantic, and working memory with multi-hop graph reasoning.
+Author-email: Bsyncs <noreply@verify.bsyncs.com>
+License-Expression: MIT
+Project-URL: Homepage, https://atlas.bsyncs.com
+Project-URL: Documentation, https://docs.bsyncs.com
+Project-URL: Repository, https://github.com/janhavi2409/atlas
+Project-URL: Bug Tracker, https://github.com/janhavi2409/atlas/issues
+Project-URL: Changelog, https://github.com/janhavi2409/atlas/blob/main/CHANGELOG.md
+Keywords: ai,agents,memory,knowledge-graph,llm,episodic-memory,semantic-memory,cognitive-ai,openai,langchain,crewai,llamaindex
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+Requires-Dist: requests>=2.28.0
+Provides-Extra: async
+Requires-Dist: httpx>=0.27.0; extra == "async"
+Provides-Extra: langchain
+Requires-Dist: langchain-core>=0.2.0; extra == "langchain"
+Requires-Dist: pydantic>=2.0; extra == "langchain"
+Provides-Extra: crewai
+Requires-Dist: crewai>=0.28.0; extra == "crewai"
+Provides-Extra: llamaindex
+Requires-Dist: llama-index-core>=0.10.0; extra == "llamaindex"
+Provides-Extra: all
+Requires-Dist: httpx>=0.27.0; extra == "all"
+Requires-Dist: langchain-core>=0.2.0; extra == "all"
+Requires-Dist: pydantic>=2.0; extra == "all"
+Requires-Dist: crewai>=0.28.0; extra == "all"
+Requires-Dist: llama-index-core>=0.10.0; extra == "all"
+# ATLAS Memory SDK
+> Knowledge graph memory for AI agents — persistent, temporal, multi-tenant.
+ATLAS gives your AI agents long-term memory that understands **time**. Unlike vector databases that treat all facts equally, ATLAS knows that "we switched to PostgreSQL last week" should outrank "we started with MySQL three months ago" — automatically.
+---
+## What problem does this solve?
+Every AI agent today suffers from the same limitation: it only knows what's in its context window. The moment a conversation ends, everything is forgotten. Workarounds like stuffing chat history into prompts don't scale — they're expensive, hit token limits, and have no concept of which facts are still true.
+ATLAS solves this by giving agents a **structured, queryable memory layer** that:
+- Extracts facts automatically from natural language — no schema design needed
+- Scores retrieval by semantic relevance **and** recency — newer facts win
+- Resolves conflicts automatically — updating a fact replaces the old one
+- Persists across sessions, deployments, and model swaps
+- Isolates each customer's data in its own namespace
+---
+## How it works
+When an agent ingests text, ATLAS runs it through three steps:
+**1. Fact extraction** — A small local language model (Qwen 0.5B) reads the text and extracts atomic `Subject → Relation → Object` triples. A spaCy grammar filter removes noise (pronouns, determiners, prepositional fragments).
+```
+"Project Apollo now uses PostgreSQL for better JSONB support"
+→ [ Project Apollo | uses | PostgreSQL ]
+```
+**2. Graph storage** — Facts are stored as nodes and edges in a dedicated Neo4j instance, scoped to the caller's namespace. Every edge carries a `created_at` timestamp. Entity names are resolved to canonical forms via a Redis vector cache — so "Postgres", "PostgreSQL", and "the database" all map to the same node.
+**3. Temporal retrieval** — When an agent searches, ATLAS scores every candidate fact with:
+```
+Score = (0.20 × Semantic similarity)
+      + (0.70 × Recency)           ← dominant signal
+      + (0.10 × Usage frequency)
+      + (0.50 × Relation-query alignment)
+```
+Recency has the highest weight by design. A fact stored today beats a contradicting fact from last month — this is what makes temporal reasoning work without any special configuration.
+---
+## The retrieval pipeline (under the hood)
+For engineers who want to understand what happens on every `/sdk/search` call:
+```
+Query text
+    │
+    ▼
+1. Keyword extraction (SLM)          "What DB?" → ["Database", "Project Apollo"]
+    │
+    ▼
+2. Canonical resolution (Redis)      "Postgres" → "PostgreSQL"
+    │
+    ▼
+3. Hub detection (live graph stats)  High-degree nodes dampened as context
+    │
+    ▼
+4. Seed node finding (centroid sim)  Signal-only probe vector, keyword boost
+    │
+    ▼
+5. 2-hop BFS traversal              Collect all reachable facts from seeds
+    │
+    ▼
+6. Semantic floor filter            Drop facts with V < 0.22 (noise removal)
+    │
+    ▼
+7. Score each fact                  α·V + β·R + γ·F + δ·A
+    │
+    ▼
+8. Conflict resolution              Newest fact wins per (Subject, Relation) slot
+    │
+    ▼
+Top-k results
+```
+---
+## API reference
+Base URL: `https://your-domain.com` (or `http://localhost:8000` locally)
+All endpoints require the `x-api-key` header except `/sdk/health`.
+---
+### `POST /sdk/ingest`
+Extract facts from text and store them in memory.
+**Headers**
+```
+x-api-key: atlas_...
+Content-Type: application/json
+```
+**Body**
+```json
+{
+    "text": "Project Apollo now uses PostgreSQL as the primary database.",
+    "session_id": "optional-user-or-conversation-id"
+}
+```
+**Response**
+```json
+{
+    "status": "ok",
+    "facts_stored": 1,
+    "facts": [
+        {
+            "subject": "Project Apollo",
+            "relation": "uses",
+            "object": "PostgreSQL"
+        }
+    ],
+    "namespace": "your_org_abc123:optional-session-id"
+}
+```
+**Tips for best extraction quality:**
+- Use direct declarative sentences: `"X uses Y"`, `"X is Y"`, `"X is hosted on Y"`
+- Avoid first-person summaries: `"The user mentioned that..."` — the SLM rejects pronoun subjects by design
+- One concept per sentence gives cleaner triples than long compound sentences
+---
+### `POST /sdk/search`
+Search memory with a natural language query.
+**Body**
+```json
+{
+    "query": "What database does Project Apollo use?",
+    "k": 5,
+    "session_id": "optional-same-session-id-as-ingest"
+}
+```
+**Response**
+```json
+{
+    "query": "What database does Project Apollo use?",
+    "results": [
+        {
+            "fact": "Project Apollo uses PostgreSQL",
+            "subject": "Project Apollo",
+            "relation": "uses",
+            "object": "PostgreSQL",
+            "score": 1.06,
+            "recency": 0.999,
+            "V": 0.74,
+            "R": 0.999,
+            "F": 1.0,
+            "A": 0.23
+        }
+    ],
+    "count": 1,
+    "namespace": "your_org_abc123:optional-session-id"
+}
+```
+Score breakdown per result:
+- `V` — How semantically similar this fact is to your query (0–1)
+- `R` — How recent this fact is; decays exponentially with age (0–1)
+- `F` — How frequently this entity has been referenced (0–1)
+- `A` — How well the relation phrase aligns with the query intent (0–1)
+---
+### `DELETE /sdk/clear`
+Delete all facts for this API key (and optional session).
+```
+DELETE /sdk/clear?session_id=optional-session-id
+```
+---
+### `GET /sdk/usage`
+Check monthly operation usage and limits.
+**Response**
+```json
+{
+    "org_name": "ACME Corp",
+    "tier": "starter",
+    "price": "$29/mo",
+    "ops_used": 1240,
+    "ops_limit": 50000,
+    "ops_remaining": 48760
+}
+```
+---
+### `GET /sdk/health`
+Liveness probe. No authentication required.
+```json
+{"status": "healthy", "service": "atlas-sdk", "version": "1.0.0"}
+```
+---
+## Pricing
+1 operation = 1 ingest call **or** 1 search call. Resets monthly.
+| Plan | Price | Ops / month | Best for |
+|---|---|---|---|
+| **Free** | $0 | 1,000 | Development and testing |
+| **Starter** | $29 | 50,000 | Indie developers, small teams |
+| **Pro** | $99 | 500,000 | Growing products |
+| **Scale** | $299 | 5,000,000 | High-volume AI applications |
+| **Enterprise** | Custom | Unlimited | Self-hosted, SLA, compliance |
+**Overage:** $0.01 per 100 ops beyond your plan limit (Starter and above).
+**Enterprise plan includes:**
+- Self-hosted deployment (your VPC, your data never leaves)
+- SSO and audit logs
+- Custom SLA with guaranteed uptime
+- Dedicated customer success manager
+- Custom onboarding and integration support
+---
+## Integration guides
+### Python (any agent)
+```python
+from atlas_sdk import AtlasMemory
+memory = AtlasMemory(
+    api_key="atlas_...",
+    base_url="https://your-domain.com",
+    session_id="user-123"       # optional — isolates per user/conversation
+)
+# Store a fact
+memory.ingest("The project deadline is March 31st.")
+# Retrieve relevant context
+results = memory.search("When is the deadline?")
+# Inject into your LLM prompt
+context = memory.format_context(results)
+```
+---
+### OpenAI function calling
+```python
+from atlas_sdk import AtlasMemory
+from openai import OpenAI
+memory = AtlasMemory(api_key="atlas_...", base_url="https://your-domain.com")
+client = OpenAI()
+response = client.chat.completions.create(
+    model="gpt-4o",
+    messages=[{"role": "user", "content": user_message}],
+    tools=memory.get_openai_tools(),   # returns save + search tool definitions
+    tool_choice="auto",
+)
+# Handle tool calls
+for tc in response.choices[0].message.tool_calls or []:
+    result = memory.handle_tool_call(tc.function.name, json.loads(tc.function.arguments))
+```
+The SDK provides two tools out of the box:
+- `atlas_save_memory` — the agent calls this when it learns something worth remembering
+- `atlas_search_memory` — the agent calls this before answering questions about past context
+---
+### LangChain
+```python
+from atlas_sdk import AtlasMemory
+memory = AtlasMemory(api_key="atlas_...", base_url="https://your-domain.com")
+tools = memory.get_langchain_tools()   # returns LangChain Tool objects
+# Drop directly into any LangChain agent
+agent = create_openai_tools_agent(llm, tools, prompt)
+```
+---
+### Any REST-capable framework
+ATLAS is framework-agnostic. If your agent can make HTTP requests, it can use ATLAS.
+```
+POST /sdk/ingest    { "text": "...", "session_id": "..." }
+POST /sdk/search    { "query": "...", "k": 5, "session_id": "..." }
+```
+Compatible with: CrewAI, AutoGen, Semantic Kernel, Haystack, custom agents, n8n, Make, Zapier.
+---
+## Namespacing and data isolation
+Every API key maps to a unique namespace. No two customers share graph nodes, edges, or vector cache entries — enforced at the database query level, not the application level.
+The optional `session_id` parameter creates a sub-namespace within your key's namespace. Use this to isolate memory per user, per conversation, or per agent:
+```
+API key namespace:     acme_corp_a1b2c3
+With session_id:       acme_corp_a1b2c3:user-456
+```
+A search with `session_id: user-456` only sees facts ingested with the same `session_id`. Different users never see each other's memories.
+---
+## Self-hosted architecture
+The SDK service runs as a standalone Docker container alongside two dedicated databases:
+```
+Your infrastructure
+├── sdk-service        (FastAPI, port 5008, ~5GB RAM)
+│   ├── Qwen 0.5B      (fact extraction SLM, ~1.5GB)
+│   ├── MiniLM         (sentence embedder, ~0.5GB)
+│   └── spaCy          (grammar filter, ~50MB)
+├── neo4j-sdk          (knowledge graph, dedicated)
+└── redis-sdk          (vector cache + API key store, dedicated)
+```
+The SDK databases are fully isolated from any other services in your stack. They do not share storage with your application's Neo4j or Redis instances.
+Models are downloaded on first boot to a persistent Docker volume (`hf_cache`) and reused on every subsequent restart — no re-downloading on rebuild.
+---
+## Admin operations
+Create and manage API keys via the admin endpoints. Requires the `x-admin-secret` header.
+```bash
+# Create a key
+POST /sdk/admin/create-key
+{ "org_name": "ACME Corp", "tier": "starter" }
+# List all keys
+GET /sdk/admin/keys
+# Upgrade a tier
+PATCH /sdk/admin/upgrade?api_key=atlas_...&new_tier=pro
+# Deactivate a key
+DELETE /sdk/admin/deactivate?api_key=atlas_...
+```
+---
+## FAQ
+**Does ATLAS replace my vector database?**
+No. ATLAS is a memory layer for agents — it stores structured facts with temporal context. A vector database stores embeddings for semantic search over documents. They serve different purposes and can be used together.
+**What happens when a fact changes?**
+Ingest the updated fact with the same subject and relation. ATLAS stores both facts but the conflict resolver surfaces only the most recent one in search results. Old facts are preserved for audit purposes.
+**How long do facts persist?**
+Indefinitely until you call `/sdk/clear`. There is no automatic expiry.
+**Can I use ATLAS without the SLM (bring my own triples)?**
+Not in this version. A direct triple ingestion endpoint (`POST /sdk/ingest/triple`) is on the roadmap.
+**Is the data encrypted?**
+Data is encrypted at rest by the underlying Neo4j and Redis storage engines. Transport is encrypted via HTTPS (configure your reverse proxy). For stricter compliance requirements, use the Enterprise self-hosted plan.
+**What languages does fact extraction support?**
+English only in this version. Multilingual support is on the roadmap.
+---
+## Changelog
+### v1.0.0
+- Initial release
+- Multi-tenant namespace isolation
+- Temporal scoring: α·V + β·R + γ·F + δ·A
+- Dynamic hub detection (no hardcoded ontologies)
+- OpenAI function calling integration
+- LangChain tool wrappers
+- Tier-based rate limiting (Free / Starter / Pro / Scale / Enterprise)
+- Dedicated Neo4j and Redis instances per deployment