PyPI - knowledgevault - Versions diffs - 0.3.0__tar.gz - Mend

knowledgevault 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

knowledgevault-0.3.0/LICENSE +21 -0
knowledgevault-0.3.0/PKG-INFO +448 -0
knowledgevault-0.3.0/README.md +410 -0
knowledgevault-0.3.0/knowledgevault.egg-info/PKG-INFO +448 -0
knowledgevault-0.3.0/knowledgevault.egg-info/SOURCES.txt +48 -0
knowledgevault-0.3.0/knowledgevault.egg-info/dependency_links.txt +1 -0
knowledgevault-0.3.0/knowledgevault.egg-info/entry_points.txt +3 -0
knowledgevault-0.3.0/knowledgevault.egg-info/requires.txt +20 -0
knowledgevault-0.3.0/knowledgevault.egg-info/top_level.txt +1 -0
knowledgevault-0.3.0/kvault/__init__.py +62 -0
knowledgevault-0.3.0/kvault/cli/__init__.py +0 -0
knowledgevault-0.3.0/kvault/cli/check.py +250 -0
knowledgevault-0.3.0/kvault/cli/main.py +724 -0
knowledgevault-0.3.0/kvault/core/__init__.py +20 -0
knowledgevault-0.3.0/kvault/core/frontmatter.py +94 -0
knowledgevault-0.3.0/kvault/core/index.py +462 -0
knowledgevault-0.3.0/kvault/core/observability.py +479 -0
knowledgevault-0.3.0/kvault/core/research.py +255 -0
knowledgevault-0.3.0/kvault/core/storage.py +322 -0
knowledgevault-0.3.0/kvault/matching/__init__.py +29 -0
knowledgevault-0.3.0/kvault/matching/alias.py +153 -0
knowledgevault-0.3.0/kvault/matching/base.py +160 -0
knowledgevault-0.3.0/kvault/matching/domain.py +118 -0
knowledgevault-0.3.0/kvault/matching/fuzzy.py +127 -0
knowledgevault-0.3.0/kvault/mcp/__init__.py +5 -0
knowledgevault-0.3.0/kvault/mcp/server.py +1538 -0
knowledgevault-0.3.0/kvault/mcp/state.py +239 -0
knowledgevault-0.3.0/kvault/mcp/validation.py +293 -0
knowledgevault-0.3.0/kvault/orchestrator/__init__.py +27 -0
knowledgevault-0.3.0/kvault/orchestrator/context.py +337 -0
knowledgevault-0.3.0/kvault/orchestrator/enforcer.py +509 -0
knowledgevault-0.3.0/kvault/orchestrator/runner.py +1443 -0
knowledgevault-0.3.0/kvault/orchestrator/state_machine.py +453 -0
knowledgevault-0.3.0/kvault/templates/CLAUDE.md +311 -0
knowledgevault-0.3.0/kvault/templates/__init__.py +0 -0
knowledgevault-0.3.0/kvault/templates/category_summary.md +10 -0
knowledgevault-0.3.0/kvault/templates/journal_entry.md +8 -0
knowledgevault-0.3.0/kvault/templates/root_summary.md +40 -0
knowledgevault-0.3.0/pyproject.toml +116 -0
knowledgevault-0.3.0/setup.cfg +4 -0
knowledgevault-0.3.0/tests/test_check.py +202 -0
knowledgevault-0.3.0/tests/test_e2e_cli.py +192 -0
knowledgevault-0.3.0/tests/test_frontmatter.py +158 -0
knowledgevault-0.3.0/tests/test_index.py +190 -0
knowledgevault-0.3.0/tests/test_init.py +137 -0
knowledgevault-0.3.0/tests/test_matching.py +365 -0
knowledgevault-0.3.0/tests/test_observability.py +224 -0
knowledgevault-0.3.0/tests/test_orchestrator.py +872 -0
knowledgevault-0.3.0/tests/test_research.py +198 -0
knowledgevault-0.3.0/tests/test_storage.py +239 -0

knowledgevault-0.3.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Eddie Landesberg
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

knowledgevault-0.3.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,448 @@
+Metadata-Version: 2.4
+Name: knowledgevault
+Version: 0.3.0
+Summary: Config-driven knowledge graph framework for extracting structured knowledge from unstructured data
+Author: Eddie Landesberg
+License: MIT
+Project-URL: Homepage, https://github.com/cimo-labs/kvault
+Project-URL: Documentation, https://github.com/cimo-labs/kvault#readme
+Project-URL: Repository, https://github.com/cimo-labs/kvault
+Keywords: knowledge-graph,entity-extraction,llm,data-processing,claude-code,mcp,personal-knowledge-base
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: pyyaml>=6.0
+Requires-Dist: click>=8.0
+Requires-Dist: pydantic>=2.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.0; extra == "dev"
+Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
+Requires-Dist: black>=23.0; extra == "dev"
+Requires-Dist: mypy>=1.0; extra == "dev"
+Requires-Dist: ruff>=0.1.0; extra == "dev"
+Requires-Dist: pre-commit>=3.0; extra == "dev"
+Provides-Extra: sdk
+Requires-Dist: claude-code-sdk>=0.1.0; extra == "sdk"
+Provides-Extra: mcp
+Requires-Dist: mcp>=1.0.0; python_version >= "3.10" and extra == "mcp"
+Dynamic: license-file
+# kvault
+Agent-first knowledge graph framework. Build knowledge graphs from unstructured data using intelligent agents.
+## Philosophy
+**The agent IS the pipeline.** Claude (or another LLM) does extraction, research, decisions, and propagation. kvault provides tools, not workflows.
+```
+┌─────────────────────────────────────────────────────────────┐
+│  EntityIndex    MatchStrategies    ObservabilityLogger      │
+│  (fast lookup)  (fuzzy, alias)     (debug & improve)        │
+│                                                             │
+│  SimpleStorage  (YAML frontmatter in _summary.md preferred) │
+└─────────────────────────────────────────────────────────────┘
+Agent (Claude) does:
+  - Read input
+  - Research (using EntityIndex + MatchStrategies)
+  - Decide (using its reasoning)
+  - Write (using SimpleStorage)
+  - Propagate (update parent summaries)
+  - Log (using ObservabilityLogger)
+```
+## Getting Started with Claude Code
+The fastest way to get a personal knowledge base running with Claude Code:
+```bash
+# 1. Install kvault with MCP support
+pip install kvault[mcp]
+# 2. Initialize a new knowledge base
+kvault init my_kb --name "Your Name"
+# 3. Verify it's clean
+kvault check --kb-root my_kb
+```
+Then add the MCP server to `.claude/settings.json`:
+```json
+{
+  "mcpServers": {
+    "kvault": {
+      "command": "kvault-mcp",
+      "env": {}
+    }
+  }
+}
+```
+And add the integrity hook (catches stale summaries before each prompt):
+```json
+{
+  "hooks": {
+    "UserPromptSubmit": [
+      {
+        "type": "command",
+        "command": "kvault check --kb-root /absolute/path/to/my_kb"
+      }
+    ]
+  }
+}
+```
+Customize the generated `CLAUDE.md` with your personal details, then start adding entities.
+## Installation
+```bash
+pip install kvault
+```
+Or install from source:
+```bash
+git clone https://github.com/cimo-labs/kvault
+cd kvault
+pip install -e .
+```
+## Quick Start
+```python
+from pathlib import Path
+from kvault import (
+    EntityIndex,
+    SimpleStorage,
+    ObservabilityLogger,
+    EntityResearcher
+)
+# Initialize
+kg_root = Path("my_knowledge_base")
+index = EntityIndex(kg_root / ".kvault" / "index.db")
+storage = SimpleStorage(kg_root)
+logger = ObservabilityLogger(kg_root / ".kvault" / "logs.db")
+researcher = EntityResearcher(index)
+# 1. Research - find existing entities
+matches = researcher.research("Alice Smith", email="alice@anthropic.com")
+action, target, confidence = researcher.suggest_action("Alice Smith")
+logger.log_research("Alice Smith", "alice smith",
+                    [m.__dict__ for m in matches], action)
+# 2. Decide - agent determines what to do
+if action == "create":
+    entity_path = "people/collaborators/alice_smith"
+    logger.log_decide("Alice Smith", "create",
+                      "No existing match found", confidence)
+# 3. Write - create/update the entity
+storage.create_entity(entity_path, {
+    "created": "2026-01-05",
+    "updated": "2026-01-05",
+    "source": "email:123",
+    "aliases": ["Alice", "alice@anthropic.com"]
+}, summary="# Alice Smith\n\nResearch scientist at Anthropic.")
+logger.log_write(entity_path, "create", "Created new entity")
+# 4. Update index
+index.add(entity_path, "Alice Smith",
+          ["Alice", "alice@anthropic.com"], "people")
+# 5. Propagate - update parent summaries
+ancestors = storage.get_ancestors(entity_path)
+logger.log_propagate(entity_path, ancestors)
+```
+## Core Components
+### EntityIndex
+SQLite-backed entity index with full-text search for fast lookups.
+```python
+from kvault import EntityIndex
+index = EntityIndex(Path("index.db"))
+# Add entity
+index.add("people/alice", "Alice Smith",
+          aliases=["Alice", "alice@example.com"],
+          category="people")
+# Search
+results = index.search("Alice")
+# Find by alias
+entry = index.find_by_alias("alice@example.com")
+# Find by email domain
+entries = index.find_by_email_domain("example.com")
+# Rebuild from filesystem
+count = index.rebuild(Path("knowledge_graph"))
+```
+### SimpleStorage
+Filesystem storage with minimal 4-field schema.
+```python
+from kvault import SimpleStorage
+storage = SimpleStorage(Path("knowledge_graph"))
+# Create entity
+storage.create_entity("people/alice", {
+    "created": "2026-01-05",
+    "updated": "2026-01-05",
+    "source": "manual",
+    "aliases": ["Alice"]
+}, summary="# Alice\n\nDescription here.")
+# Update entity
+storage.update_entity("people/alice",
+                      meta={"source": "email:123"},
+                      summary="# Alice\n\nUpdated description.")
+# Read
+meta = storage.read_meta("people/alice")
+summary = storage.read_summary("people/alice")
+# Navigate hierarchy
+ancestors = storage.get_ancestors("people/collaborators/alice")
+# Returns: ["people/collaborators", "people"]
+```
+### ObservabilityLogger
+Phase-based logging for debugging and system improvement.
+```python
+from kvault import ObservabilityLogger
+logger = ObservabilityLogger(Path("logs.db"))
+# Log phases
+logger.log_input([{"name": "Alice"}], source="email")
+logger.log_research("Alice", "alice", matches, "create")
+logger.log_decide("Alice", "create", "No match found", confidence=0.95)
+logger.log_write("people/alice", "create", "Created entity")
+logger.log_propagate("people/alice", ["people"])
+logger.log_error("validation_failed", entity="Alice",
+                 details={"field": "email"})
+# Query logs
+errors = logger.get_errors()
+decisions = logger.get_decisions(action="create")
+low_conf = logger.get_low_confidence(threshold=0.7)
+summary = logger.get_session_summary()
+```
+### EntityResearcher
+Research existing entities before creating new ones.
+```python
+from kvault import EntityResearcher, EntityIndex
+index = EntityIndex(Path("index.db"))
+researcher = EntityResearcher(index)
+# Find matches
+matches = researcher.research("Alice Smith", email="alice@example.com")
+# Get suggestion
+action, path, confidence = researcher.suggest_action("Alice Smith")
+# Returns: ("create", None, 0.95)  or  ("update", "people/alice", 0.90)
+# Quick checks
+exists = researcher.exists("Alice Smith", threshold=0.9)
+best = researcher.best_match("Alice Smith")
+```
+### Matching Strategies
+Pluggable strategies for entity deduplication.
+```python
+from kvault import (
+    AliasMatchStrategy,
+    FuzzyNameMatchStrategy,
+    EmailDomainMatchStrategy
+)
+# Alias matching - exact match (score: 1.0)
+alias_strategy = AliasMatchStrategy()
+# Fuzzy name matching (score: 0.85-0.99)
+fuzzy_strategy = FuzzyNameMatchStrategy(threshold=0.85)
+# Email domain matching (score: 0.85-0.95)
+domain_strategy = EmailDomainMatchStrategy()
+```
+## Storage Format
+### YAML Frontmatter (Preferred)
+Entities are stored as a single `_summary.md` file with YAML frontmatter:
+```markdown
+---
+created: 2026-01-05
+updated: 2026-01-05
+source: email:123
+aliases: [Alice, alice@anthropic.com, +14155551234]
+phone: +14155551234
+email: alice@anthropic.com
+relationship_type: colleague
+context: Met at NeurIPS 2024
+---
+# Alice Smith
+Research scientist at Anthropic working on causal discovery.
+## Background
+Collaborator on interpretability project.
+## Interactions
+- 2026-01-05: Initial contact logged
+## Notes
+- Interested in causal representation learning
+```
+**Required fields:** `created`, `updated`, `source`, `aliases`
+**Optional fields:** `phone`, `email`, `relationship_type`, `context`, `related_to`, `last_interaction`, `status`
+### Legacy Format (_meta.json)
+Separate `_meta.json` files are still supported for backward compatibility:
+```json
+{
+  "created": "2026-01-05",
+  "last_updated": "2026-01-05",
+  "sources": ["email:123"],
+  "aliases": ["Alice", "alice@anthropic.com"]
+}
+```
+**Note:** New entities should use YAML frontmatter. The index rebuilder supports both formats.
+## Development
+```bash
+# Install dev dependencies
+pip install -e ".[dev]"
+# Run tests
+pytest
+# Format code
+black kvault/
+# Type check
+mypy kvault/
+```
+## MCP Server (Claude Code Integration)
+The kvault MCP server provides direct tool access for Claude Code, enabling the 6-step workflow without subprocess parsing.
+### Installation
+```bash
+pip install kvault[mcp]  # Install with MCP support
+```
+### Configuration
+Add to `.claude/settings.json`:
+```json
+{
+  "mcpServers": {
+    "kvault": {
+      "command": "kvault-mcp",
+      "env": {}
+    }
+  }
+}
+```
+### Available Tools
+| Category | Tools |
+|----------|-------|
+| **Init** | `kvault_init`, `kvault_status` |
+| **Index** | `kvault_search`, `kvault_find_by_alias`, `kvault_find_by_email_domain`, `kvault_rebuild_index` |
+| **Entity** | `kvault_read_entity`, `kvault_write_entity`, `kvault_list_entities`, `kvault_delete_entity`, `kvault_move_entity` |
+| **Summary** | `kvault_read_summary`, `kvault_write_summary`, `kvault_get_parent_summaries` |
+| **Research** | `kvault_research` |
+| **Workflow** | `kvault_log_phase`, `kvault_write_journal`, `kvault_validate_transition` |
+### Example Workflow
+```
+1. kvault_init(kg_root="/path/to/kb")
+2. kvault_research(name="John Doe", phone="+14155551234")
+3. kvault_write_entity(path="people/contacts/john_doe", meta={...}, content="...", create=true)
+4. kvault_get_parent_summaries(path="people/contacts/john_doe")
+5. kvault_write_summary(path="people/contacts", content="...")
+6. kvault_write_journal(actions=[...], source="manual")
+7. kvault_rebuild_index()
+```
+### Benefits
+- **Structured JSON responses** - No regex parsing of CLI output
+- **Direct control** - Each tool call is explicit and debuggable
+- **Session state** - Track workflow progress across calls
+- **No timeouts** - Individual tools complete quickly
+---
+## CLI Usage
+```bash
+pip install -e ".[dev]"
+# Initialize a new KB
+kvault init my_kb --name "Alice"
+# Check KB integrity (propagation, journal, index, frontmatter, branching)
+kvault check --kb-root my_kb
+kvault check                      # Auto-detects KB root from cwd
+# Process a corpus
+kvault process --corpus /path/to/corpus --kg-root /path/to/kg --dry-run
+kvault process --corpus /path/to/corpus --kg-root /path/to/kg --apply
+# Rebuild and search the index
+kvault index rebuild --kg-root /path/to/kg
+kvault index search --db /path/to/kg/.kvault/index.db --query "Acme"
+# Session summary (observability)
+kvault log summary --db /path/to/kg/.kvault/logs.db
+```
+## License
+MIT