npm - hana-kgvector - Versions diffs - 0.1.0 - Mend

hana-kgvector 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/LICENSE +21 -0
package/README.md +449 -0
package/dist/Stream-JW2S2DUH.js +6 -0
package/dist/Stream-JW2S2DUH.js.map +1 -0
package/dist/chunk-I3F3SOHM.js +726 -0
package/dist/chunk-I3F3SOHM.js.map +1 -0
package/dist/chunk-VUNV25KB.js +16 -0
package/dist/chunk-VUNV25KB.js.map +1 -0
package/dist/index.d.ts +382 -0
package/dist/index.js +1036 -0
package/dist/index.js.map +1 -0
package/dist/lib-J24D2XQX.js +1256 -0
package/dist/lib-J24D2XQX.js.map +1 -0
package/package.json +59 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,449 @@
+# hana-kgvector
+A TypeScript framework for building **hybrid GraphRAG** applications using SAP HANA Cloud as the unified backend for knowledge graphs (RDF) and vector embeddings.
+## Features
+- **Unified Storage**: SAP HANA Cloud for both RDF triples (Knowledge Graph Engine) and vector embeddings (Vector Engine)
+- **PropertyGraphIndex**: LlamaIndex-inspired API for building and querying property graphs
+- **Hybrid Retrieval**: Combine vector similarity search with graph traversal
+- **Schema-Guided Extraction**: Extract entities and relations from documents using LLMs
+- **Multi-Tenancy**: Isolate data into logical "Spaces" for different domains
+- **LLM Agnostic**: Works with any LLM via LiteLLM proxy (OpenAI, Anthropic, Azure, etc.)
+> 📚 **New to hana-kgvector?** Check out the [Step-by-Step Tutorial](./TUTORIAL.md) for a complete guide.
+## Installation
+```bash
+pnpm add hana-kgvector
+# or
+npm install hana-kgvector
+```
+## Quick Start
+### 1. Setup Environment
+Create a `.env.local` file:
+```env
+# SAP HANA Cloud
+HANA_HOST=your-hana-instance.hanacloud.ondemand.com:443
+HANA_USER=your_user
+HANA_PASSWORD=your_password
+# LiteLLM Proxy
+LITELLM_PROXY_URL=http://localhost:4000
+LITELLM_API_KEY=your_key
+# Models
+DEFAULT_LLM_MODEL=gpt-4o-mini
+DEFAULT_EMBEDDING_MODEL=text-embedding-3-small
+```
+### 2. Create a PropertyGraphIndex
+```typescript
+import {
+  loadEnv,
+  createHanaConnection,
+  HanaPropertyGraphStore,
+  PropertyGraphIndex,
+  SchemaLLMPathExtractor,
+  ImplicitPathExtractor,
+} from "hana-kgvector";
+import OpenAI from "openai";
+// Load environment
+loadEnv();
+// Connect to HANA
+const conn = await createHanaConnection({
+  host: process.env.HANA_HOST!,
+  user: process.env.HANA_USER!,
+  password: process.env.HANA_PASSWORD!,
+});
+// Create OpenAI client (via LiteLLM)
+const openai = new OpenAI({
+  apiKey: process.env.LITELLM_API_KEY,
+  baseURL: process.env.LITELLM_PROXY_URL,
+});
+// Create embed model adapter
+const embedModel = {
+  async getTextEmbedding(text: string) {
+    const res = await openai.embeddings.create({
+      model: process.env.DEFAULT_EMBEDDING_MODEL ?? "text-embedding-3-small",
+      input: text,
+      encoding_format: "base64", // Required for some LiteLLM proxy configurations
+    });
+    return res.data[0].embedding;
+  },
+  async getTextEmbeddingBatch(texts: string[]) {
+    if (texts.length === 0) return [];
+    const res = await openai.embeddings.create({
+      model: process.env.DEFAULT_EMBEDDING_MODEL ?? "text-embedding-3-small",
+      input: texts,
+      encoding_format: "base64",
+    });
+    return res.data.map((d) => d.embedding);
+  },
+};
+// Create LLM client adapter
+const llmClient = {
+  async structuredPredict<T>(schema: any, prompt: string): Promise<T> {
+    const res = await openai.chat.completions.create({
+      model: process.env.DEFAULT_LLM_MODEL ?? "gpt-4o-mini",
+      messages: [{ role: "user", content: prompt }],
+      response_format: { type: "json_object" },
+    });
+    let content = res.choices[0]?.message?.content ?? "{}";
+    // Strip markdown code blocks if present (some LLMs wrap JSON in ```json...```)
+    content = content.replace(/^```(?:json)?\s*\n?/i, "").replace(/\n?```\s*$/i, "").trim();
+    return JSON.parse(content);
+  },
+};
+// Create HANA-backed graph store
+const graphStore = new HanaPropertyGraphStore(conn, {
+  graphName: "my_knowledge_graph",  // RDF named graph identifier
+  // vectorDimension is auto-detected from first embedding
+});
+// Create PropertyGraphIndex with extractors
+const index = new PropertyGraphIndex({
+  propertyGraphStore: graphStore,
+  embedModel,
+  kgExtractors: [
+    new SchemaLLMPathExtractor({
+      llm: llmClient,
+      schema: {
+        entityTypes: ["PERSON", "ORGANIZATION", "LOCATION", "PRODUCT"],
+        relationTypes: ["WORKS_AT", "LOCATED_IN", "PRODUCES", "KNOWS"],
+        validationSchema: [
+          ["PERSON", "WORKS_AT", "ORGANIZATION"],
+          ["PERSON", "KNOWS", "PERSON"],
+          ["ORGANIZATION", "LOCATED_IN", "LOCATION"],
+          ["ORGANIZATION", "PRODUCES", "PRODUCT"],
+        ],
+      },
+    }),
+    new ImplicitPathExtractor(),
+  ],
+  embedKgNodes: true,
+});
+```
+### 3. Insert Documents
+```typescript
+await index.insert([
+  {
+    id: "doc_1",
+    text: "Alice works at SAP in Walldorf. She collaborates with Bob.",
+    metadata: { documentId: "company_info" },
+  },
+  {
+    id: "doc_2",
+    text: "SAP produces enterprise software and is headquartered in Germany.",
+    metadata: { documentId: "company_info" },
+  },
+]);
+```
+### 4. Query the Graph
+```typescript
+// Simple query
+const results = await index.query("Who works at SAP?");
+for (const result of results) {
+  console.log(`[${result.score.toFixed(3)}] ${result.node.text}`);
+}
+// Advanced: Use retriever directly
+import { VectorContextRetriever } from "hana-kgvector";
+const retriever = new VectorContextRetriever({
+  graphStore,
+  embedModel,
+  similarityTopK: 5,
+  pathDepth: 2,  // Traverse 2 hops from matched nodes
+});
+const nodes = await retriever.retrieve({ queryStr: "SAP employees" });
+```
+## Architecture
+```
+┌────────────────────────────────────────────────────────────────────┐
+│                        hana-kgvector                               │
+├────────────────────────────────────────────────────────────────────┤
+│                                                                    │
+│  ┌────────────────────┐  ┌──────────────────┐  ┌────────────────┐  │
+│  │ PropertyGraphIndex │  │   Extractors     │  │  Retrievers    │  │
+│  │  - insert()        │  │  - SchemaLLM     │  │  - Vector      │  │
+│  │  - query()         │  │  - Implicit      │  │  - PGRetriever │  │
+│  └────────┬───────────┘  └──────────────────┘  └────────────────┘  │
+│           │                                                        │
+│           ▼                                                        │
+│  ┌──────────────────────────────────────────────────────────┐      │
+│  │              HanaPropertyGraphStore                      │      │
+│  │  - upsertNodes()   - vectorQuery()   - getRelMap()       │      │
+│  └──────────────────────────────────────────────────────────┘      │
+│           │                                                        │
+│           ▼                                                        │
+│  ┌──────────────────────┐    ┌─────────────────────┐               │
+│  │   HANA Vector Engine │    │   HANA KG Engine    │               │
+│  │   (REAL_VECTOR)      │    │   (SPARQL_EXECUTE)  │               │
+│  └──────────────────────┘    └─────────────────────┘               │
+│                                                                    │
+└────────────────────────────────────────────────────────────────────┘
+```
+## Core Components
+### PropertyGraphIndex
+Main entry point for building and querying knowledge graphs.
+```typescript
+const index = new PropertyGraphIndex({
+  propertyGraphStore: graphStore,  // Required: HANA-backed store
+  embedModel,                       // Optional: for vector search
+  kgExtractors: [...],             // Optional: extraction pipeline
+  embedKgNodes: true,              // Embed entity nodes
+});
+```
+### HanaPropertyGraphStore
+HANA-backed implementation of PropertyGraphStore interface.
+```typescript
+const store = new HanaPropertyGraphStore(conn, {
+  graphName: "my_graph",              // RDF named graph identifier
+  vectorTableName: "MY_VECTORS",      // Optional: custom table name
+  // vectorDimension auto-detected from embeddings (supports 1536, 3072, etc.)
+});
+```
+### Extractors
+Transform text nodes into entities and relations.
+| Extractor | Description |
+|-----------|-------------|
+| `SchemaLLMPathExtractor` | Schema-guided extraction with LLM |
+| `ImplicitPathExtractor` | Extract structure-based relations (CHUNK → DOCUMENT) |
+### Retrievers
+Retrieve relevant context from the graph.
+| Retriever | Description |
+|-----------|-------------|
+| `VectorContextRetriever` | Vector similarity → graph traversal |
+| `PGRetriever` | Orchestrates multiple sub-retrievers |
+## Configuration Reference
+### HanaPropertyGraphStore Options
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `graphName` | `string` | Required | RDF named graph identifier (e.g., `"my_knowledge_graph"`) |
+| `vectorTableName` | `string` | Auto-generated | Custom table name for vector storage |
+| `llamaNodesTableName` | `string` | Auto-generated | Custom table name for document nodes |
+| `resetTables` | `boolean` | `false` | Drop and recreate tables on init (dev/test only) |
+### PropertyGraphIndex Options
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `propertyGraphStore` | `PropertyGraphStore` | Required | HANA-backed graph store instance |
+| `embedModel` | `EmbedModel` | - | Embedding model for vector search |
+| `kgExtractors` | `TransformComponent[]` | `[ImplicitPathExtractor]` | Pipeline of entity/relation extractors |
+| `embedKgNodes` | `boolean` | `true` | Generate embeddings for extracted entity nodes |
+| `showProgress` | `boolean` | `false` | Log progress during extraction |
+### Query/Retrieval Options
+These options can be passed to `index.query()` or `index.asRetriever()`:
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `similarityTopK` | `number` | `4` | Number of top similar nodes to retrieve via vector search |
+| `pathDepth` | `number` | `1` | Graph traversal depth (hops) from matched nodes |
+| `limit` | `number` | `30` | Maximum triplets/results to return after graph expansion |
+| `similarityScore` | `number` | - | Minimum similarity threshold (0.0-1.0) to filter results |
+| `crossCheckBoost` | `boolean` | `true` | Enable cross-check boosting (see below) |
+| `crossCheckBoostFactor` | `number` | `1.25` | Score multiplier for cross-check matches |
+**Example:**
+```typescript
+// Retrieve more results with deeper graph traversal
+const results = await index.query("Tech companies in California", {
+  similarityTopK: 10,    // More initial matches
+  pathDepth: 2,          // Traverse 2 hops
+  limit: 50,             // Return up to 50 results
+  similarityScore: 0.5,  // Only results with score >= 0.5
+  crossCheckBoost: true, // Enable provenance-based boosting
+});
+```
+### Cross-Check Boosting
+Cross-check boosting is an advanced retrieval feature that improves result quality by combining vector similarity with graph provenance:
+1. **Vector search** finds semantically similar entity nodes
+2. **Graph traversal** expands to find related facts/triplets
+3. **Cross-check**: If a graph fact originated from the same document as a vector-matched entity, its score is boosted
+This rewards results that are **both semantically relevant AND have explicit graph connections**, improving precision for complex queries.
+```typescript
+// Disable cross-check boosting for raw vector scores
+const results = await index.query("Apple CEO", {
+  crossCheckBoost: false,
+});
+// Increase boost factor for stronger provenance preference
+const results = await index.query("Apple CEO", {
+  crossCheckBoostFactor: 1.5,  // 50% boost instead of default 25%
+});
+```
+### SchemaLLMPathExtractor Options
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `llm` | `LLMClient` | Required | LLM client for entity extraction |
+| `schema.entityTypes` | `string[]` | Required | Allowed entity types (e.g., `["PERSON", "ORG"]`) |
+| `schema.relationTypes` | `string[]` | Required | Allowed relation types (e.g., `["WORKS_AT"]`) |
+| `schema.validationSchema` | `[string,string,string][]` | - | Valid triplet patterns (e.g., `["PERSON", "WORKS_AT", "ORG"]`) |
+| `maxTripletsPerChunk` | `number` | `10` | Max entities/relations to extract per document |
+| `strict` | `boolean` | `true` | Only allow relations defined in validationSchema |
+| `extractPromptTemplate` | `string` | Built-in | Custom prompt template for extraction |
+### VectorContextRetriever Options
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `graphStore` | `PropertyGraphStore` | Required | Graph store instance |
+| `embedModel` | `EmbedModel` | Required | Embedding model for query embedding |
+| `similarityTopK` | `number` | `4` | Number of top similar nodes |
+| `pathDepth` | `number` | `1` | Graph traversal depth |
+| `limit` | `number` | `30` | Max results after expansion |
+| `similarityScore` | `number` | - | Minimum similarity threshold |
+| `includeText` | `boolean` | `true` | Include source text in results |
+| `crossCheckBoost` | `boolean` | `true` | Enable cross-check boosting |
+| `crossCheckBoostFactor` | `number` | `1.25` | Score multiplier for provenance matches |
+## Multi-Tenancy
+Isolate data for different domains using separate graph names:
+```typescript
+// Tenant 1: Finance data
+const financeStore = new HanaPropertyGraphStore(conn, {
+  graphName: "finance_contracts",
+});
+const financeIndex = new PropertyGraphIndex({
+  propertyGraphStore: financeStore,
+  embedModel,
+  kgExtractors: [...],
+});
+// Tenant 2: HR data (completely isolated)
+const hrStore = new HanaPropertyGraphStore(conn, {
+  graphName: "hr_data",
+});
+const hrIndex = new PropertyGraphIndex({
+  propertyGraphStore: hrStore,
+  embedModel,
+  kgExtractors: [...],
+});
+```
+Each `graphName` creates:
+- A separate RDF named graph for knowledge graph data
+- A separate vector table for embeddings
+## Low-Level Access
+### Direct SPARQL Access
+```typescript
+import { HanaSparqlStore } from "hana-kgvector";
+const sparql = new HanaSparqlStore(conn);
+// Execute SPARQL query
+const result = await sparql.execute({
+  sparql: `SELECT ?s ?p ?o FROM <my-graph> WHERE { ?s ?p ?o } LIMIT 10`,
+});
+// Load Turtle data
+await sparql.loadTurtle({
+  turtle: `<urn:entity:1> <urn:rel:knows> <urn:entity:2> .`,
+  graphName: "urn:hkv:my_graph",
+});
+```
+## Requirements
+- **Node.js** 20+
+- **SAP HANA Cloud** with:
+  - Vector Engine enabled (GA since Q1 2024)
+  - Knowledge Graph Engine enabled (GA since Q1 2025)
+  - Minimum 3 vCPUs / 48 GB memory
+- **LiteLLM Proxy** (recommended) or direct LLM API access
+## Scripts
+```bash
+# Build
+pnpm run build
+# Test
+pnpm run test
+# Validate HANA connection
+pnpm run phase0:hana
+# Validate LiteLLM connection
+pnpm run phase0:litellm
+# Run PropertyGraphIndex smoke test
+pnpm run smoke:pg
+# Run quality test suite (comprehensive testing)
+pnpm exec tsx scripts/test-quality.ts
+```
+## Quality Test Results
+The quality test suite validates entity extraction, vector retrieval, and graph traversal:
+| Test | Score |
+|------|-------|
+| Entity Extraction (Organizations, People, Locations) | 100% |
+| Relation Extraction | 100% |
+| Vector Retrieval Relevance | 100% |
+| Graph Traversal | 100% |
+| Data Persistence (Vectors + RDF Triples) | 100% |
+| **Overall** | **97.3%** |
+## License
+MIT
+## Contributing
+Contributions welcome! Please read the PRD.md for architectural decisions and design principles.

package/dist/Stream-JW2S2DUH.js ADDED Viewed

@@ -0,0 +1,6 @@
+import {
+  require_Stream
+} from "./chunk-I3F3SOHM.js";
+import "./chunk-VUNV25KB.js";
+export default require_Stream();
+//# sourceMappingURL=Stream-JW2S2DUH.js.map

package/dist/Stream-JW2S2DUH.js.map ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"version":3,"sources":[],"sourcesContent":[],"mappings":"","names":[]}