knowledgetree-rag 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Knowledge Tree
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,267 @@
1
+ Metadata-Version: 2.4
2
+ Name: knowledgetree-rag
3
+ Version: 0.1.0
4
+ Summary: Knowledge-graph-powered RAG sidecar for infrastructure querying
5
+ Author: Knowledge Tree
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/knowledgetree-dev/kt-rag
8
+ Requires-Python: >=3.10
9
+ Description-Content-Type: text/markdown
10
+ License-File: LICENSE
11
+ Requires-Dist: lightrag-hku[api]>=1.4.0
12
+ Requires-Dist: sentence-transformers>=3.0.0
13
+ Requires-Dist: transformers>=4.40.0
14
+ Requires-Dist: httpx>=0.27.0
15
+ Requires-Dist: structlog>=24.1.0
16
+ Requires-Dist: python-dotenv>=1.0.0
17
+ Requires-Dist: tenacity>=8.2.0
18
+ Requires-Dist: asyncpg>=0.29.0
19
+ Dynamic: license-file
20
+
21
+ <div align="center">
22
+ <h1>kt-rag</h1>
23
+ <p>Knowledge-graph-powered RAG sidecar for infrastructure querying</p>
24
+ </div>
25
+
26
+ <p align="center">
27
+ <a href="https://github.com/knowledgetree-dev/kt-rag/releases"><img src="https://img.shields.io/github/v/release/knowledgetree-dev/kt-rag?style=flat&label=version" alt="Version"></a>
28
+ <a href="https://github.com/knowledgetree-dev/kt-rag/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/knowledgetree-dev/kt-rag/ci.yml?branch=main&style=flat&label=CI" alt="CI"></a>
29
+ <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg?style=flat" alt="MIT License"></a>
30
+ <a href="https://pypi.org/project/knowledgetree-rag/"><img src="https://img.shields.io/pypi/v/knowledgetree-rag?style=flat&label=PyPI" alt="PyPI"></a>
31
+ </p>
32
+
33
+ ---
34
+
35
+ Turn your infrastructure graph into a searchable knowledge base. kt-rag feeds discovered infrastructure data — services, databases, Kubernetes clusters, DNS zones, CI/CD pipelines — into a [LightRAG](https://github.com/HKUDS/LightRAG) knowledge graph, then lets you ask questions in plain English.
36
+
37
+ **What you can do:**
38
+
39
+ - "Which services depend on Postgres?" — get a structured answer with confidence scoring
40
+ - "What's the blast radius if us-east-1 goes down?" — trace dependencies across providers
41
+ - "Show me all EC2 instances tagged `production` in eu-west-1" — filter by metadata
42
+ - "Generate a runbook for the auth service" — auto-document from live graph data
43
+ - "What changed in my infrastructure since last week?" — incremental sync awareness
44
+
45
+ ## How It Works
46
+
47
+ ```
48
+ ┌─────────────┐ ┌──────────┐ ┌──────────┐ ┌──────┐
49
+ │ Discovery │───▶│ kt-rag │───▶│ LightRAG │───▶│ LLM │
50
+ │ Plugins │ │ seed/sync│ │ KG Store │ │ │
51
+ └─────────────┘ └──────────┘ └──────────┘ └──────┘
52
+
53
+ ┌────┴────┐ ┌─────┴──────┐
54
+ │ API │ │ Query CLI │
55
+ │ Server │ │ / Library │
56
+ └─────────┘ └─────────────┘
57
+ ```
58
+
59
+ 1. **Plugins discover** your infrastructure (AWS, GitHub, K8s, etc.)
60
+ 2. **kt-rag seeds** entities and relationships into LightRAG's graph store
61
+ 3. **LightRAG** indexes everything for hybrid search (graph + vector)
62
+ 4. **Your question** triggers retrieval-augmented generation through an LLM
63
+
64
+ ## Quick Start
65
+
66
+ ### Install
67
+
68
+ ```bash
69
+ pip install knowledgetree-rag
70
+ ```
71
+
72
+ Or from source:
73
+
74
+ ```bash
75
+ git clone https://github.com/knowledgetree-dev/kt-rag.git
76
+ cd kt-rag
77
+ pip install -e .
78
+ ```
79
+
80
+ ### Configure
81
+
82
+ Copy the template and set your LLM API key:
83
+
84
+ ```bash
85
+ cp .env.example .env
86
+ # Edit .env with your LLM provider and API key
87
+ ```
88
+
89
+ Minimal `.env`:
90
+
91
+ ```ini
92
+ LLM_BINDING_API_KEY=sk-...
93
+ ```
94
+
95
+ ### Seed the graph
96
+
97
+ Populate LightRAG from your infrastructure data:
98
+
99
+ ```bash
100
+ kt-rag-seed
101
+ ```
102
+
103
+ ### Query
104
+
105
+ ```bash
106
+ kt-rag-query "which services depend on postgres?"
107
+ ```
108
+
109
+ Example output:
110
+
111
+ ```
112
+ Response: The following services depend on Postgres:
113
+ - auth-service (production, us-east-1)
114
+ - user-api (staging, eu-west-1)
115
+ - analytics-backend (production, us-west-2)
116
+ - docs-service (production, us-east-1)
117
+
118
+ Confidence: 0.82
119
+ ```
120
+
121
+ ### Start the API server
122
+
123
+ ```bash
124
+ kt-rag-server
125
+ # Listening on http://0.0.0.0:8085
126
+ ```
127
+
128
+ ```bash
129
+ curl -s http://localhost:8085/api/v1/rag/health
130
+ ```
131
+
132
+ ## CLI Commands
133
+
134
+ | Command | Description |
135
+ |---------|-------------|
136
+ | `kt-rag-seed` | Full seed from Knowledge Tree into LightRAG |
137
+ | `kt-rag-query "..."` | Natural-language query (supports `--filter`, `--min-score`, `--profile`) |
138
+ | `kt-rag-server` | REST API on port 8085 |
139
+
140
+ ## API
141
+
142
+ ### Query
143
+
144
+ ```json
145
+ POST /api/v1/rag/query
146
+ {
147
+ "question": "which services depend on postgres?",
148
+ "mode": "hybrid",
149
+ "filter": {"provider": "aws", "region": "us-east-1"},
150
+ "profile_name": "claude"
151
+ }
152
+ ```
153
+
154
+ **Response:**
155
+
156
+ ```json
157
+ {
158
+ "response": "The following services depend on Postgres...",
159
+ "confidence": 0.82,
160
+ "metadata": {
161
+ "mode": "hybrid",
162
+ "profile": "claude",
163
+ "tokens_used": 1247
164
+ }
165
+ }
166
+ ```
167
+
168
+ ### Profile Management
169
+
170
+ ```bash
171
+ # List profiles
172
+ GET /api/v1/rag/profiles
173
+
174
+ # Create profile
175
+ POST /api/v1/rag/profiles
176
+ {
177
+ "name": "claude",
178
+ "provider": "anthropic",
179
+ "api_key": "sk-ant-...",
180
+ "model": "claude-sonnet-4-20250514"
181
+ }
182
+
183
+ # Delete profile
184
+ DELETE /api/v1/rag/profiles/{name}
185
+ ```
186
+
187
+ ## Multi-LLM Profiles
188
+
189
+ Switch between LLM providers at query time:
190
+
191
+ ```bash
192
+ # Named profiles (persisted)
193
+ kt-rag-query "what's my blast radius?" --profile claude
194
+
195
+ # Ad-hoc via headers (server mode)
196
+ curl -H "X-KT-LLM-Provider: anthropic" \
197
+ -H "X-KT-LLM-Api-Key: sk-ant-..." \
198
+ -d '{"question": "..."}' \
199
+ http://localhost:8085/api/v1/rag/query
200
+ ```
201
+
202
+ ## Configuration
203
+
204
+ All config via environment variables or `.env`:
205
+
206
+ | Variable | Default | Description |
207
+ |----------|---------|-------------|
208
+ | `LLM_BINDING` | `openai` | LLM provider (`openai`, `anthropic`, `ollama`, `azure`) |
209
+ | `LLM_BINDING_API_KEY` | — | API key |
210
+ | `LLM_BINDING_MODEL` | `gpt-4o` | Model name |
211
+ | `EMBEDDING_BINDING` | `openai` | Embedding provider |
212
+ | `EMBEDDING_BINDING_MODEL` | `text-embedding-3-small` | Embedding model |
213
+ | `LIGHTRAG_DIR` | `./output` | LightRAG working directory |
214
+ | `LIGHTRAG_PORT` | `8085` | API server port |
215
+ | `KT_API_URL` | `http://localhost:8080` | Knowledge Tree API URL |
216
+ | `KT_API_TOKEN` | — | Knowledge Tree auth token |
217
+ | `PG_URI` | `postgresql://localhost:5432/knowledgetree` | Database (read-only seed) |
218
+ | `SYNC_INTERVAL` | `300` | Sync interval in seconds |
219
+
220
+ ## Confidence Scoring
221
+
222
+ Every query response includes a 0.0–1.0 confidence score based on:
223
+
224
+ - **Retrieval quality** (30%) — how well the query matched stored entities
225
+ - **Context coverage** (30%) — how completely the retrieved data covers the question
226
+ - **Factual support** (40%) — whether the LLM's answer is grounded in retrieved facts
227
+
228
+ Set a minimum threshold:
229
+
230
+ ```bash
231
+ kt-rag-query "what repos are production?" --min-score 0.6
232
+ ```
233
+
234
+ ## Contributing
235
+
236
+ PRs are welcome. For feature requests or major changes, open an issue first.
237
+
238
+ ```bash
239
+ # Dev setup
240
+ python -m venv .venv && source .venv/bin/activate
241
+ pip install -e ".[dev]"
242
+
243
+ # Run tests
244
+ pytest tests/ -v --tb=short
245
+
246
+ # Type check
247
+ mypy .
248
+ ```
249
+
250
+ ## Project Structure
251
+
252
+ ```
253
+ ├── seed.py # Full graph seed
254
+ ├── sync.py # Incremental sync
255
+ ├── query.py # CLI query entry point
256
+ ├── server.py # FastAPI server
257
+ ├── config.py # Environment config loader
258
+ ├── profile_store.py # LLM provider profile persistence
259
+ ├── scorer.py # Confidence scoring
260
+ ├── kt_client.py # Knowledge Tree API client
261
+ ├── tests/ # Test suite
262
+ └── .env.example # Config template
263
+ ```
264
+
265
+ ## License
266
+
267
+ MIT
@@ -0,0 +1,247 @@
1
+ <div align="center">
2
+ <h1>kt-rag</h1>
3
+ <p>Knowledge-graph-powered RAG sidecar for infrastructure querying</p>
4
+ </div>
5
+
6
+ <p align="center">
7
+ <a href="https://github.com/knowledgetree-dev/kt-rag/releases"><img src="https://img.shields.io/github/v/release/knowledgetree-dev/kt-rag?style=flat&label=version" alt="Version"></a>
8
+ <a href="https://github.com/knowledgetree-dev/kt-rag/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/knowledgetree-dev/kt-rag/ci.yml?branch=main&style=flat&label=CI" alt="CI"></a>
9
+ <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg?style=flat" alt="MIT License"></a>
10
+ <a href="https://pypi.org/project/knowledgetree-rag/"><img src="https://img.shields.io/pypi/v/knowledgetree-rag?style=flat&label=PyPI" alt="PyPI"></a>
11
+ </p>
12
+
13
+ ---
14
+
15
+ Turn your infrastructure graph into a searchable knowledge base. kt-rag feeds discovered infrastructure data — services, databases, Kubernetes clusters, DNS zones, CI/CD pipelines — into a [LightRAG](https://github.com/HKUDS/LightRAG) knowledge graph, then lets you ask questions in plain English.
16
+
17
+ **What you can do:**
18
+
19
+ - "Which services depend on Postgres?" — get a structured answer with confidence scoring
20
+ - "What's the blast radius if us-east-1 goes down?" — trace dependencies across providers
21
+ - "Show me all EC2 instances tagged `production` in eu-west-1" — filter by metadata
22
+ - "Generate a runbook for the auth service" — auto-document from live graph data
23
+ - "What changed in my infrastructure since last week?" — incremental sync awareness
24
+
25
+ ## How It Works
26
+
27
+ ```
28
+ ┌─────────────┐ ┌──────────┐ ┌──────────┐ ┌──────┐
29
+ │ Discovery │───▶│ kt-rag │───▶│ LightRAG │───▶│ LLM │
30
+ │ Plugins │ │ seed/sync│ │ KG Store │ │ │
31
+ └─────────────┘ └──────────┘ └──────────┘ └──────┘
32
+
33
+ ┌────┴────┐ ┌─────┴──────┐
34
+ │ API │ │ Query CLI │
35
+ │ Server │ │ / Library │
36
+ └─────────┘ └─────────────┘
37
+ ```
38
+
39
+ 1. **Plugins discover** your infrastructure (AWS, GitHub, K8s, etc.)
40
+ 2. **kt-rag seeds** entities and relationships into LightRAG's graph store
41
+ 3. **LightRAG** indexes everything for hybrid search (graph + vector)
42
+ 4. **Your question** triggers retrieval-augmented generation through an LLM
43
+
44
+ ## Quick Start
45
+
46
+ ### Install
47
+
48
+ ```bash
49
+ pip install knowledgetree-rag
50
+ ```
51
+
52
+ Or from source:
53
+
54
+ ```bash
55
+ git clone https://github.com/knowledgetree-dev/kt-rag.git
56
+ cd kt-rag
57
+ pip install -e .
58
+ ```
59
+
60
+ ### Configure
61
+
62
+ Copy the template and set your LLM API key:
63
+
64
+ ```bash
65
+ cp .env.example .env
66
+ # Edit .env with your LLM provider and API key
67
+ ```
68
+
69
+ Minimal `.env`:
70
+
71
+ ```ini
72
+ LLM_BINDING_API_KEY=sk-...
73
+ ```
74
+
75
+ ### Seed the graph
76
+
77
+ Populate LightRAG from your infrastructure data:
78
+
79
+ ```bash
80
+ kt-rag-seed
81
+ ```
82
+
83
+ ### Query
84
+
85
+ ```bash
86
+ kt-rag-query "which services depend on postgres?"
87
+ ```
88
+
89
+ Example output:
90
+
91
+ ```
92
+ Response: The following services depend on Postgres:
93
+ - auth-service (production, us-east-1)
94
+ - user-api (staging, eu-west-1)
95
+ - analytics-backend (production, us-west-2)
96
+ - docs-service (production, us-east-1)
97
+
98
+ Confidence: 0.82
99
+ ```
100
+
101
+ ### Start the API server
102
+
103
+ ```bash
104
+ kt-rag-server
105
+ # Listening on http://0.0.0.0:8085
106
+ ```
107
+
108
+ ```bash
109
+ curl -s http://localhost:8085/api/v1/rag/health
110
+ ```
111
+
112
+ ## CLI Commands
113
+
114
+ | Command | Description |
115
+ |---------|-------------|
116
+ | `kt-rag-seed` | Full seed from Knowledge Tree into LightRAG |
117
+ | `kt-rag-query "..."` | Natural-language query (supports `--filter`, `--min-score`, `--profile`) |
118
+ | `kt-rag-server` | REST API on port 8085 |
119
+
120
+ ## API
121
+
122
+ ### Query
123
+
124
+ ```json
125
+ POST /api/v1/rag/query
126
+ {
127
+ "question": "which services depend on postgres?",
128
+ "mode": "hybrid",
129
+ "filter": {"provider": "aws", "region": "us-east-1"},
130
+ "profile_name": "claude"
131
+ }
132
+ ```
133
+
134
+ **Response:**
135
+
136
+ ```json
137
+ {
138
+ "response": "The following services depend on Postgres...",
139
+ "confidence": 0.82,
140
+ "metadata": {
141
+ "mode": "hybrid",
142
+ "profile": "claude",
143
+ "tokens_used": 1247
144
+ }
145
+ }
146
+ ```
147
+
148
+ ### Profile Management
149
+
150
+ ```bash
151
+ # List profiles
152
+ GET /api/v1/rag/profiles
153
+
154
+ # Create profile
155
+ POST /api/v1/rag/profiles
156
+ {
157
+ "name": "claude",
158
+ "provider": "anthropic",
159
+ "api_key": "sk-ant-...",
160
+ "model": "claude-sonnet-4-20250514"
161
+ }
162
+
163
+ # Delete profile
164
+ DELETE /api/v1/rag/profiles/{name}
165
+ ```
166
+
167
+ ## Multi-LLM Profiles
168
+
169
+ Switch between LLM providers at query time:
170
+
171
+ ```bash
172
+ # Named profiles (persisted)
173
+ kt-rag-query "what's my blast radius?" --profile claude
174
+
175
+ # Ad-hoc via headers (server mode)
176
+ curl -H "X-KT-LLM-Provider: anthropic" \
177
+ -H "X-KT-LLM-Api-Key: sk-ant-..." \
178
+ -d '{"question": "..."}' \
179
+ http://localhost:8085/api/v1/rag/query
180
+ ```
181
+
182
+ ## Configuration
183
+
184
+ All config via environment variables or `.env`:
185
+
186
+ | Variable | Default | Description |
187
+ |----------|---------|-------------|
188
+ | `LLM_BINDING` | `openai` | LLM provider (`openai`, `anthropic`, `ollama`, `azure`) |
189
+ | `LLM_BINDING_API_KEY` | — | API key |
190
+ | `LLM_BINDING_MODEL` | `gpt-4o` | Model name |
191
+ | `EMBEDDING_BINDING` | `openai` | Embedding provider |
192
+ | `EMBEDDING_BINDING_MODEL` | `text-embedding-3-small` | Embedding model |
193
+ | `LIGHTRAG_DIR` | `./output` | LightRAG working directory |
194
+ | `LIGHTRAG_PORT` | `8085` | API server port |
195
+ | `KT_API_URL` | `http://localhost:8080` | Knowledge Tree API URL |
196
+ | `KT_API_TOKEN` | — | Knowledge Tree auth token |
197
+ | `PG_URI` | `postgresql://localhost:5432/knowledgetree` | Database (read-only seed) |
198
+ | `SYNC_INTERVAL` | `300` | Sync interval in seconds |
199
+
200
+ ## Confidence Scoring
201
+
202
+ Every query response includes a 0.0–1.0 confidence score based on:
203
+
204
+ - **Retrieval quality** (30%) — how well the query matched stored entities
205
+ - **Context coverage** (30%) — how completely the retrieved data covers the question
206
+ - **Factual support** (40%) — whether the LLM's answer is grounded in retrieved facts
207
+
208
+ Set a minimum threshold:
209
+
210
+ ```bash
211
+ kt-rag-query "what repos are production?" --min-score 0.6
212
+ ```
213
+
214
+ ## Contributing
215
+
216
+ PRs are welcome. For feature requests or major changes, open an issue first.
217
+
218
+ ```bash
219
+ # Dev setup
220
+ python -m venv .venv && source .venv/bin/activate
221
+ pip install -e ".[dev]"
222
+
223
+ # Run tests
224
+ pytest tests/ -v --tb=short
225
+
226
+ # Type check
227
+ mypy .
228
+ ```
229
+
230
+ ## Project Structure
231
+
232
+ ```
233
+ ├── seed.py # Full graph seed
234
+ ├── sync.py # Incremental sync
235
+ ├── query.py # CLI query entry point
236
+ ├── server.py # FastAPI server
237
+ ├── config.py # Environment config loader
238
+ ├── profile_store.py # LLM provider profile persistence
239
+ ├── scorer.py # Confidence scoring
240
+ ├── kt_client.py # Knowledge Tree API client
241
+ ├── tests/ # Test suite
242
+ └── .env.example # Config template
243
+ ```
244
+
245
+ ## License
246
+
247
+ MIT