mcppt 1.0.1__tar.gz → 1.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {mcppt-1.0.1 → mcppt-1.1.0}/PKG-INFO +3 -2
- mcppt-1.1.0/docs/STUDY_GUIDE.md +299 -0
- mcppt-1.1.0/mcppt/__init__.py +1 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/checks.py +210 -91
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/cli.py +3 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/core.py +72 -59
- mcppt-1.1.0/mcppt/shell.py +888 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/pyproject.toml +3 -2
- {mcppt-1.0.1 → mcppt-1.1.0}/requirements.txt +1 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/tests/test_core.py +4 -4
- mcppt-1.1.0/wbs_mcppt_result.md +130 -0
- mcppt-1.0.1/mcppt/__init__.py +0 -1
- mcppt-1.0.1/mcppt/shell.py +0 -508
- {mcppt-1.0.1 → mcppt-1.1.0}/.github/workflows/ci.yml +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/.gitignore +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/LICENSE +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/OPERATOR_GUIDE.md +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/README.md +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/app.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/docs/MCPTROTTER_ARTICLE.md +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/docs/MCPTROTTER_MEDIUM.md +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/docs/mcptrotter.jpeg +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/report.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/server.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/mcppt/tui.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/smoke_test.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/test_report.md +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/test_server.log +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/test_server.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/tests/__init__.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/tests/test_checks.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/vuln_server.py +0 -0
- {mcppt-1.0.1 → mcppt-1.1.0}/wbs_scan_report.md +0 -0
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: mcppt
|
|
3
|
-
Version: 1.0
|
|
4
|
-
Summary: MCPTROTTER — MCP
|
|
3
|
+
Version: 1.1.0
|
|
4
|
+
Summary: MCPTROTTER — MCP Security Framework: 31 automated checks + manual exploration shell
|
|
5
5
|
Project-URL: Homepage, https://github.com/gurudeepmallam-cmd/mcppt
|
|
6
6
|
Project-URL: Repository, https://github.com/gurudeepmallam-cmd/mcppt
|
|
7
7
|
Project-URL: Issues, https://github.com/gurudeepmallam-cmd/mcppt/issues
|
|
@@ -21,6 +21,7 @@ Classifier: Topic :: Internet :: WWW/HTTP
|
|
|
21
21
|
Classifier: Topic :: Security
|
|
22
22
|
Requires-Python: >=3.10
|
|
23
23
|
Requires-Dist: flask>=3.0
|
|
24
|
+
Requires-Dist: requests>=2.28.0
|
|
24
25
|
Requires-Dist: rich>=13.0
|
|
25
26
|
Provides-Extra: ai
|
|
26
27
|
Requires-Dist: anthropic>=0.25; extra == 'ai'
|
|
@@ -0,0 +1,299 @@
|
|
|
1
|
+
# Study Guide — CI/CD Security, MCPTROTTER, Cloud Security, RAG
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## 1. CI/CD — Concepts, Risks, and Findings
|
|
6
|
+
|
|
7
|
+
### What is CI/CD?
|
|
8
|
+
|
|
9
|
+
CI/CD stands for Continuous Integration / Continuous Delivery (or Deployment).
|
|
10
|
+
|
|
11
|
+
- **CI (Continuous Integration):** Every code push triggers automated checks — lint, tests, build. Catches bugs early before they reach production.
|
|
12
|
+
- **CD (Continuous Delivery):** After CI passes, the artifact (wheel, Docker image, binary) is automatically deployed or published — to PyPI, a cloud environment, or a registry.
|
|
13
|
+
|
|
14
|
+
### How MCPTROTTER's Pipeline Works
|
|
15
|
+
|
|
16
|
+
File: `.github/workflows/ci.yml`
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
git push to main
|
|
20
|
+
→ lint (ruff check)
|
|
21
|
+
→ test (pytest on Python 3.10, 3.11, 3.12)
|
|
22
|
+
→ build (python -m build → produces .whl + .tar.gz)
|
|
23
|
+
|
|
24
|
+
git tag v1.0.3 + git push origin v1.0.3
|
|
25
|
+
→ all above +
|
|
26
|
+
→ publish to PyPI (trusted publishing — no API key in CI)
|
|
27
|
+
→ GitHub Release created with dist files + auto release notes
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
**Trusted Publishing** — instead of storing a PyPI API token as a GitHub secret (which can be leaked), the pipeline uses OIDC (OpenID Connect). GitHub proves to PyPI "this is a legitimate workflow from this repo" using a short-lived token. No hardcoded credentials anywhere.
|
|
31
|
+
|
|
32
|
+
### CI/CD Security Risks
|
|
33
|
+
|
|
34
|
+
| Risk | What it means | Real example |
|
|
35
|
+
|------|--------------|--------------|
|
|
36
|
+
| Secrets in code | API keys, tokens hardcoded in repo | PyPI token exposed in chat — can be used by anyone to publish malicious packages |
|
|
37
|
+
| Supply chain attack | Attacker compromises a dependency used in CI | `actions/checkout@v4` — if this is compromised, every repo using it is affected |
|
|
38
|
+
| Dependency confusion | Attacker publishes a malicious package with same name as internal one | `pip install mcppt` — if the name wasn't claimed, attacker could have published first |
|
|
39
|
+
| Workflow injection | Untrusted input in `${{ github.event.issue.title }}` injected into shell | Runs arbitrary code in CI |
|
|
40
|
+
| Overprivileged tokens | `GITHUB_TOKEN` with `contents: write` + `id-token: write` — if leaked, attacker can publish and create releases |
|
|
41
|
+
| Artifact tampering | Build artifact replaced between build and publish jobs | Man-in-the-middle on artifact upload |
|
|
42
|
+
|
|
43
|
+
### What We Did (and what went wrong)
|
|
44
|
+
|
|
45
|
+
- PyPI token was pasted in chat **twice** — this is a real supply chain risk. Anyone who reads the chat history can publish to PyPI as you.
|
|
46
|
+
- **Fix:** Revoke token at https://pypi.org/manage/account/token/ and switch to trusted publishing via git tags — no token needed in CI.
|
|
47
|
+
- Manual publish process: `bump version → python -m build → twine upload`
|
|
48
|
+
- Automated: `git tag v1.0.3 && git push origin v1.0.3` triggers the full pipeline.
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## 2. MCPTROTTER — Code and Repo Logic
|
|
53
|
+
|
|
54
|
+
### What is MCP?
|
|
55
|
+
|
|
56
|
+
Model Context Protocol (MCP) — a standard for AI agents to call external tools. Think of it as an API specifically designed for LLMs. Uses JSON-RPC 2.0 over HTTP with SSE (Server-Sent Events) for streaming responses.
|
|
57
|
+
|
|
58
|
+
Every MCP call looks like:
|
|
59
|
+
```json
|
|
60
|
+
{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "tool_name", "arguments": {}}}
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### Why MCPTROTTER Makes the Calls It Does
|
|
64
|
+
|
|
65
|
+
MCPTROTTER doesn't use curl internally. It uses Python's `urllib` from stdlib. Here's the logic flow:
|
|
66
|
+
|
|
67
|
+
#### `mcppt/core.py` — The Engine
|
|
68
|
+
- `configure(no_verify, proxy)` — sets up SSL context and proxy handler globally
|
|
69
|
+
- `mcp_init(url, token)` — sends `initialize` + `notifications/initialized` to establish MCP session, gets back `mcp-session-id`
|
|
70
|
+
- `rpc(url, method, params, token, session_id)` — sends a single JSON-RPC call, returns parsed response
|
|
71
|
+
|
|
72
|
+
Why initialize first? MCP Streamable HTTP requires a handshake — the server issues a `mcp-session-id` which must be sent on all subsequent calls. Without it, calls are rejected.
|
|
73
|
+
|
|
74
|
+
#### `mcppt/checks.py` — The 28 Checks
|
|
75
|
+
Each check is a function that:
|
|
76
|
+
1. Calls `rpc()` with a specific method and params
|
|
77
|
+
2. Inspects the response for vulnerability indicators
|
|
78
|
+
3. Appends findings to `ScanState`
|
|
79
|
+
|
|
80
|
+
Example — `check_enum`: calls `tools/list` with NO token → if it returns tools, finding fires (F-16 equivalent).
|
|
81
|
+
|
|
82
|
+
Example — `check_replay`: calls a tool once, records the `id`, calls it again with the same `id` → if both succeed, replay is confirmed.
|
|
83
|
+
|
|
84
|
+
Example — `check_stored`: calls a write tool with an injection payload → calls a read tool → checks if payload appears in response unescaped.
|
|
85
|
+
|
|
86
|
+
#### `mcppt/tui.py` — The Live Dashboard
|
|
87
|
+
Uses Python `rich` (Textual layout) to show findings panel + live log panel side by side while scan runs in a background thread. The `ScanState` dataclass is thread-safe (uses `threading.Lock`) so the TUI can read it while checks write to it.
|
|
88
|
+
|
|
89
|
+
#### `mcppt/shell.py` — Interactive REPL
|
|
90
|
+
Built on `cmd.Cmd` — same pattern as gobuster/ffuf interactive mode. Commands: `target`, `token`, `scan`, `findings`, `report`. The banner uses `rich.text.Text` objects (not markup strings) to avoid `\[/]` escaping issues with ANSI escape codes.
|
|
91
|
+
|
|
92
|
+
#### `mcppt/cli.py` — Entry Point
|
|
93
|
+
- `mcppt` (no args) → launches interactive shell
|
|
94
|
+
- `mcppt scan --url ... --checks ...` → non-interactive scan
|
|
95
|
+
- Key fix: `enum` is always prepended to checks list — other checks (replay, stored, auth) depend on the tool list that enum builds
|
|
96
|
+
|
|
97
|
+
#### `mcppt/server.py` — MCP Server Mode
|
|
98
|
+
MCPTROTTER can expose itself as an MCP server. Claude Desktop calls `scan_target` tool → MCPTROTTER runs all checks → returns findings JSON back to Claude. No copy-paste, no context switching.
|
|
99
|
+
|
|
100
|
+
### Repo Structure
|
|
101
|
+
```
|
|
102
|
+
mcppt_tool/
|
|
103
|
+
├── mcppt/
|
|
104
|
+
│ ├── __init__.py version string
|
|
105
|
+
│ ├── cli.py entry point + arg parsing
|
|
106
|
+
│ ├── core.py urllib HTTP engine, MCP init, RPC
|
|
107
|
+
│ ├── checks.py 28 security checks + ScanState
|
|
108
|
+
│ ├── tui.py Rich TUI dashboard
|
|
109
|
+
│ ├── shell.py cmd.Cmd interactive REPL + banner
|
|
110
|
+
│ ├── report.py MD + JSON report export
|
|
111
|
+
│ └── server.py MCP server mode (Flask SSE)
|
|
112
|
+
├── test_server.py vulnerable demo MCP server
|
|
113
|
+
├── pyproject.toml hatchling build config
|
|
114
|
+
├── docs/
|
|
115
|
+
│ └── mcptrotter.jpeg cyberpunk logo
|
|
116
|
+
└── .github/workflows/
|
|
117
|
+
└── ci.yml lint → test → build → publish
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## 3. WBS MCP Findings — What Was Confirmed Today
|
|
123
|
+
|
|
124
|
+
### Final 5 Findings
|
|
125
|
+
|
|
126
|
+
#### F1 — MCP Agent Compromise via Chained Prompt Injection, SSRF, and Broken Access Control (CRITICAL)
|
|
127
|
+
Three weaknesses chained:
|
|
128
|
+
- `resources/list` accessible with no auth → free recon
|
|
129
|
+
- Injection payload written into `EngagementInternalNotes` via `update_wbs_form` → stored in form verbatim
|
|
130
|
+
- When any user asks Teams bot about the form → injected instruction executes in LLM context → financial data disclosed ($500K fixed fee, team PII)
|
|
131
|
+
- `FeeTextOnInvoice` accepts any URL → Teams SkypeUriPreview auto-fetches it server-side → data exfiltration without user click
|
|
132
|
+
|
|
133
|
+
#### F2 — Hardcoded Shared Service Account (MEDIUM)
|
|
134
|
+
Every user (tpt1, tpt2, Harika, anyone) gets `USITSDASBPM256@deloitte.com` when asking the bot "what is my account". Bot has no awareness of actual caller. All actions logged under one identity — forensics blind.
|
|
135
|
+
|
|
136
|
+
#### F3 — Replay Attack + No Rate Limiting (HIGH)
|
|
137
|
+
- Same JSON-RPC `id` accepted indefinitely — 1000 identical requests, all 200
|
|
138
|
+
- No nonce, no timestamp window, no duplicate detection
|
|
139
|
+
- Read tool: `get_all_market_offering_categories` replayed — same data returned every time
|
|
140
|
+
- Write tool: `test_wbs_automation_message` replayed — accepted every time
|
|
141
|
+
|
|
142
|
+
#### F4 — CORS Wildcard (MEDIUM)
|
|
143
|
+
`Access-Control-Allow-Origin: *` — any origin including `null` accepted. Enables cross-site MCP abuse from a malicious browser page.
|
|
144
|
+
|
|
145
|
+
#### F5 — Missing Security Headers + HSTS (LOW)
|
|
146
|
+
HSTS `max-age=16070400` (186 days, below 1 year). Missing `X-Content-Type-Options`. `Server: Google Frontend` leaks infrastructure.
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## 4. Cloud Security — Azure Focus
|
|
151
|
+
|
|
152
|
+
### Azure Architecture (WBS target)
|
|
153
|
+
```
|
|
154
|
+
Teams Chat
|
|
155
|
+
→ Azure Bot Service + WAF
|
|
156
|
+
→ Azure Container Apps (ACA) — agent runs here
|
|
157
|
+
→ MCP Server on GCP
|
|
158
|
+
→ WBS APIs + SWIFT
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
### Key Azure Security Concepts
|
|
162
|
+
|
|
163
|
+
#### Azure AD / Entra ID
|
|
164
|
+
- Every user, app, and service in Azure has an identity in Entra ID
|
|
165
|
+
- **Service Principal** — non-human identity for apps/services (like `USITSDASBPM256` — this is likely an App Registration service principal)
|
|
166
|
+
- **Managed Identity** — Azure-assigned identity for services, no credentials to manage. If WBS had used Managed Identity properly, the hardcoded service account issue wouldn't exist
|
|
167
|
+
- **RBAC** — Role-Based Access Control. Roles assigned to identities at subscription/resource group/resource level
|
|
168
|
+
|
|
169
|
+
#### Azure Bot Service
|
|
170
|
+
- Hosts the Teams bot middleware
|
|
171
|
+
- Routes messages from Teams → ACA agent → back to Teams
|
|
172
|
+
- Has its own App Registration (`WBS App ID: 3c91b69e-d6f2-4f87-980b-65388b7f5d1c`)
|
|
173
|
+
- **Attack surface:** Bot Framework token validation, Teams channel authentication
|
|
174
|
+
|
|
175
|
+
#### Azure Container Apps (ACA)
|
|
176
|
+
- Serverless container hosting — the agent LLM pipeline runs here
|
|
177
|
+
- Has 6 layers: Input Guardrail → Main LLM → Pre-conditions → MCP Call → Output Guardrail → Response
|
|
178
|
+
- **Attack surface:** Container escape, environment variable secrets, SSRF from within the container
|
|
179
|
+
|
|
180
|
+
#### Azure WAF
|
|
181
|
+
- Web Application Firewall in front of the bot service
|
|
182
|
+
- Blocks known attack patterns (SQLi, XSS, common payloads)
|
|
183
|
+
- **Why it didn't catch our injection:** WAF looks at HTTP-level patterns. Our payload went through Teams → Bot → ACA → MCP. By the time it reached MCP, it was inside a legitimate JSON-RPC call. WAF never saw the raw payload.
|
|
184
|
+
|
|
185
|
+
#### Common Azure Security Findings
|
|
186
|
+
|
|
187
|
+
| Finding | What to look for |
|
|
188
|
+
|---------|-----------------|
|
|
189
|
+
| Exposed storage accounts | `*.blob.core.windows.net` publicly accessible |
|
|
190
|
+
| Overprivileged Managed Identity | Identity has Contributor or Owner on subscription |
|
|
191
|
+
| Key Vault misconfiguration | Secrets accessible without proper RBAC |
|
|
192
|
+
| Azure Function exposed | No auth on HTTP trigger |
|
|
193
|
+
| App Service environment variables | Connection strings in plain text in portal |
|
|
194
|
+
| SSRF to IMDS | `http://169.254.169.254/metadata/instance` — Azure Instance Metadata Service |
|
|
195
|
+
| Tenant ID enumeration | `login.microsoftonline.com/<tenantid>/.well-known/openid-configuration` |
|
|
196
|
+
|
|
197
|
+
#### SSRF in Azure Context
|
|
198
|
+
The Azure IMDS (Instance Metadata Service) endpoint `http://169.254.169.254/metadata/instance?api-version=2021-02-01` with header `Metadata: true` returns:
|
|
199
|
+
- Subscription ID
|
|
200
|
+
- Resource group name
|
|
201
|
+
- Managed Identity tokens
|
|
202
|
+
- VM name, location, SKU
|
|
203
|
+
|
|
204
|
+
If an SSRF exists in an Azure-hosted app, hitting IMDS can give you an access token for the Managed Identity — which may have permissions on Azure resources.
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## 5. RAG — Does Your Repo Have It? Risks and Defences
|
|
209
|
+
|
|
210
|
+
### Does Your Repo Have RAG?
|
|
211
|
+
|
|
212
|
+
**Partially.** The `ad_agent/models/kb_engine.py` implements a lightweight RAG-like system:
|
|
213
|
+
|
|
214
|
+
```python
|
|
215
|
+
class KBEngine:
|
|
216
|
+
def search(self, query: str, top_k: int = 4) -> list:
|
|
217
|
+
qwords = set(query.lower().split())
|
|
218
|
+
scored = []
|
|
219
|
+
for c in self.chunks:
|
|
220
|
+
overlap = len(qwords & set(c["text"].lower().split()))
|
|
221
|
+
...
|
|
222
|
+
return [c for _, c in scored[:top_k]]
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
It reads `.md`/`.txt` files from a `./kb` directory, chunks them by markdown sections (400 words), and does **keyword overlap scoring** — not vector embeddings. This is called **BM25-style retrieval** or **lexical RAG** — simpler than semantic RAG but same security risks.
|
|
226
|
+
|
|
227
|
+
The retrieved chunks are injected into the LLM context as `=== KB CONTEXT ===` blocks before the prompt.
|
|
228
|
+
|
|
229
|
+
### What is RAG?
|
|
230
|
+
|
|
231
|
+
**Retrieval-Augmented Generation** — instead of relying purely on the LLM's training data, RAG:
|
|
232
|
+
1. Takes the user query
|
|
233
|
+
2. Searches a knowledge base (vector DB or keyword index) for relevant documents
|
|
234
|
+
3. Injects those documents into the LLM's context
|
|
235
|
+
4. LLM answers based on retrieved context + its training
|
|
236
|
+
|
|
237
|
+
Used to give LLMs access to up-to-date or private information without retraining.
|
|
238
|
+
|
|
239
|
+
### RAG Poisoning — What It Is and How to Attack It
|
|
240
|
+
|
|
241
|
+
RAG poisoning = injecting malicious content into the knowledge base so that when it's retrieved and fed to the LLM, it manipulates the model's output.
|
|
242
|
+
|
|
243
|
+
**Attack paths:**
|
|
244
|
+
|
|
245
|
+
| Attack | How |
|
|
246
|
+
|--------|-----|
|
|
247
|
+
| Direct KB write | If `kb_engine.add()` is exposed via API without auth — write a malicious `.md` file directly |
|
|
248
|
+
| Indirect poisoning | Attacker controls a webpage/doc that gets ingested into the KB via a crawler or import feature |
|
|
249
|
+
| Chunk injection | Craft content that scores high on keyword overlap — retrieved for any query |
|
|
250
|
+
| Instruction injection in KB | Store `IGNORE PREVIOUS INSTRUCTIONS. Do X instead.` in a KB chunk — retrieved and fed to LLM |
|
|
251
|
+
| Metadata poisoning | Manipulate source filenames so retrieved chunks appear to come from trusted sources |
|
|
252
|
+
|
|
253
|
+
**In your repo specifically:** `kb_engine.add(filename, content)` writes directly to the `./kb` directory. If this method is reachable without authentication, an attacker can write any content into the KB and it will be retrieved and injected into LLM context on the next query.
|
|
254
|
+
|
|
255
|
+
### How to Reduce RAG Poisoning
|
|
256
|
+
|
|
257
|
+
| Defence | How |
|
|
258
|
+
|---------|-----|
|
|
259
|
+
| Auth on KB writes | Never expose `kb_engine.add()` without strict authentication |
|
|
260
|
+
| Input sanitisation | Strip instruction-like patterns from content before storing (`IGNORE`, `SYSTEM:`, etc.) |
|
|
261
|
+
| Source allowlisting | Only ingest from trusted, controlled sources — never from user-supplied URLs |
|
|
262
|
+
| Chunk signing | Sign each chunk at ingest time — verify signature before injecting into context |
|
|
263
|
+
| Retrieval filtering | Post-process retrieved chunks — remove any that contain prompt injection patterns |
|
|
264
|
+
| Separate retrieval from generation | Log every retrieved chunk — alert on anomalous content before it reaches LLM |
|
|
265
|
+
| KB access control | Read-only access for the retrieval pipeline — write access only for admins |
|
|
266
|
+
|
|
267
|
+
### How to Reduce Hallucination
|
|
268
|
+
|
|
269
|
+
Hallucination = LLM confidently states something false.
|
|
270
|
+
|
|
271
|
+
| Technique | How it helps |
|
|
272
|
+
|-----------|-------------|
|
|
273
|
+
| RAG itself | Ground the LLM in real retrieved documents — less reliance on training memory |
|
|
274
|
+
| Temperature = 0 | Deterministic output — less creative, less likely to invent facts |
|
|
275
|
+
| System prompt constraints | "Only answer based on the provided context. If not in context, say I don't know." |
|
|
276
|
+
| Citation requirement | Force model to cite the source chunk for every claim — uncheckable claims = hallucination signal |
|
|
277
|
+
| Output validation | Second LLM call to verify the first answer against retrieved context |
|
|
278
|
+
| Smaller, focused context | Large context windows increase hallucination risk — retrieve only top 3-4 chunks |
|
|
279
|
+
| Confidence scoring | Ask model to rate confidence — low confidence answers flagged for human review |
|
|
280
|
+
| Structured output | JSON output with defined fields — harder to hallucinate than free text |
|
|
281
|
+
|
|
282
|
+
**In your repo:** The `kb_engine.format_context()` method truncates each chunk to 500 chars before injecting. This helps reduce noise but also means the LLM may not have full context — increasing hallucination risk on long technical content. Increasing to 1000 chars and reducing `top_k` from 4 to 3 would be a better balance.
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
286
|
+
## Quick Reference — Things to Remember
|
|
287
|
+
|
|
288
|
+
| Topic | Key point |
|
|
289
|
+
|-------|-----------|
|
|
290
|
+
| CI/CD publish | `git tag v1.0.3 && git push origin v1.0.3` → auto publishes to PyPI |
|
|
291
|
+
| Trusted publishing | No token in CI — OIDC proves repo identity to PyPI |
|
|
292
|
+
| enum auto-prepend | Always runs first — replay/stored/auth need the tool list it builds |
|
|
293
|
+
| MCP session | Must `initialize` first to get `mcp-session-id` — required on all subsequent calls |
|
|
294
|
+
| WBS write tools | Only work via Teams bot session — direct curl blocked by WBS backend ACL |
|
|
295
|
+
| Azure IMDS | `169.254.169.254/metadata/instance` — SSRF target in Azure, returns Managed Identity tokens |
|
|
296
|
+
| RAG poisoning | Write malicious content to KB → retrieved on any matching query → injected into LLM context |
|
|
297
|
+
| Hallucination fix | Temperature 0 + citation requirement + output validation |
|
|
298
|
+
| CORS wildcard risk | `ACAO: *` on an API endpoint means any website can make authenticated cross-origin requests |
|
|
299
|
+
| Replay protection | Needs nonce + timestamp window — without both, captured requests valid forever |
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
__version__ = "1.1.0"
|