docuware-mcp 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,117 @@
1
+ Metadata-Version: 2.4
2
+ Name: docuware-mcp
3
+ Version: 0.1.0
4
+ Summary: MCP server exposing a DocuWare DMS to LLM-based agents
5
+ Author: Stefan Schönberger
6
+ Author-email: Stefan Schönberger <stefan@sniner.dev>
7
+ License-Expression: BSD-3-Clause
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Operating System :: OS Independent
10
+ Classifier: Programming Language :: Python :: 3.10
11
+ Classifier: Programming Language :: Python :: 3.11
12
+ Classifier: Programming Language :: Python :: 3.12
13
+ Classifier: Programming Language :: Python :: 3.13
14
+ Requires-Dist: docuware-client>=0.7.10
15
+ Requires-Dist: fastmcp>=2.0.0
16
+ Requires-Python: >=3.10
17
+ Description-Content-Type: text/markdown
18
+
19
+ # docuware-mcp
20
+
21
+ A Model Context Protocol (MCP) server that exposes a DocuWare DMS to
22
+ LLM-based agents through a database-style API.
23
+
24
+ This is an independent project with no affiliation to DocuWare GmbH.
25
+
26
+ ## Configuration
27
+
28
+ Credentials are read from the `docuware-client` standard environment
29
+ variables:
30
+
31
+ ```
32
+ DW_URL=https://dms.example.com
33
+ DW_USERNAME=service_account
34
+ DW_PASSWORD=<secret>
35
+ DW_ORG=<org>
36
+ ```
37
+
38
+ Alternatively, point `DW_CREDENTIALS_FILE` at a JSON file — useful for
39
+ switching between test and production systems, or for keeping secrets
40
+ out of shell history:
41
+
42
+ ```
43
+ DW_CREDENTIALS_FILE=/path/to/credentials.json
44
+ ```
45
+
46
+ The file uses the same keys as the environment variables:
47
+
48
+ ```json
49
+ {
50
+ "url": "https://dms.example.com",
51
+ "username": "service_account",
52
+ "password": "<secret>",
53
+ "organization": "Acme GmbH"
54
+ }
55
+ ```
56
+
57
+ `organization` is optional if the service account belongs to a single
58
+ organization. Make sure the file is not world-readable (`chmod 600`).
59
+
60
+ For internal DocuWare installations with self-signed or private-CA
61
+ certificates, TLS verification can be disabled with
62
+ `DW_VERIFY_CERT=false`. **Do not use this against production systems** —
63
+ it disables protection against man-in-the-middle attacks.
64
+
65
+ OAuth2 requires DocuWare 7.10 or later.
66
+
67
+ ## Use with an MCP client
68
+
69
+ `docuware-mcp` is a stdio-based MCP server: an MCP client (Claude
70
+ Desktop, Claude Code, …) launches it as a subprocess and talks to it
71
+ over stdin/stdout. You don't run it yourself — the client does.
72
+
73
+ The recommended install path is via [`uv`](https://docs.astral.sh/uv/),
74
+ because `uvx` will fetch and run the package on demand without a global
75
+ install. Install `uv` once (`brew install uv` on macOS,
76
+ `curl -LsSf https://astral.sh/uv/install.sh | sh` on Linux,
77
+ `irm https://astral.sh/uv/install.ps1 | iex` in PowerShell on Windows),
78
+ then add this entry to your client's MCP config:
79
+
80
+ ```json
81
+ {
82
+ "mcpServers": {
83
+ "docuware": {
84
+ "command": "uvx",
85
+ "args": ["docuware-mcp"],
86
+ "env": {
87
+ "DW_CREDENTIALS_FILE": "/path/to/credentials.json"
88
+ }
89
+ }
90
+ }
91
+ }
92
+ ```
93
+
94
+ The config file lives at:
95
+
96
+ - **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS), `%APPDATA%\Claude\claude_desktop_config.json` (Windows)
97
+ - **Claude Code**: `.mcp.json` in your project root (or run `claude mcp add docuware -- uvx docuware-mcp`)
98
+
99
+ Restart the client after editing. The `docuware` server should then
100
+ appear in the available-tools list, exposing `list_archives`,
101
+ `describe_archive`, `search`, `get_document`, and `status`.
102
+
103
+ ### Running directly (for debugging)
104
+
105
+ If you've cloned this repo and want to poke at the server with the
106
+ [MCP Inspector](https://github.com/modelcontextprotocol/inspector)
107
+ or call it from a script:
108
+
109
+ ```
110
+ docuware-mcp
111
+ ```
112
+
113
+ Speaks MCP over stdio — same protocol the clients above use.
114
+
115
+ ## License
116
+
117
+ BSD-3-Clause.
@@ -0,0 +1,99 @@
1
+ # docuware-mcp
2
+
3
+ A Model Context Protocol (MCP) server that exposes a DocuWare DMS to
4
+ LLM-based agents through a database-style API.
5
+
6
+ This is an independent project with no affiliation to DocuWare GmbH.
7
+
8
+ ## Configuration
9
+
10
+ Credentials are read from the `docuware-client` standard environment
11
+ variables:
12
+
13
+ ```
14
+ DW_URL=https://dms.example.com
15
+ DW_USERNAME=service_account
16
+ DW_PASSWORD=<secret>
17
+ DW_ORG=<org>
18
+ ```
19
+
20
+ Alternatively, point `DW_CREDENTIALS_FILE` at a JSON file — useful for
21
+ switching between test and production systems, or for keeping secrets
22
+ out of shell history:
23
+
24
+ ```
25
+ DW_CREDENTIALS_FILE=/path/to/credentials.json
26
+ ```
27
+
28
+ The file uses the same keys as the environment variables:
29
+
30
+ ```json
31
+ {
32
+ "url": "https://dms.example.com",
33
+ "username": "service_account",
34
+ "password": "<secret>",
35
+ "organization": "Acme GmbH"
36
+ }
37
+ ```
38
+
39
+ `organization` is optional if the service account belongs to a single
40
+ organization. Make sure the file is not world-readable (`chmod 600`).
41
+
42
+ For internal DocuWare installations with self-signed or private-CA
43
+ certificates, TLS verification can be disabled with
44
+ `DW_VERIFY_CERT=false`. **Do not use this against production systems** —
45
+ it disables protection against man-in-the-middle attacks.
46
+
47
+ OAuth2 requires DocuWare 7.10 or later.
48
+
49
+ ## Use with an MCP client
50
+
51
+ `docuware-mcp` is a stdio-based MCP server: an MCP client (Claude
52
+ Desktop, Claude Code, …) launches it as a subprocess and talks to it
53
+ over stdin/stdout. You don't run it yourself — the client does.
54
+
55
+ The recommended install path is via [`uv`](https://docs.astral.sh/uv/),
56
+ because `uvx` will fetch and run the package on demand without a global
57
+ install. Install `uv` once (`brew install uv` on macOS,
58
+ `curl -LsSf https://astral.sh/uv/install.sh | sh` on Linux,
59
+ `irm https://astral.sh/uv/install.ps1 | iex` in PowerShell on Windows),
60
+ then add this entry to your client's MCP config:
61
+
62
+ ```json
63
+ {
64
+ "mcpServers": {
65
+ "docuware": {
66
+ "command": "uvx",
67
+ "args": ["docuware-mcp"],
68
+ "env": {
69
+ "DW_CREDENTIALS_FILE": "/path/to/credentials.json"
70
+ }
71
+ }
72
+ }
73
+ }
74
+ ```
75
+
76
+ The config file lives at:
77
+
78
+ - **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS), `%APPDATA%\Claude\claude_desktop_config.json` (Windows)
79
+ - **Claude Code**: `.mcp.json` in your project root (or run `claude mcp add docuware -- uvx docuware-mcp`)
80
+
81
+ Restart the client after editing. The `docuware` server should then
82
+ appear in the available-tools list, exposing `list_archives`,
83
+ `describe_archive`, `search`, `get_document`, and `status`.
84
+
85
+ ### Running directly (for debugging)
86
+
87
+ If you've cloned this repo and want to poke at the server with the
88
+ [MCP Inspector](https://github.com/modelcontextprotocol/inspector)
89
+ or call it from a script:
90
+
91
+ ```
92
+ docuware-mcp
93
+ ```
94
+
95
+ Speaks MCP over stdio — same protocol the clients above use.
96
+
97
+ ## License
98
+
99
+ BSD-3-Clause.
@@ -0,0 +1,57 @@
1
+ [project]
2
+ name = "docuware-mcp"
3
+ version = "0.1.0"
4
+ description = "MCP server exposing a DocuWare DMS to LLM-based agents"
5
+ authors = [{ name = "Stefan Schönberger", email = "stefan@sniner.dev" }]
6
+ requires-python = ">=3.10"
7
+ readme = "README.md"
8
+ license = "BSD-3-Clause"
9
+ classifiers = [
10
+ "Programming Language :: Python :: 3",
11
+ "Operating System :: OS Independent",
12
+ "Programming Language :: Python :: 3.10",
13
+ "Programming Language :: Python :: 3.11",
14
+ "Programming Language :: Python :: 3.12",
15
+ "Programming Language :: Python :: 3.13",
16
+ ]
17
+ dependencies = [
18
+ "docuware-client>=0.7.10",
19
+ "fastmcp>=2.0.0",
20
+ ]
21
+
22
+ [project.scripts]
23
+ docuware-mcp = "docuware_mcp.server:main"
24
+
25
+ [dependency-groups]
26
+ dev = [
27
+ "pytest>=8.2.2,<9",
28
+ "basedpyright>=1.38.2,<2",
29
+ "ruff>=0.15.5,<0.16",
30
+ ]
31
+
32
+ [tool.uv]
33
+ default-groups = "all"
34
+
35
+ [build-system]
36
+ requires = ["uv-build>=0.10.9,<0.12"]
37
+ build-backend = "uv_build"
38
+
39
+ [tool.uv.build-backend]
40
+ module-name = "docuware_mcp"
41
+
42
+ [tool.pytest.ini_options]
43
+ addopts = "-ra -q"
44
+ testpaths = ["tests"]
45
+
46
+ [tool.pyright]
47
+ typeCheckingMode = "standard"
48
+ useLibraryCodeForTypes = true
49
+ venvPath = "."
50
+ venv = ".venv"
51
+
52
+ [tool.ruff]
53
+ line-length = 96
54
+
55
+ [tool.ruff.lint]
56
+ ignore = ["E402"]
57
+ per-file-ignores = { "__init__.py" = ["F401"] }
@@ -0,0 +1 @@
1
+ __version__ = "0.1.0"
@@ -0,0 +1,271 @@
1
+ """Filter DSL → docuware-client conditions translation.
2
+
3
+ Operators (v1):
4
+
5
+ - bare value (or ``{"eq": v}``): exact match
6
+ - ``{"like": "*pattern*"}``: wildcard match (``*`` and ``?``)
7
+ - ``{"gte": v}`` / ``{"lte": v}``: ≥ / ≤ inclusive
8
+ - ``{"between": [low, high]}``: low ≤ field ≤ high (use ``null`` for open bound)
9
+ - ``{"empty": true}`` (or value ``null``): field is empty
10
+
11
+ Combinator (top-level): ``"AND"`` (default) or ``"OR"``. Flat — no nested groups.
12
+
13
+ Limitations (per design — see docuware-mcp-design-decisions.md):
14
+
15
+ - No ``gt`` / ``lt`` (DW only has inclusive ranges)
16
+ - No ``ne``, ``in``, ``regex`` (would require backend-fakery)
17
+ - No nested boolean groups (DW backend has one global combinator)
18
+ """
19
+
20
+ from __future__ import annotations
21
+
22
+ from datetime import date, datetime
23
+ from typing import Any, Dict, List
24
+
25
+ import docuware
26
+
27
+ from docuware_mcp.schema import ArchiveSchema, FieldSchema
28
+
29
+
30
+ class FilterValidationError(ValueError):
31
+ """Raised when the filter DSL contains an invalid construct."""
32
+
33
+
34
+ _DW_METACHARS = "()*?"
35
+
36
+
37
+ def _escape_dw(value: str, *, escape_wildcards: bool) -> str:
38
+ """Escape DocuWare metacharacters in a string value.
39
+
40
+ ``*`` and ``?`` are escaped only when ``escape_wildcards`` is True. Already
41
+ backslash-escaped sequences are left alone, so the function is idempotent.
42
+ """
43
+ chars = _DW_METACHARS if escape_wildcards else "()"
44
+ out: List[str] = []
45
+ i = 0
46
+ n = len(value)
47
+ while i < n:
48
+ c = value[i]
49
+ if c == "\\" and i + 1 < n and value[i + 1] in _DW_METACHARS:
50
+ out.append(c)
51
+ out.append(value[i + 1])
52
+ i += 2
53
+ continue
54
+ if c in chars:
55
+ out.append("\\")
56
+ out.append(c)
57
+ i += 1
58
+ return "".join(out)
59
+
60
+
61
+ def _coerce(value: Any, fld: FieldSchema) -> Any:
62
+ """Coerce a JSON value to the type matching the field's DW type.
63
+
64
+ Strings remain strings on text-like fields; ISO strings become ``date``,
65
+ ``datetime``, ``int``, or ``float`` on the typed fields. Failures raise
66
+ :class:`FilterValidationError` with an LLM-actionable message.
67
+ """
68
+ if value is None:
69
+ return None
70
+
71
+ ftype = (fld.type or "").lower()
72
+
73
+ if ftype == "date":
74
+ if isinstance(value, datetime):
75
+ return value.date()
76
+ if isinstance(value, date):
77
+ return value
78
+ if isinstance(value, str):
79
+ try:
80
+ return date.fromisoformat(value)
81
+ except ValueError as exc:
82
+ raise FilterValidationError(
83
+ f"Field {fld.name!r} is type Date — value {value!r} is not a "
84
+ f"valid ISO date (YYYY-MM-DD)"
85
+ ) from exc
86
+ raise FilterValidationError(
87
+ f"Field {fld.name!r} is type Date — expected ISO date string, got "
88
+ f"{type(value).__name__}"
89
+ )
90
+
91
+ if ftype == "datetime":
92
+ if isinstance(value, datetime):
93
+ return value
94
+ if isinstance(value, date):
95
+ return datetime(value.year, value.month, value.day)
96
+ if isinstance(value, str):
97
+ try:
98
+ return datetime.fromisoformat(value)
99
+ except ValueError as exc:
100
+ raise FilterValidationError(
101
+ f"Field {fld.name!r} is type DateTime — value {value!r} is not "
102
+ f"valid ISO 8601"
103
+ ) from exc
104
+ raise FilterValidationError(
105
+ f"Field {fld.name!r} is type DateTime — expected ISO datetime string, got "
106
+ f"{type(value).__name__}"
107
+ )
108
+
109
+ if ftype in ("numeric", "int"):
110
+ if isinstance(value, bool):
111
+ raise FilterValidationError(
112
+ f"Field {fld.name!r} is type Numeric — got bool"
113
+ )
114
+ if isinstance(value, int):
115
+ return value
116
+ if isinstance(value, str):
117
+ try:
118
+ return int(value)
119
+ except ValueError as exc:
120
+ raise FilterValidationError(
121
+ f"Field {fld.name!r} is type Numeric — value {value!r} is not an integer"
122
+ ) from exc
123
+ raise FilterValidationError(
124
+ f"Field {fld.name!r} is type Numeric — expected integer, got "
125
+ f"{type(value).__name__}"
126
+ )
127
+
128
+ if ftype == "decimal":
129
+ if isinstance(value, bool):
130
+ raise FilterValidationError(
131
+ f"Field {fld.name!r} is type Decimal — got bool"
132
+ )
133
+ if isinstance(value, (int, float)):
134
+ return float(value)
135
+ if isinstance(value, str):
136
+ try:
137
+ return float(value)
138
+ except ValueError as exc:
139
+ raise FilterValidationError(
140
+ f"Field {fld.name!r} is type Decimal — value {value!r} is not a number"
141
+ ) from exc
142
+ raise FilterValidationError(
143
+ f"Field {fld.name!r} is type Decimal — expected number, got "
144
+ f"{type(value).__name__}"
145
+ )
146
+
147
+ # Text, Memo, Keyword, unknown: pass strings through, stringify others.
148
+ return value if isinstance(value, str) else str(value)
149
+
150
+
151
+ def _format(value: Any, *, escape_wildcards: bool) -> Any:
152
+ """Apply DW metacharacter escaping; pass non-strings through unchanged.
153
+
154
+ The returned value is suitable for the dict form of
155
+ :meth:`docuware.SearchDialog.search` when called with
156
+ ``quote=QuoteMode.NONE`` (i.e. the client does no further escaping).
157
+ """
158
+ if value is None:
159
+ return None
160
+ if isinstance(value, str):
161
+ return _escape_dw(value, escape_wildcards=escape_wildcards)
162
+ return value
163
+
164
+
165
+ def _translate_one(fld: FieldSchema, spec: Any) -> Any:
166
+ """Translate one ``(field, spec)`` pair to the value form expected by DW."""
167
+ allowed = set(fld.operators)
168
+
169
+ # Shorthand: null value means "field is empty"
170
+ if spec is None:
171
+ if "empty" not in allowed:
172
+ raise FilterValidationError(
173
+ f"Field {fld.name!r} (type={fld.type}) does not support 'empty' check"
174
+ )
175
+ return None
176
+
177
+ # Operator dict
178
+ if isinstance(spec, dict):
179
+ if len(spec) != 1:
180
+ raise FilterValidationError(
181
+ f"Field {fld.name!r}: filter must be a single-operator dict, got "
182
+ f"keys {list(spec.keys())}"
183
+ )
184
+ op, raw = next(iter(spec.items()))
185
+ if op not in allowed:
186
+ raise FilterValidationError(
187
+ f"Field {fld.name!r} (type={fld.type}) does not support operator {op!r}. "
188
+ f"Allowed: {', '.join(sorted(allowed))}"
189
+ )
190
+
191
+ if op == "empty":
192
+ return None
193
+
194
+ if op == "eq":
195
+ return _format(_coerce(raw, fld), escape_wildcards=True)
196
+
197
+ if op == "like":
198
+ if not isinstance(raw, str):
199
+ raise FilterValidationError(
200
+ f"Field {fld.name!r}: 'like' requires a string value, got "
201
+ f"{type(raw).__name__}"
202
+ )
203
+ return _format(raw, escape_wildcards=False)
204
+
205
+ if op == "gte":
206
+ return [_format(_coerce(raw, fld), escape_wildcards=True), None]
207
+
208
+ if op == "lte":
209
+ return [None, _format(_coerce(raw, fld), escape_wildcards=True)]
210
+
211
+ if op == "between":
212
+ if not isinstance(raw, (list, tuple)) or len(raw) != 2:
213
+ raise FilterValidationError(
214
+ f"Field {fld.name!r}: 'between' requires a 2-element list "
215
+ f"[low, high] (use null for an open bound)"
216
+ )
217
+ low, high = raw
218
+ return [
219
+ _format(_coerce(low, fld), escape_wildcards=True) if low is not None else None,
220
+ _format(_coerce(high, fld), escape_wildcards=True) if high is not None else None,
221
+ ]
222
+
223
+ # Unreachable: 'allowed' check above filtered unsupported operators.
224
+ raise FilterValidationError(
225
+ f"Internal: operator {op!r} accepted but not implemented"
226
+ )
227
+
228
+ # Bare value = eq
229
+ if "eq" not in allowed:
230
+ raise FilterValidationError(
231
+ f"Field {fld.name!r} (type={fld.type}) does not support 'eq'"
232
+ )
233
+ return _format(_coerce(spec, fld), escape_wildcards=True)
234
+
235
+
236
+ def build_conditions(filters: Dict[str, Any], schema: ArchiveSchema) -> Dict[str, Any]:
237
+ """Translate the filter DSL to a dict for ``SearchDialog.search()``.
238
+
239
+ The returned dict must be passed to docuware-client with
240
+ ``quote=docuware.QuoteMode.NONE`` because all values have already been
241
+ escaped according to their per-operator wildcard intent.
242
+ """
243
+ if not isinstance(filters, dict):
244
+ raise FilterValidationError(
245
+ f"filters must be an object, got {type(filters).__name__}"
246
+ )
247
+
248
+ out: Dict[str, Any] = {}
249
+ for fname, spec in filters.items():
250
+ try:
251
+ fld = schema.field_by_name(fname)
252
+ except KeyError:
253
+ raise FilterValidationError(
254
+ f"Unknown field {fname!r}. Available: {', '.join(schema.field_names())}"
255
+ ) from None
256
+ if fld.internal_id in out:
257
+ raise FilterValidationError(
258
+ f"Field {fname!r} resolves to {fld.internal_id!r} which already has "
259
+ f"a condition (multiple conditions on the same field are not supported)"
260
+ )
261
+ out[fld.internal_id] = _translate_one(fld, spec)
262
+ return out
263
+
264
+
265
+ def parse_combinator(value: str) -> docuware.Operation:
266
+ upper = (value or "AND").upper()
267
+ if upper == "AND":
268
+ return docuware.Operation.AND
269
+ if upper == "OR":
270
+ return docuware.Operation.OR
271
+ raise FilterValidationError(f"combinator must be 'AND' or 'OR', got {value!r}")
@@ -0,0 +1,106 @@
1
+ """Archive schema and field-type → operator mapping.
2
+
3
+ The mapping is a static table held in the MCP server. We deliberately do
4
+ not derive it from DocuWare's own ``OperatorTable``: the MCP-facing
5
+ operator vocabulary is intentionally smaller than what DW would accept,
6
+ so the LLM's mental model stays simple and consistent.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import logging
12
+ from dataclasses import asdict, dataclass, field
13
+ from typing import Any, Dict, FrozenSet, List, Optional
14
+
15
+ import docuware
16
+
17
+ log = logging.getLogger(__name__)
18
+
19
+
20
+ # DocuWare DWFieldType values observed in the wild are case-insensitive.
21
+ # Keys here are lowercased before lookup.
22
+ _TYPE_OPERATORS: Dict[str, FrozenSet[str]] = {
23
+ "text": frozenset({"eq", "like", "empty"}),
24
+ "memo": frozenset({"eq", "like", "empty"}),
25
+ "keyword": frozenset({"eq", "like", "empty"}),
26
+ "keywords": frozenset({"eq", "like", "empty"}),
27
+ "numeric": frozenset({"eq", "gte", "lte", "between", "empty"}),
28
+ "int": frozenset({"eq", "gte", "lte", "between", "empty"}),
29
+ "decimal": frozenset({"eq", "gte", "lte", "between", "empty"}),
30
+ "date": frozenset({"eq", "gte", "lte", "between", "empty"}),
31
+ "datetime": frozenset({"eq", "gte", "lte", "between", "empty"}),
32
+ }
33
+
34
+ _DEFAULT_OPERATORS: FrozenSet[str] = frozenset({"eq", "empty"})
35
+
36
+
37
+ def allowed_operators_for_type(dw_type: Optional[str]) -> FrozenSet[str]:
38
+ """Return the operator set the MCP DSL allows on a field of this DW type."""
39
+ if not dw_type:
40
+ return _DEFAULT_OPERATORS
41
+ return _TYPE_OPERATORS.get(dw_type.lower(), _DEFAULT_OPERATORS)
42
+
43
+
44
+ @dataclass(frozen=True)
45
+ class FieldSchema:
46
+ name: str
47
+ internal_id: str
48
+ type: Optional[str]
49
+ length: int
50
+ operators: List[str]
51
+ select_list: Optional[List[str]] = None
52
+
53
+ def to_dict(self) -> Dict[str, Any]:
54
+ return asdict(self)
55
+
56
+
57
+ @dataclass
58
+ class ArchiveSchema:
59
+ name: str
60
+ internal_id: str
61
+ fields: List[FieldSchema] = field(default_factory=list)
62
+
63
+ def field_by_name(self, name: str) -> FieldSchema:
64
+ cf = name.casefold()
65
+ for f in self.fields:
66
+ if f.name.casefold() == cf or f.internal_id.casefold() == cf:
67
+ return f
68
+ raise KeyError(name)
69
+
70
+ def field_names(self) -> List[str]:
71
+ return [f.name for f in self.fields]
72
+
73
+ def to_dict(self) -> Dict[str, Any]:
74
+ return {
75
+ "name": self.name,
76
+ "internal_id": self.internal_id,
77
+ "fields": [f.to_dict() for f in self.fields],
78
+ }
79
+
80
+
81
+ def describe_dialog(
82
+ dialog: docuware.SearchDialog, archive_name: str, archive_id: str
83
+ ) -> ArchiveSchema:
84
+ """Build an ArchiveSchema from a SearchDialog's field definitions."""
85
+ out: List[FieldSchema] = []
86
+ for sf in dialog.fields.values():
87
+ ops = sorted(allowed_operators_for_type(sf.type))
88
+ select_list: Optional[List[str]] = None
89
+ if sf.type and sf.type.lower() in ("keyword", "keywords"):
90
+ try:
91
+ values = sf.values()
92
+ if values:
93
+ select_list = [str(v) for v in values]
94
+ except Exception as exc:
95
+ log.debug("select_list fetch failed for field %r: %s", sf.id, exc)
96
+ out.append(
97
+ FieldSchema(
98
+ name=sf.name,
99
+ internal_id=sf.id,
100
+ type=sf.type,
101
+ length=sf.length if sf.length is not None else -1,
102
+ operators=ops,
103
+ select_list=select_list,
104
+ )
105
+ )
106
+ return ArchiveSchema(name=archive_name, internal_id=archive_id, fields=out)
@@ -0,0 +1,326 @@
1
+ """docuware-mcp — MCP server exposing a DocuWare DMS to LLM-based agents."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import logging
6
+ import os
7
+ import time
8
+ from typing import Any, Dict, Iterable, List, Optional, Set, Tuple
9
+
10
+ import docuware
11
+ from fastmcp import FastMCP
12
+
13
+ from docuware_mcp.filters import (
14
+ FilterValidationError,
15
+ build_conditions,
16
+ parse_combinator,
17
+ )
18
+ from docuware_mcp.schema import ArchiveSchema, describe_dialog
19
+
20
+ log = logging.getLogger("docuware_mcp")
21
+
22
+
23
+ _SCHEMA_TTL_SECONDS = 300.0
24
+
25
+ _client: Optional[docuware.Client] = None
26
+ _schema_cache: Dict[str, Tuple[float, ArchiveSchema]] = {}
27
+
28
+
29
+ def _verify_cert_from_env() -> bool:
30
+ raw = os.environ.get("DW_VERIFY_CERT")
31
+ if raw is None:
32
+ return True
33
+ return raw.strip().lower() not in ("0", "false", "no", "off")
34
+
35
+
36
+ def _get_client() -> docuware.Client:
37
+ global _client
38
+ if _client is None:
39
+ verify = _verify_cert_from_env()
40
+ if not verify:
41
+ log.warning("DW_VERIFY_CERT disabled — TLS certificate not verified")
42
+ creds_file = os.environ.get("DW_CREDENTIALS_FILE")
43
+ if creds_file:
44
+ log.info("Connecting to DocuWare with credentials from %s", creds_file)
45
+ _client = docuware.connect(
46
+ credentials_file=creds_file, verify_certificate=verify
47
+ )
48
+ else:
49
+ log.info("Connecting to DocuWare via docuware.connect()")
50
+ _client = docuware.connect(verify_certificate=verify)
51
+ return _client
52
+
53
+
54
+ def _resolve_archive(client: docuware.Client, name_or_id: str) -> docuware.FileCabinet:
55
+ """Resolve an archive across all organizations by ID or display name.
56
+
57
+ Baskets are excluded — this server exposes archives only.
58
+ """
59
+ candidates: List[docuware.FileCabinet] = []
60
+ cf = name_or_id.casefold()
61
+ for org in client.organizations:
62
+ for fc in org.file_cabinets:
63
+ if not isinstance(fc, docuware.FileCabinet) or fc.is_basket:
64
+ continue
65
+ if fc.id == name_or_id:
66
+ return fc
67
+ if fc.name.casefold() == cf:
68
+ candidates.append(fc)
69
+ if not candidates:
70
+ raise ValueError(f"Archive not found: {name_or_id!r}")
71
+ if len(candidates) > 1:
72
+ names = [
73
+ f"{c.name} [id={c.id}, org={c.organization.name}]" for c in candidates
74
+ ]
75
+ raise ValueError(
76
+ f"Archive name {name_or_id!r} is ambiguous across organizations. "
77
+ f"Use the internal ID instead. Candidates: {names}"
78
+ )
79
+ return candidates[0]
80
+
81
+
82
+ def _get_search_dialog(fc: docuware.FileCabinet) -> docuware.SearchDialog:
83
+ dlg = fc.search_dialog(required=True)
84
+ if not isinstance(dlg, docuware.SearchDialog):
85
+ raise RuntimeError(
86
+ f"Archive {fc.name!r} returned unexpected dialog type {type(dlg).__name__}"
87
+ )
88
+ return dlg
89
+
90
+
91
+ def _get_schema(client: docuware.Client, archive: str) -> ArchiveSchema:
92
+ now = time.monotonic()
93
+ cached = _schema_cache.get(archive)
94
+ if cached and (now - cached[0]) < _SCHEMA_TTL_SECONDS:
95
+ return cached[1]
96
+ fc = _resolve_archive(client, archive)
97
+ dlg = _get_search_dialog(fc)
98
+ schema = describe_dialog(dlg, fc.name, fc.id)
99
+ _schema_cache[archive] = (now, schema)
100
+ _schema_cache[fc.id] = (now, schema)
101
+ log.info("Loaded schema for archive %r [id=%s]: %d fields",
102
+ fc.name, fc.id, len(schema.fields))
103
+ return schema
104
+
105
+
106
+ def _fields_to_dict(
107
+ field_values: Any, allowed_ids: Optional[Set[str]] = None
108
+ ) -> Dict[str, Any]:
109
+ """Render a list of FieldValue objects to a ``{name: value}`` dict.
110
+
111
+ If ``allowed_ids`` is given, only fields whose internal ID is in the set
112
+ are emitted. This is used to suppress DocuWare system metadata that isn't
113
+ part of the archive's search-dialog schema (and thus not described to
114
+ callers via :func:`describe_archive`).
115
+ """
116
+ out: Dict[str, Any] = {}
117
+ for fv in field_values or []:
118
+ fid = getattr(fv, "id", None)
119
+ if allowed_ids is not None and fid not in allowed_ids:
120
+ continue
121
+ name = getattr(fv, "name", None) or fid
122
+ if not name:
123
+ continue
124
+ value = getattr(fv, "value", None)
125
+ if value is not None and hasattr(value, "isoformat"):
126
+ value = value.isoformat()
127
+ out[name] = value
128
+ return out
129
+
130
+
131
+ def _extract_doc_id(field_values: Iterable[Any]) -> Optional[str]:
132
+ """Pull the DWDOCID value out of a FieldValue list as a string."""
133
+ for fv in field_values or []:
134
+ if getattr(fv, "id", None) == "DWDOCID":
135
+ value = getattr(fv, "value", None)
136
+ return str(value) if value is not None else None
137
+ return None
138
+
139
+
140
+ # --- MCP server ---
141
+
142
+ mcp = FastMCP("docuware-mcp")
143
+
144
+
145
+ @mcp.tool()
146
+ def list_archives() -> List[Dict[str, str]]:
147
+ """List archives accessible to the configured DocuWare service account.
148
+
149
+ Analogous to ``SHOW DATABASES``. Returns each archive's display name,
150
+ internal ID, and the organization it belongs to. Baskets are excluded.
151
+ """
152
+ client = _get_client()
153
+ out: List[Dict[str, str]] = []
154
+ for org in client.organizations:
155
+ for fc in org.file_cabinets:
156
+ if fc.is_basket:
157
+ continue
158
+ out.append({
159
+ "name": fc.name,
160
+ "id": fc.id,
161
+ "organization": org.name,
162
+ })
163
+ log.info("list_archives → %d archives", len(out))
164
+ return out
165
+
166
+
167
+ @mcp.tool()
168
+ def describe_archive(archive: str) -> Dict[str, Any]:
169
+ """Describe an archive's schema: fields, types, and allowed operators.
170
+
171
+ Analogous to ``DESCRIBE TABLE``. The ``operators`` list per field tells you
172
+ which operators are valid in :func:`search`'s ``filters`` for that field.
173
+ Keyword fields with a defined value list also expose ``select_list``.
174
+
175
+ Args:
176
+ archive: Display name or internal ID of the archive.
177
+ """
178
+ client = _get_client()
179
+ schema = _get_schema(client, archive)
180
+ return schema.to_dict()
181
+
182
+
183
+ @mcp.tool()
184
+ def search(
185
+ archive: str,
186
+ filters: Optional[Dict[str, Any]] = None,
187
+ combinator: str = "AND",
188
+ limit: int = 25,
189
+ offset: int = 0,
190
+ ) -> Dict[str, Any]:
191
+ """Search documents in an archive using the structured filter DSL.
192
+
193
+ Args:
194
+ archive: Display name or internal ID of the archive.
195
+ filters: Dict mapping field names to either a bare value (= ``eq``) or
196
+ a single-operator dict like ``{"gte": 100}``. Supported operators:
197
+ ``eq``, ``like``, ``gte``, ``lte``, ``between``, ``empty``. Use
198
+ :func:`describe_archive` to see which operators each field accepts.
199
+ combinator: How multiple conditions are combined: ``"AND"`` (default)
200
+ or ``"OR"``. DocuWare does not support mixed AND/OR in one query.
201
+ limit: Maximum results to return (1–200, default 25).
202
+ offset: Number of results to skip (client-side slicing).
203
+
204
+ Returns:
205
+ A dict with ``items`` (list of result dicts containing ``id``,
206
+ ``title``, ``content_type``, ``fields``), ``count`` (server-reported
207
+ total when known), ``limit``, and ``offset``.
208
+ """
209
+ if not 1 <= limit <= 200:
210
+ raise ValueError("limit must be between 1 and 200")
211
+ if offset < 0:
212
+ raise ValueError("offset must be >= 0")
213
+
214
+ client = _get_client()
215
+ schema = _get_schema(client, archive)
216
+ fc = _resolve_archive(client, archive)
217
+
218
+ if not filters:
219
+ raise ValueError(
220
+ "search currently requires at least one filter condition. "
221
+ "Match-everything is not yet implemented in v1."
222
+ )
223
+
224
+ try:
225
+ conditions = build_conditions(filters, schema)
226
+ except FilterValidationError as exc:
227
+ raise ValueError(str(exc)) from None
228
+
229
+ op = parse_combinator(combinator)
230
+ dlg = _get_search_dialog(fc)
231
+
232
+ log.info(
233
+ "search archive=%r [id=%s] filters=%s combinator=%s limit=%d offset=%d",
234
+ fc.name, fc.id, list(filters.keys()), combinator, limit, offset,
235
+ )
236
+
237
+ result_iter = dlg.search(conditions, operation=op, quote=docuware.QuoteMode.NONE)
238
+ allowed_ids = {f.internal_id for f in schema.fields}
239
+
240
+ items: List[Dict[str, Any]] = []
241
+ skipped = 0
242
+ for item in result_iter:
243
+ if skipped < offset:
244
+ skipped += 1
245
+ continue
246
+ if len(items) >= limit:
247
+ break
248
+ items.append({
249
+ "id": _extract_doc_id(item.fields),
250
+ "title": item.title,
251
+ "content_type": item.content_type,
252
+ "fields": _fields_to_dict(item.fields, allowed_ids=allowed_ids),
253
+ })
254
+
255
+ return {
256
+ "items": items,
257
+ "count": getattr(result_iter, "count", None),
258
+ "limit": limit,
259
+ "offset": offset,
260
+ }
261
+
262
+
263
+ @mcp.tool()
264
+ def get_document(archive: str, document_id: str) -> Dict[str, Any]:
265
+ """Fetch a single document's metadata by primary-key ID.
266
+
267
+ Returns index field values, title, and content type. Does not return file
268
+ content — binary download will be a separate tool.
269
+
270
+ Args:
271
+ archive: Display name or internal ID of the archive.
272
+ document_id: DocuWare document ID (DWDOCID).
273
+ """
274
+ client = _get_client()
275
+ schema = _get_schema(client, archive)
276
+ fc = _resolve_archive(client, archive)
277
+ doc = fc.get_document(document_id)
278
+ allowed_ids = {f.internal_id for f in schema.fields}
279
+ return {
280
+ "id": str(getattr(doc, "id", document_id)),
281
+ "title": getattr(doc, "title", None),
282
+ "content_type": getattr(doc, "content_type", None),
283
+ "fields": _fields_to_dict(getattr(doc, "fields", None), allowed_ids=allowed_ids),
284
+ }
285
+
286
+
287
+ @mcp.tool()
288
+ def status() -> Dict[str, Any]:
289
+ """Connection health: organizations and visible archive count.
290
+
291
+ Useful as a first call to verify credentials and surface what the
292
+ configured service account can actually see.
293
+ """
294
+ try:
295
+ client = _get_client()
296
+ orgs_info = []
297
+ archive_count = 0
298
+ for org in client.organizations:
299
+ archives = [fc for fc in org.file_cabinets if not fc.is_basket]
300
+ archive_count += len(archives)
301
+ orgs_info.append({
302
+ "name": org.name,
303
+ "id": org.id,
304
+ "archive_count": len(archives),
305
+ })
306
+ return {
307
+ "connected": True,
308
+ "organizations": orgs_info,
309
+ "archive_count": archive_count,
310
+ }
311
+ except Exception as exc:
312
+ log.exception("status check failed")
313
+ return {"connected": False, "error": str(exc)}
314
+
315
+
316
+ def main() -> None:
317
+ """Entry point — run the MCP server over stdio."""
318
+ logging.basicConfig(
319
+ level=logging.INFO,
320
+ format="%(asctime)s %(levelname)s %(name)s: %(message)s",
321
+ )
322
+ mcp.run()
323
+
324
+
325
+ if __name__ == "__main__":
326
+ main()