thinharness 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. thinharness-0.1.0/CHANGELOG.md +5 -0
  2. thinharness-0.1.0/LICENSE +21 -0
  3. thinharness-0.1.0/MANIFEST.in +2 -0
  4. thinharness-0.1.0/PKG-INFO +316 -0
  5. thinharness-0.1.0/README.md +277 -0
  6. thinharness-0.1.0/docs/THIRD_PARTY_NOTICES.md +28 -0
  7. thinharness-0.1.0/docs/architecture.md +543 -0
  8. thinharness-0.1.0/docs/decisions.md +90 -0
  9. thinharness-0.1.0/docs/docs.md +105 -0
  10. thinharness-0.1.0/docs/releasing.md +33 -0
  11. thinharness-0.1.0/docs/table.md +62 -0
  12. thinharness-0.1.0/pyproject.toml +92 -0
  13. thinharness-0.1.0/setup.cfg +4 -0
  14. thinharness-0.1.0/tests/test_file_tools.py +213 -0
  15. thinharness-0.1.0/tests/test_harness.py +553 -0
  16. thinharness-0.1.0/tests/test_hooks.py +450 -0
  17. thinharness-0.1.0/tests/test_mcp.py +751 -0
  18. thinharness-0.1.0/tests/test_mcp_optional_dependency.py +29 -0
  19. thinharness-0.1.0/tests/test_parallel_llm.py +626 -0
  20. thinharness-0.1.0/tests/test_parallel_tools.py +229 -0
  21. thinharness-0.1.0/tests/test_providers.py +371 -0
  22. thinharness-0.1.0/tests/test_resume.py +532 -0
  23. thinharness-0.1.0/tests/test_skills.py +115 -0
  24. thinharness-0.1.0/tests/test_structured_output.py +557 -0
  25. thinharness-0.1.0/tests/test_subagents.py +483 -0
  26. thinharness-0.1.0/tests/test_tool_retry.py +441 -0
  27. thinharness-0.1.0/tests/test_tracing.py +197 -0
  28. thinharness-0.1.0/thinharness/__init__.py +143 -0
  29. thinharness-0.1.0/thinharness/core.py +1044 -0
  30. thinharness-0.1.0/thinharness/defaults.py +10 -0
  31. thinharness-0.1.0/thinharness/hooks.py +277 -0
  32. thinharness-0.1.0/thinharness/output.py +295 -0
  33. thinharness-0.1.0/thinharness/providers.py +991 -0
  34. thinharness-0.1.0/thinharness/py.typed +1 -0
  35. thinharness-0.1.0/thinharness/subagents.py +340 -0
  36. thinharness-0.1.0/thinharness/tools/__init__.py +46 -0
  37. thinharness-0.1.0/thinharness/tools/base.py +375 -0
  38. thinharness-0.1.0/thinharness/tools/filesystem.py +632 -0
  39. thinharness-0.1.0/thinharness/tools/jsonl.py +313 -0
  40. thinharness-0.1.0/thinharness/tools/mcp.py +391 -0
  41. thinharness-0.1.0/thinharness/tools/parallel_llm.py +356 -0
  42. thinharness-0.1.0/thinharness/tools/skills.py +259 -0
  43. thinharness-0.1.0/thinharness/tracing.py +271 -0
  44. thinharness-0.1.0/thinharness.egg-info/PKG-INFO +316 -0
  45. thinharness-0.1.0/thinharness.egg-info/SOURCES.txt +46 -0
  46. thinharness-0.1.0/thinharness.egg-info/dependency_links.txt +1 -0
  47. thinharness-0.1.0/thinharness.egg-info/requires.txt +17 -0
  48. thinharness-0.1.0/thinharness.egg-info/top_level.txt +1 -0
@@ -0,0 +1,5 @@
1
+ # Changelog
2
+
3
+ ## 0.1.0 - Unreleased
4
+
5
+ - Initial PyPI release.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Ryan Brown
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,2 @@
1
+ include CHANGELOG.md
2
+ recursive-include docs *.md
@@ -0,0 +1,316 @@
1
+ Metadata-Version: 2.4
2
+ Name: thinharness
3
+ Version: 0.1.0
4
+ Summary: Minimal filesystem agent harness with provider-backed Responses-like models.
5
+ Author: thinharness maintainers
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/ryanbbrown/thinharness
8
+ Project-URL: Repository, https://github.com/ryanbbrown/thinharness
9
+ Project-URL: Issues, https://github.com/ryanbbrown/thinharness/issues
10
+ Project-URL: Changelog, https://github.com/ryanbbrown/thinharness/blob/main/CHANGELOG.md
11
+ Keywords: agents,responses-api,filesystem,skills,harness
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Operating System :: OS Independent
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3 :: Only
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
20
+ Classifier: Typing :: Typed
21
+ Requires-Python: >=3.11
22
+ Description-Content-Type: text/markdown
23
+ License-File: LICENSE
24
+ Requires-Dist: httpx>=0.27
25
+ Requires-Dist: pydantic>=2.13.4
26
+ Provides-Extra: dev
27
+ Requires-Dist: pyright>=1.1.409; extra == "dev"
28
+ Requires-Dist: pytest>=8; extra == "dev"
29
+ Requires-Dist: pytest-asyncio>=0.25; extra == "dev"
30
+ Requires-Dist: pytest-cov>=7; extra == "dev"
31
+ Requires-Dist: ruff>=0.14; extra == "dev"
32
+ Provides-Extra: tracing
33
+ Requires-Dist: opentelemetry-api>=1.24; extra == "tracing"
34
+ Requires-Dist: opentelemetry-sdk>=1.24; extra == "tracing"
35
+ Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.24; extra == "tracing"
36
+ Provides-Extra: mcp
37
+ Requires-Dist: mcp>=1.23.0; extra == "mcp"
38
+ Dynamic: license-file
39
+
40
+ <p align="center">
41
+ <img src="assets/ThinHarness.svg" alt="ThinHarness" width="360">
42
+ </p>
43
+
44
+ <p align="center">
45
+ <br/>
46
+ A minimal, opinionated agent harness &mdash;
47
+ <br/>
48
+ focused scope, readable core, easy to fork.
49
+ <br/><br/>
50
+ </p>
51
+
52
+ <div align="center">
53
+
54
+ [![CI](https://img.shields.io/github/actions/workflow/status/ryanbbrown/thinharness/ci.yml?branch=main&label=CI)](https://github.com/ryanbbrown/thinharness/actions/workflows/ci.yml)
55
+ [![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/ryanbbrown/thinharness/blob/main/LICENSE)
56
+ [![PyPI](https://img.shields.io/pypi/v/thinharness.svg)](https://pypi.org/project/thinharness/)
57
+
58
+ </div>
59
+
60
+ ## Why this exists
61
+
62
+ Filesystem-based agent harnesses are simple but powerful: easily auditable, flexible, and they work just as well for non-coding business tasks like research over a corpus, workflow automation, or multi-step analysis. But the harnesses that provide filesystem primitives are either coding agents (Claude Agent SDK) or are massive and highly abstracted (deepagents, Agno). Even if you don't want filesystem tools, the general-purpose agent harness libraries are missing features (see table below) — or large enough that it's a pain when you (inevitably) need to customize.
63
+
64
+ So I built one. The core agent loop isn't that complicated. Provider call, parse tool calls, run them, feed results back, repeat. ThinHarness is **4,938 lines of Python** across 15 files. The whole thing. Small enough to actually read. You can audit it. You can fork it without inheriting a fork-maintenance problem, because there isn't much there to drift.
65
+
66
+ <!--
67
+ LOC measurement scope: strict framework-only. Each row strips clearly
68
+ non-framework code from the upstream package — platform/deployment layers,
69
+ domain-specific modalities (voice/realtime), eval/optimizer suites, UI/CLI
70
+ tools, A2A/declarative wire protocols, code-executor backends. Provider
71
+ implementations stay IN (they're part of what you import to use the library).
72
+ The exact tokei command + upstream commit hash for each row is in an HTML
73
+ comment above the row, so the number is reproducible. Measured 2026-05-16
74
+ against the commit pinned in each row's comment.
75
+ -->
76
+
77
+ <div align="center">
78
+
79
+ <table>
80
+ <thead>
81
+ <tr>
82
+ <td align="left" width="256" bgcolor="#eaeef2"><b>Library</b></td>
83
+ <td align="center" width="70" bgcolor="#eaeef2"><b>LOC<sup>1</sup></b></td>
84
+ <td align="center" width="62" bgcolor="#eaeef2"><b>Tool<br>retries<sup>2</sup></b></td>
85
+ <td align="center" width="70" bgcolor="#eaeef2"><b>Subagents</b></td>
86
+ <td align="center" width="68" bgcolor="#eaeef2"><b>Structured<br>output</b></td>
87
+ <td align="center" width="52" bgcolor="#eaeef2"><b>Skills</b></td>
88
+ <td align="center" width="82" bgcolor="#eaeef2"><b>FS<br>tools</b></td>
89
+ <td align="center" width="62" bgcolor="#eaeef2"><b>OTel<br>tracing</b></td>
90
+ </tr>
91
+ </thead>
92
+ <tbody>
93
+ <!-- LOC: tokei thinharness/ -t Python · ryanbbrown/thinharness working tree, measured 2026-05-18 -->
94
+ <tr>
95
+ <td align="left" bgcolor="#f6f8fa"><b>ThinHarness</b></td>
96
+ <td align="right" bgcolor="#f6f8fa"><b>4,938</b></td>
97
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
98
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
99
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
100
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
101
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
102
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
103
+ </tr>
104
+ <!-- LOC: tokei src/claude_agent_sdk/ -t Python --exclude testing · anthropics/claude-agent-sdk-python @ c352a50 -->
105
+ <tr>
106
+ <td align="left" bgcolor="#ffffff">
107
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/anthropic.svg" width="20" height="20" align="absmiddle" alt="">
108
+ &nbsp;Claude&nbsp;Agent&nbsp;SDK<sup>3</sup>
109
+ </td>
110
+ <td align="right" bgcolor="#ffffff">8,202</td>
111
+ <td align="center" bgcolor="#ffffff">❌</td>
112
+ <td align="center" bgcolor="#ffffff">✅</td>
113
+ <td align="center" bgcolor="#ffffff">❌</td>
114
+ <td align="center" bgcolor="#ffffff">✅</td>
115
+ <td align="center" bgcolor="#ffffff">✅</td>
116
+ <td align="center" bgcolor="#ffffff">⚠️</td>
117
+ </tr>
118
+ <!-- LOC: tokei src/smolagents/ -t Python --exclude cli.py --exclude gradio_ui.py --exclude vision_web_browser.py · huggingface/smolagents @ 025b6ad -->
119
+ <tr>
120
+ <td align="left" bgcolor="#ffffff">
121
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/huggingface.svg" width="20" height="20" align="absmiddle" alt="">
122
+ &nbsp;smolagents
123
+ </td>
124
+ <td align="right" bgcolor="#ffffff">10,091</td>
125
+ <td align="center" bgcolor="#ffffff">❌</td>
126
+ <td align="center" bgcolor="#ffffff">✅</td>
127
+ <td align="center" bgcolor="#ffffff">✅</td>
128
+ <td align="center" bgcolor="#ffffff">❌</td>
129
+ <td align="center" bgcolor="#ffffff">❌</td>
130
+ <td align="center" bgcolor="#ffffff">✅</td>
131
+ </tr>
132
+ <!-- LOC: tokei libs/deepagents/deepagents/ -t Python · langchain-ai/deepagents @ 7465d77 -->
133
+ <!-- Substrate (see footnote 4): deepagents is a thin wrapper over LangChain/LangGraph.
134
+ Effective import surface ≈105k LOC, measured with the same strict filter as the rest of the table:
135
+ tokei libs/langgraph/langgraph/ libs/prebuilt/langgraph/ -t Python · langchain-ai/langgraph @ 076e2a3 => 26,144
136
+ tokei libs/core/langchain_core/ -t Python --exclude document_loaders --exclude documents --exclude embeddings --exclude indexing --exclude retrievers.py --exclude vectorstores --exclude cross_encoders.py · langchain-ai/langchain @ 73d4fd9 => 54,992
137
+ tokei libs/langchain_v1/langchain/ -t Python --exclude embeddings · langchain-ai/langchain @ 73d4fd9 => ~9,000
138
+ deepagents itself: 15,369
139
+ -->
140
+ <tr>
141
+ <td align="left" bgcolor="#ffffff">
142
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/langchain.svg" width="20" height="20" align="absmiddle" alt="">
143
+ &nbsp;deepagents<sup>4</sup>
144
+ </td>
145
+ <td align="right" bgcolor="#ffffff">15,369</td>
146
+ <td align="center" bgcolor="#ffffff">❌</td>
147
+ <td align="center" bgcolor="#ffffff">✅</td>
148
+ <td align="center" bgcolor="#ffffff">❌</td>
149
+ <td align="center" bgcolor="#ffffff">✅</td>
150
+ <td align="center" bgcolor="#ffffff">✅</td>
151
+ <td align="center" bgcolor="#ffffff">❌</td>
152
+ </tr>
153
+ <!-- LOC: tokei src/strands/ -t Python --exclude experimental --exclude vended_plugins --exclude multiagent/a2a · strands-agents/sdk-python @ 1232230 -->
154
+ <tr>
155
+ <td align="left" bgcolor="#ffffff">
156
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/amazonwebservices.svg" width="20" height="20" align="absmiddle" alt="">
157
+ &nbsp;AWS Strands
158
+ </td>
159
+ <td align="right" bgcolor="#ffffff">25,494</td>
160
+ <td align="center" bgcolor="#ffffff">⚠️</td>
161
+ <td align="center" bgcolor="#ffffff">✅</td>
162
+ <td align="center" bgcolor="#ffffff">✅</td>
163
+ <td align="center" bgcolor="#ffffff">❌</td>
164
+ <td align="center" bgcolor="#ffffff">❌</td>
165
+ <td align="center" bgcolor="#ffffff">✅</td>
166
+ </tr>
167
+ <!-- LOC: tokei python/packages/core/agent_framework/ -t Python --exclude _evaluation.py --exclude a2a --exclude ag_ui --exclude chatkit --exclude declarative --exclude devui --exclude hyperlight --exclude lab --exclude orchestrations --exclude mem0 --exclude redis --exclude microsoft · microsoft/agent-framework @ a60e541 -->
168
+ <tr>
169
+ <td align="left" bgcolor="#ffffff">
170
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/microsoft.svg" width="20" height="20" align="absmiddle" alt="">
171
+ &nbsp;Microsoft<br>
172
+ Agent Framework
173
+ </td>
174
+ <td align="right" bgcolor="#ffffff">34,751</td>
175
+ <td align="center" bgcolor="#ffffff">❌</td>
176
+ <td align="center" bgcolor="#ffffff">✅</td>
177
+ <td align="center" bgcolor="#ffffff">✅</td>
178
+ <td align="center" bgcolor="#ffffff">✅</td>
179
+ <td align="center" bgcolor="#ffffff">❌</td>
180
+ <td align="center" bgcolor="#ffffff">✅</td>
181
+ </tr>
182
+ <!-- LOC: tokei pydantic_ai_slim/pydantic_ai/ -t Python --exclude _a2a.py --exclude ag_ui.py --exclude ui --exclude durable_exec --exclude embeddings --exclude ext · pydantic/pydantic-ai @ ac684b2 -->
183
+ <tr>
184
+ <td align="left" bgcolor="#ffffff">
185
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/pydantic.svg" width="20" height="20" align="absmiddle" alt="">
186
+ &nbsp;Pydantic AI
187
+ </td>
188
+ <td align="right" bgcolor="#ffffff">51,231</td>
189
+ <td align="center" bgcolor="#ffffff">✅</td>
190
+ <td align="center" bgcolor="#ffffff">❌</td>
191
+ <td align="center" bgcolor="#ffffff">✅</td>
192
+ <td align="center" bgcolor="#ffffff">❌</td>
193
+ <td align="center" bgcolor="#ffffff">❌</td>
194
+ <td align="center" bgcolor="#ffffff">✅</td>
195
+ </tr>
196
+ <!-- LOC: tokei src/google/adk/ -t Python --exclude a2a --exclude apps --exclude cli --exclude cloud --exclude code_executors --exclude environment --exclude evaluation --exclude examples --exclude integrations --exclude optimization --exclude platform · google/adk-python @ bd062ec -->
197
+ <tr>
198
+ <td align="left" bgcolor="#ffffff">
199
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/google.svg" width="20" height="20" align="absmiddle" alt="">
200
+ &nbsp;Google ADK
201
+ </td>
202
+ <td align="right" bgcolor="#ffffff">57,392</td>
203
+ <td align="center" bgcolor="#ffffff">⚠️</td>
204
+ <td align="center" bgcolor="#ffffff">✅</td>
205
+ <td align="center" bgcolor="#ffffff">✅</td>
206
+ <td align="center" bgcolor="#ffffff">✅</td>
207
+ <td align="center" bgcolor="#ffffff">❌</td>
208
+ <td align="center" bgcolor="#ffffff">✅</td>
209
+ </tr>
210
+ <!-- LOC: tokei src/agents/ -t Python --exclude realtime --exclude voice --exclude extensions/experimental --exclude extensions/visualization.py · openai/openai-agents-python @ 4bd459e -->
211
+ <tr>
212
+ <td align="left" bgcolor="#ffffff">
213
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/openai.svg" width="20" height="20" align="absmiddle" alt="">
214
+ &nbsp;OpenAI&nbsp;Agents&nbsp;SDK
215
+ </td>
216
+ <td align="right" bgcolor="#ffffff">72,410</td>
217
+ <td align="center" bgcolor="#ffffff">✅</td>
218
+ <td align="center" bgcolor="#ffffff">✅</td>
219
+ <td align="center" bgcolor="#ffffff">✅</td>
220
+ <td align="center" bgcolor="#ffffff">❌</td>
221
+ <td align="center" bgcolor="#ffffff">❌</td>
222
+ <td align="center" bgcolor="#ffffff">✅</td>
223
+ </tr>
224
+ <!-- LOC: tokei libs/agno/agno/{agent,agents,approval,compression,factory,guardrails,hooks,memory,models,reasoning,registry,run,session,skills,team,tools,tracing,utils} -t Python · agno-agi/agno @ bb7ddb0 -->
225
+ <tr>
226
+ <td align="left" bgcolor="#ffffff">
227
+ <img src="assets/agno-a.svg" width="20" height="20" align="absmiddle" alt="">
228
+ &nbsp;Agno
229
+ </td>
230
+ <td align="right" bgcolor="#ffffff">106,852</td>
231
+ <td align="center" bgcolor="#ffffff">⚠️</td>
232
+ <td align="center" bgcolor="#ffffff">✅</td>
233
+ <td align="center" bgcolor="#ffffff">✅</td>
234
+ <td align="center" bgcolor="#ffffff">✅</td>
235
+ <td align="center" bgcolor="#ffffff">✅</td>
236
+ <td align="center" bgcolor="#ffffff">✅</td>
237
+ </tr>
238
+ </tbody>
239
+ </table>
240
+ <p align="center"><sub><em>* Table focuses on features that differentiate the harnesses. All listed also support MCP, lifecycle hooks, and multi-turn conversations.</em></sub></p>
241
+
242
+ <p align="left">
243
+ <sub>1. LOC excludes anything that is not the core agent harness framework. See raw README source comments for exact commands.<br>
244
+ 2. Tool retries: a documented primitive (e.g. Pydantic AI's <code>ModelRetry</code>) that lets tools signal "model passed bad args — retry with this feedback," distinct from generic exception propagation.<br>
245
+ 3. Claude Agent SDK shells out to the Claude Code CLI binary, which is 200k+ LOC.<br>
246
+ 4. deepagents is a thin wrapper over LangChain/LangGraph; effective import surface is ≈105k LOC.</sub>
247
+ </p>
248
+
249
+ </div>
250
+
251
+ <br>
252
+
253
+ See [docs/table.md](docs/table.md) for per-cell rationale and how the LOC numbers are measured.
254
+
255
+ ## Opinions
256
+
257
+ ThinHarness has opinions. They are the reason it stays small.
258
+
259
+ **No bash.** Business agents don't need a shell. Bash is a giant security surface, and agents mess up when writing shell commands more often than you'd initially expect. Cut it and most of those failures stop being possible.
260
+
261
+ **Skills are tools, not auto-discovery.** Skills live in directories you point at explicitly. The agent calls `skill_read` and `skill_run` like any other tool. No interactive scan of the workspace, no global skill marketplace, no magic. SDK use is deliberate; the auto-discovery design is for interactive coding agents and doesn't belong here.
262
+
263
+ **Search is a top priority.** The `search` tool is a Python port of [pgr](https://github.com/entireio/pgr)'s ranking; pgr [built benchmarks for agentic search](https://entire.io/blog/improving-agentic-search-in-coding-agents) and came up with a great way of exposing ripgrep to agents without raw bash. There's also a `jsonl_search` variant, because JSONL is the right shape when you're replacing RAG with agent-driven search over structured data (line-delimited, naturally chunked, `jq` + `rg`).
264
+
265
+ **Parallel LLM calls, built in.** Fan out from inside the harness when a workflow needs reliability beyond a single agent loop — majority vote, ensembled extraction. Set `builtin_parallel_llm_model` to enable the default `parallel_llm` tool for plain-text batches; for validated structured output per call, instantiate `ParallelLlmTool` yourself with `output_type` (a Pydantic model). Each call is stateless, and large batches can write JSON to `output_file`.
266
+
267
+ **Three providers, no matrix.** ThinHarness ships small provider classes for OpenAI, Anthropic, and OpenRouter. If your gateway speaks one of those protocols, you swap a base URL and move on. If not, the provider classes are small enough to fork or replace, and ignoring the bundled ones costs you nothing
268
+
269
+ **No compaction.** Compaction is a workaround for context windows filling up across long, accumulating runs — useful for interactive coding sessions that sprawl over hours. For SDK-based business agents, the right answer to "context is getting big" is almost always better task decomposition: shorter runs, separate harness instances, narrower subagents.
270
+
271
+ **No deployment layer.** Agents still need serving, auth, storage, retries, and observability in production. ThinHarness does not try to own that stack. A bundled deployment layer might work for some teams, but it will miss plenty of real production shapes; instead of adding more code and more options, ThinHarness stays an SDK and lets the host application own deployment.
272
+
273
+ ## Install
274
+
275
+ ```bash
276
+ uv add thinharness # or pip install thinharness
277
+ ```
278
+
279
+ Requires Python 3.11+.
280
+
281
+ ## Use
282
+
283
+ ```python
284
+ import asyncio
285
+ from thinharness import Harness, HarnessConfig
286
+
287
+ async def main():
288
+ async with Harness(HarnessConfig(root=".", model="openai:gpt-5.2")) as harness:
289
+ result = await harness.run("Read README.md and summarize it.")
290
+ print(result.text)
291
+
292
+ asyncio.run(main())
293
+ ```
294
+
295
+ There's a synchronous wrapper (`Harness(...).run_sync(...)`), Pydantic-typed structured output, lifecycle hooks, subagents, and path-scoped FS tools. The whole library is 15 files; the loop you care about is in [`thinharness/core.py`](thinharness/core.py) and the tools live in [`thinharness/tools/`](thinharness/tools/). Reading those files is faster than reading the docs would be.
296
+
297
+ ## Features
298
+
299
+ - **Filesystem tools:** `read`, `write`, `edit`, `search`, `list`, `glob`, and `jsonl_search` with root-scoped path policies.
300
+ - **Structured output:** Pydantic-validated results with native, tool, prompted, and text modes.
301
+ - **Hooks:** lifecycle and tool-call interception for prompt submission, tool calls, subagents, limits, and run boundaries.
302
+ - **Subagents:** opt-in delegation through a built-in `subagent` tool and explicit `SubAgentConfig`.
303
+ - **Parallel LLM:** opt-in `parallel_llm` fan-out for batches of independent one-shot prompts, plus `ParallelLlmTool(...).spec()` for renameable tools with explicit model, path, prompt, and retry settings.
304
+ - **Resume:** clean new-turn continuation through opaque provider session state.
305
+ - **MCP:** optional MCP client support with lazy tool discovery and collision checks.
306
+ - **Parallel tool calls:** same-turn tool batches run concurrently when every called tool is parallel-safe.
307
+ - **Tool retries and limit notices:** retryable argument/model mistakes use `ModelRetry`; near-limit guidance can warn the model before configured request or tool-call budgets are exhausted. Notices are harness-owned model input, not hooks or configurable callbacks. Parent and child runs compute notices from their own local budgets.
308
+ - **Tracing:** OpenTelemetry-compatible spans for runs, provider calls, tools, and subagents.
309
+
310
+ ## Status
311
+
312
+ Pre-1.0. APIs may shift, but I don't expect dramatic changes. Forking is a real option, not just a theoretical one: the codebase is small enough that pulling upstream changes into your fork by hand stays cheap. Each major feature (MCP, subagents, jsonl_search, parallel_llm, skills) lives in its own file with no hidden dependencies. If you don't use one, that's even less code to worry about. If you want to delete it entirely, that's a one-shot 10-word prompt to a coding agent.
313
+
314
+ ## License
315
+
316
+ MIT. Search ranking adapted from [pgr](https://github.com/entireio/pgr); see [docs/THIRD_PARTY_NOTICES.md](docs/THIRD_PARTY_NOTICES.md).
@@ -0,0 +1,277 @@
1
+ <p align="center">
2
+ <img src="assets/ThinHarness.svg" alt="ThinHarness" width="360">
3
+ </p>
4
+
5
+ <p align="center">
6
+ <br/>
7
+ A minimal, opinionated agent harness &mdash;
8
+ <br/>
9
+ focused scope, readable core, easy to fork.
10
+ <br/><br/>
11
+ </p>
12
+
13
+ <div align="center">
14
+
15
+ [![CI](https://img.shields.io/github/actions/workflow/status/ryanbbrown/thinharness/ci.yml?branch=main&label=CI)](https://github.com/ryanbbrown/thinharness/actions/workflows/ci.yml)
16
+ [![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/ryanbbrown/thinharness/blob/main/LICENSE)
17
+ [![PyPI](https://img.shields.io/pypi/v/thinharness.svg)](https://pypi.org/project/thinharness/)
18
+
19
+ </div>
20
+
21
+ ## Why this exists
22
+
23
+ Filesystem-based agent harnesses are simple but powerful: easily auditable, flexible, and they work just as well for non-coding business tasks like research over a corpus, workflow automation, or multi-step analysis. But the harnesses that provide filesystem primitives are either coding agents (Claude Agent SDK) or are massive and highly abstracted (deepagents, Agno). Even if you don't want filesystem tools, the general-purpose agent harness libraries are missing features (see table below) — or large enough that it's a pain when you (inevitably) need to customize.
24
+
25
+ So I built one. The core agent loop isn't that complicated. Provider call, parse tool calls, run them, feed results back, repeat. ThinHarness is **4,938 lines of Python** across 15 files. The whole thing. Small enough to actually read. You can audit it. You can fork it without inheriting a fork-maintenance problem, because there isn't much there to drift.
26
+
27
+ <!--
28
+ LOC measurement scope: strict framework-only. Each row strips clearly
29
+ non-framework code from the upstream package — platform/deployment layers,
30
+ domain-specific modalities (voice/realtime), eval/optimizer suites, UI/CLI
31
+ tools, A2A/declarative wire protocols, code-executor backends. Provider
32
+ implementations stay IN (they're part of what you import to use the library).
33
+ The exact tokei command + upstream commit hash for each row is in an HTML
34
+ comment above the row, so the number is reproducible. Measured 2026-05-16
35
+ against the commit pinned in each row's comment.
36
+ -->
37
+
38
+ <div align="center">
39
+
40
+ <table>
41
+ <thead>
42
+ <tr>
43
+ <td align="left" width="256" bgcolor="#eaeef2"><b>Library</b></td>
44
+ <td align="center" width="70" bgcolor="#eaeef2"><b>LOC<sup>1</sup></b></td>
45
+ <td align="center" width="62" bgcolor="#eaeef2"><b>Tool<br>retries<sup>2</sup></b></td>
46
+ <td align="center" width="70" bgcolor="#eaeef2"><b>Subagents</b></td>
47
+ <td align="center" width="68" bgcolor="#eaeef2"><b>Structured<br>output</b></td>
48
+ <td align="center" width="52" bgcolor="#eaeef2"><b>Skills</b></td>
49
+ <td align="center" width="82" bgcolor="#eaeef2"><b>FS<br>tools</b></td>
50
+ <td align="center" width="62" bgcolor="#eaeef2"><b>OTel<br>tracing</b></td>
51
+ </tr>
52
+ </thead>
53
+ <tbody>
54
+ <!-- LOC: tokei thinharness/ -t Python · ryanbbrown/thinharness working tree, measured 2026-05-18 -->
55
+ <tr>
56
+ <td align="left" bgcolor="#f6f8fa"><b>ThinHarness</b></td>
57
+ <td align="right" bgcolor="#f6f8fa"><b>4,938</b></td>
58
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
59
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
60
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
61
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
62
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
63
+ <td align="center" bgcolor="#f6f8fa"><b>✅</b></td>
64
+ </tr>
65
+ <!-- LOC: tokei src/claude_agent_sdk/ -t Python --exclude testing · anthropics/claude-agent-sdk-python @ c352a50 -->
66
+ <tr>
67
+ <td align="left" bgcolor="#ffffff">
68
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/anthropic.svg" width="20" height="20" align="absmiddle" alt="">
69
+ &nbsp;Claude&nbsp;Agent&nbsp;SDK<sup>3</sup>
70
+ </td>
71
+ <td align="right" bgcolor="#ffffff">8,202</td>
72
+ <td align="center" bgcolor="#ffffff">❌</td>
73
+ <td align="center" bgcolor="#ffffff">✅</td>
74
+ <td align="center" bgcolor="#ffffff">❌</td>
75
+ <td align="center" bgcolor="#ffffff">✅</td>
76
+ <td align="center" bgcolor="#ffffff">✅</td>
77
+ <td align="center" bgcolor="#ffffff">⚠️</td>
78
+ </tr>
79
+ <!-- LOC: tokei src/smolagents/ -t Python --exclude cli.py --exclude gradio_ui.py --exclude vision_web_browser.py · huggingface/smolagents @ 025b6ad -->
80
+ <tr>
81
+ <td align="left" bgcolor="#ffffff">
82
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/huggingface.svg" width="20" height="20" align="absmiddle" alt="">
83
+ &nbsp;smolagents
84
+ </td>
85
+ <td align="right" bgcolor="#ffffff">10,091</td>
86
+ <td align="center" bgcolor="#ffffff">❌</td>
87
+ <td align="center" bgcolor="#ffffff">✅</td>
88
+ <td align="center" bgcolor="#ffffff">✅</td>
89
+ <td align="center" bgcolor="#ffffff">❌</td>
90
+ <td align="center" bgcolor="#ffffff">❌</td>
91
+ <td align="center" bgcolor="#ffffff">✅</td>
92
+ </tr>
93
+ <!-- LOC: tokei libs/deepagents/deepagents/ -t Python · langchain-ai/deepagents @ 7465d77 -->
94
+ <!-- Substrate (see footnote 4): deepagents is a thin wrapper over LangChain/LangGraph.
95
+ Effective import surface ≈105k LOC, measured with the same strict filter as the rest of the table:
96
+ tokei libs/langgraph/langgraph/ libs/prebuilt/langgraph/ -t Python · langchain-ai/langgraph @ 076e2a3 => 26,144
97
+ tokei libs/core/langchain_core/ -t Python --exclude document_loaders --exclude documents --exclude embeddings --exclude indexing --exclude retrievers.py --exclude vectorstores --exclude cross_encoders.py · langchain-ai/langchain @ 73d4fd9 => 54,992
98
+ tokei libs/langchain_v1/langchain/ -t Python --exclude embeddings · langchain-ai/langchain @ 73d4fd9 => ~9,000
99
+ deepagents itself: 15,369
100
+ -->
101
+ <tr>
102
+ <td align="left" bgcolor="#ffffff">
103
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/langchain.svg" width="20" height="20" align="absmiddle" alt="">
104
+ &nbsp;deepagents<sup>4</sup>
105
+ </td>
106
+ <td align="right" bgcolor="#ffffff">15,369</td>
107
+ <td align="center" bgcolor="#ffffff">❌</td>
108
+ <td align="center" bgcolor="#ffffff">✅</td>
109
+ <td align="center" bgcolor="#ffffff">❌</td>
110
+ <td align="center" bgcolor="#ffffff">✅</td>
111
+ <td align="center" bgcolor="#ffffff">✅</td>
112
+ <td align="center" bgcolor="#ffffff">❌</td>
113
+ </tr>
114
+ <!-- LOC: tokei src/strands/ -t Python --exclude experimental --exclude vended_plugins --exclude multiagent/a2a · strands-agents/sdk-python @ 1232230 -->
115
+ <tr>
116
+ <td align="left" bgcolor="#ffffff">
117
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/amazonwebservices.svg" width="20" height="20" align="absmiddle" alt="">
118
+ &nbsp;AWS Strands
119
+ </td>
120
+ <td align="right" bgcolor="#ffffff">25,494</td>
121
+ <td align="center" bgcolor="#ffffff">⚠️</td>
122
+ <td align="center" bgcolor="#ffffff">✅</td>
123
+ <td align="center" bgcolor="#ffffff">✅</td>
124
+ <td align="center" bgcolor="#ffffff">❌</td>
125
+ <td align="center" bgcolor="#ffffff">❌</td>
126
+ <td align="center" bgcolor="#ffffff">✅</td>
127
+ </tr>
128
+ <!-- LOC: tokei python/packages/core/agent_framework/ -t Python --exclude _evaluation.py --exclude a2a --exclude ag_ui --exclude chatkit --exclude declarative --exclude devui --exclude hyperlight --exclude lab --exclude orchestrations --exclude mem0 --exclude redis --exclude microsoft · microsoft/agent-framework @ a60e541 -->
129
+ <tr>
130
+ <td align="left" bgcolor="#ffffff">
131
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/microsoft.svg" width="20" height="20" align="absmiddle" alt="">
132
+ &nbsp;Microsoft<br>
133
+ Agent Framework
134
+ </td>
135
+ <td align="right" bgcolor="#ffffff">34,751</td>
136
+ <td align="center" bgcolor="#ffffff">❌</td>
137
+ <td align="center" bgcolor="#ffffff">✅</td>
138
+ <td align="center" bgcolor="#ffffff">✅</td>
139
+ <td align="center" bgcolor="#ffffff">✅</td>
140
+ <td align="center" bgcolor="#ffffff">❌</td>
141
+ <td align="center" bgcolor="#ffffff">✅</td>
142
+ </tr>
143
+ <!-- LOC: tokei pydantic_ai_slim/pydantic_ai/ -t Python --exclude _a2a.py --exclude ag_ui.py --exclude ui --exclude durable_exec --exclude embeddings --exclude ext · pydantic/pydantic-ai @ ac684b2 -->
144
+ <tr>
145
+ <td align="left" bgcolor="#ffffff">
146
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/pydantic.svg" width="20" height="20" align="absmiddle" alt="">
147
+ &nbsp;Pydantic AI
148
+ </td>
149
+ <td align="right" bgcolor="#ffffff">51,231</td>
150
+ <td align="center" bgcolor="#ffffff">✅</td>
151
+ <td align="center" bgcolor="#ffffff">❌</td>
152
+ <td align="center" bgcolor="#ffffff">✅</td>
153
+ <td align="center" bgcolor="#ffffff">❌</td>
154
+ <td align="center" bgcolor="#ffffff">❌</td>
155
+ <td align="center" bgcolor="#ffffff">✅</td>
156
+ </tr>
157
+ <!-- LOC: tokei src/google/adk/ -t Python --exclude a2a --exclude apps --exclude cli --exclude cloud --exclude code_executors --exclude environment --exclude evaluation --exclude examples --exclude integrations --exclude optimization --exclude platform · google/adk-python @ bd062ec -->
158
+ <tr>
159
+ <td align="left" bgcolor="#ffffff">
160
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/google.svg" width="20" height="20" align="absmiddle" alt="">
161
+ &nbsp;Google ADK
162
+ </td>
163
+ <td align="right" bgcolor="#ffffff">57,392</td>
164
+ <td align="center" bgcolor="#ffffff">⚠️</td>
165
+ <td align="center" bgcolor="#ffffff">✅</td>
166
+ <td align="center" bgcolor="#ffffff">✅</td>
167
+ <td align="center" bgcolor="#ffffff">✅</td>
168
+ <td align="center" bgcolor="#ffffff">❌</td>
169
+ <td align="center" bgcolor="#ffffff">✅</td>
170
+ </tr>
171
+ <!-- LOC: tokei src/agents/ -t Python --exclude realtime --exclude voice --exclude extensions/experimental --exclude extensions/visualization.py · openai/openai-agents-python @ 4bd459e -->
172
+ <tr>
173
+ <td align="left" bgcolor="#ffffff">
174
+ <img src="https://cdn.jsdelivr.net/npm/simple-icons@latest/icons/openai.svg" width="20" height="20" align="absmiddle" alt="">
175
+ &nbsp;OpenAI&nbsp;Agents&nbsp;SDK
176
+ </td>
177
+ <td align="right" bgcolor="#ffffff">72,410</td>
178
+ <td align="center" bgcolor="#ffffff">✅</td>
179
+ <td align="center" bgcolor="#ffffff">✅</td>
180
+ <td align="center" bgcolor="#ffffff">✅</td>
181
+ <td align="center" bgcolor="#ffffff">❌</td>
182
+ <td align="center" bgcolor="#ffffff">❌</td>
183
+ <td align="center" bgcolor="#ffffff">✅</td>
184
+ </tr>
185
+ <!-- LOC: tokei libs/agno/agno/{agent,agents,approval,compression,factory,guardrails,hooks,memory,models,reasoning,registry,run,session,skills,team,tools,tracing,utils} -t Python · agno-agi/agno @ bb7ddb0 -->
186
+ <tr>
187
+ <td align="left" bgcolor="#ffffff">
188
+ <img src="assets/agno-a.svg" width="20" height="20" align="absmiddle" alt="">
189
+ &nbsp;Agno
190
+ </td>
191
+ <td align="right" bgcolor="#ffffff">106,852</td>
192
+ <td align="center" bgcolor="#ffffff">⚠️</td>
193
+ <td align="center" bgcolor="#ffffff">✅</td>
194
+ <td align="center" bgcolor="#ffffff">✅</td>
195
+ <td align="center" bgcolor="#ffffff">✅</td>
196
+ <td align="center" bgcolor="#ffffff">✅</td>
197
+ <td align="center" bgcolor="#ffffff">✅</td>
198
+ </tr>
199
+ </tbody>
200
+ </table>
201
+ <p align="center"><sub><em>* Table focuses on features that differentiate the harnesses. All listed also support MCP, lifecycle hooks, and multi-turn conversations.</em></sub></p>
202
+
203
+ <p align="left">
204
+ <sub>1. LOC excludes anything that is not the core agent harness framework. See raw README source comments for exact commands.<br>
205
+ 2. Tool retries: a documented primitive (e.g. Pydantic AI's <code>ModelRetry</code>) that lets tools signal "model passed bad args — retry with this feedback," distinct from generic exception propagation.<br>
206
+ 3. Claude Agent SDK shells out to the Claude Code CLI binary, which is 200k+ LOC.<br>
207
+ 4. deepagents is a thin wrapper over LangChain/LangGraph; effective import surface is ≈105k LOC.</sub>
208
+ </p>
209
+
210
+ </div>
211
+
212
+ <br>
213
+
214
+ See [docs/table.md](docs/table.md) for per-cell rationale and how the LOC numbers are measured.
215
+
216
+ ## Opinions
217
+
218
+ ThinHarness has opinions. They are the reason it stays small.
219
+
220
+ **No bash.** Business agents don't need a shell. Bash is a giant security surface, and agents mess up when writing shell commands more often than you'd initially expect. Cut it and most of those failures stop being possible.
221
+
222
+ **Skills are tools, not auto-discovery.** Skills live in directories you point at explicitly. The agent calls `skill_read` and `skill_run` like any other tool. No interactive scan of the workspace, no global skill marketplace, no magic. SDK use is deliberate; the auto-discovery design is for interactive coding agents and doesn't belong here.
223
+
224
+ **Search is a top priority.** The `search` tool is a Python port of [pgr](https://github.com/entireio/pgr)'s ranking; pgr [built benchmarks for agentic search](https://entire.io/blog/improving-agentic-search-in-coding-agents) and came up with a great way of exposing ripgrep to agents without raw bash. There's also a `jsonl_search` variant, because JSONL is the right shape when you're replacing RAG with agent-driven search over structured data (line-delimited, naturally chunked, `jq` + `rg`).
225
+
226
+ **Parallel LLM calls, built in.** Fan out from inside the harness when a workflow needs reliability beyond a single agent loop — majority vote, ensembled extraction. Set `builtin_parallel_llm_model` to enable the default `parallel_llm` tool for plain-text batches; for validated structured output per call, instantiate `ParallelLlmTool` yourself with `output_type` (a Pydantic model). Each call is stateless, and large batches can write JSON to `output_file`.
227
+
228
+ **Three providers, no matrix.** ThinHarness ships small provider classes for OpenAI, Anthropic, and OpenRouter. If your gateway speaks one of those protocols, you swap a base URL and move on. If not, the provider classes are small enough to fork or replace, and ignoring the bundled ones costs you nothing
229
+
230
+ **No compaction.** Compaction is a workaround for context windows filling up across long, accumulating runs — useful for interactive coding sessions that sprawl over hours. For SDK-based business agents, the right answer to "context is getting big" is almost always better task decomposition: shorter runs, separate harness instances, narrower subagents.
231
+
232
+ **No deployment layer.** Agents still need serving, auth, storage, retries, and observability in production. ThinHarness does not try to own that stack. A bundled deployment layer might work for some teams, but it will miss plenty of real production shapes; instead of adding more code and more options, ThinHarness stays an SDK and lets the host application own deployment.
233
+
234
+ ## Install
235
+
236
+ ```bash
237
+ uv add thinharness # or pip install thinharness
238
+ ```
239
+
240
+ Requires Python 3.11+.
241
+
242
+ ## Use
243
+
244
+ ```python
245
+ import asyncio
246
+ from thinharness import Harness, HarnessConfig
247
+
248
+ async def main():
249
+ async with Harness(HarnessConfig(root=".", model="openai:gpt-5.2")) as harness:
250
+ result = await harness.run("Read README.md and summarize it.")
251
+ print(result.text)
252
+
253
+ asyncio.run(main())
254
+ ```
255
+
256
+ There's a synchronous wrapper (`Harness(...).run_sync(...)`), Pydantic-typed structured output, lifecycle hooks, subagents, and path-scoped FS tools. The whole library is 15 files; the loop you care about is in [`thinharness/core.py`](thinharness/core.py) and the tools live in [`thinharness/tools/`](thinharness/tools/). Reading those files is faster than reading the docs would be.
257
+
258
+ ## Features
259
+
260
+ - **Filesystem tools:** `read`, `write`, `edit`, `search`, `list`, `glob`, and `jsonl_search` with root-scoped path policies.
261
+ - **Structured output:** Pydantic-validated results with native, tool, prompted, and text modes.
262
+ - **Hooks:** lifecycle and tool-call interception for prompt submission, tool calls, subagents, limits, and run boundaries.
263
+ - **Subagents:** opt-in delegation through a built-in `subagent` tool and explicit `SubAgentConfig`.
264
+ - **Parallel LLM:** opt-in `parallel_llm` fan-out for batches of independent one-shot prompts, plus `ParallelLlmTool(...).spec()` for renameable tools with explicit model, path, prompt, and retry settings.
265
+ - **Resume:** clean new-turn continuation through opaque provider session state.
266
+ - **MCP:** optional MCP client support with lazy tool discovery and collision checks.
267
+ - **Parallel tool calls:** same-turn tool batches run concurrently when every called tool is parallel-safe.
268
+ - **Tool retries and limit notices:** retryable argument/model mistakes use `ModelRetry`; near-limit guidance can warn the model before configured request or tool-call budgets are exhausted. Notices are harness-owned model input, not hooks or configurable callbacks. Parent and child runs compute notices from their own local budgets.
269
+ - **Tracing:** OpenTelemetry-compatible spans for runs, provider calls, tools, and subagents.
270
+
271
+ ## Status
272
+
273
+ Pre-1.0. APIs may shift, but I don't expect dramatic changes. Forking is a real option, not just a theoretical one: the codebase is small enough that pulling upstream changes into your fork by hand stays cheap. Each major feature (MCP, subagents, jsonl_search, parallel_llm, skills) lives in its own file with no hidden dependencies. If you don't use one, that's even less code to worry about. If you want to delete it entirely, that's a one-shot 10-word prompt to a coding agent.
274
+
275
+ ## License
276
+
277
+ MIT. Search ranking adapted from [pgr](https://github.com/entireio/pgr); see [docs/THIRD_PARTY_NOTICES.md](docs/THIRD_PARTY_NOTICES.md).