state-harness 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. state_harness-0.2.0/.github/ISSUE_TEMPLATE/bug_report.md +30 -0
  2. state_harness-0.2.0/.github/ISSUE_TEMPLATE/feature_request.md +18 -0
  3. state_harness-0.2.0/.github/workflows/ci.yml +43 -0
  4. state_harness-0.2.0/.github/workflows/release.yml +101 -0
  5. state_harness-0.2.0/.gitignore +249 -0
  6. state_harness-0.2.0/CHANGELOG.md +33 -0
  7. state_harness-0.2.0/CONTRIBUTING.md +66 -0
  8. state_harness-0.2.0/Cargo.lock +410 -0
  9. state_harness-0.2.0/Cargo.toml +22 -0
  10. state_harness-0.2.0/LICENSE.md +42 -0
  11. state_harness-0.2.0/PKG-INFO +483 -0
  12. state_harness-0.2.0/README.md +451 -0
  13. state_harness-0.2.0/SECURITY.md +48 -0
  14. state_harness-0.2.0/benchmarks/README.md +182 -0
  15. state_harness-0.2.0/benchmarks/analyze_results.py +201 -0
  16. state_harness-0.2.0/benchmarks/mint/configs/gemini_baseline_coding_humaneval.json +36 -0
  17. state_harness-0.2.0/benchmarks/mint/configs/gemini_baseline_coding_mbpp.json +36 -0
  18. state_harness-0.2.0/benchmarks/mint/configs/gemini_baseline_reasoning_gsm8k.json +36 -0
  19. state_harness-0.2.0/benchmarks/mint/configs/gemini_baseline_reasoning_math.json +36 -0
  20. state_harness-0.2.0/benchmarks/mint/gemini_agent.py +59 -0
  21. state_harness-0.2.0/benchmarks/mint/gemini_feedback_agent.py +93 -0
  22. state_harness-0.2.0/benchmarks/mint/mint_harness.py +148 -0
  23. state_harness-0.2.0/benchmarks/mint/run_harness_mint.py +269 -0
  24. state_harness-0.2.0/benchmarks/mint/run_mint.sh +86 -0
  25. state_harness-0.2.0/benchmarks/mint/setup_mint.sh +129 -0
  26. state_harness-0.2.0/benchmarks/rerun_retail.sh +154 -0
  27. state_harness-0.2.0/benchmarks/run_definitive_benchmark.sh +130 -0
  28. state_harness-0.2.0/benchmarks/run_domain_airline.sh +48 -0
  29. state_harness-0.2.0/benchmarks/run_domain_retail.sh +64 -0
  30. state_harness-0.2.0/benchmarks/run_domain_telecom.sh +48 -0
  31. state_harness-0.2.0/benchmarks/run_full_benchmark.sh +191 -0
  32. state_harness-0.2.0/benchmarks/swe_bench/docker_run.patch +59 -0
  33. state_harness-0.2.0/benchmarks/swe_bench/flow_configs/swebench_baseline.json +150 -0
  34. state_harness-0.2.0/benchmarks/swe_bench/flow_configs/swebench_harness.json +155 -0
  35. state_harness-0.2.0/benchmarks/swe_bench/harness_loop.py +310 -0
  36. state_harness-0.2.0/benchmarks/swe_bench/run_benchmark.sh +177 -0
  37. state_harness-0.2.0/benchmarks/swe_bench/setup.sh +138 -0
  38. state_harness-0.2.0/examples/failure_report_demo.py +123 -0
  39. state_harness-0.2.0/examples/quickstart.py +32 -0
  40. state_harness-0.2.0/examples/tau3_bench_agent.py +211 -0
  41. state_harness-0.2.0/pyproject.toml +54 -0
  42. state_harness-0.2.0/pyrightconfig.json +16 -0
  43. state_harness-0.2.0/python/state_harness/LICENSE +190 -0
  44. state_harness-0.2.0/python/state_harness/__init__.py +78 -0
  45. state_harness-0.2.0/python/state_harness/_core.pyi +172 -0
  46. state_harness-0.2.0/python/state_harness/adapters.py +399 -0
  47. state_harness-0.2.0/python/state_harness/diagnostics.py +689 -0
  48. state_harness-0.2.0/python/state_harness/sdk.py +770 -0
  49. state_harness-0.2.0/scripts/health_check.py +76 -0
  50. state_harness-0.2.0/src/lib.rs +65 -0
  51. state_harness-0.2.0/src/lyapunov.rs +940 -0
  52. state_harness-0.2.0/src/rg.rs +799 -0
  53. state_harness-0.2.0/src/vsa.rs +985 -0
  54. state_harness-0.2.0/tests/test_integration.py +651 -0
@@ -0,0 +1,30 @@
1
+ ---
2
+ name: Bug report
3
+ about: Report a bug in state-harness
4
+ title: "[BUG] "
5
+ labels: bug
6
+ ---
7
+
8
+ **Describe the bug**
9
+ A clear description of what the bug is.
10
+
11
+ **To reproduce**
12
+ ```python
13
+ # Minimal code to reproduce the issue
14
+ from state_harness import GrowthRatioGuard
15
+
16
+ guard = GrowthRatioGuard(token_budget=50_000)
17
+ # ...
18
+ ```
19
+
20
+ **Expected behavior**
21
+ What you expected to happen.
22
+
23
+ **Actual behavior**
24
+ What actually happened. Include the full error traceback if applicable.
25
+
26
+ **Environment**
27
+ - OS: [e.g., macOS 15, Ubuntu 24.04]
28
+ - Python: [e.g., 3.12]
29
+ - state-harness version: [e.g., 0.2.0]
30
+ - Install method: [pip / maturin develop]
@@ -0,0 +1,18 @@
1
+ ---
2
+ name: Feature request
3
+ about: Suggest a feature or improvement
4
+ title: "[FEATURE] "
5
+ labels: enhancement
6
+ ---
7
+
8
+ **Problem**
9
+ What problem does this feature solve? What's the current pain point?
10
+
11
+ **Proposed solution**
12
+ Describe what you'd like to see.
13
+
14
+ **Alternatives considered**
15
+ Any other approaches you've thought about.
16
+
17
+ **Use case**
18
+ How would you use this feature in your agent system?
@@ -0,0 +1,43 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+ branches: [main]
8
+
9
+ env:
10
+ PYO3_USE_ABI3_FORWARD_COMPATIBILITY: "1"
11
+
12
+ jobs:
13
+ lint:
14
+ runs-on: ubuntu-latest
15
+ steps:
16
+ - uses: actions/checkout@v4
17
+ - uses: dtolnay/rust-toolchain@stable
18
+ with:
19
+ components: clippy, rustfmt
20
+ - uses: actions/setup-python@v5
21
+ with:
22
+ python-version: "3.12"
23
+ - run: cargo fmt --check
24
+ - run: cargo clippy -- -D warnings
25
+
26
+ test:
27
+ runs-on: ${{ matrix.os }}
28
+ strategy:
29
+ matrix:
30
+ os: [ubuntu-latest, macos-latest]
31
+ python-version: ["3.10", "3.12"]
32
+ steps:
33
+ - uses: actions/checkout@v4
34
+ - uses: dtolnay/rust-toolchain@stable
35
+ - uses: actions/setup-python@v5
36
+ with:
37
+ python-version: ${{ matrix.python-version }}
38
+ - name: Build and test
39
+ run: |
40
+ pip install maturin pytest
41
+ maturin build --release
42
+ pip install target/wheels/*.whl
43
+ pytest tests/ -v
@@ -0,0 +1,101 @@
1
+ name: Release to PyPI
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ tags:
7
+ - "v*"
8
+
9
+ permissions:
10
+ contents: write
11
+
12
+ env:
13
+ PYO3_USE_ABI3_FORWARD_COMPATIBILITY: "1"
14
+
15
+ jobs:
16
+ build:
17
+ runs-on: ${{ matrix.os }}
18
+ strategy:
19
+ matrix:
20
+ include:
21
+ - os: ubuntu-latest
22
+ target: x86_64-unknown-linux-gnu
23
+ - os: ubuntu-latest
24
+ target: aarch64-unknown-linux-gnu
25
+ - os: macos-latest
26
+ target: x86_64-apple-darwin
27
+ - os: macos-latest
28
+ target: aarch64-apple-darwin
29
+ - os: windows-latest
30
+ target: x86_64-pc-windows-msvc
31
+ steps:
32
+ - uses: actions/checkout@v4
33
+ - uses: actions/setup-python@v5
34
+ with:
35
+ python-version: "3.12"
36
+ - uses: dtolnay/rust-toolchain@stable
37
+ with:
38
+ targets: ${{ matrix.target }}
39
+ - name: Build wheels
40
+ uses: PyO3/maturin-action@v1
41
+ with:
42
+ target: ${{ matrix.target }}
43
+ args: --release --out dist --interpreter python3.10 python3.11 python3.12
44
+ manylinux: auto
45
+ - uses: actions/upload-artifact@v4
46
+ with:
47
+ name: wheels-${{ matrix.target }}
48
+ path: dist
49
+
50
+ sdist:
51
+ runs-on: ubuntu-latest
52
+ steps:
53
+ - uses: actions/checkout@v4
54
+ - name: Build sdist
55
+ uses: PyO3/maturin-action@v1
56
+ with:
57
+ command: sdist
58
+ args: --out dist
59
+ - uses: actions/upload-artifact@v4
60
+ with:
61
+ name: sdist
62
+ path: dist
63
+
64
+ publish:
65
+ needs: [build, sdist]
66
+ if: startsWith(github.ref, 'refs/tags/v')
67
+ runs-on: ubuntu-latest
68
+ environment: release
69
+ permissions:
70
+ id-token: write # for trusted publishing
71
+ steps:
72
+ - uses: actions/download-artifact@v4
73
+ with:
74
+ pattern: wheels-*
75
+ merge-multiple: true
76
+ path: dist
77
+ - uses: actions/download-artifact@v4
78
+ with:
79
+ name: sdist
80
+ path: dist
81
+ - name: Publish to PyPI
82
+ uses: pypa/gh-action-pypi-publish@release/v1
83
+
84
+ github-release:
85
+ needs: publish
86
+ if: startsWith(github.ref, 'refs/tags/v')
87
+ runs-on: ubuntu-latest
88
+ permissions:
89
+ contents: write
90
+ steps:
91
+ - uses: actions/checkout@v4
92
+ - uses: actions/download-artifact@v4
93
+ with:
94
+ pattern: wheels-*
95
+ merge-multiple: true
96
+ path: dist
97
+ - name: Create GitHub Release
98
+ uses: softprops/action-gh-release@v2
99
+ with:
100
+ files: dist/*
101
+ generate_release_notes: true
@@ -0,0 +1,249 @@
1
+ # === State-Harness specifics ===
2
+ # Secrets — NEVER commit these
3
+ .env
4
+ .env.*
5
+ *.key
6
+ *credentials*.json
7
+
8
+ # Logs from benchmark runs
9
+ *.log
10
+
11
+ # Benchmark results (large, released separately)
12
+ benchmark_results/
13
+
14
+ # Rust build artifacts
15
+ target/
16
+ Cargo.lock
17
+
18
+ # Virtual environments
19
+ .venv/
20
+
21
+ # === Standard Python gitignore ===
22
+
23
+ # Byte-compiled / optimized / DLL files
24
+ __pycache__/
25
+ *.py[codz]
26
+ *$py.class
27
+
28
+ # C extensions
29
+ *.so
30
+
31
+ # Distribution / packaging
32
+ .Python
33
+ build/
34
+ develop-eggs/
35
+ dist/
36
+ downloads/
37
+ eggs/
38
+ .eggs/
39
+ lib/
40
+ lib64/
41
+ parts/
42
+ sdist/
43
+ var/
44
+ wheels/
45
+ share/python-wheels/
46
+ *.egg-info/
47
+ .installed.cfg
48
+ *.egg
49
+ MANIFEST
50
+
51
+ # PyInstaller
52
+ # Usually these files are written by a python script from a template
53
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
54
+ *.manifest
55
+ *.spec
56
+
57
+ # Installer logs
58
+ pip-log.txt
59
+ pip-delete-this-directory.txt
60
+
61
+ # Unit test / coverage reports
62
+ htmlcov/
63
+ .tox/
64
+ .nox/
65
+ .coverage
66
+ .coverage.*
67
+ .cache
68
+ nosetests.xml
69
+ coverage.xml
70
+ *.cover
71
+ *.py.cover
72
+ .hypothesis/
73
+ .pytest_cache/
74
+ cover/
75
+
76
+ # Translations
77
+ *.mo
78
+ *.pot
79
+
80
+ # Django stuff:
81
+ *.log
82
+ local_settings.py
83
+ db.sqlite3
84
+ db.sqlite3-journal
85
+
86
+ # Flask stuff:
87
+ instance/
88
+ .webassets-cache
89
+
90
+ # Scrapy stuff:
91
+ .scrapy
92
+
93
+ # Sphinx documentation
94
+ docs/_build/
95
+
96
+ # PyBuilder
97
+ .pybuilder/
98
+ target/
99
+
100
+ # Jupyter Notebook
101
+ .ipynb_checkpoints
102
+
103
+ # IPython
104
+ profile_default/
105
+ ipython_config.py
106
+
107
+ # pyenv
108
+ # For a library or package, you might want to ignore these files since the code is
109
+ # intended to run in multiple environments; otherwise, check them in:
110
+ # .python-version
111
+
112
+ # pipenv
113
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
114
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
115
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
116
+ # install all needed dependencies.
117
+ # Pipfile.lock
118
+
119
+ # UV
120
+ # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
121
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
122
+ # commonly ignored for libraries.
123
+ # uv.lock
124
+
125
+ # poetry
126
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
127
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
128
+ # commonly ignored for libraries.
129
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
130
+ # poetry.lock
131
+ # poetry.toml
132
+
133
+ # pdm
134
+ # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
135
+ # pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python.
136
+ # https://pdm-project.org/en/latest/usage/project/#working-with-version-control
137
+ # pdm.lock
138
+ # pdm.toml
139
+ .pdm-python
140
+ .pdm-build/
141
+
142
+ # pixi
143
+ # Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control.
144
+ # pixi.lock
145
+ # Pixi creates a virtual environment in the .pixi directory, just like venv module creates one
146
+ # in the .venv directory. It is recommended not to include this directory in version control.
147
+ .pixi
148
+
149
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
150
+ __pypackages__/
151
+
152
+ # Celery stuff
153
+ celerybeat-schedule
154
+ celerybeat.pid
155
+
156
+ # Redis
157
+ *.rdb
158
+ *.aof
159
+ *.pid
160
+
161
+ # RabbitMQ
162
+ mnesia/
163
+ rabbitmq/
164
+ rabbitmq-data/
165
+
166
+ # ActiveMQ
167
+ activemq-data/
168
+
169
+ # SageMath parsed files
170
+ *.sage.py
171
+
172
+ # Environments
173
+ .env
174
+ .envrc
175
+ .venv
176
+ env/
177
+ venv/
178
+ ENV/
179
+ env.bak/
180
+ venv.bak/
181
+
182
+ # Spyder project settings
183
+ .spyderproject
184
+ .spyproject
185
+
186
+ # Rope project settings
187
+ .ropeproject
188
+
189
+ # mkdocs documentation
190
+ /site
191
+
192
+ # mypy
193
+ .mypy_cache/
194
+ .dmypy.json
195
+ dmypy.json
196
+
197
+ # Pyre type checker
198
+ .pyre/
199
+
200
+ # pytype static type analyzer
201
+ .pytype/
202
+
203
+ # Cython debug symbols
204
+ cython_debug/
205
+
206
+ # PyCharm
207
+ # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
208
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
209
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
210
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
211
+ # .idea/
212
+
213
+ # Abstra
214
+ # Abstra is an AI-powered process automation framework.
215
+ # Ignore directories containing user credentials, local state, and settings.
216
+ # Learn more at https://abstra.io/docs
217
+ .abstra/
218
+
219
+ # Visual Studio Code
220
+ # Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
221
+ # that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
222
+ # and can be added to the global gitignore or merged into this file. However, if you prefer,
223
+ # you could uncomment the following to ignore the entire vscode folder
224
+ # .vscode/
225
+ # Temporary file for partial code execution
226
+ tempCodeRunnerFile.py
227
+
228
+ # Ruff stuff:
229
+ .ruff_cache/
230
+
231
+ # PyPI configuration file
232
+ .pypirc
233
+
234
+ # Marimo
235
+ marimo/_static/
236
+ marimo/_lsp/
237
+ __marimo__/
238
+
239
+ # Streamlit
240
+ .streamlit/secrets.toml
241
+
242
+ # Rust / Cargo
243
+ target/
244
+ *.pdb
245
+ Cargo.lock
246
+
247
+ # Benchmark results (generated, large JSON files — re-run to reproduce)
248
+ benchmark_results/
249
+ benchmark_overnight*.log
@@ -0,0 +1,33 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.2.0] - 2026-05-31
9
+
10
+ ### Added
11
+ - **GrowthRatioGuard**: Self-calibrating circuit breaker that normalizes token usage against a warmup baseline. Recommended over raw `BoundaryGuard` for most use cases.
12
+ - **FailureReport**: Zero-cost failure diagnostics — classifies failure patterns (spiral, retry storm, policy drift, early explosion, budget exhaustion), provides evidence, and suggests specific fixes. No LLM calls required.
13
+ - **FailurePattern enum**: Structured failure classification with confidence scores.
14
+ - **Suggestion dataclass**: Actionable fix recommendations with severity levels.
15
+ - **Framework adapters**: `LangGraphMiddleware` and `VanillaHook` for drop-in integration.
16
+ - **Benchmark results**: τ³-bench airline (58% pass, 9% savings), SWE-bench Verified (49.5% savings, 68.8% precision), MINT (0.8% savings, zero trips).
17
+ - **Research paper**: Published empirical validation across three benchmarks.
18
+
19
+ ### Changed
20
+ - Bumped Cargo.toml version to match pyproject.toml.
21
+ - Standardized all repository URLs to canonical GitHub location.
22
+
23
+ ## [0.1.0] - 2026-05-20
24
+
25
+ ### Added
26
+ - **LyapunovMonitor** (Rust): Discrete-time energy tracker with circuit breaker semantics. Tracks V(k) = S(k) + λθ(k) and trips when ΔV ≥ 0 for W consecutive steps.
27
+ - **RGDecimator** (Rust): TF-IDF-based conversation history compression with structural keyword retention. First and last messages always preserved.
28
+ - **HolographicEngine** (Rust): VSA-based policy drift detection using 10,000-dimensional bipolar hypervectors. Constant-time cosine similarity checks.
29
+ - **BoundaryGuard** (Python): Context manager wrapping the Lyapunov monitor with Python ergonomics.
30
+ - **`@boundary_guard` decorator**: Function-level monitoring for individual agent steps.
31
+ - **MonitorGroup**: Manage multiple monitors for multi-agent orchestration.
32
+ - **Custom exceptions**: `StabilityViolation`, `BudgetExhausted`, `PermanentFailure`.
33
+ - **Type stubs** (`_core.pyi`): Full type coverage for IDE support.
@@ -0,0 +1,66 @@
1
+ # Contributing to state-harness
2
+
3
+ Thanks for considering a contribution. This document covers setup, style, and process.
4
+
5
+ ## Development Environment
6
+
7
+ ### Prerequisites
8
+
9
+ - Python ≥ 3.10
10
+ - Rust toolchain ([rustup.rs](https://rustup.rs/))
11
+ - [maturin](https://github.com/PyO3/maturin) ≥ 1.5
12
+
13
+ ### Setup
14
+
15
+ ```bash
16
+ git clone https://github.com/vishal-dehurdle/state-harness.git
17
+ cd state-harness
18
+
19
+ python -m venv .venv && source .venv/bin/activate
20
+
21
+ pip install maturin pytest
22
+ maturin develop --release
23
+ ```
24
+
25
+ ### Running Tests
26
+
27
+ ```bash
28
+ pytest tests/
29
+ ```
30
+
31
+ Tests cover the full Rust↔Python interface: Lyapunov monitor, RG decimator, holographic engine, growth ratio guard, and failure diagnostics.
32
+
33
+ ## Code Style
34
+
35
+ ### Rust (`src/`)
36
+
37
+ - Follow standard `rustfmt` formatting
38
+ - Use `///` doc comments on all public items
39
+ - Section dividers (`// ─── Section ───`) are fine for organizing large files
40
+ - Keep comments focused on *why*, not *what* — the code should be self-explanatory
41
+
42
+ ### Python (`python/`, `tests/`)
43
+
44
+ - Follow [PEP 8](https://peps.python.org/pep-0008/)
45
+ - Type hints on all public functions
46
+ - Docstrings on all public classes and methods
47
+ - Section dividers (`# ─── Section ───`) are fine for organizing large files
48
+
49
+ ## Pull Requests
50
+
51
+ 1. Fork the repo and create a branch from `main`
52
+ 2. Make your changes with tests
53
+ 3. Ensure `pytest tests/` passes
54
+ 4. Ensure `cargo clippy` reports no warnings
55
+ 5. Open a PR with a clear description of what and why
56
+
57
+ ## License
58
+
59
+ By contributing, you agree that:
60
+
61
+ - **Rust contributions** (`src/`) are licensed under BSL 1.1 (converts to Apache 2.0 on May 26, 2030)
62
+ - **Python contributions** (`python/`, `tests/`, `examples/`, `benchmarks/`) are licensed under Apache 2.0
63
+
64
+ ## Questions?
65
+
66
+ Open a [Discussion](https://github.com/vishal-dehurdle/state-harness/discussions) or file an [Issue](https://github.com/vishal-dehurdle/state-harness/issues).