context-compiler 0.5.2__tar.gz → 0.6.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {context_compiler-0.5.2 → context_compiler-0.6.2}/.github/workflows/ci.yml +4 -1
- {context_compiler-0.5.2 → context_compiler-0.6.2}/.github/workflows/publish-pypi.yml +36 -6
- context_compiler-0.6.2/.github/workflows/stress-tests.yml +47 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/.pre-commit-config.yaml +2 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/PKG-INFO +83 -23
- {context_compiler-0.5.2 → context_compiler-0.6.2}/README.md +80 -22
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/06_llm_context_compaction.py +2 -2
- context_compiler-0.6.2/docs/README.md +20 -0
- context_compiler-0.6.2/docs/llm-preprocessor.md +44 -0
- context_compiler-0.6.2/evals/litellm_proxy_additional_findings.md +89 -0
- context_compiler-0.6.2/evals/litellm_proxy_behavioral_comparisons.md +120 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/swe-bench.py +6 -2
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/06_transcript_replay.py +2 -2
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/integrations/README.md +15 -5
- context_compiler-0.6.2/examples/integrations/litellm/README.md +64 -0
- context_compiler-0.6.2/examples/integrations/litellm/basic.py +124 -0
- context_compiler-0.6.2/examples/integrations/litellm/with_preprocessor.py +220 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/integrations/litellm_proxy/README.md +37 -1
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/integrations/litellm_proxy/config.example.yaml +3 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/integrations/litellm_proxy/context_compiler_precall_hook.py +53 -20
- context_compiler-0.6.2/examples/integrations/litellm_proxy/context_compiler_precall_hook_with_preprocessor.py +270 -0
- context_compiler-0.6.2/examples/integrations/openwebui/README.md +84 -0
- context_compiler-0.6.2/examples/integrations/openwebui/open_webui_pipe.py +303 -0
- context_compiler-0.6.2/examples/integrations/openwebui/open_webui_pipe_with_preprocessor.py +374 -0
- context_compiler-0.6.2/experimental/__init__.py +1 -0
- context_compiler-0.6.2/experimental/preprocessor/README.md +72 -0
- context_compiler-0.6.2/experimental/preprocessor/__init__.py +24 -0
- context_compiler-0.6.2/experimental/preprocessor/constants.py +29 -0
- context_compiler-0.6.2/experimental/preprocessor/heuristic_precompiler.py +239 -0
- context_compiler-0.6.2/experimental/preprocessor/output_validation.py +102 -0
- context_compiler-0.6.2/experimental/preprocessor/prompt_utils.py +57 -0
- context_compiler-0.6.2/experimental/preprocessor/prompts/default.txt +129 -0
- context_compiler-0.6.2/experimental/preprocessor/prompts/llama.txt +114 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/pyproject.toml +5 -2
- {context_compiler-0.5.2 → context_compiler-0.6.2}/src/context_compiler/__init__.py +4 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/src/context_compiler/engine.py +11 -3
- context_compiler-0.6.2/tests/fixtures/v2/step/012_clear_premise_populated_update.json +32 -0
- context_compiler-0.6.2/tests/fixtures/v2/step/013_clear_premise_already_null_update.json +32 -0
- context_compiler-0.6.2/tests/fixtures/v2/step/014_reset_policies_populated_update.json +29 -0
- context_compiler-0.6.2/tests/fixtures/v2/step/015_reset_policies_already_empty_update.json +26 -0
- context_compiler-0.6.2/tests/fixtures/v2/step/016_clear_state_populated_update.json +28 -0
- context_compiler-0.6.2/tests/fixtures/v2/step/017_clear_state_already_empty_update.json +26 -0
- context_compiler-0.6.2/tests/test_precompiler_heuristic.py +252 -0
- context_compiler-0.6.2/tests/test_precompiler_heuristic_properties.py +148 -0
- context_compiler-0.6.2/tests/test_precompiler_output_validation.py +50 -0
- context_compiler-0.6.2/tests/test_precompiler_prompt_utils.py +67 -0
- context_compiler-0.6.2/tests/test_precompiler_validator_properties.py +173 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/uv.lock +6 -2
- context_compiler-0.5.2/docs/README.md +0 -6
- context_compiler-0.5.2/docs/llm-preprocessor.md +0 -149
- context_compiler-0.5.2/examples/integrations/litellm_sdk.py +0 -95
- {context_compiler-0.5.2 → context_compiler-0.6.2}/.gitignore +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/AGENTS.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/CONTRIBUTING.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/LICENSE +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/01_llm_contradiction_clarify.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/02_llm_constraint_guardrail.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/03_llm_premise_guardrail.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/04_llm_tool_denylist_guardrail.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/05_llm_prompt_drift_vs_state.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/07_llm_prompt_vs_state.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/README.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/__init__.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/common.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/llm_client.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/demos/run_demo.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/docs/DescriptionAndMilestones.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/docs/DirectiveGrammarSpec.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/docs/multi-engine.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/README.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/RUBRIC.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/manifest.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/django__django-12453.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/django__django-13158.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/django__django-13964.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/django__django-15252.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/matplotlib__matplotlib-23299.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/evals/swe-bench/tasks/psf__requests-1963.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/01_persistent_guardrails.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/02_configuration_and_correction.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/03_ambiguity_with_clarification.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/04_tool_governance_denylist.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/05_llm_integration_pattern.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/07_single_policy_correction.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/README.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/examples/_util.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/src/context_compiler/const.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/src/context_compiler/repl.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/README.md +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/001_set_premise_update.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/002_use_item_normalization.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/003_conflict_prohibit_clarify.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/004_remove_policy_missing_idempotent_update.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/005_exact_prefix_passthrough_leading_space.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/006_near_miss_set_premise_to.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/007_near_miss_change_premise_missing_to.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/008_replace_missing_source_clarify_prompt.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/009_pending_affirmative_normalized_token.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/010_pending_negative_normalized_token.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/step/011_pending_unmatched_reuses_prompt.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/transcript/001_user_only_replay_state.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/transcript/002_non_string_user_content_ignored.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/transcript/003_stops_at_first_clarify_later_yes.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/fixtures/v2/transcript/004_stops_at_first_clarify_later_no.json +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_04_grammar_edge_cases.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_04_llm_tool_governance.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_07_llm_prompt_engineering_comparison.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_demo_01_04_behavior.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_demo_05_prompt_contract.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_demo_07_output_clarity.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_demo_compaction.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_engine.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_examples.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_fixtures.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_llm_client.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_llm_demos.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_properties.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_repl.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_repl_properties.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_run_demo.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_smoke.py +0 -0
- {context_compiler-0.5.2 → context_compiler-0.6.2}/tests/test_transcript_replay.py +0 -0
|
@@ -34,8 +34,11 @@ jobs:
|
|
|
34
34
|
- name: Ruff (format)
|
|
35
35
|
run: uv run ruff format --check .
|
|
36
36
|
|
|
37
|
+
- name: Install mypy integration dependencies
|
|
38
|
+
run: uv pip install --python .venv/bin/python litellm pydantic
|
|
39
|
+
|
|
37
40
|
- name: Mypy
|
|
38
|
-
run: uv run mypy src
|
|
41
|
+
run: uv run mypy src examples evals/swe-bench demos
|
|
39
42
|
|
|
40
43
|
- name: Install Hypothesis for tests
|
|
41
44
|
run: uv pip install --python .venv/bin/python hypothesis
|
|
@@ -3,13 +3,41 @@ name: Publish to PyPI
|
|
|
3
3
|
on:
|
|
4
4
|
release:
|
|
5
5
|
types: [published]
|
|
6
|
-
workflow_dispatch:
|
|
7
6
|
|
|
8
7
|
permissions:
|
|
9
8
|
contents: read
|
|
10
9
|
id-token: write
|
|
11
10
|
|
|
12
11
|
jobs:
|
|
12
|
+
stress-tests:
|
|
13
|
+
name: Stress tests (release gate)
|
|
14
|
+
runs-on: ubuntu-latest
|
|
15
|
+
|
|
16
|
+
steps:
|
|
17
|
+
- uses: actions/checkout@v6
|
|
18
|
+
|
|
19
|
+
- name: Set up Python
|
|
20
|
+
uses: actions/setup-python@v6
|
|
21
|
+
with:
|
|
22
|
+
python-version: "3.12"
|
|
23
|
+
|
|
24
|
+
- name: Install uv
|
|
25
|
+
uses: astral-sh/setup-uv@v7
|
|
26
|
+
with:
|
|
27
|
+
enable-cache: true
|
|
28
|
+
|
|
29
|
+
- name: Install dev and demos dependencies
|
|
30
|
+
run: uv sync --extra dev --extra demos
|
|
31
|
+
|
|
32
|
+
- name: Run pytest stress loop
|
|
33
|
+
shell: bash
|
|
34
|
+
run: |
|
|
35
|
+
loops="10"
|
|
36
|
+
for i in $(seq 1 "$loops"); do
|
|
37
|
+
echo "== stress run $i/$loops =="
|
|
38
|
+
uv run pytest -q
|
|
39
|
+
done
|
|
40
|
+
|
|
13
41
|
build:
|
|
14
42
|
name: Build distributions
|
|
15
43
|
runs-on: ubuntu-latest
|
|
@@ -22,11 +50,13 @@ jobs:
|
|
|
22
50
|
with:
|
|
23
51
|
python-version: "3.12"
|
|
24
52
|
|
|
53
|
+
- name: Install uv
|
|
54
|
+
uses: astral-sh/setup-uv@v7
|
|
55
|
+
with:
|
|
56
|
+
enable-cache: true
|
|
57
|
+
|
|
25
58
|
- name: Build sdist and wheel
|
|
26
|
-
run:
|
|
27
|
-
python -m pip install --upgrade pip
|
|
28
|
-
python -m pip install build
|
|
29
|
-
python -m build
|
|
59
|
+
run: uv run --with build python -m build
|
|
30
60
|
|
|
31
61
|
- name: Show built distributions
|
|
32
62
|
run: |
|
|
@@ -41,7 +71,7 @@ jobs:
|
|
|
41
71
|
|
|
42
72
|
publish:
|
|
43
73
|
name: Publish to PyPI
|
|
44
|
-
needs: build
|
|
74
|
+
needs: [build, stress-tests]
|
|
45
75
|
runs-on: ubuntu-latest
|
|
46
76
|
environment:
|
|
47
77
|
name: pypi
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
name: Stress Tests
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
workflow_dispatch:
|
|
5
|
+
inputs:
|
|
6
|
+
stress_loops:
|
|
7
|
+
description: Number of full pytest stress loops
|
|
8
|
+
required: false
|
|
9
|
+
default: "10"
|
|
10
|
+
schedule:
|
|
11
|
+
- cron: "0 3 * * *"
|
|
12
|
+
|
|
13
|
+
jobs:
|
|
14
|
+
stress-tests:
|
|
15
|
+
name: Stress tests
|
|
16
|
+
runs-on: ubuntu-latest
|
|
17
|
+
env:
|
|
18
|
+
STRESS_LOOPS: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.stress_loops || '10' }}
|
|
19
|
+
|
|
20
|
+
steps:
|
|
21
|
+
- uses: actions/checkout@v6
|
|
22
|
+
|
|
23
|
+
- name: Set up Python
|
|
24
|
+
uses: actions/setup-python@v6
|
|
25
|
+
with:
|
|
26
|
+
python-version: "3.12"
|
|
27
|
+
|
|
28
|
+
- name: Install uv
|
|
29
|
+
uses: astral-sh/setup-uv@v7
|
|
30
|
+
with:
|
|
31
|
+
enable-cache: true
|
|
32
|
+
|
|
33
|
+
- name: Install dev and demos dependencies
|
|
34
|
+
run: uv sync --extra dev --extra demos
|
|
35
|
+
|
|
36
|
+
- name: Run pytest stress loop
|
|
37
|
+
shell: bash
|
|
38
|
+
run: |
|
|
39
|
+
loops="${STRESS_LOOPS}"
|
|
40
|
+
if ! [[ "$loops" =~ ^[1-9][0-9]*$ ]]; then
|
|
41
|
+
echo "Invalid stress loop count: $loops"
|
|
42
|
+
exit 1
|
|
43
|
+
fi
|
|
44
|
+
for i in $(seq 1 "$loops"); do
|
|
45
|
+
echo "== stress run $i/$loops =="
|
|
46
|
+
uv run pytest -q
|
|
47
|
+
done
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: context-compiler
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.6.2
|
|
4
4
|
Summary: Deterministic conversational state engine for LLM applications.
|
|
5
5
|
Project-URL: Homepage, https://github.com/rlippmann/context-compiler
|
|
6
6
|
Project-URL: Repository, https://github.com/rlippmann/context-compiler
|
|
@@ -26,6 +26,8 @@ Requires-Dist: pre-commit; extra == 'dev'
|
|
|
26
26
|
Requires-Dist: pytest; extra == 'dev'
|
|
27
27
|
Requires-Dist: pytest-cov; extra == 'dev'
|
|
28
28
|
Requires-Dist: ruff<1.0,>=0.12; extra == 'dev'
|
|
29
|
+
Provides-Extra: experimental
|
|
30
|
+
Requires-Dist: litellm>=1.0.0; extra == 'experimental'
|
|
29
31
|
Description-Content-Type: text/markdown
|
|
30
32
|
|
|
31
33
|
|
|
@@ -118,18 +120,32 @@ The host supplies the authoritative state to the model so the constraint persist
|
|
|
118
120
|
|
|
119
121
|
---
|
|
120
122
|
|
|
121
|
-
##
|
|
123
|
+
## Deterministic behavior (examples)
|
|
122
124
|
|
|
123
|
-
|
|
125
|
+
LLMs interpret intent. Context Compiler enforces it.
|
|
124
126
|
|
|
125
|
-
|
|
127
|
+
**Explicit directive**
|
|
128
|
+
```text
|
|
129
|
+
set premise concise replies
|
|
130
|
+
```
|
|
131
|
+
- Base model: silently accepts / rewrites
|
|
132
|
+
- Context Compiler: applies a deterministic state update
|
|
126
133
|
|
|
127
|
-
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
-
|
|
134
|
+
**State-dependent operation**
|
|
135
|
+
```text
|
|
136
|
+
clear state
|
|
137
|
+
use podman instead of docker
|
|
138
|
+
```
|
|
139
|
+
- Base model: generic explanation
|
|
140
|
+
- Context Compiler: rejects (“No exact policy found for 'docker'…”)
|
|
141
|
+
|
|
142
|
+
**Lifecycle enforcement**
|
|
143
|
+
```text
|
|
144
|
+
clear state
|
|
145
|
+
change premise to formal tone
|
|
146
|
+
```
|
|
147
|
+
- Base model: conversational rewrite guidance
|
|
148
|
+
- Context Compiler: clarifies (“No premise exists yet…”)
|
|
133
149
|
|
|
134
150
|
---
|
|
135
151
|
|
|
@@ -183,8 +199,8 @@ Meaning:
|
|
|
183
199
|
|---|---|
|
|
184
200
|
| `create_engine(state=None)` | Create a new compiler engine; optional `state` provides initial authoritative state (validated/canonicalized). |
|
|
185
201
|
| `step(user_input)` | Parse one user turn and return a deterministic `Decision`. |
|
|
186
|
-
| `compile_transcript(messages)` | Replay a transcript from a fresh engine and return either final state or a confirmation prompt. |
|
|
187
|
-
| `engine.apply_transcript(messages)` | Replay a transcript onto the current engine state and return either final state or a confirmation prompt. |
|
|
202
|
+
| `compile_transcript(messages: Transcript)` | Replay a transcript from a fresh engine and return either final state or a confirmation prompt. |
|
|
203
|
+
| `engine.apply_transcript(messages: Transcript)` | Replay a transcript onto the current engine state and return either final state or a confirmation prompt. |
|
|
188
204
|
| `engine.state` | Read current authoritative in-memory state snapshot. |
|
|
189
205
|
| `get_premise_value(state)` | Read the current premise value from a state snapshot. |
|
|
190
206
|
| `get_policy_items(state, value=None)` | Read policy items from a state snapshot (all, `use`, or `prohibit`). |
|
|
@@ -252,27 +268,65 @@ For full directive grammar and edge-case behavior, see [DirectiveGrammarSpec.md]
|
|
|
252
268
|
|
|
253
269
|
---
|
|
254
270
|
|
|
255
|
-
##
|
|
271
|
+
## Guarantees
|
|
256
272
|
|
|
257
|
-
|
|
273
|
+
- State changes only through explicit user directives or confirmation.
|
|
274
|
+
- Identical input sequences produce identical compiler state.
|
|
275
|
+
- Model responses never modify compiler state.
|
|
276
|
+
- Ambiguous directives trigger clarification instead of changing state.
|
|
277
|
+
|
|
278
|
+
These invariants are verified through behavioral tests and Hypothesis-based property tests.
|
|
258
279
|
|
|
259
280
|
---
|
|
260
281
|
|
|
261
|
-
##
|
|
282
|
+
## Evidence
|
|
283
|
+
|
|
284
|
+
### Behavioral correctness (key examples)
|
|
285
|
+
|
|
286
|
+
Concrete behavioral comparisons (base model vs compiler) are available here:
|
|
287
|
+
|
|
288
|
+
- [Open WebUI integration README](examples/integrations/openwebui/README.md)
|
|
289
|
+
|
|
290
|
+
These demonstrate deterministic clarification, state enforcement, and conflict handling.
|
|
291
|
+
|
|
292
|
+
### Cross-model evaluation
|
|
293
|
+
|
|
294
|
+
- Models tested: `llama3.1:8b`, `gpt-4o-mini`, `gpt-4.1`, `gpt-5`, `claude-sonnet-4`, `claude-opus-4`
|
|
295
|
+
- Pass-rate summary: baseline (LLM only) `2–4 / 6`; with compiler `6 / 6`; with compiler + compaction `6 / 6`.
|
|
296
|
+
|
|
297
|
+
### Efficiency
|
|
298
|
+
|
|
299
|
+
- Context reduction in long conversations: up to `99%`
|
|
300
|
+
- Prompt size reduction: about `50%`
|
|
301
|
+
|
|
302
|
+
### Additional results
|
|
303
|
+
|
|
304
|
+
- [SWE curated results (compiler vs baseline)](evals/swe-bench/README.md) — cross-model evaluation on 6 tasks showing mostly positive deltas
|
|
262
305
|
|
|
263
|
-
- [LLM preprocessor](docs/llm-preprocessor.md)
|
|
264
|
-
- [Multiple engines](docs/multi-engine.md)
|
|
265
306
|
|
|
266
307
|
---
|
|
267
308
|
|
|
268
|
-
## Guarantees
|
|
269
309
|
|
|
270
|
-
|
|
271
|
-
- Identical input sequences produce identical compiler state.
|
|
272
|
-
- Model responses never modify compiler state.
|
|
273
|
-
- Ambiguous directives trigger clarification instead of changing state.
|
|
310
|
+
## Optional: LLM Preprocessor (Experimental)
|
|
274
311
|
|
|
275
|
-
|
|
312
|
+
An optional host-side preprocessor can convert natural-language instructions
|
|
313
|
+
into canonical directives before compilation.
|
|
314
|
+
|
|
315
|
+
It is designed to be conservative and must be used with validation:
|
|
316
|
+
|
|
317
|
+
- heuristic-first, with LLM fallback when needed
|
|
318
|
+
- all outputs must be validated with `parse_precompiler_output(...)`
|
|
319
|
+
- raw outputs must not be passed directly to the compiler
|
|
320
|
+
|
|
321
|
+
See [LLM preprocessor](docs/llm-preprocessor.md) and
|
|
322
|
+
[`experimental/preprocessor/`](experimental/preprocessor/) for details.
|
|
323
|
+
|
|
324
|
+
|
|
325
|
+
## Advanced topics
|
|
326
|
+
|
|
327
|
+
- [Multiple engines](docs/multi-engine.md)
|
|
328
|
+
|
|
329
|
+
For a full documentation map, see [docs/README.md](docs/README.md).
|
|
276
330
|
|
|
277
331
|
---
|
|
278
332
|
|
|
@@ -285,6 +339,12 @@ More detailed design and milestone documents are available in:
|
|
|
285
339
|
|
|
286
340
|
---
|
|
287
341
|
|
|
342
|
+
## Conformance Fixtures
|
|
343
|
+
|
|
344
|
+
Cross-language conformance tests are defined in [`tests/fixtures/`](tests/fixtures/).
|
|
345
|
+
|
|
346
|
+
---
|
|
347
|
+
|
|
288
348
|
## License
|
|
289
349
|
|
|
290
350
|
Apache-2.0.
|
|
@@ -88,18 +88,32 @@ The host supplies the authoritative state to the model so the constraint persist
|
|
|
88
88
|
|
|
89
89
|
---
|
|
90
90
|
|
|
91
|
-
##
|
|
91
|
+
## Deterministic behavior (examples)
|
|
92
92
|
|
|
93
|
-
|
|
93
|
+
LLMs interpret intent. Context Compiler enforces it.
|
|
94
94
|
|
|
95
|
-
|
|
95
|
+
**Explicit directive**
|
|
96
|
+
```text
|
|
97
|
+
set premise concise replies
|
|
98
|
+
```
|
|
99
|
+
- Base model: silently accepts / rewrites
|
|
100
|
+
- Context Compiler: applies a deterministic state update
|
|
96
101
|
|
|
97
|
-
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
-
|
|
102
|
+
**State-dependent operation**
|
|
103
|
+
```text
|
|
104
|
+
clear state
|
|
105
|
+
use podman instead of docker
|
|
106
|
+
```
|
|
107
|
+
- Base model: generic explanation
|
|
108
|
+
- Context Compiler: rejects (“No exact policy found for 'docker'…”)
|
|
109
|
+
|
|
110
|
+
**Lifecycle enforcement**
|
|
111
|
+
```text
|
|
112
|
+
clear state
|
|
113
|
+
change premise to formal tone
|
|
114
|
+
```
|
|
115
|
+
- Base model: conversational rewrite guidance
|
|
116
|
+
- Context Compiler: clarifies (“No premise exists yet…”)
|
|
103
117
|
|
|
104
118
|
---
|
|
105
119
|
|
|
@@ -153,8 +167,8 @@ Meaning:
|
|
|
153
167
|
|---|---|
|
|
154
168
|
| `create_engine(state=None)` | Create a new compiler engine; optional `state` provides initial authoritative state (validated/canonicalized). |
|
|
155
169
|
| `step(user_input)` | Parse one user turn and return a deterministic `Decision`. |
|
|
156
|
-
| `compile_transcript(messages)` | Replay a transcript from a fresh engine and return either final state or a confirmation prompt. |
|
|
157
|
-
| `engine.apply_transcript(messages)` | Replay a transcript onto the current engine state and return either final state or a confirmation prompt. |
|
|
170
|
+
| `compile_transcript(messages: Transcript)` | Replay a transcript from a fresh engine and return either final state or a confirmation prompt. |
|
|
171
|
+
| `engine.apply_transcript(messages: Transcript)` | Replay a transcript onto the current engine state and return either final state or a confirmation prompt. |
|
|
158
172
|
| `engine.state` | Read current authoritative in-memory state snapshot. |
|
|
159
173
|
| `get_premise_value(state)` | Read the current premise value from a state snapshot. |
|
|
160
174
|
| `get_policy_items(state, value=None)` | Read policy items from a state snapshot (all, `use`, or `prohibit`). |
|
|
@@ -222,27 +236,65 @@ For full directive grammar and edge-case behavior, see [DirectiveGrammarSpec.md]
|
|
|
222
236
|
|
|
223
237
|
---
|
|
224
238
|
|
|
225
|
-
##
|
|
239
|
+
## Guarantees
|
|
226
240
|
|
|
227
|
-
|
|
241
|
+
- State changes only through explicit user directives or confirmation.
|
|
242
|
+
- Identical input sequences produce identical compiler state.
|
|
243
|
+
- Model responses never modify compiler state.
|
|
244
|
+
- Ambiguous directives trigger clarification instead of changing state.
|
|
245
|
+
|
|
246
|
+
These invariants are verified through behavioral tests and Hypothesis-based property tests.
|
|
228
247
|
|
|
229
248
|
---
|
|
230
249
|
|
|
231
|
-
##
|
|
250
|
+
## Evidence
|
|
251
|
+
|
|
252
|
+
### Behavioral correctness (key examples)
|
|
253
|
+
|
|
254
|
+
Concrete behavioral comparisons (base model vs compiler) are available here:
|
|
255
|
+
|
|
256
|
+
- [Open WebUI integration README](examples/integrations/openwebui/README.md)
|
|
257
|
+
|
|
258
|
+
These demonstrate deterministic clarification, state enforcement, and conflict handling.
|
|
259
|
+
|
|
260
|
+
### Cross-model evaluation
|
|
261
|
+
|
|
262
|
+
- Models tested: `llama3.1:8b`, `gpt-4o-mini`, `gpt-4.1`, `gpt-5`, `claude-sonnet-4`, `claude-opus-4`
|
|
263
|
+
- Pass-rate summary: baseline (LLM only) `2–4 / 6`; with compiler `6 / 6`; with compiler + compaction `6 / 6`.
|
|
264
|
+
|
|
265
|
+
### Efficiency
|
|
266
|
+
|
|
267
|
+
- Context reduction in long conversations: up to `99%`
|
|
268
|
+
- Prompt size reduction: about `50%`
|
|
269
|
+
|
|
270
|
+
### Additional results
|
|
271
|
+
|
|
272
|
+
- [SWE curated results (compiler vs baseline)](evals/swe-bench/README.md) — cross-model evaluation on 6 tasks showing mostly positive deltas
|
|
232
273
|
|
|
233
|
-
- [LLM preprocessor](docs/llm-preprocessor.md)
|
|
234
|
-
- [Multiple engines](docs/multi-engine.md)
|
|
235
274
|
|
|
236
275
|
---
|
|
237
276
|
|
|
238
|
-
## Guarantees
|
|
239
277
|
|
|
240
|
-
|
|
241
|
-
- Identical input sequences produce identical compiler state.
|
|
242
|
-
- Model responses never modify compiler state.
|
|
243
|
-
- Ambiguous directives trigger clarification instead of changing state.
|
|
278
|
+
## Optional: LLM Preprocessor (Experimental)
|
|
244
279
|
|
|
245
|
-
|
|
280
|
+
An optional host-side preprocessor can convert natural-language instructions
|
|
281
|
+
into canonical directives before compilation.
|
|
282
|
+
|
|
283
|
+
It is designed to be conservative and must be used with validation:
|
|
284
|
+
|
|
285
|
+
- heuristic-first, with LLM fallback when needed
|
|
286
|
+
- all outputs must be validated with `parse_precompiler_output(...)`
|
|
287
|
+
- raw outputs must not be passed directly to the compiler
|
|
288
|
+
|
|
289
|
+
See [LLM preprocessor](docs/llm-preprocessor.md) and
|
|
290
|
+
[`experimental/preprocessor/`](experimental/preprocessor/) for details.
|
|
291
|
+
|
|
292
|
+
|
|
293
|
+
## Advanced topics
|
|
294
|
+
|
|
295
|
+
- [Multiple engines](docs/multi-engine.md)
|
|
296
|
+
|
|
297
|
+
For a full documentation map, see [docs/README.md](docs/README.md).
|
|
246
298
|
|
|
247
299
|
---
|
|
248
300
|
|
|
@@ -255,6 +307,12 @@ More detailed design and milestone documents are available in:
|
|
|
255
307
|
|
|
256
308
|
---
|
|
257
309
|
|
|
310
|
+
## Conformance Fixtures
|
|
311
|
+
|
|
312
|
+
Cross-language conformance tests are defined in [`tests/fixtures/`](tests/fixtures/).
|
|
313
|
+
|
|
314
|
+
---
|
|
315
|
+
|
|
258
316
|
## License
|
|
259
317
|
|
|
260
318
|
Apache-2.0.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
"""Demo 6: host-side prompt replacement from authoritative compiled state."""
|
|
2
2
|
|
|
3
|
-
from context_compiler import compile_transcript, get_premise_value
|
|
3
|
+
from context_compiler import Transcript, compile_transcript, get_premise_value
|
|
4
4
|
from demos.common import compact_user_turns, is_verbose, print_info_report
|
|
5
5
|
|
|
6
6
|
DEMO_NAME = "06_context_compaction — superseded directives eliminated"
|
|
@@ -40,7 +40,7 @@ def _build_turns(turn_count: int) -> list[str]:
|
|
|
40
40
|
|
|
41
41
|
|
|
42
42
|
def _compile_premise(turns: list[str]) -> str:
|
|
43
|
-
messages:
|
|
43
|
+
messages: Transcript = [{"role": "user", "content": turn} for turn in turns]
|
|
44
44
|
result = compile_transcript(messages)
|
|
45
45
|
assert result["kind"] == "state"
|
|
46
46
|
compiled_premise = get_premise_value(result["state"])
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# Documentation Index
|
|
2
|
+
|
|
3
|
+
## Start Here
|
|
4
|
+
- [Project README](../README.md)
|
|
5
|
+
|
|
6
|
+
## Core Concepts
|
|
7
|
+
- [Directive Grammar](DirectiveGrammarSpec.md)
|
|
8
|
+
|
|
9
|
+
## Integrations
|
|
10
|
+
- [Open WebUI integration](../examples/integrations/openwebui/README.md)
|
|
11
|
+
|
|
12
|
+
## Preprocessor
|
|
13
|
+
- [LLM preprocessor](llm-preprocessor.md)
|
|
14
|
+
|
|
15
|
+
## Evaluation & Evidence
|
|
16
|
+
- [Behavioral comparisons (Open WebUI)](../examples/integrations/openwebui/README.md)
|
|
17
|
+
- [SWE curated results](../evals/swe-bench/README.md)
|
|
18
|
+
|
|
19
|
+
## Project Background
|
|
20
|
+
- [Description and Milestones](DescriptionAndMilestones.md)
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# LLM Preprocessor (Optional, Experimental)
|
|
2
|
+
|
|
3
|
+
The experimental preprocessor is an optional host-side layer that can convert
|
|
4
|
+
natural-language messages into canonical Context Compiler directives before
|
|
5
|
+
compilation.
|
|
6
|
+
|
|
7
|
+
The compiler remains deterministic and authoritative. The preprocessor does not
|
|
8
|
+
replace core parsing or state semantics.
|
|
9
|
+
|
|
10
|
+
Install path for integrations using this layer:
|
|
11
|
+
`pip install "context-compiler[experimental]"`.
|
|
12
|
+
|
|
13
|
+
Integration runtimes must use installed-package imports/resources for this
|
|
14
|
+
layer. Do not rely on repo-relative preprocessor paths.
|
|
15
|
+
|
|
16
|
+
## Required flow
|
|
17
|
+
|
|
18
|
+
Recommended conceptual flow:
|
|
19
|
+
|
|
20
|
+
1. heuristic precompile
|
|
21
|
+
2. validate candidate output
|
|
22
|
+
3. LLM fallback precompile (only when needed)
|
|
23
|
+
4. validate candidate output
|
|
24
|
+
5. If a valid directive is produced, pass it to the compiler.
|
|
25
|
+
Otherwise pass the original input unchanged.
|
|
26
|
+
|
|
27
|
+
All preprocessor outputs, including heuristic outputs, must be validated with
|
|
28
|
+
`parse_precompiler_output(...)` before being applied.
|
|
29
|
+
|
|
30
|
+
Raw heuristic/LLM outputs must not be passed directly to the compiler.
|
|
31
|
+
|
|
32
|
+
## Limits
|
|
33
|
+
|
|
34
|
+
The preprocessor is best-effort and intentionally conservative. Ambiguous,
|
|
35
|
+
reported, quoted, or mixed-intent inputs may still require abstention or host
|
|
36
|
+
clarification behavior.
|
|
37
|
+
|
|
38
|
+
## Status
|
|
39
|
+
|
|
40
|
+
This preprocessor surface is experimental and may evolve independently of the
|
|
41
|
+
core engine.
|
|
42
|
+
|
|
43
|
+
For concrete module usage, prompt guidance, and integration details, see:
|
|
44
|
+
[`experimental/preprocessor/README.md`](../experimental/preprocessor/README.md).
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
# LiteLLM Proxy Additional Findings
|
|
2
|
+
|
|
3
|
+
Model: `ollama/qwen2.5:14b-instruct`
|
|
4
|
+
|
|
5
|
+
- Limitations/caveats:
|
|
6
|
+
- Confirm follow-up (`yes`) does not resolve the prior confirm in current replay-only proxy flow.
|
|
7
|
+
- Last-turn-only preprocessing can fail to persist earlier canonicalization effects across subsequent replay.
|
|
8
|
+
- Additional LiteLLM-surface behavior:
|
|
9
|
+
- Structured mixed-content user payloads can trigger upstream LiteLLM/Ollama message-shape validation errors.
|
|
10
|
+
- Structured text-part near-miss inputs still show a meaningful preprocessor lifecycle win over basic proxy.
|
|
11
|
+
|
|
12
|
+
## Finding 1 — confirm follow-up loops (replay limitation)
|
|
13
|
+
|
|
14
|
+
**Prompt sequence**
|
|
15
|
+
1. `clear state`
|
|
16
|
+
2. `use podman instead of docker`
|
|
17
|
+
3. `yes, keep existing policies and use podman`
|
|
18
|
+
|
|
19
|
+
**Vanilla**
|
|
20
|
+
- Step 2/3: generic Podman migration/help text.
|
|
21
|
+
|
|
22
|
+
**Basic proxy**
|
|
23
|
+
- Step 2: confirm clarify (`No exact policy found for "docker" ... Confirm to use "podman" ...`).
|
|
24
|
+
- Step 3: same confirm clarify repeats.
|
|
25
|
+
|
|
26
|
+
**Preprocessor proxy**
|
|
27
|
+
- Step 2: same confirm clarify.
|
|
28
|
+
- Step 3: same confirm clarify repeats.
|
|
29
|
+
|
|
30
|
+
**Why it matters**
|
|
31
|
+
Current replay-based proxy behavior does not treat natural-language “yes” as explicit confirm resolution, so this can loop until user supplies an explicit directive path.
|
|
32
|
+
|
|
33
|
+
## Finding 2 — last-turn-only preprocessing is non-persistent across replay (replay limitation)
|
|
34
|
+
|
|
35
|
+
**Prompt sequence**
|
|
36
|
+
1. `clear state`
|
|
37
|
+
2. `set premise to concise replies`
|
|
38
|
+
3. `Explain TCP in detail.`
|
|
39
|
+
|
|
40
|
+
**Vanilla**
|
|
41
|
+
- Conversationally accepts premise-like instruction, then gives normal long-form answer.
|
|
42
|
+
|
|
43
|
+
**Basic proxy**
|
|
44
|
+
- Step 2: syntax clarify (`Did you mean 'set premise concise replies'?`).
|
|
45
|
+
- Step 3: same syntax clarify repeats.
|
|
46
|
+
|
|
47
|
+
**Preprocessor proxy**
|
|
48
|
+
- Step 2: canonicalized update (`Premise set to concise replies ...`).
|
|
49
|
+
- Step 3: syntax clarify reappears (`Did you mean 'set premise concise replies'?`).
|
|
50
|
+
|
|
51
|
+
**Why it matters**
|
|
52
|
+
Only the latest replay turn is preprocessed; earlier raw near-miss text in transcript can still drive later replay outcomes.
|
|
53
|
+
|
|
54
|
+
## Finding 3 — structured mixed content can fail upstream validation (LiteLLM-surface caveat)
|
|
55
|
+
|
|
56
|
+
**Prompt sequence**
|
|
57
|
+
1. `clear state`
|
|
58
|
+
2. user content parts: text (`set premise to concise replies`) + non-text (`input_image`)
|
|
59
|
+
3. `What is TCP?`
|
|
60
|
+
|
|
61
|
+
**Vanilla**
|
|
62
|
+
- Upstream request fails with invalid user message shape error.
|
|
63
|
+
|
|
64
|
+
**Basic proxy**
|
|
65
|
+
- Blocks at compiler clarify before upstream model call.
|
|
66
|
+
|
|
67
|
+
**Preprocessor proxy**
|
|
68
|
+
- Step 2 hits upstream validation error path; later turn can return clarify.
|
|
69
|
+
|
|
70
|
+
**Why it matters**
|
|
71
|
+
In proxy mode, forwarded request messages remain unchanged; LiteLLM/Ollama payload validation behavior can dominate outcomes for mixed content shapes.
|
|
72
|
+
|
|
73
|
+
## Finding 4 — structured text-part near-miss still yields stronger lifecycle result (LiteLLM-surface win)
|
|
74
|
+
|
|
75
|
+
**Prompt sequence**
|
|
76
|
+
1. `clear state`
|
|
77
|
+
2. user content text parts: `change premise` + `concise replies`
|
|
78
|
+
|
|
79
|
+
**Vanilla**
|
|
80
|
+
- Conversational acceptance of style change.
|
|
81
|
+
|
|
82
|
+
**Basic proxy**
|
|
83
|
+
- Syntax clarify only (`Did you mean 'change premise to concise replies'?`).
|
|
84
|
+
|
|
85
|
+
**Preprocessor proxy**
|
|
86
|
+
- Lifecycle clarify (`No premise exists yet. Use 'set premise ...' first.`).
|
|
87
|
+
|
|
88
|
+
**Why it matters**
|
|
89
|
+
For structured text-part inputs, preprocessor canonicalization can move past syntax-only clarify and reach the stronger lifecycle-semantic outcome.
|