inferroute 0.3.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,56 @@
1
+ name: Publish to PyPI
2
+
3
+ # Publishes `inferroute` to PyPI via Trusted Publishing (OIDC) — no API token
4
+ # is stored in the repo. Fires when a GitHub Release is published (the normal
5
+ # path), and also exposes a manual button for re-runs.
6
+ #
7
+ # One-time setup on PyPI (Account → Publishing → Add a pending publisher):
8
+ # PyPI project name : inferroute
9
+ # Owner : InferRoute
10
+ # Repository name : inferroute
11
+ # Workflow filename : release.yml
12
+ # Environment name : pypi
13
+ # The environment name MUST match the `environment:` block below.
14
+
15
+ on:
16
+ release:
17
+ types: [published]
18
+ workflow_dispatch:
19
+
20
+ jobs:
21
+ build:
22
+ name: Build distributions
23
+ runs-on: ubuntu-latest
24
+ steps:
25
+ - uses: actions/checkout@v4
26
+ - uses: actions/setup-python@v5
27
+ with:
28
+ python-version: "3.12"
29
+ - name: Build sdist and wheel
30
+ run: |
31
+ python -m pip install --upgrade build
32
+ python -m build
33
+ - name: Check metadata
34
+ run: |
35
+ python -m pip install --upgrade twine
36
+ python -m twine check dist/*
37
+ - uses: actions/upload-artifact@v4
38
+ with:
39
+ name: dist
40
+ path: dist/
41
+
42
+ publish:
43
+ name: Publish to PyPI
44
+ needs: build
45
+ runs-on: ubuntu-latest
46
+ environment:
47
+ name: pypi
48
+ url: https://pypi.org/p/inferroute
49
+ permissions:
50
+ id-token: write # required for Trusted Publishing (OIDC)
51
+ steps:
52
+ - uses: actions/download-artifact@v4
53
+ with:
54
+ name: dist
55
+ path: dist/
56
+ - uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,11 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ .pytest_cache/
4
+ .mypy_cache/
5
+ .ruff_cache/
6
+ *.egg-info/
7
+ build/
8
+ dist/
9
+ .venv/
10
+ .env
11
+ .envrc
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Henry Declety / inferroute
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,137 @@
1
+ Metadata-Version: 2.4
2
+ Name: inferroute
3
+ Version: 0.3.1
4
+ Summary: Launch Claude Code through inferroute — typed less, routed smarter
5
+ Project-URL: Homepage, https://inferroute.ai
6
+ Project-URL: Source, https://github.com/InferRoute/inferroute
7
+ Project-URL: Issues, https://github.com/InferRoute/inferroute/issues
8
+ Author-email: Henry Declety <henry@inferroute.ai>
9
+ License-Expression: MIT
10
+ License-File: LICENSE
11
+ Keywords: ai,claude,claude-code,gateway,llm,router
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Environment :: Console
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Programming Language :: Python :: 3 :: Only
17
+ Classifier: Topic :: Software Development
18
+ Requires-Python: >=3.10
19
+ Requires-Dist: httpx<1.0,>=0.27
20
+ Requires-Dist: textual<2.0,>=0.78
21
+ Provides-Extra: local
22
+ Requires-Dist: click>=8.1; extra == 'local'
23
+ Requires-Dist: fastapi>=0.111; extra == 'local'
24
+ Requires-Dist: httpx>=0.27; extra == 'local'
25
+ Requires-Dist: uvicorn[standard]>=0.30; extra == 'local'
26
+ Description-Content-Type: text/markdown
27
+
28
+ # ir — Claude Code through inferroute
29
+
30
+ Tiny launcher: one command (`ir`) spawns `claude` with the model you pick
31
+ and your inferroute key, so you don't have to set env vars every time.
32
+ **Your normal `claude` is never touched** — `ir` only configures the
33
+ subprocess it spawns.
34
+
35
+ You choose the model per session — there's no auto-routing. Optionally, an
36
+ on-device recorder logs your choices privately (see [Local recording](#local-recording-optional)).
37
+
38
+ ## Install
39
+
40
+ ```bash
41
+ pipx install inferroute
42
+ ir login
43
+ ```
44
+
45
+ That's it. `ir login` saves your API key to `~/.config/inferroute/credentials`
46
+ (mode 600) and verifies it against the inferroute API. Update any time with
47
+ `pipx upgrade inferroute`.
48
+
49
+ Sign up for a key at https://inferroute.ai if you don't have one.
50
+
51
+ ## Commands
52
+
53
+ ```
54
+ ir open the model picker, then launch (same as `ir choose`)
55
+ ir --model NAME pin a model — short alias or canonical id; any claude flag passes through
56
+ ir choose interactive picker — pick a model, then launch
57
+ ir anthropic escape hatch — plain Claude, your own setup
58
+ ir status personal usage view (TUI)
59
+ ir login save / refresh your inferroute API key
60
+ ir help one-screen explainer
61
+
62
+ ir add recording turn on private on-device recording (see below)
63
+ ir data show what's been recorded (counts, models, size)
64
+ ir data export DIR copy the metadata layer · ir data wipe delete it all
65
+ ir remove recording stop recording
66
+ ```
67
+
68
+ `NAME` is a short alias (`minimax`, `minimax-m3`, `kimi`, `glm`) or any
69
+ canonical id (e.g. `claude-opus-4-8`), which passes through verbatim.
70
+
71
+ Every command bakes in `--dangerously-skip-permissions`, which is the
72
+ right default for an agentic workflow. If you want the prompts back,
73
+ just use plain `claude`.
74
+
75
+ ## What does each command do under the hood?
76
+
77
+ ```
78
+ ir --model NAME → ANTHROPIC_BASE_URL=<recorder daemon if running, \\
79
+ else https://api.inferroute.ai> \\
80
+ ANTHROPIC_AUTH_TOKEN=$saved_key \\
81
+ exec claude --dangerously-skip-permissions \\
82
+ --model <resolved-id>
83
+ ir anthropic → exec claude --dangerously-skip-permissions
84
+ (no env mutation — pure pass-through)
85
+ ```
86
+
87
+ The env vars are scoped to the spawned `claude` process. Your shell
88
+ doesn't see them. When the local recorder daemon is running it sits in front
89
+ of `api.inferroute.ai` — it records locally, then forwards there (the cloud
90
+ still does its own backend fallbacks).
91
+
92
+ ## Local recording (optional)
93
+
94
+ `ir add recording` runs a small daemon on `localhost:5005` that records — **on
95
+ this machine only** — which model you pick per task and how the turn went, so
96
+ you (or a future personal router) can learn your preferences. It is **never
97
+ uploaded; we never see it.**
98
+
99
+ - Default is `full` — choices, outcomes, **and** prompt text. The prompt text is
100
+ what actually lets it learn your preferences, and it never leaves this machine.
101
+ Pick `minimal` at install for choices + outcomes only (no prompt text), which is
102
+ lighter but can't train a personal router.
103
+ - Inspect it any time with `ir data show`, copy the metadata layer with
104
+ `ir data export DIR` (never raw prompt text), or delete everything with
105
+ `ir data wipe`.
106
+ - It lives under `~/.inferroute`. Remove the daemon with `ir remove recording`
107
+ (`--purge` also deletes the recorded data).
108
+
109
+ ## Where your config lives
110
+
111
+ | Path | Contents |
112
+ |---|---|
113
+ | `~/.config/inferroute/credentials` | Your inferroute API key + base URL (mode 600). Edit by hand or re-run `ir login`. |
114
+
115
+ ## Privacy
116
+
117
+ The CLI talks to:
118
+ - `https://api.inferroute.ai` for your traffic (everything except `ir anthropic`)
119
+ - `https://inferroute.ai` only on `ir help` — to print the signup URL
120
+
121
+ It does **not** phone home, no telemetry. If you enable `ir add recording`,
122
+ that data is written **only** to `~/.inferroute` on your machine and is never
123
+ uploaded — inspect or delete it any time with `ir data show` / `ir data wipe`.
124
+ Whatever your `claude --model …` sends to whatever upstream is on you.
125
+
126
+ `ir anthropic` is literally `exec claude` plus one flag — the inferroute
127
+ service sees nothing.
128
+
129
+ ## Source
130
+
131
+ https://github.com/InferRoute/inferroute
132
+
133
+ Issues / feature requests / PRs welcome.
134
+
135
+ ## License
136
+
137
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,110 @@
1
+ # ir — Claude Code through inferroute
2
+
3
+ Tiny launcher: one command (`ir`) spawns `claude` with the model you pick
4
+ and your inferroute key, so you don't have to set env vars every time.
5
+ **Your normal `claude` is never touched** — `ir` only configures the
6
+ subprocess it spawns.
7
+
8
+ You choose the model per session — there's no auto-routing. Optionally, an
9
+ on-device recorder logs your choices privately (see [Local recording](#local-recording-optional)).
10
+
11
+ ## Install
12
+
13
+ ```bash
14
+ pipx install inferroute
15
+ ir login
16
+ ```
17
+
18
+ That's it. `ir login` saves your API key to `~/.config/inferroute/credentials`
19
+ (mode 600) and verifies it against the inferroute API. Update any time with
20
+ `pipx upgrade inferroute`.
21
+
22
+ Sign up for a key at https://inferroute.ai if you don't have one.
23
+
24
+ ## Commands
25
+
26
+ ```
27
+ ir open the model picker, then launch (same as `ir choose`)
28
+ ir --model NAME pin a model — short alias or canonical id; any claude flag passes through
29
+ ir choose interactive picker — pick a model, then launch
30
+ ir anthropic escape hatch — plain Claude, your own setup
31
+ ir status personal usage view (TUI)
32
+ ir login save / refresh your inferroute API key
33
+ ir help one-screen explainer
34
+
35
+ ir add recording turn on private on-device recording (see below)
36
+ ir data show what's been recorded (counts, models, size)
37
+ ir data export DIR copy the metadata layer · ir data wipe delete it all
38
+ ir remove recording stop recording
39
+ ```
40
+
41
+ `NAME` is a short alias (`minimax`, `minimax-m3`, `kimi`, `glm`) or any
42
+ canonical id (e.g. `claude-opus-4-8`), which passes through verbatim.
43
+
44
+ Every command bakes in `--dangerously-skip-permissions`, which is the
45
+ right default for an agentic workflow. If you want the prompts back,
46
+ just use plain `claude`.
47
+
48
+ ## What does each command do under the hood?
49
+
50
+ ```
51
+ ir --model NAME → ANTHROPIC_BASE_URL=<recorder daemon if running, \\
52
+ else https://api.inferroute.ai> \\
53
+ ANTHROPIC_AUTH_TOKEN=$saved_key \\
54
+ exec claude --dangerously-skip-permissions \\
55
+ --model <resolved-id>
56
+ ir anthropic → exec claude --dangerously-skip-permissions
57
+ (no env mutation — pure pass-through)
58
+ ```
59
+
60
+ The env vars are scoped to the spawned `claude` process. Your shell
61
+ doesn't see them. When the local recorder daemon is running it sits in front
62
+ of `api.inferroute.ai` — it records locally, then forwards there (the cloud
63
+ still does its own backend fallbacks).
64
+
65
+ ## Local recording (optional)
66
+
67
+ `ir add recording` runs a small daemon on `localhost:5005` that records — **on
68
+ this machine only** — which model you pick per task and how the turn went, so
69
+ you (or a future personal router) can learn your preferences. It is **never
70
+ uploaded; we never see it.**
71
+
72
+ - Default is `full` — choices, outcomes, **and** prompt text. The prompt text is
73
+ what actually lets it learn your preferences, and it never leaves this machine.
74
+ Pick `minimal` at install for choices + outcomes only (no prompt text), which is
75
+ lighter but can't train a personal router.
76
+ - Inspect it any time with `ir data show`, copy the metadata layer with
77
+ `ir data export DIR` (never raw prompt text), or delete everything with
78
+ `ir data wipe`.
79
+ - It lives under `~/.inferroute`. Remove the daemon with `ir remove recording`
80
+ (`--purge` also deletes the recorded data).
81
+
82
+ ## Where your config lives
83
+
84
+ | Path | Contents |
85
+ |---|---|
86
+ | `~/.config/inferroute/credentials` | Your inferroute API key + base URL (mode 600). Edit by hand or re-run `ir login`. |
87
+
88
+ ## Privacy
89
+
90
+ The CLI talks to:
91
+ - `https://api.inferroute.ai` for your traffic (everything except `ir anthropic`)
92
+ - `https://inferroute.ai` only on `ir help` — to print the signup URL
93
+
94
+ It does **not** phone home, no telemetry. If you enable `ir add recording`,
95
+ that data is written **only** to `~/.inferroute` on your machine and is never
96
+ uploaded — inspect or delete it any time with `ir data show` / `ir data wipe`.
97
+ Whatever your `claude --model …` sends to whatever upstream is on you.
98
+
99
+ `ir anthropic` is literally `exec claude` plus one flag — the inferroute
100
+ service sees nothing.
101
+
102
+ ## Source
103
+
104
+ https://github.com/InferRoute/inferroute
105
+
106
+ Issues / feature requests / PRs welcome.
107
+
108
+ ## License
109
+
110
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,120 @@
1
+ # Releasing inferroute
2
+
3
+ Two artifact families need to ship together for a clean release: the Python
4
+ package (PyPI) and the classifier model bundle (GitHub Releases). The CLI's
5
+ default `classifier_bootstrap_url` points at the GitHub release's "latest"
6
+ download path, so users running `ir add local-routing` get whichever model
7
+ was attached to the most recent tag.
8
+
9
+ ## 1. Cut the Python release
10
+
11
+ ```bash
12
+ # Bump the version in pyproject.toml, commit, tag
13
+ vim pyproject.toml # e.g. 0.1.0 → 0.2.0
14
+ git commit -am "release v0.2.0"
15
+ git tag v0.2.0
16
+ git push origin main --tags
17
+
18
+ # Build wheels
19
+ rm -rf dist/
20
+ python -m build
21
+
22
+ # Upload to PyPI (uses your ~/.pypirc or twine env vars)
23
+ python -m twine upload dist/*
24
+ ```
25
+
26
+ ## 2. Build the classifier bundle
27
+
28
+ The bundle is the contents of `~/.inferroute/models/classifier-v0/` packed
29
+ as individual files + a `classifier-v0-manifest.json` describing them.
30
+
31
+ ```bash
32
+ # From inferroute-local-experiments — the canonical source of the trained model
33
+ cd ~/workspaces/inferroute/inferroute-local-experiments
34
+ MODEL_DIR=out/classifier-v0-longer
35
+ BUNDLE_DIR=/tmp/classifier-v0-bundle-$(date +%Y%m%d)
36
+ mkdir -p $BUNDLE_DIR/onnx
37
+
38
+ # Copy the four required files (and their data sidecar if present)
39
+ cp $MODEL_DIR/onnx/model.onnx $BUNDLE_DIR/onnx/
40
+ [ -f $MODEL_DIR/onnx/model.onnx.data ] && cp $MODEL_DIR/onnx/model.onnx.data $BUNDLE_DIR/onnx/
41
+ cp $MODEL_DIR/tokenizer.json $BUNDLE_DIR/
42
+ cp $MODEL_DIR/calibration.json $BUNDLE_DIR/
43
+ cp $MODEL_DIR/label_to_int.json $BUNDLE_DIR/
44
+
45
+ # Generate the manifest (sha256 each file, point url at the release tag)
46
+ TAG=v0.2.0 # match the Python release tag
47
+ .venv/bin/python - <<EOF
48
+ import hashlib, json
49
+ from pathlib import Path
50
+ base = Path("$BUNDLE_DIR")
51
+ release_url = "https://github.com/InferRoute/inferroute/releases/download/$TAG"
52
+ files = []
53
+ for rel in ["onnx/model.onnx", "tokenizer.json", "calibration.json", "label_to_int.json"]:
54
+ p = base / rel
55
+ if not p.exists():
56
+ continue
57
+ sha = hashlib.sha256(p.read_bytes()).hexdigest()
58
+ files.append({"path": rel, "url": f"{release_url}/{p.name}", "sha256": sha})
59
+ # Optional sidecar (large model weights)
60
+ for sidecar in base.glob("onnx/model.onnx.data"):
61
+ sha = hashlib.sha256(sidecar.read_bytes()).hexdigest()
62
+ files.append({"path": f"onnx/{sidecar.name}", "url": f"{release_url}/{sidecar.name}", "sha256": sha})
63
+ manifest = {"version": "$TAG", "files": files}
64
+ (base / "classifier-v0-manifest.json").write_text(json.dumps(manifest, indent=2))
65
+ print(json.dumps(manifest, indent=2))
66
+ EOF
67
+ ```
68
+
69
+ ## 3. Attach to the GitHub Release
70
+
71
+ Upload these as release assets to the matching tag (`v0.2.0`):
72
+
73
+ - `classifier-v0-manifest.json` ← the daemon's entry point; must be named exactly this
74
+ - `model.onnx` ← rename so the URLs in the manifest match
75
+ - `model.onnx.data` ← if your export uses external weights
76
+ - `tokenizer.json`
77
+ - `calibration.json`
78
+ - `label_to_int.json`
79
+
80
+ Easiest via `gh`:
81
+
82
+ ```bash
83
+ gh release create $TAG \
84
+ --title "$TAG" \
85
+ --notes "$(git log --oneline $(git describe --tags --abbrev=0 HEAD^)..HEAD)" \
86
+ $BUNDLE_DIR/classifier-v0-manifest.json \
87
+ $BUNDLE_DIR/onnx/model.onnx \
88
+ $BUNDLE_DIR/onnx/model.onnx.data \
89
+ $BUNDLE_DIR/tokenizer.json \
90
+ $BUNDLE_DIR/calibration.json \
91
+ $BUNDLE_DIR/label_to_int.json
92
+ ```
93
+
94
+ The daemon's default `classifier_bootstrap_url` uses `/releases/latest/download/`,
95
+ so once this release is published, every user running `ir add local-routing`
96
+ gets these files automatically.
97
+
98
+ ## 4. Smoke-test the published release
99
+
100
+ ```bash
101
+ # In a clean env that doesn't have the model on disk
102
+ pip install --upgrade inferroute
103
+ rm -rf ~/.inferroute/models/classifier-v0
104
+ ir add local-routing --no-service --no-shell-edit --yes
105
+ # Should print: ✓ Installed model version v0.2.0 at ~/.inferroute/models/classifier-v0
106
+ ls ~/.inferroute/models/classifier-v0/
107
+ ```
108
+
109
+ ## Common slip-ups
110
+
111
+ - **Filename mismatch**: the manifest's `url` field must match the actual
112
+ filename you uploaded. `model.onnx.data` in the manifest doesn't help if
113
+ you uploaded it as `weights.bin`.
114
+ - **Tag drift**: if you bump the Python version but forget to tag, the next
115
+ `ir add local-routing` still pulls the OLD model from `/releases/latest/`.
116
+ Always tag both halves at the same version.
117
+ - **Pre-release tag**: GitHub's `/releases/latest/` ignores pre-releases. If
118
+ you ship a `v0.3.0-rc1` and don't mark it as a full release, users still
119
+ see `v0.2.0`. Use this on purpose during testing, but remember to flip
120
+ the "Set as the latest release" toggle for the real ship.
@@ -0,0 +1,8 @@
1
+ """inferroute CLI — launch Claude Code through inferroute with one command."""
2
+ from importlib.metadata import PackageNotFoundError, version
3
+
4
+ try:
5
+ # Single source of truth: the installed package version from pyproject.
6
+ __version__ = version("inferroute")
7
+ except PackageNotFoundError: # running from a source tree that isn't installed
8
+ __version__ = "0.0.0+dev"
@@ -0,0 +1,7 @@
1
+ """Allow `python -m inferroute_cli`."""
2
+ import sys
3
+
4
+ from .main import main
5
+
6
+ if __name__ == "__main__":
7
+ sys.exit(main())