petsitter 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,280 @@
1
+ Metadata-Version: 2.4
2
+ Name: petsitter
3
+ Version: 0.1.0
4
+ Summary: OpenAI-compatible proxy that adds functionality to models through tricks
5
+ License-File: LICENSE.MIT
6
+ Requires-Python: >=3.10
7
+ Requires-Dist: click>=8.1.0
8
+ Requires-Dist: httpx>=0.25.0
9
+ Requires-Dist: pydantic>=2.0.0
10
+ Requires-Dist: starlette>=0.34.0
11
+ Requires-Dist: uvicorn>=0.24.0
12
+ Provides-Extra: test
13
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == 'test'
14
+ Requires-Dist: pytest>=7.0.0; extra == 'test'
15
+ Description-Content-Type: text/markdown
16
+
17
+ # Petsitter
18
+
19
+ **Teach old models new tricks.**
20
+
21
+ Petsitter is an OpenAI-compatible proxy that layers smart harnesses on top of language models, giving them capabilities they don't natively have. Smaller models can't do tool calling? Petsitter tricks them into it. Need structured JSON output? Petsitter will loop until it gets it right.
22
+
23
+ But that's only the beginning. Cyclomatic complexity? Halstead metrics? Chidamber and Kemerer? Why not!
24
+
25
+ ## Who Is This For?
26
+
27
+ - **You run local models** (Ollama, llama.cpp, vllm, sglang) and miss OpenAI's features
28
+ - **You use small/cheap models** that lack tool calling or JSON mode
29
+ - **You build agentic systems** that need consistent capabilities across different models
30
+ - **You want to experiment** with prompt engineering tricks without changing your application code
31
+
32
+ ## What Does It Do?
33
+
34
+ Petsitter sits between your application and your model, intercepting requests and responses to apply "tricks" — pluggable transformations that add functionality through:
35
+
36
+ 1. **Prompt engineering** — Inject instructions and tool definitions
37
+ 2. **Context manipulation** — Modify messages before/after the model sees them
38
+ 3. **Retry loops** — Call the model again if output doesn't meet requirements
39
+ 4. **Response transformation** — Convert outputs to expected formats (e.g., OpenAI tool_calls)
40
+
41
+ ## Why Use It?
42
+
43
+ - **No model changes required** — Works with any OpenAI-compatible endpoint
44
+ - **Pluggable architecture** — Write your own tricks in Python
45
+ - **Transparent to your app** — Point your existing code at petsitter instead of the model
46
+ - **Mix and match** — Combine multiple tricks for compound effects
47
+
48
+ ---
49
+
50
+ ## Installation
51
+
52
+ ```bash
53
+ # Create virtual environment
54
+ uv venv
55
+
56
+ # Activate it
57
+ source .venv/bin/activate
58
+
59
+ # Install petsitter
60
+ pip install -e .
61
+ ```
62
+
63
+ ## Quick Start
64
+
65
+ ```bash
66
+ # Start your model backend (e.g., Ollama)
67
+ ollama serve
68
+
69
+ # Activate the virtual environment
70
+ source .venv/bin/activate
71
+
72
+ # Run petsitter with tricks
73
+ ./petsitter --model_url http://localhost:11434 \
74
+ --model_name llama3:8b \
75
+ --trick tricks/json_mode.py \
76
+ --trick tricks/tool_call.py \
77
+ --listen_on localhost:8080
78
+ ```
79
+
80
+ Now point your AI applications to `http://localhost:8080/v1`.
81
+
82
+ ## CLI Options
83
+
84
+ | Option | Required | Description |
85
+ |--------|----------|-------------|
86
+ | `--model_url` | Yes | Base URL of upstream model (e.g., `http://localhost:11434`) |
87
+ | `--model_name` | No | Model name (optional for vllm, sglang, llama.cpp) |
88
+ | `--api_key` | No | API key for upstream (if required) |
89
+ | `--trick` | No | Path to a trick module (can be repeated) |
90
+ | `--listen_on` | No | Host:port to listen on (default: `localhost:8080`) |
91
+
92
+ ## Built-in Tricks
93
+
94
+ ### JSON Mode (`tricks/json_mode.py`)
95
+
96
+ Enforces valid JSON output by:
97
+ - Adding formatting instructions to the system prompt
98
+ - Retrying with feedback if response isn't valid JSON
99
+ - Stripping markdown code blocks
100
+
101
+ ```bash
102
+ ./petsitter --model_url http://localhost:11434 --trick tricks/json_mode.py
103
+ ```
104
+
105
+ ### Tool Calling (`tricks/tool_call.py`)
106
+
107
+ Enables tool calling for models without native support:
108
+ - Injects tool definitions into prompts
109
+ - Parses JSONRPC-style tool call responses
110
+ - Converts to OpenAI `tool_calls` format
111
+
112
+ ```bash
113
+ ./petsitter --model_url http://localhost:11434 --trick tricks/tool_call.py
114
+ ```
115
+
116
+ ### List Files (`tricks/list_files.py`)
117
+
118
+ Test trick that provides a `list_files` tool. Useful for testing tool calling functionality.
119
+
120
+ ## Creating Custom Tricks
121
+
122
+ The `Trick` class has four hooks you can implement. Each hook is optional — only implement what you need.
123
+
124
+ ### `system_prompt(to_add: str) -> str`
125
+
126
+ **When:** Called once per request, before any messages are sent to the model.
127
+
128
+ **Purpose:** Append instructions to the system prompt. This is how you "prime" the model to behave a certain way.
129
+
130
+ **Example:**
131
+ ```python
132
+ def system_prompt(self, to_add: str) -> str:
133
+ return "IMPORTANT: Respond only in valid JSON. No markdown, no explanations."
134
+ ```
135
+
136
+ ### `pre_hook(context: list, params: dict) -> list`
137
+
138
+ **When:** Called after the system prompt is set, before the model receives the messages.
139
+
140
+ **Purpose:** Modify the conversation context. You can inject tool definitions, add few-shot examples, or restructure messages.
141
+
142
+ **Parameters:**
143
+ - `context`: List of message dicts (`[{"role": "user", "content": "..."}]`)
144
+ - `params`: Request parameters including `tools`, `temperature`, etc.
145
+
146
+ **Example:**
147
+ ```python
148
+ def pre_hook(self, context: list, params: dict) -> list:
149
+ if "tools" in params:
150
+ # Inject tool definitions into system prompt
151
+ tools_json = json.dumps(params["tools"])
152
+ context[0]["content"] += f"\n\nAvailable tools: {tools_json}"
153
+ return context
154
+ ```
155
+
156
+ ### `post_hook(context: list) -> list`
157
+
158
+ **When:** Called after the model responds, before the response goes back to your application.
159
+
160
+ **Purpose:** Validate, transform, or retry. This is where you can:
161
+ - Parse the response and convert it to a different format
162
+ - Detect when the model failed and call it again with feedback
163
+ - Extract tool calls from natural language
164
+
165
+ **Example (JSON validation with retry):**
166
+ ```python
167
+ def post_hook(self, context: list) -> list:
168
+ attempts = 3
169
+ while attempts > 0:
170
+ try:
171
+ json.loads(context[-1]["content"])
172
+ break # Valid JSON, we're done
173
+ except json.JSONDecodeError:
174
+ attempts -= 1
175
+ if attempts == 0:
176
+ break
177
+ # Retry with feedback
178
+ context = callmodel(context, "That wasn't valid JSON. Try again.")
179
+ return context
180
+ ```
181
+
182
+ **Example (Tool call detection):**
183
+ ```python
184
+ def post_hook(self, context: list) -> list:
185
+ content = context[-1]["content"]
186
+ if self._looks_like_tool_call(content):
187
+ # Convert to OpenAI tool_calls format
188
+ context[-1]["tool_calls"] = [self._parse_tool_call(content)]
189
+ context[-1]["content"] = None
190
+ return context
191
+ ```
192
+
193
+ ### `info(capabilities: dict) -> dict`
194
+
195
+ **When:** Called when building the response to your application.
196
+
197
+ **Purpose:** Declare what capabilities this trick provides. Some frameworks check for capabilities before using certain features.
198
+
199
+ **Example:**
200
+ ```python
201
+ def info(self, capabilities: dict) -> dict:
202
+ capabilities["json_mode"] = True
203
+ capabilities["tools_support"] = True
204
+ return capabilities
205
+ ```
206
+
207
+ ## Full Trick Example
208
+
209
+ Here's a trick that makes any model respond in haiku:
210
+
211
+ ```python
212
+ from src.trick import Trick
213
+
214
+ class HaikuTrick(Trick):
215
+ """Force the model to respond only in haiku."""
216
+
217
+ def system_prompt(self, to_add: str) -> str:
218
+ return (
219
+ "You must respond only in haiku (5-7-5 syllables). "
220
+ "No explanations, no extra text. Just haiku."
221
+ )
222
+
223
+ def post_hook(self, context: list) -> list:
224
+ # Could add syllable counting and retry here
225
+ return context
226
+
227
+ def info(self, capabilities: dict) -> dict:
228
+ capabilities["haiku_mode"] = True
229
+ return capabilities
230
+ ```
231
+
232
+ Use it:
233
+ ```bash
234
+ ./petsitter --model_url http://localhost:11434 --trick haiku.py
235
+ ```
236
+
237
+ ## API Endpoints
238
+
239
+ Petsitter exposes OpenAI-compatible endpoints:
240
+
241
+ - `POST /v1/chat/completions` - Chat completions (proxied + transformed)
242
+ - `GET /v1/models` - List available models (proxied)
243
+ - `GET /health` - Health check
244
+
245
+ ## Running Tests
246
+
247
+ ```bash
248
+ # Activate virtual environment
249
+ source .venv/bin/activate
250
+
251
+ # Install test dependencies
252
+ pip install -e ".[test]"
253
+
254
+ # Run tests
255
+ pytest tests/
256
+ ```
257
+
258
+ ## Example: Using with an Agentic Framework
259
+
260
+ ```python
261
+ from openai import OpenAI
262
+
263
+ # Point to petsitter instead of directly to the model
264
+ client = OpenAI(
265
+ base_url="http://localhost:8080/v1",
266
+ api_key="not-needed"
267
+ )
268
+
269
+ response = client.chat.completions.create(
270
+ model="any-model-name",
271
+ messages=[{"role": "user", "content": "List files in /tmp"}],
272
+ tools=[{"type": "function", "function": {"name": "list_files", ...}}]
273
+ )
274
+
275
+ # With tool_call trick, even small models can use tools!
276
+ ```
277
+
278
+ ## License
279
+
280
+ MIT
@@ -0,0 +1,11 @@
1
+ src/__init__.py,sha256=aUogdBrPbcR_sN60wGtqWp8NKj62XAxpURRAGHRi838,73
2
+ src/context.py,sha256=6fz-_CxCCzRkfKgXjemMcRCFkIlYIC_hoIwqzfM0fSM,2328
3
+ src/loader.py,sha256=1wN00V-waodqkt1r7T3zAVGSkNrA209Oc_8QZhvk4nk,1727
4
+ src/proxy.py,sha256=m9IP_mkjyGoj4QDHv1vceCxnXE7qnXrflmYJ0DMIMpo,5969
5
+ src/server.py,sha256=4V9n82sMhvmwCNPBhpOY0s2i3PjXZEyUiDel2nOk6AI,6316
6
+ src/trick.py,sha256=wn4ET7qD9T1DpFDCuWDGrKY-Y-PEh0_uNQ3DGxSgh0Q,3057
7
+ petsitter-0.1.0.dist-info/METADATA,sha256=XObnYYC2WRLhk0g1BfrVlPUakf9uy305YP3Z4eYZOss,8481
8
+ petsitter-0.1.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
9
+ petsitter-0.1.0.dist-info/entry_points.txt,sha256=CYTLVRD5Gm9J6u9giXkhrAewEe9xHZ-uz_wSOZZtl4E,45
10
+ petsitter-0.1.0.dist-info/licenses/LICENSE.MIT,sha256=5T9l8YHMRH7MQGFKfOrJ0CHHUAjpH_YKmcnAQRjCABg,1071
11
+ petsitter-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.30.1
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ petsitter = src.server:cli
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Chris McKenzie
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
src/__init__.py ADDED
@@ -0,0 +1,3 @@
1
+ from src.trick import Trick, callmodel
2
+
3
+ __all__ = ["Trick", "callmodel"]
src/context.py ADDED
@@ -0,0 +1,98 @@
1
+ """Context manipulation utilities for petsitter."""
2
+
3
+ from typing import Any
4
+
5
+
6
+ def get_system_prompt(context: list) -> str:
7
+ """Extract the system prompt from context if present.
8
+
9
+ Args:
10
+ context: List of message dicts.
11
+
12
+ Returns:
13
+ System prompt content or empty string.
14
+ """
15
+ if context and context[0].get("role") == "system":
16
+ return context[0].get("content", "")
17
+ return ""
18
+
19
+
20
+ def set_system_prompt(context: list, content: str) -> list:
21
+ """Set or update the system prompt in context.
22
+
23
+ Args:
24
+ context: List of message dicts.
25
+ content: New system prompt content.
26
+
27
+ Returns:
28
+ Modified context (may be new list).
29
+ """
30
+ if not context:
31
+ return [{"role": "system", "content": content}]
32
+
33
+ if context[0].get("role") == "system":
34
+ context[0]["content"] = content
35
+ else:
36
+ context.insert(0, {"role": "system", "content": content})
37
+
38
+ return context
39
+
40
+
41
+ def append_to_system_prompt(context: list, addition: str) -> list:
42
+ """Append text to the system prompt.
43
+
44
+ Args:
45
+ context: List of message dicts.
46
+ addition: Text to append.
47
+
48
+ Returns:
49
+ Modified context.
50
+ """
51
+ current = get_system_prompt(context)
52
+ if current:
53
+ new_content = current + "\n" + addition
54
+ else:
55
+ new_content = addition
56
+ return set_system_prompt(context, new_content)
57
+
58
+
59
+ def get_last_message(context: list) -> dict | None:
60
+ """Get the last message in context.
61
+
62
+ Args:
63
+ context: List of message dicts.
64
+
65
+ Returns:
66
+ Last message dict or None.
67
+ """
68
+ return context[-1] if context else None
69
+
70
+
71
+ def set_last_message_content(context: list, content: str) -> list:
72
+ """Replace the content of the last message.
73
+
74
+ Args:
75
+ context: List of message dicts.
76
+ content: New content.
77
+
78
+ Returns:
79
+ Modified context.
80
+ """
81
+ if context:
82
+ context[-1]["content"] = content
83
+ return context
84
+
85
+
86
+ def add_message(context: list, role: str, content: str) -> list:
87
+ """Add a new message to the context.
88
+
89
+ Args:
90
+ context: List of message dicts.
91
+ role: Message role (user, assistant, system, tool).
92
+ content: Message content.
93
+
94
+ Returns:
95
+ Modified context.
96
+ """
97
+ context.append({"role": role, "content": content})
98
+ return context
src/loader.py ADDED
@@ -0,0 +1,63 @@
1
+ """Dynamic loading of trick modules."""
2
+
3
+ import importlib.util
4
+ import sys
5
+ from pathlib import Path
6
+ from typing import Type
7
+
8
+ from src.trick import Trick
9
+
10
+
11
+ def load_trick_from_path(path: str) -> Type[Trick]:
12
+ """Load a Trick class from a Python file path.
13
+
14
+ Args:
15
+ path: File path to the trick module (e.g., 'tricks/tools.py').
16
+
17
+ Returns:
18
+ The Trick subclass defined in the module.
19
+
20
+ Raises:
21
+ FileNotFoundError: If the path doesn't exist.
22
+ ImportError: If no Trick subclass is found.
23
+ """
24
+ trick_path = Path(path).resolve()
25
+ if not trick_path.exists():
26
+ raise FileNotFoundError(f"Trick file not found: {path}")
27
+
28
+ module_name = trick_path.stem
29
+ spec = importlib.util.spec_from_file_location(module_name, trick_path)
30
+ if spec is None or spec.loader is None:
31
+ raise ImportError(f"Could not load module: {path}")
32
+
33
+ module = importlib.util.module_from_spec(spec)
34
+ sys.modules[module_name] = module
35
+ spec.loader.exec_module(module)
36
+
37
+ # Find the Trick subclass (not the base Trick itself)
38
+ for attr_name in dir(module):
39
+ attr = getattr(module, attr_name)
40
+ if (
41
+ isinstance(attr, type)
42
+ and issubclass(attr, Trick)
43
+ and attr is not Trick
44
+ ):
45
+ return attr
46
+
47
+ raise ImportError(f"No Trick subclass found in {path}")
48
+
49
+
50
+ def load_tricks(paths: list[str]) -> list[Trick]:
51
+ """Load multiple tricks from file paths.
52
+
53
+ Args:
54
+ paths: List of file paths to trick modules.
55
+
56
+ Returns:
57
+ List of instantiated Trick objects.
58
+ """
59
+ tricks = []
60
+ for path in paths:
61
+ trick_class = load_trick_from_path(path)
62
+ tricks.append(trick_class())
63
+ return tricks
src/proxy.py ADDED
@@ -0,0 +1,168 @@
1
+ """OpenAI API proxy handling for petsitter."""
2
+
3
+ import json
4
+ import logging
5
+ from typing import Any
6
+
7
+ import httpx
8
+
9
+ from src.context import append_to_system_prompt
10
+ from src.trick import Trick, callmodel
11
+
12
+ logger = logging.getLogger("petsitter")
13
+
14
+
15
+ class ProxyHandler:
16
+ """Handles proxied requests to the upstream model."""
17
+
18
+ def __init__(
19
+ self,
20
+ model_url: str,
21
+ model_name: str | None,
22
+ api_key: str = "",
23
+ tricks: list[Trick] | None = None,
24
+ ):
25
+ self.model_url = model_url.rstrip("/")
26
+ self.model_name = model_name
27
+ self.api_key = api_key
28
+ self.tricks = tricks or []
29
+
30
+ def _build_headers(self) -> dict[str, str]:
31
+ """Build request headers."""
32
+ headers = {"Content-Type": "application/json"}
33
+ if self.api_key:
34
+ headers["Authorization"] = f"Bearer {self.api_key}"
35
+ return headers
36
+
37
+ def _apply_system_prompt_tricks(self, system_prompt: str) -> str:
38
+ """Apply system_prompt hooks from all tricks."""
39
+ result = system_prompt
40
+ for trick in self.tricks:
41
+ addition = trick.system_prompt(result)
42
+ if addition:
43
+ result = result + "\n" + addition if result else addition
44
+ return result
45
+
46
+ def _apply_pre_hooks(self, context: list, params: dict) -> list:
47
+ """Apply pre_hook from all tricks."""
48
+ result = context
49
+ for trick in self.tricks:
50
+ result = trick.pre_hook(result, params)
51
+ return result
52
+
53
+ def _apply_post_hooks(self, context: list) -> list:
54
+ """Apply post_hook from all tricks."""
55
+ result = context
56
+ for trick in self.tricks:
57
+ result = trick.post_hook(result)
58
+ return result
59
+
60
+ def _merge_capabilities(self) -> dict:
61
+ """Merge capabilities from all tricks."""
62
+ capabilities = {}
63
+ for trick in self.tricks:
64
+ capabilities = trick.info(capabilities)
65
+ return capabilities
66
+
67
+ async def chat_completions(self, payload: dict) -> dict:
68
+ """Handle /v1/chat/completions request.
69
+
70
+ Args:
71
+ payload: The incoming request body.
72
+
73
+ Returns:
74
+ The response from the upstream model (possibly modified).
75
+ """
76
+ # Extract messages and apply system prompt tricks
77
+ messages = payload.get("messages", [])
78
+ system_prompt = ""
79
+ if messages and messages[0].get("role") == "system":
80
+ system_prompt = messages[0].get("content", "")
81
+ messages = messages[1:]
82
+
83
+ # Apply system prompt tricks
84
+ new_system_prompt = self._apply_system_prompt_tricks(system_prompt)
85
+ if new_system_prompt:
86
+ messages = [{"role": "system", "content": new_system_prompt}] + messages
87
+
88
+ # Apply pre-hooks
89
+ messages = self._apply_pre_hooks(messages, payload)
90
+
91
+ # Build upstream request
92
+ upstream_payload = {
93
+ "model": self.model_name or payload.get("model", "default"),
94
+ "messages": messages,
95
+ }
96
+ # Pass through optional params (but force stream=false for upstream)
97
+ for key in ["temperature", "max_tokens"]:
98
+ if key in payload:
99
+ upstream_payload[key] = payload[key]
100
+
101
+ # Don't pass tools/stream to upstream - we handle tool calling via tricks
102
+ # and always fetch full response for post-processing
103
+ # upstream_payload["stream"] = False
104
+
105
+ logger.info(f"Calling upstream model: {self.model_url}/v1/chat/completions")
106
+ logger.debug(f"Upstream payload: {json.dumps(upstream_payload, indent=2)}")
107
+
108
+ # Call upstream model
109
+ async with httpx.AsyncClient() as client:
110
+ response = await client.post(
111
+ f"{self.model_url}/v1/chat/completions",
112
+ json=upstream_payload,
113
+ headers=self._build_headers(),
114
+ timeout=120.0,
115
+ )
116
+
117
+ # Log response details for debugging
118
+ logger.info(f"Upstream response status: {response.status_code}")
119
+ logger.debug(f"Upstream response headers: {dict(response.headers)}")
120
+ logger.debug(f"Upstream response body: {response.text[:500] if response.text else '(empty)'}")
121
+
122
+ response.raise_for_status()
123
+
124
+ # Check for empty response
125
+ if not response.content:
126
+ logger.error(f"Empty response from upstream. Status: {response.status_code}")
127
+ logger.error(f"Response headers: {dict(response.headers)}")
128
+ raise ValueError(f"Upstream returned empty response (status {response.status_code})")
129
+
130
+ result = response.json()
131
+
132
+ logger.debug(f"Upstream response: {json.dumps(result, indent=2)}")
133
+
134
+ # Extract assistant message and build context for post-hooks
135
+ assistant_message = result["choices"][0]["message"]
136
+ context = messages + [assistant_message]
137
+
138
+ logger.debug(f"Context before post-hooks: {json.dumps(context, indent=2)}")
139
+
140
+ # Apply post-hooks
141
+ context = self._apply_post_hooks(context)
142
+
143
+ logger.debug(f"Context after post-hooks: {json.dumps(context, indent=2)}")
144
+
145
+ # Update result with potentially modified response
146
+ result["choices"][0]["message"] = context[-1]
147
+
148
+ # Merge capabilities into response if present
149
+ capabilities = self._merge_capabilities()
150
+ if capabilities:
151
+ result["capabilities"] = capabilities
152
+
153
+ return result
154
+
155
+ async def models(self) -> dict:
156
+ """Handle /v1/models request.
157
+
158
+ Returns:
159
+ Model listing response.
160
+ """
161
+ async with httpx.AsyncClient() as client:
162
+ response = await client.get(
163
+ f"{self.model_url}/v1/models",
164
+ headers=self._build_headers(),
165
+ timeout=30.0,
166
+ )
167
+ response.raise_for_status()
168
+ return response.json()
src/server.py ADDED
@@ -0,0 +1,203 @@
1
+ """HTTP server and CLI for petsitter."""
2
+
3
+ import json
4
+ import logging
5
+ import os
6
+ from typing import Any
7
+
8
+ import click
9
+ import uvicorn
10
+ from starlette.applications import Starlette
11
+ from starlette.requests import Request
12
+ from starlette.responses import JSONResponse, Response, StreamingResponse
13
+
14
+ from src.loader import load_tricks
15
+ from src.proxy import ProxyHandler
16
+ from src.trick import Trick
17
+
18
+
19
+ def create_app(
20
+ model_url: str,
21
+ model_name: str | None,
22
+ api_key: str,
23
+ trick_paths: list[str],
24
+ ) -> Starlette:
25
+ """Create the petsitter Starlette application.
26
+
27
+ Args:
28
+ model_url: Base URL of the upstream model.
29
+ model_name: Optional model name override.
30
+ api_key: API key for upstream.
31
+ trick_paths: List of trick file paths.
32
+
33
+ Returns:
34
+ Configured Starlette app.
35
+ """
36
+ tricks = load_tricks(trick_paths) if trick_paths else []
37
+ handler = ProxyHandler(model_url, model_name, api_key, tricks)
38
+
39
+ app = Starlette()
40
+
41
+ async def stream_chat_completions(handler: ProxyHandler, payload: dict):
42
+ """Stream chat completions as SSE events in OpenAI format."""
43
+ try:
44
+ result = await handler.chat_completions(payload)
45
+
46
+ # Convert to streaming format with delta instead of message
47
+ message = result["choices"][0]["message"]
48
+
49
+ # Build streaming response with delta
50
+ stream_result = {
51
+ "id": result.get("id", "chatcmpl-petsitter"),
52
+ "object": "chat.completion.chunk",
53
+ "created": result.get("created", __import__("time").time()),
54
+ "model": result.get("model", "unknown"),
55
+ "choices": [{
56
+ "index": 0,
57
+ "delta": {
58
+ "role": "assistant",
59
+ "content": message.get("content"),
60
+ },
61
+ "finish_reason": result["choices"][0].get("finish_reason", "stop"),
62
+ }],
63
+ }
64
+
65
+ # Add tool_calls to delta if present
66
+ if "tool_calls" in message:
67
+ stream_result["choices"][0]["delta"]["tool_calls"] = message["tool_calls"]
68
+
69
+ yield f"data: {json.dumps(stream_result)}\n\n"
70
+ yield "data: [DONE]\n\n"
71
+ except Exception as e:
72
+ import traceback
73
+ tb = traceback.format_exc()
74
+ click.echo(f"ERROR in stream_chat_completions: {e}")
75
+ click.echo(tb)
76
+ error_data = {
77
+ "error": {"message": str(e), "type": "proxy_error"}
78
+ }
79
+ yield f"data: {json.dumps(error_data)}\n\n"
80
+
81
+ @app.route("/v1/chat/completions", methods=["POST"])
82
+ async def chat_completions(request: Request) -> Response:
83
+ """Proxy chat completions to upstream model."""
84
+ try:
85
+ payload = await request.json()
86
+ stream = payload.get("stream", False)
87
+
88
+ if stream:
89
+ return StreamingResponse(
90
+ stream_chat_completions(handler, payload),
91
+ media_type="text/event-stream",
92
+ )
93
+ else:
94
+ result = await handler.chat_completions(payload)
95
+ return JSONResponse(result)
96
+ except Exception as e:
97
+ import traceback
98
+ tb = traceback.format_exc()
99
+ click.echo(f"ERROR in chat_completions: {e}")
100
+ click.echo(tb)
101
+ return JSONResponse(
102
+ {"error": {"message": str(e), "type": "proxy_error", "traceback": tb}},
103
+ status_code=500,
104
+ )
105
+
106
+ @app.route("/v1/models", methods=["GET"])
107
+ async def models(request: Request) -> Response:
108
+ """Proxy models listing to upstream."""
109
+ try:
110
+ result = await handler.models()
111
+ return JSONResponse(result)
112
+ except Exception as e:
113
+ import traceback
114
+ tb = traceback.format_exc()
115
+ click.echo(f"ERROR in models: {e}")
116
+ click.echo(tb)
117
+ return JSONResponse(
118
+ {"error": {"message": str(e), "type": "proxy_error", "traceback": tb}},
119
+ status_code=500,
120
+ )
121
+
122
+ @app.route("/health", methods=["GET"])
123
+ async def health(request: Request) -> Response:
124
+ """Health check endpoint."""
125
+ return JSONResponse({"status": "ok"})
126
+
127
+ return app
128
+
129
+
130
+ @click.command()
131
+ @click.option(
132
+ "--model_url",
133
+ required=True,
134
+ help="Base URL of the upstream model (e.g., http://localhost:11434)",
135
+ )
136
+ @click.option(
137
+ "--model_name",
138
+ default=None,
139
+ help="Model name to use (optional for some backends like vllm, sglang)",
140
+ )
141
+ @click.option(
142
+ "--api_key",
143
+ default="",
144
+ help="API key for upstream (if required)",
145
+ )
146
+ @click.option(
147
+ "--trick",
148
+ "tricks",
149
+ multiple=True,
150
+ help="Path to a trick module (can be specified multiple times)",
151
+ )
152
+ @click.option(
153
+ "--listen_on",
154
+ default="localhost:8080",
155
+ help="Host:port to listen on (default: localhost:8080)",
156
+ )
157
+ def cli(
158
+ model_url: str,
159
+ model_name: str | None,
160
+ api_key: str,
161
+ tricks: tuple[str, ...],
162
+ listen_on: str,
163
+ ) -> None:
164
+ """Petsitter - OpenAI-compatible proxy with tricks.
165
+
166
+ Example:
167
+
168
+ \b
169
+ petsitter --model_url http://localhost:11434 \\
170
+ --model_name llama3:8b \\
171
+ --trick tricks/tool_call.py \\
172
+ --trick tricks/json_mode.py \\
173
+ --listen_on localhost:8080
174
+ """
175
+ # Parse listen_on
176
+ if ":" in listen_on:
177
+ host, port_str = listen_on.rsplit(":", 1)
178
+ port = int(port_str)
179
+ else:
180
+ host = listen_on
181
+ port = 8080
182
+
183
+ app = create_app(model_url, model_name, api_key, list(tricks))
184
+
185
+ click.echo(f"Starting petsitter on {host}:{port}")
186
+ click.echo(f"Upstream: {model_url}")
187
+ if model_name:
188
+ click.echo(f"Model: {model_name}")
189
+ if tricks:
190
+ click.echo(f"Tricks: {', '.join(tricks)}")
191
+
192
+ # Configure logging from environment
193
+ log_level = os.getenv("LOGLEVEL", "INFO").upper()
194
+ logging.basicConfig(
195
+ level=getattr(logging, log_level, logging.INFO),
196
+ format="%(levelname)s: %(message)s"
197
+ )
198
+
199
+ uvicorn.run(app, host=host, port=port)
200
+
201
+
202
+ if __name__ == "__main__":
203
+ cli()
src/trick.py ADDED
@@ -0,0 +1,114 @@
1
+ """Base Trick class and callmodel utility for petsitter."""
2
+
3
+ import json
4
+ from typing import Any
5
+
6
+ import httpx
7
+
8
+
9
+ class Trick:
10
+ """Base class for all petsitter tricks.
11
+
12
+ Subclass this and implement any of the hooks to add functionality.
13
+ """
14
+
15
+ def system_prompt(self, to_add: str) -> str:
16
+ """Add instructions to the system prompt.
17
+
18
+ Args:
19
+ to_add: The current system prompt content.
20
+
21
+ Returns:
22
+ Modified system prompt content.
23
+ """
24
+ return ""
25
+
26
+ def pre_hook(self, context: list, params: dict) -> list:
27
+ """Modify context before it reaches the model.
28
+
29
+ Args:
30
+ context: The conversation context (list of messages).
31
+ params: Request parameters (tools, model, etc.).
32
+
33
+ Returns:
34
+ Modified context.
35
+ """
36
+ return context
37
+
38
+ def post_hook(self, context: list) -> list:
39
+ """Modify context after model processes but before returning upstream.
40
+
41
+ Args:
42
+ context: The conversation context including model response.
43
+
44
+ Returns:
45
+ Modified context.
46
+ """
47
+ return context
48
+
49
+ def info(self, capabilities: dict) -> dict:
50
+ """Declare capabilities added by this trick.
51
+
52
+ Args:
53
+ capabilities: Current capabilities dict.
54
+
55
+ Returns:
56
+ Modified capabilities dict.
57
+ """
58
+ return capabilities
59
+
60
+
61
+ async def callmodel(
62
+ context: list,
63
+ instruction: str = "",
64
+ model_url: str = "",
65
+ model_name: str = "",
66
+ api_key: str = "",
67
+ ) -> list:
68
+ """Make a follow-up call to the model.
69
+
70
+ Used by tricks that need to retry or refine model output.
71
+
72
+ Args:
73
+ context: Current conversation context.
74
+ instruction: Optional system instruction to append.
75
+ model_url: Base URL of the model endpoint.
76
+ model_name: Name of the model to use.
77
+ api_key: API key if required.
78
+
79
+ Returns:
80
+ Updated context with model response.
81
+ """
82
+ if not model_url:
83
+ raise ValueError("model_url is required for callmodel")
84
+
85
+ messages = context.copy()
86
+ if instruction:
87
+ # Add instruction as system message or append to existing
88
+ if messages and messages[0].get("role") == "system":
89
+ messages[0]["content"] += f"\n{instruction}"
90
+ else:
91
+ messages.insert(0, {"role": "system", "content": instruction})
92
+
93
+ payload = {
94
+ "model": model_name or "default",
95
+ "messages": messages,
96
+ }
97
+
98
+ headers = {"Content-Type": "application/json"}
99
+ if api_key:
100
+ headers["Authorization"] = f"Bearer {api_key}"
101
+
102
+ async with httpx.AsyncClient() as client:
103
+ response = await client.post(
104
+ f"{model_url}/v1/chat/completions",
105
+ json=payload,
106
+ headers=headers,
107
+ timeout=60.0,
108
+ )
109
+ response.raise_for_status()
110
+ result = response.json()
111
+
112
+ # Extract the assistant's response
113
+ assistant_message = result["choices"][0]["message"]
114
+ return context + [assistant_message]