@openhoo/hoopilot 0.6.1 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -154,6 +154,18 @@ Equivalent environment variables:
154
154
 
155
155
  Incoming `x-request-id` headers are preserved on responses. If a request has no ID, Hoopilot generates one and returns it as `x-request-id`.
156
156
 
157
+ ## Metrics and usage
158
+
159
+ Hoopilot tracks token usage, request counts, and latency in memory while the server runs, and can report your GitHub Copilot account quota (premium-request "credit" usage).
160
+
161
+ - `GET /metrics` returns Prometheus text (`text/plain; version=0.0.4`). It exposes request counters (`hoopilot_requests_total`), upstream call counters (`hoopilot_upstream_requests_total`), token counters by model and type (`hoopilot_tokens_total{model,type}`), a request-duration histogram (`hoopilot_request_duration_seconds`), an in-flight gauge, and—once `/v1/usage` has been fetched at least once—Copilot quota gauges (`hoopilot_copilot_quota_remaining{category}`, `_entitlement`, `_used`, `_percent_remaining`). Counters reset to zero on restart, which Prometheus handles natively.
162
+ - `GET /v1/usage` returns JSON combining the proxy metrics snapshot with live Copilot quota fetched from GitHub (cached for 60 seconds). If the quota cannot be read, `copilot` is `null` and `copilot_error` explains why, but the proxy metrics are still returned.
163
+ - `hoopilot usage` prints your Copilot plan and quota from the command line.
164
+
165
+ Token usage is read from the upstream `usage` object. For streaming chat completions, usage is only available when the client sends `stream_options: {"include_usage": true}`; Hoopilot never injects it, so streamed chat requests without that flag contribute request and latency metrics but not token counts. The Responses API always reports usage, so streamed Responses requests are fully accounted.
166
+
167
+ `/metrics` and `/v1/usage` are subject to the same `HOOPILOT_API_KEY` gate as the other routes.
168
+
157
169
  ## Authentication
158
170
 
159
171
  Hoopilot supports one credential flow: GitHub Copilot OAuth browser login.
@@ -171,6 +183,7 @@ Supported authentication-related settings:
171
183
  - `HOOPILOT_GITHUB_CLIENT_ID`: GitHub OAuth app client ID override. The default uses the same GitHub Copilot OAuth app as opencode's Copilot provider.
172
184
  - `HOOPILOT_GITHUB_DOMAIN`: GitHub domain override. Default: `github.com`.
173
185
  - `COPILOT_API_BASE_URL`: upstream Copilot API base URL override. Default: `https://api.githubcopilot.com`.
186
+ - `HOOPILOT_GITHUB_API_BASE_URL`: GitHub REST API base URL used for the Copilot quota lookup. Default: `https://api.github.com`.
174
187
 
175
188
  ## Codex Auth Errors
176
189
 
@@ -201,6 +214,7 @@ If that returns `401 copilot_auth_error`, rerun `npx @openhoo/hoopilot login` an
201
214
  hoopilot [serve] [options]
202
215
  hoopilot login [options]
203
216
  hoopilot models [options]
217
+ hoopilot usage [options]
204
218
  ```
205
219
 
206
220
  Commands:
@@ -209,6 +223,7 @@ Commands:
209
223
  serve Start the proxy server (default)
210
224
  login Sign in through GitHub OAuth in a browser and verify Copilot access
211
225
  models List available GitHub Copilot model IDs
226
+ usage Show GitHub Copilot quota and premium-request usage
212
227
  update, upgrade Update hoopilot to the latest release
213
228
  ```
214
229
 
@@ -229,12 +244,14 @@ Options:
229
244
  ## Endpoints
230
245
 
231
246
  - `GET /healthz`
247
+ - `GET /metrics`
232
248
  - `GET /v1/models`
249
+ - `GET /v1/usage`
233
250
  - `POST /v1/chat/completions`
234
251
  - `POST /v1/responses`
235
252
  - `POST /v1/completions`
236
253
 
237
- `/v1/chat/completions` and `/v1/responses` are proxied to the matching Copilot endpoints as directly as possible. `/v1/completions` translates legacy completion requests and responses to the closest chat completions equivalent.
254
+ `/v1/chat/completions` and `/v1/responses` are proxied to the matching Copilot endpoints as directly as possible. `/v1/completions` translates legacy completion requests and responses to the closest chat completions equivalent. `GET /metrics` and `GET /v1/usage` report proxy metrics and Copilot quota (see [Metrics and usage](#metrics-and-usage)).
238
255
 
239
256
  ## Development
240
257