npm - @openhoo/hoopilot - Versions diffs - 0.6.0 → 0.7.0 - Mend

@openhoo/hoopilot 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md CHANGED Viewed

@@ -97,20 +97,7 @@ $env:OPENAI_BASE_URL = "http://127.0.0.1:4141/v1"
 $env:OPENAI_API_KEY = "local-key"
 ```
-Use with Codex CLI after Hoopilot is running:
-```powershell
-$env:OPENAI_API_KEY = "local-key"
-codex -m gpt-5.5 -c 'model_reasoning_effort="xhigh"' -c 'openai_base_url="http://127.0.0.1:4141/v1"'
-```
-One-line PowerShell form:
-```powershell
-$env:OPENAI_API_KEY = "local-key"; codex -m gpt-5.5 -c 'model_reasoning_effort="xhigh"' -c 'openai_base_url="http://127.0.0.1:4141/v1"'
-```
-Or use the bundled `codexx` convenience command after Hoopilot is already running:
+Use with Codex CLI after Hoopilot is running, via the bundled `codexx` command. It runs Codex against the local server with the right model provider — selecting `gpt-5.5` over Copilot's Responses API, which a plain `openai_base_url` override does not configure (see the note below):
 ```powershell
 $env:HOOPILOT_API_KEY = "local-key"
@@ -167,6 +154,18 @@ Equivalent environment variables:
 Incoming `x-request-id` headers are preserved on responses. If a request has no ID, Hoopilot generates one and returns it as `x-request-id`.
+## Metrics and usage
+Hoopilot tracks token usage, request counts, and latency in memory while the server runs, and can report your GitHub Copilot account quota (premium-request "credit" usage).
+- `GET /metrics` returns Prometheus text (`text/plain; version=0.0.4`). It exposes request counters (`hoopilot_requests_total`), upstream call counters (`hoopilot_upstream_requests_total`), token counters by model and type (`hoopilot_tokens_total{model,type}`), a request-duration histogram (`hoopilot_request_duration_seconds`), an in-flight gauge, and—once `/v1/usage` has been fetched at least once—Copilot quota gauges (`hoopilot_copilot_quota_remaining{category}`, `_entitlement`, `_used`, `_percent_remaining`). Counters reset to zero on restart, which Prometheus handles natively.
+- `GET /v1/usage` returns JSON combining the proxy metrics snapshot with live Copilot quota fetched from GitHub (cached for 60 seconds). If the quota cannot be read, `copilot` is `null` and `copilot_error` explains why, but the proxy metrics are still returned.
+- `hoopilot usage` prints your Copilot plan and quota from the command line.
+Token usage is read from the upstream `usage` object. For streaming chat completions, usage is only available when the client sends `stream_options: {"include_usage": true}`; Hoopilot never injects it, so streamed chat requests without that flag contribute request and latency metrics but not token counts. The Responses API always reports usage, so streamed Responses requests are fully accounted.
+`/metrics` and `/v1/usage` are subject to the same `HOOPILOT_API_KEY` gate as the other routes.
 ## Authentication
 Hoopilot supports one credential flow: GitHub Copilot OAuth browser login.
@@ -184,6 +183,7 @@ Supported authentication-related settings:
 - `HOOPILOT_GITHUB_CLIENT_ID`: GitHub OAuth app client ID override. The default uses the same GitHub Copilot OAuth app as opencode's Copilot provider.
 - `HOOPILOT_GITHUB_DOMAIN`: GitHub domain override. Default: `github.com`.
 - `COPILOT_API_BASE_URL`: upstream Copilot API base URL override. Default: `https://api.githubcopilot.com`.
+- `HOOPILOT_GITHUB_API_BASE_URL`: GitHub REST API base URL used for the Copilot quota lookup. Default: `https://api.github.com`.
 ## Codex Auth Errors
@@ -203,7 +203,7 @@ Then, in another PowerShell session:
 $env:OPENAI_API_KEY = "local-key"
 Invoke-RestMethod -Headers @{ Authorization = "Bearer $env:OPENAI_API_KEY" } `
   http://127.0.0.1:4141/v1/models
-codex -m gpt-5.5 -c 'model_reasoning_effort="xhigh"' -c 'openai_base_url="http://127.0.0.1:4141/v1"'
+codexx
 ```
 If that returns `401 copilot_auth_error`, rerun `npx @openhoo/hoopilot login` and confirm the GitHub account has active Copilot access.
@@ -214,6 +214,7 @@ If that returns `401 copilot_auth_error`, rerun `npx @openhoo/hoopilot login` an
 hoopilot [serve] [options]
 hoopilot login [options]
 hoopilot models [options]
+hoopilot usage [options]
 ```
 Commands:
@@ -222,6 +223,7 @@ Commands:
 serve                             Start the proxy server (default)
 login                             Sign in through GitHub OAuth in a browser and verify Copilot access
 models                            List available GitHub Copilot model IDs
+usage                             Show GitHub Copilot quota and premium-request usage
 update, upgrade                   Update hoopilot to the latest release
 ```
@@ -230,7 +232,7 @@ Options:
 ```txt
 -p, --port <port>                 Port to listen on. Default: 4141
     --host <host>                 Host to listen on. Default: 127.0.0.1
-    --api-key <key>               Require clients to send Authorization: Bearer <key>
+    --api-key <key>               Require clients to send Authorization: Bearer <key> or x-api-key: <key>
     --auth-file <path>            OAuth credential store path
     --copilot-api-base-url <url>  Copilot API base URL override
     --log-level <level>           trace, debug, info, warn, error, fatal, or silent
@@ -242,12 +244,14 @@ Options:
 ## Endpoints
 - `GET /healthz`
+- `GET /metrics`
 - `GET /v1/models`
+- `GET /v1/usage`
 - `POST /v1/chat/completions`
 - `POST /v1/responses`
 - `POST /v1/completions`
-`/v1/chat/completions` and `/v1/responses` are proxied to the matching Copilot endpoints as directly as possible. `/v1/completions` translates legacy completion requests and responses to the closest chat completions equivalent.
+`/v1/chat/completions` and `/v1/responses` are proxied to the matching Copilot endpoints as directly as possible. `/v1/completions` translates legacy completion requests and responses to the closest chat completions equivalent. `GET /metrics` and `GET /v1/usage` report proxy metrics and Copilot quota (see [Metrics and usage](#metrics-and-usage)).
 ## Development