@protolabsai/proto 0.29.0 → 0.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +22 -19
  2. package/cli.js +76 -25
  3. package/package.json +2 -2
package/README.md CHANGED
@@ -27,7 +27,7 @@ At-a-glance overview vs. upstream Qwen Code. For the full architectural breakdow
27
27
  | Ignore files | `.qwenignore` | `.protoignore` + inherits `.claudeignore` patterns |
28
28
  | ACP / Zed integration | Stock | Cron-in-Session, concurrent Agent calls, SSE/HTTP MCP, internal-part filtering |
29
29
  | Extra built-in tools | Standard set | + browser automation, repo-map (PageRank), task tools, mailbox, LSP, voice/STT |
30
- | Observability | Console | Langfuse OTLP traces with harness-intervention spans (SFT-ready) |
30
+ | Observability | Console | OTLP/HTTP to LGTM stack + Langfuse, opt-in, with `gen_ai.response.thinking` and harness-intervention spans (SFT-ready) |
31
31
  | Release pipeline | Manual | Conventional-commit auto-release (`feat:` → minor, `fix:` → patch) |
32
32
  | VS Code companion | Included | Removed (focus on TUI + ACP/Zed) |
33
33
 
@@ -206,53 +206,56 @@ Both no-op outside a TTY, in screen-reader mode, or under tmux/SSH.
206
206
 
207
207
  ## Observability
208
208
 
209
- proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully tracedLLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.
209
+ proto ships OpenTelemetry-native, with both a Tempo/LGTM-style ops backend and Langfuse for prompt-grade trace UI. Both are **opt-in**nothing is sent anywhere until `telemetry.enabled` is `true`.
210
210
 
211
211
  ### Setup
212
212
 
213
- Add to the `env` block in `~/.proto/settings.json`:
213
+ Add to `~/.proto/settings.json`:
214
214
 
215
215
  ```json
216
216
  {
217
+ "telemetry": { "enabled": true },
217
218
  "env": {
219
+ "OTEL_INGRESS_TOKEN": "<bearer token from your Infisical or vault>",
218
220
  "LANGFUSE_PUBLIC_KEY": "pk-lf-...",
219
221
  "LANGFUSE_SECRET_KEY": "sk-lf-...",
220
- "LANGFUSE_BASE_URL": "https://cloud.langfuse.com"
222
+ "LANGFUSE_BASE_URL": "https://your-langfuse-instance.example.com"
221
223
  }
222
224
  }
223
225
  ```
224
226
 
225
- `LANGFUSE_BASE_URL` is optional and defaults to `https://cloud.langfuse.com`. For a self-hosted instance, set it to your deployment URL.
227
+ With `telemetry.enabled = true`:
226
228
 
227
- > **Why `settings.json` and not `.env`?** proto walks up from your CWD loading `.env` files, so a project-level `.env` with Langfuse keys would bleed into proto's tracing and mix your traces into the wrong dataset. The `env` block in `settings.json` is proto-namespaced and completely isolated from your projects.
229
+ - **OTLP traces** ship to `https://otel.proto-labs.ai` over HTTP, bearer-auth via `OTEL_INGRESS_TOKEN`. Override `telemetry.otlpEndpoint` / `telemetry.otlpProtocol` to point at a local OTel collector or a different vendor.
230
+ - **Langfuse traces** ship to `LANGFUSE_BASE_URL` (defaults to `https://cloud.langfuse.com`) when both Langfuse keys are present.
231
+
232
+ Without `telemetry.enabled = true`, neither exporter activates regardless of env vars.
233
+
234
+ > **Why `settings.json` and not `.env`?** proto walks up from your CWD loading `.env` files, so a project-level `.env` with telemetry keys would bleed into proto's tracing and mix your traces into the wrong dataset. The `env` block in `settings.json` is proto-namespaced and completely isolated from your projects.
228
235
 
229
236
  ### What gets traced
230
237
 
231
- | Span | Attributes |
232
- | --------------------- | ---------------------------------------------------------------------------------------------------- |
233
- | `turn` | `session.id`, `turn.id` — root span per user prompt |
234
- | `gen_ai chat {model}` | `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.request.model` — one per LLM call |
235
- | `tool/{name}` | `tool.name`, `tool.type`, `tool.duration_ms` — one per tool execution |
236
- | `agent/{name}` | `agent.name`, `agent.status`, `agent.duration_ms` — one per subagent |
238
+ | Span | Attributes |
239
+ | --------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
240
+ | `turn` | `session.id`, `turn.id` — root span per user prompt |
241
+ | `gen_ai chat {model}` | `gen_ai.usage.{input,output,thinking}_tokens`, `gen_ai.request.model`, `gen_ai.response.thinking` (when present) — one per LLM call |
242
+ | `tool/{name}` | `tool.name`, `tool.type`, `tool.duration_ms` — one per tool execution |
243
+ | `agent/{name}` | `agent.name`, `agent.status`, `agent.duration_ms` — one per subagent |
237
244
 
238
245
  All three provider backends are covered: OpenAI-compatible, Anthropic, and Gemini.
239
246
 
240
247
  ### Prompt content logging
241
248
 
242
- Full prompt messages and response text are included in traces by default. To disable:
249
+ Full prompt messages, response text, and reasoning text are included in traces by default. To disable:
243
250
 
244
251
  ```json
245
252
  // ~/.proto/settings.json
246
253
  {
247
- "telemetry": { "logPrompts": false }
254
+ "telemetry": { "enabled": true, "logPrompts": false }
248
255
  }
249
256
  ```
250
257
 
251
- > **Privacy note:** `logPrompts` is enabled by default. When enabled, full prompt and response content is sent to your Langfuse instance. Set to `false` if you want traces without message content.
252
-
253
- ### Langfuse activates independently
254
-
255
- Langfuse tracing activates from env vars alone — it does not require `telemetry.enabled: true` in settings. The general telemetry pipeline (OTLP/GCP) and Langfuse are independent.
258
+ > **Privacy note:** Telemetry is off by default. When you opt in, `logPrompts` defaults to `true` — full prompt, response, and reasoning content are attached to spans (truncated at 10K chars each). Set `logPrompts: false` if you want token counts and timings without message content.
256
259
 
257
260
  ## Task Management
258
261
 
package/cli.js CHANGED
@@ -93610,7 +93610,7 @@ var require_metadata3 = __commonJS({
93610
93610
  }
93611
93611
  }
93612
93612
  __name(validate3, "validate");
93613
- var Metadata = class _Metadata {
93613
+ var Metadata2 = class _Metadata {
93614
93614
  static {
93615
93615
  __name(this, "Metadata");
93616
93616
  }
@@ -93782,7 +93782,7 @@ var require_metadata3 = __commonJS({
93782
93782
  return result;
93783
93783
  }
93784
93784
  };
93785
- exports2.Metadata = Metadata;
93785
+ exports2.Metadata = Metadata2;
93786
93786
  var bufToString = /* @__PURE__ */ __name((val) => {
93787
93787
  return Buffer.isBuffer(val) ? val.toString("base64") : val;
93788
93788
  }, "bufToString");
@@ -116794,10 +116794,10 @@ var require_grpc_exporter_transport = __commonJS({
116794
116794
  exports2.createSslCredentials = createSslCredentials;
116795
116795
  function createEmptyMetadata() {
116796
116796
  const {
116797
- Metadata
116797
+ Metadata: Metadata2
116798
116798
  // eslint-disable-next-line @typescript-eslint/no-require-imports
116799
116799
  } = require_src11();
116800
- return new Metadata();
116800
+ return new Metadata2();
116801
116801
  }
116802
116802
  __name(createEmptyMetadata, "createEmptyMetadata");
116803
116803
  exports2.createEmptyMetadata = createEmptyMetadata;
@@ -141044,11 +141044,14 @@ function parseOtlpEndpoint(otlpEndpointSetting, protocol) {
141044
141044
  }
141045
141045
  }
141046
141046
  function initializeTelemetry(config2) {
141047
- const langfuse = buildLangfuseExporters();
141048
- if (telemetryInitialized || !config2.getTelemetryEnabled() && !langfuse) {
141047
+ const debugLogger164 = createDebugLogger("OTEL");
141048
+ if (telemetryInitialized || !config2.getTelemetryEnabled()) {
141049
+ if (!telemetryInitialized && (process.env["LANGFUSE_PUBLIC_KEY"] || process.env["LANGFUSE_SECRET_KEY"])) {
141050
+ debugLogger164.debug("Langfuse env vars detected but telemetry.enabled is false \u2014 skipping. Set telemetry.enabled = true in settings to opt in.");
141051
+ }
141049
141052
  return;
141050
141053
  }
141051
- const debugLogger164 = createDebugLogger("OTEL");
141054
+ const langfuse = buildLangfuseExporters();
141052
141055
  const resource = (0, import_resources.resourceFromAttributes)({
141053
141056
  [SemanticResourceAttributes.SERVICE_NAME]: SERVICE_NAME,
141054
141057
  [SemanticResourceAttributes.SERVICE_VERSION]: process.version,
@@ -141066,32 +141069,49 @@ function initializeTelemetry(config2) {
141066
141069
  let logExporter;
141067
141070
  let metricReader;
141068
141071
  if (useOtlp) {
141072
+ const otlpAuthToken = process.env["OTEL_INGRESS_TOKEN"];
141073
+ const otlpHeaders = otlpAuthToken ? { Authorization: `Bearer ${otlpAuthToken}` } : void 0;
141074
+ if (!otlpAuthToken && /otel\.proto-labs\.ai/.test(parsedEndpoint ?? "")) {
141075
+ debugLogger164.debug("OTEL_INGRESS_TOKEN not set; OTLP exports to otel.proto-labs.ai will return 401.");
141076
+ }
141069
141077
  if (otlpProtocol === "http") {
141078
+ const httpAuth = otlpHeaders ? { headers: otlpHeaders } : {};
141070
141079
  spanExporter = new import_exporter_trace_otlp_http.OTLPTraceExporter({
141071
- url: parsedEndpoint
141080
+ url: parsedEndpoint,
141081
+ ...httpAuth
141072
141082
  });
141073
141083
  logExporter = new import_exporter_logs_otlp_http.OTLPLogExporter({
141074
- url: parsedEndpoint
141084
+ url: parsedEndpoint,
141085
+ ...httpAuth
141075
141086
  });
141076
141087
  metricReader = new import_sdk_metrics2.PeriodicExportingMetricReader({
141077
141088
  exporter: new import_exporter_metrics_otlp_http.OTLPMetricExporter({
141078
- url: parsedEndpoint
141089
+ url: parsedEndpoint,
141090
+ ...httpAuth
141079
141091
  }),
141080
141092
  exportIntervalMillis: 1e4
141081
141093
  });
141082
141094
  } else {
141095
+ const grpcAuth = otlpAuthToken ? (() => {
141096
+ const m3 = new import_grpc_js.Metadata();
141097
+ m3.set("authorization", `Bearer ${otlpAuthToken}`);
141098
+ return { metadata: m3 };
141099
+ })() : {};
141083
141100
  spanExporter = new import_exporter_trace_otlp_grpc.OTLPTraceExporter({
141084
141101
  url: parsedEndpoint,
141085
- compression: CompressionAlgorithm.GZIP
141102
+ compression: CompressionAlgorithm.GZIP,
141103
+ ...grpcAuth
141086
141104
  });
141087
141105
  logExporter = new import_exporter_logs_otlp_grpc.OTLPLogExporter({
141088
141106
  url: parsedEndpoint,
141089
- compression: CompressionAlgorithm.GZIP
141107
+ compression: CompressionAlgorithm.GZIP,
141108
+ ...grpcAuth
141090
141109
  });
141091
141110
  metricReader = new import_sdk_metrics2.PeriodicExportingMetricReader({
141092
141111
  exporter: new import_exporter_metrics_otlp_grpc.OTLPMetricExporter({
141093
141112
  url: parsedEndpoint,
141094
- compression: CompressionAlgorithm.GZIP
141113
+ compression: CompressionAlgorithm.GZIP,
141114
+ ...grpcAuth
141095
141115
  }),
141096
141116
  exportIntervalMillis: 1e4
141097
141117
  });
@@ -141158,7 +141178,7 @@ async function shutdownTelemetry() {
141158
141178
  telemetryInitialized = false;
141159
141179
  }
141160
141180
  }
141161
- var import_exporter_trace_otlp_grpc, import_exporter_logs_otlp_grpc, import_exporter_metrics_otlp_grpc, import_exporter_trace_otlp_http, import_exporter_logs_otlp_http, import_exporter_metrics_otlp_http, import_sdk_node, import_resources, import_sdk_trace_node, import_sdk_logs, import_sdk_metrics2, import_instrumentation_http, sdk, telemetryInitialized;
141181
+ var import_exporter_trace_otlp_grpc, import_exporter_logs_otlp_grpc, import_exporter_metrics_otlp_grpc, import_exporter_trace_otlp_http, import_exporter_logs_otlp_http, import_exporter_metrics_otlp_http, import_grpc_js, import_sdk_node, import_resources, import_sdk_trace_node, import_sdk_logs, import_sdk_metrics2, import_instrumentation_http, sdk, telemetryInitialized;
141162
141182
  var init_sdk = __esm({
141163
141183
  "packages/core/dist/src/telemetry/sdk.js"() {
141164
141184
  "use strict";
@@ -141171,6 +141191,7 @@ var init_sdk = __esm({
141171
141191
  import_exporter_logs_otlp_http = __toESM(require_src21(), 1);
141172
141192
  import_exporter_metrics_otlp_http = __toESM(require_src18(), 1);
141173
141193
  init_esm3();
141194
+ import_grpc_js = __toESM(require_src11(), 1);
141174
141195
  import_sdk_node = __toESM(require_src34(), 1);
141175
141196
  init_esm2();
141176
141197
  import_resources = __toESM(require_src13(), 1);
@@ -155923,7 +155944,7 @@ var init_pipeline = __esm({
155923
155944
  this.converter = new OpenAIContentConverter(this.contentGeneratorConfig.model, this.contentGeneratorConfig.schemaCompliance, this.contentGeneratorConfig.modalities ?? {});
155924
155945
  }
155925
155946
  async execute(request3, userPromptId) {
155926
- const effectiveModel = this.contentGeneratorConfig.model;
155947
+ const effectiveModel = this.resolveEffectiveModel(request3);
155927
155948
  this.converter.setModel(effectiveModel);
155928
155949
  this.converter.setModalities(this.contentGeneratorConfig.modalities ?? {});
155929
155950
  return this.executeWithErrorHandling(request3, userPromptId, false, effectiveModel, async (openaiRequest) => {
@@ -155935,7 +155956,7 @@ var init_pipeline = __esm({
155935
155956
  });
155936
155957
  }
155937
155958
  async executeStream(request3, userPromptId) {
155938
- const effectiveModel = this.contentGeneratorConfig.model;
155959
+ const effectiveModel = this.resolveEffectiveModel(request3);
155939
155960
  this.converter.setModel(effectiveModel);
155940
155961
  this.converter.setModalities(this.contentGeneratorConfig.modalities ?? {});
155941
155962
  return this.executeWithErrorHandling(request3, userPromptId, true, effectiveModel, async (openaiRequest, context2) => {
@@ -156285,6 +156306,22 @@ var init_pipeline = __esm({
156285
156306
  context2.duration = Date.now() - context2.startTime;
156286
156307
  this.config.errorHandler.handle(error40, context2, request3);
156287
156308
  }
156309
+ /**
156310
+ * Resolve which model to actually send to the upstream. Defaults to the
156311
+ * configured model. Callers may opt into using `request.model` instead by
156312
+ * setting `request.config.allowModelOverride = true` — the request.model
156313
+ * string is used verbatim and the caller takes responsibility for it being
156314
+ * valid/available on the backend (e.g. recap → "protolabs/fast" alias).
156315
+ */
156316
+ resolveEffectiveModel(request3) {
156317
+ const configured = this.contentGeneratorConfig.model;
156318
+ const allowOverride = request3.config?.["allowModelOverride"] === true;
156319
+ const requested = request3.model;
156320
+ if (allowOverride && typeof requested === "string" && requested.length > 0) {
156321
+ return requested;
156322
+ }
156323
+ return configured;
156324
+ }
156288
156325
  /**
156289
156326
  * Create request context with common properties
156290
156327
  */
@@ -169046,7 +169083,7 @@ __export(geminiContentGenerator_exports, {
169046
169083
  createGeminiContentGenerator: () => createGeminiContentGenerator
169047
169084
  });
169048
169085
  function createGeminiContentGenerator(config2, gcConfig) {
169049
- const version2 = "0.29.0";
169086
+ const version2 = "0.31.0";
169050
169087
  const userAgent2 = config2.userAgent || `QwenCode/${version2} (${process.platform}; ${process.arch})`;
169051
169088
  const baseHeaders = {
169052
169089
  "User-Agent": userAgent2
@@ -191415,7 +191452,7 @@ var init_telemetry = __esm({
191415
191452
  TelemetryTarget2["QWEN"] = "qwen";
191416
191453
  })(TelemetryTarget || (TelemetryTarget = {}));
191417
191454
  DEFAULT_TELEMETRY_TARGET = TelemetryTarget.LOCAL;
191418
- DEFAULT_OTLP_ENDPOINT = "http://localhost:4317";
191455
+ DEFAULT_OTLP_ENDPOINT = "https://otel.proto-labs.ai";
191419
191456
  }
191420
191457
  });
191421
191458
 
@@ -275205,7 +275242,7 @@ var init_config3 = __esm({
275205
275242
  return this.telemetrySettings.otlpEndpoint ?? DEFAULT_OTLP_ENDPOINT;
275206
275243
  }
275207
275244
  getTelemetryOtlpProtocol() {
275208
- return this.telemetrySettings.otlpProtocol ?? "grpc";
275245
+ return this.telemetrySettings.otlpProtocol ?? "http";
275209
275246
  }
275210
275247
  getTelemetryTarget() {
275211
275248
  return this.telemetrySettings.target ?? DEFAULT_TELEMETRY_TARGET;
@@ -284977,6 +285014,13 @@ var init_followup = __esm({
284977
285014
  });
284978
285015
 
284979
285016
  // packages/core/dist/src/recap/recapGenerator.js
285017
+ function pickRecapModel(config2) {
285018
+ const available = config2.getModelsConfig().getAllConfiguredModels();
285019
+ if (available.some((m3) => m3.id === PREFERRED_RECAP_MODEL_ID)) {
285020
+ return { model: PREFERRED_RECAP_MODEL_ID, isOverride: true };
285021
+ }
285022
+ return { model: config2.getModel(), isOverride: false };
285023
+ }
284980
285024
  async function generateRecap(config2, conversationHistory, abortSignal) {
284981
285025
  if (conversationHistory.length === 0)
284982
285026
  return null;
@@ -284986,9 +285030,10 @@ async function generateRecap(config2, conversationHistory, abortSignal) {
284986
285030
  ...recent,
284987
285031
  { role: "user", parts: [{ text: RECAP_PROMPT }] }
284988
285032
  ];
285033
+ const { model, isOverride } = pickRecapModel(config2);
284989
285034
  const generator = config2.getContentGenerator();
284990
285035
  const response = await generator.generateContent({
284991
- model: config2.getModel(),
285036
+ model,
284992
285037
  contents,
284993
285038
  config: {
284994
285039
  abortSignal,
@@ -284997,7 +285042,11 @@ async function generateRecap(config2, conversationHistory, abortSignal) {
284997
285042
  // tool-stripping path. Without this, assistant turns containing
284998
285043
  // tool_calls — i.e. most of the agent's actual work — are dropped
284999
285044
  // before the request leaves, starving the recap of context.
285000
- tools: []
285045
+ tools: [],
285046
+ // Opt into the model override path in the OpenAI pipeline. Pipeline
285047
+ // ignores request.model by default for safety; for recap we know the
285048
+ // alias resolves on the gateway, so honor it.
285049
+ ...isOverride ? { allowModelOverride: true } : {}
285001
285050
  }
285002
285051
  }, "recap");
285003
285052
  const text = response.candidates?.[0]?.content?.parts?.map((p2) => p2.text ?? "").join("").trim();
@@ -285011,7 +285060,7 @@ async function generateRecap(config2, conversationHistory, abortSignal) {
285011
285060
  return null;
285012
285061
  }
285013
285062
  }
285014
- var debugLogger99, RECENT_MESSAGE_WINDOW, RECAP_PROMPT;
285063
+ var debugLogger99, RECENT_MESSAGE_WINDOW, PREFERRED_RECAP_MODEL_ID, RECAP_PROMPT;
285015
285064
  var init_recapGenerator = __esm({
285016
285065
  "packages/core/dist/src/recap/recapGenerator.js"() {
285017
285066
  "use strict";
@@ -285019,11 +285068,13 @@ var init_recapGenerator = __esm({
285019
285068
  init_debugLogger();
285020
285069
  debugLogger99 = createDebugLogger("RECAP");
285021
285070
  RECENT_MESSAGE_WINDOW = 30;
285071
+ PREFERRED_RECAP_MODEL_ID = "protolabs/fast";
285022
285072
  RECAP_PROMPT = `That last agent turn was long. Summarize where we are so the user can pick back up cold.
285023
285073
 
285024
285074
  Write exactly 1-3 short sentences. Lead with the high-level goal \u2014 what they're building or debugging, not implementation details. Then state the concrete current status or next step. No status reports, no commit recaps, no apologies.
285025
285075
 
285026
285076
  Reply with ONLY the recap text \u2014 no headers, no quotes, no preamble.`;
285077
+ __name(pickRecapModel, "pickRecapModel");
285027
285078
  __name(generateRecap, "generateRecap");
285028
285079
  }
285029
285080
  });
@@ -414942,7 +414993,7 @@ __name(getPackageJson, "getPackageJson");
414942
414993
  // packages/cli/src/utils/version.ts
414943
414994
  async function getCliVersion() {
414944
414995
  const pkgJson = await getPackageJson();
414945
- return "0.29.0";
414996
+ return "0.31.0";
414946
414997
  }
414947
414998
  __name(getCliVersion, "getCliVersion");
414948
414999
 
@@ -422714,7 +422765,7 @@ var formatDuration = /* @__PURE__ */ __name((milliseconds) => {
422714
422765
 
422715
422766
  // packages/cli/src/generated/git-commit.ts
422716
422767
  init_esbuild_shims();
422717
- var GIT_COMMIT_INFO = "c4dafcfe9";
422768
+ var GIT_COMMIT_INFO = "d77ab4b1b";
422718
422769
 
422719
422770
  // packages/cli/src/utils/systemInfo.ts
422720
422771
  async function getNpmVersion() {
@@ -490880,7 +490931,7 @@ var QwenAgent = class {
490880
490931
  async initialize(args2) {
490881
490932
  this.clientCapabilities = args2.clientCapabilities;
490882
490933
  const authMethods = buildAuthMethods();
490883
- const version2 = "0.29.0";
490934
+ const version2 = "0.31.0";
490884
490935
  return {
490885
490936
  protocolVersion: PROTOCOL_VERSION,
490886
490937
  agentInfo: {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@protolabsai/proto",
3
- "version": "0.29.0",
3
+ "version": "0.31.0",
4
4
  "description": "proto - AI-powered coding agent",
5
5
  "repository": {
6
6
  "type": "git",
@@ -21,7 +21,7 @@
21
21
  "bundled"
22
22
  ],
23
23
  "config": {
24
- "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.29.0"
24
+ "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.31.0"
25
25
  },
26
26
  "dependencies": {},
27
27
  "optionalDependencies": {