@ramarivera/coding-agent-langfuse 0.1.42 → 0.1.44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,9 +3,10 @@
3
3
  Universal coding-agent Langfuse backfiller and OTLP exporter helpers.
4
4
 
5
5
  It imports local histories from Codex, Claude Code, Grok, OpenCode, and Pi into
6
- Langfuse as session traces with child observations. LLM usage records are kept
7
- as observation metadata so historical imports do not create Langfuse billing or
8
- cost rows. Tool calls remain child spans under the same session.
6
+ Langfuse as session traces with child observations. LLM generations include
7
+ Langfuse canonical `usage_details` and `cost_details` attributes so historical
8
+ backfills participate in Langfuse model-usage and cost dashboards. Tool calls
9
+ remain child spans under the same session.
9
10
 
10
11
  ```sh
11
12
  coding-agent-langfuse-backfill --agents codex,claude,grok,pi,opencode
@@ -35,6 +36,56 @@ npx @ramarivera/coding-agent-langfuse@latest \
35
36
  The importer is fail-fast: the first failed OTLP POST stops the run, prints the
36
37
  real network cause, and preserves local state so reruns resume cleanly.
37
38
 
39
+ ## Cost calculation
40
+
41
+ Backfill cost calculation follows Langfuse's OpenTelemetry mapping: generation
42
+ spans receive `langfuse.observation.usage_details` and
43
+ `langfuse.observation.cost_details` JSON attributes. If a source history already
44
+ records a total cost, that recorded value wins. Otherwise, the importer
45
+ calculates per-usage-type USD costs from a model catalog using rates in USD per
46
+ 1M tokens.
47
+
48
+ The built-in catalog covers OpenAI GPT-5.5, GPT-5.4, and GPT-5.3-Codex API list
49
+ pricing, Anthropic Claude Opus/Sonnet 4 API list pricing, plus the toolbox/Pi
50
+ models already used in local configuration, including Fireworks Kimi K2.6,
51
+ Fireworks DeepSeek V4 Pro, MiniMax-M3, Together DeepSeek/Kimi/GLM/MiniMax, and
52
+ Zai GLM. `gpt-5.5` is charged at current standard API list price by default:
53
+ `$5.00` input, `$0.50` cached input, and `$30.00` output per 1M tokens. GPT-5.5
54
+ Pro defaults to `$30.00` input and `$180.00` output per 1M tokens. Claude Opus 4
55
+ models default to `$15.00` input, `$1.50` cache hits, `$18.75` 5-minute cache
56
+ writes, `$30.00` 1-hour cache writes, and `$75.00` output per 1M tokens.
57
+ When a source only records a total token count without input/output/cache
58
+ breakdown, the importer charges that total at the model input rate and marks the
59
+ cost source as `calculated_total_as_input`.
60
+
61
+ Use an override only when you intentionally want a different accounting policy:
62
+
63
+ ```sh
64
+ npx @ramarivera/coding-agent-langfuse@latest \
65
+ --agents codex \
66
+ --cost-rates-json '{"gpt-5.5":{"input":1,"output":2,"cacheRead":0.1,"cacheWrite":0}}'
67
+ ```
68
+
69
+ You can also keep the policy in a JSON file and pass `--cost-rates PATH`, or set
70
+ `CODING_AGENT_LANGFUSE_COST_RATES_PATH` /
71
+ `CODING_AGENT_LANGFUSE_COST_RATES_JSON` for both manual backfills and generated
72
+ services. A file can be either a direct model map or `{ "rates": { ... } }`:
73
+
74
+ ```json
75
+ {
76
+ "rates": {
77
+ "gpt-5.5": {
78
+ "input": 1,
79
+ "output": 2,
80
+ "cacheRead": 0.1,
81
+ "cacheWrite": 0,
82
+ "cacheWrite5m": 0,
83
+ "cacheWrite1h": 0
84
+ }
85
+ }
86
+ }
87
+ ```
88
+
38
89
  ## Follow as a host service
39
90
 
40
91
  Install a live follower directly from npm. The generated service keeps inference
@@ -95,8 +146,26 @@ npx @ramarivera/coding-agent-langfuse@latest \
95
146
  --endpoint https://langfuse.ai.roxasroot.net/otel/v1/traces
96
147
  ```
97
148
 
98
- Deduplication is state-file based and keyed by agent, session id, and source
99
- record id. Reuse the same `--state` path for repeat repairs on a host.
149
+ Deduplication is state-file based and keyed by importer state identity, agent,
150
+ session id, and source record id. Reuse the same `--state` path for normal
151
+ incremental runs.
152
+
153
+ For an intentional repair replay, add `--force` to resend the selected window
154
+ even when the state file says those events were already sent:
155
+
156
+ ```sh
157
+ npx @ramarivera/coding-agent-langfuse@latest \
158
+ --agents claude,codex,grok,pi,opencode \
159
+ --since 2026-05-01T00:00:00Z \
160
+ --until 2026-06-01T00:00:00Z \
161
+ --force \
162
+ --endpoint https://langfuse.ai.roxasroot.net/otel/v1/traces
163
+ ```
164
+
165
+ The Langfuse trace/span IDs intentionally stay pinned to the original pre-cost
166
+ identity, while the state-file key can advance with importer payload changes.
167
+ That lets cost repairs replace historical zero-cost rows instead of creating a
168
+ new duplicate identity for the same source event.
100
169
 
101
170
  ## Verification
102
171
 
@@ -112,7 +181,7 @@ npm run test:e2e
112
181
  The e2e suite verifies:
113
182
 
114
183
  - Codex full session traces with messages, reasoning, tool calls, tool results,
115
- and usage metadata
184
+ usage details, and cost details
116
185
  - Follow mode picking up newly written Codex events
117
186
  - One CLI run posting reconstructable traces for Claude Code, Codex, Grok,
118
187
  OpenCode, and Pi
@@ -6,8 +6,11 @@ type Usage = {
6
6
  reasoning?: number;
7
7
  cacheRead?: number;
8
8
  cacheWrite?: number;
9
+ cacheWrite5m?: number;
10
+ cacheWrite1h?: number;
9
11
  total?: number;
10
12
  cost?: number;
13
+ inputIncludesCache?: boolean;
11
14
  };
12
15
  type BackfillEvent = {
13
16
  agent: AgentName;
@@ -33,6 +36,7 @@ type BackfillOptions = {
33
36
  statePath: string;
34
37
  homeDir: string;
35
38
  dryRun: boolean;
39
+ force: boolean;
36
40
  follow: boolean;
37
41
  pollIntervalMs: number;
38
42
  idleExitAfterMs?: number;
@@ -44,6 +48,21 @@ type BackfillOptions = {
44
48
  maxRequestBytes: number;
45
49
  maxFieldBytes: number;
46
50
  postDelayMs: number;
51
+ costRates: CostCatalog;
52
+ };
53
+ type CostRates = {
54
+ input?: number;
55
+ output?: number;
56
+ reasoning?: number;
57
+ cacheRead?: number;
58
+ cacheWrite?: number;
59
+ cacheWrite5m?: number;
60
+ cacheWrite1h?: number;
61
+ };
62
+ type CostCatalog = Record<string, CostRates>;
63
+ type OtlpOptions = {
64
+ maxFieldBytes?: number;
65
+ costRates?: CostCatalog;
47
66
  };
48
67
  type RunSummary = {
49
68
  discovered: Record<string, number>;
@@ -77,9 +96,7 @@ declare function opencodeEvents(homeDir: string, options?: {
77
96
  untilMs?: number;
78
97
  }): BackfillEvent[];
79
98
  declare function fingerprint(event: BackfillEvent): string;
80
- declare function toOtlp(events: BackfillEvent[], options?: {
81
- maxFieldBytes?: number;
82
- }): Record<string, unknown>;
99
+ declare function toOtlp(events: BackfillEvent[], options?: OtlpOptions): Record<string, unknown>;
83
100
  declare function discoverEvents(options: BackfillOptions): BackfillEvent[];
84
101
  declare function run(options: BackfillOptions): Promise<RunSummary>;
85
102
  declare function follow(options: BackfillOptions): Promise<FollowSummary>;
package/dist/backfill.js CHANGED
@@ -5,20 +5,178 @@ import { existsSync, mkdirSync, renameSync, readdirSync, readFileSync, statSync,
5
5
  import { hostname, homedir } from "node:os";
6
6
  import { dirname, join } from "node:path";
7
7
  const allAgents = ["claude", "codex", "grok", "opencode", "pi"];
8
- const importIdentityVersion = "v8-cached-input-token-split";
9
- const importIdentityVersions = {
8
+ const importStateIdentityVersion = "v9-cost-details";
9
+ const importStateIdentityVersions = {
10
+ claude: "v12-cost-details",
11
+ codex: "v10-cost-details",
12
+ grok: "v12-cost-details",
13
+ opencode: "v11-cost-details",
14
+ pi: "v12-cost-details",
15
+ };
16
+ const langfuseIdIdentityVersion = "v8-cached-input-token-split";
17
+ const langfuseIdIdentityVersions = {
10
18
  claude: "v11-tool-results",
11
19
  codex: "v9-codex-conversation-events",
12
20
  grok: "v11-chat-history-only",
13
21
  opencode: "v10-opencode-message-parts",
14
22
  pi: "v11-tool-results",
15
23
  };
24
+ const importPayloadVersion = "v10-cost-details";
16
25
  const defaultEndpoint = "https://langfuse.ai.roxasroot.net/otel/v1/traces";
17
26
  const deadRemoteEndpoint = "http://langfuse.ai.roxasroot.net:14318/v1/traces";
18
27
  const defaultMaxRequestBytes = 12 * 1024 * 1024;
19
28
  const defaultMaxFieldBytes = 512 * 1024;
20
29
  const defaultStatePath = join(homedir(), ".local/state/coding-agent-langfuse/backfill-v6.json");
21
30
  const currentHost = hostname();
31
+ const kimiFirepassRates = {
32
+ input: 2,
33
+ output: 8,
34
+ cacheRead: 0.3,
35
+ cacheWrite: 0,
36
+ };
37
+ const deepseekFireworksRates = {
38
+ input: 1.74,
39
+ output: 3.48,
40
+ cacheRead: 0.15,
41
+ cacheWrite: 0,
42
+ };
43
+ const miniMaxM3Rates = {
44
+ input: 0.6,
45
+ output: 2.4,
46
+ cacheRead: 0.12,
47
+ cacheWrite: 0,
48
+ };
49
+ const gpt55Rates = {
50
+ input: 5,
51
+ output: 30,
52
+ cacheRead: 0.5,
53
+ cacheWrite: 5,
54
+ };
55
+ const gpt55ProRates = {
56
+ input: 30,
57
+ output: 180,
58
+ cacheRead: 30,
59
+ cacheWrite: 30,
60
+ };
61
+ const gpt54Rates = {
62
+ input: 2.5,
63
+ output: 15,
64
+ cacheRead: 0.25,
65
+ cacheWrite: 2.5,
66
+ };
67
+ const gpt53CodexRates = {
68
+ input: 1.75,
69
+ output: 14,
70
+ cacheRead: 0.175,
71
+ cacheWrite: 1.75,
72
+ };
73
+ const claudeOpus4Rates = {
74
+ input: 15,
75
+ output: 75,
76
+ cacheRead: 1.5,
77
+ cacheWrite: 18.75,
78
+ cacheWrite5m: 18.75,
79
+ cacheWrite1h: 30,
80
+ };
81
+ const claudeSonnet4Rates = {
82
+ input: 3,
83
+ output: 15,
84
+ cacheRead: 0.3,
85
+ cacheWrite: 3.75,
86
+ cacheWrite5m: 3.75,
87
+ cacheWrite1h: 6,
88
+ };
89
+ const defaultCostRates = {
90
+ "gpt-5.5": gpt55Rates,
91
+ "openai/gpt-5.5": gpt55Rates,
92
+ "gpt-5.5-pro": gpt55ProRates,
93
+ "openai/gpt-5.5-pro": gpt55ProRates,
94
+ "gpt-5.4": gpt54Rates,
95
+ "openai/gpt-5.4": gpt54Rates,
96
+ "gpt-5.3-codex": gpt53CodexRates,
97
+ "openai/gpt-5.3-codex": gpt53CodexRates,
98
+ "claude-opus-4": claudeOpus4Rates,
99
+ "anthropic/claude-opus-4": claudeOpus4Rates,
100
+ "claude-opus-4-1": claudeOpus4Rates,
101
+ "anthropic/claude-opus-4-1": claudeOpus4Rates,
102
+ "claude-opus-4-6": claudeOpus4Rates,
103
+ "anthropic/claude-opus-4-6": claudeOpus4Rates,
104
+ "claude-opus-4-7": claudeOpus4Rates,
105
+ "anthropic/claude-opus-4-7": claudeOpus4Rates,
106
+ "claude-opus-4-8": claudeOpus4Rates,
107
+ "anthropic/claude-opus-4-8": claudeOpus4Rates,
108
+ "claude-sonnet-4": claudeSonnet4Rates,
109
+ "anthropic/claude-sonnet-4": claudeSonnet4Rates,
110
+ "claude-sonnet-4-5": claudeSonnet4Rates,
111
+ "anthropic/claude-sonnet-4-5": claudeSonnet4Rates,
112
+ "claude-sonnet-4.5": claudeSonnet4Rates,
113
+ "anthropic/claude-sonnet-4.5": claudeSonnet4Rates,
114
+ "claude-sonnet-4-6": claudeSonnet4Rates,
115
+ "anthropic/claude-sonnet-4-6": claudeSonnet4Rates,
116
+ "claude-sonnet-4.6": claudeSonnet4Rates,
117
+ "anthropic/claude-sonnet-4.6": claudeSonnet4Rates,
118
+ "accounts/fireworks/routers/kimi-k2p6-turbo": kimiFirepassRates,
119
+ "fireworks-firepass/accounts/fireworks/routers/kimi-k2p6-turbo": kimiFirepassRates,
120
+ "kimi-for-coding": kimiFirepassRates,
121
+ "accounts/fireworks/models/deepseek-v4-pro": deepseekFireworksRates,
122
+ "fireworks/accounts/fireworks/models/deepseek-v4-pro": deepseekFireworksRates,
123
+ "MiniMax-M3": miniMaxM3Rates,
124
+ "minimax/MiniMax-M3": miniMaxM3Rates,
125
+ "together/deepseek-ai/DeepSeek-V4-Pro": {
126
+ input: 2.1,
127
+ output: 4.4,
128
+ cacheRead: 0.2,
129
+ cacheWrite: 0,
130
+ },
131
+ "deepseek-ai/DeepSeek-V4-Pro": {
132
+ input: 2.1,
133
+ output: 4.4,
134
+ cacheRead: 0.2,
135
+ cacheWrite: 0,
136
+ },
137
+ "together/zai-org/GLM-5.1": {
138
+ input: 1.4,
139
+ output: 4.4,
140
+ cacheRead: 0.2,
141
+ cacheWrite: 0,
142
+ },
143
+ "zai-org/GLM-5.1": {
144
+ input: 1.4,
145
+ output: 4.4,
146
+ cacheRead: 0.2,
147
+ cacheWrite: 0,
148
+ },
149
+ "together/moonshotai/Kimi-K2.6": {
150
+ input: 1.2,
151
+ output: 4.5,
152
+ cacheRead: 0.2,
153
+ cacheWrite: 0,
154
+ },
155
+ "moonshotai/Kimi-K2.6": {
156
+ input: 1.2,
157
+ output: 4.5,
158
+ cacheRead: 0.2,
159
+ cacheWrite: 0,
160
+ },
161
+ "together/MiniMaxAI/MiniMax-M2.7": {
162
+ input: 0.3,
163
+ output: 1.2,
164
+ cacheRead: 0.06,
165
+ cacheWrite: 0,
166
+ },
167
+ "MiniMax-M2.7": {
168
+ input: 0.3,
169
+ output: 1.2,
170
+ cacheRead: 0.06,
171
+ cacheWrite: 0,
172
+ },
173
+ "zai/glm-5.1": {
174
+ input: 1.4,
175
+ output: 4.4,
176
+ cacheRead: 0.2,
177
+ cacheWrite: 0,
178
+ },
179
+ };
22
180
  function projectMetadata(cwd) {
23
181
  if (!cwd)
24
182
  return {};
@@ -46,10 +204,13 @@ Options:
46
204
  --max-request-bytes N Split OTLP POSTs below this JSON byte size (default: ${defaultMaxRequestBytes})
47
205
  --max-field-bytes N Truncate individual input/output fields above this byte size (default: ${defaultMaxFieldBytes})
48
206
  --post-delay-ms N Delay after each successful OTLP POST (default: 0)
207
+ --cost-rates PATH JSON model cost-rate overrides in USD per 1M tokens
208
+ --cost-rates-json JSON Inline JSON model cost-rate overrides in USD per 1M tokens
49
209
  --follow Keep scanning and sending newly written events
50
210
  --poll-interval-ms N Delay between --follow scans (default: 5000)
51
211
  --idle-exit-after-ms N Stop --follow after this much time without new sends
52
212
  --dry-run Discover and dedupe without sending or mutating state
213
+ --force Resend discovered events even when present in local state
53
214
  --help Show this help
54
215
  `;
55
216
  }
@@ -58,6 +219,7 @@ function parseArgs(argv) {
58
219
  let statePath = process.env.LANGFUSE_BACKFILL_STATE ?? defaultStatePath;
59
220
  let homeDir = process.env.HOME ?? homedir();
60
221
  let dryRun = false;
222
+ let force = false;
61
223
  let limit;
62
224
  let sinceMs;
63
225
  let untilMs;
@@ -71,6 +233,7 @@ function parseArgs(argv) {
71
233
  let postDelayMs = Number.parseInt(process.env.LANGFUSE_BACKFILL_POST_DELAY_MS ?? "", 10);
72
234
  if (!Number.isFinite(postDelayMs))
73
235
  postDelayMs = 0;
236
+ let costRates = loadCostCatalogFromEnv();
74
237
  let follow = false;
75
238
  let pollIntervalMs = 5_000;
76
239
  let idleExitAfterMs;
@@ -91,6 +254,9 @@ function parseArgs(argv) {
91
254
  if (arg === "--dry-run") {
92
255
  dryRun = true;
93
256
  }
257
+ else if (arg === "--force") {
258
+ force = true;
259
+ }
94
260
  else if (arg === "--endpoint") {
95
261
  endpoint = normalizeEndpoint(next());
96
262
  }
@@ -125,6 +291,12 @@ function parseArgs(argv) {
125
291
  else if (arg === "--post-delay-ms") {
126
292
  postDelayMs = Number.parseInt(next(), 10);
127
293
  }
294
+ else if (arg === "--cost-rates") {
295
+ costRates = mergeCostCatalog(costRates, loadCostCatalogFile(next()));
296
+ }
297
+ else if (arg === "--cost-rates-json") {
298
+ costRates = mergeCostCatalog(costRates, parseCostCatalogJson(next()));
299
+ }
128
300
  else if (arg === "--follow") {
129
301
  follow = true;
130
302
  }
@@ -179,6 +351,7 @@ function parseArgs(argv) {
179
351
  statePath,
180
352
  homeDir,
181
353
  dryRun,
354
+ force,
182
355
  follow,
183
356
  pollIntervalMs,
184
357
  idleExitAfterMs,
@@ -190,8 +363,91 @@ function parseArgs(argv) {
190
363
  maxRequestBytes,
191
364
  maxFieldBytes,
192
365
  postDelayMs,
366
+ costRates,
193
367
  };
194
368
  }
369
+ function loadCostCatalogFromEnv() {
370
+ let catalog = { ...defaultCostRates };
371
+ const path = process.env.CODING_AGENT_LANGFUSE_COST_RATES_PATH ??
372
+ process.env.LANGFUSE_BACKFILL_COST_RATES_PATH;
373
+ const inlineJson = process.env.CODING_AGENT_LANGFUSE_COST_RATES_JSON ??
374
+ process.env.LANGFUSE_BACKFILL_COST_RATES_JSON;
375
+ if (path)
376
+ catalog = mergeCostCatalog(catalog, loadCostCatalogFile(path));
377
+ if (inlineJson)
378
+ catalog = mergeCostCatalog(catalog, parseCostCatalogJson(inlineJson));
379
+ return catalog;
380
+ }
381
+ function loadCostCatalogFile(path) {
382
+ return parseCostCatalogJson(readFileSync(path, "utf8"), path);
383
+ }
384
+ function parseCostCatalogJson(json, source = "inline JSON") {
385
+ let parsed;
386
+ try {
387
+ parsed = JSON.parse(json);
388
+ }
389
+ catch (error) {
390
+ throw new Error(`Invalid cost rates ${source}: ${describeError(error)}`);
391
+ }
392
+ const root = asRecord(parsed);
393
+ const nestedRates = asRecord(root.rates);
394
+ const entries = Object.keys(nestedRates).length > 0 ? nestedRates : root;
395
+ const catalog = {};
396
+ for (const [modelKey, rawRates] of Object.entries(entries)) {
397
+ const rates = normalizeCostRates(rawRates, modelKey, source);
398
+ if (rates)
399
+ catalog[modelKey] = rates;
400
+ }
401
+ return catalog;
402
+ }
403
+ function normalizeCostRates(value, modelKey, source) {
404
+ const record = asRecord(value);
405
+ const rates = {
406
+ input: asNumber(record.input) ??
407
+ asNumber(record.input_cost) ??
408
+ asNumber(record.inputPerMillion),
409
+ output: asNumber(record.output) ??
410
+ asNumber(record.output_cost) ??
411
+ asNumber(record.outputPerMillion),
412
+ reasoning: asNumber(record.reasoning) ??
413
+ asNumber(record.reasoning_cost) ??
414
+ asNumber(record.reasoningPerMillion),
415
+ cacheRead: asNumber(record.cacheRead) ??
416
+ asNumber(record.cache_read) ??
417
+ asNumber(record.cachedInput) ??
418
+ asNumber(record.input_cached_tokens),
419
+ cacheWrite: asNumber(record.cacheWrite) ??
420
+ asNumber(record.cache_write) ??
421
+ asNumber(record.inputCacheCreation) ??
422
+ asNumber(record.input_cache_creation),
423
+ cacheWrite5m: asNumber(record.cacheWrite5m) ??
424
+ asNumber(record.cache_write_5m) ??
425
+ asNumber(record.inputCacheCreation5m) ??
426
+ asNumber(record.input_cache_creation_5m),
427
+ cacheWrite1h: asNumber(record.cacheWrite1h) ??
428
+ asNumber(record.cache_write_1h) ??
429
+ asNumber(record.inputCacheCreation1h) ??
430
+ asNumber(record.input_cache_creation_1h),
431
+ };
432
+ const values = Object.entries(rates).filter(([, rate]) => rate !== undefined);
433
+ if (values.length === 0)
434
+ return undefined;
435
+ const cleanRates = {};
436
+ for (const [name, rate] of values) {
437
+ if (rate === undefined || rate < 0) {
438
+ throw new Error(`Invalid ${name} cost rate for '${modelKey}' in ${source}; rates must be non-negative USD per 1M tokens.`);
439
+ }
440
+ cleanRates[name] = rate;
441
+ }
442
+ return cleanRates;
443
+ }
444
+ function mergeCostCatalog(base, override) {
445
+ const merged = { ...base };
446
+ for (const [modelKey, rates] of Object.entries(override)) {
447
+ merged[modelKey] = { ...(merged[modelKey] ?? {}), ...rates };
448
+ }
449
+ return merged;
450
+ }
195
451
  function normalizeEndpoint(endpoint) {
196
452
  if (endpoint !== deadRemoteEndpoint)
197
453
  return endpoint;
@@ -310,12 +566,29 @@ function normalizeUsage(value) {
310
566
  const record = asRecord(value);
311
567
  const nestedCost = asRecord(record.cost);
312
568
  const cache = asRecord(record.cache);
569
+ const cacheCreation = asRecord(record.cache_creation);
313
570
  const inputDetails = asRecord(record.input_tokens_details);
314
571
  const outputDetails = asRecord(record.output_tokens_details);
572
+ const directInput = asNumber(record.input);
573
+ const aggregateInput = asNumber(record.input_tokens) ??
574
+ asNumber(record.prompt_tokens);
575
+ const cacheWrite5m = asNumber(record.cacheWrite5m) ??
576
+ asNumber(record.cache_write_5m) ??
577
+ asNumber(record.input_cache_creation_5m) ??
578
+ asNumber(cacheCreation.ephemeral_5m_input_tokens);
579
+ const cacheWrite1h = asNumber(record.cacheWrite1h) ??
580
+ asNumber(record.cache_write_1h) ??
581
+ asNumber(record.input_cache_creation_1h) ??
582
+ asNumber(cacheCreation.ephemeral_1h_input_tokens);
583
+ const untypedCacheWrite = asNumber(record.cacheWrite) ??
584
+ asNumber(record.cache_creation_input_tokens) ??
585
+ asNumber(cache.write);
586
+ const hasTypedCacheWrite = cacheWrite5m !== undefined || cacheWrite1h !== undefined;
587
+ const hasAnthropicCacheShape = hasTypedCacheWrite ||
588
+ asNumber(record.cache_creation_input_tokens) !== undefined ||
589
+ asNumber(record.cache_read_input_tokens) !== undefined;
315
590
  const usage = {
316
- input: asNumber(record.input) ??
317
- asNumber(record.input_tokens) ??
318
- asNumber(record.prompt_tokens),
591
+ input: directInput ?? aggregateInput,
319
592
  output: asNumber(record.output) ??
320
593
  asNumber(record.output_tokens) ??
321
594
  asNumber(record.completion_tokens),
@@ -329,9 +602,9 @@ function normalizeUsage(value) {
329
602
  asNumber(record.cached_tokens) ??
330
603
  asNumber(inputDetails.cached_tokens) ??
331
604
  asNumber(cache.read),
332
- cacheWrite: asNumber(record.cacheWrite) ??
333
- asNumber(record.cache_creation_input_tokens) ??
334
- asNumber(cache.write),
605
+ cacheWrite: hasTypedCacheWrite ? undefined : untypedCacheWrite,
606
+ cacheWrite5m,
607
+ cacheWrite1h,
335
608
  total: asNumber(record.totalTokens) ?? asNumber(record.total_tokens) ??
336
609
  asNumber(record.total),
337
610
  cost: asNumber(nestedCost.total) ??
@@ -340,11 +613,19 @@ function normalizeUsage(value) {
340
613
  asNumber(record.total_cost) ??
341
614
  asNumber(record.cost),
342
615
  };
616
+ if (usage.input !== undefined) {
617
+ usage.inputIncludesCache = directInput === undefined && !hasAnthropicCacheShape;
618
+ }
343
619
  if (usage.total === undefined) {
620
+ const cacheRead = usage.inputIncludesCache === false ? (usage.cacheRead ?? 0) : 0;
621
+ const cacheWrite = (usage.cacheWrite ?? 0) +
622
+ (usage.cacheWrite5m ?? 0) +
623
+ (usage.cacheWrite1h ?? 0);
344
624
  const total = (usage.input ?? 0) +
345
625
  (usage.output ?? 0) +
346
626
  (usage.reasoning ?? 0) +
347
- (usage.cacheWrite ?? 0);
627
+ cacheWrite +
628
+ cacheRead;
348
629
  if (total > 0)
349
630
  usage.total = total;
350
631
  }
@@ -365,11 +646,14 @@ function usageDetails(usage) {
365
646
  if (!usage)
366
647
  return undefined;
367
648
  const details = {};
368
- const cachedInput = usage.cacheRead ?? 0;
369
- const cacheWrite = usage.cacheWrite ?? 0;
649
+ const cachedInput = usage.inputIncludesCache === false ? 0 : (usage.cacheRead ?? 0);
650
+ const cacheWriteTotal = (usage.cacheWrite ?? 0) +
651
+ (usage.cacheWrite5m ?? 0) +
652
+ (usage.cacheWrite1h ?? 0);
653
+ const cacheWriteInInput = usage.inputIncludesCache === false ? 0 : cacheWriteTotal;
370
654
  const regularInput = usage.input === undefined
371
655
  ? undefined
372
- : Math.max(usage.input - cachedInput - cacheWrite, 0);
656
+ : Math.max(usage.input - cachedInput - cacheWriteInInput, 0);
373
657
  if (regularInput !== undefined)
374
658
  details.input = regularInput;
375
659
  if (usage.output !== undefined)
@@ -380,10 +664,92 @@ function usageDetails(usage) {
380
664
  details.input_cached_tokens = usage.cacheRead;
381
665
  if (usage.cacheWrite !== undefined)
382
666
  details.input_cache_creation = usage.cacheWrite;
667
+ if (usage.cacheWrite5m !== undefined) {
668
+ details.input_cache_creation_5m = usage.cacheWrite5m;
669
+ }
670
+ if (usage.cacheWrite1h !== undefined) {
671
+ details.input_cache_creation_1h = usage.cacheWrite1h;
672
+ }
673
+ if (usage.cacheWrite === undefined && cacheWriteTotal > 0) {
674
+ details.input_cache_creation = cacheWriteTotal;
675
+ }
383
676
  if (usage.total !== undefined)
384
677
  details.total = usage.total;
385
678
  return Object.keys(details).length > 0 ? details : undefined;
386
679
  }
680
+ function calculateCost(event, usage, costRates) {
681
+ if (!usage)
682
+ return undefined;
683
+ if (event.usage?.cost !== undefined) {
684
+ return {
685
+ details: { total: roundCost(event.usage.cost) },
686
+ source: "recorded",
687
+ };
688
+ }
689
+ const match = findCostRates(event, costRates);
690
+ if (!match)
691
+ return undefined;
692
+ const { rates, modelKey } = match;
693
+ const details = {};
694
+ setCostPart(details, "input", usage.input, rates.input);
695
+ setCostPart(details, "output", usage.output, rates.output);
696
+ setCostPart(details, "output_reasoning", usage.output_reasoning, rates.reasoning ?? rates.output);
697
+ setCostPart(details, "input_cached_tokens", usage.input_cached_tokens, rates.cacheRead ?? rates.input);
698
+ const hasTypedCacheWrite = usage.input_cache_creation_5m !== undefined ||
699
+ usage.input_cache_creation_1h !== undefined;
700
+ setCostPart(details, "input_cache_creation_5m", usage.input_cache_creation_5m, rates.cacheWrite5m ?? rates.cacheWrite ?? rates.input);
701
+ setCostPart(details, "input_cache_creation_1h", usage.input_cache_creation_1h, rates.cacheWrite1h ?? rates.cacheWrite ?? rates.input);
702
+ if (!hasTypedCacheWrite) {
703
+ setCostPart(details, "input_cache_creation", usage.input_cache_creation, rates.cacheWrite ?? rates.input);
704
+ }
705
+ const calculatedTotal = Object.values(details).reduce((sum, value) => sum + value, 0);
706
+ let source = "calculated";
707
+ if (calculatedTotal === 0 &&
708
+ usage.total !== undefined &&
709
+ usage.total > 0 &&
710
+ rates.input !== undefined) {
711
+ for (const key of Object.keys(details))
712
+ delete details[key];
713
+ setCostPart(details, "input", usage.total, rates.input);
714
+ source = "calculated_total_as_input";
715
+ }
716
+ if (Object.keys(details).length === 0)
717
+ return undefined;
718
+ details.total = roundCost(Object.values(details).reduce((sum, value) => sum + value, 0));
719
+ return {
720
+ details,
721
+ source,
722
+ modelKey,
723
+ rates,
724
+ };
725
+ }
726
+ function findCostRates(event, costRates) {
727
+ const modelName = normalizeModelName(event.model);
728
+ const candidates = [
729
+ event.provider && event.model ? `${event.provider}/${event.model}` : undefined,
730
+ event.provider && modelName ? `${event.provider}/${modelName}` : undefined,
731
+ event.model,
732
+ modelName,
733
+ ];
734
+ const seen = new Set();
735
+ for (const candidate of candidates) {
736
+ if (!candidate || seen.has(candidate))
737
+ continue;
738
+ seen.add(candidate);
739
+ const rates = costRates[candidate];
740
+ if (rates)
741
+ return { modelKey: candidate, rates };
742
+ }
743
+ return undefined;
744
+ }
745
+ function setCostPart(details, key, tokens, usdPerMillionTokens) {
746
+ if (tokens === undefined || usdPerMillionTokens === undefined)
747
+ return;
748
+ details[key] = roundCost((tokens * usdPerMillionTokens) / 1_000_000);
749
+ }
750
+ function roundCost(value) {
751
+ return Number(value.toFixed(12));
752
+ }
387
753
  function isGenerationEvent(event) {
388
754
  return event.usage !== undefined && event.role !== "user" &&
389
755
  event.role !== "developer" && event.role !== "system";
@@ -1124,19 +1490,25 @@ function stableId(input) {
1124
1490
  return createHash("sha256").update(input).digest("hex").slice(0, 32);
1125
1491
  }
1126
1492
  function importIdentity(event) {
1127
- return importIdentityVersions[event.agent] ?? importIdentityVersion;
1493
+ return importStateIdentityVersions[event.agent] ?? importStateIdentityVersion;
1494
+ }
1495
+ function langfuseIdIdentity(event) {
1496
+ return langfuseIdIdentityVersions[event.agent] ?? langfuseIdIdentityVersion;
1128
1497
  }
1129
1498
  function fingerprint(event) {
1130
1499
  return `${importIdentity(event)}:${event.agent}:${event.sessionId}:${event.recordId}`;
1131
1500
  }
1501
+ function langfuseFingerprint(event) {
1502
+ return `${langfuseIdIdentity(event)}:${event.agent}:${event.sessionId}:${event.recordId}`;
1503
+ }
1132
1504
  function traceFingerprint(event) {
1133
- return `${importIdentity(event)}:${event.agent}:${event.sessionId}`;
1505
+ return `${langfuseIdIdentity(event)}:${event.agent}:${event.sessionId}`;
1134
1506
  }
1135
1507
  function traceId(event) {
1136
1508
  return stableId(traceFingerprint(event));
1137
1509
  }
1138
1510
  function spanId(event) {
1139
- return stableId(fingerprint(event)).slice(0, 16);
1511
+ return stableId(langfuseFingerprint(event)).slice(0, 16);
1140
1512
  }
1141
1513
  function rootSpanId(event) {
1142
1514
  return stableId(`${traceFingerprint(event)}:root`).slice(0, 16);
@@ -1189,6 +1561,7 @@ function limitEventPayload(event, maxFieldBytes) {
1189
1561
  }
1190
1562
  function toOtlp(events, options = {}) {
1191
1563
  const maxFieldBytes = options.maxFieldBytes ?? defaultMaxFieldBytes;
1564
+ const costRates = options.costRates ?? loadCostCatalogFromEnv();
1192
1565
  const spansByTrace = new Map();
1193
1566
  for (const rawEvent of events) {
1194
1567
  const event = limitEventPayload(rawEvent, maxFieldBytes);
@@ -1226,6 +1599,9 @@ function toOtlp(events, options = {}) {
1226
1599
  attr("langfuse.trace.metadata.project_path", firstProject.projectPath),
1227
1600
  attr("langfuse.trace.metadata.project_name", firstProject.projectName),
1228
1601
  attr("langfuse.trace.metadata.project_folder", firstProject.projectFolder),
1602
+ attr("langfuse.trace.metadata.import_payload_version", importPayloadVersion),
1603
+ attr("langfuse.trace.metadata.import_state_identity", importIdentity(first)),
1604
+ attr("langfuse.trace.metadata.langfuse_id_identity", langfuseIdIdentity(first)),
1229
1605
  attr("langfuse.observation.metadata.agent", first.agent),
1230
1606
  attr("langfuse.observation.metadata.host", currentHost),
1231
1607
  attr("langfuse.observation.metadata.machine", currentHost),
@@ -1236,6 +1612,9 @@ function toOtlp(events, options = {}) {
1236
1612
  attr("langfuse.observation.metadata.project_path", firstProject.projectPath),
1237
1613
  attr("langfuse.observation.metadata.project_name", firstProject.projectName),
1238
1614
  attr("langfuse.observation.metadata.project_folder", firstProject.projectFolder),
1615
+ attr("langfuse.observation.metadata.import_payload_version", importPayloadVersion),
1616
+ attr("langfuse.observation.metadata.import_state_identity", importIdentity(first)),
1617
+ attr("langfuse.observation.metadata.langfuse_id_identity", langfuseIdIdentity(first)),
1239
1618
  attr("source.path", first.sourcePath),
1240
1619
  attr("cwd", first.cwd),
1241
1620
  attr("project.path", firstProject.projectPath),
@@ -1259,6 +1638,7 @@ function toOtlp(events, options = {}) {
1259
1638
  const modelName = normalizeModelName(event.model);
1260
1639
  const generation = isGenerationEvent(event);
1261
1640
  const usage = usageDetails(event.usage);
1641
+ const cost = generation ? calculateCost(event, usage, costRates) : undefined;
1262
1642
  const eventProject = projectMetadata(event.cwd);
1263
1643
  const attributes = [
1264
1644
  attr("service.name", `agent.${event.agent}`),
@@ -1286,7 +1666,16 @@ function toOtlp(events, options = {}) {
1286
1666
  attr("langfuse.observation.metadata.project_folder", eventProject.projectFolder),
1287
1667
  attr("langfuse.observation.metadata.model", modelName ?? event.model),
1288
1668
  attr("langfuse.observation.metadata.provider", event.provider),
1669
+ attr("langfuse.observation.metadata.import_payload_version", importPayloadVersion),
1670
+ attr("langfuse.observation.metadata.import_state_identity", importIdentity(event)),
1671
+ attr("langfuse.observation.metadata.langfuse_id_identity", langfuseIdIdentity(event)),
1672
+ attr("langfuse.observation.usage_details", generation ? usage : undefined),
1673
+ attr("langfuse.observation.cost_details", cost?.details),
1289
1674
  attr("langfuse.observation.metadata.usage_details", usage),
1675
+ attr("langfuse.observation.metadata.cost_details", cost?.details),
1676
+ attr("langfuse.observation.metadata.cost_source", cost?.source),
1677
+ attr("langfuse.observation.metadata.cost_model_key", cost?.modelKey),
1678
+ attr("langfuse.observation.metadata.cost_rates", cost?.rates),
1290
1679
  attr("langfuse.observation.metadata.recorded_cost", event.usage?.cost),
1291
1680
  attr("langfuse.observation.input", event.input),
1292
1681
  attr("langfuse.observation.output", event.output),
@@ -1450,7 +1839,9 @@ async function run(options) {
1450
1839
  for (const event of events) {
1451
1840
  discovered[event.agent] = (discovered[event.agent] ?? 0) + 1;
1452
1841
  }
1453
- const unsent = events.filter((event) => state.sent[fingerprint(event)] === undefined);
1842
+ const unsent = options.force
1843
+ ? events
1844
+ : events.filter((event) => state.sent[fingerprint(event)] === undefined);
1454
1845
  const selected = options.limit === undefined
1455
1846
  ? unsent
1456
1847
  : unsent.slice(0, options.limit);
@@ -1465,6 +1856,7 @@ async function run(options) {
1465
1856
  batchSize: options.batchSize,
1466
1857
  maxRequestBytes: options.maxRequestBytes,
1467
1858
  maxFieldBytes: options.maxFieldBytes,
1859
+ costRates: options.costRates,
1468
1860
  });
1469
1861
  }
1470
1862
  catch (error) {
@@ -1490,6 +1882,7 @@ async function run(options) {
1490
1882
  try {
1491
1883
  await postOtlp(options.endpoint, batch, {
1492
1884
  maxFieldBytes: options.maxFieldBytes,
1885
+ costRates: options.costRates,
1493
1886
  });
1494
1887
  for (const event of batch) {
1495
1888
  state.sent[fingerprint(event)] = new Date().toISOString();
package/dist/service.d.ts CHANGED
@@ -13,6 +13,8 @@ type ServiceOptions = {
13
13
  batchSize: number;
14
14
  pollIntervalMs: number;
15
15
  postDelayMs: number;
16
+ costRatesPath?: string;
17
+ costRatesJson?: string;
16
18
  npxPath: string;
17
19
  since?: string;
18
20
  dryRun: boolean;
package/dist/service.js CHANGED
@@ -24,6 +24,8 @@ Service options:
24
24
  --batch-size N OTLP spans per POST (default: 10)
25
25
  --poll-interval-ms N Delay between --follow scans (default: 5000)
26
26
  --post-delay-ms N Delay after each successful OTLP POST (default: 0)
27
+ --cost-rates PATH JSON model cost-rate overrides in USD per 1M tokens
28
+ --cost-rates-json JSON Inline JSON model cost-rate overrides in USD per 1M tokens
27
29
  --since ISO_OR_MS Optional lower bound for events the follower may send
28
30
  --working-directory DIR Directory the service starts in (default: --home)
29
31
  --path VALUE PATH value injected into the service environment
@@ -49,6 +51,10 @@ function parseServiceArgs(argv) {
49
51
  let batchSize = 10;
50
52
  let pollIntervalMs = 5_000;
51
53
  let postDelayMs = 0;
54
+ let costRatesPath = process.env.CODING_AGENT_LANGFUSE_COST_RATES_PATH ??
55
+ process.env.LANGFUSE_BACKFILL_COST_RATES_PATH;
56
+ let costRatesJson = process.env.CODING_AGENT_LANGFUSE_COST_RATES_JSON ??
57
+ process.env.LANGFUSE_BACKFILL_COST_RATES_JSON;
52
58
  let since;
53
59
  let dryRun = false;
54
60
  let start = true;
@@ -95,6 +101,12 @@ function parseServiceArgs(argv) {
95
101
  else if (arg === "--post-delay-ms") {
96
102
  postDelayMs = parseNonNegativeInt(arg, next());
97
103
  }
104
+ else if (arg === "--cost-rates") {
105
+ costRatesPath = next();
106
+ }
107
+ else if (arg === "--cost-rates-json") {
108
+ costRatesJson = next();
109
+ }
98
110
  else if (arg === "--since") {
99
111
  since = next();
100
112
  }
@@ -134,6 +146,8 @@ function parseServiceArgs(argv) {
134
146
  batchSize,
135
147
  pollIntervalMs,
136
148
  postDelayMs,
149
+ costRatesPath,
150
+ costRatesJson,
137
151
  since,
138
152
  dryRun,
139
153
  start,
@@ -282,6 +296,10 @@ function buildFollowCommand(options) {
282
296
  ];
283
297
  if (options.since)
284
298
  command.push("--since", options.since);
299
+ if (options.costRatesPath)
300
+ command.push("--cost-rates", options.costRatesPath);
301
+ if (options.costRatesJson)
302
+ command.push("--cost-rates-json", options.costRatesJson);
285
303
  return command;
286
304
  }
287
305
  function renderSystemdUnit(options, command) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ramarivera/coding-agent-langfuse",
3
- "version": "0.1.42",
3
+ "version": "0.1.44",
4
4
  "description": "Universal coding-agent Langfuse backfiller and live OTLP helpers",
5
5
  "type": "module",
6
6
  "license": "MIT",