copilot-metrics 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,19 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.1.1 - 2026-05-30
4
+
5
+ ### Added
6
+
7
+ - Setup-once report flow: report commands automatically import configured VS Code, Copilot CLI, and hook JSONL sources before querying.
8
+ - Idempotent import fingerprints so repeated reports or imports do not double-count previously ingested JSONL rows.
9
+ - Complete label token reporting for input, output, cache read, cache creation, and reasoning tokens in human and JSON output.
10
+ - Hook-only label status so attribution evidence from hooks remains visible without implying token-bearing usage.
11
+
12
+ ### Changed
13
+
14
+ - `setup all` now persists the central data directory config, matching `init` for setup-once usage.
15
+ - Hook commands support installed executable shims as well as checkout-local JavaScript entrypoints.
16
+
3
17
  ## 0.1.0 - 2026-05-30
4
18
 
5
19
  First local release candidate for `copilot-metrics`.
package/README.md CHANGED
@@ -9,8 +9,8 @@ Costs are estimates, not official billing records. GitHub billing remains the so
9
9
  From npm:
10
10
 
11
11
  ```bash
12
- npx copilot-metrics@0.1.0 --help
13
- npx copilot-metrics@0.1.0 init
12
+ npx copilot-metrics@0.1.1 --help
13
+ npx copilot-metrics@0.1.1 init
14
14
  ```
15
15
 
16
16
  From this checkout:
@@ -38,8 +38,8 @@ export COPILOT_METRICS_HOME=/path/to/copilot-metrics-data
38
38
  Useful commands:
39
39
 
40
40
  ```bash
41
- npx copilot-metrics@0.1.0 init
42
- npx copilot-metrics@0.1.0 paths --json
41
+ npx copilot-metrics@0.1.1 init
42
+ npx copilot-metrics@0.1.1 paths --json
43
43
  ```
44
44
 
45
45
  ## Configure Telemetry
@@ -47,13 +47,13 @@ npx copilot-metrics@0.1.0 paths --json
47
47
  Print VS Code Insiders Copilot Chat OpenTelemetry settings:
48
48
 
49
49
  ```bash
50
- npx copilot-metrics@0.1.0 setup vscode
50
+ npx copilot-metrics@0.1.1 setup vscode
51
51
  ```
52
52
 
53
53
  Print Copilot CLI OpenTelemetry environment exports:
54
54
 
55
55
  ```bash
56
- npx copilot-metrics@0.1.0 setup copilot-cli
56
+ npx copilot-metrics@0.1.1 setup copilot-cli
57
57
  ```
58
58
 
59
59
  Content capture is disabled by default. Do not enable richer prompt capture unless you explicitly accept the privacy tradeoff.
@@ -63,14 +63,14 @@ Content capture is disabled by default. Do not enable richer prompt capture unle
63
63
  Preview repo-local hook config. The default `--surface both` emits the Copilot CLI lower camel case hook format:
64
64
 
65
65
  ```bash
66
- npx copilot-metrics@0.1.0 hooks preview --scope local --surface both
66
+ npx copilot-metrics@0.1.1 hooks preview --scope local --surface both
67
67
  ```
68
68
 
69
69
  Install repo-local or user-global hook config:
70
70
 
71
71
  ```bash
72
- npx copilot-metrics@0.1.0 hooks install --scope local --surface both
73
- npx copilot-metrics@0.1.0 hooks install --scope global --surface both
72
+ npx copilot-metrics@0.1.1 hooks install --scope local --surface both
73
+ npx copilot-metrics@0.1.1 hooks install --scope global --surface both
74
74
  ```
75
75
 
76
76
  Local install writes `.github/hooks/copilot-metrics.json`. Global install updates `~/.copilot/settings.json` idempotently, replacing prior `copilot-metrics` hook entries while preserving other settings and hooks. Use `--surface vscode` for VS Code-only PascalCase events or `--surface copilot-cli` for CLI-native lower camel case events. The hook logger writes redacted JSONL metadata to the central data directory. It extracts Jira-style labels such as `DEMO-12345` from safe metadata and does not store full prompt text by default.
@@ -80,33 +80,35 @@ Local install writes `.github/hooks/copilot-metrics.json`. Global install update
80
80
  Initialize the local SQLite store and import JSONL files:
81
81
 
82
82
  ```bash
83
- npx copilot-metrics@0.1.0 store init
84
- npx copilot-metrics@0.1.0 import --source vscode --file ~/.local/share/copilot-metrics/telemetry/vscode-copilot-otel.jsonl
85
- npx copilot-metrics@0.1.0 import --source copilot-cli --file ~/.local/share/copilot-metrics/telemetry/copilot-cli-otel.jsonl
86
- npx copilot-metrics@0.1.0 import --source hooks --file ~/.local/share/copilot-metrics/hooks/copilot-hooks.jsonl
83
+ npx copilot-metrics@0.1.1 store init
84
+ npx copilot-metrics@0.1.1 import --source vscode --file ~/.local/share/copilot-metrics/telemetry/vscode-copilot-otel.jsonl
85
+ npx copilot-metrics@0.1.1 import --source copilot-cli --file ~/.local/share/copilot-metrics/telemetry/copilot-cli-otel.jsonl
86
+ npx copilot-metrics@0.1.1 import --source hooks --file ~/.local/share/copilot-metrics/hooks/copilot-hooks.jsonl
87
87
  ```
88
88
 
89
- Imports persist raw records, normalized LLM usage records, hook events, label evidence, and import warnings.
89
+ Imports persist raw records, normalized LLM usage records, hook events, label evidence, and import warnings. Re-importing the same JSONL rows is idempotent.
90
90
 
91
91
  ## Reports
92
92
 
93
93
  Run local reports from the SQLite store:
94
94
 
95
95
  ```bash
96
- npx copilot-metrics@0.1.0 report labels
97
- npx copilot-metrics@0.1.0 report label DEMO-12345
98
- npx copilot-metrics@0.1.0 report label DEMO-12345 --detail
99
- npx copilot-metrics@0.1.0 report models
100
- npx copilot-metrics@0.1.0 report repos
101
- npx copilot-metrics@0.1.0 report unattributed
96
+ npx copilot-metrics@0.1.1 report labels
97
+ npx copilot-metrics@0.1.1 report label DEMO-12345
98
+ npx copilot-metrics@0.1.1 report label DEMO-12345 --detail
99
+ npx copilot-metrics@0.1.1 report models
100
+ npx copilot-metrics@0.1.1 report repos
101
+ npx copilot-metrics@0.1.1 report unattributed
102
102
  ```
103
103
 
104
104
  Every report supports `--json`:
105
105
 
106
106
  ```bash
107
- npx copilot-metrics@0.1.0 report labels --json
107
+ npx copilot-metrics@0.1.1 report labels --json
108
108
  ```
109
109
 
110
+ Report commands automatically import newly appended configured VS Code, Copilot CLI, and hook JSONL files before querying. Label reports include input, output, cache read, cache creation, and reasoning token totals. Labels seen only in hooks remain visible as hook-only evidence with zero usage records, so attribution hints do not imply token-bearing usage.
111
+
110
112
  ## Attribution Model
111
113
 
112
114
  The default extractor finds Jira-style labels such as `DEMO-12345` from safe metadata including hook labels, branch names, cwd/path values, repo metadata, and task hints.
@@ -168,7 +170,7 @@ The manual prompt performs one harmless tool call so Copilot CLI hook execution
168
170
  ## Current Limits
169
171
 
170
172
  - Costs are estimates, not official billing records.
171
- - Official GitHub usage report reconciliation is not included in `0.1.0`.
172
- - Local OTLP collector mode is not included in `0.1.0`.
173
- - Richer prompt/content capture and redaction controls are not included in `0.1.0`.
173
+ - Official GitHub usage report reconciliation is not included in `0.1.1`.
174
+ - Local OTLP collector mode is not included in `0.1.1`.
175
+ - Richer prompt/content capture and redaction controls are not included in `0.1.1`.
174
176
  - Dashboard views are deferred until the CLI/query model proves useful.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "copilot-metrics",
3
- "version": "0.1.0",
3
+ "version": "0.1.1",
4
4
  "description": "Local-first Copilot usage telemetry setup and reporting tools.",
5
5
  "type": "commonjs",
6
6
  "homepage": "https://github.com/nnexai/copilot-metrics#readme",
package/src/cli.js CHANGED
@@ -12,7 +12,7 @@ const {
12
12
  } = require('./setup');
13
13
  const { appendHookEvent, readJsonFromStream } = require('./hook-logger');
14
14
  const { initStore } = require('./sqlite-store');
15
- const { ingestFile } = require('./ingest');
15
+ const { autoImportConfiguredSources, ingestFile } = require('./ingest');
16
16
  const { MODEL_PRICES, PRICING_VERSION } = require('./pricing');
17
17
  const {
18
18
  labelOverview,
@@ -236,6 +236,12 @@ async function main(args, io) {
236
236
  }
237
237
 
238
238
  if (command === 'report') {
239
+ ensureDataDirs(paths);
240
+ await autoImportConfiguredSources(paths, {
241
+ cwd: io.cwd,
242
+ extractors: loadConfiguredExtractors(paths.configJson, io.cwd),
243
+ });
244
+
239
245
  if (subcommand === 'labels') {
240
246
  const rows = await labelOverview(paths.usageDb);
241
247
  writeOutput(io.stdout, json ? { labels: rows } : formatLabels(rows), json);
package/src/ingest.js CHANGED
@@ -1,10 +1,14 @@
1
1
  'use strict';
2
2
 
3
+ const crypto = require('node:crypto');
4
+ const fs = require('node:fs');
5
+ const path = require('node:path');
3
6
  const { readJsonl } = require('./jsonl');
4
7
  const { normalizePayload, normalizeHookEvent } = require('./otel');
5
8
  const { estimateCost, PRICING_VERSION } = require('./pricing');
6
- const { insertImport } = require('./sqlite-store');
9
+ const { existingRawFingerprints, insertImport } = require('./sqlite-store');
7
10
  const { attachUsageLabelEvidence, attachHookLabelEvidence } = require('./labels');
11
+ const { loadConfiguredExtractors } = require('./label-extractors');
8
12
 
9
13
  function enrichCosts(records) {
10
14
  return records.map((record) => {
@@ -21,14 +25,39 @@ function enrichCosts(records) {
21
25
  });
22
26
  }
23
27
 
28
+ function rawFingerprint(source, file, record) {
29
+ return crypto
30
+ .createHash('sha256')
31
+ .update(source)
32
+ .update('\0')
33
+ .update(path.resolve(file))
34
+ .update('\0')
35
+ .update(String(record.line))
36
+ .update('\0')
37
+ .update(JSON.stringify(record.value))
38
+ .digest('hex');
39
+ }
40
+
24
41
  async function ingestFile(options) {
25
42
  const { dbPath, file, source } = options;
26
43
  const parsed = readJsonl(file);
27
44
  const warnings = [...parsed.warnings];
45
+ const sourceFile = path.resolve(file);
46
+ const parsedRecords = parsed.records.map((record) => ({
47
+ ...record,
48
+ raw_fingerprint: rawFingerprint(source, sourceFile, record),
49
+ }));
50
+ const existing = await existingRawFingerprints(
51
+ dbPath,
52
+ source,
53
+ sourceFile,
54
+ parsedRecords.map((record) => record.raw_fingerprint),
55
+ );
56
+ const newRecords = parsedRecords.filter((record) => !existing.has(record.raw_fingerprint));
28
57
  const usageRecords = [];
29
58
  const hookEvents = [];
30
59
 
31
- for (const record of parsed.records) {
60
+ for (const record of newRecords) {
32
61
  if (source === 'hooks') {
33
62
  const event = normalizeHookEvent(record.value, source, record.line);
34
63
  if (event) hookEvents.push(event);
@@ -50,13 +79,15 @@ async function ingestFile(options) {
50
79
  }
51
80
  }
52
81
 
53
- await insertImport(dbPath, source, parsed.records, enrichedUsage, enrichedHooks, warnings);
82
+ await insertImport(dbPath, source, sourceFile, newRecords, enrichedUsage, enrichedHooks, warnings);
54
83
 
55
84
  return {
56
85
  source,
57
86
  file,
58
87
  dbPath,
59
88
  raw_records: parsed.records.length,
89
+ new_raw_records: newRecords.length,
90
+ skipped_existing_records: parsed.records.length - newRecords.length,
60
91
  usage_records: enrichedUsage.length,
61
92
  hook_events: enrichedHooks.length,
62
93
  label_evidence: enrichedUsage.reduce((sum, usage) => sum + (usage.label_evidence || []).length, 0)
@@ -66,6 +97,53 @@ async function ingestFile(options) {
66
97
  };
67
98
  }
68
99
 
100
+ function configuredSourceFiles(paths, config = {}) {
101
+ const sourceConfig = config.sources || {};
102
+ const telemetryConfig = config.telemetry || {};
103
+ const files = [
104
+ { source: 'vscode', file: sourceConfig.vscode?.telemetry || telemetryConfig.vscode || paths.vscodeOtelJsonl },
105
+ { source: 'hooks', file: sourceConfig.vscode?.hooks || paths.hookEventsJsonl },
106
+ { source: 'copilot-cli', file: sourceConfig.copilotCli?.telemetry || telemetryConfig.copilotCli || paths.copilotCliOtelJsonl },
107
+ { source: 'hooks', file: sourceConfig.copilotCli?.hooks || paths.hookEventsJsonl },
108
+ ];
109
+ const seen = new Set();
110
+ return files
111
+ .filter((entry) => entry.file)
112
+ .map((entry) => ({ source: entry.source, file: path.resolve(entry.file) }))
113
+ .filter((entry) => {
114
+ const key = `${entry.source}\0${entry.file}`;
115
+ if (seen.has(key)) return false;
116
+ seen.add(key);
117
+ return true;
118
+ });
119
+ }
120
+
121
+ function readConfig(configJson) {
122
+ if (!fs.existsSync(configJson)) return {};
123
+ return JSON.parse(fs.readFileSync(configJson, 'utf8'));
124
+ }
125
+
126
+ async function autoImportConfiguredSources(paths, options = {}) {
127
+ const config = readConfig(paths.configJson);
128
+ const extractors = options.extractors || loadConfiguredExtractors(paths.configJson, options.cwd || process.cwd());
129
+ const results = [];
130
+ for (const entry of configuredSourceFiles(paths, config)) {
131
+ if (!fs.existsSync(entry.file)) {
132
+ results.push({ ...entry, skipped: true, reason: 'missing_file' });
133
+ continue;
134
+ }
135
+ results.push(await ingestFile({
136
+ dbPath: paths.usageDb,
137
+ file: entry.file,
138
+ source: entry.source,
139
+ extractors,
140
+ }));
141
+ }
142
+ return results;
143
+ }
144
+
69
145
  module.exports = {
146
+ autoImportConfiguredSources,
147
+ configuredSourceFiles,
70
148
  ingestFile,
71
149
  };
package/src/reports.js CHANGED
@@ -1,6 +1,6 @@
1
1
  'use strict';
2
2
 
3
- const { queryRows } = require('./sqlite-store');
3
+ const { initStore, queryRows } = require('./sqlite-store');
4
4
  const { canonicalLabel } = require('./label-extractors');
5
5
 
6
6
  function n(value) {
@@ -31,6 +31,7 @@ function table(headers, rows) {
31
31
  }
32
32
 
33
33
  async function labelOverview(dbPath) {
34
+ await initStore(dbPath);
34
35
  return queryRows(dbPath, `
35
36
  SELECT
36
37
  labels.label,
@@ -43,6 +44,10 @@ SELECT
43
44
  COALESCE((SELECT SUM(cache_creation_tokens) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS cache_creation_tokens,
44
45
  COALESCE((SELECT SUM(reasoning_tokens) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS reasoning_tokens,
45
46
  COALESCE((SELECT SUM(COALESCE(estimated_ai_credits, 0)) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS estimated_ai_credits,
47
+ CASE
48
+ WHEN (SELECT COUNT(DISTINCT usage_record_id) FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL) = 0 THEN 'hook-only'
49
+ ELSE 'token-bearing'
50
+ END AS token_status,
46
51
  (SELECT MIN(COALESCE(ur.timestamp, le.timestamp, le.imported_at)) FROM label_evidence le LEFT JOIN usage_records ur ON ur.id = le.usage_record_id WHERE le.label = labels.label) AS first_seen,
47
52
  (SELECT MAX(COALESCE(ur.timestamp, le.timestamp, le.imported_at)) FROM label_evidence le LEFT JOIN usage_records ur ON ur.id = le.usage_record_id WHERE le.label = labels.label) AS last_seen,
48
53
  (SELECT MAX(estimate_label) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)) AS estimate_label
@@ -51,6 +56,7 @@ ORDER BY estimated_ai_credits DESC, labels.label`);
51
56
  }
52
57
 
53
58
  async function labelSummary(dbPath, label) {
59
+ await initStore(dbPath);
54
60
  const rows = await queryRows(dbPath, `
55
61
  SELECT
56
62
  labels.label,
@@ -63,6 +69,10 @@ SELECT
63
69
  COALESCE((SELECT SUM(cache_creation_tokens) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS cache_creation_tokens,
64
70
  COALESCE((SELECT SUM(reasoning_tokens) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS reasoning_tokens,
65
71
  COALESCE((SELECT SUM(COALESCE(estimated_ai_credits, 0)) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)), 0) AS estimated_ai_credits,
72
+ CASE
73
+ WHEN (SELECT COUNT(DISTINCT usage_record_id) FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL) = 0 THEN 'hook-only'
74
+ ELSE 'token-bearing'
75
+ END AS token_status,
66
76
  (SELECT MIN(COALESCE(ur.timestamp, le.timestamp, le.imported_at)) FROM label_evidence le LEFT JOIN usage_records ur ON ur.id = le.usage_record_id WHERE le.label = labels.label) AS first_seen,
67
77
  (SELECT MAX(COALESCE(ur.timestamp, le.timestamp, le.imported_at)) FROM label_evidence le LEFT JOIN usage_records ur ON ur.id = le.usage_record_id WHERE le.label = labels.label) AS last_seen,
68
78
  (SELECT MAX(estimate_label) FROM usage_records WHERE id IN (SELECT DISTINCT usage_record_id FROM label_evidence WHERE label = labels.label AND usage_record_id IS NOT NULL)) AS estimate_label
@@ -71,6 +81,7 @@ FROM (SELECT DISTINCT label FROM label_evidence WHERE label = ?) labels`, [canon
71
81
  }
72
82
 
73
83
  async function labelDetails(dbPath, label) {
84
+ await initStore(dbPath);
74
85
  return queryRows(dbPath, `
75
86
  SELECT
76
87
  le.label,
@@ -87,6 +98,9 @@ SELECT
87
98
  ur.resolved_model,
88
99
  ur.input_tokens,
89
100
  ur.output_tokens,
101
+ ur.cache_read_tokens,
102
+ ur.cache_creation_tokens,
103
+ ur.reasoning_tokens,
90
104
  ur.estimated_ai_credits,
91
105
  ur.estimate_label,
92
106
  COALESCE(ur.timestamp, le.timestamp, le.imported_at) AS timestamp
@@ -97,6 +111,7 @@ ORDER BY timestamp, le.source_type, le.source_field`, [canonicalLabel(label)]);
97
111
  }
98
112
 
99
113
  async function modelReport(dbPath) {
114
+ await initStore(dbPath);
100
115
  return queryRows(dbPath, `
101
116
  SELECT
102
117
  COALESCE(resolved_model, requested_model, 'unknown') AS model,
@@ -114,6 +129,7 @@ ORDER BY estimated_ai_credits DESC, model`);
114
129
  }
115
130
 
116
131
  async function repoReport(dbPath) {
132
+ await initStore(dbPath);
117
133
  return queryRows(dbPath, `
118
134
  SELECT
119
135
  COALESCE(repo, 'unknown') AS repo,
@@ -130,6 +146,7 @@ ORDER BY estimated_ai_credits DESC, repo, cwd`);
130
146
  }
131
147
 
132
148
  async function unattributedReport(dbPath) {
149
+ await initStore(dbPath);
133
150
  return queryRows(dbPath, `
134
151
  SELECT
135
152
  ur.id,
@@ -156,13 +173,18 @@ ORDER BY ur.timestamp, ur.id`);
156
173
  function formatLabels(rows) {
157
174
  return [
158
175
  table(
159
- ['Label', 'Sessions', 'Input', 'Output', 'Credits', 'Evidence', 'Last seen'],
176
+ ['Label', 'Sessions', 'Usage', 'Input', 'Output', 'Cache read', 'Cache create', 'Reasoning', 'Credits', 'Status', 'Evidence', 'Last seen'],
160
177
  rows.map((row) => [
161
178
  row.label,
162
179
  row.sessions,
180
+ row.usage_records,
163
181
  formatNumber(row.input_tokens),
164
182
  formatNumber(row.output_tokens),
183
+ formatNumber(row.cache_read_tokens),
184
+ formatNumber(row.cache_creation_tokens),
185
+ formatNumber(row.reasoning_tokens),
165
186
  formatCredits(row.estimated_ai_credits),
187
+ row.token_status,
166
188
  row.evidence_count,
167
189
  row.last_seen || '',
168
190
  ]),
@@ -176,18 +198,35 @@ function formatLabelSummary(summary, details = null) {
176
198
  if (!summary) return 'No usage found for label.';
177
199
  const lines = [
178
200
  table(
179
- ['Label', 'Sessions', 'Input', 'Output', 'Credits', 'Evidence'],
180
- [[summary.label, summary.sessions, formatNumber(summary.input_tokens), formatNumber(summary.output_tokens), formatCredits(summary.estimated_ai_credits), summary.evidence_count]],
201
+ ['Label', 'Sessions', 'Usage', 'Input', 'Output', 'Cache read', 'Cache create', 'Reasoning', 'Credits', 'Status', 'Evidence'],
202
+ [[
203
+ summary.label,
204
+ summary.sessions,
205
+ summary.usage_records,
206
+ formatNumber(summary.input_tokens),
207
+ formatNumber(summary.output_tokens),
208
+ formatNumber(summary.cache_read_tokens),
209
+ formatNumber(summary.cache_creation_tokens),
210
+ formatNumber(summary.reasoning_tokens),
211
+ formatCredits(summary.estimated_ai_credits),
212
+ summary.token_status,
213
+ summary.evidence_count,
214
+ ]],
181
215
  ),
182
216
  ];
183
217
  if (details) {
184
218
  lines.push('', table(
185
- ['Source', 'Field', 'Session', 'Model', 'Credits', 'Value'],
219
+ ['Source', 'Field', 'Session', 'Model', 'Input', 'Output', 'Cache read', 'Cache create', 'Reasoning', 'Credits', 'Value'],
186
220
  details.map((row) => [
187
221
  row.source_type,
188
222
  row.source_field,
189
223
  row.session_id || '',
190
224
  row.resolved_model || '',
225
+ formatNumber(row.input_tokens),
226
+ formatNumber(row.output_tokens),
227
+ formatNumber(row.cache_read_tokens),
228
+ formatNumber(row.cache_creation_tokens),
229
+ formatNumber(row.reasoning_tokens),
191
230
  formatCredits(row.estimated_ai_credits),
192
231
  row.source_value || '',
193
232
  ]),
package/src/setup.js CHANGED
@@ -45,6 +45,7 @@ function ensureDataDirs(paths) {
45
45
  if (!fs.existsSync(paths.configJson)) {
46
46
  writePrivateFile(paths.configJson, `${JSON.stringify({
47
47
  version: 1,
48
+ dataHome: paths.home,
48
49
  contentCapture: false,
49
50
  telemetry: {
50
51
  vscode: paths.vscodeOtelJsonl,
@@ -93,6 +94,11 @@ function packageBinCommand(cwd) {
93
94
  return path.join(cwd, 'bin', 'copilot-metrics.js');
94
95
  }
95
96
 
97
+ function commandInvocation(command) {
98
+ const quoted = shellQuote(command);
99
+ return command.endsWith('.js') ? `node ${quoted}` : quoted;
100
+ }
101
+
96
102
  function hookEventsForSurface(surface) {
97
103
  if (surface === 'copilot-cli' || surface === 'both') return COPILOT_CLI_HOOK_EVENTS;
98
104
  if (surface === 'vscode') return VSCODE_HOOK_EVENTS;
@@ -100,7 +106,7 @@ function hookEventsForSurface(surface) {
100
106
  }
101
107
 
102
108
  function hookCommand(command, event, metricsHome) {
103
- return `COPILOT_METRICS_HOME=${shellQuote(metricsHome)} node ${shellQuote(command)} hook-log --event ${shellQuote(event)} --quiet`;
109
+ return `COPILOT_METRICS_HOME=${shellQuote(metricsHome)} ${commandInvocation(command)} hook-log --event ${shellQuote(event)} --quiet`;
104
110
  }
105
111
 
106
112
  function hookConfig(paths, options = {}) {
@@ -171,6 +177,7 @@ function installHook(paths, options = {}) {
171
177
 
172
178
  function setupSnapshot(options = {}) {
173
179
  const paths = resolvePaths(options);
180
+ ensureDataDirs(paths);
174
181
  return {
175
182
  paths,
176
183
  vscode: vscodeSettings(paths),
@@ -29,6 +29,19 @@ function persistDatabase(dbPath, db) {
29
29
  }
30
30
  }
31
31
 
32
+ function hasColumn(db, table, column) {
33
+ const result = db.exec(`PRAGMA table_info(${table})`);
34
+ if (!result.length) return false;
35
+ const nameIndex = result[0].columns.indexOf('name');
36
+ return result[0].values.some((row) => row[nameIndex] === column);
37
+ }
38
+
39
+ function addColumnIfMissing(db, table, column, definition) {
40
+ if (!hasColumn(db, table, column)) {
41
+ db.run(`ALTER TABLE ${table} ADD COLUMN ${column} ${definition}`);
42
+ }
43
+ }
44
+
32
45
  async function initStore(dbPath) {
33
46
  const db = await openDatabase(dbPath);
34
47
  db.run(`
@@ -36,7 +49,9 @@ CREATE TABLE IF NOT EXISTS raw_records (
36
49
  id INTEGER PRIMARY KEY AUTOINCREMENT,
37
50
  imported_at TEXT NOT NULL,
38
51
  source TEXT NOT NULL,
52
+ source_file TEXT,
39
53
  line INTEGER NOT NULL,
54
+ raw_fingerprint TEXT,
40
55
  payload_json TEXT NOT NULL
41
56
  );
42
57
  CREATE TABLE IF NOT EXISTS usage_records (
@@ -105,6 +120,9 @@ CREATE TABLE IF NOT EXISTS import_warnings (
105
120
  message TEXT NOT NULL
106
121
  );
107
122
  `);
123
+ addColumnIfMissing(db, 'raw_records', 'source_file', 'TEXT');
124
+ addColumnIfMissing(db, 'raw_records', 'raw_fingerprint', 'TEXT');
125
+ db.run('CREATE UNIQUE INDEX IF NOT EXISTS idx_raw_records_fingerprint ON raw_records (source, source_file, raw_fingerprint)');
108
126
  persistDatabase(dbPath, db);
109
127
  }
110
128
 
@@ -147,7 +165,25 @@ function insertLabelEvidence(db, importedAt, evidenceRows) {
147
165
  );
148
166
  }
149
167
 
150
- async function insertImport(dbPath, source, rawRecords, usageRecords, hookEvents, warnings) {
168
+ async function existingRawFingerprints(dbPath, source, sourceFile, fingerprints) {
169
+ await initStore(dbPath);
170
+ if (!fingerprints.length) return new Set();
171
+ const db = await openDatabase(dbPath);
172
+ const existing = new Set();
173
+ const statement = db.prepare('SELECT 1 FROM raw_records WHERE source = ? AND source_file = ? AND raw_fingerprint = ? LIMIT 1');
174
+ try {
175
+ for (const fingerprint of fingerprints) {
176
+ statement.bind([source, sourceFile, fingerprint]);
177
+ if (statement.step()) existing.add(fingerprint);
178
+ statement.reset();
179
+ }
180
+ } finally {
181
+ statement.free();
182
+ }
183
+ return existing;
184
+ }
185
+
186
+ async function insertImport(dbPath, source, sourceFile, rawRecords, usageRecords, hookEvents, warnings) {
151
187
  await initStore(dbPath);
152
188
  const db = await openDatabase(dbPath);
153
189
  const importedAt = new Date().toISOString();
@@ -156,8 +192,8 @@ async function insertImport(dbPath, source, rawRecords, usageRecords, hookEvents
156
192
  try {
157
193
  runPrepared(
158
194
  db,
159
- 'INSERT INTO raw_records (imported_at, source, line, payload_json) VALUES (?, ?, ?, ?)',
160
- rawRecords.map((record) => [importedAt, source, record.line, JSON.stringify(record.value)]),
195
+ 'INSERT OR IGNORE INTO raw_records (imported_at, source, source_file, line, raw_fingerprint, payload_json) VALUES (?, ?, ?, ?, ?, ?)',
196
+ rawRecords.map((record) => [importedAt, source, sourceFile, record.line, record.raw_fingerprint || null, JSON.stringify(record.value)]),
161
197
  );
162
198
 
163
199
  const labelEvidence = [];
@@ -283,6 +319,7 @@ async function queryRows(dbPath, sql, params = []) {
283
319
  }
284
320
 
285
321
  module.exports = {
322
+ existingRawFingerprints,
286
323
  initStore,
287
324
  insertImport,
288
325
  queryOne,