vg-coder-cli 2.0.59 → 2.0.61

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/bugs/bug1.md DELETED
@@ -1,493 +0,0 @@
1
- # Bug 1: `open-tab` API silently downgrades model — no validation, no error
2
-
3
- **Reporter**: medgraph integration (chrome-mcp-vgcoder consumer)
4
- **Date filed**: 2026-05-10
5
- **Severity**: 🟠 MAJOR — silent quality degradation, hard for caller to detect
6
- **Affects**: `vetgo-server2.duckdns.org` (server2, account `phathuy.vetgo@gmail.com`), production deployment
7
- **Service URL**: `https://vetgo.webmcp.vn/vg/api/launcher/open-tab`
8
-
9
- ---
10
-
11
- ## Status: 🟢 RESOLVED (2026-05-10, v2.0.57)
12
-
13
- **Real root cause** (xác định cuối cùng sau 5 round debug):
14
-
15
- `vetgo-auto/scripts/aistudio.google.com/main.js` hardcode
16
- `VG_DEFAULT_MODEL='gemini-3-flash-preview'`. `task-worker.js:handleTaskExecute`
17
- gọi `startNewChat()` ở đầu mỗi task → navigate
18
- `/prompts/new_chat?model=gemini-3-flash-preview` (drop `?model=` của caller pin).
19
- Task chạy trên Flash bất kể caller request gì.
20
-
21
- **Bằng chứng** (verified server2 ULTRA, 2026-05-10):
22
- - Trước v2.0.57: pin `gemini-3.1-pro-preview` → quan sát qua noVNC: tab navigate
23
- Pro → reload thành Flash → `actualModel: gemini-3-flash-preview`
24
- - Sau v2.0.57: pin `gemini-3.1-pro-preview` → tab giữ Pro → `actualModel:
25
- gemini-3.1-pro-preview` ✅
26
-
27
- **Fix shipped**:
28
- - v2.0.52: `open-tab` response thêm `requested_model` / `actual_model` /
29
- `fallback_occurred` (URL-based, partial detection)
30
- - v2.0.53: `task.result.actualModel` — worker scrape DOM `<ms-model-selector>`
31
- sau task done (detection accurate)
32
- - v2.0.55: `_pinnedModelByEmail` Map — `_recycleWorkerTab` reopen với pinned
33
- model (giữ pin qua nhiều task)
34
- - **v2.0.57: REAL FIX** — `getTargetModel()` đọc model từ URL hiện tại +
35
- sessionStorage cache, thay tất cả hardcode `VG_DEFAULT_MODEL`. `startNewChat`
36
- + `pinPromptModel` không còn override pin của caller.
37
-
38
- **Client pattern recommend**:
39
- 1. Pin model qua `POST /api/launcher/open-tab` body `{model: "..."}`
40
- 2. Submit task → check `task.result.actualModel === expectedModel`
41
- 3. Nếu khác → AI Studio fallback (account không có access). Decide retry /
42
- fail-fast / accept tùy use-case.
43
-
44
- **Code refs**:
45
- - `vetgo-auto/scripts/aistudio.google.com/main.js:20-37` — `getTargetModel()`
46
- (real fix — primary)
47
- - `src/server/views/js/features/task-worker.js:210-236` — `readActualModel()`
48
- scrape DOM (verification layer)
49
- - `src/server/task-queue.js:42-44, 491-494` — `_pinnedModelByEmail` persist
50
- qua recycle (secondary fix)
51
- - `vetgo-auto/chrome/src/launcher.ts` — open-tab handler URL-based detection
52
- (legacy — không reliable với AI Studio versions mới)
53
- - `INTEGRATION.md "Verify model thực"` — client docs với recommend pattern
54
-
55
- ---
56
-
57
- ## Summary
58
-
59
- The `POST /api/launcher/open-tab` endpoint accepts any string in the
60
- `model` field, returns a success response with that model name in the
61
- URL, but **the actual Chromium tab silently loads a different model**
62
- (typically `gemini-3-flash-preview`) when the requested model is not
63
- available to the worker's logged-in Google account.
64
-
65
- The API caller has **no way to detect** the downgrade — both the API
66
- response and the URL contain the requested model name. Only by visually
67
- inspecting the AI Studio sidebar via noVNC can you see which model is
68
- actually loaded.
69
-
70
- This causes downstream tasks to run on a different model than the
71
- caller assumes, which silently degrades extraction quality without any
72
- warning.
73
-
74
- ---
75
-
76
- ## How to reproduce
77
-
78
- ### Step 1 — Request `gemini-3-pro-preview` (or any model not available to the account)
79
-
80
- ```bash
81
- BASE=https://vetgo.webmcp.vn/vg
82
-
83
- # Close existing tab first
84
- curl -X POST -H 'Content-Type: application/json' -d '{}' $BASE/api/launcher/close-tab
85
-
86
- # Open with pro-preview model
87
- curl -X POST -H 'Content-Type: application/json' \
88
- -d '{"model":"gemini-3-pro-preview"}' \
89
- $BASE/api/launcher/open-tab
90
- ```
91
-
92
- ### Step 2 — Observe API response (looks success)
93
-
94
- ```json
95
- {
96
- "ok": true,
97
- "tabId": 477055248,
98
- "windowId": 477055217,
99
- "url": "https://aistudio.google.com/prompts/new_chat?model=gemini-3-pro-preview"
100
- }
101
- ```
102
-
103
- HTTP status: `200 OK`. URL embeds the requested model. Caller assumes success.
104
-
105
- ### Step 3 — Inspect actual tab via noVNC at `https://vetgo.webmcp.vn/vnc.html`
106
-
107
- Look at AI Studio sidebar (right pane). Observe:
108
-
109
- - **Sidebar title**: "Gemini 3 Flash Preview" (NOT "Gemini 3 Pro Preview")
110
- - **Model identifier** under title: `gemini-3-flash-preview`
111
- - **Browser URL bar**: shows `?model=gemini-3-flash-preview` (changed from request)
112
-
113
- Screenshot evidence (medgraph user observed 2026-05-10):
114
- ```
115
- URL: aistudio.google.com/prompts/new_chat?model=gemini-3-flash-preview
116
- Sidebar: "Gemini 3 Flash Preview"
117
- "gemini-3-flash-preview"
118
- "Our most intelligent model built for speed,
119
- combining frontier intelligence with superior
120
- search and grounding."
121
- ```
122
-
123
- ### Step 4 — Confirm task uses Flash, not Pro
124
-
125
- Submit a multimodal task and measure duration:
126
-
127
- ```bash
128
- curl -F prompt="Describe this PDF chapter structure as JSON" \
129
- -F files=@/tmp/test_chapter.pdf \
130
- $BASE/api/tasks
131
- ```
132
-
133
- Observed: 38-page PDF processed in **58 seconds**.
134
-
135
- Expected on `gemini-3-pro-preview`: ~3–5 minutes for 38-page PDF (per
136
- Google's published Pro model latency benchmarks).
137
-
138
- → 58s ≪ Pro baseline strongly indicates Flash, not Pro.
139
-
140
- ---
141
-
142
- ## Expected behavior
143
-
144
- `/api/launcher/open-tab` SHOULD **either**:
145
-
146
- **Option A (preferred — fail-fast)**:
147
- - Validate the requested `model` against a whitelist of models actually available to the worker account
148
- - If model not available, return `400 Bad Request` with body
149
- ```json
150
- {
151
- "ok": false,
152
- "error": "model_not_available",
153
- "requested": "gemini-3-pro-preview",
154
- "available": ["gemini-3-flash-preview", "gemini-2.5-flash", "gemini-2.5-pro"],
155
- "message": "Model 'gemini-3-pro-preview' not accessible to account phathuy.vetgo@gmail.com. Choose from available list."
156
- }
157
- ```
158
-
159
- **Option B (acceptable — report fallback)**:
160
- - Allow tab to open with whatever AI Studio gives back
161
- - Detect the actual loaded model from the resulting tab URL or DOM
162
- - Return the **actually loaded** model in the response
163
- ```json
164
- {
165
- "ok": true,
166
- "tabId": 477055248,
167
- "requested_model": "gemini-3-pro-preview",
168
- "actual_model": "gemini-3-flash-preview",
169
- "fallback_occurred": true,
170
- "url": "https://aistudio.google.com/prompts/new_chat?model=gemini-3-flash-preview"
171
- }
172
- ```
173
-
174
- Either way, the caller MUST be informed when the loaded model differs
175
- from the requested model.
176
-
177
- ---
178
-
179
- ## Actual behavior (broken)
180
-
181
- API response:
182
- ```json
183
- {
184
- "ok": true,
185
- "tabId": 477055248,
186
- "windowId": 477055217,
187
- "url": "https://aistudio.google.com/prompts/new_chat?model=gemini-3-pro-preview"
188
- }
189
- ```
190
-
191
- But Chromium tab loads `?model=gemini-3-flash-preview` (visible in noVNC
192
- URL bar + sidebar). The `url` field in the API response is **the
193
- requested URL, not the loaded URL**.
194
-
195
- ---
196
-
197
- ## Root cause hypotheses (for fixer to investigate)
198
-
199
- 1. **AI Studio frontend silently auto-fallbacks** when account lacks
200
- access to a preview model. The Chromium tab navigates from
201
- `?model=gemini-3-pro-preview` → `?model=gemini-3-flash-preview`
202
- without any error toast.
203
-
204
- 2. **`open-tab` handler returns the requested URL immediately** without
205
- waiting for navigation to settle. It does NOT poll the tab's actual
206
- final URL after AI Studio's redirect.
207
-
208
- 3. **No model whitelist validation** in `open-tab` — any string is
209
- accepted, even nonsense like `gemini-99-fake`. Tested with multiple
210
- strings including `gemini-3-pro` (note: without `-preview` suffix);
211
- all return `ok: true` regardless of validity. Each request returns
212
- the requested URL verbatim, regardless of whether AI Studio actually
213
- honors that model.
214
-
215
- ---
216
-
217
- ## Test cases for fixer
218
-
219
- | Input model | Account permission | Expected result |
220
- |-------------|-------------------|-----------------|
221
- | `gemini-3-flash-preview` | ✅ has access | open success, actual_model = requested |
222
- | `gemini-2.5-flash` | ✅ has access | open success, actual_model = requested |
223
- | `gemini-3-pro-preview` | ❌ no access | Option A: 400 error / Option B: ok with `actual_model: gemini-3-flash-preview, fallback_occurred: true` |
224
- | `gemini-3-pro` | ❌ doesn't exist | Option A: 400 invalid model / Option B: report actual fallback |
225
- | `gemini-99-fake` | ❌ doesn't exist | Option A: 400 invalid model |
226
- | (omitted `model` field) | — | Use default model, log which default chosen |
227
-
228
- ---
229
-
230
- ## Workaround (current — manual)
231
-
232
- Until fixed, callers must:
233
-
234
- 1. Open tab via API
235
- 2. Open `https://vetgo.webmcp.vn/vnc.html` separately
236
- 3. Visually inspect AI Studio sidebar to verify model
237
- 4. If wrong model loaded, manually click model dropdown in AI Studio UI
238
- to select correct one
239
-
240
- This defeats the purpose of programmatic tab control. The bug is
241
- particularly insidious because:
242
-
243
- - **Output quality silently degrades** (Flash vs Pro for the same task)
244
- - **No error logged anywhere** — task succeeds, looks fine
245
- - **Token count similar** so caller can't detect via metrics
246
- - **Only manual visual inspection** reveals the issue
247
-
248
- In the medgraph case, this caused a recon task to run on Flash when
249
- caller assumed Pro, leading to ~5x faster completion time but with
250
- unknown quality trade-off. For production extraction of clinical
251
- veterinary protocols (high-liability content), this silent downgrade
252
- is unacceptable.
253
-
254
- ---
255
-
256
- ## Evidence collected (2026-05-10 03:29 UTC)
257
-
258
- ### API responses
259
-
260
- ```bash
261
- # Test 1: Request pro-preview
262
- $ curl -X POST -d '{"model":"gemini-3-pro-preview"}' \
263
- $BASE/api/launcher/open-tab
264
- {"ok":true,"tabId":477055264,"windowId":477055217,
265
- "url":"https://aistudio.google.com/prompts/new_chat?model=gemini-3-pro-preview"}
266
-
267
- # Test 2: Request 2.5-pro (might also fallback if account lacks access)
268
- $ curl -X POST -d '{"model":"gemini-2.5-pro"}' \
269
- $BASE/api/launcher/open-tab
270
- {"ok":true,"tabId":477055260,"windowId":477055217,
271
- "url":"https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-pro"}
272
-
273
- # Test 3: Request gemini-3-pro (no -preview suffix, possibly invalid)
274
- $ curl -X POST -d '{"model":"gemini-3-pro"}' \
275
- $BASE/api/launcher/open-tab
276
- {"ok":true,"tabId":477055262,"windowId":477055217,
277
- "url":"https://aistudio.google.com/prompts/new_chat?model=gemini-3-pro"}
278
- ```
279
-
280
- All three return `ok: true` regardless of whether the account actually
281
- has access to the model. The API is unable to distinguish a successful
282
- load from a silent fallback.
283
-
284
- ### Visual evidence from noVNC (after Test 1)
285
-
286
- Browser URL bar reads: `aistudio.google.com/prompts/new_chat?model=gemini-3-flash-preview`
287
-
288
- AI Studio Run-settings sidebar shows:
289
- ```
290
- Gemini 3 Flash Preview
291
- gemini-3-flash-preview
292
- Our most intelligent model built for speed,
293
- combining frontier intelligence with superior
294
- search and grounding.
295
- ```
296
-
297
- NOT `gemini-3-pro-preview` as requested.
298
-
299
- ### Task duration evidence
300
-
301
- Submitted PDF (38 pages, 2.2MB, structured-extraction prompt):
302
- - Task ID: `t_1778383267520_b0c68b`
303
- - `durationMs`: `58047` ms (≈ 58 seconds)
304
- - Worker: `phathuy.vetgo@gmail.com`
305
-
306
- Reference benchmarks (Google AI Studio published latencies, Apr 2026):
307
- - `gemini-3-flash-preview` typical 30-60s for 30+ page PDF + JSON output
308
- - `gemini-3-pro-preview` typical 3-9 minutes for same workload
309
-
310
- Observed 58s strongly aligns with Flash, not Pro.
311
-
312
- ---
313
-
314
- ## Fix priority justification
315
-
316
- This bug breaks the **only programmatic mechanism for model selection**
317
- in chrome-mcp-vgcoder. README.md (lines 64-69) documents `open-tab` as
318
- the way to "lock model" — but in practice, this lock is non-functional
319
- when the account lacks the requested model.
320
-
321
- For automated pipelines (e.g. medgraph Layer 2 extraction), this means:
322
-
323
- - Cannot guarantee task quality without manual noVNC verification
324
- - Cannot run unattended batch jobs with confidence
325
- - Cannot programmatically detect/recover from model unavailability
326
- - Account quota issues are masked as "task succeeded" when actually
327
- running on a degraded model
328
-
329
- Recommend Option A (fail-fast validation) for production deployments
330
- to surface account permission issues immediately, with Option B as a
331
- fallback for graceful degradation in dev/testing.
332
-
333
- ---
334
-
335
- ## Suggested implementation pointers
336
-
337
- For the AI agent fixing this bug:
338
-
339
- 1. **Locate `open-tab` handler** — likely in `vg-coder-cli` source under
340
- `src/server/launcher/` or similar. Search for handler accepting POST
341
- on path matching `/api/launcher/open-tab`.
342
-
343
- 2. **Add post-navigation poll**: after Chromium navigates to the
344
- requested URL, wait for AI Studio frontend to settle (debounce ~2-3s
345
- or watch for DOM "model selector loaded" event), then read the
346
- actual `model` query param from `tab.url` (the URL after any
347
- redirects).
348
-
349
- 3. **Compare requested vs actual**:
350
- ```js
351
- const actualModel = new URL(tab.url).searchParams.get('model');
352
- const requestedModel = req.body.model;
353
- const fallbackOccurred = actualModel && actualModel !== requestedModel;
354
- ```
355
-
356
- 4. **For Option A**: also maintain a per-account model whitelist
357
- (probably needs to scrape AI Studio model dropdown DOM once at
358
- worker boot and cache).
359
-
360
- 5. **Test on multiple accounts**: server2 (`phathuy.vetgo`) and
361
- server3 (`udymec`) may have different model access — the fix must
362
- work for both.
363
-
364
- 6. **Update README.md**: document the new response shape and any
365
- account-tier limitations on which models are available.
366
-
367
- 7. **Add integration test** in `vg-coder-cli/tests/` that asserts
368
- `actual_model === requested_model` after `open-tab`, and fails
369
- loudly when fallback occurs without warning.
370
-
371
- ---
372
-
373
- ## Related code areas (for fixer to grep)
374
-
375
- - Handler: search for `app.post('/api/launcher/open-tab'` or similar route registration
376
- - Tab navigation: search for `chrome.tabs.update` or `chrome.tabs.create` with `url:` containing `aistudio.google.com`
377
- - Worker registration: `meta.domain === 'aistudio.google.com'` likely involves model param parsing already
378
- - Reference: `chrome-mcp-vgcoder/README.md` lines 64-69 document the contract this bug violates
379
-
380
- ---
381
-
382
- ## Out of scope for this bug
383
-
384
- - Fixing AI Studio's silent fallback behavior itself (that's Google's UI)
385
- - Adding model selection per-task (current architecture is per-worker; this bug only addresses
386
- per-worker model selection accuracy)
387
- - Quota management / fallback strategy when preview models hit limits
388
-
389
- ---
390
-
391
- ## Debug timeline (2026-05-10)
392
-
393
- Bug fix mất 5 round vì đoán sai root cause vài lần. Lưu lại để tránh lặp:
394
-
395
- ### Round 1 (v2.0.52) — URL-based detection ❌
396
- Hypothesis: AI Studio redirect URL khi fallback. Add `requested_model` /
397
- `actual_model` từ URL ở open-tab response.
398
- **Sai vì**: AI Studio versions hiện tại **không** redirect URL (từng làm trong
399
- quá khứ?). URL giữ nguyên dù fallback.
400
-
401
- ### Round 2 (v2.0.53) — DOM scrape sau task ✅ partial
402
- Worker `readActualModel()` scrape `<ms-model-selector>` sau task done. Verified
403
- work với account free tier (request fake model → DOM trả model thực).
404
-
405
- ### Round 3 — Test với account ULTRA, vẫn Flash
406
- Hypothesis: pin model bị mất sau `_recycleWorkerTab` (close+reopen với default).
407
- v2.0.55 add `_pinnedModelByEmail` Map.
408
- **Đúng 1 phần**: pin persist qua recycle. Nhưng test vẫn Flash.
409
-
410
- ### Round 4 — Hypothesis "AI Studio strip ?model="
411
- Quan sát: URL during task = `/prompts/new_chat` (no query). Đoán AI Studio
412
- auto-clean URL sau prompt submit.
413
- **Sai vì**: chính code vg-coder navigate URL (không phải AI Studio).
414
-
415
- ### Round 5 (v2.0.57) — REAL ROOT CAUSE ✅
416
- User quan sát qua noVNC: tab navigate đúng Pro → reload thành Flash trước khi
417
- chat. Grep code → tìm thấy `vetgo-auto/scripts/aistudio.google.com/main.js`
418
- hardcode `VG_DEFAULT_MODEL='gemini-3-flash-preview'`. `startNewChat()` ở đầu
419
- mỗi task navigate `/prompts/new_chat?model=Flash` → override pin caller.
420
- Fix: `getTargetModel()` dynamic từ URL + sessionStorage. **Verified work.**
421
-
422
- ### Round 6 — CI miss step (real-real fix)
423
-
424
- Sau v2.0.57 deployed, test server3 vẫn fail. Verify bundle trên server:
425
- - `grep getTargetModel /usr/local/lib/node_modules/vg-coder-cli/dist/vg-coder-bundle.js` → 0 match
426
- - v2.0.57 publish thành công nhưng code mới **không có** trong bundle
427
-
428
- **Cause**: `vetgo-auto/scripts/aistudio.google.com/main.js` deploy qua Firebase
429
- RTDB (`ENV/VGCODER`), KHÔNG bundle vào npm package. Extension fetch script tại
430
- runtime từ Firebase. CI `publish.yml` chỉ chạy `build:extension` + `build:copy`
431
- + `build:inject` — bỏ qua `deploy-scripts`. Phải chạy thủ công
432
- `cd vetgo-auto && node deploy-scripts.js` từ máy local để push code mới.
433
-
434
- **Fix CI**: add `npm run deploy-scripts` step vào `publish.yml` (commit
435
- 83186ba). Lần sau bump version → CI tự push Firebase.
436
-
437
- ### Round 7 — File upload race (server4 Windows-specific)
438
-
439
- Sau bug1 fix work cả 3 server cho text-only task, test multimodal:
440
- - server2/3 (Linux native): image + PDF Pro work ✅
441
- - server4 (Docker Desktop Windows): cả image + PDF → model trả "chưa upload
442
- file" mặc dù chip hiển thị file đã attach trong UI.
443
-
444
- User screenshot 2 lần show:
445
- 1. Chip "feline-xray-chest7.jpg Loading..." — file đang upload
446
- 2. Chip "feline-xray-chest7.jpg 1,101 tokens" — upload xong, nhưng textarea
447
- vẫn trống, worker chưa paste prompt
448
-
449
- DOM inspect tìm thấy chip element là `<ms-prompt-media>` /
450
- `[data-test-id="prompt-media-container"]` — KHÔNG match selector cũ
451
- `ms-prompt-chip-file, ms-file-chip, ms-attachment-chip`. Wait loop pass với
452
- `chips=0`, fall through 30s timeout → submit Run trước khi tokenize xong → AI
453
- Studio drop file silent.
454
-
455
- **Cause cụ thể**: 2 vấn đề chồng nhau:
456
- 1. Selector outdated (AI Studio đã rename element 2026)
457
- 2. Không có check token-count finalize (chip hiện ngay sau drop, tokenize 5-15s
458
- sau đặc biệt trên Windows fs latency)
459
-
460
- **Fix v2.0.58**:
461
- - Update CHIP_SELECTORS thêm `ms-prompt-media`, `[data-test-id="prompt-media-container"]`
462
- - Add wait loop: chip text phải match `/[\d,]+ tokens/` (support comma) VÀ
463
- KHÔNG có `Calculating|Processing|Uploading|Loading`
464
- - Timeout 60s → proceed anyway
465
-
466
- **Result**: server4 image task 105s → 23s sau fix selector + regex chuẩn. Cả 3
467
- server multimodal Pro work end-to-end.
468
-
469
- ### Lesson learned
470
-
471
- - **Quan sát visual (noVNC) > eval DOM async**: User report "tab reload trước
472
- khi chat" + screenshot chip "Loading..." → "1,101 tokens" textarea trống là
473
- clue quyết định. Eval DOM tại các thời điểm khác nhau cho data rời rạc khó
474
- ráp — visual real-time mới thấy state transitions.
475
- - **Grep hardcode constant trước khi đoán external behavior**: 4 round đầu đoán
476
- AI Studio làm gì đó (redirect, strip query). Round 5 tìm thấy hardcode
477
- `VG_DEFAULT_MODEL` trong chính code mình.
478
- - **Verify deploy artifact trên target server**: Round 6 chỉ ra version bump
479
- KHÔNG đảm bảo code mới chạy nếu deploy pipeline có gap. Sau mỗi fix, grep
480
- symbol mới trong file production thực — không tin "CI passed = code chạy".
481
- - **Selector update qua thời gian (round 7)**: AI Studio Angular rename
482
- tag/class qua mỗi version. Match cả new + legacy selector trong array. Khi
483
- wait loop pass với count=0 (warning ignored) → có thể selector đã chết.
484
- - **Multi-issue stacking**: Round 7 có 2 cause chồng nhau (selector outdated
485
- AND tokenize race). Fix 1 cause không đủ — verify fully E2E sau mỗi fix.
486
- - **Multiple layer detection có giá trị**: `actualModel` (DOM scrape) là
487
- source-of-truth đúng đắn ngay từ v2.0.53 — confirm bug có thật, định nghĩa
488
- expected behavior cho fix, không phụ thuộc fix nào fail.
489
- - **Firebase deploy không đồng bộ với npm publish**: code AIChat ở
490
- `vetgo-auto/scripts/aistudio.google.com/*.js` deploy qua Firebase RTDB
491
- riêng. Bump npm package version KHÔNG đẩy code này (trước v2.0.58 CI fix).
492
- Test fix nhanh: `cd vetgo-auto && node deploy-scripts.js` + restart Chromium
493
- — KHÔNG cần rebuild image hay update package.