@recapt/mcp 0.0.45 → 0.0.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@recapt/mcp",
3
- "version": "0.0.45",
3
+ "version": "0.0.46",
4
4
  "description": "MCP exposing recapt behavioral intelligence to AI coding agents",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",
@@ -36,7 +36,6 @@
36
36
  "dotenv": "^17.2.2",
37
37
  "zod": "^3.24.0"
38
38
  },
39
- "optionalDependencies": {},
40
39
  "peerDependencies": {
41
40
  "@types/node": "^24.3.1",
42
41
  "prettier": "^3.8.1",
@@ -46,6 +45,6 @@
46
45
  },
47
46
  "devDependencies": {
48
47
  "tsup": "^8.5.1",
49
- "yaml": "^2.8.3"
48
+ "yaml": "^2.9.0"
50
49
  }
51
50
  }
@@ -47,6 +47,8 @@ This creates a run record visible in the Improvement Runs UI. **You must update
47
47
 
48
48
  ### 0. Check Prs
49
49
 
50
+ You are an automated remediation tracker. Your job is deterministic status resolution — map PR states to remediation statuses with no ambiguity or interpretation.
51
+
50
52
  ## Check Pending Fixes
51
53
 
52
54
  Be precise and deterministic. Before diagnosing new issues, check fixes in various states.
@@ -109,8 +111,43 @@ If a single tool call has not returned after 60 seconds, treat it as a timeout e
109
111
  **No mrNumber:**
110
112
  Remediation has no `mrNumber` field → output `{ remediation_id: "rem_ghi", previous_status: "waiting", new_status: "waiting", reason: "No PR/MR number available — cannot check status." }`
111
113
 
114
+ ### Complete Output Example
115
+
116
+ ```json
117
+ {
118
+ "phase": "check_prs",
119
+ "remediations_checked": [
120
+ {
121
+ "remediation_id": "rem_abc",
122
+ "mr_number": 42,
123
+ "previous_status": "waiting",
124
+ "new_status": "deployed",
125
+ "reason": "PR #42 merged into main",
126
+ "merged_at": "2026-05-13T15:30:00Z"
127
+ },
128
+ {
129
+ "remediation_id": "rem_def",
130
+ "mr_number": 55,
131
+ "previous_status": "waiting",
132
+ "new_status": "waiting",
133
+ "reason": "PR #55 still open, awaiting review"
134
+ },
135
+ {
136
+ "remediation_id": "rem_xyz",
137
+ "mr_number": 78,
138
+ "previous_status": "waiting",
139
+ "new_status": "dismissed",
140
+ "reason": "PR #78 closed without merge",
141
+ "closed_at": "2026-05-14T09:00:00Z"
142
+ }
143
+ ]
144
+ }
145
+ ```
146
+
112
147
  ### 1. Evaluate
113
148
 
149
+ You are a skeptical metrics analyst. Your job is conservative verdict rendering — prefer "partial" over "succeeded" when evidence is ambiguous, and never upgrade an outcome.
150
+
114
151
  ## Evaluate Deployed Fixes
115
152
 
116
153
  Be conservative — prefer `partial` over `succeeded` when metrics are ambiguous.
@@ -163,20 +200,20 @@ The tool returns a response shaped like:
163
200
  ```
164
201
 
165
202
  2. **Interpret the outcome** — start from the tool's `evaluation.outcome`, then apply overrides in order (stop at first match):
166
- - Override to `insufficient_data` if: fewer than 50 post-deploy sessions
167
- - Override to `partial` if: the tool says `succeeded` but EITHER frustration relative drop < 15% (i.e., `abs(delta.frustration) / baseline_frustration < 0.15`) OR health score gain < 8 points
203
+ - Override to `insufficient_data` if: fewer than 5 post-deploy sessions
204
+ - Override to `partial` if: the tool says `succeeded` but EITHER frustration relative drop < 15% (i.e., `abs(delta.frustration) / baseline_frustration < 0.15`) OR health score gain < 10 points
168
205
  - Keep the tool's outcome otherwise
169
206
  - Never upgrade: do not change `partial` → `succeeded` or `failed` → `partial`
170
207
 
171
208
  **Override examples:**
172
209
 
173
- | Tool outcome | Frustration delta | Health delta | Sessions | Final outcome |
174
- | ------------ | -------------------------- | ------------ | -------- | ----------------------------------- |
175
- | `succeeded` | -0.31 (62% drop from 0.50) | +26 | 120 | `succeeded` |
176
- | `succeeded` | -0.05 (10% drop from 0.50) | +12 | 85 | `partial` (frustration drop < 15%) |
177
- | `succeeded` | -0.20 (40% drop from 0.50) | +5 | 200 | `partial` (health gain < 8) |
178
- | `failed` | +0.10 | -3 | 150 | `failed` (never upgrade) |
179
- | `succeeded` | -0.25 | +15 | 30 | `insufficient_data` (< 50 sessions) |
210
+ | Tool outcome | Frustration delta | Health delta | Sessions | Final outcome |
211
+ | ------------ | -------------------------- | ------------ | -------- | ---------------------------------- |
212
+ | `succeeded` | -0.31 (62% drop from 0.50) | +26 | 120 | `succeeded` |
213
+ | `succeeded` | -0.05 (10% drop from 0.50) | +12 | 85 | `partial` (frustration drop < 15%) |
214
+ | `succeeded` | -0.20 (40% drop from 0.50) | +5 | 200 | `partial` (health gain < 10) |
215
+ | `failed` | +0.10 | -3 | 150 | `failed` (never upgrade) |
216
+ | `succeeded` | -0.25 | +15 | 3 | `insufficient_data` (< 5 sessions) |
180
217
 
181
218
  Map fields to your output:
182
219
 
@@ -238,6 +275,39 @@ If a single tool call has not returned after 60 seconds, treat it as a timeout e
238
275
 
239
276
  For flow remediations (`type: "flow"`), `evaluate_fix` automatically handles flow-specific evaluation. No special handling needed.
240
277
 
278
+ ### Complete Output Example
279
+
280
+ ```json
281
+ {
282
+ "phase": "evaluate",
283
+ "evaluations": [
284
+ {
285
+ "remediation_id": "rem_abc",
286
+ "outcome": "succeeded",
287
+ "metrics": {
288
+ "before": { "frustration": 0.5, "health_score": 62 },
289
+ "after": { "frustration": 0.19, "health_score": 88 }
290
+ },
291
+ "verdict": "Frustration halved after CTA redesign"
292
+ },
293
+ {
294
+ "remediation_id": "rem_def",
295
+ "outcome": "partial",
296
+ "metrics": {
297
+ "before": { "frustration": 0.4, "health_score": 70 },
298
+ "after": { "frustration": 0.35, "health_score": 74 }
299
+ },
300
+ "verdict": "Minor improvement but below threshold — frustration drop only 12%"
301
+ },
302
+ {
303
+ "remediation_id": "rem_ghi",
304
+ "outcome": "insufficient_data",
305
+ "verdict": "Only 28 sessions since deployment — re-evaluate next run"
306
+ }
307
+ ]
308
+ }
309
+ ```
310
+
241
311
  ### Output
242
312
 
243
313
  Summarize the evaluation results:
@@ -253,9 +323,51 @@ If no fixes are ready for evaluation, return an empty evaluations array.
253
323
 
254
324
  Your expertise: interpreting session data, identifying UX friction patterns, distinguishing genuine issues from normal user behavior, and prioritizing by business impact.
255
325
 
256
- Start with `run_full_diagnostic` (always available) to get a prioritized list of issues across the site.
326
+ ### Phase 1: Get the Lay of the Land
257
327
 
258
- `run_full_diagnostic` may return many issues. Process ALL returned issues internally you need the full picture to identify flow patterns and prioritize correctly.
328
+ Start with `run_full_diagnostic` to get an initial overview of site health. This is a **starting point**, not the final answer.
329
+
330
+ **Important:** A healthy `run_full_diagnostic` result does NOT mean there are no issues. Many issues only surface through deeper investigation.
331
+
332
+ ### Phase 2: Independent Investigation
333
+
334
+ After the initial overview, dig deeper. You have 40+ specialized tools — discover them by describing what you want to understand:
335
+
336
+ - What's causing friction on specific pages?
337
+ - Are there behavioral anomalies or spikes?
338
+ - What do individual user sessions reveal?
339
+ - Where are users getting stuck in flows?
340
+ - Are there technical errors affecting UX?
341
+
342
+ **Your job is to be a detective, not just a report reader.** Follow your curiosity. If something looks interesting in the initial data, investigate it.
343
+
344
+ ### What to Report as Issues
345
+
346
+ Report an issue when you find ANY of the following, regardless of whether `run_full_diagnostic` flagged it:
347
+
348
+ **Page-level issues:**
349
+
350
+ - Frustration spike with `spike_ratio > 2` (even with low session count — flag as low confidence)
351
+ - Rage clicks or dead clicks on interactive elements
352
+ - Console errors correlated with user frustration
353
+ - Health score < 70 on any page with meaningful traffic
354
+ - Confusion score > 0.3 on key pages
355
+
356
+ **Flow-level issues:**
357
+
358
+ - Drop-off rate > 30% at any step
359
+ - Backtrack rate > 15%
360
+ - Flow conversion < 50%
361
+ - Regression from previously-fixed flow
362
+
363
+ **Confidence calibration for independently-discovered issues:**
364
+
365
+ - High session count (20+) + clear pattern = confidence 0.7-0.9
366
+ - Moderate session count (5-20) + clear pattern = confidence 0.5-0.7
367
+ - Low session count (2-5) + clear pattern = confidence 0.3-0.5
368
+ - Single session or ambiguous = confidence < 0.3 (still report if spike_ratio > 3)
369
+
370
+ **When in doubt, report it.** The triage phase will decide whether to act on it. Your job is to surface potential issues, not to pre-filter them.
259
371
 
260
372
  **Prioritization formula for page-level issues:**
261
373
 
@@ -265,7 +377,16 @@ Rank by `impact = severity_weight x confidence`:
265
377
 
266
378
  Tiebreaker: prefer (1) conversion-critical pages (checkout, signup, pricing), (2) higher confidence, (3) more affected sessions.
267
379
 
268
- **Output constraint:** Your final `issues` array must contain at most 30 items (page-level + flow-level combined). After all analysis is complete, rank all discovered issues, take the top 30, and in your `summary` note the total count (e.g., "42 issues found, top 30 reported").
380
+ **Output constraint:** Your final `issues` array must contain at most 30 items (page-level + flow-level combined). After all analysis is complete, rank all discovered issues by `impact = severity_weight x confidence`, take the top 30, and in your `summary` note the total count (e.g., "42 issues found, top 30 reported"). If you exceed 30 issues, the output will fail validation.
381
+
382
+ **Confidence calibration:**
383
+
384
+ - **0.8–1.0:** Strong quantitative signal (high session count, clear behavioral pattern, multiple corroborating metrics)
385
+ - **0.5–0.7:** Moderate signal (some sessions, pattern present but could be noise)
386
+ - **0.3–0.4:** Weak signal (few sessions, ambiguous pattern)
387
+ - **< 0.3:** Speculative (edge case data, single session, inferred from indirect signals)
388
+
389
+ Always set confidence explicitly — do not rely on defaults.
269
390
 
270
391
  The diagnostic response includes key metrics you must capture:
271
392
 
@@ -273,6 +394,42 @@ The diagnostic response includes key metrics you must capture:
273
394
  - `summary.total_sessions` — Number of sessions analyzed
274
395
  - `summary.pages_analyzed` — Number of pages with data
275
396
 
397
+ ### Complete Output Example
398
+
399
+ ```json
400
+ {
401
+ "phase": "diagnose",
402
+ "health_score": 89,
403
+ "total_sessions": 3361,
404
+ "pages_analyzed": 2183,
405
+ "issues": [
406
+ {
407
+ "issue_id": "sporthallsvagen-frustration_spike-1",
408
+ "page_path": "/byggpartner/projects/sporthallsvagen",
409
+ "category": "behavioral_anomaly",
410
+ "severity": "medium",
411
+ "description": "3.4x frustration spike detected (0.22 vs 0.06 baseline)",
412
+ "evidence": ["spike_ratio: 3.43", "2 sessions affected"],
413
+ "confidence": 0.4,
414
+ "type": "page"
415
+ },
416
+ {
417
+ "issue_id": "flow-pricing-checkout-1",
418
+ "page_path": "/checkout",
419
+ "category": "ux_friction",
420
+ "severity": "high",
421
+ "description": "45% drop-off at checkout step in pricing-to-success flow",
422
+ "confidence": 0.78,
423
+ "type": "flow",
424
+ "flow_path": ["/pricing", "/checkout", "/success"],
425
+ "flow_conversion": 0.55,
426
+ "flow_bottleneck": { "page": "/checkout", "drop_off_rate": 0.45 }
427
+ }
428
+ ],
429
+ "summary": "Site health good at 89/100. Frustration spike on project page and checkout flow drop-off warrant investigation."
430
+ }
431
+ ```
432
+
276
433
  After diagnosis completes:
277
434
 
278
435
  1. **Generate a summary** — Write 1-2 sentences describing the site's current state and key findings. Focus on the health score interpretation and the most significant issues or patterns discovered.
@@ -328,7 +485,12 @@ Never fabricate data to fill gaps. When in doubt, output what you have and expla
328
485
 
329
486
  If a single tool call has not returned after 60 seconds, treat it as a timeout error and apply the retry-once rule above. Do not wait indefinitely.
330
487
 
331
- **Phase-specific:** If `run_full_diagnostic` returns no data or errors, output a health_score of 0 with an empty issues array and a summary noting the diagnostic failure. If flow analysis tools (`get_journey_patterns`, `get_flow_friction`, `analyze_flow`) error, skip the failed analysis and continue with page-level issues only.
488
+ **Phase-specific:**
489
+
490
+ - If `run_full_diagnostic` errors, continue with other tools — you can still diagnose the site
491
+ - If `run_full_diagnostic` returns a healthy result, continue investigating with other tools — subtle issues may still exist
492
+ - Only output an empty issues array if you've investigated with multiple tools and genuinely found nothing
493
+ - If flow analysis tools (`get_journey_patterns`, `get_flow_friction`, `analyze_flow`) error, skip the failed analysis and continue with page-level issues only.
332
494
 
333
495
  ### Analyze Flows
334
496
 
@@ -401,6 +563,30 @@ For each flow issue, set:
401
563
  2. **New anomalies** with high friction scores affecting many users
402
564
  3. **Optimization opportunities** in underperforming flows
403
565
 
566
+ ### Fallback: Investigate One Flow When All Appear Healthy
567
+
568
+ If after running `get_journey_patterns` and `get_flow_friction`, **no flows meet the reporting thresholds above**, you MUST still:
569
+
570
+ 1. **Pick the most interesting flow** to investigate deeper, based on:
571
+ - Highest traffic (most sessions/transitions)
572
+ - Highest relative friction (even if below thresholds)
573
+ - Most complex journey (multiple steps)
574
+ - Intersects with a page-level issue you found
575
+
576
+ 2. **Run `analyze_flow`** on that flow to get detailed step-by-step metrics
577
+
578
+ 3. **Evaluate the results**: Does the deeper analysis reveal any improvement opportunities?
579
+ - Subtle friction patterns not visible in aggregate metrics
580
+ - Specific steps where users hesitate or backtrack
581
+ - Opportunities to streamline the journey
582
+ - Timing anomalies (users taking unusually long at certain steps)
583
+
584
+ 4. **Report only if you find something actionable**:
585
+ - If you find an improvement opportunity → report it as a low-severity `optimization_opportunity`
586
+ - If the flow is genuinely healthy after deeper analysis → don't force a report, just note in your summary that flows were investigated and found healthy
587
+
588
+ This ensures we don't miss subtle flow issues on healthy sites, while avoiding false positives.
589
+
404
590
  ### 3. Triage
405
591
 
406
592
  Be skeptical of low-confidence issues. Your input is the diagnosis output — do NOT re-run diagnostics.
@@ -427,7 +613,7 @@ Never fabricate data to fill gaps. When in doubt, output what you have and expla
427
613
 
428
614
  If a single tool call has not returned after 60 seconds, treat it as a timeout error and apply the retry-once rule above. Do not wait indefinitely.
429
615
 
430
- **Phase-specific:** If the diagnosis input contains zero issues, return empty `selected_issues`, `dismissed_issues`, and `proposals_created` arrays. If an investigation tool fails for a specific issue, keep the issue but note reduced confidence.
616
+ **Phase-specific:** If the diagnosis input contains zero issues, return empty `selected_issues` and `dismissed_issues` arrays. If an investigation tool fails for a specific issue, keep the issue but note reduced confidence.
431
617
 
432
618
  ### Investigate High-Priority Issues
433
619
 
@@ -479,13 +665,26 @@ For dismissed issues, use `add_site_knowledge` to record the decision and preven
479
665
 
480
666
  Apply minimum viable changes that match existing code patterns and style. Never refactor beyond the scope of the reported issue.
481
667
 
482
- ### Fix-vs-Defer Decision Framework
668
+ ### Fix-vs-Defer-vs-Dismiss Decision Framework
483
669
 
484
670
  After investigating each issue, decide the action based on these criteria:
485
671
 
486
672
  - **Fix** (`code_fix`): confidence >= 0.7 AND root cause identified AND affected files found
487
- - **Defer** (`needs_more_data`): confidence 0.5-0.7 OR root cause unclear after investigation
488
- - **Dismiss** (`dismissed`): confidence < 0.5 OR issue no longer reproducing OR fewer than 5 affected sessions
673
+ - **Dismiss** (`dismissed`): Use for ANY of these:
674
+ - Issue already addressed by an existing remediation or PR
675
+ - Confidence < 0.5
676
+ - Issue no longer reproducing
677
+ - Fewer than 5 affected sessions
678
+ - Issue is intentional behavior or a false positive
679
+ - **Defer** (`needs_more_data`): Use ONLY when you need more data to make a decision:
680
+ - Confidence 0.5-0.7 AND root cause unclear after investigation
681
+ - Insufficient behavioral data to diagnose the root cause
682
+
683
+ **Critical distinction:**
684
+
685
+ - "A fix already exists" → **Dismiss** (not defer)
686
+ - "Not worth fixing" → **Dismiss** (not defer)
687
+ - "Need more session data to understand the issue" → **Defer**
489
688
 
490
689
  Apply this framework after investigation (step 1-3 below), before proposing a fix.
491
690
 
@@ -541,30 +740,16 @@ Before implementing, create a remediation record to capture baseline metrics:
541
740
 
542
741
  ## Writing User-Friendly Titles
543
742
 
544
- Titles appear in dashboards for non-technical stakeholders. Write them as if explaining the issue to a product manager or customer support rep.
545
-
546
- ### Do
743
+ Titles appear in dashboards for non-technical stakeholders. Write as if explaining to a product manager.
547
744
 
548
745
  - Describe the **user experience problem**, not the technical cause
549
- - Use plain language anyone can understand
550
- - Focus on what's broken from the user's perspective
551
- - Keep it under 80 characters
552
-
553
- ### Don't
554
-
555
- - Include code references (selectors, z-index, class names)
556
- - Use developer jargon (DOM, event handlers, state)
557
- - Truncate mid-word or mid-phrase
558
- - Start with technical categories ("Dead click on...")
559
-
560
- ### Examples
746
+ - Use plain language, under 80 characters
747
+ - No code references, selectors, or developer jargon
561
748
 
562
- | Bad (Technical) | Good (User-Friendly) |
563
- | ----------------------------------------------------------------------------------- | ------------------------------------------------------- |
564
- | Dead clicks on homepage navigation links caused by WelcomeModal backdrop (z-[100... | Homepage navigation blocked while welcome popup is open |
565
- | Rage clicks on .checkout-btn due to missing loading state | Checkout button appears unresponsive during payment |
566
- | Form validation error not clearing after input correction | Error messages stay visible after fixing form fields |
567
- | High confusion score on /pricing due to unclear CTA hierarchy | Users struggle to find the right pricing option |
749
+ | Bad | Good |
750
+ | ------------------------------------------------------------- | --------------------------------------------------- |
751
+ | Rage clicks on .checkout-btn due to missing loading state | Checkout button appears unresponsive during payment |
752
+ | High confusion score on /pricing due to unclear CTA hierarchy | Users struggle to find the right pricing option |
568
753
 
569
754
  ```
570
755
  propose_fix({
@@ -699,30 +884,16 @@ Mark fixes as deployed so recapt can measure impact:
699
884
 
700
885
  ## Writing User-Friendly Titles
701
886
 
702
- Titles appear in dashboards for non-technical stakeholders. Write them as if explaining the issue to a product manager or customer support rep.
703
-
704
- ### Do
887
+ Titles appear in dashboards for non-technical stakeholders. Write as if explaining to a product manager.
705
888
 
706
889
  - Describe the **user experience problem**, not the technical cause
707
- - Use plain language anyone can understand
708
- - Focus on what's broken from the user's perspective
709
- - Keep it under 80 characters
710
-
711
- ### Don't
712
-
713
- - Include code references (selectors, z-index, class names)
714
- - Use developer jargon (DOM, event handlers, state)
715
- - Truncate mid-word or mid-phrase
716
- - Start with technical categories ("Dead click on...")
890
+ - Use plain language, under 80 characters
891
+ - No code references, selectors, or developer jargon
717
892
 
718
- ### Examples
719
-
720
- | Bad (Technical) | Good (User-Friendly) |
721
- | ----------------------------------------------------------------------------------- | ------------------------------------------------------- |
722
- | Dead clicks on homepage navigation links caused by WelcomeModal backdrop (z-[100... | Homepage navigation blocked while welcome popup is open |
723
- | Rage clicks on .checkout-btn due to missing loading state | Checkout button appears unresponsive during payment |
724
- | Form validation error not clearing after input correction | Error messages stay visible after fixing form fields |
725
- | High confusion score on /pricing due to unclear CTA hierarchy | Users struggle to find the right pricing option |
893
+ | Bad | Good |
894
+ | ------------------------------------------------------------- | --------------------------------------------------- |
895
+ | Rage clicks on .checkout-btn due to missing loading state | Checkout button appears unresponsive during payment |
896
+ | High confusion score on /pricing due to unclear CTA hierarchy | Users struggle to find the right pricing option |
726
897
 
727
898
  Before completing the run, generate a concise title that summarizes what was fixed:
728
899
 
@@ -806,21 +977,29 @@ If the user agrees:
806
977
  2. Explain that recapt will monitor the affected pages/elements
807
978
  3. Suggest checking back in 24-48 hours with `evaluate_fix` to measure improvement
808
979
 
980
+ ## Tool Discovery
981
+
982
+ Tools beyond the always-available set are dynamically registered. You must call `search_tools` once to activate a tool before calling it via `call_tool`. After the first search, call the tool directly for subsequent uses in the same phase.
983
+
984
+ **Always available (no search needed):** `get_domains`, `run_full_diagnostic`, `triage_sessions`, `get_upgrade_options`, `search_tools`, `call_tool`, `memory_*`
985
+
809
986
  ## Tool Discovery Reference
810
987
 
811
- | Phase | Search Query | Tools |
812
- | ------------- | -------------------- | ------------------------------------------------------------------------------------------------------- |
813
- | Run Tracking | "improvement run" | `start_improvement_run`, `update_improvement_run`, `record_improvement_action`, `list_improvement_runs` |
814
- | Check PRs | "remediation status" | `list_remediations_by_status`, `check_mr_status`, `update_remediation_status` |
815
- | Check Pending | "pending fixes" | `list_pending_fixes`, `evaluate_fix` |
816
- | Diagnose | (always available) | `run_full_diagnostic` |
817
- | Journey | "journey patterns" | `get_journey_patterns` |
818
- | Funnels | "analyze funnel" | `analyze_funnel` |
819
- | Flows | "analyze flow" | `analyze_flow`, `get_flow_friction` |
820
- | Personas | "personas" | `discover_personas` |
821
- | Compare | "compare cohorts" | `compare_cohorts` |
822
- | Investigate | "investigate issue" | `get_session_details`, `get_element_friction`, `get_page_metrics`, `triage_sessions` |
823
- | Audit | "proposal" | `create_proposal`, `list_proposals`, `evaluate_proposal`, `list_proposals_for_evaluation` |
824
- | Fix | "propose fix" | `propose_fix` |
825
- | Track | "deployment" | `confirm_deployment`, `evaluate_fix`, `list_pending_fixes` |
826
- | Learn | "site knowledge" | `get_site_knowledge`, `add_site_knowledge` |
988
+ | Phase | Search Query | Tools |
989
+ | ----- | ------------ | ----- |
990
+
991
+ | Run Tracking | "improvement run" | `start_improvement_run`, `update_improvement_run`, `record_improvement_action`, `list_improvement_runs` |
992
+
993
+ | Check PRs | "remediation status" | `list_remediations_by_status`, `check_mr_status`, `update_remediation_status` |
994
+ | Check Pending | "pending fixes" | `list_pending_fixes`, `evaluate_fix` |
995
+ | Diagnose | (always available) | `run_full_diagnostic` |
996
+ | Journey | "journey patterns" | `get_journey_patterns` |
997
+ | Funnels | "analyze funnel" | `analyze_funnel` |
998
+ | Flows | "analyze flow" | `analyze_flow`, `get_flow_friction` |
999
+ | Personas | "personas" | `discover_personas` |
1000
+ | Compare | "compare cohorts" | `compare_cohorts` |
1001
+ | Investigate | "investigate issue" | `get_session_details`, `get_element_friction`, `get_page_metrics`, `triage_sessions` |
1002
+
1003
+ | Fix | "propose fix" | `propose_fix` |
1004
+ | Track | "deployment" | `confirm_deployment`, `evaluate_fix`, `list_pending_fixes` |
1005
+ | Learn | "site knowledge" | `get_site_knowledge`, `add_site_knowledge` |