@ainyc/canonry 4.18.1 → 4.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/agent-workspace/skills/canonry-setup/SKILL.md +1 -0
- package/assets/agent-workspace/skills/canonry-setup/references/server-side-traffic.md +167 -0
- package/dist/{chunk-7VDM3JBI.js → chunk-OHPZXTFC.js} +42 -4
- package/dist/cli.js +1 -1
- package/dist/index.js +1 -1
- package/package.json +6 -6
|
@@ -88,6 +88,7 @@ GA4 is a first-class signal alongside citation tracking. Connect once with `cano
|
|
|
88
88
|
| `references/aeo-analysis.md` | Interpreting sweep output, diagnosing regressions, planning content fixes |
|
|
89
89
|
| `references/indexing.md` | Submitting URLs, checking GSC/Bing coverage, fixing indexing gaps |
|
|
90
90
|
| `references/wordpress-integration.md` | Connecting to WordPress, editing pages, pushing staging → live |
|
|
91
|
+
| `references/server-side-traffic.md` | Wiring server-log evidence (Cloud Run today; WordPress / others later) for AI Visibility — Server-Side. Connect, sync, manage sources, troubleshoot. |
|
|
91
92
|
|
|
92
93
|
---
|
|
93
94
|
|
|
@@ -0,0 +1,167 @@
|
|
|
1
|
+
# Server-side traffic (AI Visibility — Server-Side)
|
|
2
|
+
|
|
3
|
+
Server-side traffic ingestion captures **what AI engines actually do in
|
|
4
|
+
your server logs** — bots crawling pages, AI products sending
|
|
5
|
+
click-through arrivals — in addition to the citation data that measures
|
|
6
|
+
**what models say** about you. The two surfaces are independent.
|
|
7
|
+
|
|
8
|
+
## When to use it
|
|
9
|
+
|
|
10
|
+
Reach for server-side traffic when an analyst or operator asks:
|
|
11
|
+
|
|
12
|
+
- *"Is GPTBot / ClaudeBot / PerplexityBot actually fetching my pages?"*
|
|
13
|
+
- *"Which paths are AI engines paying attention to?"*
|
|
14
|
+
- *"Are users clicking through from chatgpt.com / claude.ai / etc.?"*
|
|
15
|
+
- *"My citation rate is fine but there's no traffic — why?"*
|
|
16
|
+
|
|
17
|
+
GA4 referrals (chatgpt.com → your site) catch click-throughs after they
|
|
18
|
+
land. Server logs catch the upstream bot activity AND referrals at the
|
|
19
|
+
edge — including arrivals GA4 missed because of cookie consent, ad
|
|
20
|
+
blockers, or analytics gaps.
|
|
21
|
+
|
|
22
|
+
## Architecture
|
|
23
|
+
|
|
24
|
+
Two tables, populated from server-log adapters:
|
|
25
|
+
|
|
26
|
+
| Table | What's in it |
|
|
27
|
+
|---|---|
|
|
28
|
+
| `crawler_events_hourly` | One row per `(project, source, hour, bot, verification, path, status)` — bot crawls rolled up by hour |
|
|
29
|
+
| `ai_referral_events_hourly` | One row per `(project, source, hour, product, source_domain, evidence_type, landing_path, status)` — click-through arrivals rolled up by hour |
|
|
30
|
+
| `raw_event_samples` | Bounded forensic samples (≤100 per sync) for spot-checking |
|
|
31
|
+
|
|
32
|
+
Each `traffic_sources` row is one server-log integration for a project.
|
|
33
|
+
Today's only adapter is `cloud-run`; future adapters slot in by
|
|
34
|
+
implementing the same contract.
|
|
35
|
+
|
|
36
|
+
## Connecting a Cloud Run source
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
# 1. Create a service account in the Cloud project that hosts the Cloud Run
|
|
40
|
+
# service. Grant it `roles/logging.viewer`. Download the JSON key.
|
|
41
|
+
|
|
42
|
+
# 2. Connect from canonry CLI:
|
|
43
|
+
canonry traffic connect cloud-run <project> \
|
|
44
|
+
--gcp-project <gcp-project-id> \
|
|
45
|
+
--service-account-key <path/to/key.json>
|
|
46
|
+
|
|
47
|
+
# 3. (Optional) narrow to a specific service or location:
|
|
48
|
+
canonry traffic connect cloud-run <project> \
|
|
49
|
+
--gcp-project <id> \
|
|
50
|
+
--service-account-key <path> \
|
|
51
|
+
--service my-service-name \
|
|
52
|
+
--location us-east1
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Credentials are stored in `~/.canonry/config.yaml` (not the DB). The
|
|
56
|
+
canonical key lives only on the host that runs `canonry serve`. The
|
|
57
|
+
sync flow does NOT echo the private key back in any response.
|
|
58
|
+
|
|
59
|
+
## Syncing data
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# Manual sync — defaults to a 30-day lookback on the first run; subsequent
|
|
63
|
+
# runs are clamped forward to lastSyncedAt to avoid re-pulling.
|
|
64
|
+
canonry traffic sync <project> --source <id>
|
|
65
|
+
|
|
66
|
+
# Override the lookback window (minutes):
|
|
67
|
+
canonry traffic sync <project> --source <id> --since-minutes 4320 # 3 days
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Cross-sync dedupe via the `last_event_ids` ring buffer means re-running a
|
|
71
|
+
sync over an overlapping window cannot double-count rolled-up hourly
|
|
72
|
+
hits. Safe to schedule (see "Scheduling" below) or trigger from CI.
|
|
73
|
+
|
|
74
|
+
## Inspecting source state
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# All sources with last-24h totals + latest sync run (single-call):
|
|
78
|
+
canonry traffic status <project> --format json
|
|
79
|
+
|
|
80
|
+
# Just the source list:
|
|
81
|
+
canonry traffic sources <project> --format json
|
|
82
|
+
|
|
83
|
+
# Windowed events (defaults to last 24h):
|
|
84
|
+
canonry traffic events <project> --kind crawler --limit 200 --format json
|
|
85
|
+
canonry traffic events <project> --kind ai-referral --since 2026-04-01 --until 2026-04-30
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
The `traffic status` composite returns the same per-source detail
|
|
89
|
+
(24h crawler hits, AI-referral arrivals, raw-event-sample count, latest
|
|
90
|
+
sync-run summary) whether you reach it via the CLI, the API, or the
|
|
91
|
+
MCP `canonry_traffic_status` tool.
|
|
92
|
+
|
|
93
|
+
## Where the data shows up
|
|
94
|
+
|
|
95
|
+
| Surface | What's rendered |
|
|
96
|
+
|---|---|
|
|
97
|
+
| Project dashboard `/projects/:name/activity` | Live source table + 24h totals + GA4 referrals (combined view) |
|
|
98
|
+
| Top-level `/traffic` route | Cross-project source admin (connect, sync, archive) |
|
|
99
|
+
| `canonry report <project>` (HTML + SPA) | "AI Visibility — Server-Side" section, ranked above Indexing Health |
|
|
100
|
+
| `canonry doctor --project <name>` | `traffic.source.connected`, `recent-data`, `credentials`, `scopes` checks |
|
|
101
|
+
| MCP toolkit `traffic` | Tools: `canonry_traffic_status`, `_sources_list`, `_source_get`, `_events`, `_connect_cloud_run`, `_sync` |
|
|
102
|
+
|
|
103
|
+
## Doctor signals
|
|
104
|
+
|
|
105
|
+
The doctor checks are adapter-agnostic. When they fail or warn:
|
|
106
|
+
|
|
107
|
+
| Check | Code | What to do |
|
|
108
|
+
|---|---|---|
|
|
109
|
+
| `traffic.source.connected` | `traffic.source.none` | No source — `canonry traffic connect cloud-run …` |
|
|
110
|
+
| `traffic.source.connected` | `traffic.source.all-errored` | Re-connect the source. The check's `details.lastError` shows the underlying reason. |
|
|
111
|
+
| `traffic.source.recent-data` | `traffic.recent-data.stale` | Last sync was >7d ago. Run `canonry traffic sync …` or schedule a recurring sync. |
|
|
112
|
+
| `traffic.source.recent-data` | `traffic.recent-data.empty` | Source connected but no data in 30d. Verify config and credentials with `canonry traffic sources <project>`. |
|
|
113
|
+
| `traffic.source.credentials` | `traffic.credentials.resolve-failed` | Service-account key in `~/.canonry/config.yaml` is invalid or expired. Re-connect. |
|
|
114
|
+
|
|
115
|
+
## Scheduling
|
|
116
|
+
|
|
117
|
+
`canonry schedule` supports `--kind traffic-sync`. Recurring syncs are
|
|
118
|
+
safe because of the `last_event_ids` cross-sync dedupe ring buffer
|
|
119
|
+
described above. Recommended cadence:
|
|
120
|
+
|
|
121
|
+
| Cadence | Use case |
|
|
122
|
+
|---|---|
|
|
123
|
+
| `0 */6 * * *` (every 6h) | Production agencies tracking active client sites |
|
|
124
|
+
| `0 0 * * *` (daily) | Lower-traffic sites or local dev |
|
|
125
|
+
| Manual only | First few weeks while validating data |
|
|
126
|
+
|
|
127
|
+
## Telemetry
|
|
128
|
+
|
|
129
|
+
Every successful or failed sync emits a `traffic.synced` event to the
|
|
130
|
+
canonry telemetry pipeline:
|
|
131
|
+
|
|
132
|
+
```jsonc
|
|
133
|
+
{
|
|
134
|
+
"event": "traffic.synced",
|
|
135
|
+
"errorCode": "PROVIDER_AUTH", // present only when status='failed'
|
|
136
|
+
"properties": {
|
|
137
|
+
"status": "completed" | "failed",
|
|
138
|
+
"sourceType": "cloud-run", // adapter type
|
|
139
|
+
"sourceId": "<uuid>", // opaque
|
|
140
|
+
"pulledEvents": 234,
|
|
141
|
+
"crawlerHits": 200,
|
|
142
|
+
"aiReferralHits": 12,
|
|
143
|
+
"durationMs": 4150
|
|
144
|
+
}
|
|
145
|
+
}
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Counts are aggregate. The sourceId is an opaque UUID. No raw paths,
|
|
149
|
+
domains, or PII are surfaced.
|
|
150
|
+
|
|
151
|
+
## Limits & caveats
|
|
152
|
+
|
|
153
|
+
- **Path-level citation cross-reference is not implemented yet.** The
|
|
154
|
+
citation store is domain-grain (`query_snapshots.cited_domains`). A
|
|
155
|
+
future iteration that lands URL-grain citation evidence will extend
|
|
156
|
+
the `topCrawledPaths` entry with a `citationState` flag. Until then,
|
|
157
|
+
treat the report's crawled-paths table as "engine attention" — the
|
|
158
|
+
signal is the bot fetched it, not whether it was cited.
|
|
159
|
+
- **Verified vs unverified.** The headline numbers count only
|
|
160
|
+
rDNS-verified hits. Unverified bots claim a known UA but couldn't be
|
|
161
|
+
cross-confirmed via reverse-DNS — they may be the real bot or an
|
|
162
|
+
imitator. Don't promote unverified counts in client-facing copy.
|
|
163
|
+
- **Cloud Run only in v1.** WordPress plugin and other adapters are
|
|
164
|
+
planned. The doctor checks and the report renderer are already
|
|
165
|
+
adapter-agnostic — adding a new adapter is just a new entry in
|
|
166
|
+
`traffic_sources.source_type` and a `TrafficSourceValidator`
|
|
167
|
+
registration.
|
|
@@ -17118,6 +17118,7 @@ async function trafficRoutes(app, opts) {
|
|
|
17118
17118
|
Math.min(windowEnd.getTime(), Math.max(requestedStartMs, lastSyncedMs))
|
|
17119
17119
|
);
|
|
17120
17120
|
const startedAt = windowEnd.toISOString();
|
|
17121
|
+
const syncStartedAtMs = windowEnd.getTime();
|
|
17121
17122
|
const runId = crypto20.randomUUID();
|
|
17122
17123
|
app.db.insert(runs).values({
|
|
17123
17124
|
id: runId,
|
|
@@ -17129,19 +17130,32 @@ async function trafficRoutes(app, opts) {
|
|
|
17129
17130
|
startedAt,
|
|
17130
17131
|
createdAt: startedAt
|
|
17131
17132
|
}).run();
|
|
17132
|
-
const markFailed = (msg) => {
|
|
17133
|
+
const markFailed = (msg, errorCode) => {
|
|
17133
17134
|
const failedAt = (/* @__PURE__ */ new Date()).toISOString();
|
|
17134
17135
|
app.db.transaction((tx) => {
|
|
17135
17136
|
tx.update(runs).set({ status: RunStatuses.failed, error: msg, finishedAt: failedAt }).where(eq23(runs.id, runId)).run();
|
|
17136
17137
|
tx.update(trafficSources).set({ status: TrafficSourceStatuses.error, lastError: msg, updatedAt: failedAt }).where(eq23(trafficSources.id, sourceRow.id)).run();
|
|
17137
17138
|
});
|
|
17139
|
+
try {
|
|
17140
|
+
opts.onTrafficSynced?.({
|
|
17141
|
+
status: "failed",
|
|
17142
|
+
sourceType: sourceRow.sourceType,
|
|
17143
|
+
sourceId: sourceRow.id,
|
|
17144
|
+
pulledEvents: 0,
|
|
17145
|
+
crawlerHits: 0,
|
|
17146
|
+
aiReferralHits: 0,
|
|
17147
|
+
durationMs: Date.now() - syncStartedAtMs,
|
|
17148
|
+
errorCode
|
|
17149
|
+
});
|
|
17150
|
+
} catch {
|
|
17151
|
+
}
|
|
17138
17152
|
};
|
|
17139
17153
|
let accessToken;
|
|
17140
17154
|
try {
|
|
17141
17155
|
accessToken = await resolveAccessToken2(credential);
|
|
17142
17156
|
} catch (e) {
|
|
17143
17157
|
const msg = e instanceof Error ? e.message : String(e);
|
|
17144
|
-
markFailed(msg);
|
|
17158
|
+
markFailed(msg, "PROVIDER_AUTH");
|
|
17145
17159
|
throw providerError(`Failed to resolve Cloud Run access token: ${msg}`);
|
|
17146
17160
|
}
|
|
17147
17161
|
let allEvents = [];
|
|
@@ -17158,7 +17172,7 @@ async function trafficRoutes(app, opts) {
|
|
|
17158
17172
|
allEvents = page.events;
|
|
17159
17173
|
} catch (e) {
|
|
17160
17174
|
const msg = e instanceof Error ? e.message : String(e);
|
|
17161
|
-
markFailed(msg);
|
|
17175
|
+
markFailed(msg, "PROVIDER_PULL");
|
|
17162
17176
|
throw providerError(`Cloud Run pull failed: ${msg}`);
|
|
17163
17177
|
}
|
|
17164
17178
|
const seenEventIds = new Set(parseJsonColumn(sourceRow.lastEventIds, []));
|
|
@@ -17292,6 +17306,18 @@ async function trafficRoutes(app, opts) {
|
|
|
17292
17306
|
entityType: "traffic_source",
|
|
17293
17307
|
entityId: sourceRow.id
|
|
17294
17308
|
});
|
|
17309
|
+
try {
|
|
17310
|
+
opts.onTrafficSynced?.({
|
|
17311
|
+
status: "completed",
|
|
17312
|
+
sourceType: sourceRow.sourceType,
|
|
17313
|
+
sourceId: sourceRow.id,
|
|
17314
|
+
pulledEvents: report.totals.normalizedEvents,
|
|
17315
|
+
crawlerHits: report.totals.crawlerHits,
|
|
17316
|
+
aiReferralHits: report.totals.aiReferralHits,
|
|
17317
|
+
durationMs: Date.now() - syncStartedAtMs
|
|
17318
|
+
});
|
|
17319
|
+
} catch {
|
|
17320
|
+
}
|
|
17295
17321
|
const response = {
|
|
17296
17322
|
sourceId: sourceRow.id,
|
|
17297
17323
|
runId,
|
|
@@ -18633,7 +18659,8 @@ async function apiRoutes(app, opts) {
|
|
|
18633
18659
|
await api.register(trafficRoutes, {
|
|
18634
18660
|
cloudRunCredentialStore: opts.cloudRunCredentialStore,
|
|
18635
18661
|
pullCloudRunEvents: opts.pullCloudRunEvents,
|
|
18636
|
-
resolveCloudRunAccessToken: opts.resolveCloudRunAccessToken
|
|
18662
|
+
resolveCloudRunAccessToken: opts.resolveCloudRunAccessToken,
|
|
18663
|
+
onTrafficSynced: opts.onTrafficSynced
|
|
18637
18664
|
});
|
|
18638
18665
|
await api.register(backlinksRoutes, {
|
|
18639
18666
|
getBacklinksStatus: opts.getBacklinksStatus,
|
|
@@ -25830,6 +25857,17 @@ async function createServer(opts) {
|
|
|
25830
25857
|
wordpressConnectionStore,
|
|
25831
25858
|
ga4CredentialStore,
|
|
25832
25859
|
cloudRunCredentialStore,
|
|
25860
|
+
onTrafficSynced: (event) => {
|
|
25861
|
+
trackEvent("traffic.synced", {
|
|
25862
|
+
status: event.status,
|
|
25863
|
+
sourceType: event.sourceType,
|
|
25864
|
+
sourceId: event.sourceId,
|
|
25865
|
+
pulledEvents: event.pulledEvents,
|
|
25866
|
+
crawlerHits: event.crawlerHits,
|
|
25867
|
+
aiReferralHits: event.aiReferralHits,
|
|
25868
|
+
durationMs: event.durationMs
|
|
25869
|
+
}, event.errorCode ? { errorCode: event.errorCode } : void 0);
|
|
25870
|
+
},
|
|
25833
25871
|
onRunCreated: (runId, projectId, providers2, location) => {
|
|
25834
25872
|
jobRunner.executeRun(runId, projectId, providers2, location).catch((err) => {
|
|
25835
25873
|
app.log.error({ runId, err }, "Job runner failed");
|
package/dist/cli.js
CHANGED
package/dist/index.js
CHANGED
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ainyc/canonry",
|
|
3
|
-
"version": "4.
|
|
3
|
+
"version": "4.19.0",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Agent-first open-source AEO operating platform - track how answer engines cite your domain",
|
|
6
6
|
"license": "FSL-1.1-ALv2",
|
|
@@ -63,19 +63,19 @@
|
|
|
63
63
|
"@ainyc/canonry-db": "0.0.0",
|
|
64
64
|
"@ainyc/canonry-intelligence": "0.0.0",
|
|
65
65
|
"@ainyc/canonry-integration-bing": "0.0.0",
|
|
66
|
-
"@ainyc/canonry-api-routes": "0.0.0",
|
|
67
|
-
"@ainyc/canonry-integration-commoncrawl": "0.0.0",
|
|
68
66
|
"@ainyc/canonry-integration-cloud-run": "0.0.0",
|
|
69
67
|
"@ainyc/canonry-contracts": "0.0.0",
|
|
70
68
|
"@ainyc/canonry-integration-google": "0.0.0",
|
|
71
69
|
"@ainyc/canonry-integration-traffic": "0.0.0",
|
|
72
|
-
"@ainyc/canonry-
|
|
70
|
+
"@ainyc/canonry-api-routes": "0.0.0",
|
|
73
71
|
"@ainyc/canonry-provider-cdp": "0.0.0",
|
|
72
|
+
"@ainyc/canonry-integration-wordpress": "0.0.0",
|
|
74
73
|
"@ainyc/canonry-provider-claude": "0.0.0",
|
|
75
|
-
"@ainyc/canonry-
|
|
74
|
+
"@ainyc/canonry-integration-commoncrawl": "0.0.0",
|
|
76
75
|
"@ainyc/canonry-provider-local": "0.0.0",
|
|
76
|
+
"@ainyc/canonry-provider-openai": "0.0.0",
|
|
77
77
|
"@ainyc/canonry-provider-perplexity": "0.0.0",
|
|
78
|
-
"@ainyc/canonry-provider-
|
|
78
|
+
"@ainyc/canonry-provider-gemini": "0.0.0"
|
|
79
79
|
},
|
|
80
80
|
"scripts": {
|
|
81
81
|
"build": "tsx scripts/copy-agent-assets.ts && tsup && tsx build-web.ts",
|