@ainyc/canonry 4.83.0 → 4.85.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (32) hide show
  1. package/assets/agent-workspace/skills/aero/SKILL.md +14 -10
  2. package/assets/agent-workspace/skills/aero/references/aeo-discovery.md +7 -5
  3. package/assets/agent-workspace/skills/aero/references/memory-patterns.md +1 -1
  4. package/assets/agent-workspace/skills/aero/references/orchestration.md +22 -20
  5. package/assets/agent-workspace/skills/aero/references/regression-playbook.md +11 -9
  6. package/assets/agent-workspace/skills/aero/references/reporting.md +14 -9
  7. package/assets/agent-workspace/skills/aero/soul.md +5 -5
  8. package/assets/agent-workspace/skills/canonry/SKILL.md +1 -1
  9. package/assets/agent-workspace/skills/canonry/references/aeo-analysis.md +84 -36
  10. package/assets/agent-workspace/skills/canonry/references/canonry-cli.md +44 -10
  11. package/assets/assets/{BacklinksPage-OrSg_iPA.js → BacklinksPage-CDAv0ggn.js} +1 -1
  12. package/assets/assets/{ChartPrimitives-DPBhAT_r.js → ChartPrimitives-CnAmsyt7.js} +1 -1
  13. package/assets/assets/{ProjectPage-CpMcEmtw.js → ProjectPage-C9KEgRxD.js} +1 -1
  14. package/assets/assets/{RunRow-2Rty0BAH.js → RunRow-CVZ5o8fg.js} +1 -1
  15. package/assets/assets/{RunsPage-B3ahqf8s.js → RunsPage-Bzy5c0MZ.js} +1 -1
  16. package/assets/assets/{SettingsPage-BIjeI85q.js → SettingsPage-B1ocxPBe.js} +1 -1
  17. package/assets/assets/{TrafficPage-DjGoj691.js → TrafficPage-D2zepQOC.js} +1 -1
  18. package/assets/assets/{TrafficSourceDetailPage-BgKG-2q3.js → TrafficSourceDetailPage-C7JuAkaK.js} +1 -1
  19. package/assets/assets/{arrow-left-Cf7wmru1.js → arrow-left-Bv3CWylm.js} +1 -1
  20. package/assets/assets/{extract-error-message-CANxezte.js → extract-error-message-BtVid5TP.js} +1 -1
  21. package/assets/assets/{index-CGlPx_cu.js → index-DmNti_xn.js} +72 -72
  22. package/assets/assets/{trash-2-6nHJZrvy.js → trash-2-BoimCsYz.js} +1 -1
  23. package/assets/index.html +1 -1
  24. package/dist/{chunk-GOWH42QV.js → chunk-3K3QRSYE.js} +709 -426
  25. package/dist/{chunk-Y3O3HBMN.js → chunk-62YB3ML7.js} +49 -1
  26. package/dist/{chunk-NRACXNI7.js → chunk-7BMSWI2K.js} +6 -4
  27. package/dist/{chunk-BNF3HXBW.js → chunk-I2BJC3DT.js} +1021 -942
  28. package/dist/cli.js +165 -29
  29. package/dist/index.js +4 -4
  30. package/dist/{intelligence-service-FHQM7YMA.js → intelligence-service-AHHBQKRD.js} +2 -2
  31. package/dist/mcp.js +2 -2
  32. package/package.json +7 -7
@@ -103,24 +103,35 @@ Use `--probe` whenever you're testing on your own initiative — verifying a fix
103
103
 
104
104
  `snapshot` does not create a project or write to the DB. It generates category queries, runs providers, and produces a report for prospecting.
105
105
 
106
- ## Citation Data
106
+ ## Mention + Citation Data
107
+
108
+ Two independent signals per (query × provider): **mention** (`answerMentioned` — brand named in the answer text; the **primary** read) and **citation** (`cited`/`citedDomains` — domain in the grounding sources; **secondary**). Read mention first. Never compute one from the other; never coerce `answerMentioned` null → false (null = "not checked").
107
109
 
108
110
  ```bash
109
- cnry evidence <project> # per-query cited/not-cited + mentioned/not-mentioned
111
+ cnry evidence <project> # per-query [C/c][M/m] cell + Mentioned: X / Y / Cited: X / Y
110
112
  cnry evidence <project> --format json # JSON output
111
113
  cnry history <project> # audit trail
112
114
  cnry export <project> --include-results # export as YAML
113
- cnry backfill answer-visibility # recompute citationState from stored answers
114
- cnry backfill answer-visibility --dry-run # preview which snapshots would change
115
- cnry backfill answer-mentions # recompute answerMentioned from stored answers (honors brandAliases)
115
+ cnry backfill answer-mentions # recompute answerMentioned (primary) from stored answers (honors brandAliases)
116
116
  cnry backfill answer-mentions --dry-run
117
+ cnry backfill answer-visibility # recompute citationState (secondary) from stored answers
118
+ cnry backfill answer-visibility --dry-run # preview which snapshots would change
117
119
  cnry backfill insights <project> # recompute insights for completed runs
118
120
  cnry backfill insights <project> --since 2026-04-01 --dry-run
119
121
  ```
120
122
 
121
- Output uses a two-glyph cell per (query × provider): `[C/c][M/m]` — uppercase = present, lowercase = absent, `–` = no snapshot. Always print the legend before the table; never collapse the two signals into one cell.
123
+ Output uses a two-glyph cell per (query × provider): `[C/c][M/m]` — uppercase = present, lowercase = absent, `–` = no snapshot. **C/c = cited** (secondary), **M/m = mentioned** (primary). Always print the legend before the table; never collapse the two signals into one cell.
124
+
125
+ ```
126
+ Legend: [C/c][M/m] C=cited c=not-cited M=mentioned m=not-mentioned –=no snapshot
127
+
128
+ [C][M] acme corp ny ← mentioned AND cited
129
+ [c][M] best crm for smb ← mentioned, not cited (mention win, citation gap)
130
+ [C][m] crm pricing ← cited, not mentioned (citation without share of voice)
131
+ [c][m] free crm tools ← neither
132
+ ```
122
133
 
123
- Summary: `Cited: X / Y` and `Mentioned: X / Y` are reported independently — a query can be one, both, or neither.
134
+ Summary: `Mentioned: X / Y` (primary) and `Cited: X / Y` (secondary) are reported independently — a query can be one, both, or neither.
124
135
 
125
136
  ## Reports
126
137
 
@@ -144,8 +155,8 @@ Behavior to know when narrating numbers from the report:
144
155
 
145
156
  ```bash
146
157
  cnry analytics <project> # default analytics view
147
- cnry analytics <project> --feature metrics # citation rate trends
148
- cnry analytics <project> --feature gaps # brand gap analysis (cited/gap/uncited)
158
+ cnry analytics <project> --feature metrics # mention + citation rate trends (BrandMetricsDto: mentionTrend primary, trend secondary)
159
+ cnry analytics <project> --feature gaps # brand gap analysis — mention buckets (mentionedQueries[]/mentionGap[]/notMentioned[]) primary, cited buckets (cited[]/gap[]/uncited[]) secondary
149
160
  cnry analytics <project> --feature sources # source breakdown by category
150
161
  cnry analytics <project> --window 7d # time window: 7d, 30d, 90d, all
151
162
  ```
@@ -167,6 +178,24 @@ cnry sources <project> --rank --format jsonl # stream the ranked domains, one
167
178
  - The ranked list is **not truncated** by default (the old top-5-per-category cap is gone). Pass `--limit N` to cap each list; the response carries `truncatedDomainCount` / `truncatedCitedSlots` so totals always reconcile.
168
179
  - Counts are **cited slots** (grounding citations), so a domain cited 3× in one answer counts 3. Probe runs are excluded.
169
180
 
181
+ ### Aggregated visibility stats (`cnry visibility-stats`)
182
+
183
+ Per-query mention (answer-text) and citation (source-list) **counts with a sample size**, pooled across many answer-visibility runs — the data to compute a confidence-aware proportion (e.g. Wilson) or detect drift without fetching every run. Backed by `GET /api/v1/projects/<name>/visibility-stats` and the `canonry_visibility_stats` MCP tool. Probe runs and non-`answer-visibility` runs are excluded; only completed/partial runs count.
184
+
185
+ ```bash
186
+ cnry visibility-stats <project> # all runs; per-query cited/total + mentioned/checked + pooled TOTAL
187
+ cnry visibility-stats <project> --last-runs 10 # most recent 10 runs (mutually exclusive with --since/--until)
188
+ cnry visibility-stats <project> --since 2026-06-01 --until 2026-06-30 # ISO date/time window on run createdAt
189
+ cnry visibility-stats <project> --by-provider # per-provider breakdown (counts sum to the pooled counts)
190
+ cnry visibility-stats <project> --format jsonl # stream one record per query, stamped with project + runCount
191
+ ```
192
+
193
+ - **Tri-state aware:** `checked` counts only snapshots where `answerMentioned` was recorded — `null` ("not checked") is **excluded**, never counted as not-mentioned. So `checked` is the correct `n` for a mention proportion. `mentionRate = mentioned/checked`; `citedRate = cited/total` (citation_state is always populated, so the citation `n` is `total`). Both rates are `null` when their denominator is 0 (undefined over no samples).
194
+ - **Date-only window:** `--since`/`--until` accept a full ISO instant or a bare `YYYY-MM-DD`. A date-only `--until 2026-06-30` covers the **whole** UTC day (through 23:59:59.999), so same-day runs are included; a date-only `--since` is that day's start.
195
+ - **Unbounded by default:** with no `--since`/`--until`/`--last-runs`, every completed/partial run is pooled (`window.runCount` reports how many). For a recent sample, bound it with `--last-runs N`.
196
+ - **`groupBy` in the payload:** present (`"provider"`) only with `--by-provider`; omitted otherwise (absent = no breakdown) — the generated SDK types it `groupBy?: 'provider'`.
197
+ - **mention vs cited stay independent** — a model can do either, both, or neither. Don't read one from the other.
198
+
170
199
  ## Technical AEO (site audit)
171
200
 
172
201
  Site-wide technical audit (structured data, AI-readable content, AI-crawler access, content depth/freshness/extractability, …) powered by `@ainyc/aeo-audit`'s `runSitemapAudit`. Runs as the `site-audit` run kind — crawls the project's `sitemap.xml` and scores every reachable page, then rolls up into one 0–100 site score. Pure HTTP, no LLM cost; a large site can take minutes, so it runs in the background.
@@ -190,7 +219,7 @@ cnry insights <project> # list active insights (regressio
190
219
  cnry insights <project> --dismissed # include dismissed insights
191
220
  cnry insights <project> --format json # JSON output
192
221
  cnry insights dismiss <project> <id> # dismiss an insight
193
- cnry health <project> # latest citation health snapshot
222
+ cnry health <project> # latest citation health snapshot (citation-only — see known gap below)
194
223
  cnry health <project> --history # health trend over time
195
224
  cnry health <project> --history --limit 10 # limit history entries
196
225
  cnry health <project> --format json # JSON output
@@ -198,6 +227,8 @@ cnry backfill insights <project> # backfill insights for all comple
198
227
  cnry backfill insights <project> --from-run <id> --to-run <id> # backfill a range
199
228
  ```
200
229
 
230
+ > **Known gap (mention-first read):** `cnry health` is **citation-only** today — it has no mention dimension. For the primary mention-first read, use `cnry overview` and `cnry get <project> scores.mentionCoverage.value` / `cnry get <project> scores.mentionShare.value` until health is extended.
231
+
201
232
  ## Queries & Competitors
202
233
 
203
234
  ```bash
@@ -237,6 +268,8 @@ Available events: `citation.lost`, `citation.gained`, `run.completed`, `run.fail
237
268
 
238
269
  `insight.critical` and `insight.high` fire when the intelligence engine generates critical- or high-severity insights after a sweep completes.
239
270
 
271
+ > **No mention events yet.** Notification events cover the citation signal only — there are **no** `mention.lost` / `mention.gained` events today. For mention-first monitoring, read `scores.mentionCoverage` / `scores.mentionShare` via `cnry overview` (or the `insight.*` events, which can be driven by mention-side insights); do not wire automation to mention events that aren't emitted.
272
+
240
273
  ## Provider Settings & Quotas
241
274
 
242
275
  ```bash
@@ -640,6 +673,7 @@ Compact reference for the composite / keyed commands agents read most (shapes ca
640
673
  | `cnry doctor [--project p] [--all]` | `{ scope, project, generatedAt, durationMs, summary{total,ok,warn,fail,skipped}, checks[] }` — `DoctorReportDto` @ `contracts/doctor.ts`. `checks[]` = `CheckResultDto{ id, category, scope, title, status(ok\|warn\|fail\|skipped), code, summary, remediation?, details?, durationMs }`. With `--all`: an object keyed by `__global__` + each project name, each value a full report. | ✅ one check / line as `{project, …check}`; still exits non-zero if any `fail` |
641
674
  | `cnry analytics <p> [--feature metrics\|gaps\|sources] [--window 7d\|30d\|90d\|all]` | Object **keyed by feature**: `{ metrics?, gaps?, sources? }` (all three present with no `--feature`; one with `--feature X`). `metrics`=`BrandMetricsDto{ window, buckets[], overall, byProvider, trend, mentionTrend, queryChanges[] }`; `gaps`=`GapAnalysisDto{ cited[], gap[], uncited[], mentionedQueries[], mentionGap[], notMentioned[], runId, window }` (each `[]`=`GapQuery`); `sources`=`SourceBreakdownDto` (same shape as `cnry sources`, below). @ `contracts/analytics.ts` | → degrades to the `json` document |
642
675
  | `cnry sources <p> [--rank] [--limit N] [--by-provider] [--window …]` | `SourceBreakdownDto{ overall[], byQuery, ranked, byProvider, runId, window, limit }` @ `contracts/analytics.ts`. `ranked`/each `byProvider[name]` = `RankedSourceList{ totalCitedSlots, domainTotal, entries[], truncatedDomainCount, truncatedCitedSlots, bySurfaceClass[] }`; `entries[]`=`SourceRankEntry{ domain, count, percentage, category, label, surfaceClass }`; `bySurfaceClass[]`=`SurfaceClassCount{ surfaceClass, label, count, percentage, domainCount }`. `surfaceClass` ∈ own \| direct-competitor \| ota-aggregator \| editorial-media \| other. | ✅ streams `ranked.entries` one / line as `{project, …entry}` |
676
+ | `cnry visibility-stats <p> [--since <iso>] [--until <iso>] [--last-runs N] [--by-provider]` | `VisibilityStatsDto{ project, groupBy, window{since,until,lastRuns,runCount}, totals, byProvider?[], queries[] }` @ `contracts/visibility-stats.ts`. Each query / provider / totals entry = `{ total, checked, mentioned, cited, mentionRate, citedRate }` (+ `query`/`queryId`/`firstObserved`/`lastObserved` on queries, + `provider`/observed on provider entries). `checked`=snapshots with non-null `answerMentioned` (tri-state n for mention); `mentionRate=mentioned/checked`, `citedRate=cited/total`, both `null` on a 0 denominator. `byProvider`/per-query `providers` present only with `--by-provider`; counts sum to pooled. | ✅ streams `queries` one / line as `{project, runCount, …query}` |
643
677
  | `cnry google coverage <p>` (index coverage) | `{ summary{total,indexed,notIndexed,deindexed,percentage}, lastInspectedAt, lastSyncedAt, indexed[], notIndexed[], deindexed[], reasonGroups[] }` — `GscCoverageSummaryDto` @ `contracts/google.ts`. `indexed[]`/`notIndexed[]`=`GscUrlInspectionDto`, `deindexed[]`=`GscDeindexedRowDto`. | → degrades to the `json` document. The single-array reads `google inspections` / `coverage-history` / `deindexed` **stream** `jsonl`. |
644
678
  | `cnry ga traffic <p> [--window …]` | Object summary — `GA4TrafficSummaryDto` / `GaTrafficResponse` @ `contracts/ga.ts`: `{ totalSessions, totalOrganicSessions, totalDirectSessions, totalUsers, aiSessionsDeduped, aiUsersDeduped, aiSessionsBySession, aiUsersBySession, socialSessions, socialUsers, channelBreakdown{organic,social,direct,ai,other→{sessions,sharePct,sharePctDisplay}}, *SharePct (+ `*Display`), topPages[], aiReferrals[], aiReferralLandingPages[], socialReferrals[], lastSyncedAt, periodStart, periodEnd }`. | → degrades to the `json` document |
645
679
  | `cnry ga attribution <p> [--trend]` | Object — a **renamed projection** of `GaTrafficResponse` (⚠️ field names differ from the DTO): `aiSessions`(←`aiSessionsDeduped`), `organicSessions`(←`totalOrganicSessions`), `directSessions`(←`totalDirectSessions`), plus `totalSessions, totalUsers, aiUsers, aiSessionsBySession, aiUsersBySession, socialSessions, socialUsers, {ai,social,organic,direct}SharePct (+ `*Display`), otherSessions, otherSharePct, channelBreakdown, aiReferrals[], aiReferralLandingPages[], socialReferrals[], periodStart, periodEnd`. With `--trend`: drops `periodStart/End`, adds `trend` (`GaAttributionTrendResponse`). Assembled inline in `commands/ga.ts`. | → degrades to the `json` document |
@@ -1 +1 @@
1
- import{r as n,j as e}from"./vendor-tanstack-Dq7p98wZ.js";import{c as E,bO as O,ae as V,bP as U,bQ as Y,bR as K,g as l,aH as L,T as f,B as z,i as A,an as J,bS as X,bT as Z,ah as ee,bU as se}from"./index-CGlPx_cu.js";import{C as ae,D as te,T as ne,a as ce}from"./trash-2-6nHJZrvy.js";import"./vendor-radix-B57xfQbP.js";import"./vendor-recharts-ClRVR6aX.js";import"./vendor-markdown-DK7fbRNb.js";const ie=[["circle",{cx:"12",cy:"12",r:"10",key:"1mglay"}],["line",{x1:"12",x2:"12",y1:"8",y2:"12",key:"1pkeuh"}],["line",{x1:"12",x2:"12.01",y1:"16",y2:"16",key:"4dfq90"}]],re=E("circle-alert",ie);const le=[["path",{d:"M15 3h6v6",key:"1q9fwt"}],["path",{d:"M10 14 21 3",key:"gplh6r"}],["path",{d:"M18 13v6a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V8a2 2 0 0 1 2-2h6",key:"a6xqqp"}]],de=E("external-link",le);function h({children:a,label:m="More info",placement:t="top",className:d}){const o=n.useId(),[y,r]=n.useState(!1);return e.jsxs("span",{className:`relative inline-flex ${d??""}`,children:[e.jsx("button",{type:"button","aria-label":m,"aria-describedby":y?o:void 0,className:"inline-flex h-4 w-4 items-center justify-center rounded-full text-zinc-500 hover:text-zinc-200 focus:text-zinc-200 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500",onMouseEnter:()=>r(!0),onMouseLeave:()=>r(!1),onFocus:()=>r(!0),onBlur:()=>r(!1),children:e.jsx(ce,{className:"h-3.5 w-3.5","aria-hidden":!0})}),y&&e.jsx("span",{id:o,role:"tooltip",className:`absolute z-50 w-64 rounded border border-zinc-700 bg-zinc-900 px-3 py-2 text-xs font-normal leading-relaxed text-zinc-200 shadow-lg ${t==="top"?"bottom-full mb-2":"top-full mt-2"} left-1/2 -translate-x-1/2 whitespace-normal`,children:a})]})}const oe="https://commoncrawl.org/web-graphs";function w(a){return a==null?"—":a>=1e12?`${(a/1e12).toFixed(1)} TB`:a>=1e9?`${(a/1e9).toFixed(1)} GB`:a>=1e6?`${(a/1e6).toFixed(1)} MB`:a>=1e3?`${(a/1e3).toFixed(1)} KB`:`${a} B`}function b(a){if(!a)return"—";const m=Date.now()-new Date(a).getTime(),t=Math.floor(m/6e4);if(t<1)return"just now";if(t<60)return`${t}m ago`;const d=Math.floor(t/60);return d<24?`${d}h ago`:`${Math.floor(d/24)}d ago`}function v(a){switch(a){case"ready":return"positive";case"failed":return"negative";case"downloading":case"querying":case"queued":return"caution"}}function fe(){const[a,m]=n.useState(null),[t,d]=n.useState(null),[o,y]=n.useState([]),[r,P]=n.useState([]),[u,M]=n.useState(null),[D,S]=n.useState(!0),[C,B]=n.useState(!1),[p,R]=n.useState(!1),[N,g]=n.useState(""),[F,k]=n.useState(!1),[q,i]=n.useState(null),[I,x]=n.useState(null),j=n.useCallback(async()=>{S(!0),i(null);try{const[s,c,_,H,W]=await Promise.all([O(),V().catch(()=>null),U().catch(()=>[]),Y().catch(()=>[]),K().catch(()=>null)]);m(s),d(c),y(_),P(H),M(W)}catch(s){i(s instanceof Error?s.message:"Failed to load backlinks status")}finally{S(!1)}},[]);n.useEffect(()=>{j()},[j]);async function T(){B(!0),i(null),x(null);try{const s=await X();x(s.alreadyPresent?`DuckDB already installed (${s.version}).`:`Installed DuckDB ${s.version}.`),await j()}catch(s){i(s instanceof Error?s.message:"Failed to install DuckDB")}finally{B(!1)}}async function $(){const s=N.trim()||void 0;R(!0),i(null),x(null);try{const c=await Z(s);x(s?`Queued sync for ${c.release}. Download + query runs in the background.`:`Queued sync for auto-discovered release ${c.release}. Download + query runs in the background.`),g(""),k(!1),await j()}catch(c){c instanceof ee&&c.code==="MISSING_DEPENDENCY"?i("DuckDB is not installed. Install it first."):i(c instanceof Error?c.message:"Failed to trigger sync")}finally{R(!1)}}async function Q(s){i(null),x(null);try{await se(s),x(`Pruned cached release ${s}.`),await j()}catch(c){i(c instanceof Error?c.message:"Failed to prune release")}}const G=t?.status==="ready"&&r.every(s=>s.release!==t.release);return e.jsxs("div",{className:"page-container",children:[e.jsx("div",{className:"page-header",children:e.jsxs("div",{className:"page-header-left",children:[e.jsx("h1",{className:"page-title",children:"Backlinks"}),e.jsx("p",{className:"page-subtitle",children:"Find domains that link to your projects, computed from the open Common Crawl web graph. Runs entirely on your machine — nothing is sent to third parties."})]})}),e.jsx(l,{className:"surface-card p-4 mb-6 border-amber-800/60",children:e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(L,{className:"h-5 w-5 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"text-sm text-zinc-300 leading-relaxed",children:[e.jsx("p",{className:"font-medium text-amber-200",children:"Heads up — a release sync is a large download."}),e.jsxs("ul",{className:"mt-1.5 space-y-1 text-zinc-400",children:[e.jsxs("li",{children:[e.jsx("span",{className:"text-zinc-200",children:"~16 GB"})," of gzipped vertex + edge files per release, stored at"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),"."]}),e.jsxs("li",{children:[e.jsx("span",{className:"text-zinc-200",children:"10–20 min on a fast connection"})," for the download, then ~5 min for the DuckDB query."]}),e.jsx("li",{children:"One sync covers every project in this workspace. Releases are immutable, so the download only happens once per release."})]})]})]})}),e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"About"}),e.jsx("h2",{children:"How it works"})]})}),e.jsxs(l,{className:"surface-card p-5",children:[e.jsxs("p",{className:"text-sm text-zinc-400 leading-relaxed max-w-3xl mb-4",children:["Common Crawl publishes a quarterly snapshot of the public web’s hyperlink graph. Canonry downloads one"," ",e.jsx("span",{className:"text-zinc-200",children:"release"})," at a time and extracts backlinks for every project in this workspace in a single pass."]}),e.jsxs("ol",{className:"space-y-3 text-sm text-zinc-400 max-w-3xl",children:[e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"1"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Download (one-time, ~16 GB)"})," — vertex + edge files cached to"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),". Runs once per release; subsequent operations reuse the cache."]})]}),e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"2"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Query (~5 min)"})," — one DuckDB pass scans the cached files and extracts referring domains for every project’s canonical domain. DuckDB is only used to ",e.jsx("span",{className:"text-zinc-200",children:"read"})," these dumps; it doesn’t store any canonry state."]})]}),e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"3"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Persist"})," — results land in the same SQLite database the rest of canonry uses. After the first sync, per-project reads (and re-run extracts against the cached release) are instant."]})]})]})]})]}),q&&e.jsx(l,{className:"surface-card p-4 mb-4 border-rose-800/60",children:e.jsx("p",{className:"text-sm text-rose-300",children:q})}),I&&e.jsx(l,{className:"surface-card p-4 mb-4 border-emerald-800/60",children:e.jsx("p",{className:"text-sm text-emerald-300",children:I})}),e.jsxs("section",{className:"page-section-divider",children:[e.jsxs("div",{className:"section-head section-head-inline",children:[e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Dependency"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["DuckDB install status",e.jsxs(h,{label:"Why DuckDB?",children:[e.jsx("span",{className:"block",children:"DuckDB is a query engine canonry uses to scan the ~16 GB Common Crawl dumps and pull out your referring domains."}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["It does ",e.jsx("span",{className:"text-zinc-200",children:"not"})," store any canonry data — your backlink results live in SQLite alongside the rest of your projects. DuckDB is purely a tool for processing the raw CSV files."]}),e.jsxs("span",{className:"mt-2 block text-zinc-500",children:["Installed on demand (not bundled) into ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/plugins/"})," so users who never run backlinks don’t pay the ~40 MB install cost."]})]})]})]}),a?.duckdbInstalled?e.jsx(f,{tone:"positive",children:"Installed"}):e.jsx(f,{tone:"caution",children:"Not installed"})]}),e.jsx(l,{className:"surface-card p-5",children:D?e.jsx("p",{className:"text-sm text-zinc-500",children:"Checking…"}):a?.duckdbInstalled?e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(ae,{className:"h-5 w-5 text-emerald-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{children:[e.jsxs("p",{className:"text-sm text-zinc-200",children:["Version ",a.duckdbVersion??"unknown"," installed at"," ",e.jsx("code",{className:"text-zinc-300",children:a.pluginDir})]}),e.jsxs("p",{className:"text-xs text-zinc-500 mt-1",children:["Required spec: ",a.duckdbSpec]})]})]}):e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(re,{className:"h-5 w-5 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"flex-1",children:[e.jsx("p",{className:"text-sm text-zinc-200",children:"DuckDB is not installed. It’s the query engine canonry uses to scan Common Crawl dumps — required before you can run a release sync or per-project extract."}),e.jsx("p",{className:"text-xs text-zinc-500 mt-1",children:"Installing doesn’t touch your project data. DuckDB only reads the downloaded CSV files; backlink results are written to the same SQLite database canonry already uses."}),a&&e.jsxs("p",{className:"text-xs text-zinc-500 mt-1",children:["Will be installed into ",e.jsx("code",{className:"text-zinc-300",children:a.pluginDir})," (~40 MB)."]}),e.jsx("div",{className:"mt-3",children:e.jsxs(z,{type:"button",size:"sm",disabled:C,onClick:A(T),children:[e.jsx(te,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),C?"Installing…":"Install DuckDB"]})})]})]})})]}),e.jsxs("section",{className:"page-section-divider",children:[e.jsxs("div",{className:"section-head section-head-inline",children:[e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Latest sync"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["Release sync",e.jsx(h,{label:"What is a release sync?",children:"A release sync downloads one Common Crawl dump (~16 GB) and extracts backlinks for every project in this workspace in one pass. This is the heavy job — subsequent per-project re-runs skip the download and just re-query the cached files."})]})]}),t&&e.jsx(f,{tone:v(t.status),children:t.status})]}),e.jsxs(l,{className:"surface-card p-5",children:[e.jsxs("p",{className:"text-xs text-zinc-500 max-w-3xl mb-4",children:["A release is one Common Crawl dump (e.g. ",e.jsx("code",{className:"text-zinc-400",children:"cc-main-2026-jan-feb-mar"}),"). Syncing it downloads the graph and populates backlinks for every project in this workspace."]}),t?e.jsxs("div",{className:"space-y-2 text-sm",children:[e.jsxs("p",{className:"text-zinc-200",children:["Release ",e.jsx("code",{className:"text-zinc-300",children:t.release})]}),t.phaseDetail&&e.jsx("p",{className:"text-zinc-500",children:t.phaseDetail}),e.jsxs("div",{className:"grid grid-cols-2 md:grid-cols-4 gap-4 text-xs text-zinc-500 pt-2",children:[e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Projects"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:t.projectsProcessed??"—"})]}),e.jsxs("div",{children:[e.jsxs("p",{className:"text-zinc-600 uppercase tracking-wide flex items-center gap-1",children:["Rows",e.jsx(h,{label:"What are rows?",children:"Total number of (project, referring domain) pairs persisted in SQLite from this sync, across every project in the workspace."})]}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:t.domainsDiscovered??"—"})]}),e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Started"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:b(t.downloadStartedAt??t.createdAt)})]}),e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Finished"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:b(t.queryFinishedAt)})]})]}),t.error&&e.jsx("p",{className:"text-sm text-rose-400 pt-2",children:t.error})]}):e.jsx("p",{className:"text-sm text-zinc-500",children:"No release sync has run in this workspace yet."}),G&&e.jsx("div",{className:"mt-4 rounded border border-amber-800/60 bg-amber-950/20 p-3",children:e.jsxs("div",{className:"flex items-start gap-2",children:[e.jsx(L,{className:"h-4 w-4 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"text-xs text-zinc-300 leading-relaxed",children:[e.jsx("p",{className:"font-medium text-amber-200",children:"Cached files for this release are missing."}),e.jsxs("p",{className:"mt-1 text-zinc-400",children:["The sync record in the database says this release finished successfully, but the ~16 GB dump at"," ",e.jsxs("code",{className:"text-zinc-300",children:["~/.canonry/cache/commoncrawl/",t?.release,"/"]})," isn’t on disk. Your backlink data is still intact (it lives in SQLite), but per-project re-run extracts will fail until you either re-sync this release or start a new one."]})]})]})}),e.jsxs("div",{className:"mt-4 rounded border border-zinc-800 bg-zinc-900/40 p-3",children:[e.jsxs("div",{className:"flex items-start justify-between gap-3 mb-3",children:[e.jsxs("div",{children:[e.jsx("p",{className:"text-[10px] uppercase tracking-wide text-zinc-500",children:"Auto-detected release"}),u?e.jsxs("p",{className:"text-sm text-zinc-200 mt-0.5",children:[e.jsx("code",{className:"text-zinc-100",children:u.release}),e.jsxs("span",{className:"ml-2 text-xs text-zinc-500",children:["— vertex ",w(u.vertexBytes),", edges ",w(u.edgesBytes)]})]}):e.jsx("p",{className:"text-sm text-zinc-500 mt-0.5",children:D?"Probing Common Crawl…":"Could not auto-detect — pass an explicit release below."}),e.jsxs("a",{href:oe,target:"_blank",rel:"noopener noreferrer",className:"mt-1 inline-flex items-center gap-1 text-xs text-zinc-400 hover:text-zinc-200 focus:text-zinc-200 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",children:["Browse all Common Crawl web-graph releases",e.jsx(de,{className:"h-3 w-3","aria-hidden":!0})]})]}),e.jsxs("div",{className:"flex items-center gap-2 shrink-0",children:[e.jsxs(z,{type:"button",size:"sm",disabled:p||!a?.duckdbInstalled||!u&&!N.trim(),onClick:A($),children:[e.jsx(J,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),p?"Queuing…":"Run sync"]}),e.jsxs(h,{label:"What does Run sync do?",children:[e.jsxs("span",{className:"block",children:["Downloads the auto-detected (or chosen) Common Crawl release (~16 GB) to"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),", then runs a single DuckDB query that extracts referring domains for every project in this workspace."]}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["First time for a release: ",e.jsx("span",{className:"text-zinc-200",children:"~10–20 min download + ~5 min query"}),". Re-running the same release later: ",e.jsx("span",{className:"text-zinc-200",children:"skips download, just re-queries"})," (~5 min)."]})]})]})]}),F?e.jsxs("div",{className:"flex flex-wrap items-center gap-2",children:[e.jsx("input",{type:"text",className:"flex-1 min-w-[240px] rounded border border-zinc-700 bg-transparent px-2.5 py-1.5 text-sm text-zinc-200 placeholder-zinc-600 focus:border-zinc-500 focus:outline-none",placeholder:"cc-main-2026-jan-feb-mar",value:N,onChange:s=>g(s.target.value),disabled:p,autoFocus:!0}),e.jsx("button",{type:"button",className:"text-xs text-zinc-500 hover:text-zinc-300 focus:text-zinc-300 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",onClick:()=>{g(""),k(!1)},disabled:p,children:"Cancel"})]}):e.jsx("button",{type:"button",className:"text-xs text-zinc-500 hover:text-zinc-300 focus:text-zinc-300 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",onClick:()=>k(!0),disabled:p,children:"Use a different release →"})]}),!a?.duckdbInstalled&&e.jsx("p",{className:"text-xs text-zinc-600 mt-2",children:"Install DuckDB first to enable sync."})]})]}),e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Cached releases"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["Local disk cache",e.jsxs(h,{label:"What is this?",children:[e.jsxs("span",{className:"block",children:["Raw Common Crawl dumps stored at"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/<release>/"}),". Each release takes ~16 GB."]}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["These files are needed to re-run per-project extracts against a release without re-downloading. Pruning here ",e.jsx("span",{className:"text-zinc-200",children:"does not delete your backlink data"})," — that lives in SQLite."]})]})]})]})}),e.jsx("p",{className:"text-xs text-zinc-500 mb-3 max-w-3xl",children:"Each cached release is a ~16 GB pair of gzipped files. They’re needed to re-query the graph (e.g. for a newly-added project) without re-downloading. Safe to prune — backlink results persist in SQLite."}),e.jsx(l,{className:"surface-card overflow-hidden",children:e.jsxs("table",{className:"w-full text-sm",children:[e.jsx("thead",{children:e.jsxs("tr",{className:"border-b border-zinc-800 text-left text-xs uppercase tracking-wide text-zinc-600",children:[e.jsx("th",{className:"px-4 py-2 font-medium",children:"Release"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Sync status"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Size"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Last used"}),e.jsx("th",{className:"px-4 py-2 font-medium sr-only",children:"Actions"})]})}),e.jsxs("tbody",{children:[r.map(s=>e.jsxs("tr",{className:"border-b border-zinc-900 last:border-0",children:[e.jsx("td",{className:"px-4 py-2 text-zinc-200",children:e.jsx("code",{children:s.release})}),e.jsx("td",{className:"px-4 py-2",children:s.syncStatus?e.jsx(f,{tone:v(s.syncStatus),children:s.syncStatus}):e.jsx("span",{className:"text-zinc-600",children:"—"})}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:w(s.bytes)}),e.jsx("td",{className:"px-4 py-2 text-zinc-400",children:b(s.lastUsedAt)}),e.jsx("td",{className:"px-4 py-2 text-right",children:e.jsxs("div",{className:"inline-flex items-center gap-1",children:[e.jsxs(z,{type:"button",variant:"outline",size:"sm",onClick:()=>{Q(s.release)},children:[e.jsx(ne,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),"Prune"]}),e.jsx(h,{label:"What does Prune do?",placement:"top",children:"Deletes the ~16 GB cache for this release from disk. Backlink results already in SQLite remain untouched. To re-run extracts against this release, you’d have to sync it again (another ~16 GB download)."})]})})]},s.release)),r.length===0&&e.jsx("tr",{children:e.jsx("td",{className:"px-4 py-4 text-sm text-zinc-500",colSpan:5,children:"No cached releases on this machine. If you ran a sync from a different machine (or deleted the cache), the backlink data is still in the database — but you’ll need to re-sync a release to run new extracts."})})]})]})})]}),o.length>1&&e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"History"}),e.jsx("h2",{children:"Past release syncs"})]})}),e.jsx(l,{className:"surface-card overflow-hidden",children:e.jsxs("table",{className:"w-full text-sm",children:[e.jsx("thead",{children:e.jsxs("tr",{className:"border-b border-zinc-800 text-left text-xs uppercase tracking-wide text-zinc-600",children:[e.jsx("th",{className:"px-4 py-2 font-medium",children:"Release"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Status"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Projects"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Rows"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Finished"})]})}),e.jsx("tbody",{children:o.map(s=>e.jsxs("tr",{className:"border-b border-zinc-900 last:border-0",children:[e.jsx("td",{className:"px-4 py-2 text-zinc-200",children:e.jsx("code",{children:s.release})}),e.jsx("td",{className:"px-4 py-2",children:e.jsx(f,{tone:v(s.status),children:s.status})}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:s.projectsProcessed??"—"}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:s.domainsDiscovered??"—"}),e.jsx("td",{className:"px-4 py-2 text-zinc-400",children:b(s.queryFinishedAt??s.updatedAt)})]},s.id))})]})})]})]})}export{fe as BacklinksPage};
1
+ import{r as n,j as e}from"./vendor-tanstack-Dq7p98wZ.js";import{c as E,bO as O,ae as V,bP as U,bQ as Y,bR as K,g as l,aH as L,T as f,B as z,i as A,an as J,bS as X,bT as Z,ah as ee,bU as se}from"./index-DmNti_xn.js";import{C as ae,D as te,T as ne,a as ce}from"./trash-2-BoimCsYz.js";import"./vendor-radix-B57xfQbP.js";import"./vendor-recharts-ClRVR6aX.js";import"./vendor-markdown-DK7fbRNb.js";const ie=[["circle",{cx:"12",cy:"12",r:"10",key:"1mglay"}],["line",{x1:"12",x2:"12",y1:"8",y2:"12",key:"1pkeuh"}],["line",{x1:"12",x2:"12.01",y1:"16",y2:"16",key:"4dfq90"}]],re=E("circle-alert",ie);const le=[["path",{d:"M15 3h6v6",key:"1q9fwt"}],["path",{d:"M10 14 21 3",key:"gplh6r"}],["path",{d:"M18 13v6a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V8a2 2 0 0 1 2-2h6",key:"a6xqqp"}]],de=E("external-link",le);function h({children:a,label:m="More info",placement:t="top",className:d}){const o=n.useId(),[y,r]=n.useState(!1);return e.jsxs("span",{className:`relative inline-flex ${d??""}`,children:[e.jsx("button",{type:"button","aria-label":m,"aria-describedby":y?o:void 0,className:"inline-flex h-4 w-4 items-center justify-center rounded-full text-zinc-500 hover:text-zinc-200 focus:text-zinc-200 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500",onMouseEnter:()=>r(!0),onMouseLeave:()=>r(!1),onFocus:()=>r(!0),onBlur:()=>r(!1),children:e.jsx(ce,{className:"h-3.5 w-3.5","aria-hidden":!0})}),y&&e.jsx("span",{id:o,role:"tooltip",className:`absolute z-50 w-64 rounded border border-zinc-700 bg-zinc-900 px-3 py-2 text-xs font-normal leading-relaxed text-zinc-200 shadow-lg ${t==="top"?"bottom-full mb-2":"top-full mt-2"} left-1/2 -translate-x-1/2 whitespace-normal`,children:a})]})}const oe="https://commoncrawl.org/web-graphs";function w(a){return a==null?"—":a>=1e12?`${(a/1e12).toFixed(1)} TB`:a>=1e9?`${(a/1e9).toFixed(1)} GB`:a>=1e6?`${(a/1e6).toFixed(1)} MB`:a>=1e3?`${(a/1e3).toFixed(1)} KB`:`${a} B`}function b(a){if(!a)return"—";const m=Date.now()-new Date(a).getTime(),t=Math.floor(m/6e4);if(t<1)return"just now";if(t<60)return`${t}m ago`;const d=Math.floor(t/60);return d<24?`${d}h ago`:`${Math.floor(d/24)}d ago`}function v(a){switch(a){case"ready":return"positive";case"failed":return"negative";case"downloading":case"querying":case"queued":return"caution"}}function fe(){const[a,m]=n.useState(null),[t,d]=n.useState(null),[o,y]=n.useState([]),[r,P]=n.useState([]),[u,M]=n.useState(null),[D,S]=n.useState(!0),[C,B]=n.useState(!1),[p,R]=n.useState(!1),[N,g]=n.useState(""),[F,k]=n.useState(!1),[q,i]=n.useState(null),[I,x]=n.useState(null),j=n.useCallback(async()=>{S(!0),i(null);try{const[s,c,_,H,W]=await Promise.all([O(),V().catch(()=>null),U().catch(()=>[]),Y().catch(()=>[]),K().catch(()=>null)]);m(s),d(c),y(_),P(H),M(W)}catch(s){i(s instanceof Error?s.message:"Failed to load backlinks status")}finally{S(!1)}},[]);n.useEffect(()=>{j()},[j]);async function T(){B(!0),i(null),x(null);try{const s=await X();x(s.alreadyPresent?`DuckDB already installed (${s.version}).`:`Installed DuckDB ${s.version}.`),await j()}catch(s){i(s instanceof Error?s.message:"Failed to install DuckDB")}finally{B(!1)}}async function $(){const s=N.trim()||void 0;R(!0),i(null),x(null);try{const c=await Z(s);x(s?`Queued sync for ${c.release}. Download + query runs in the background.`:`Queued sync for auto-discovered release ${c.release}. Download + query runs in the background.`),g(""),k(!1),await j()}catch(c){c instanceof ee&&c.code==="MISSING_DEPENDENCY"?i("DuckDB is not installed. Install it first."):i(c instanceof Error?c.message:"Failed to trigger sync")}finally{R(!1)}}async function Q(s){i(null),x(null);try{await se(s),x(`Pruned cached release ${s}.`),await j()}catch(c){i(c instanceof Error?c.message:"Failed to prune release")}}const G=t?.status==="ready"&&r.every(s=>s.release!==t.release);return e.jsxs("div",{className:"page-container",children:[e.jsx("div",{className:"page-header",children:e.jsxs("div",{className:"page-header-left",children:[e.jsx("h1",{className:"page-title",children:"Backlinks"}),e.jsx("p",{className:"page-subtitle",children:"Find domains that link to your projects, computed from the open Common Crawl web graph. Runs entirely on your machine — nothing is sent to third parties."})]})}),e.jsx(l,{className:"surface-card p-4 mb-6 border-amber-800/60",children:e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(L,{className:"h-5 w-5 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"text-sm text-zinc-300 leading-relaxed",children:[e.jsx("p",{className:"font-medium text-amber-200",children:"Heads up — a release sync is a large download."}),e.jsxs("ul",{className:"mt-1.5 space-y-1 text-zinc-400",children:[e.jsxs("li",{children:[e.jsx("span",{className:"text-zinc-200",children:"~16 GB"})," of gzipped vertex + edge files per release, stored at"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),"."]}),e.jsxs("li",{children:[e.jsx("span",{className:"text-zinc-200",children:"10–20 min on a fast connection"})," for the download, then ~5 min for the DuckDB query."]}),e.jsx("li",{children:"One sync covers every project in this workspace. Releases are immutable, so the download only happens once per release."})]})]})]})}),e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"About"}),e.jsx("h2",{children:"How it works"})]})}),e.jsxs(l,{className:"surface-card p-5",children:[e.jsxs("p",{className:"text-sm text-zinc-400 leading-relaxed max-w-3xl mb-4",children:["Common Crawl publishes a quarterly snapshot of the public web’s hyperlink graph. Canonry downloads one"," ",e.jsx("span",{className:"text-zinc-200",children:"release"})," at a time and extracts backlinks for every project in this workspace in a single pass."]}),e.jsxs("ol",{className:"space-y-3 text-sm text-zinc-400 max-w-3xl",children:[e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"1"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Download (one-time, ~16 GB)"})," — vertex + edge files cached to"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),". Runs once per release; subsequent operations reuse the cache."]})]}),e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"2"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Query (~5 min)"})," — one DuckDB pass scans the cached files and extracts referring domains for every project’s canonical domain. DuckDB is only used to ",e.jsx("span",{className:"text-zinc-200",children:"read"})," these dumps; it doesn’t store any canonry state."]})]}),e.jsxs("li",{className:"flex gap-3",children:[e.jsx("span",{className:"shrink-0 inline-flex h-6 w-6 items-center justify-center rounded-full border border-zinc-700 bg-zinc-900 text-xs font-semibold text-zinc-300 tabular-nums",children:"3"}),e.jsxs("span",{children:[e.jsx("span",{className:"text-zinc-200 font-medium",children:"Persist"})," — results land in the same SQLite database the rest of canonry uses. After the first sync, per-project reads (and re-run extracts against the cached release) are instant."]})]})]})]})]}),q&&e.jsx(l,{className:"surface-card p-4 mb-4 border-rose-800/60",children:e.jsx("p",{className:"text-sm text-rose-300",children:q})}),I&&e.jsx(l,{className:"surface-card p-4 mb-4 border-emerald-800/60",children:e.jsx("p",{className:"text-sm text-emerald-300",children:I})}),e.jsxs("section",{className:"page-section-divider",children:[e.jsxs("div",{className:"section-head section-head-inline",children:[e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Dependency"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["DuckDB install status",e.jsxs(h,{label:"Why DuckDB?",children:[e.jsx("span",{className:"block",children:"DuckDB is a query engine canonry uses to scan the ~16 GB Common Crawl dumps and pull out your referring domains."}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["It does ",e.jsx("span",{className:"text-zinc-200",children:"not"})," store any canonry data — your backlink results live in SQLite alongside the rest of your projects. DuckDB is purely a tool for processing the raw CSV files."]}),e.jsxs("span",{className:"mt-2 block text-zinc-500",children:["Installed on demand (not bundled) into ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/plugins/"})," so users who never run backlinks don’t pay the ~40 MB install cost."]})]})]})]}),a?.duckdbInstalled?e.jsx(f,{tone:"positive",children:"Installed"}):e.jsx(f,{tone:"caution",children:"Not installed"})]}),e.jsx(l,{className:"surface-card p-5",children:D?e.jsx("p",{className:"text-sm text-zinc-500",children:"Checking…"}):a?.duckdbInstalled?e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(ae,{className:"h-5 w-5 text-emerald-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{children:[e.jsxs("p",{className:"text-sm text-zinc-200",children:["Version ",a.duckdbVersion??"unknown"," installed at"," ",e.jsx("code",{className:"text-zinc-300",children:a.pluginDir})]}),e.jsxs("p",{className:"text-xs text-zinc-500 mt-1",children:["Required spec: ",a.duckdbSpec]})]})]}):e.jsxs("div",{className:"flex items-start gap-3",children:[e.jsx(re,{className:"h-5 w-5 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"flex-1",children:[e.jsx("p",{className:"text-sm text-zinc-200",children:"DuckDB is not installed. It’s the query engine canonry uses to scan Common Crawl dumps — required before you can run a release sync or per-project extract."}),e.jsx("p",{className:"text-xs text-zinc-500 mt-1",children:"Installing doesn’t touch your project data. DuckDB only reads the downloaded CSV files; backlink results are written to the same SQLite database canonry already uses."}),a&&e.jsxs("p",{className:"text-xs text-zinc-500 mt-1",children:["Will be installed into ",e.jsx("code",{className:"text-zinc-300",children:a.pluginDir})," (~40 MB)."]}),e.jsx("div",{className:"mt-3",children:e.jsxs(z,{type:"button",size:"sm",disabled:C,onClick:A(T),children:[e.jsx(te,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),C?"Installing…":"Install DuckDB"]})})]})]})})]}),e.jsxs("section",{className:"page-section-divider",children:[e.jsxs("div",{className:"section-head section-head-inline",children:[e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Latest sync"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["Release sync",e.jsx(h,{label:"What is a release sync?",children:"A release sync downloads one Common Crawl dump (~16 GB) and extracts backlinks for every project in this workspace in one pass. This is the heavy job — subsequent per-project re-runs skip the download and just re-query the cached files."})]})]}),t&&e.jsx(f,{tone:v(t.status),children:t.status})]}),e.jsxs(l,{className:"surface-card p-5",children:[e.jsxs("p",{className:"text-xs text-zinc-500 max-w-3xl mb-4",children:["A release is one Common Crawl dump (e.g. ",e.jsx("code",{className:"text-zinc-400",children:"cc-main-2026-jan-feb-mar"}),"). Syncing it downloads the graph and populates backlinks for every project in this workspace."]}),t?e.jsxs("div",{className:"space-y-2 text-sm",children:[e.jsxs("p",{className:"text-zinc-200",children:["Release ",e.jsx("code",{className:"text-zinc-300",children:t.release})]}),t.phaseDetail&&e.jsx("p",{className:"text-zinc-500",children:t.phaseDetail}),e.jsxs("div",{className:"grid grid-cols-2 md:grid-cols-4 gap-4 text-xs text-zinc-500 pt-2",children:[e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Projects"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:t.projectsProcessed??"—"})]}),e.jsxs("div",{children:[e.jsxs("p",{className:"text-zinc-600 uppercase tracking-wide flex items-center gap-1",children:["Rows",e.jsx(h,{label:"What are rows?",children:"Total number of (project, referring domain) pairs persisted in SQLite from this sync, across every project in the workspace."})]}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:t.domainsDiscovered??"—"})]}),e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Started"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:b(t.downloadStartedAt??t.createdAt)})]}),e.jsxs("div",{children:[e.jsx("p",{className:"text-zinc-600 uppercase tracking-wide",children:"Finished"}),e.jsx("p",{className:"text-zinc-300 mt-0.5",children:b(t.queryFinishedAt)})]})]}),t.error&&e.jsx("p",{className:"text-sm text-rose-400 pt-2",children:t.error})]}):e.jsx("p",{className:"text-sm text-zinc-500",children:"No release sync has run in this workspace yet."}),G&&e.jsx("div",{className:"mt-4 rounded border border-amber-800/60 bg-amber-950/20 p-3",children:e.jsxs("div",{className:"flex items-start gap-2",children:[e.jsx(L,{className:"h-4 w-4 text-amber-400 shrink-0 mt-0.5","aria-hidden":!0}),e.jsxs("div",{className:"text-xs text-zinc-300 leading-relaxed",children:[e.jsx("p",{className:"font-medium text-amber-200",children:"Cached files for this release are missing."}),e.jsxs("p",{className:"mt-1 text-zinc-400",children:["The sync record in the database says this release finished successfully, but the ~16 GB dump at"," ",e.jsxs("code",{className:"text-zinc-300",children:["~/.canonry/cache/commoncrawl/",t?.release,"/"]})," isn’t on disk. Your backlink data is still intact (it lives in SQLite), but per-project re-run extracts will fail until you either re-sync this release or start a new one."]})]})]})}),e.jsxs("div",{className:"mt-4 rounded border border-zinc-800 bg-zinc-900/40 p-3",children:[e.jsxs("div",{className:"flex items-start justify-between gap-3 mb-3",children:[e.jsxs("div",{children:[e.jsx("p",{className:"text-[10px] uppercase tracking-wide text-zinc-500",children:"Auto-detected release"}),u?e.jsxs("p",{className:"text-sm text-zinc-200 mt-0.5",children:[e.jsx("code",{className:"text-zinc-100",children:u.release}),e.jsxs("span",{className:"ml-2 text-xs text-zinc-500",children:["— vertex ",w(u.vertexBytes),", edges ",w(u.edgesBytes)]})]}):e.jsx("p",{className:"text-sm text-zinc-500 mt-0.5",children:D?"Probing Common Crawl…":"Could not auto-detect — pass an explicit release below."}),e.jsxs("a",{href:oe,target:"_blank",rel:"noopener noreferrer",className:"mt-1 inline-flex items-center gap-1 text-xs text-zinc-400 hover:text-zinc-200 focus:text-zinc-200 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",children:["Browse all Common Crawl web-graph releases",e.jsx(de,{className:"h-3 w-3","aria-hidden":!0})]})]}),e.jsxs("div",{className:"flex items-center gap-2 shrink-0",children:[e.jsxs(z,{type:"button",size:"sm",disabled:p||!a?.duckdbInstalled||!u&&!N.trim(),onClick:A($),children:[e.jsx(J,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),p?"Queuing…":"Run sync"]}),e.jsxs(h,{label:"What does Run sync do?",children:[e.jsxs("span",{className:"block",children:["Downloads the auto-detected (or chosen) Common Crawl release (~16 GB) to"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/"}),", then runs a single DuckDB query that extracts referring domains for every project in this workspace."]}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["First time for a release: ",e.jsx("span",{className:"text-zinc-200",children:"~10–20 min download + ~5 min query"}),". Re-running the same release later: ",e.jsx("span",{className:"text-zinc-200",children:"skips download, just re-queries"})," (~5 min)."]})]})]})]}),F?e.jsxs("div",{className:"flex flex-wrap items-center gap-2",children:[e.jsx("input",{type:"text",className:"flex-1 min-w-[240px] rounded border border-zinc-700 bg-transparent px-2.5 py-1.5 text-sm text-zinc-200 placeholder-zinc-600 focus:border-zinc-500 focus:outline-none",placeholder:"cc-main-2026-jan-feb-mar",value:N,onChange:s=>g(s.target.value),disabled:p,autoFocus:!0}),e.jsx("button",{type:"button",className:"text-xs text-zinc-500 hover:text-zinc-300 focus:text-zinc-300 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",onClick:()=>{g(""),k(!1)},disabled:p,children:"Cancel"})]}):e.jsx("button",{type:"button",className:"text-xs text-zinc-500 hover:text-zinc-300 focus:text-zinc-300 focus:outline-none focus-visible:ring-1 focus-visible:ring-zinc-500 rounded",onClick:()=>k(!0),disabled:p,children:"Use a different release →"})]}),!a?.duckdbInstalled&&e.jsx("p",{className:"text-xs text-zinc-600 mt-2",children:"Install DuckDB first to enable sync."})]})]}),e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"Cached releases"}),e.jsxs("h2",{className:"flex items-center gap-2",children:["Local disk cache",e.jsxs(h,{label:"What is this?",children:[e.jsxs("span",{className:"block",children:["Raw Common Crawl dumps stored at"," ",e.jsx("code",{className:"text-zinc-300",children:"~/.canonry/cache/commoncrawl/<release>/"}),". Each release takes ~16 GB."]}),e.jsxs("span",{className:"mt-2 block text-zinc-400",children:["These files are needed to re-run per-project extracts against a release without re-downloading. Pruning here ",e.jsx("span",{className:"text-zinc-200",children:"does not delete your backlink data"})," — that lives in SQLite."]})]})]})]})}),e.jsx("p",{className:"text-xs text-zinc-500 mb-3 max-w-3xl",children:"Each cached release is a ~16 GB pair of gzipped files. They’re needed to re-query the graph (e.g. for a newly-added project) without re-downloading. Safe to prune — backlink results persist in SQLite."}),e.jsx(l,{className:"surface-card overflow-hidden",children:e.jsxs("table",{className:"w-full text-sm",children:[e.jsx("thead",{children:e.jsxs("tr",{className:"border-b border-zinc-800 text-left text-xs uppercase tracking-wide text-zinc-600",children:[e.jsx("th",{className:"px-4 py-2 font-medium",children:"Release"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Sync status"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Size"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Last used"}),e.jsx("th",{className:"px-4 py-2 font-medium sr-only",children:"Actions"})]})}),e.jsxs("tbody",{children:[r.map(s=>e.jsxs("tr",{className:"border-b border-zinc-900 last:border-0",children:[e.jsx("td",{className:"px-4 py-2 text-zinc-200",children:e.jsx("code",{children:s.release})}),e.jsx("td",{className:"px-4 py-2",children:s.syncStatus?e.jsx(f,{tone:v(s.syncStatus),children:s.syncStatus}):e.jsx("span",{className:"text-zinc-600",children:"—"})}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:w(s.bytes)}),e.jsx("td",{className:"px-4 py-2 text-zinc-400",children:b(s.lastUsedAt)}),e.jsx("td",{className:"px-4 py-2 text-right",children:e.jsxs("div",{className:"inline-flex items-center gap-1",children:[e.jsxs(z,{type:"button",variant:"outline",size:"sm",onClick:()=>{Q(s.release)},children:[e.jsx(ne,{className:"h-4 w-4 mr-1.5","aria-hidden":!0}),"Prune"]}),e.jsx(h,{label:"What does Prune do?",placement:"top",children:"Deletes the ~16 GB cache for this release from disk. Backlink results already in SQLite remain untouched. To re-run extracts against this release, you’d have to sync it again (another ~16 GB download)."})]})})]},s.release)),r.length===0&&e.jsx("tr",{children:e.jsx("td",{className:"px-4 py-4 text-sm text-zinc-500",colSpan:5,children:"No cached releases on this machine. If you ran a sync from a different machine (or deleted the cache), the backlink data is still in the database — but you’ll need to re-sync a release to run new extracts."})})]})]})})]}),o.length>1&&e.jsxs("section",{className:"page-section-divider",children:[e.jsx("div",{className:"section-head section-head-inline",children:e.jsxs("div",{children:[e.jsx("p",{className:"eyebrow eyebrow-soft",children:"History"}),e.jsx("h2",{children:"Past release syncs"})]})}),e.jsx(l,{className:"surface-card overflow-hidden",children:e.jsxs("table",{className:"w-full text-sm",children:[e.jsx("thead",{children:e.jsxs("tr",{className:"border-b border-zinc-800 text-left text-xs uppercase tracking-wide text-zinc-600",children:[e.jsx("th",{className:"px-4 py-2 font-medium",children:"Release"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Status"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Projects"}),e.jsx("th",{className:"px-4 py-2 text-right font-medium",children:"Rows"}),e.jsx("th",{className:"px-4 py-2 font-medium",children:"Finished"})]})}),e.jsx("tbody",{children:o.map(s=>e.jsxs("tr",{className:"border-b border-zinc-900 last:border-0",children:[e.jsx("td",{className:"px-4 py-2 text-zinc-200",children:e.jsx("code",{children:s.release})}),e.jsx("td",{className:"px-4 py-2",children:e.jsx(f,{tone:v(s.status),children:s.status})}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:s.projectsProcessed??"—"}),e.jsx("td",{className:"px-4 py-2 text-right text-zinc-400 tabular-nums",children:s.domainsDiscovered??"—"}),e.jsx("td",{className:"px-4 py-2 text-zinc-400",children:b(s.queryFinishedAt??s.updatedAt)})]},s.id))})]})})]})]})}export{fe as BacklinksPage};
@@ -1 +1 @@
1
- import{c as n}from"./index-CGlPx_cu.js";const r=[["path",{d:"m15 18-6-6 6-6",key:"1wnfg3"}]],s=n("chevron-left",r),f={contentStyle:{backgroundColor:"#18181b",border:"1px solid #3f3f46",borderRadius:8,fontSize:12},labelStyle:{color:"#e4e4e7"},itemStyle:{color:"#a1a1aa"}},d={fill:"#71717a",fontSize:11},l="#27272a",S="#27272a",a=["#34d399","#60a5fa","#f472b6","#facc15","#a78bfa","#fb923c","#22d3ee","#f87171"],c={gemini:"#60a5fa",openai:"#34d399",claude:"#fb923c",perplexity:"#22d3ee",local:"#a78bfa"};function R(t,e=0){return c[t]??a[e%a.length]}const T={text:"#a1a1aa",textDim:"#71717a",textFaint:"#52525b",surface:"#27272a"},u={positive:"#34d399",positiveDeep:"#10b981"};function o(t){const e=String(t);return e.includes("T")?new Date(e):new Date(e+"T00:00:00")}function C(t){return o(String(t)).toLocaleDateString(void 0,{month:"short",day:"numeric",year:"numeric"})}function _(t){const e=o(t);return`${e.getMonth()+1}/${e.getDate()}`}export{S as C,d as a,C as b,f as c,T as d,a as e,_ as f,u as g,s as h,l as i,R as p};
1
+ import{c as n}from"./index-DmNti_xn.js";const r=[["path",{d:"m15 18-6-6 6-6",key:"1wnfg3"}]],s=n("chevron-left",r),f={contentStyle:{backgroundColor:"#18181b",border:"1px solid #3f3f46",borderRadius:8,fontSize:12},labelStyle:{color:"#e4e4e7"},itemStyle:{color:"#a1a1aa"}},d={fill:"#71717a",fontSize:11},l="#27272a",S="#27272a",a=["#34d399","#60a5fa","#f472b6","#facc15","#a78bfa","#fb923c","#22d3ee","#f87171"],c={gemini:"#60a5fa",openai:"#34d399",claude:"#fb923c",perplexity:"#22d3ee",local:"#a78bfa"};function R(t,e=0){return c[t]??a[e%a.length]}const T={text:"#a1a1aa",textDim:"#71717a",textFaint:"#52525b",surface:"#27272a"},u={positive:"#34d399",positiveDeep:"#10b981"};function o(t){const e=String(t);return e.includes("T")?new Date(e):new Date(e+"T00:00:00")}function C(t){return o(String(t)).toLocaleDateString(void 0,{month:"short",day:"numeric",year:"numeric"})}function _(t){const e=o(t);return`${e.getMonth()+1}/${e.getDate()}`}export{S as C,d as a,C as b,f as c,T as d,a as e,_ as f,u as g,s as h,l as i,R as p};