webctx 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md ADDED
@@ -0,0 +1,84 @@
1
+ # AGENTS.md
2
+
3
+ Guidance for coding agents working in `webctx`.
4
+
5
+ ## Purpose
6
+
7
+ This repo contains the pure Go port of the `webctx` CLI.
8
+
9
+ It is no longer a generic starter template. Treat the current CLI behavior as the source of truth unless the user explicitly asks to change it.
10
+
11
+ ## Architecture
12
+
13
+ - `cmd/webctx/main.go`: process entrypoint, exit-code based dispatch.
14
+ - `internal/app/app.go`: CLI parsing and top-level command routing.
15
+ - `internal/app/tools.go`: search provider clients, ranking, formatting, and HTTP helpers.
16
+ - `internal/app/scrape.go`: GitHub raw-content optimization, `.md` fetch path, Firecrawl queue, and env loading.
17
+ - `internal/app/app_test.go`: unit tests for CLI behavior and core helpers.
18
+ - `bin/webctx.js`: npm shim that invokes the packaged native binary.
19
+ - `scripts/postinstall.js`: downloads release binary on install, falls back to `go build`.
20
+ - `.github/workflows/release.yml`: tag-driven release pipeline.
21
+ - `docs/porting-status.md`: progress log and remaining work for future agents.
22
+
23
+ ## Local commands
24
+
25
+ Use `make` targets:
26
+
27
+ - `make fmt`
28
+ - `make test`
29
+ - `make vet`
30
+ - `make lint`
31
+ - `make check`
32
+ - `make build`
33
+ - `make build-all`
34
+ - `make install-local`
35
+
36
+ Direct commands:
37
+
38
+ - `go test ./...`
39
+ - `go vet ./...`
40
+ - `npm run lint`
41
+
42
+ ## Current CLI contract
43
+
44
+ Preserve these commands unless the user explicitly asks to change them:
45
+
46
+ - `webctx search <query> [--exclude domain1,domain2] [--keyword phrase]`
47
+ - `webctx read-link <url>`
48
+ - `webctx map-site <url>`
49
+ - `webctx --version`
50
+
51
+ Behavioral expectations:
52
+
53
+ - `search` combines Brave, Tavily, and Exa results, then re-ranks them with duplicate-aware scoring.
54
+ - `read-link` keeps the current GitHub raw-content fast path, `.md` fast path, and Firecrawl fallback settings.
55
+ - `map-site` keeps the current Firecrawl map request settings.
56
+ - The CLI should remain agent-friendly and emit plain markdown/text output.
57
+
58
+ ## How to change things safely
59
+
60
+ 1. Keep binary naming convention unchanged unless you also update postinstall/workflow:
61
+ - release assets: `<cli>_<goos>_<goarch>[.exe]`
62
+ - npm-installed binary path: `bin/<cli>-bin` (or `.exe` on Windows)
63
+
64
+ 2. If changing search behavior, compare against the TypeScript porting notes in `docs/porting-status.md` first.
65
+
66
+ 3. If adding dependencies, commit `go.sum` and make sure the workflow still passes on a clean checkout.
67
+
68
+ 4. If you change release artifacts or version plumbing, update `Makefile`, `.github/workflows/release.yml`, and `scripts/postinstall.js` together.
69
+
70
+ ## Release contract
71
+
72
+ Release pipeline triggers on `v*` tags and expects:
73
+
74
+ - `NPM_TOKEN` GitHub secret present.
75
+ - npm package name in `package.json` is publishable under your account/org.
76
+ - repository URL matches the release origin used by `scripts/postinstall.js`.
77
+
78
+ Release binaries should embed the tagged version into `internal/buildinfo.Version` so `webctx --version` matches the release tag.
79
+
80
+ ## Guardrails
81
+
82
+ - Prefer additive changes and keep the CLI output stable.
83
+ - Do not silently change Firecrawl request settings unless the user explicitly wants behavioral changes.
84
+ - Do not reintroduce MCP/server code unless requested; this repo is intentionally CLI-only.
@@ -0,0 +1,93 @@
1
+ # CONTRIBUTORS.md
2
+
3
+ Maintainer notes for `webctx`.
4
+
5
+ ## Prerequisites
6
+
7
+ - Go `1.26+`
8
+ - Node `18+`
9
+ - npm account with publish rights for the package name in `package.json`
10
+ - GitHub repo admin access
11
+
12
+ ## Local development
13
+
14
+ ```bash
15
+ make check
16
+ make build
17
+ ./dist/webctx --help
18
+ ```
19
+
20
+ Example command checks:
21
+
22
+ ```bash
23
+ ./dist/webctx --version
24
+ ./dist/webctx search "golang http client"
25
+ ./dist/webctx read-link https://github.com/amxv/webctx-ts/blob/main/cli.ts
26
+ ./dist/webctx map-site https://example.com
27
+ ```
28
+
29
+ Install command locally:
30
+
31
+ ```bash
32
+ make install-local
33
+ webctx --help
34
+ ```
35
+
36
+ ## Release process
37
+
38
+ 1. Ensure `main` is green:
39
+
40
+ ```bash
41
+ make check
42
+ ```
43
+
44
+ 2. Confirm the release workflow is targeting `webctx` and that `package.json` still points to the correct GitHub repository.
45
+
46
+ 3. Prepare release tag:
47
+
48
+ ```bash
49
+ make release-tag VERSION=0.1.0
50
+ ```
51
+
52
+ 4. GitHub Actions `release` workflow runs automatically:
53
+ - quality checks
54
+ - cross-platform binary build
55
+ - GitHub release publish
56
+ - npm publish
57
+
58
+ ## Required GitHub secret
59
+
60
+ - `NPM_TOKEN`: npm automation token with publish rights for your package.
61
+
62
+ Set via GitHub CLI:
63
+
64
+ ```bash
65
+ gh secret set NPM_TOKEN --repo amxv/webctx
66
+ ```
67
+
68
+ ## npm token setup
69
+
70
+ Create token at npm:
71
+
72
+ - Profile -> Access Tokens -> Create New Token
73
+ - Use an automation/granular token scoped to required package/org
74
+
75
+ Validate auth locally:
76
+
77
+ ```bash
78
+ npm whoami
79
+ ```
80
+
81
+ ## Notes on package naming
82
+
83
+ `webctx` is already configured. If you ever rename or move the package, update all of the following together:
84
+
85
+ - `package.json`
86
+ - `bin/webctx.js`
87
+ - `scripts/postinstall.js`
88
+ - `.github/workflows/release.yml`
89
+ - `Makefile`
90
+
91
+ ## Porting reference
92
+
93
+ The repo includes `docs/porting-status.md` as the running reference for what was ported from `webctx-ts`, what was intentionally excluded, and what future agents should verify before making behavior changes.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 amxv
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/Makefile ADDED
@@ -0,0 +1,69 @@
1
+ SHELL := /bin/bash
2
+
3
+ GO ?= go
4
+ GOFMT ?= gofmt
5
+ BIN_NAME ?= webctx
6
+ CMD_PATH ?= ./cmd/$(BIN_NAME)
7
+ DIST_DIR ?= dist
8
+ BIN_PATH ?= $(DIST_DIR)/$(BIN_NAME)
9
+ VERSION ?= $(shell node -p "require('./package.json').version" 2>/dev/null)
10
+ LDFLAGS ?= -s -w -X github.com/amxv/webctx/internal/buildinfo.Version=$(if $(VERSION),$(VERSION),dev)
11
+
12
+ .PHONY: help fmt test vet lint check build build-all install-local clean release-tag
13
+
14
+ help:
15
+ @echo "webctx command runner"
16
+ @echo ""
17
+ @echo "Targets:"
18
+ @echo " make fmt - format Go files"
19
+ @echo " make test - run go test ./..."
20
+ @echo " make vet - run go vet ./..."
21
+ @echo " make lint - run Node script checks"
22
+ @echo " make check - fmt + test + vet + lint"
23
+ @echo " make build - build local binary to dist/webctx"
24
+ @echo " make build-all - build release binaries for 5 target platforms"
25
+ @echo " make install-local - install CLI to ~/.local/bin/webctx"
26
+ @echo " make clean - remove dist artifacts"
27
+ @echo " make release-tag - create and push git tag (requires VERSION=x.y.z)"
28
+
29
+ fmt:
30
+ @$(GOFMT) -w $$(find . -type f -name '*.go' -not -path './dist/*')
31
+
32
+ test:
33
+ @$(GO) test ./...
34
+
35
+ vet:
36
+ @$(GO) vet ./...
37
+
38
+ lint:
39
+ @npm run lint
40
+
41
+ check: fmt test vet lint
42
+
43
+ build:
44
+ @mkdir -p $(DIST_DIR)
45
+ @$(GO) build -trimpath -ldflags="$(LDFLAGS)" -o $(BIN_PATH) $(CMD_PATH)
46
+
47
+ build-all:
48
+ @mkdir -p $(DIST_DIR)
49
+ @for target in "darwin amd64" "darwin arm64" "linux amd64" "linux arm64" "windows amd64"; do \
50
+ set -- $$target; \
51
+ GOOS=$$1; GOARCH=$$2; \
52
+ EXT=""; \
53
+ if [ "$$GOOS" = "windows" ]; then EXT=".exe"; fi; \
54
+ echo "Building $(BIN_NAME) for $$GOOS/$$GOARCH"; \
55
+ CGO_ENABLED=0 GOOS=$$GOOS GOARCH=$$GOARCH $(GO) build -trimpath -ldflags="$(LDFLAGS)" -o "$(DIST_DIR)/$(BIN_NAME)_$${GOOS}_$${GOARCH}$${EXT}" $(CMD_PATH); \
56
+ done
57
+
58
+ install-local: build
59
+ @mkdir -p $$HOME/.local/bin
60
+ @install -m 755 $(BIN_PATH) $$HOME/.local/bin/$(BIN_NAME)
61
+ @echo "Installed $(BIN_NAME) to $$HOME/.local/bin/$(BIN_NAME)"
62
+
63
+ clean:
64
+ @rm -rf $(DIST_DIR)
65
+
66
+ release-tag:
67
+ @test -n "$(VERSION)" || (echo "Usage: make release-tag VERSION=x.y.z" && exit 1)
68
+ @git tag "v$(VERSION)"
69
+ @git push origin "v$(VERSION)"
package/README.md ADDED
@@ -0,0 +1,95 @@
1
+ # webctx
2
+
3
+ `webctx` is a pure Go CLI for agent-friendly web search and page extraction.
4
+
5
+ ## What it does
6
+
7
+ - `search`: combines Brave, Tavily, and Exa search results, deduplicates them, and re-ranks them
8
+ - `read-link`: returns clean markdown for a single URL using a GitHub raw-content path, a `.md` fast path, and Firecrawl scraping fallback
9
+ - `map-site`: returns a sitemap-style list of URLs and metadata from Firecrawl
10
+
11
+ ## Install
12
+
13
+ Global npm install:
14
+
15
+ ```bash
16
+ npm i -g webctx
17
+ webctx --help
18
+ ```
19
+
20
+ Build from source:
21
+
22
+ ```bash
23
+ git clone https://github.com/amxv/webctx.git
24
+ cd webctx
25
+ make build
26
+ ./dist/webctx --help
27
+ ```
28
+
29
+ ## Commands
30
+
31
+ ```bash
32
+ webctx --help
33
+ webctx --version
34
+ webctx search <query> [--exclude domain1,domain2] [--keyword phrase]
35
+ webctx read-link <url>
36
+ webctx map-site <url>
37
+ ```
38
+
39
+ Examples:
40
+
41
+ ```bash
42
+ webctx search "next.js server components"
43
+ webctx search "react hooks" --exclude youtube.com,vimeo.com
44
+ webctx search "drizzle orm" --keyword "migration guide"
45
+ webctx read-link https://docs.example.com/guide
46
+ webctx map-site https://example.com
47
+ ```
48
+
49
+ ## Environment variables
50
+
51
+ The CLI loads `.env.local` when present and reads provider credentials from the environment.
52
+
53
+ Quick start:
54
+
55
+ ```bash
56
+ cp .env.local.example .env.local
57
+ ```
58
+
59
+ Required by command:
60
+
61
+ - `search`
62
+ - `BRAVE_API_KEY`
63
+ - `TAVILY_API_KEY`
64
+ - `EXA_API_KEY`
65
+ - `read-link`
66
+ - `FIRECRAWL_API_KEY` for non-GitHub / non-`.md` URLs
67
+ - `map-site`
68
+ - `FIRECRAWL_API_KEY`
69
+
70
+ ## Release and distribution
71
+
72
+ This repo publishes in two ways:
73
+
74
+ - GitHub Releases for native binaries
75
+ - npm for `npm i -g webctx`
76
+
77
+ The release workflow triggers on `v*` tags and does the following:
78
+
79
+ 1. runs Go and Node quality checks
80
+ 2. builds cross-platform binaries
81
+ 3. creates a GitHub Release with those assets
82
+ 4. publishes the npm package using the tag version
83
+
84
+ ## Project layout
85
+
86
+ - `cmd/webctx/main.go`: CLI entrypoint
87
+ - `internal/app/`: CLI parsing, search, ranking, scrape, and Firecrawl queue logic
88
+ - `internal/buildinfo/`: build-time version plumbing for `--version`
89
+ - `bin/webctx.js`: npm shim that invokes the packaged native binary
90
+ - `scripts/postinstall.js`: downloads the release binary on install and falls back to local `go build`
91
+ - `.github/workflows/release.yml`: tag-driven release pipeline
92
+ - `AGENTS.md`: guidance for coding agents
93
+ - `CONTRIBUTORS.md`: maintainer/release notes
94
+
95
+ See `AGENTS.md` and `CONTRIBUTORS.md` for repo-specific implementation and maintenance details.
package/bin/webctx.js ADDED
@@ -0,0 +1,28 @@
1
+ #!/usr/bin/env node
2
+
3
+ const fs = require("node:fs");
4
+ const path = require("node:path");
5
+ const { spawnSync } = require("node:child_process");
6
+
7
+ const pkg = require("../package.json");
8
+ const cliName = pkg.config?.cliBinaryName || "webctx";
9
+ const executableName = process.platform === "win32" ? `${cliName}.exe` : `${cliName}-bin`;
10
+ const executablePath = path.join(__dirname, executableName);
11
+
12
+ if (!fs.existsSync(executablePath)) {
13
+ console.error(`${cliName} binary is not installed. Re-run: npm rebuild -g ${pkg.name}`);
14
+ process.exit(1);
15
+ }
16
+
17
+ const child = spawnSync(executablePath, process.argv.slice(2), { stdio: "inherit" });
18
+
19
+ if (child.error) {
20
+ console.error(child.error.message);
21
+ process.exit(1);
22
+ }
23
+
24
+ if (child.signal) {
25
+ process.kill(process.pid, child.signal);
26
+ }
27
+
28
+ process.exit(child.status ?? 1);
@@ -0,0 +1,11 @@
1
+ package main
2
+
3
+ import (
4
+ "os"
5
+
6
+ "github.com/amxv/webctx/internal/app"
7
+ )
8
+
9
+ func main() {
10
+ os.Exit(app.Run(os.Args[1:], os.Stdout, os.Stderr))
11
+ }
@@ -0,0 +1,173 @@
1
+ # webctx TypeScript -> Go porting status
2
+
3
+ This document is the handoff reference for future agents working on `amxv/webctx`.
4
+
5
+ ## Goal
6
+
7
+ Port the CLI behavior from `amxv/webctx-ts` into pure Go while keeping the command-line interface and provider behavior effectively one-to-one for the CLI use case.
8
+
9
+ Scope intentionally excludes the MCP/server/dashboard pieces from the TypeScript repo.
10
+
11
+ ## Source areas reviewed in `webctx-ts`
12
+
13
+ - `cli.ts`
14
+ - `tools/search.ts`
15
+ - `tools/read-link.ts`
16
+ - `tools/map-site.ts`
17
+ - `lib/search/brave.ts`
18
+ - `lib/search/tavily.ts`
19
+ - `lib/search/exa.ts`
20
+ - `lib/ranking.ts`
21
+ - `lib/utils.ts`
22
+ - `lib/scraping.ts`
23
+ - `lib/firecrawl-queue.ts`
24
+ - `lib/rate-limiter.ts`
25
+
26
+ ## Completed
27
+
28
+ ### CLI surface
29
+
30
+ Implemented in Go:
31
+
32
+ - `webctx search <query> [--exclude domain1,domain2] [--keyword phrase]`
33
+ - `webctx read-link <url>`
34
+ - `webctx map-site <url>`
35
+ - `webctx --help`
36
+ - `webctx --version`
37
+
38
+ Notes:
39
+
40
+ - `--version` now prints the bare version string, matching the TypeScript CLI.
41
+ - Error handling is exit-code based in Go rather than promise rejection based.
42
+
43
+ ### Search port
44
+
45
+ Implemented:
46
+
47
+ - Brave HTTP client
48
+ - Tavily HTTP client using direct HTTP API instead of the TypeScript SDK
49
+ - Exa HTTP client
50
+ - provider fan-out with per-provider timeout
51
+ - duplicate-aware reranking
52
+ - excluded-domain filtering
53
+ - HTML entity decoding
54
+ - keyword truncation to 5 words for Exa include-text mode
55
+ - top 35 result output limit
56
+
57
+ Current behavior matches the TypeScript CLI design:
58
+
59
+ - normal search mode queries Brave + Tavily + Exa
60
+ - keyword mode queries Exa only
61
+ - user/domain exclusions are applied after provider collection, matching the TypeScript tool flow
62
+
63
+ ### Read-link port
64
+
65
+ Implemented:
66
+
67
+ - GitHub raw-content fast path
68
+ - `.md` fast path with HEAD probe
69
+ - Firecrawl scrape fallback with the same agent-oriented request settings
70
+ - PDF parser enablement for `.pdf` URLs
71
+
72
+ Kept settings aligned with the TypeScript CLI:
73
+
74
+ - `formats: ["markdown"]`
75
+ - `onlyMainContent: true`
76
+ - `skipTlsVerification: true`
77
+ - `blockAds: true`
78
+ - `removeBase64Images: true`
79
+ - `maxAge: 600000`
80
+ - same excluded tags list
81
+
82
+ ### Map-site port
83
+
84
+ Implemented with the same Firecrawl map settings:
85
+
86
+ - `sitemap: "include"`
87
+ - `includeSubdomains: true`
88
+ - `ignoreQueryParameters: true`
89
+ - `limit: 5000`
90
+
91
+ ### Firecrawl queue / rate limiting
92
+
93
+ Implemented in Go:
94
+
95
+ - singleton-style queue wrapper
96
+ - token bucket limiter at 10 requests/minute
97
+ - serialized queue processing for Firecrawl operations
98
+
99
+ This is not a literal line-by-line port, but preserves the same operational intent.
100
+
101
+ ### Release/publish setup
102
+
103
+ Updated to be release-ready for the real `webctx` CLI:
104
+
105
+ - GitHub Actions workflow now builds `webctx` instead of the old template placeholder
106
+ - release binaries embed the tagged version into `internal/buildinfo.Version`
107
+ - npm metadata now describes the actual CLI instead of the template
108
+ - README / agent / maintainer docs updated for the real repo
109
+
110
+ ## Intentionally not ported
111
+
112
+ - MCP/server behavior
113
+ - Next.js app/dashboard code
114
+ - database/logging layers unrelated to the CLI
115
+
116
+ These can be added later only if explicitly requested.
117
+
118
+ ## Current repo files of interest
119
+
120
+ - `cmd/webctx/main.go`
121
+ - `internal/app/app.go`
122
+ - `internal/app/tools.go`
123
+ - `internal/app/scrape.go`
124
+ - `internal/app/app_test.go`
125
+ - `.github/workflows/release.yml`
126
+ - `scripts/postinstall.js`
127
+ - `README.md`
128
+
129
+ ## Verification already completed
130
+
131
+ - `go test ./...`
132
+ - `go build ./cmd/webctx`
133
+
134
+ ## Live validation notes
135
+
136
+ Live CLI validation was run against a real `.env.local` on the Sprite machine.
137
+
138
+ Confirmed working live:
139
+
140
+ - combined `search` path returns real web results
141
+ - public GitHub blob `read-link` fast path works
142
+ - Firecrawl-backed `read-link` works
143
+ - Firecrawl-backed `map-site` works
144
+
145
+ Observed external/provider constraints during live validation:
146
+
147
+ - `search --keyword` currently depends on Exa-only results and could not be fully validated because the live Exa account returned `NO_MORE_CREDITS`
148
+ - private GitHub blob URLs are not readable via unauthenticated raw-content fetch, so they fall through to the general scrape path
149
+
150
+ These findings were from live provider behavior, not from compile/test failures in the Go port.
151
+
152
+ ## Good next checks for future agents
153
+
154
+ 1. Run live end-to-end checks against real provider keys for:
155
+ - normal multi-provider search
156
+ - Exa keyword-only search mode
157
+ - GitHub raw-content read-link
158
+ - `.md` fast path read-link
159
+ - Firecrawl scrape fallback
160
+ - Firecrawl map-site
161
+
162
+ 2. Compare a handful of live outputs from `webctx-ts` and Go `webctx` for formatting parity.
163
+
164
+ 3. If performance tuning is needed, focus on:
165
+ - HTTP client reuse
166
+ - provider timeout tuning
167
+ - Firecrawl queue behavior under concurrent use
168
+
169
+ ## Constraints to preserve
170
+
171
+ - Keep the CLI output simple and agent-friendly.
172
+ - Keep the Firecrawl request settings stable unless explicitly asked to change them.
173
+ - Keep the release asset naming contract stable unless postinstall/workflow are updated together.
package/go.mod ADDED
@@ -0,0 +1,3 @@
1
+ module github.com/amxv/webctx
2
+
3
+ go 1.26