@side-quest/word-on-the-street 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +50 -0
- package/LICENSE +21 -0
- package/README.md +513 -0
- package/dist/cli.js +835 -0
- package/dist/index.d.ts +1464 -0
- package/dist/index.js +299 -0
- package/dist/shared/chunk-c9dj9n15.js +4779 -0
- package/fixtures/algorithm-baseline/normalize-sample.json +66 -0
- package/fixtures/algorithm-baseline/v1.json +13471 -0
- package/fixtures/eval/baseline.json +125 -0
- package/fixtures/eval/oracle.json +32 -0
- package/fixtures/eval/topics.json +42 -0
- package/fixtures/models_openai_sample.json +41 -0
- package/fixtures/models_xai_sample.json +23 -0
- package/fixtures/openai_edge_cases.json +17 -0
- package/fixtures/openai_edge_cases_no_json.json +17 -0
- package/fixtures/openai_sample.json +22 -0
- package/fixtures/reddit_thread_sample.json +108 -0
- package/fixtures/telemetry/run.completed.v1.sample.json +195 -0
- package/fixtures/xai_sample.json +22 -0
- package/fixtures/youtube_sample.json +47 -0
- package/package.json +121 -0
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## 0.1.3
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- [#24](https://github.com/nathanvale/side-quest-last-30-days/pull/24) [`f7fd622`](https://github.com/nathanvale/side-quest-last-30-days/commit/f7fd62225bd789db1b85678f4513fc1ce9906f71) Thanks [@nathanvale](https://github.com/nathanvale)! - Add --outdir flag to write output files to a custom directory instead of the default ~/.local/share/last-30-days/out/. This enables parallel CLI invocations to write to isolated directories without racing on the same output path.
|
|
8
|
+
|
|
9
|
+
## 0.1.2
|
|
10
|
+
|
|
11
|
+
### Patch Changes
|
|
12
|
+
|
|
13
|
+
- [#17](https://github.com/nathanvale/side-quest-last-30-days/pull/17) [`b9611c7`](https://github.com/nathanvale/side-quest-last-30-days/commit/b9611c7034402f975cce7eed35a4a8296319878b) Thanks [@nathanvale](https://github.com/nathanvale)! - Add comprehensive smoke test prompt for end-to-end validation
|
|
14
|
+
|
|
15
|
+
- [#19](https://github.com/nathanvale/side-quest-last-30-days/pull/19) [`145ab63`](https://github.com/nathanvale/side-quest-last-30-days/commit/145ab638b122c3eb98d9bc515edcba7b0c2b35dc) Thanks [@nathanvale](https://github.com/nathanvale)! - Fix 9 critical issues from staff engineer review: improved error handling, CLI validation, type safety, and code documentation
|
|
16
|
+
|
|
17
|
+
## 0.1.1
|
|
18
|
+
|
|
19
|
+
### Patch Changes
|
|
20
|
+
|
|
21
|
+
- [#14](https://github.com/nathanvale/side-quest-last-30-days/pull/14) [`583ba17`](https://github.com/nathanvale/side-quest-last-30-days/commit/583ba172a9ed11d0490834889dd9a11d0adcbed1) Thanks [@nathanvale](https://github.com/nathanvale)! - Fix Reddit 429 rate limiting with resilient cache fallback
|
|
22
|
+
|
|
23
|
+
- Add 429 classification (transient vs non-retryable quota/billing)
|
|
24
|
+
- Upgrade retry engine: exponential backoff with jitter, Retry-After and x-ratelimit-reset header parsing, 5 retries capped at 30s
|
|
25
|
+
- Add per-source search cache with versioned keys, configurable TTL, and stale fallback on transient 429
|
|
26
|
+
- Add cache concurrency safety: atomic writes and per-key file locking with stampede control
|
|
27
|
+
- Add --refresh and --no-cache CLI flags for cache bypass
|
|
28
|
+
- Add degraded UX messaging with per-source rate-limit attribution
|
|
29
|
+
- Add retry amplification guard (skip core-subject retry after rate-limit)
|
|
30
|
+
|
|
31
|
+
## 0.1.0
|
|
32
|
+
|
|
33
|
+
### Minor Changes
|
|
34
|
+
|
|
35
|
+
- [#8](https://github.com/nathanvale/side-quest-last-30-days/pull/8) [`1504d92`](https://github.com/nathanvale/side-quest-last-30-days/commit/1504d92501d0a4095ec5deabb2912bb37729cbae) Thanks [@nathanvale](https://github.com/nathanvale)! - feat: add --days=N CLI parameter for configurable lookback window (1-365, default 30)
|
|
36
|
+
|
|
37
|
+
## 0.0.1
|
|
38
|
+
|
|
39
|
+
### Patch Changes
|
|
40
|
+
|
|
41
|
+
- [#2](https://github.com/nathanvale/side-quest-last-30-days/pull/2) [`f7736c2`](https://github.com/nathanvale/side-quest-last-30-days/commit/f7736c210db080dba1239d53e3d857ce1c4aba04) Thanks [@nathanvale](https://github.com/nathanvale)! - fix: prevent CLI from hanging after successful completion
|
|
42
|
+
|
|
43
|
+
All notable changes to this project will be documented in this file.
|
|
44
|
+
|
|
45
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
46
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
47
|
+
|
|
48
|
+
## [Unreleased]
|
|
49
|
+
|
|
50
|
+
Initial release.
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Nathan Vale
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,513 @@
|
|
|
1
|
+
# @side-quest/word-on-the-street
|
|
2
|
+
|
|
3
|
+
[](https://www.npmjs.com/package/@side-quest/word-on-the-street)
|
|
4
|
+
[](https://github.com/nathanvale/side-quest-word-on-the-street/actions/workflows/pr-quality.yml)
|
|
5
|
+
[](./LICENSE)
|
|
6
|
+
[](https://bun.sh)
|
|
7
|
+
|
|
8
|
+
Research any topic from the last 30 days across Reddit, X, YouTube, and web -- engagement-ranked results.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Features
|
|
13
|
+
|
|
14
|
+
- **Multi-source search** -- Reddit (via OpenAI Responses API), X/Twitter (via xAI Responses API), YouTube (via yt-dlp), and general web search
|
|
15
|
+
- **Engagement-ranked results** -- multi-factor scoring: relevance x recency x engagement, with trend-aware momentum scoring
|
|
16
|
+
- **Smart deduplication** -- N-gram Jaccard similarity (70% threshold) for Reddit/X; exact video-ID matching for YouTube
|
|
17
|
+
- **Two-phase retrieval** -- phase 1 parallel search + optional phase 2 entity-driven supplemental queries
|
|
18
|
+
- **Watchlist** -- track topics over time with SQLite-backed run history and delta detection
|
|
19
|
+
- **Filesystem cache** -- versioned cache keys, file locking, atomic writes, stale-cache fallback on rate-limit errors
|
|
20
|
+
- **Multiple output modes** -- compact markdown, full JSON, full markdown report, reusable context snippet, or file path
|
|
21
|
+
- **CLI + library** -- usable as a command-line tool or imported as a typed Bun package
|
|
22
|
+
- **Mock mode** -- fixture-based testing without API keys (`--mock`)
|
|
23
|
+
- **Zero runtime deps** -- only `@side-quest/core`; everything else is native (`fetch`, `node:fs`, built-in JSON)
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Prerequisites
|
|
28
|
+
|
|
29
|
+
| Requirement | Notes |
|
|
30
|
+
|-------------|-------|
|
|
31
|
+
| [Bun](https://bun.sh) `>=1.2` | Runtime (Bun-only) |
|
|
32
|
+
| `OPENAI_API_KEY` | Required for Reddit search |
|
|
33
|
+
| `XAI_API_KEY` | Required for X/Twitter search |
|
|
34
|
+
| [yt-dlp](https://github.com/yt-dlp/yt-dlp) in `PATH` | Required for `--include-youtube` |
|
|
35
|
+
|
|
36
|
+
Both API keys are optional -- the CLI falls back gracefully to whatever sources are configured.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Installation
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
# Global CLI install
|
|
44
|
+
bun add -g @side-quest/word-on-the-street
|
|
45
|
+
|
|
46
|
+
# Library only (programmatic use)
|
|
47
|
+
bun add @side-quest/word-on-the-street
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## Quick Start
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
# Research a topic using all available sources
|
|
56
|
+
wots "Claude Code"
|
|
57
|
+
|
|
58
|
+
# Deep search with JSON output
|
|
59
|
+
wots "React Server Components" --deep --emit=json
|
|
60
|
+
|
|
61
|
+
# Reddit only, last 7 days
|
|
62
|
+
wots "Bun 1.2" --sources=reddit --days=7
|
|
63
|
+
|
|
64
|
+
# Include YouTube results
|
|
65
|
+
wots "AI agents" --include-youtube --emit=json
|
|
66
|
+
|
|
67
|
+
# Two-phase retrieval (extracts entities from phase 1, runs supplemental queries)
|
|
68
|
+
wots "TypeScript 5.9" --strategy=two-phase
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Configuration
|
|
74
|
+
|
|
75
|
+
API keys are loaded from environment variables first, then from `~/.config/wots/.env`.
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
# ~/.config/wots/.env
|
|
79
|
+
OPENAI_API_KEY=sk-...
|
|
80
|
+
XAI_API_KEY=xai-...
|
|
81
|
+
|
|
82
|
+
# Optional: control model selection
|
|
83
|
+
OPENAI_MODEL_POLICY=pinned # auto | pinned
|
|
84
|
+
OPENAI_MODEL_PIN=gpt-4o-search-preview # only used when policy=pinned
|
|
85
|
+
XAI_MODEL_POLICY=latest # latest | stable
|
|
86
|
+
XAI_MODEL_PIN=grok-4-1-fast # only used when policy=pinned
|
|
87
|
+
|
|
88
|
+
# Optional: override cache TTL (hours)
|
|
89
|
+
WOTS_CACHE_TTL=1
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
| Path | Purpose |
|
|
93
|
+
|------|---------|
|
|
94
|
+
| `~/.config/wots/.env` | API keys and model policy |
|
|
95
|
+
| `~/.cache/wots/` | Search result cache |
|
|
96
|
+
| `~/.local/share/wots/out/` | Context snippet output (default) |
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Model Policy
|
|
101
|
+
|
|
102
|
+
By default, the CLI pins OpenAI to `gpt-4o-search-preview`. Override with env vars or flags:
|
|
103
|
+
|
|
104
|
+
- `OPENAI_MODEL_POLICY=pinned` + `OPENAI_MODEL_PIN=<model>` -- env var override
|
|
105
|
+
- `--fast` -- pins `gpt-4o` for speed
|
|
106
|
+
- `--cheap` -- pins `gpt-4o-mini-search-preview` for cost
|
|
107
|
+
|
|
108
|
+
Env vars take precedence over flags when both are set.
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## CLI Reference
|
|
113
|
+
|
|
114
|
+
### Search (default command)
|
|
115
|
+
|
|
116
|
+
```
|
|
117
|
+
wots <topic> [options]
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
| Flag | Default | Description |
|
|
121
|
+
|------|---------|-------------|
|
|
122
|
+
| `--emit=MODE` | `compact` | Output format: `compact`, `json`, `md`, `context`, `path` |
|
|
123
|
+
| `--sources=MODE` | `auto` | Source selection: `auto`, `reddit`, `x`, `both`, `web` |
|
|
124
|
+
| `--days=N` | `30` | Lookback window in days (1-365) |
|
|
125
|
+
| `--quick` | - | Fewer results, faster |
|
|
126
|
+
| `--deep` | - | More results, comprehensive |
|
|
127
|
+
| `--fast` | - | Pin OpenAI model to `gpt-4o` |
|
|
128
|
+
| `--cheap` | - | Pin OpenAI model to `gpt-4o-mini-search-preview` |
|
|
129
|
+
| `--include-web` | - | Add general web search alongside Reddit/X |
|
|
130
|
+
| `--include-youtube` | - | Add YouTube video search (requires yt-dlp) |
|
|
131
|
+
| `--strategy=MODE` | `single` | Search strategy: `single` or `two-phase` |
|
|
132
|
+
| `--phase2-budget=N` | `5` | Max supplemental queries per source in phase 2 (1-50) |
|
|
133
|
+
| `--query-type=TYPE` | `auto` | Intent hint: `auto`, `prompting`, `recommendations`, `news`, `general` |
|
|
134
|
+
| `--refresh` | - | Bypass cache reads, force fresh search |
|
|
135
|
+
| `--no-cache` | - | Disable cache reads and writes entirely |
|
|
136
|
+
| `--outdir=PATH` | - | Write output files to PATH instead of default location |
|
|
137
|
+
| `--mock` | - | Use fixture data instead of real API calls |
|
|
138
|
+
| `--debug` | - | Enable verbose debug logging |
|
|
139
|
+
| `--json` | - | Structured envelope output for agents: `{ status, schema_version, data\|error }` |
|
|
140
|
+
| `--jsonl` | - | Newline-delimited JSON records |
|
|
141
|
+
| `--fields=SPEC` | - | Field projection (only with `--json`, `--jsonl`, or `--emit=json`) |
|
|
142
|
+
| `--quiet` | - | Suppress progress display |
|
|
143
|
+
| `--version` | - | Print CLI version |
|
|
144
|
+
| `-h`, `--help` | - | Show help message |
|
|
145
|
+
|
|
146
|
+
### Output modes
|
|
147
|
+
|
|
148
|
+
| Mode | Description |
|
|
149
|
+
|------|-------------|
|
|
150
|
+
| `compact` | Markdown summary optimized for Claude to synthesize (default) |
|
|
151
|
+
| `json` | Raw `Report` dict as JSON (no envelope) |
|
|
152
|
+
| `md` | Full markdown report |
|
|
153
|
+
| `context` | Writes a reusable context snippet to disk |
|
|
154
|
+
| `path` | Prints the path to the context file on disk |
|
|
155
|
+
|
|
156
|
+
Notes:
|
|
157
|
+
- `--json` returns an agent-friendly envelope `{ status, schema_version, data|error }`; `--emit=json` returns the raw report dict
|
|
158
|
+
- `--fields` only applies with `--json`, `--jsonl`, or `--emit=json`
|
|
159
|
+
|
|
160
|
+
### Sources
|
|
161
|
+
|
|
162
|
+
| Value | Requires |
|
|
163
|
+
|-------|----------|
|
|
164
|
+
| `auto` | Uses all keys that are configured |
|
|
165
|
+
| `reddit` | `OPENAI_API_KEY` |
|
|
166
|
+
| `x` | `XAI_API_KEY` |
|
|
167
|
+
| `both` | Both keys |
|
|
168
|
+
| `web` | No keys required |
|
|
169
|
+
|
|
170
|
+
### Watch subcommand
|
|
171
|
+
|
|
172
|
+
Track topics over time. Run history is persisted to a local SQLite database.
|
|
173
|
+
|
|
174
|
+
```bash
|
|
175
|
+
# Add a topic to the watchlist
|
|
176
|
+
wots watch add "Claude Code" --every=daily
|
|
177
|
+
|
|
178
|
+
# List all watched topics
|
|
179
|
+
wots watch list
|
|
180
|
+
|
|
181
|
+
# Remove a topic
|
|
182
|
+
wots watch remove "Claude Code"
|
|
183
|
+
|
|
184
|
+
# Show run history for a topic
|
|
185
|
+
wots watch history "Claude Code" --limit=10
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
### Briefing subcommand
|
|
189
|
+
|
|
190
|
+
Generate a structured briefing from watchlist run history.
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
wots briefing "Claude Code" --period=daily
|
|
194
|
+
wots briefing "Claude Code" --period=weekly
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## Library Usage
|
|
200
|
+
|
|
201
|
+
`@side-quest/word-on-the-street` ships a fully-typed barrel export (`src/index.ts`). All core functions are available for programmatic use without side effects.
|
|
202
|
+
|
|
203
|
+
### Scoring and deduplication
|
|
204
|
+
|
|
205
|
+
```typescript
|
|
206
|
+
import {
|
|
207
|
+
scoreRedditItems,
|
|
208
|
+
scoreXItems,
|
|
209
|
+
scoreYouTubeItems,
|
|
210
|
+
dedupeReddit,
|
|
211
|
+
dedupeX,
|
|
212
|
+
dedupeYouTube,
|
|
213
|
+
sortItems,
|
|
214
|
+
} from '@side-quest/word-on-the-street'
|
|
215
|
+
|
|
216
|
+
const scored = scoreRedditItems(rawItems)
|
|
217
|
+
const sorted = sortItems(scored)
|
|
218
|
+
const unique = dedupeReddit(scored)
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
### Trend-aware scoring
|
|
222
|
+
|
|
223
|
+
```typescript
|
|
224
|
+
import { computeTrendScores } from '@side-quest/word-on-the-street'
|
|
225
|
+
|
|
226
|
+
// trendScore = momentum * 0.7 + sourceDiversityBonus * 0.3
|
|
227
|
+
const trendScores = computeTrendScores([...redditItems, ...xItems, ...youtubeItems])
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
### YouTube search (requires yt-dlp)
|
|
231
|
+
|
|
232
|
+
```typescript
|
|
233
|
+
import { isYtDlpAvailable, searchYouTube } from '@side-quest/word-on-the-street'
|
|
234
|
+
|
|
235
|
+
if (isYtDlpAvailable()) {
|
|
236
|
+
const results = await searchYouTube('Claude Code', 30, 'default')
|
|
237
|
+
}
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
### Two-phase retrieval orchestration
|
|
241
|
+
|
|
242
|
+
```typescript
|
|
243
|
+
import {
|
|
244
|
+
orchestrate,
|
|
245
|
+
defaultOrchestratorConfig,
|
|
246
|
+
} from '@side-quest/word-on-the-street'
|
|
247
|
+
import type { SearchAdapter, AdapterSearchConfig } from '@side-quest/word-on-the-street'
|
|
248
|
+
|
|
249
|
+
const results = await orchestrate(
|
|
250
|
+
adapters,
|
|
251
|
+
config,
|
|
252
|
+
{ ...defaultOrchestratorConfig(), strategy: 'two-phase', phase2Budget: 5 },
|
|
253
|
+
)
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
### Entity extraction
|
|
257
|
+
|
|
258
|
+
```typescript
|
|
259
|
+
import { extractEntities } from '@side-quest/word-on-the-street'
|
|
260
|
+
|
|
261
|
+
const entities = extractEntities([...redditItems, ...xItems])
|
|
262
|
+
// entities.handles, entities.subreddits, entities.hashtags, entities.terms
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
### Delta detection
|
|
266
|
+
|
|
267
|
+
```typescript
|
|
268
|
+
import { computeDelta } from '@side-quest/word-on-the-street'
|
|
269
|
+
|
|
270
|
+
const delta = computeDelta(previousEntities, currentEntities)
|
|
271
|
+
// delta.newEntities, delta.goneEntities, delta.risingVoices, delta.fallingVoices
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
### Watchlist management
|
|
275
|
+
|
|
276
|
+
```typescript
|
|
277
|
+
import { addTopic, listTopics, removeTopic, recordRun, getHistory } from '@side-quest/word-on-the-street'
|
|
278
|
+
|
|
279
|
+
await addTopic('Claude Code', 'daily')
|
|
280
|
+
const topics = listTopics()
|
|
281
|
+
await recordRun('Claude Code', { durationMs: 1200, itemCount: 42, status: 'success', errorMessage: null, summaryJson: null })
|
|
282
|
+
const history = getHistory('Claude Code', 10)
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
### Schema types
|
|
286
|
+
|
|
287
|
+
```typescript
|
|
288
|
+
import type {
|
|
289
|
+
Report,
|
|
290
|
+
RedditItem,
|
|
291
|
+
XItem,
|
|
292
|
+
YouTubeItem,
|
|
293
|
+
WebSearchItem,
|
|
294
|
+
Engagement,
|
|
295
|
+
SubScores,
|
|
296
|
+
} from '@side-quest/word-on-the-street'
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
## Architecture
|
|
302
|
+
|
|
303
|
+
### The Newsroom Metaphor
|
|
304
|
+
|
|
305
|
+
The codebase is structured as an editorial newsroom:
|
|
306
|
+
|
|
307
|
+
```
|
|
308
|
+
CLI (Editor-in-Chief) src/cli.ts
|
|
309
|
+
|
|
|
310
|
+
|-- openai-reddit.ts Reporter -> Reddit via OpenAI Responses API
|
|
311
|
+
|-- xai-x.ts Reporter -> X/Twitter via xAI Responses API
|
|
312
|
+
|-- youtube.ts Reporter -> YouTube via yt-dlp
|
|
313
|
+
|-- websearch.ts Stringer -> Delegates to Claude's WebSearch tool
|
|
314
|
+
|-- reddit-enrich.ts Fact-Check -> Verifies engagement via Reddit JSON API
|
|
315
|
+
|-- entity-extract.ts Research -> Extracts @handles, r/subs, #tags, terms
|
|
316
|
+
|-- trend.ts Analysis -> Momentum + source diversity scoring
|
|
317
|
+
|-- score.ts + dedupe.ts Copy Desk -> Normalizes, ranks, deduplicates
|
|
318
|
+
|-- render.ts Layout -> Output: compact, JSON, markdown, context
|
|
319
|
+
|-- retrieval/ Desk -> Two-phase adapter orchestration
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Entry Points
|
|
323
|
+
|
|
324
|
+
| File | Role |
|
|
325
|
+
|------|------|
|
|
326
|
+
| `src/index.ts` | Pure barrel export -- no side effects. All library exports. |
|
|
327
|
+
| `src/cli.ts` | CLI orchestration and I/O. All side effects live here. |
|
|
328
|
+
|
|
329
|
+
Both are independent entry points compiled by bunup with code splitting into `dist/`.
|
|
330
|
+
|
|
331
|
+
### Source Modules (`src/lib/`)
|
|
332
|
+
|
|
333
|
+
| Module | Responsibility |
|
|
334
|
+
|--------|---------------|
|
|
335
|
+
| `cache.ts` | Filesystem cache with TTL, versioning, file locking, atomic writes |
|
|
336
|
+
| `config.ts` | Loads env vars from `~/.config/wots/.env` |
|
|
337
|
+
| `dates.ts` | Date range math, recency scoring |
|
|
338
|
+
| `dedupe.ts` | N-gram Jaccard similarity deduplication |
|
|
339
|
+
| `delta.ts` | Detects new/gone entities and rising/falling voices between runs |
|
|
340
|
+
| `entity-extract.ts` | Extracts @handles, r/subreddits, #hashtags, and repeated terms |
|
|
341
|
+
| `http.ts` | Retry logic, rate-limit parsing, error types |
|
|
342
|
+
| `intent.ts` | Classifies query intent to tune retrieval policy |
|
|
343
|
+
| `models.ts` | Auto-selects latest model from OpenAI/xAI APIs |
|
|
344
|
+
| `normalize.ts` | Converts raw API responses to standard schema |
|
|
345
|
+
| `openai-reddit.ts` | Reddit search via OpenAI Responses API |
|
|
346
|
+
| `reddit-enrich.ts` | Fetches real engagement data from Reddit public JSON |
|
|
347
|
+
| `render.ts` | Output formatting (compact, JSON, markdown, context snippet) |
|
|
348
|
+
| `retrieval/` | Two-phase orchestrator, query policy, adapter contracts |
|
|
349
|
+
| `schema.ts` | TypeScript interfaces + Report factory functions |
|
|
350
|
+
| `score.ts` | Multi-factor scoring: relevance x recency x engagement |
|
|
351
|
+
| `store.ts` | SQLite database singleton (watchlist persistence) |
|
|
352
|
+
| `trend.ts` | Momentum + source diversity scoring |
|
|
353
|
+
| `ui.ts` | Terminal progress display |
|
|
354
|
+
| `watchlist.ts` | CRUD operations for watched topics and run history |
|
|
355
|
+
| `websearch.ts` | Date extraction patterns for web results |
|
|
356
|
+
| `xai-x.ts` | X search via xAI Responses API |
|
|
357
|
+
|
|
358
|
+
### Key Design Decisions
|
|
359
|
+
|
|
360
|
+
- **WebSearch delegation** -- The CLI outputs structured JSON instructions for Claude to use its WebSearch tool rather than making direct HTTP requests.
|
|
361
|
+
- **Versioned cache keys** -- Keys hash topic + source + depth + model + prompt version + date range. Prompt version bumps automatically invalidate stale entries.
|
|
362
|
+
- **Stale cache fallback** -- On transient 429 rate-limit errors, entries up to 24 hours old are served rather than failing hard.
|
|
363
|
+
- **Deduplication strategies** -- Reddit and X use 3-character N-gram Jaccard similarity at 70% threshold. YouTube uses exact video ID matching because IDs are structural identifiers, not fuzzy text.
|
|
364
|
+
- **Trend scoring** -- `trendScore = momentum * 0.7 + sourceDiversityBonus * 0.3`. High-engagement items beat high-keyword-match low-engagement items.
|
|
365
|
+
- **Library vs CLI separation** -- `src/index.ts` has no side effects; `src/cli.ts` owns all I/O. They compile to separate entry points.
|
|
366
|
+
|
|
367
|
+
---
|
|
368
|
+
|
|
369
|
+
## Development
|
|
370
|
+
|
|
371
|
+
### Setup
|
|
372
|
+
|
|
373
|
+
```bash
|
|
374
|
+
bun install
|
|
375
|
+
bun run dev # Watch mode (src/index.ts)
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
### Scripts
|
|
379
|
+
|
|
380
|
+
```bash
|
|
381
|
+
# Build
|
|
382
|
+
bun run build # Compile via bunup -> dist/
|
|
383
|
+
bun run clean # Remove dist/
|
|
384
|
+
|
|
385
|
+
# Quality
|
|
386
|
+
bun run lint # Biome lint check
|
|
387
|
+
bun run lint:fix # Biome lint auto-fix
|
|
388
|
+
bun run format # Biome format (write)
|
|
389
|
+
bun run check # Biome lint + format (write)
|
|
390
|
+
bun run typecheck # tsc --noEmit
|
|
391
|
+
bun run validate # Full pipeline: lint + typecheck + build + test
|
|
392
|
+
|
|
393
|
+
# Testing
|
|
394
|
+
bun test # Run all tests
|
|
395
|
+
bun test --watch # Watch mode
|
|
396
|
+
bun test --coverage # With coverage
|
|
397
|
+
bun run update:baseline # Regenerate algorithm baseline fixtures
|
|
398
|
+
|
|
399
|
+
# Package hygiene
|
|
400
|
+
bun run hygiene # publint + attw checks
|
|
401
|
+
bun run pack:dry # Inspect package contents
|
|
402
|
+
|
|
403
|
+
# Versioning
|
|
404
|
+
bun run version:gen # Interactive changeset generation
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
### Testing
|
|
408
|
+
|
|
409
|
+
Tests use the Bun native test runner. All test files live in `tests/`.
|
|
410
|
+
|
|
411
|
+
| File | Scope |
|
|
412
|
+
|------|-------|
|
|
413
|
+
| `tests/index.test.ts` | Integration tests -- CLI subprocess via `Bun.spawnSync()` |
|
|
414
|
+
| `tests/cli-output.test.ts` | CLI output format and envelope contracts |
|
|
415
|
+
| `tests/parse-args.test.ts` | Argument parser unit tests |
|
|
416
|
+
| `tests/youtube.test.ts` | YouTube parsing, scoring, deduplication, serialization |
|
|
417
|
+
| `tests/youtube-adapter.test.ts` | `buildYouTubeSearchArgs` unit tests |
|
|
418
|
+
| `tests/entity-extract.test.ts` | Entity extraction logic |
|
|
419
|
+
| `tests/trend.test.ts` | Trend scoring and momentum |
|
|
420
|
+
| `tests/intent.test.ts` | Intent classification |
|
|
421
|
+
| `tests/watchlist.test.ts` | Watchlist CRUD and run history |
|
|
422
|
+
| `tests/briefing.test.ts` | Briefing generation and rendering |
|
|
423
|
+
| `tests/retrieval-contracts.test.ts` | Retrieval adapter interface contracts |
|
|
424
|
+
| `tests/algorithm-baseline.test.ts` | Golden snapshot baseline for scoring + ranking |
|
|
425
|
+
| `tests/algorithm-contracts.test.ts` | Scoring, normalization, dedupe contract tests |
|
|
426
|
+
| `tests/field-projection.test.ts` | Field projection logic |
|
|
427
|
+
| `tests/output.test.ts` | Output envelope helpers |
|
|
428
|
+
| `tests/eval-metrics.test.ts` | Evaluation metric functions |
|
|
429
|
+
| `tests/eval-oracle.test.ts` | Test oracle |
|
|
430
|
+
| `tests/telemetry-contract.test.ts` | Telemetry schema validation |
|
|
431
|
+
| `tests/openai-reddit-edge.test.ts` | OpenAI Reddit edge cases |
|
|
432
|
+
|
|
433
|
+
The `--mock` flag enables fixture-based testing without API keys. Fixtures live in `fixtures/`.
|
|
434
|
+
|
|
435
|
+
**Coverage gate:** 80% minimum on lines, branches, and functions (enforced in CI).
|
|
436
|
+
|
|
437
|
+
### Algorithm Baselines
|
|
438
|
+
|
|
439
|
+
Golden snapshots in `fixtures/algorithm-baseline/` lock scoring and ranking behavior for deterministic fixtures. If algorithm behavior changes intentionally, regenerate the baseline and review the diff:
|
|
440
|
+
|
|
441
|
+
```bash
|
|
442
|
+
bun run update:baseline
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
| Scenario | Required checks | Lock rule |
|
|
446
|
+
|----------|-----------------|-----------|
|
|
447
|
+
| Model change (policy, pin, fallback order) | Deterministic gate | Lock only with reviewed baseline diff |
|
|
448
|
+
| Algorithm refactor (scoring, normalize, dedupe, trend) | Deterministic gate + `bun run update:baseline` | Lock only with reviewed baseline diff |
|
|
449
|
+
| Reliability changes (retry/cache/stale fallback) | Deterministic gate | Lock only if deterministic gate passes |
|
|
450
|
+
| CLI/reporting/telemetry refactor | Deterministic gate | Lock if deterministic gate passes |
|
|
451
|
+
| Docs-only changes | None | No lock workflow required |
|
|
452
|
+
|
|
453
|
+
### Code Style
|
|
454
|
+
|
|
455
|
+
- **Formatter:** Biome -- tabs, single quotes, trailing commas, 80-character line width
|
|
456
|
+
- **Test files:** 100-character line width
|
|
457
|
+
- **TypeScript:** strict mode, `verbatimModuleSyntax`, bundler module resolution
|
|
458
|
+
- **JSDoc required** on all exported functions
|
|
459
|
+
|
|
460
|
+
---
|
|
461
|
+
|
|
462
|
+
## CI/CD
|
|
463
|
+
|
|
464
|
+
| Workflow | Trigger | Purpose |
|
|
465
|
+
|----------|---------|---------|
|
|
466
|
+
| `pr-quality.yml` | PR, push to main | Lint, typecheck, tests, 80% coverage gate, shell script lint |
|
|
467
|
+
| `publish.yml` | Push to main, manual | Stable releases via changesets with OIDC provenance |
|
|
468
|
+
| `release.yml` | Manual | Release coordination |
|
|
469
|
+
| `commitlint.yml` | PR | Enforce conventional commits |
|
|
470
|
+
| `pr-title.yml` | PR | Validate PR title format |
|
|
471
|
+
| `security.yml` | Schedule | OSV dependency scanning |
|
|
472
|
+
| `codeql.yml` | Schedule | CodeQL static analysis |
|
|
473
|
+
| `dependency-review.yml` | PR | Supply chain security review |
|
|
474
|
+
| `dependabot-auto-merge.yml` | Dependabot PR | Auto-merge patch/minor updates |
|
|
475
|
+
| `package-hygiene.yml` | PR | publint + attw package correctness checks |
|
|
476
|
+
| `workflow-lint.yml` | PR | actionlint on workflow YAML files |
|
|
477
|
+
| `dismiss-stale-bot-reviews.yml` | PR synchronize | Auto-dismiss stale bot CHANGES_REQUESTED reviews |
|
|
478
|
+
| `version-packages-auto-merge.yml` | Changesets PR | Auto-merge version bump PRs |
|
|
479
|
+
| `autogenerate-changeset.yml` | PR | Auto-generate changesets for dependency updates |
|
|
480
|
+
|
|
481
|
+
Runtime support is Bun-only. Release workflows use Node 24 in CI for npm trusted publishing and Changesets compatibility.
|
|
482
|
+
|
|
483
|
+
---
|
|
484
|
+
|
|
485
|
+
## Contributing
|
|
486
|
+
|
|
487
|
+
All commit messages must follow the [Conventional Commits](https://www.conventionalcommits.org/) format, enforced by commitlint + Husky:
|
|
488
|
+
|
|
489
|
+
```
|
|
490
|
+
feat: add YouTube source adapter
|
|
491
|
+
fix(youtube): honor lookback window and preserve id case in dedupe
|
|
492
|
+
docs: rebuild README
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
### Changeset workflow
|
|
496
|
+
|
|
497
|
+
1. Create a feature branch from `main`
|
|
498
|
+
2. Make changes
|
|
499
|
+
3. Run `bun run version:gen` to create a changeset
|
|
500
|
+
4. Push the branch and open a PR
|
|
501
|
+
5. CI checks must pass (lint, typecheck, tests with 80% coverage)
|
|
502
|
+
6. Merge the PR -- the Changesets bot opens a "Version Packages" PR
|
|
503
|
+
7. Merge the Version PR to trigger publish to npm with provenance signing
|
|
504
|
+
|
|
505
|
+
---
|
|
506
|
+
|
|
507
|
+
## License
|
|
508
|
+
|
|
509
|
+
MIT -- see [LICENSE](./LICENSE).
|
|
510
|
+
|
|
511
|
+
---
|
|
512
|
+
|
|
513
|
+
Built by [Nathan Vale](https://github.com/nathanvale)
|