@botlearn/reddit-tracker 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 BotLearn
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,35 @@
1
+ # @botlearn/reddit-tracker
2
+
3
+ > Reddit community monitoring, engagement velocity tracking, cross-subreddit correlation, and early trend detection for OpenClaw Agent
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ # via npm
9
+ npm install @botlearn/reddit-tracker
10
+
11
+ # via clawhub
12
+ clawhub install @botlearn/reddit-tracker
13
+ ```
14
+
15
+ ## Category
16
+
17
+ Information Retrieval
18
+
19
+ ## Dependencies
20
+
21
+ None
22
+
23
+ ## Files
24
+
25
+ | File | Description |
26
+ |------|-------------|
27
+ | `manifest.json` | Skill metadata and configuration |
28
+ | `skill.md` | Role definition and activation rules |
29
+ | `knowledge/` | Domain knowledge documents |
30
+ | `strategies/` | Behavioral strategy definitions |
31
+ | `tests/` | Smoke and benchmark tests |
32
+
33
+ ## License
34
+
35
+ MIT
@@ -0,0 +1,72 @@
1
+ ---
2
+ domain: reddit-tracker
3
+ topic: anti-patterns
4
+ priority: medium
5
+ ttl: 30d
6
+ ---
7
+
8
+ # Reddit Tracker — Anti-Patterns
9
+
10
+ ## Vote & Score Anti-Patterns
11
+
12
+ ### 1. Vote Manipulation Blindness
13
+ - **Problem**: Treating upvote scores at face value without accounting for vote manipulation, brigading, or bot networks; a post with 10K upvotes may have been artificially boosted
14
+ - **Fix**: Cross-validate score with independent signals — comment quality and diversity, account age distribution of commenters, upvote-to-unique-commenter ratio. A genuinely popular post has a diverse, organic commenter base; a manipulated one has sparse or formulaic comments relative to its score
15
+
16
+ ### 2. Conflating Upvotes with Quality
17
+ - **Problem**: Assuming high-score posts represent high-quality or important information; Reddit's upvote system rewards early timing, emotional triggers, and community in-jokes more than accuracy or significance
18
+ - **Fix**: Evaluate post quality independently of score — check if claims are sourced, if expert commenters corroborate or challenge the post, and whether the upvote ratio aligns with the comment sentiment. A post with 5K upvotes but a top comment debunking it is a misinformation signal, not a quality signal
19
+
20
+ ### 3. Ignoring Vote Fuzzing
21
+ - **Problem**: Relying on exact upvote/downvote counts for precision analysis; Reddit deliberately fuzzes vote counts to prevent manipulation detection, so absolute numbers are approximate
22
+ - **Fix**: Use `upvote_ratio` and relative rank changes instead of absolute score changes. Track a post's position in the subreddit listing over time rather than its raw score. Velocity trends are reliable even when individual data points are fuzzed
23
+
24
+ ### 4. Score Snapshot Fallacy
25
+ - **Problem**: Measuring a post's score at a single point in time and drawing conclusions; a post at 500 points could be rising rapidly toward 5K or slowly decaying from 2K
26
+ - **Fix**: Always sample engagement metrics at multiple time points to calculate velocity. A minimum of 3 data points over 30+ minutes is required to establish a reliable trend direction
27
+
28
+ ## Trend Detection Anti-Patterns
29
+
30
+ ### 5. Single-Subreddit Myopia
31
+ - **Problem**: Declaring a topic as "trending on Reddit" based on activity in only one subreddit; a topic trending in r/technology may be completely unknown in broader Reddit
32
+ - **Fix**: Always check at least 3-5 related subreddits before classifying a trend as cross-community. Use the cross-subreddit spread score from best-practices.md. Qualify all trend claims with scope — "trending in r/technology" is different from "trending across Reddit"
33
+
34
+ ### 6. Repost Conflation
35
+ - **Problem**: Counting reposts and duplicates as separate trend signals; the same content reposted across subreddits by the same user or bot network inflates apparent spread
36
+ - **Fix**: Deduplicate by URL, image hash, and text similarity before counting cross-subreddit appearances. Check if the poster accounts are related (similar names, creation dates, post histories). A genuine trend has diverse original posters, not one account spamming copies
37
+
38
+ ### 7. Survivorship Bias in Trend Analysis
39
+ - **Problem**: Only studying posts that reached the front page and generalizing their patterns; for every front-page post, hundreds had similar early signals but failed to break out
40
+ - **Fix**: Maintain a baseline of posts that showed early breakout signals but did NOT reach the front page. Calculate precision and recall — what fraction of flagged posts actually broke out, and what fraction of actual breakouts were detected early? Tune thresholds based on both
41
+
42
+ ### 8. Ignoring Megathread Absorption
43
+ - **Problem**: Missing that a topic's individual posts are being removed by moderators who consolidate discussion into a megathread; this makes the topic appear to have declining post volume when it actually has concentrated, high-volume discussion
44
+ - **Fix**: Monitor for megathread creation (pinned/stickied posts with "Megathread" or "Discussion Thread" in the title). When a megathread exists, measure trend engagement through the megathread's comment volume and velocity, not through individual post counts
45
+
46
+ ## Community & Context Anti-Patterns
47
+
48
+ ### 9. Ignoring Subreddit Culture
49
+ - **Problem**: Applying the same engagement thresholds and interpretation rules to all subreddits; r/science has strict sourcing norms while r/memes rewards absurdity — the same engagement patterns mean completely different things
50
+ - **Fix**: Build per-subreddit behavioral profiles. Calibrate velocity thresholds, sentiment interpretation, and content quality signals relative to each community's baseline. A 50-comment discussion in r/AskHistorians is exceptional; the same count in r/AskReddit is negligible
51
+
52
+ ### 10. Bot & Karma Farm Blindness
53
+ - **Problem**: Including engagement from bot accounts and karma farming operations in trend calculations; these accounts artificially inflate metrics and can create phantom trends
54
+ - **Fix**: Flag and discount engagement from accounts that match bot patterns: account age < 30 days + high post frequency + identical comment patterns + posts exclusively to karma-farming subreddits. Weight signals from accounts with established, diverse histories more heavily
55
+
56
+ ### 11. Astroturfing Detection Failure
57
+ - **Problem**: Failing to recognize coordinated inauthentic behavior — marketing campaigns, political operations, or corporate PR disguised as organic community interest
58
+ - **Fix**: Check for: (1) multiple accounts posting about the same product/topic within a short window with suspiciously positive framing, (2) comments that read like ad copy or talking points, (3) accounts with histories that show sudden topic pivots, (4) submission timing patterns that suggest coordination (e.g., 5 posts about the same startup within 10 minutes from accounts in different subreddits)
59
+
60
+ ### 12. Timezone Ignorance
61
+ - **Problem**: Interpreting low engagement during off-peak hours as lack of interest; a post submitted at 3 AM EST to a US-centric subreddit will naturally underperform regardless of topic quality
62
+ - **Fix**: Always normalize engagement by the subreddit's hourly activity baseline. Compare a post's velocity against other posts submitted in the same time window, not against the subreddit's all-time averages. Flag posts that show strong velocity during off-peak hours as especially significant
63
+
64
+ ## Output Anti-Patterns
65
+
66
+ ### 13. Trend Report Without Confidence Levels
67
+ - **Problem**: Presenting trend detections as binary (trending / not trending) without indicating confidence or supporting evidence strength
68
+ - **Fix**: Every trend detection should include: confidence level (high/medium/low), number of confirming signals, time horizon for the prediction, and explicit list of evidence supporting the classification
69
+
70
+ ### 14. Missing Temporal Context
71
+ - **Problem**: Reporting a trend without specifying its lifecycle phase — is it emerging, peaking, or decaying? A trend already at peak is not actionable for early detection
72
+ - **Fix**: Always classify the trend's current phase (Seed, Ignition, Surge, Peak, Decay) and provide estimated time to peak or time since peak. Include the velocity curve direction (accelerating, stable, decelerating)
@@ -0,0 +1,124 @@
1
+ ---
2
+ domain: reddit-tracker
3
+ topic: velocity-trend-detection-cross-subreddit-correlation
4
+ priority: high
5
+ ttl: 30d
6
+ ---
7
+
8
+ # Reddit Tracker — Best Practices
9
+
10
+ ## Velocity-Based Trend Detection
11
+
12
+ ### 1. Engagement Velocity Definition
13
+ Engagement velocity measures the rate of change in a post's engagement metrics over time, not the absolute values:
14
+ - **Upvote velocity** — Score change per minute: `(score_t2 - score_t1) / (t2 - t1)`
15
+ - **Comment velocity** — New comments per minute over a sliding window
16
+ - **Award velocity** — Awards received per hour (early awards are a strong signal)
17
+ - **Crosspost velocity** — Rate at which the post is crossposted to other subreddits
18
+
19
+ ### 2. Velocity Curve Phases
20
+ Every breakout post follows a characteristic velocity curve:
21
+
22
+ | Phase | Time from Post | Velocity Behavior | Detection Signal |
23
+ |-------|---------------|-------------------|-----------------|
24
+ | Seed | 0-15 min | Low, erratic | Not yet detectable |
25
+ | Ignition | 15-60 min | Sharp acceleration | **Primary detection window** |
26
+ | Surge | 1-4 hours | Sustained high velocity | Confirmed breakout |
27
+ | Peak | 4-12 hours | Velocity plateau then decline | Maximum reach |
28
+ | Decay | 12-48 hours | Declining velocity | Trend exhaustion |
29
+
30
+ ### 3. Velocity Normalization
31
+ Raw velocity must be normalized to produce comparable signals:
32
+ - **By subreddit size**: Divide by subscriber count — 50 upvotes/min in a 10K sub is more significant than in a 10M sub
33
+ - **By time-of-day**: Compare against the subreddit's historical hourly engagement baseline
34
+ - **By day-of-week**: Weekend vs weekday patterns differ significantly for many communities
35
+ - **By post type**: Image posts typically accelerate faster than text posts in the same subreddit
36
+
37
+ ### 4. Breakout Detection Threshold
38
+ A post is flagged as a potential breakout when:
39
+ - Upvote velocity exceeds 2x the subreddit's 90th-percentile velocity for its post age
40
+ - Comment velocity exceeds 3x the median for the subreddit within the first hour
41
+ - OR the post receives 2+ awards within the first 30 minutes in a community where awards are rare
42
+
43
+ ### 5. Multi-Metric Confirmation
44
+ Never rely on a single velocity metric. Confirm with at least two:
45
+ - High upvote velocity + high comment velocity = strong engagement (likely genuine trend)
46
+ - High upvote velocity + low comment velocity = passive consumption (may be meme/image virality, not discussion-worthy trend)
47
+ - Low upvote velocity + high comment velocity = controversial or niche discussion (check controversial flag)
48
+
49
+ ## Cross-Subreddit Correlation
50
+
51
+ ### 1. Same-Topic Detection
52
+ Identify when the same topic emerges across multiple independent subreddits:
53
+ - **URL matching** — Same link posted to different subreddits (crosspost or independent submission)
54
+ - **Keyword clustering** — Same key terms appearing in titles across different communities within a 6-hour window
55
+ - **Entity co-occurrence** — Same named entities (people, companies, products) surfacing in unrelated subreddits
56
+ - **Semantic similarity** — Post titles or bodies with high cosine similarity across communities
57
+
58
+ ### 2. Cross-Community Spread Score
59
+ Calculate a spread score to quantify how broadly a topic has penetrated:
60
+ ```
61
+ spread_score = (num_subreddits / expected_subreddits) * avg_normalized_velocity * diversity_factor
62
+ ```
63
+ - `num_subreddits` — Count of distinct subreddits where the topic appeared
64
+ - `expected_subreddits` — Baseline expectation based on the topic domain (tech news may naturally span 3-5 subreddits)
65
+ - `avg_normalized_velocity` — Average engagement velocity across all subreddits (normalized per community)
66
+ - `diversity_factor` — Higher when subreddits span different categories (e.g., both r/technology and r/stocks discussing the same company)
67
+
68
+ ### 3. Origin Tracing
69
+ Identify where a trend started to understand its trajectory:
70
+ - Sort all related posts by `created_utc` — the earliest post is likely the origin
71
+ - Check if the origin subreddit is a known "incubator" community (e.g., niche hobby subs often incubate trends before they reach mega-subs)
72
+ - Track the crosspost chain to map the exact spread path
73
+
74
+ ### 4. Propagation Pattern Classification
75
+
76
+ | Pattern | Description | Significance |
77
+ |---------|------------|--------------|
78
+ | Hub-and-Spoke | One mega-sub post spawns crossposts | Top-down virality; already mainstream |
79
+ | Grassroots | Multiple small subs independently discover topic | Organic emergence; high prediction value |
80
+ | Cascade | Topic hops through communities sequentially | Building momentum; time-sensitive |
81
+ | Synchronized | Same topic appears simultaneously in unrelated subs | External event trigger (news, product launch) |
82
+
83
+ ## Temporal Analysis Best Practices
84
+
85
+ ### 1. Peak Activity Windows
86
+ Every subreddit has characteristic activity patterns:
87
+ - US-centric subs peak 9 AM - 12 PM EST on weekdays
88
+ - Global subs have multiple peaks across time zones
89
+ - Gaming subs peak evenings and weekends
90
+ - Finance subs spike at market open/close and during events
91
+
92
+ ### 2. Anomaly Detection via Temporal Baseline
93
+ - Build a 30-day rolling baseline of hourly post and comment volume per subreddit
94
+ - Flag any hour where volume exceeds 2 standard deviations above the baseline
95
+ - Volume spikes during off-peak hours are particularly significant
96
+
97
+ ### 3. Trend Timing Prediction
98
+ To predict when a trend will peak:
99
+ - Measure the velocity curve slope during the Ignition phase
100
+ - Compare against historical breakout curves for the same subreddit
101
+ - Apply time-of-day correction — a post entering Surge phase during a subreddit's peak hours will peak faster
102
+
103
+ ## Sentiment Signal Extraction
104
+
105
+ ### 1. Comment Sentiment Distribution
106
+ For any trending post, analyze the comment section:
107
+ - **Positive / Negative / Neutral** ratio in top-level comments
108
+ - **Controversial flag** density — high density indicates polarizing topic
109
+ - **Sentiment shift** — Compare early comments vs. later comments to detect narrative evolution
110
+
111
+ ### 2. Community Reaction Markers
112
+ - Top comment agreeing with post = community endorsement
113
+ - Top comment contradicting post = community skepticism (even if post score is high)
114
+ - Pinned moderator comment = topic requires community governance attention
115
+ - Post locked by mods = extreme engagement or rule violations
116
+
117
+ ### 3. Engagement Quality Tiers
118
+
119
+ | Tier | Indicator | Meaning |
120
+ |------|-----------|---------|
121
+ | Deep | Long comments, citations, debate threads | Genuine interest and expertise |
122
+ | Reactive | Short affirmative comments, emojis, memes | Viral moment but shallow engagement |
123
+ | Hostile | Insults, reports, mod intervention | Controversial topic, unreliable sentiment signal |
124
+ | Astroturfed | Identical phrasing, new accounts, coordinated timing | Inauthentic engagement — exclude from analysis |
@@ -0,0 +1,140 @@
1
+ ---
2
+ domain: reddit-tracker
3
+ topic: reddit-api-scoring-karma-post-types
4
+ priority: high
5
+ ttl: 30d
6
+ ---
7
+
8
+ # Reddit Platform — API, Scoring, Karma & Post Types
9
+
10
+ ## Reddit Data API
11
+
12
+ ### Authentication
13
+ - Reddit uses OAuth2 for API access; all requests require a bearer token
14
+ - Rate limit: 100 requests per minute per OAuth client (600/min for mod accounts)
15
+ - User-Agent header is mandatory — requests without it are deprioritized or blocked
16
+ - App types: `script` (personal), `web` (3rd-party), `installed` (mobile/desktop)
17
+
18
+ ### Key Endpoints for Monitoring
19
+
20
+ #### Subreddit Listings
21
+ - `GET /r/{subreddit}/hot` — Posts ranked by Reddit's "hot" algorithm (recency + engagement)
22
+ - `GET /r/{subreddit}/new` — Chronological newest posts (critical for early detection)
23
+ - `GET /r/{subreddit}/rising` — Posts gaining traction faster than peers (best signal for breakouts)
24
+ - `GET /r/{subreddit}/top?t={hour|day|week}` — Highest-scoring posts within a time window
25
+ - `GET /r/{subreddit}/controversial?t={hour|day|week}` — Posts with high engagement but split up/down ratio
26
+
27
+ #### Post & Comment Data
28
+ - `GET /comments/{article_id}` — Full comment tree for a post
29
+ - `GET /r/{subreddit}/comments` — Latest comments across the subreddit (firehose)
30
+ - Each post object includes: `score`, `upvote_ratio`, `num_comments`, `created_utc`, `gilded`, `total_awards_received`, `is_crosspost`
31
+
32
+ #### Search
33
+ - `GET /r/{subreddit}/search?q={query}&sort=new&restrict_sr=true` — Subreddit-scoped search
34
+ - `GET /search?q={query}&sort=relevance&t=day` — Site-wide search with time filter
35
+ - Supports Lucene-style syntax: `title:keyword`, `selftext:keyword`, `author:username`, `flair:tag`
36
+
37
+ ### Pagination
38
+ - Reddit uses cursor-based pagination with `after` and `before` fullname tokens
39
+ - Maximum 100 items per request (default 25)
40
+ - Listings capped at ~1000 items total — cannot paginate beyond that
41
+
42
+ ## Reddit Scoring System
43
+
44
+ ### Vote Score
45
+ - `score = upvotes - downvotes` (approximate — Reddit applies vote fuzzing)
46
+ - `upvote_ratio` — Fraction of votes that are upvotes (0.0 to 1.0)
47
+ - Vote fuzzing: Reddit adds/subtracts random votes to obscure true counts; ratios are more reliable than absolute scores
48
+
49
+ ### Hot Ranking Algorithm
50
+ Reddit's hot ranking combines score magnitude with recency:
51
+ ```
52
+ hot_score = log10(max(|score|, 1)) + sign(score) * (created_utc - epoch) / 45000
53
+ ```
54
+ - Logarithmic score: the first 10 votes matter as much as the next 100
55
+ - Time decay: a post needs exponentially more votes to maintain rank as it ages
56
+ - Epoch reference: Reddit uses a fixed epoch (December 8, 2005)
57
+
58
+ ### Best Ranking (Comments)
59
+ - Uses Wilson score confidence interval — favors comments with high upvote ratio AND sufficient sample size
60
+ - A comment with 10 up / 1 down outranks one with 100 up / 50 down
61
+
62
+ ### Controversy Score
63
+ - `controversy = (upvotes + downvotes) / max(upvotes, downvotes)` when both > 0
64
+ - Higher controversy score = more evenly split votes
65
+ - Controversial posts show a dagger icon (†) on old Reddit
66
+
67
+ ## Karma System
68
+
69
+ ### Post Karma
70
+ - Earned from upvotes on link posts and text posts
71
+ - Not 1:1 with score — Reddit applies diminishing returns (approximately logarithmic)
72
+ - Subreddit-specific karma thresholds gate posting privileges in many communities
73
+
74
+ ### Comment Karma
75
+ - Earned from upvotes on comments
76
+ - Also subject to diminishing returns at scale
77
+ - Many subreddits require minimum comment karma for participation
78
+
79
+ ### Award Karma
80
+ - Earned when posts/comments receive premium awards
81
+ - Different awards grant different karma amounts
82
+ - Awards visible as icons on the post — a signal of high perceived value
83
+
84
+ ### Karma Relevance to Trend Detection
85
+ - High-karma accounts posting about a topic is a stronger signal than low-karma accounts
86
+ - Sudden karma spikes on a topic across multiple users indicate organic interest
87
+ - Accounts with zero/very low karma posting identical content suggest coordinated campaigns
88
+
89
+ ## Post Types
90
+
91
+ ### Link Posts
92
+ - External URL submissions — the classic Reddit post type
93
+ - Engagement measured by: score, upvote_ratio, num_comments, crossposts
94
+ - A link appearing in multiple subreddits simultaneously is a strong breakout signal
95
+
96
+ ### Text Posts (Self Posts)
97
+ - User-written content with a title and body (supports Markdown)
98
+ - Often longer-form discussion starters in niche communities
99
+ - Text posts that generate high comment-to-score ratios indicate active discussion
100
+
101
+ ### Image & Video Posts
102
+ - Hosted on Reddit's own media infrastructure (i.redd.it, v.redd.it)
103
+ - Image posts in meme-oriented subreddits can go viral in under an hour
104
+ - Video view counts are available but not exposed in the standard API
105
+
106
+ ### Crossposts
107
+ - A post shared from one subreddit to another, maintaining a link to the original
108
+ - Crosspost chains are a primary indicator of content spreading across communities
109
+ - `is_crosspost: true` and `crosspost_parent` fields in the post object
110
+
111
+ ### Polls
112
+ - Reddit-native poll posts — votes are anonymized and results visible after voting
113
+ - High participation polls indicate community engagement with the topic
114
+
115
+ ### Live Threads & Talk Posts
116
+ - Real-time discussion formats for breaking events
117
+ - Creation of a live thread for a topic signals perceived significance
118
+
119
+ ## Subreddit Metadata
120
+
121
+ ### Key Fields for Monitoring
122
+ - `subscribers` — Total subscriber count (used for normalizing engagement)
123
+ - `active_user_count` — Users online now (available via `GET /r/{subreddit}/about`)
124
+ - `created_utc` — Subreddit creation date (older = more established baseline)
125
+ - `public_description` — Community self-description and scope
126
+
127
+ ### Subreddit Size Tiers
128
+
129
+ | Tier | Subscribers | Characteristics |
130
+ |------|------------|-----------------|
131
+ | Mega | >10M | Front-page feeder; high noise, fast velocity |
132
+ | Large | 1M-10M | Established communities; reliable trend signals |
133
+ | Medium | 100K-1M | Niche but active; early signals before mega-subs |
134
+ | Small | 10K-100K | Specialist communities; high signal-to-noise for domain topics |
135
+ | Micro | <10K | Very niche; useful for domain-expert sentiment only |
136
+
137
+ ### Flair System
138
+ - Post flairs categorize content within a subreddit (e.g., "Discussion", "News", "OC")
139
+ - User flairs indicate community standing or expertise
140
+ - Flair-based filtering: `GET /r/{subreddit}/search?q=flair:News` — narrows monitoring to specific content types
package/manifest.json ADDED
@@ -0,0 +1,26 @@
1
+ {
2
+ "name": "@botlearn/reddit-tracker",
3
+ "version": "0.1.0",
4
+ "description": "Reddit community monitoring, engagement velocity tracking, cross-subreddit correlation, and early trend detection for OpenClaw Agent",
5
+ "category": "information-retrieval",
6
+ "author": "BotLearn",
7
+ "benchmarkDimension": "information-retrieval",
8
+ "expectedImprovement": 30,
9
+ "dependencies": {},
10
+ "compatibility": {
11
+ "openclaw": ">=0.5.0"
12
+ },
13
+ "files": {
14
+ "skill": "skill.md",
15
+ "knowledge": [
16
+ "knowledge/domain.md",
17
+ "knowledge/best-practices.md",
18
+ "knowledge/anti-patterns.md"
19
+ ],
20
+ "strategies": [
21
+ "strategies/main.md"
22
+ ],
23
+ "smokeTest": "tests/smoke.json",
24
+ "benchmark": "tests/benchmark.json"
25
+ }
26
+ }
package/package.json ADDED
@@ -0,0 +1,35 @@
1
+ {
2
+ "name": "@botlearn/reddit-tracker",
3
+ "version": "0.1.0",
4
+ "description": "Reddit community monitoring, engagement velocity tracking, cross-subreddit correlation, and early trend detection for OpenClaw Agent",
5
+ "type": "module",
6
+ "main": "manifest.json",
7
+ "files": [
8
+ "manifest.json",
9
+ "skill.md",
10
+ "knowledge/",
11
+ "strategies/",
12
+ "tests/",
13
+ "README.md"
14
+ ],
15
+ "keywords": [
16
+ "botlearn",
17
+ "openclaw",
18
+ "skill",
19
+ "information-retrieval"
20
+ ],
21
+ "author": "BotLearn",
22
+ "license": "MIT",
23
+ "repository": {
24
+ "type": "git",
25
+ "url": "https://github.com/readai-team/botlearn-awesome-skills.git",
26
+ "directory": "packages/skills/reddit-tracker"
27
+ },
28
+ "homepage": "https://github.com/readai-team/botlearn-awesome-skills/tree/main/packages/skills/reddit-tracker",
29
+ "bugs": {
30
+ "url": "https://github.com/readai-team/botlearn-awesome-skills/issues"
31
+ },
32
+ "publishConfig": {
33
+ "access": "public"
34
+ }
35
+ }
package/skill.md ADDED
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: reddit-tracker
3
+ role: Reddit Trend Analyst
4
+ version: 1.0.0
5
+ triggers:
6
+ - "reddit"
7
+ - "subreddit"
8
+ - "trending"
9
+ - "community"
10
+ - "hotspot"
11
+ - "reddit trends"
12
+ - "what's hot on reddit"
13
+ ---
14
+
15
+ # Role
16
+
17
+ You are a Reddit Trend Analyst. When activated, you monitor subreddit activity, track engagement velocity, correlate signals across communities, and predict emerging trends up to 24 hours before they peak on Reddit's front page.
18
+
19
+ # Capabilities
20
+
21
+ 1. Monitor targeted subreddits for rising posts by tracking upvote velocity, comment acceleration, and award density within configurable time windows
22
+ 2. Detect emerging trends by computing engagement velocity curves and comparing them against historical breakout patterns for the target subreddit
23
+ 3. Correlate cross-subreddit signals — identify when the same topic, URL, or narrative surfaces independently in multiple communities simultaneously
24
+ 4. Analyze community sentiment by evaluating comment tone distribution, controversial-flag ratios, and top-comment polarity within trending threads
25
+ 5. Predict trend trajectories by combining velocity metrics, cross-community spread rate, and temporal posting patterns to forecast peak timing and reach
26
+
27
+ # Constraints
28
+
29
+ 1. Never treat raw upvote counts as a reliable quality or trend signal — always normalize by subreddit size, post age, and historical baseline
30
+ 2. Never ignore Reddit's vote fuzzing — reported scores are approximate; rely on velocity and rank changes rather than absolute numbers
31
+ 3. Never conflate karma farming or repost surges with organic trend emergence — check for duplicate URLs, bot account patterns, and artificial award clustering
32
+ 4. Always distinguish between subreddit-local trends (only relevant within a niche community) and cross-community breakouts (genuine broad interest)
33
+ 5. Always account for time-zone posting patterns — a post's velocity must be interpreted relative to the subreddit's peak-activity hours
34
+
35
+ # Activation
36
+
37
+ WHEN the user requests Reddit monitoring, trend detection, or community analysis:
38
+ 1. Identify the target subreddits, topics, or keywords from the user's request
39
+ 2. Execute the monitoring strategy from strategies/main.md
40
+ 3. Apply Reddit platform knowledge from knowledge/domain.md for correct API usage and scoring interpretation
41
+ 4. Evaluate signals using velocity and correlation methods from knowledge/best-practices.md
42
+ 5. Verify findings against known pitfalls in knowledge/anti-patterns.md
43
+ 6. Output a trend report with confidence scores, predicted peak timing, and supporting evidence
@@ -0,0 +1,117 @@
1
+ ---
2
+ strategy: reddit-tracker
3
+ version: 1.0.0
4
+ steps: 6
5
+ ---
6
+
7
+ # Reddit Tracker Strategy
8
+
9
+ ## Step 1: Target Identification & Scope Definition
10
+ - Parse the user's request to identify: **target subreddits**, **topics/keywords**, **time horizon**, and **desired output** (trend report, sentiment summary, breakout alerts)
11
+ - IF specific subreddits are named THEN add them to the watch list directly
12
+ - IF only a topic or domain is given THEN identify the top 3-5 relevant subreddits by:
13
+ - Searching Reddit for the topic and noting which subreddits surface most frequently
14
+ - Checking subreddit recommendation resources and community directories
15
+ - Including both mega-subs (broad reach) and niche subs (early signal) for the topic
16
+ - Determine monitoring mode:
17
+ - **Snapshot** — One-time scan of current state (default if user asks "what's trending")
18
+ - **Watch** — Continuous monitoring over a specified time window (if user asks "alert me" or "track")
19
+ - Record each subreddit's subscriber count and current active user count as baseline context from knowledge/domain.md
20
+
21
+ ## Step 2: Data Collection & Velocity Tracking
22
+ - For each target subreddit, collect posts from three listing endpoints:
23
+ - `/rising` — Primary signal source for early breakout detection
24
+ - `/new` — Catch posts in the Seed phase that rising has not yet surfaced
25
+ - `/hot` — Baseline for what the community's algorithm already considers trending
26
+ - For each collected post, record the initial engagement snapshot:
27
+ - `score`, `upvote_ratio`, `num_comments`, `total_awards_received`, `created_utc`, `is_crosspost`
28
+ - IF monitoring mode is Watch THEN sample again after 15 minutes and 60 minutes to calculate velocity:
29
+ - Upvote velocity: `(score_t2 - score_t1) / elapsed_minutes`
30
+ - Comment velocity: `(comments_t2 - comments_t1) / elapsed_minutes`
31
+ - Award velocity: awards received per hour
32
+ - Normalize all velocities using subreddit size and time-of-day baselines from knowledge/best-practices.md
33
+ - Flag posts that exceed breakout detection thresholds:
34
+ - Upvote velocity > 2x the subreddit's 90th-percentile for the post's age
35
+ - Comment velocity > 3x the subreddit's median within the first hour
36
+ - Awards > 2 within the first 30 minutes (in communities where awards are uncommon)
37
+
38
+ ## Step 3: Cross-Community Correlation
39
+ - Aggregate flagged posts and rising content across all monitored subreddits
40
+ - Detect cross-subreddit topic convergence using:
41
+ - **URL deduplication** — Same link appearing in multiple subreddits (crosspost or independent)
42
+ - **Keyword overlap** — Same key terms in post titles across different communities within a 6-hour window
43
+ - **Entity matching** — Same named entities (people, companies, events) surfacing independently
44
+ - For each correlated topic, calculate the cross-community spread score:
45
+ - `spread_score = (num_subreddits / expected_subreddits) * avg_normalized_velocity * diversity_factor`
46
+ - `diversity_factor` increases when subreddits span different categories
47
+ - Classify the propagation pattern from knowledge/best-practices.md:
48
+ - Hub-and-Spoke (top-down from mega-sub)
49
+ - Grassroots (independent emergence in small subs — highest prediction value)
50
+ - Cascade (sequential community hopping)
51
+ - Synchronized (simultaneous appearance suggesting external trigger)
52
+ - IF a Grassroots pattern is detected THEN elevate the trend's priority — these are the signals that predict breakouts 24 hours early
53
+
54
+ ## Step 4: Sentiment Analysis & Discussion Quality
55
+ - For each flagged trending post, analyze the comment section:
56
+ - Sample top 20 comments by "best" ranking (Wilson score)
57
+ - Classify each comment's sentiment: positive, negative, neutral, mixed
58
+ - Calculate the sentiment distribution ratio for the thread
59
+ - Assess discussion quality using engagement quality tiers from knowledge/best-practices.md:
60
+ - **Deep** — Long comments, citations, debate threads (genuine interest)
61
+ - **Reactive** — Short affirmations, emojis, memes (viral but shallow)
62
+ - **Hostile** — Insults, reports, mod intervention (controversial, unreliable signal)
63
+ - **Astroturfed** — Identical phrasing, new accounts, coordinated timing (exclude from analysis)
64
+ - Check for community reaction markers:
65
+ - Top comment alignment with post (endorsement vs. contradiction)
66
+ - Moderator intervention (pinned comments, post locks, flair changes)
67
+ - Controversial flag density in the comment tree
68
+ - IF astroturfing indicators are detected THEN flag the trend and discount its metrics per knowledge/anti-patterns.md
69
+ - IF top comment contradicts the post THEN note the divergence between post score and community consensus
70
+
71
+ ## Step 5: Trend Prediction & Confidence Scoring
72
+ - For each detected trend, determine its lifecycle phase:
73
+ - **Seed** (0-15 min, low/erratic velocity) — Too early for reliable prediction
74
+ - **Ignition** (15-60 min, sharp acceleration) — Primary prediction window
75
+ - **Surge** (1-4 hours, sustained high velocity) — Confirmed breakout
76
+ - **Peak** (4-12 hours, velocity plateau then decline) — Maximum reach achieved
77
+ - **Decay** (12-48 hours, declining velocity) — Trend exhaustion
78
+ - Calculate a composite trend confidence score (0-100):
79
+ - Velocity strength (30%): How far above baseline thresholds is the engagement velocity?
80
+ - Cross-community spread (25%): How many independent subreddits have surfaced this topic?
81
+ - Sentiment alignment (15%): Is community sentiment consistent and positive/engaged?
82
+ - Discussion quality (15%): Is engagement deep and organic, or shallow and potentially inauthentic?
83
+ - Temporal fit (15%): Is the timing consistent with the subreddit's peak activity patterns?
84
+ - Estimate peak timing:
85
+ - IF in Ignition phase THEN predict peak in 3-8 hours (adjusted by subreddit size and time-of-day)
86
+ - IF in Surge phase THEN predict peak in 1-4 hours
87
+ - IF already at Peak THEN report as "currently peaking" with estimated decay onset
88
+ - SELF-CHECK against anti-patterns from knowledge/anti-patterns.md:
89
+ - Is this a repost surge, not a genuine trend?
90
+ - Could vote manipulation be inflating the signal?
91
+ - Is a megathread absorbing individual post activity?
92
+ - Are bot accounts driving the engagement?
93
+ - IF any check flags a concern THEN reduce confidence score by 15-30 points and note the risk factor
94
+
95
+ ## Step 6: Report Generation & Output
96
+ - Present findings in a structured trend report:
97
+ - **Trend Summary** — One-sentence description of the detected trend
98
+ - **Confidence** — Score (0-100) with label: Low (<40), Medium (40-69), High (>=70)
99
+ - **Phase** — Current lifecycle phase and estimated time to peak
100
+ - **Scope** — Subreddits involved, propagation pattern, spread score
101
+ - **Velocity Metrics** — Key engagement rates with normalization context
102
+ - **Sentiment** — Community sentiment distribution and quality tier
103
+ - **Evidence** — Links to the top 3-5 posts driving the trend, with per-post metrics
104
+ - **Risk Factors** — Any detected anti-patterns or confidence-reducing signals
105
+ - **Recommendation** — Actionable advice: monitor, act now, or wait for confirmation
106
+ - IF multiple trends are detected THEN rank by confidence score descending and present as a prioritized list
107
+ - IF monitoring mode is Watch THEN specify the next recommended check time based on the trend phase:
108
+ - Seed phase → recheck in 15 minutes
109
+ - Ignition phase → recheck in 30 minutes
110
+ - Surge phase → recheck in 1 hour
111
+ - Peak or Decay → recheck in 4 hours or close monitoring
112
+ - SELF-CHECK output completeness:
113
+ - Does every trend have a confidence score and lifecycle phase?
114
+ - Are raw metrics accompanied by normalization context?
115
+ - Is the scope correctly qualified (subreddit-local vs. cross-community)?
116
+ - Are risk factors and anti-pattern warnings included where applicable?
117
+ - IF any check fails THEN revise the report before delivering
@@ -0,0 +1,486 @@
1
+ {
2
+ "version": "0.0.1",
3
+ "dimension": "information-retrieval",
4
+ "tasks": [
5
+ {
6
+ "id": "bench-easy-01",
7
+ "difficulty": "easy",
8
+ "description": "Identify the current top trending topic in a single subreddit",
9
+ "input": "What is the hottest topic on r/technology right now? Give me the top trending post with its engagement metrics.",
10
+ "rubric": [
11
+ {
12
+ "criterion": "Data Retrieval",
13
+ "weight": 0.4,
14
+ "scoring": {
15
+ "5": "Retrieves current hot/rising posts from r/technology with score, upvote_ratio, num_comments, and post age",
16
+ "3": "Retrieves top posts but missing some engagement metrics",
17
+ "1": "Provides a general description without specific post data",
18
+ "0": "No data retrieved from the subreddit"
19
+ }
20
+ },
21
+ {
22
+ "criterion": "Metric Interpretation",
23
+ "weight": 0.3,
24
+ "scoring": {
25
+ "5": "Explains what the metrics mean in context — normalizes by subreddit size, notes velocity, identifies lifecycle phase",
26
+ "3": "Reports metrics but without normalization or velocity context",
27
+ "1": "Raw numbers only with no interpretation",
28
+ "0": "No metrics provided"
29
+ }
30
+ },
31
+ {
32
+ "criterion": "Output Quality",
33
+ "weight": 0.3,
34
+ "scoring": {
35
+ "5": "Structured output with post title, link, metrics, and a brief assessment of why it is trending",
36
+ "3": "Post identified with partial metrics",
37
+ "1": "Vague description of trending content",
38
+ "0": "No usable output"
39
+ }
40
+ }
41
+ ],
42
+ "expectedScoreWithout": 35,
43
+ "expectedScoreWith": 75
44
+ },
45
+ {
46
+ "id": "bench-easy-02",
47
+ "difficulty": "easy",
48
+ "description": "Check a subreddit's current activity level relative to its baseline",
49
+ "input": "Is r/worldnews more active than usual right now? Check the current activity level and compare it to what you'd expect for this time of day.",
50
+ "rubric": [
51
+ {
52
+ "criterion": "Baseline Awareness",
53
+ "weight": 0.4,
54
+ "scoring": {
55
+ "5": "References subreddit subscriber count and typical active user count; compares current activity to expected hourly baseline for this time of day and day of week",
56
+ "3": "Notes current activity level but baseline comparison is approximate or missing temporal context",
57
+ "1": "Reports current active users without any baseline reference",
58
+ "0": "No activity level assessment"
59
+ }
60
+ },
61
+ {
62
+ "criterion": "Data Collection",
63
+ "weight": 0.3,
64
+ "scoring": {
65
+ "5": "Checks active user count, recent post frequency in /new, and comment rates to triangulate activity level",
66
+ "3": "Checks one or two activity indicators",
67
+ "1": "Only mentions subscriber count without current activity data",
68
+ "0": "No data collected"
69
+ }
70
+ },
71
+ {
72
+ "criterion": "Contextual Output",
73
+ "weight": 0.3,
74
+ "scoring": {
75
+ "5": "Clear verdict (above/below/at baseline) with specific numbers and time-of-day context; notes if an event may be driving unusual activity",
76
+ "3": "Provides a verdict but without supporting numbers or temporal context",
77
+ "1": "Vague assessment without data",
78
+ "0": "No assessment provided"
79
+ }
80
+ }
81
+ ],
82
+ "expectedScoreWithout": 30,
83
+ "expectedScoreWith": 70
84
+ },
85
+ {
86
+ "id": "bench-easy-03",
87
+ "difficulty": "easy",
88
+ "description": "Find relevant subreddits for a given topic",
89
+ "input": "What are the most relevant subreddits for tracking developments in artificial intelligence? List them with subscriber counts and typical activity levels.",
90
+ "rubric": [
91
+ {
92
+ "criterion": "Subreddit Discovery",
93
+ "weight": 0.4,
94
+ "scoring": {
95
+ "5": "Identifies 5+ relevant subreddits spanning different aspects of AI (research, news, applications, ethics) including both mega-subs and niche communities",
96
+ "3": "Identifies 3-4 relevant subreddits but misses niche or specialized communities",
97
+ "1": "Only identifies 1-2 obvious subreddits (e.g., r/artificial)",
98
+ "0": "No relevant subreddits identified"
99
+ }
100
+ },
101
+ {
102
+ "criterion": "Community Profiling",
103
+ "weight": 0.3,
104
+ "scoring": {
105
+ "5": "Each subreddit includes: subscriber count, size tier, activity level, content focus, and typical post types",
106
+ "3": "Subscriber counts and basic descriptions but missing activity characterization",
107
+ "1": "Names only without metadata",
108
+ "0": "No profiling"
109
+ }
110
+ },
111
+ {
112
+ "criterion": "Strategic Value Assessment",
113
+ "weight": 0.3,
114
+ "scoring": {
115
+ "5": "Explains which subreddits are best for early trend detection vs. broad coverage vs. expert analysis; recommends a monitoring set",
116
+ "3": "Lists subreddits but doesn't differentiate their strategic value for monitoring",
117
+ "1": "Flat list with no monitoring strategy advice",
118
+ "0": "No strategic guidance"
119
+ }
120
+ }
121
+ ],
122
+ "expectedScoreWithout": 35,
123
+ "expectedScoreWith": 80
124
+ },
125
+ {
126
+ "id": "bench-med-01",
127
+ "difficulty": "medium",
128
+ "description": "Detect cross-subreddit topic convergence for a breaking event",
129
+ "input": "A major tech company just announced a significant product. Check r/technology, r/gadgets, r/apple, and r/Android to see if this topic is spreading across communities. Identify the topic, trace where it originated, and assess the cross-community spread pattern.",
130
+ "rubric": [
131
+ {
132
+ "criterion": "Cross-Subreddit Detection",
133
+ "weight": 0.3,
134
+ "scoring": {
135
+ "5": "Scans all 4 subreddits for overlapping topics using URL matching, keyword clustering, and entity co-occurrence; identifies the converging topic with specific evidence",
136
+ "3": "Checks multiple subreddits and finds overlap but uses only one correlation method",
137
+ "1": "Checks subreddits independently without systematic cross-correlation",
138
+ "0": "No cross-subreddit analysis"
139
+ }
140
+ },
141
+ {
142
+ "criterion": "Origin Tracing",
143
+ "weight": 0.25,
144
+ "scoring": {
145
+ "5": "Identifies the earliest post by timestamp, traces the crosspost chain, determines which subreddit served as the origin, and maps the propagation path",
146
+ "3": "Identifies the likely origin subreddit but doesn't trace the full propagation path",
147
+ "1": "Mentions where the topic appeared but doesn't determine origin",
148
+ "0": "No origin analysis"
149
+ }
150
+ },
151
+ {
152
+ "criterion": "Spread Pattern Classification",
153
+ "weight": 0.25,
154
+ "scoring": {
155
+ "5": "Correctly classifies the propagation pattern (hub-and-spoke, grassroots, cascade, or synchronized) with evidence; calculates spread score",
156
+ "3": "Describes the spread qualitatively but doesn't formally classify the pattern",
157
+ "1": "Notes that the topic appears in multiple places without pattern analysis",
158
+ "0": "No spread analysis"
159
+ }
160
+ },
161
+ {
162
+ "criterion": "Report Completeness",
163
+ "weight": 0.2,
164
+ "scoring": {
165
+ "5": "Structured report with topic summary, origin, spread map, per-subreddit engagement metrics, and prediction of further spread",
166
+ "3": "Reports the topic and affected subreddits but missing metrics or predictions",
167
+ "1": "Partial report with significant gaps",
168
+ "0": "No coherent report"
169
+ }
170
+ }
171
+ ],
172
+ "expectedScoreWithout": 25,
173
+ "expectedScoreWith": 65
174
+ },
175
+ {
176
+ "id": "bench-med-02",
177
+ "difficulty": "medium",
178
+ "description": "Perform velocity-based breakout detection on rising posts",
179
+ "input": "Analyze the rising posts in r/science right now. Identify any posts that are showing breakout velocity — engagement rates significantly above the subreddit's baseline. For each candidate, calculate the velocity metrics and predict whether it will reach the front page.",
180
+ "rubric": [
181
+ {
182
+ "criterion": "Velocity Calculation",
183
+ "weight": 0.3,
184
+ "scoring": {
185
+ "5": "Calculates upvote velocity, comment velocity, and award velocity for rising posts; normalizes by subreddit size (3M+ subscribers) and post age; compares against baseline thresholds",
186
+ "3": "Computes some velocity metrics but normalization is incomplete or thresholds are not referenced",
187
+ "1": "Reports engagement numbers without computing velocity (rate of change over time)",
188
+ "0": "No velocity analysis"
189
+ }
190
+ },
191
+ {
192
+ "criterion": "Breakout Identification",
193
+ "weight": 0.3,
194
+ "scoring": {
195
+ "5": "Flags posts exceeding the 90th-percentile velocity threshold for r/science; classifies each flagged post's lifecycle phase; provides multi-metric confirmation (upvote + comment velocity agreement)",
196
+ "3": "Identifies high-engagement posts but breakout criteria are loosely defined",
197
+ "1": "Lists popular posts without breakout-specific analysis",
198
+ "0": "No breakout detection"
199
+ }
200
+ },
201
+ {
202
+ "criterion": "Prediction Quality",
203
+ "weight": 0.25,
204
+ "scoring": {
205
+ "5": "Provides front-page probability estimate with reasoning based on velocity curve phase, time-of-day, and subreddit-specific breakout history; includes predicted peak timing",
206
+ "3": "Makes a prediction but without detailed supporting analysis",
207
+ "1": "Vague speculation about whether posts might trend",
208
+ "0": "No predictions made"
209
+ }
210
+ },
211
+ {
212
+ "criterion": "Anti-Pattern Awareness",
213
+ "weight": 0.15,
214
+ "scoring": {
215
+ "5": "Checks for vote manipulation signals, repost conflation, and bot activity; explicitly notes if any flagged posts have suspicious patterns",
216
+ "3": "Mentions the possibility of manipulation but doesn't perform specific checks",
217
+ "1": "No consideration of manipulation or false signals",
218
+ "0": "Analysis is likely skewed by undetected anti-patterns"
219
+ }
220
+ }
221
+ ],
222
+ "expectedScoreWithout": 20,
223
+ "expectedScoreWith": 65
224
+ },
225
+ {
226
+ "id": "bench-med-03",
227
+ "difficulty": "medium",
228
+ "description": "Analyze community sentiment on a controversial topic",
229
+ "input": "There's a heated discussion about AI regulation in r/technology and r/MachineLearning. Analyze the sentiment in both communities. I need to know: Are the communities aligned or divided? What are the dominant viewpoints? Is the discussion quality genuine or astroturfed?",
230
+ "rubric": [
231
+ {
232
+ "criterion": "Sentiment Analysis",
233
+ "weight": 0.3,
234
+ "scoring": {
235
+ "5": "Analyzes comment sentiment distribution (positive/negative/neutral/mixed) in both subreddits; compares sentiment between communities; identifies dominant viewpoints with supporting quotes",
236
+ "3": "Provides sentiment assessment for both subreddits but without systematic distribution analysis",
237
+ "1": "Vague characterization of sentiment without evidence",
238
+ "0": "No sentiment analysis"
239
+ }
240
+ },
241
+ {
242
+ "criterion": "Cross-Community Comparison",
243
+ "weight": 0.25,
244
+ "scoring": {
245
+ "5": "Compares r/technology (general audience) vs r/MachineLearning (practitioner audience) sentiment explicitly; identifies where they agree and diverge; explains the divergence in terms of community composition",
246
+ "3": "Notes differences between communities but doesn't explain the reasons",
247
+ "1": "Treats both communities as a single unit",
248
+ "0": "No comparison"
249
+ }
250
+ },
251
+ {
252
+ "criterion": "Discussion Quality Assessment",
253
+ "weight": 0.25,
254
+ "scoring": {
255
+ "5": "Evaluates engagement quality tier (Deep/Reactive/Hostile/Astroturfed) for each community; checks for coordinated messaging, bot accounts, and formulaic comments; provides evidence for quality classification",
256
+ "3": "Comments on discussion quality but without systematic evaluation against quality tiers",
257
+ "1": "No quality assessment beyond basic observation",
258
+ "0": "Assumes all engagement is genuine without checking"
259
+ }
260
+ },
261
+ {
262
+ "criterion": "Output Structure",
263
+ "weight": 0.2,
264
+ "scoring": {
265
+ "5": "Structured comparison report with per-community sentiment breakdown, dominant viewpoints, quality assessment, and a synthesis identifying key points of alignment and divergence",
266
+ "3": "Covers main points but output lacks clear structure",
267
+ "1": "Unstructured observations",
268
+ "0": "No coherent output"
269
+ }
270
+ }
271
+ ],
272
+ "expectedScoreWithout": 25,
273
+ "expectedScoreWith": 65
274
+ },
275
+ {
276
+ "id": "bench-med-04",
277
+ "difficulty": "medium",
278
+ "description": "Detect early signals of an emerging trend before it peaks",
279
+ "input": "I need to detect trends 24 hours before they hit the front page. Scan r/gaming, r/pcgaming, and r/Games for any posts currently in the Ignition or early Surge phase that haven't reached mainstream visibility yet. Focus on posts less than 2 hours old with anomalous engagement velocity.",
280
+ "rubric": [
281
+ {
282
+ "criterion": "Early Detection Methodology",
283
+ "weight": 0.35,
284
+ "scoring": {
285
+ "5": "Scans /new and /rising (not just /hot) for posts under 2 hours old; calculates velocity at this early stage; compares against historical breakout curves for gaming subreddits; correctly focuses on Ignition-phase signals",
286
+ "3": "Checks rising posts but doesn't filter by age or compare against historical breakout patterns",
287
+ "1": "Reports currently hot posts that are already mainstream — misses the early detection goal",
288
+ "0": "No early detection methodology applied"
289
+ }
290
+ },
291
+ {
292
+ "criterion": "Velocity Normalization",
293
+ "weight": 0.25,
294
+ "scoring": {
295
+ "5": "Normalizes velocity by each subreddit's size and current time-of-day baseline; accounts for different content types (game announcements vs. memes vs. discussion posts) in gaming subs",
296
+ "3": "Some normalization but missing important context factors",
297
+ "1": "Reports raw engagement numbers without normalization",
298
+ "0": "No normalization"
299
+ }
300
+ },
301
+ {
302
+ "criterion": "Multi-Signal Confirmation",
303
+ "weight": 0.2,
304
+ "scoring": {
305
+ "5": "Confirms breakout candidates using at least 2 signals (e.g., high upvote velocity + high comment velocity + early awards); notes signal agreement or disagreement",
306
+ "3": "Uses one velocity metric without multi-signal confirmation",
307
+ "1": "Relies on a single metric (e.g., just upvote count)",
308
+ "0": "No signal confirmation"
309
+ }
310
+ },
311
+ {
312
+ "criterion": "Actionable Prediction",
313
+ "weight": 0.2,
314
+ "scoring": {
315
+ "5": "For each candidate, provides predicted peak timing, confidence score, and recommended action (monitor, act, wait); compares against similar past breakouts",
316
+ "3": "Flags candidates as potentially trending but without timing or confidence specifics",
317
+ "1": "Lists posts without prediction or actionability",
318
+ "0": "No predictions"
319
+ }
320
+ }
321
+ ],
322
+ "expectedScoreWithout": 20,
323
+ "expectedScoreWith": 65
324
+ },
325
+ {
326
+ "id": "bench-hard-01",
327
+ "difficulty": "hard",
328
+ "description": "Distinguish organic trends from coordinated manipulation campaigns",
329
+ "input": "I've noticed a sudden surge of posts about a specific cryptocurrency across r/CryptoCurrency, r/technology, r/investing, and r/wallstreetbets. Investigate whether this is organic community interest or a coordinated pump campaign. Analyze posting patterns, account authenticity, comment quality, and engagement metrics to make a determination.",
330
+ "rubric": [
331
+ {
332
+ "criterion": "Account Analysis",
333
+ "weight": 0.3,
334
+ "scoring": {
335
+ "5": "Checks posting accounts for: account age, karma history, post history diversity, recent activity patterns; identifies clusters of new or single-purpose accounts; compares against bot/shill indicators from anti-patterns",
336
+ "3": "Notes some account characteristics but doesn't perform systematic authentication analysis",
337
+ "1": "Ignores account characteristics entirely",
338
+ "0": "No account analysis"
339
+ }
340
+ },
341
+ {
342
+ "criterion": "Coordination Detection",
343
+ "weight": 0.3,
344
+ "scoring": {
345
+ "5": "Analyzes temporal posting patterns (are posts suspiciously synchronized?); checks for identical or near-identical phrasing across posts/comments; examines whether accounts have coordinated history; identifies talking-point patterns",
346
+ "3": "Notes suspicious timing or phrasing but doesn't perform comprehensive coordination analysis",
347
+ "1": "Treats all posts as independent without checking for coordination",
348
+ "0": "No coordination analysis"
349
+ }
350
+ },
351
+ {
352
+ "criterion": "Organic vs. Inauthentic Verdict",
353
+ "weight": 0.25,
354
+ "scoring": {
355
+ "5": "Provides a clear, evidence-based verdict (organic / likely manipulated / mixed) with confidence level; lists specific evidence supporting the conclusion; quantifies the ratio of authentic vs. suspicious engagement",
356
+ "3": "Provides a verdict but supporting evidence is thin or speculative",
357
+ "1": "Hedges without committing to an assessment despite available evidence",
358
+ "0": "No determination made"
359
+ }
360
+ },
361
+ {
362
+ "criterion": "Cross-Subreddit Pattern Comparison",
363
+ "weight": 0.15,
364
+ "scoring": {
365
+ "5": "Compares the engagement pattern in each subreddit — notes if behavior differs by community (e.g., organic in r/CryptoCurrency but astroturfed in r/technology); explains why different communities may react differently",
366
+ "3": "Compares across subreddits but superficially",
367
+ "1": "Treats all subreddits as a single unit",
368
+ "0": "No cross-subreddit comparison"
369
+ }
370
+ }
371
+ ],
372
+ "expectedScoreWithout": 15,
373
+ "expectedScoreWith": 60
374
+ },
375
+ {
376
+ "id": "bench-hard-02",
377
+ "difficulty": "hard",
378
+ "description": "Full trend lifecycle analysis with prediction accuracy assessment",
379
+ "input": "Track a developing story across r/news, r/worldnews, r/politics, and relevant niche subreddits. I need: (1) origin identification — where did this story first appear on Reddit, (2) full propagation timeline across communities, (3) velocity curves at each stage, (4) sentiment evolution as the story spread, and (5) a prediction of whether it will sustain for 48+ hours or burn out quickly. Include confidence levels for all predictions.",
380
+ "rubric": [
381
+ {
382
+ "criterion": "Origin & Timeline Tracing",
383
+ "weight": 0.25,
384
+ "scoring": {
385
+ "5": "Identifies the earliest Reddit post about the story with exact timestamp; maps the complete propagation timeline showing when each subreddit picked up the topic; identifies crosspost chains and independent discoveries",
386
+ "3": "Identifies origin but timeline is incomplete or approximate",
387
+ "1": "Notes which subreddits discuss the topic but no temporal ordering",
388
+ "0": "No origin or timeline analysis"
389
+ }
390
+ },
391
+ {
392
+ "criterion": "Multi-Subreddit Velocity Analysis",
393
+ "weight": 0.25,
394
+ "scoring": {
395
+ "5": "Provides per-subreddit velocity curves showing engagement over time; identifies lifecycle phase in each community; notes velocity differences between communities and explains them (e.g., r/worldnews peaked before r/politics due to international angle)",
396
+ "3": "Provides velocity data for some subreddits but analysis is inconsistent",
397
+ "1": "Reports final engagement numbers without velocity over time",
398
+ "0": "No velocity analysis"
399
+ }
400
+ },
401
+ {
402
+ "criterion": "Sentiment Evolution",
403
+ "weight": 0.2,
404
+ "scoring": {
405
+ "5": "Tracks how sentiment changed as the story spread — e.g., initial concern in r/news, political framing in r/politics, international perspective in r/worldnews; identifies narrative shifts over time",
406
+ "3": "Provides snapshot sentiment but doesn't track evolution",
407
+ "1": "Single sentiment label for the entire story",
408
+ "0": "No sentiment analysis"
409
+ }
410
+ },
411
+ {
412
+ "criterion": "Sustainability Prediction",
413
+ "weight": 0.2,
414
+ "scoring": {
415
+ "5": "Provides a reasoned prediction of story sustainability (48h+ or burnout) based on: velocity decay rate, cross-community engagement depth, historical comparison with similar stories, and external event dependency; includes confidence level",
416
+ "3": "Makes a prediction but with limited supporting analysis",
417
+ "1": "Vague guess about story longevity",
418
+ "0": "No prediction"
419
+ }
420
+ },
421
+ {
422
+ "criterion": "Report Synthesis",
423
+ "weight": 0.1,
424
+ "scoring": {
425
+ "5": "Coherent narrative connecting all five requested elements; executive summary with key findings and confidence levels",
426
+ "3": "All elements present but not well connected",
427
+ "1": "Fragmented analysis",
428
+ "0": "Incomplete report"
429
+ }
430
+ }
431
+ ],
432
+ "expectedScoreWithout": 15,
433
+ "expectedScoreWith": 55
434
+ },
435
+ {
436
+ "id": "bench-hard-03",
437
+ "difficulty": "hard",
438
+ "description": "Predict an emerging trend 24 hours before peak from weak early signals",
439
+ "input": "You have access to r/startups, r/SaaS, r/Entrepreneur, r/smallbusiness, and r/webdev. Using early signal detection, identify any topics that are currently showing Seed or early Ignition phase characteristics that could become significant trends within 24 hours. I want you to detect trends that have NOT yet appeared on any subreddit's front page. Provide evidence for each prediction and a false-positive assessment.",
440
+ "rubric": [
441
+ {
442
+ "criterion": "Early Signal Detection",
443
+ "weight": 0.3,
444
+ "scoring": {
445
+ "5": "Systematically scans /new and /rising across all 5 subreddits for posts under 1 hour old; identifies anomalous velocity in the Seed/Ignition phase using multi-metric signals; successfully spots at least one genuinely early trend",
446
+ "3": "Checks rising in some subreddits but may miss very early signals in /new; detection criteria are not rigorous",
447
+ "1": "Reports already-visible trends rather than pre-peak signals",
448
+ "0": "No early detection capability demonstrated"
449
+ }
450
+ },
451
+ {
452
+ "criterion": "Prediction Methodology",
453
+ "weight": 0.25,
454
+ "scoring": {
455
+ "5": "Predictions are based on velocity analysis, cross-subreddit convergence checks, temporal pattern matching, and comparison with historical breakout curves; methodology is explicitly stated and repeatable",
456
+ "3": "Predictions have some analytical basis but methodology is not comprehensive",
457
+ "1": "Predictions are speculative without clear analytical grounding",
458
+ "0": "No methodology for prediction"
459
+ }
460
+ },
461
+ {
462
+ "criterion": "False Positive Assessment",
463
+ "weight": 0.25,
464
+ "scoring": {
465
+ "5": "For each flagged trend, provides: confidence score, list of confirming signals, list of potential confounds (repost, bot, niche-only), and explicit false-positive probability estimate; demonstrates awareness that most early signals do NOT become major trends",
466
+ "3": "Acknowledges uncertainty but false-positive assessment is informal",
467
+ "1": "Presents all predictions with equal confidence; no false-positive awareness",
468
+ "0": "No false-positive consideration"
469
+ }
470
+ },
471
+ {
472
+ "criterion": "Evidence & Traceability",
473
+ "weight": 0.2,
474
+ "scoring": {
475
+ "5": "Each prediction is backed by specific posts with links, metrics, and timeline; evidence is sufficient for the user to independently verify the assessment",
476
+ "3": "Some supporting evidence but not enough for independent verification",
477
+ "1": "Claims without supporting evidence",
478
+ "0": "No evidence provided"
479
+ }
480
+ }
481
+ ],
482
+ "expectedScoreWithout": 15,
483
+ "expectedScoreWith": 55
484
+ }
485
+ ]
486
+ }
@@ -0,0 +1,54 @@
1
+ {
2
+ "version": "0.0.1",
3
+ "timeout": 60,
4
+ "tasks": [
5
+ {
6
+ "id": "smoke-01",
7
+ "description": "Monitor a technology subreddit for emerging trends and provide a structured trend report with velocity analysis",
8
+ "input": "Monitor r/technology and r/programming for any emerging trends in the last 24 hours. I want to know what topics are gaining unusual traction, which posts are showing breakout velocity, and whether any topics are spreading across both subreddits. Provide confidence scores and predicted peak timing for each trend.",
9
+ "rubric": [
10
+ {
11
+ "criterion": "Subreddit Monitoring & Data Collection",
12
+ "weight": 0.25,
13
+ "scoring": {
14
+ "5": "Checks rising, new, and hot listings for both subreddits; collects engagement metrics (score, comments, awards, upvote_ratio) for candidate posts; notes subreddit subscriber counts for normalization",
15
+ "3": "Checks at least one listing type for both subreddits; collects basic metrics but misses some dimensions",
16
+ "1": "Only checks one subreddit or one listing type; minimal metric collection",
17
+ "0": "No systematic data collection from subreddit listings"
18
+ }
19
+ },
20
+ {
21
+ "criterion": "Velocity Analysis & Trend Detection",
22
+ "weight": 0.3,
23
+ "scoring": {
24
+ "5": "Calculates engagement velocity (upvotes/time, comments/time); normalizes by subreddit size and time-of-day; identifies posts exceeding breakout thresholds; classifies trend lifecycle phase (Seed/Ignition/Surge/Peak/Decay)",
25
+ "3": "Identifies high-engagement posts but velocity calculation is approximate or unnormalized; lifecycle phase mentioned but not rigorously determined",
26
+ "1": "Lists popular posts by raw score without velocity or normalization analysis",
27
+ "0": "No velocity-based analysis; just reports current top posts"
28
+ }
29
+ },
30
+ {
31
+ "criterion": "Cross-Subreddit Correlation",
32
+ "weight": 0.25,
33
+ "scoring": {
34
+ "5": "Detects topics appearing in both subreddits; classifies propagation pattern (grassroots, hub-and-spoke, etc.); calculates spread score; identifies origin subreddit",
35
+ "3": "Notes overlapping topics between subreddits but without formal correlation analysis or propagation classification",
36
+ "1": "Treats each subreddit independently with no cross-community analysis",
37
+ "0": "No awareness of cross-subreddit patterns"
38
+ }
39
+ },
40
+ {
41
+ "criterion": "Report Quality & Actionability",
42
+ "weight": 0.2,
43
+ "scoring": {
44
+ "5": "Structured report with: trend summary, confidence score (0-100), lifecycle phase, predicted peak timing, sentiment snapshot, supporting evidence links, and risk factors; trends ranked by confidence",
45
+ "3": "Report includes trend descriptions and some metrics but missing confidence scores, timing predictions, or risk factors",
46
+ "1": "Unstructured list of trending topics without metrics or predictions",
47
+ "0": "No coherent report format"
48
+ }
49
+ }
50
+ ],
51
+ "passThreshold": 60
52
+ }
53
+ ]
54
+ }