sc-research 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +67 -0
- package/README.md +111 -0
- package/dist/cli.js +671 -0
- package/dist/index.js +6530 -0
- package/dist/test/manual-fixed-links.js +186 -0
- package/dist/visualize.js +371 -0
- package/dist/web/assets/index-FB2oq23H.css +1 -0
- package/dist/web/assets/index-sZb3bzqd.js +44 -0
- package/dist/web/data.json +577 -0
- package/dist/web/index.html +14 -0
- package/dist/web/vite.svg +11 -0
- package/package.json +52 -0
- package/templates/base/commands/controversy.md +28 -0
- package/templates/base/commands/deep-research.md +25 -0
- package/templates/base/commands/discovery.md +27 -0
- package/templates/base/commands/quick.md +14 -0
- package/templates/base/commands/rank.md +27 -0
- package/templates/base/commands/research.md +10 -0
- package/templates/base/commands/sentiment.md +28 -0
- package/templates/base/commands/test-research.md +9 -0
- package/templates/base/commands/trend.md +27 -0
- package/templates/base/commands/visualize.md +14 -0
- package/templates/base/manifest.json +119 -0
- package/templates/base/skills/communities_controversy.md +65 -0
- package/templates/base/skills/communities_discovery.md +55 -0
- package/templates/base/skills/communities_fetch.md +56 -0
- package/templates/base/skills/communities_rank.md +57 -0
- package/templates/base/skills/communities_research_test.md +64 -0
- package/templates/base/skills/communities_sentiment.md +61 -0
- package/templates/base/skills/communities_trend.md +71 -0
- package/templates/base/skills/communities_visualize.md +46 -0
- package/templates/base/skills/using_communities_research.md +146 -0
- package/templates/platforms/agent.json +21 -0
- package/templates/platforms/claude.json +21 -0
- package/templates/platforms/codebuddy.json +21 -0
- package/templates/platforms/codex.json +21 -0
- package/templates/platforms/continue.json +21 -0
- package/templates/platforms/copilot.json +18 -0
- package/templates/platforms/cursor.json +18 -0
- package/templates/platforms/droid.json +21 -0
- package/templates/platforms/gemini.json +21 -0
- package/templates/platforms/kiro.json +18 -0
- package/templates/platforms/opencode.json +21 -0
- package/templates/platforms/qoder.json +21 -0
- package/templates/platforms/roocode.json +18 -0
- package/templates/platforms/trae.json +21 -0
- package/templates/platforms/windsurf.json +18 -0
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Find controversial debates in existing research data.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for data existence
|
|
6
|
+
|
|
7
|
+
> Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
|
|
8
|
+
|
|
9
|
+
2. Check freshness for the current request
|
|
10
|
+
|
|
11
|
+
> Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
|
|
12
|
+
|
|
13
|
+
3. Run the controversy skill
|
|
14
|
+
|
|
15
|
+
> Use the `communities_controversy` skill to identify polarizing topics, extract opposing quotes, and generate `classified_controversy.json`.
|
|
16
|
+
|
|
17
|
+
4. Validate output schema
|
|
18
|
+
|
|
19
|
+
> Ensure `classified_controversy.json` includes:
|
|
20
|
+
>
|
|
21
|
+
> - `topic`
|
|
22
|
+
> - `overall_divisiveness`
|
|
23
|
+
> - `controversies` array
|
|
24
|
+
> - each controversy contains `topic`, `heat_score`, `divisiveness`, `side_a`, and `side_b`
|
|
25
|
+
> - each side contains `position`, `supporter_count`, and `sample_quote` with `text`, `author`, `link`
|
|
26
|
+
|
|
27
|
+
5. Display the results
|
|
28
|
+
> Read `classified_controversy.json` and display controversies with side-by-side opposing views and heat scores.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Deeply research a topic and route to the best analysis template. This is the default fetch depth for all templates.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Run the deep research fetcher
|
|
6
|
+
|
|
7
|
+
> `sc-research research:deep "ARGUMENTS"` (you may also add `--from=YYYY-MM-DD --to=YYYY-MM-DD` for a custom window)
|
|
8
|
+
|
|
9
|
+
2. Route to the right analysis mode
|
|
10
|
+
|
|
11
|
+
> Follow **Intent → Template Routing** rules in `using_communities_research`.
|
|
12
|
+
>
|
|
13
|
+
> - If the user explicitly asks for "full analysis", "everything", or "all views": run all 4 templates in order (`rank` -> `sentiment` -> `trend` -> `controversy`).
|
|
14
|
+
> - Otherwise: pick the SINGLE most suitable template for the user's question.
|
|
15
|
+
|
|
16
|
+
3. Validate outputs before presenting
|
|
17
|
+
|
|
18
|
+
> Confirm the generated `classified_*.json` file(s):
|
|
19
|
+
>
|
|
20
|
+
> - Match the requested topic (or close variant of the same topic).
|
|
21
|
+
> - Match requested date window when `--from/--to` was provided.
|
|
22
|
+
> - Include required fields from `web/src/types.ts`.
|
|
23
|
+
|
|
24
|
+
4. Display the results
|
|
25
|
+
> Present whichever template output(s) were selected after validation.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Discover viral topics and emerging themes from existing research data.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for data existence
|
|
6
|
+
|
|
7
|
+
> Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
|
|
8
|
+
|
|
9
|
+
2. Check freshness for the current request
|
|
10
|
+
|
|
11
|
+
> Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC" --mode=discovery`.
|
|
12
|
+
|
|
13
|
+
3. Run the discovery skill
|
|
14
|
+
|
|
15
|
+
> Use the `communities_discovery` skill to cluster posts by topic and generate `classified_discovery.json`.
|
|
16
|
+
|
|
17
|
+
4. Validate output schema
|
|
18
|
+
|
|
19
|
+
> Ensure `classified_discovery.json` includes:
|
|
20
|
+
>
|
|
21
|
+
> - `period`
|
|
22
|
+
> - `total_posts_analyzed`
|
|
23
|
+
> - `trending_topics` array
|
|
24
|
+
> - per topic: `id`, `topic_name`, `description`, `category`, `engagement_score`, `sentiment`, `key_posts`
|
|
25
|
+
|
|
26
|
+
5. Display the results
|
|
27
|
+
> Read `classified_discovery.json` and display the discovered topics with their engagement scores, sentiment, and top posts.
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Quick research a topic and get a fast text answer (Reddit only, skips X).
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Run the quick research fetcher (Reddit only)
|
|
6
|
+
> `sc-research research "ARGUMENTS"`
|
|
7
|
+
>
|
|
8
|
+
> This fetches Reddit data only and skips X for speed. Use `/research` if you need both sources.
|
|
9
|
+
|
|
10
|
+
2. Read the raw data
|
|
11
|
+
> Read `reddit_data.json` only. Ignore `x_data.json` even if it exists from a prior run.
|
|
12
|
+
|
|
13
|
+
3. Provide a concise answer
|
|
14
|
+
> Synthesize a 3-5 sentence answer directly from the Reddit data. Mention the community favorite, 1-2 alternatives, and a real user quote. Do NOT generate any classified files.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Rank and classify existing research data.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for data existence
|
|
6
|
+
|
|
7
|
+
> Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
|
|
8
|
+
|
|
9
|
+
2. Check freshness for the current request
|
|
10
|
+
|
|
11
|
+
> Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
|
|
12
|
+
|
|
13
|
+
3. Run the ranking skill
|
|
14
|
+
|
|
15
|
+
> Use the `communities_rank` skill to generate `classified_rank.json` from the existing data files.
|
|
16
|
+
|
|
17
|
+
4. Validate output schema
|
|
18
|
+
|
|
19
|
+
> Ensure `classified_rank.json` includes:
|
|
20
|
+
>
|
|
21
|
+
> - `topic`
|
|
22
|
+
> - `key_insights` (array)
|
|
23
|
+
> - `products` (array)
|
|
24
|
+
> - per product: `rank`, `name`, `sentiment`, `mentions`, `estimated_engagement_score`, `consensus`, `pros`, `cons`
|
|
25
|
+
|
|
26
|
+
5. Display the results
|
|
27
|
+
> Read `classified_rank.json` and display key insights plus a summary of the ranked products.
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
1. Run the research fetcher
|
|
2
|
+
|
|
3
|
+
> `sc-research research "ARGUMENTS"` (optionally add `--from=YYYY-MM-DD --to=YYYY-MM-DD` to focus on a specific date range)
|
|
4
|
+
|
|
5
|
+
2. Analyze the results
|
|
6
|
+
|
|
7
|
+
> Read the generated `reddit_data.json` and `x_data.json`.
|
|
8
|
+
|
|
9
|
+
3. Provide an answer
|
|
10
|
+
> Based on the data, provide a concise answer including the Community Favorite, decent alternatives, and a representative quote.
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Analyze sentiment from existing research data.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for data existence
|
|
6
|
+
|
|
7
|
+
> Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
|
|
8
|
+
|
|
9
|
+
2. Check freshness for the current request
|
|
10
|
+
|
|
11
|
+
> Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
|
|
12
|
+
|
|
13
|
+
3. Run the sentiment skill
|
|
14
|
+
|
|
15
|
+
> Use the `communities_sentiment` skill to analyze the raw data and generate `classified_sentiment.json`.
|
|
16
|
+
|
|
17
|
+
4. Validate output schema
|
|
18
|
+
|
|
19
|
+
> Ensure `classified_sentiment.json` includes:
|
|
20
|
+
>
|
|
21
|
+
> - `topic`
|
|
22
|
+
> - `overall_mood`
|
|
23
|
+
> - `distribution` (`very_positive`, `positive`, `mixed`, `negative`)
|
|
24
|
+
> - `by_source` with both `reddit` and `x`
|
|
25
|
+
> - `product_sentiments`
|
|
26
|
+
|
|
27
|
+
5. Display the results
|
|
28
|
+
> Read `classified_sentiment.json` and display the overall mood, sentiment distribution, and per-product sentiment breakdown.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Analyze discussion trends over time from existing research data.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for data existence
|
|
6
|
+
|
|
7
|
+
> Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
|
|
8
|
+
|
|
9
|
+
2. Check freshness for the current request
|
|
10
|
+
|
|
11
|
+
> Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
|
|
12
|
+
|
|
13
|
+
3. Run the trend skill
|
|
14
|
+
|
|
15
|
+
> Use the `communities_trend` skill to parse dates, choose an adaptive granularity (day/week/month), and generate `classified_trend.json`.
|
|
16
|
+
|
|
17
|
+
4. Validate output schema
|
|
18
|
+
|
|
19
|
+
> Ensure `classified_trend.json` includes:
|
|
20
|
+
>
|
|
21
|
+
> - `topic`
|
|
22
|
+
> - `date_range` with `from` and `to`
|
|
23
|
+
> - `timeline` entries with `period`, `post_count`, `total_engagement`, `reddit_posts`, `x_posts`
|
|
24
|
+
> - `key_moments` entries with `date`, `event`, `significance`
|
|
25
|
+
|
|
26
|
+
5. Display the results
|
|
27
|
+
> Read `classified_trend.json` and display timeline activity, engagement trends, and key moments.
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Launch the visualization dashboard.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
1. Check for classified data
|
|
6
|
+
|
|
7
|
+
> Ensure at least one `classified_*.json` file exists (e.g., `classified_rank.json`, `classified_sentiment.json`, `classified_trend.json`, `classified_controversy.json`). If none exist, suggest running `/rank`, `/sentiment`, `/trend`, or `/controversy` first.
|
|
8
|
+
|
|
9
|
+
2. Launch the visualizer
|
|
10
|
+
|
|
11
|
+
> `sc-research visualize`
|
|
12
|
+
|
|
13
|
+
3. Inform the user
|
|
14
|
+
> Tell the user to open the URL provided by the command (usually http://localhost:5173). The dashboard will show tabs for each available classified file.
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
{
|
|
2
|
+
"version": 1,
|
|
3
|
+
"templates": [
|
|
4
|
+
{
|
|
5
|
+
"id": "communities_fetch",
|
|
6
|
+
"kind": "skill",
|
|
7
|
+
"description": "Worker skill that fetches raw discussion data from Reddit and X (Twitter) for a given topic. Returns raw JSON files.",
|
|
8
|
+
"bodyFile": "skills/communities_fetch.md"
|
|
9
|
+
},
|
|
10
|
+
{
|
|
11
|
+
"id": "communities_rank",
|
|
12
|
+
"kind": "skill",
|
|
13
|
+
"description": "Analyze raw social media data (Reddit/X) to produce a ranked, classified report with strict JSON output.",
|
|
14
|
+
"bodyFile": "skills/communities_rank.md"
|
|
15
|
+
},
|
|
16
|
+
{
|
|
17
|
+
"id": "communities_sentiment",
|
|
18
|
+
"kind": "skill",
|
|
19
|
+
"description": "Worker skill that analyzes raw social media data to produce a sentiment breakdown report with strict JSON output.",
|
|
20
|
+
"bodyFile": "skills/communities_sentiment.md"
|
|
21
|
+
},
|
|
22
|
+
{
|
|
23
|
+
"id": "communities_trend",
|
|
24
|
+
"kind": "skill",
|
|
25
|
+
"description": "Worker skill that analyzes raw social media data to produce a trend timeline report with strict JSON output.",
|
|
26
|
+
"bodyFile": "skills/communities_trend.md"
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"id": "communities_controversy",
|
|
30
|
+
"kind": "skill",
|
|
31
|
+
"description": "Worker skill that analyzes raw social media data to identify polarizing topics and produce a controversy map with strict JSON output.",
|
|
32
|
+
"bodyFile": "skills/communities_controversy.md"
|
|
33
|
+
},
|
|
34
|
+
{
|
|
35
|
+
"id": "communities_discovery",
|
|
36
|
+
"kind": "skill",
|
|
37
|
+
"description": "Worker skill that analyzes raw social media data to discover and cluster high-signal emerging topics.",
|
|
38
|
+
"bodyFile": "skills/communities_discovery.md"
|
|
39
|
+
},
|
|
40
|
+
{
|
|
41
|
+
"id": "communities_visualize",
|
|
42
|
+
"kind": "skill",
|
|
43
|
+
"description": "Worker skill that launches a local web dashboard to visualize all available classified research data.",
|
|
44
|
+
"bodyFile": "skills/communities_visualize.md"
|
|
45
|
+
},
|
|
46
|
+
{
|
|
47
|
+
"id": "communities_research_test",
|
|
48
|
+
"kind": "skill",
|
|
49
|
+
"description": "Test the communities research skill with fixed Reddit links (no API key needed). Fetches data and returns JSON for AI classification.",
|
|
50
|
+
"bodyFile": "skills/communities_research_test.md"
|
|
51
|
+
},
|
|
52
|
+
{
|
|
53
|
+
"id": "using_communities_research",
|
|
54
|
+
"kind": "skill",
|
|
55
|
+
"description": "Orchestrator skill that understands user intent and routes to the most suitable analysis template.",
|
|
56
|
+
"bodyFile": "skills/using_communities_research.md"
|
|
57
|
+
},
|
|
58
|
+
{
|
|
59
|
+
"id": "research",
|
|
60
|
+
"kind": "command",
|
|
61
|
+
"description": "Research a topic using the Quick Answer flow.",
|
|
62
|
+
"bodyFile": "commands/research.md"
|
|
63
|
+
},
|
|
64
|
+
{
|
|
65
|
+
"id": "quick",
|
|
66
|
+
"kind": "command",
|
|
67
|
+
"description": "Run quick Reddit-only research flow.",
|
|
68
|
+
"bodyFile": "commands/quick.md"
|
|
69
|
+
},
|
|
70
|
+
{
|
|
71
|
+
"id": "deep-research",
|
|
72
|
+
"kind": "command",
|
|
73
|
+
"description": "Run deep research flow and route to best template.",
|
|
74
|
+
"bodyFile": "commands/deep-research.md"
|
|
75
|
+
},
|
|
76
|
+
{
|
|
77
|
+
"id": "rank",
|
|
78
|
+
"kind": "command",
|
|
79
|
+
"description": "Generate ranking report from current raw data.",
|
|
80
|
+
"bodyFile": "commands/rank.md"
|
|
81
|
+
},
|
|
82
|
+
{
|
|
83
|
+
"id": "sentiment",
|
|
84
|
+
"kind": "command",
|
|
85
|
+
"description": "Generate sentiment report from current raw data.",
|
|
86
|
+
"bodyFile": "commands/sentiment.md"
|
|
87
|
+
},
|
|
88
|
+
{
|
|
89
|
+
"id": "trend",
|
|
90
|
+
"kind": "command",
|
|
91
|
+
"description": "Generate trend timeline report from current raw data.",
|
|
92
|
+
"bodyFile": "commands/trend.md"
|
|
93
|
+
},
|
|
94
|
+
{
|
|
95
|
+
"id": "controversy",
|
|
96
|
+
"kind": "command",
|
|
97
|
+
"description": "Generate controversy map from current raw data.",
|
|
98
|
+
"bodyFile": "commands/controversy.md"
|
|
99
|
+
},
|
|
100
|
+
{
|
|
101
|
+
"id": "discovery",
|
|
102
|
+
"kind": "command",
|
|
103
|
+
"description": "Generate discovery clusters from current raw data.",
|
|
104
|
+
"bodyFile": "commands/discovery.md"
|
|
105
|
+
},
|
|
106
|
+
{
|
|
107
|
+
"id": "visualize",
|
|
108
|
+
"kind": "command",
|
|
109
|
+
"description": "Launch dashboard for available classified data files.",
|
|
110
|
+
"bodyFile": "commands/visualize.md"
|
|
111
|
+
},
|
|
112
|
+
{
|
|
113
|
+
"id": "test-research",
|
|
114
|
+
"kind": "command",
|
|
115
|
+
"description": "Run fixed-link test research pipeline.",
|
|
116
|
+
"bodyFile": "commands/test-research.md"
|
|
117
|
+
}
|
|
118
|
+
]
|
|
119
|
+
}
|
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: communities_controversy
|
|
3
|
+
description: Worker skill that analyzes raw social media data to identify polarizing topics and produce a controversy map with strict JSON output.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Communities Controversy Skill
|
|
7
|
+
|
|
8
|
+
This is a **Worker Skill** that identifies the most **divisive and argumentative** topics in community discussions. It finds where opinions clash and presents opposing viewpoints side-by-side.
|
|
9
|
+
|
|
10
|
+
## Input Data
|
|
11
|
+
You will act on existing JSON files in the workspace (fetched by `communities_fetch`):
|
|
12
|
+
- `reddit_data.json`
|
|
13
|
+
- `x_data.json`
|
|
14
|
+
|
|
15
|
+
## Workflow
|
|
16
|
+
|
|
17
|
+
### 1. Read All Discussion Content
|
|
18
|
+
Read the full text of every item in both data files. Look for patterns of **disagreement**:
|
|
19
|
+
- Contradictory opinions about the same product/topic
|
|
20
|
+
- "But..." / "However..." / "I disagree" / "overrated" / "overhyped" language
|
|
21
|
+
- Posts where the author explicitly compares and picks sides
|
|
22
|
+
- Topics where Reddit and X communities have opposing views
|
|
23
|
+
|
|
24
|
+
### 2. Identify Controversies
|
|
25
|
+
Find 3-5 clear controversies. Common patterns to look for:
|
|
26
|
+
- **Quality vs Price debates** — Is the expensive option worth it?
|
|
27
|
+
- **Technology debates** — e.g., planar vs dynamic, wired vs wireless
|
|
28
|
+
- **Platform divides** — Reddit enthusiasts vs X mainstream opinion
|
|
29
|
+
- **Subjective preferences** — EQ vs stock tuning, comfort vs sound quality
|
|
30
|
+
- **Brand loyalty conflicts** — Sennheiser vs Hifiman, etc.
|
|
31
|
+
|
|
32
|
+
### 3. Score Each Controversy
|
|
33
|
+
For each identified controversy:
|
|
34
|
+
- **heat_score** (1-100): Based on engagement on the divisive posts. Higher engagement on opposing posts = higher heat.
|
|
35
|
+
- **divisiveness**: "High" (near 50/50 split with strong opinions), "Medium" (clear lean but vocal minority), "Low" (mostly agreed but some dissent)
|
|
36
|
+
|
|
37
|
+
### 4. Extract Opposing Quotes (3 Per Side)
|
|
38
|
+
For each controversy, find **THREE real quotes for each side**:
|
|
39
|
+
- **Side A**: The position + 3 real quotes with author and link
|
|
40
|
+
- **Side B**: The opposing position + 3 real quotes with author and link
|
|
41
|
+
- Count how many commenters support each side (`supporter_count`)
|
|
42
|
+
- The 3 quotes should show different perspectives/arguments within the same side (not just repeating the same point).
|
|
43
|
+
|
|
44
|
+
### 5. Generate Output (Strict JSON)
|
|
45
|
+
You **MUST** save the result to `classified_controversy.json`.
|
|
46
|
+
|
|
47
|
+
The output **MUST** strictly follow the `ControversyData` interface defined in `web/src/types.ts`. Read that file first to understand the exact schema.
|
|
48
|
+
|
|
49
|
+
**Schema source of truth (important):**
|
|
50
|
+
- Do not rely on schema examples in this skill file.
|
|
51
|
+
- Always read `web/src/types.ts` and follow `ControversyData` exactly.
|
|
52
|
+
- If this skill text and `types.ts` ever conflict, `types.ts` wins.
|
|
53
|
+
|
|
54
|
+
## Critical Rules
|
|
55
|
+
1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
|
|
56
|
+
2. **Strict Schema**: The visualization tool will crash if the schema doesn't match `ControversyData` from `types.ts`.
|
|
57
|
+
3. **Real Quotes Only**: Every entry in `sample_quotes` must be a real excerpt from the data with a real link.
|
|
58
|
+
4. **3 Quotes Per Side**: Always provide 3 quotes per side to show breadth of the argument.
|
|
59
|
+
5. **Genuine Controversies**: Don't manufacture disagreement. Only report controversies that actually exist in the data.
|
|
60
|
+
6. **Output file**: Must be saved as `classified_controversy.json` in the project root.
|
|
61
|
+
7. **Data Resilience**:
|
|
62
|
+
- If `heat_score` cannot be calculated, set it to `0`.
|
|
63
|
+
- If `supporter_count` cannot be estimated, set it to `0`.
|
|
64
|
+
- If no quotes are found, set `sample_quotes` to `[]`.
|
|
65
|
+
- Ensure all `Array` fields are at least empty arrays `[]` if no data exists, never `null` or `undefined`.
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: communities_discovery
|
|
3
|
+
description: Worker skill that analyzes raw social media data to identify viral topics and generate a discovery report with strict JSON output.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Communities Discovery Skill
|
|
7
|
+
|
|
8
|
+
This is a **Worker Skill** that identifies **viral topics and emerging themes** from community discussions. It clusters posts by topic and surfaces the most engaging content.
|
|
9
|
+
|
|
10
|
+
## Input Data
|
|
11
|
+
You will act on existing JSON files in the workspace (fetched by `communities_fetch`):
|
|
12
|
+
- `reddit_data.json`
|
|
13
|
+
- `x_data.json`
|
|
14
|
+
|
|
15
|
+
## Workflow
|
|
16
|
+
|
|
17
|
+
### 1. Read All Posts
|
|
18
|
+
Read every item in both data files. Look for recurring themes, viral posts, and emerging conversations.
|
|
19
|
+
|
|
20
|
+
### 2. Analyze & Cluster
|
|
21
|
+
- Identify 3-5 distinct viral topics/themes from the combined data.
|
|
22
|
+
- Group relevant posts under each topic.
|
|
23
|
+
- Ignore spam or irrelevant noise.
|
|
24
|
+
|
|
25
|
+
### 3. Calculate Metrics
|
|
26
|
+
- **Engagement Score**: Sum of (score + comments) for Reddit, (likes + reposts) for X.
|
|
27
|
+
- **Sentiment**: Determine the overall sentiment (`positive`, `negative`, `neutral`, `mixed`) for the topic.
|
|
28
|
+
|
|
29
|
+
### 4. Extract Highlight Comments
|
|
30
|
+
For each trending topic, extract **3 real highlight comments** from the data:
|
|
31
|
+
- Each comment must be a real excerpt from an actual post or comment in the raw data.
|
|
32
|
+
- Include `author`, `link`, and `platform` ("reddit" or "x") for each.
|
|
33
|
+
- Choose comments that best represent the community's reaction to the topic — insightful takes, strong opinions, or popular observations.
|
|
34
|
+
|
|
35
|
+
### 5. Generate Output (Strict JSON)
|
|
36
|
+
You **MUST** save the result to `classified_discovery.json`.
|
|
37
|
+
|
|
38
|
+
The output **MUST** strictly follow the `DiscoveryData` interface defined in `web/src/types.ts`. Read that file first to understand the exact schema.
|
|
39
|
+
|
|
40
|
+
**Schema source of truth (important):**
|
|
41
|
+
- Do not rely on schema examples in this skill file.
|
|
42
|
+
- Always read `web/src/types.ts` and follow `DiscoveryData` exactly.
|
|
43
|
+
- If this skill text and `types.ts` ever conflict, `types.ts` wins.
|
|
44
|
+
|
|
45
|
+
## Critical Rules
|
|
46
|
+
1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
|
|
47
|
+
2. **Strict Schema**: The visualization tool will crash if the schema doesn't match `DiscoveryData` from `types.ts`.
|
|
48
|
+
3. **Real Content**: Base topic clustering on actual post content — don't guess.
|
|
49
|
+
4. **3 Comments Per Topic**: Always provide 3 highlight comments per trending topic with real text and source links.
|
|
50
|
+
5. **Output file**: Must be saved as `classified_discovery.json` in the project root.
|
|
51
|
+
6. **Data Resilience**:
|
|
52
|
+
- If `engagement_score` cannot be calculated, set it to `0`.
|
|
53
|
+
- If no comments are found, set `highlight_comments` to `[]`.
|
|
54
|
+
- Ensure all `Array` fields are at least empty arrays `[]` if no data exists, never `null` or `undefined`.
|
|
55
|
+
- Ensure `period` is always a string (e.g. "Last 7 Days"), never null.
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
# Communities Fetch Skill
|
|
2
|
+
|
|
3
|
+
This is a **Worker Skill** responsible for the "Eyes and Ears" of the research. It goes out to Reddit (and X if configured) to find real user discussions.
|
|
4
|
+
|
|
5
|
+
## Capabilities
|
|
6
|
+
|
|
7
|
+
- **Sources**: Reddit (via OpenAI URL discovery) + X (via XAI API)
|
|
8
|
+
- **Output**: Raw JSON files (`reddit_data.json`, `x_data.json`)
|
|
9
|
+
- **No Analysis**: This skill DOES NOT analyze or rank data. It only fetches it.
|
|
10
|
+
|
|
11
|
+
## Usage
|
|
12
|
+
|
|
13
|
+
### 1. Standard Research (Quick)
|
|
14
|
+
Scans ~5 threads per source. Good for quick answers.
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
sc-research research "YOUR TOPIC HERE"
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
### 2. Deep Research
|
|
21
|
+
Scans ~10+ threads. Good for comprehensive market analysis.
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
sc-research research:deep "YOUR TOPIC HERE"
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
### 3. Discovery Mode
|
|
28
|
+
Fetches top/trending posts for topic clustering instead of targeted search.
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
sc-research research "YOUR TOPIC HERE" --mode=discovery
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
### Optional Flags
|
|
35
|
+
- `--from=YYYY-MM-DD --to=YYYY-MM-DD` — Focus on a specific date range
|
|
36
|
+
- `--depth=deep` — Same as using `research:deep`
|
|
37
|
+
- `--mode=discovery` — Switch to discovery mode for topic clustering
|
|
38
|
+
- `--source=reddit|x|both` — Limit to a specific source (default: both available)
|
|
39
|
+
|
|
40
|
+
## Output Format
|
|
41
|
+
|
|
42
|
+
The skill saves files to the project root:
|
|
43
|
+
- `reddit_data.json`
|
|
44
|
+
- `x_data.json`
|
|
45
|
+
|
|
46
|
+
**Next Step**: After fetching, use an analysis skill (`communities_rank`, `communities_sentiment`, `communities_trend`, `communities_controversy`, or `communities_discovery`) to classify the data.
|
|
47
|
+
|
|
48
|
+
## Error Handling
|
|
49
|
+
|
|
50
|
+
| Scenario | Symptom | Action |
|
|
51
|
+
|---|---|---|
|
|
52
|
+
| Missing `OPENAI_API_KEY` | Process exits with auth error | Ensure `.env` has a valid `OPENAI_API_KEY` |
|
|
53
|
+
| Missing `XAI_API_KEY` | X data is empty, Reddit still works | Set `XAI_API_KEY` in `.env` or accept Reddit-only results |
|
|
54
|
+
| No results for topic | Output JSON has 0 items | Try broader search terms or check spelling |
|
|
55
|
+
| Rate limited | API error or timeout | Wait a few minutes and retry |
|
|
56
|
+
| Empty/malformed JSON output | File exists but `items` array is empty | Check topic relevance; try a more popular search term |
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: communities_rank
|
|
3
|
+
description: Analyze raw social media data (Reddit/X) to produce a ranked, classified report with strict JSON output.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Communities Ranking Skill
|
|
7
|
+
|
|
8
|
+
This skill acts as a **Ranking Engine** and **Data Analyst**. It takes raw, unstructured user discussions from `communities_fetch` and converts them into a structured, quantitative report.
|
|
9
|
+
|
|
10
|
+
## Input Data
|
|
11
|
+
You will act on existing JSON files in the workspace:
|
|
12
|
+
- `reddit_data.json`
|
|
13
|
+
- `x_data.json`
|
|
14
|
+
|
|
15
|
+
## Workflow
|
|
16
|
+
|
|
17
|
+
### 1. Read Raw Data
|
|
18
|
+
Read the input files to understand the raw community sentiment. Look for:
|
|
19
|
+
- **Consensus**: What is the most agreed-upon "best" product?
|
|
20
|
+
- **Debate**: What are the alternatives?
|
|
21
|
+
- **Engagement**: Which posts/comments have high upvotes/likes?
|
|
22
|
+
|
|
23
|
+
### 2. Analyze & Rank
|
|
24
|
+
You must analyze the data to create a ranking. Use the following criteria:
|
|
25
|
+
- **Rank 1**: The clear community winner (highest sentiment + engagement).
|
|
26
|
+
- **Rank 2-5**: Strong contenders or "best value" alternatives.
|
|
27
|
+
- **Sentiment**: Label as "Positive", "Negative", "Mixed", or "Very Positive".
|
|
28
|
+
- **Engagement Score**: A derived score based on upvotes + comments + mentions.
|
|
29
|
+
|
|
30
|
+
### 3. Extract Highlight Quotes
|
|
31
|
+
For each ranked product, extract **3 real highlight quotes** from the data:
|
|
32
|
+
- Each quote must be a real excerpt from the raw data with a real author and link.
|
|
33
|
+
- Tag each quote with a `context` field:
|
|
34
|
+
- `"pro"` — The quote highlights a strength or positive experience.
|
|
35
|
+
- `"con"` — The quote highlights a weakness or negative experience.
|
|
36
|
+
- `"general"` — The quote is a general opinion or comparison.
|
|
37
|
+
- Aim for a mix of pro/con/general across the 3 quotes to show balanced community opinion.
|
|
38
|
+
|
|
39
|
+
### 4. Generate Output (Strict JSON)
|
|
40
|
+
You **MUST** save the result to `classified_rank.json`.
|
|
41
|
+
|
|
42
|
+
To determine the output schema:
|
|
43
|
+
1. **Read the file** `web/src/types.ts`.
|
|
44
|
+
2. **Strictly follow** the `ClassifiedData` interface defined in that file as your output schema.
|
|
45
|
+
3. Ensure all fields in `ClassifiedData` (and its dependent types like `Product`) are populated.
|
|
46
|
+
|
|
47
|
+
## Example Command
|
|
48
|
+
To trigger this skill, the user might ask:
|
|
49
|
+
- "Rank the IEMs found in the reddit data"
|
|
50
|
+
- "Create a classification report for the research"
|
|
51
|
+
- "Update the visualization data"
|
|
52
|
+
|
|
53
|
+
## Critical Rules
|
|
54
|
+
1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
|
|
55
|
+
2. **Strict Schema**: The visualization tool will crash if `key_insights` or `estimated_engagement_score` are missing.
|
|
56
|
+
3. **Real Quotes**: Every `highlight_quotes` entry must be a real excerpt from the data with a real link.
|
|
57
|
+
4. **3 Quotes Per Product**: Always provide exactly 3 highlight quotes per product with appropriate context tags.
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: communities_research_test
|
|
3
|
+
description: "⚙️ DEBUG/LEGACY — Test the communities research skill with fixed Reddit links (no API key needed). Fetches data and returns JSON for AI classification."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Communities Research Test Skill
|
|
7
|
+
|
|
8
|
+
> **⚙️ Legacy/Debug Tool** — This skill predates the current `classified_*.json` pipeline. It fetches data from fixed URLs and asks for a markdown table classification. Use it for quick pipeline sanity checks, but prefer the standard `communities_fetch` → `communities_rank` flow for real analysis.
|
|
9
|
+
|
|
10
|
+
Use this skill to test the research pipeline with fixed Reddit URLs. No API key required.
|
|
11
|
+
|
|
12
|
+
## Usage
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
sc-research test:fixed [URL1] [URL2] ...
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Default URLs (if none provided):
|
|
19
|
+
- https://www.reddit.com/r/iems/comments/1olzu0g/the_best_iem_builds_at_each_price_2025_ultimate/
|
|
20
|
+
- https://www.reddit.com/r/headphones/comments/1lbcngj/new_iem_tierlist_2025/
|
|
21
|
+
- https://www.reddit.com/r/iems/comments/1c7imln/iem_tier_list/
|
|
22
|
+
|
|
23
|
+
## Output
|
|
24
|
+
|
|
25
|
+
The script outputs a **JSON object** with the following structure:
|
|
26
|
+
|
|
27
|
+
```json
|
|
28
|
+
{
|
|
29
|
+
"query": "Manual Test: ...",
|
|
30
|
+
"dateRange": { "from": "YYYY-MM-DD", "to": "YYYY-MM-DD" },
|
|
31
|
+
"items": [
|
|
32
|
+
{
|
|
33
|
+
"id": "...",
|
|
34
|
+
"title": "Thread Title",
|
|
35
|
+
"content": "Full text content...",
|
|
36
|
+
"author": "username",
|
|
37
|
+
"platform": "reddit",
|
|
38
|
+
"engagement": {
|
|
39
|
+
"score": 100,
|
|
40
|
+
"upvotes": 100,
|
|
41
|
+
"comments": 50
|
|
42
|
+
},
|
|
43
|
+
"url": "https://...",
|
|
44
|
+
"createdAt": "ISO8601 Date"
|
|
45
|
+
}
|
|
46
|
+
],
|
|
47
|
+
"sources": { "redditThreads": 3, "xPosts": 0 }
|
|
48
|
+
}
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## AI Instructions
|
|
52
|
+
|
|
53
|
+
After running, classify items by:
|
|
54
|
+
1. **Product** mentioned
|
|
55
|
+
2. **Sentiment** (Positive/Mixed/Negative)
|
|
56
|
+
3. **Engagement** (upvotes)
|
|
57
|
+
|
|
58
|
+
Present as a Markdown table:
|
|
59
|
+
|
|
60
|
+
| Rank | Product | Sentiment | Mentions | Avg. Upvotes |
|
|
61
|
+
|------|---------|-----------|----------|--------------|
|
|
62
|
+
| 1 | ... | 👍 | ... | ... |
|
|
63
|
+
|
|
64
|
+
Include top quotes with attribution.
|