sc-research 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/CLAUDE.md +67 -0
  2. package/README.md +111 -0
  3. package/dist/cli.js +671 -0
  4. package/dist/index.js +6530 -0
  5. package/dist/test/manual-fixed-links.js +186 -0
  6. package/dist/visualize.js +371 -0
  7. package/dist/web/assets/index-FB2oq23H.css +1 -0
  8. package/dist/web/assets/index-sZb3bzqd.js +44 -0
  9. package/dist/web/data.json +577 -0
  10. package/dist/web/index.html +14 -0
  11. package/dist/web/vite.svg +11 -0
  12. package/package.json +52 -0
  13. package/templates/base/commands/controversy.md +28 -0
  14. package/templates/base/commands/deep-research.md +25 -0
  15. package/templates/base/commands/discovery.md +27 -0
  16. package/templates/base/commands/quick.md +14 -0
  17. package/templates/base/commands/rank.md +27 -0
  18. package/templates/base/commands/research.md +10 -0
  19. package/templates/base/commands/sentiment.md +28 -0
  20. package/templates/base/commands/test-research.md +9 -0
  21. package/templates/base/commands/trend.md +27 -0
  22. package/templates/base/commands/visualize.md +14 -0
  23. package/templates/base/manifest.json +119 -0
  24. package/templates/base/skills/communities_controversy.md +65 -0
  25. package/templates/base/skills/communities_discovery.md +55 -0
  26. package/templates/base/skills/communities_fetch.md +56 -0
  27. package/templates/base/skills/communities_rank.md +57 -0
  28. package/templates/base/skills/communities_research_test.md +64 -0
  29. package/templates/base/skills/communities_sentiment.md +61 -0
  30. package/templates/base/skills/communities_trend.md +71 -0
  31. package/templates/base/skills/communities_visualize.md +46 -0
  32. package/templates/base/skills/using_communities_research.md +146 -0
  33. package/templates/platforms/agent.json +21 -0
  34. package/templates/platforms/claude.json +21 -0
  35. package/templates/platforms/codebuddy.json +21 -0
  36. package/templates/platforms/codex.json +21 -0
  37. package/templates/platforms/continue.json +21 -0
  38. package/templates/platforms/copilot.json +18 -0
  39. package/templates/platforms/cursor.json +18 -0
  40. package/templates/platforms/droid.json +21 -0
  41. package/templates/platforms/gemini.json +21 -0
  42. package/templates/platforms/kiro.json +18 -0
  43. package/templates/platforms/opencode.json +21 -0
  44. package/templates/platforms/qoder.json +21 -0
  45. package/templates/platforms/roocode.json +18 -0
  46. package/templates/platforms/trae.json +21 -0
  47. package/templates/platforms/windsurf.json +18 -0
@@ -0,0 +1,28 @@
1
+ ---
2
+ description: Find controversial debates in existing research data.
3
+ ---
4
+
5
+ 1. Check for data existence
6
+
7
+ > Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
8
+
9
+ 2. Check freshness for the current request
10
+
11
+ > Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
12
+
13
+ 3. Run the controversy skill
14
+
15
+ > Use the `communities_controversy` skill to identify polarizing topics, extract opposing quotes, and generate `classified_controversy.json`.
16
+
17
+ 4. Validate output schema
18
+
19
+ > Ensure `classified_controversy.json` includes:
20
+ >
21
+ > - `topic`
22
+ > - `overall_divisiveness`
23
+ > - `controversies` array
24
+ > - each controversy contains `topic`, `heat_score`, `divisiveness`, `side_a`, and `side_b`
25
+ > - each side contains `position`, `supporter_count`, and `sample_quote` with `text`, `author`, `link`
26
+
27
+ 5. Display the results
28
+ > Read `classified_controversy.json` and display controversies with side-by-side opposing views and heat scores.
@@ -0,0 +1,25 @@
1
+ ---
2
+ description: Deeply research a topic and route to the best analysis template. This is the default fetch depth for all templates.
3
+ ---
4
+
5
+ 1. Run the deep research fetcher
6
+
7
+ > `sc-research research:deep "ARGUMENTS"` (you may also add `--from=YYYY-MM-DD --to=YYYY-MM-DD` for a custom window)
8
+
9
+ 2. Route to the right analysis mode
10
+
11
+ > Follow **Intent → Template Routing** rules in `using_communities_research`.
12
+ >
13
+ > - If the user explicitly asks for "full analysis", "everything", or "all views": run all 4 templates in order (`rank` -> `sentiment` -> `trend` -> `controversy`).
14
+ > - Otherwise: pick the SINGLE most suitable template for the user's question.
15
+
16
+ 3. Validate outputs before presenting
17
+
18
+ > Confirm the generated `classified_*.json` file(s):
19
+ >
20
+ > - Match the requested topic (or close variant of the same topic).
21
+ > - Match requested date window when `--from/--to` was provided.
22
+ > - Include required fields from `web/src/types.ts`.
23
+
24
+ 4. Display the results
25
+ > Present whichever template output(s) were selected after validation.
@@ -0,0 +1,27 @@
1
+ ---
2
+ description: Discover viral topics and emerging themes from existing research data.
3
+ ---
4
+
5
+ 1. Check for data existence
6
+
7
+ > Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
8
+
9
+ 2. Check freshness for the current request
10
+
11
+ > Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC" --mode=discovery`.
12
+
13
+ 3. Run the discovery skill
14
+
15
+ > Use the `communities_discovery` skill to cluster posts by topic and generate `classified_discovery.json`.
16
+
17
+ 4. Validate output schema
18
+
19
+ > Ensure `classified_discovery.json` includes:
20
+ >
21
+ > - `period`
22
+ > - `total_posts_analyzed`
23
+ > - `trending_topics` array
24
+ > - per topic: `id`, `topic_name`, `description`, `category`, `engagement_score`, `sentiment`, `key_posts`
25
+
26
+ 5. Display the results
27
+ > Read `classified_discovery.json` and display the discovered topics with their engagement scores, sentiment, and top posts.
@@ -0,0 +1,14 @@
1
+ ---
2
+ description: Quick research a topic and get a fast text answer (Reddit only, skips X).
3
+ ---
4
+
5
+ 1. Run the quick research fetcher (Reddit only)
6
+ > `sc-research research "ARGUMENTS"`
7
+ >
8
+ > This fetches Reddit data only and skips X for speed. Use `/research` if you need both sources.
9
+
10
+ 2. Read the raw data
11
+ > Read `reddit_data.json` only. Ignore `x_data.json` even if it exists from a prior run.
12
+
13
+ 3. Provide a concise answer
14
+ > Synthesize a 3-5 sentence answer directly from the Reddit data. Mention the community favorite, 1-2 alternatives, and a real user quote. Do NOT generate any classified files.
@@ -0,0 +1,27 @@
1
+ ---
2
+ description: Rank and classify existing research data.
3
+ ---
4
+
5
+ 1. Check for data existence
6
+
7
+ > Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
8
+
9
+ 2. Check freshness for the current request
10
+
11
+ > Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
12
+
13
+ 3. Run the ranking skill
14
+
15
+ > Use the `communities_rank` skill to generate `classified_rank.json` from the existing data files.
16
+
17
+ 4. Validate output schema
18
+
19
+ > Ensure `classified_rank.json` includes:
20
+ >
21
+ > - `topic`
22
+ > - `key_insights` (array)
23
+ > - `products` (array)
24
+ > - per product: `rank`, `name`, `sentiment`, `mentions`, `estimated_engagement_score`, `consensus`, `pros`, `cons`
25
+
26
+ 5. Display the results
27
+ > Read `classified_rank.json` and display key insights plus a summary of the ranked products.
@@ -0,0 +1,10 @@
1
+ 1. Run the research fetcher
2
+
3
+ > `sc-research research "ARGUMENTS"` (optionally add `--from=YYYY-MM-DD --to=YYYY-MM-DD` to focus on a specific date range)
4
+
5
+ 2. Analyze the results
6
+
7
+ > Read the generated `reddit_data.json` and `x_data.json`.
8
+
9
+ 3. Provide an answer
10
+ > Based on the data, provide a concise answer including the Community Favorite, decent alternatives, and a representative quote.
@@ -0,0 +1,28 @@
1
+ ---
2
+ description: Analyze sentiment from existing research data.
3
+ ---
4
+
5
+ 1. Check for data existence
6
+
7
+ > Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
8
+
9
+ 2. Check freshness for the current request
10
+
11
+ > Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
12
+
13
+ 3. Run the sentiment skill
14
+
15
+ > Use the `communities_sentiment` skill to analyze the raw data and generate `classified_sentiment.json`.
16
+
17
+ 4. Validate output schema
18
+
19
+ > Ensure `classified_sentiment.json` includes:
20
+ >
21
+ > - `topic`
22
+ > - `overall_mood`
23
+ > - `distribution` (`very_positive`, `positive`, `mixed`, `negative`)
24
+ > - `by_source` with both `reddit` and `x`
25
+ > - `product_sentiments`
26
+
27
+ 5. Display the results
28
+ > Read `classified_sentiment.json` and display the overall mood, sentiment distribution, and per-product sentiment breakdown.
@@ -0,0 +1,9 @@
1
+ ---
2
+ description: Test the research tool with fixed URLs.
3
+ ---
4
+
5
+ 1. Run the fixed test script
6
+ > `sc-research test:fixed`
7
+
8
+ 2. Report status
9
+ > Report success if the command exits with code 0, otherwise report failure.
@@ -0,0 +1,27 @@
1
+ ---
2
+ description: Analyze discussion trends over time from existing research data.
3
+ ---
4
+
5
+ 1. Check for data existence
6
+
7
+ > Ensure `reddit_data.json` or `x_data.json` exists in the current directory.
8
+
9
+ 2. Check freshness for the current request
10
+
11
+ > Confirm raw data matches the requested topic and date window (if provided). If it does not match, re-fetch with **deep research**: `sc-research research:deep "TOPIC"`.
12
+
13
+ 3. Run the trend skill
14
+
15
+ > Use the `communities_trend` skill to parse dates, choose an adaptive granularity (day/week/month), and generate `classified_trend.json`.
16
+
17
+ 4. Validate output schema
18
+
19
+ > Ensure `classified_trend.json` includes:
20
+ >
21
+ > - `topic`
22
+ > - `date_range` with `from` and `to`
23
+ > - `timeline` entries with `period`, `post_count`, `total_engagement`, `reddit_posts`, `x_posts`
24
+ > - `key_moments` entries with `date`, `event`, `significance`
25
+
26
+ 5. Display the results
27
+ > Read `classified_trend.json` and display timeline activity, engagement trends, and key moments.
@@ -0,0 +1,14 @@
1
+ ---
2
+ description: Launch the visualization dashboard.
3
+ ---
4
+
5
+ 1. Check for classified data
6
+
7
+ > Ensure at least one `classified_*.json` file exists (e.g., `classified_rank.json`, `classified_sentiment.json`, `classified_trend.json`, `classified_controversy.json`). If none exist, suggest running `/rank`, `/sentiment`, `/trend`, or `/controversy` first.
8
+
9
+ 2. Launch the visualizer
10
+
11
+ > `sc-research visualize`
12
+
13
+ 3. Inform the user
14
+ > Tell the user to open the URL provided by the command (usually http://localhost:5173). The dashboard will show tabs for each available classified file.
@@ -0,0 +1,119 @@
1
+ {
2
+ "version": 1,
3
+ "templates": [
4
+ {
5
+ "id": "communities_fetch",
6
+ "kind": "skill",
7
+ "description": "Worker skill that fetches raw discussion data from Reddit and X (Twitter) for a given topic. Returns raw JSON files.",
8
+ "bodyFile": "skills/communities_fetch.md"
9
+ },
10
+ {
11
+ "id": "communities_rank",
12
+ "kind": "skill",
13
+ "description": "Analyze raw social media data (Reddit/X) to produce a ranked, classified report with strict JSON output.",
14
+ "bodyFile": "skills/communities_rank.md"
15
+ },
16
+ {
17
+ "id": "communities_sentiment",
18
+ "kind": "skill",
19
+ "description": "Worker skill that analyzes raw social media data to produce a sentiment breakdown report with strict JSON output.",
20
+ "bodyFile": "skills/communities_sentiment.md"
21
+ },
22
+ {
23
+ "id": "communities_trend",
24
+ "kind": "skill",
25
+ "description": "Worker skill that analyzes raw social media data to produce a trend timeline report with strict JSON output.",
26
+ "bodyFile": "skills/communities_trend.md"
27
+ },
28
+ {
29
+ "id": "communities_controversy",
30
+ "kind": "skill",
31
+ "description": "Worker skill that analyzes raw social media data to identify polarizing topics and produce a controversy map with strict JSON output.",
32
+ "bodyFile": "skills/communities_controversy.md"
33
+ },
34
+ {
35
+ "id": "communities_discovery",
36
+ "kind": "skill",
37
+ "description": "Worker skill that analyzes raw social media data to discover and cluster high-signal emerging topics.",
38
+ "bodyFile": "skills/communities_discovery.md"
39
+ },
40
+ {
41
+ "id": "communities_visualize",
42
+ "kind": "skill",
43
+ "description": "Worker skill that launches a local web dashboard to visualize all available classified research data.",
44
+ "bodyFile": "skills/communities_visualize.md"
45
+ },
46
+ {
47
+ "id": "communities_research_test",
48
+ "kind": "skill",
49
+ "description": "Test the communities research skill with fixed Reddit links (no API key needed). Fetches data and returns JSON for AI classification.",
50
+ "bodyFile": "skills/communities_research_test.md"
51
+ },
52
+ {
53
+ "id": "using_communities_research",
54
+ "kind": "skill",
55
+ "description": "Orchestrator skill that understands user intent and routes to the most suitable analysis template.",
56
+ "bodyFile": "skills/using_communities_research.md"
57
+ },
58
+ {
59
+ "id": "research",
60
+ "kind": "command",
61
+ "description": "Research a topic using the Quick Answer flow.",
62
+ "bodyFile": "commands/research.md"
63
+ },
64
+ {
65
+ "id": "quick",
66
+ "kind": "command",
67
+ "description": "Run quick Reddit-only research flow.",
68
+ "bodyFile": "commands/quick.md"
69
+ },
70
+ {
71
+ "id": "deep-research",
72
+ "kind": "command",
73
+ "description": "Run deep research flow and route to best template.",
74
+ "bodyFile": "commands/deep-research.md"
75
+ },
76
+ {
77
+ "id": "rank",
78
+ "kind": "command",
79
+ "description": "Generate ranking report from current raw data.",
80
+ "bodyFile": "commands/rank.md"
81
+ },
82
+ {
83
+ "id": "sentiment",
84
+ "kind": "command",
85
+ "description": "Generate sentiment report from current raw data.",
86
+ "bodyFile": "commands/sentiment.md"
87
+ },
88
+ {
89
+ "id": "trend",
90
+ "kind": "command",
91
+ "description": "Generate trend timeline report from current raw data.",
92
+ "bodyFile": "commands/trend.md"
93
+ },
94
+ {
95
+ "id": "controversy",
96
+ "kind": "command",
97
+ "description": "Generate controversy map from current raw data.",
98
+ "bodyFile": "commands/controversy.md"
99
+ },
100
+ {
101
+ "id": "discovery",
102
+ "kind": "command",
103
+ "description": "Generate discovery clusters from current raw data.",
104
+ "bodyFile": "commands/discovery.md"
105
+ },
106
+ {
107
+ "id": "visualize",
108
+ "kind": "command",
109
+ "description": "Launch dashboard for available classified data files.",
110
+ "bodyFile": "commands/visualize.md"
111
+ },
112
+ {
113
+ "id": "test-research",
114
+ "kind": "command",
115
+ "description": "Run fixed-link test research pipeline.",
116
+ "bodyFile": "commands/test-research.md"
117
+ }
118
+ ]
119
+ }
@@ -0,0 +1,65 @@
1
+ ---
2
+ name: communities_controversy
3
+ description: Worker skill that analyzes raw social media data to identify polarizing topics and produce a controversy map with strict JSON output.
4
+ ---
5
+
6
+ # Communities Controversy Skill
7
+
8
+ This is a **Worker Skill** that identifies the most **divisive and argumentative** topics in community discussions. It finds where opinions clash and presents opposing viewpoints side-by-side.
9
+
10
+ ## Input Data
11
+ You will act on existing JSON files in the workspace (fetched by `communities_fetch`):
12
+ - `reddit_data.json`
13
+ - `x_data.json`
14
+
15
+ ## Workflow
16
+
17
+ ### 1. Read All Discussion Content
18
+ Read the full text of every item in both data files. Look for patterns of **disagreement**:
19
+ - Contradictory opinions about the same product/topic
20
+ - "But..." / "However..." / "I disagree" / "overrated" / "overhyped" language
21
+ - Posts where the author explicitly compares and picks sides
22
+ - Topics where Reddit and X communities have opposing views
23
+
24
+ ### 2. Identify Controversies
25
+ Find 3-5 clear controversies. Common patterns to look for:
26
+ - **Quality vs Price debates** — Is the expensive option worth it?
27
+ - **Technology debates** — e.g., planar vs dynamic, wired vs wireless
28
+ - **Platform divides** — Reddit enthusiasts vs X mainstream opinion
29
+ - **Subjective preferences** — EQ vs stock tuning, comfort vs sound quality
30
+ - **Brand loyalty conflicts** — Sennheiser vs Hifiman, etc.
31
+
32
+ ### 3. Score Each Controversy
33
+ For each identified controversy:
34
+ - **heat_score** (1-100): Based on engagement on the divisive posts. Higher engagement on opposing posts = higher heat.
35
+ - **divisiveness**: "High" (near 50/50 split with strong opinions), "Medium" (clear lean but vocal minority), "Low" (mostly agreed but some dissent)
36
+
37
+ ### 4. Extract Opposing Quotes (3 Per Side)
38
+ For each controversy, find **THREE real quotes for each side**:
39
+ - **Side A**: The position + 3 real quotes with author and link
40
+ - **Side B**: The opposing position + 3 real quotes with author and link
41
+ - Count how many commenters support each side (`supporter_count`)
42
+ - The 3 quotes should show different perspectives/arguments within the same side (not just repeating the same point).
43
+
44
+ ### 5. Generate Output (Strict JSON)
45
+ You **MUST** save the result to `classified_controversy.json`.
46
+
47
+ The output **MUST** strictly follow the `ControversyData` interface defined in `web/src/types.ts`. Read that file first to understand the exact schema.
48
+
49
+ **Schema source of truth (important):**
50
+ - Do not rely on schema examples in this skill file.
51
+ - Always read `web/src/types.ts` and follow `ControversyData` exactly.
52
+ - If this skill text and `types.ts` ever conflict, `types.ts` wins.
53
+
54
+ ## Critical Rules
55
+ 1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
56
+ 2. **Strict Schema**: The visualization tool will crash if the schema doesn't match `ControversyData` from `types.ts`.
57
+ 3. **Real Quotes Only**: Every entry in `sample_quotes` must be a real excerpt from the data with a real link.
58
+ 4. **3 Quotes Per Side**: Always provide 3 quotes per side to show breadth of the argument.
59
+ 5. **Genuine Controversies**: Don't manufacture disagreement. Only report controversies that actually exist in the data.
60
+ 6. **Output file**: Must be saved as `classified_controversy.json` in the project root.
61
+ 7. **Data Resilience**:
62
+ - If `heat_score` cannot be calculated, set it to `0`.
63
+ - If `supporter_count` cannot be estimated, set it to `0`.
64
+ - If no quotes are found, set `sample_quotes` to `[]`.
65
+ - Ensure all `Array` fields are at least empty arrays `[]` if no data exists, never `null` or `undefined`.
@@ -0,0 +1,55 @@
1
+ ---
2
+ name: communities_discovery
3
+ description: Worker skill that analyzes raw social media data to identify viral topics and generate a discovery report with strict JSON output.
4
+ ---
5
+
6
+ # Communities Discovery Skill
7
+
8
+ This is a **Worker Skill** that identifies **viral topics and emerging themes** from community discussions. It clusters posts by topic and surfaces the most engaging content.
9
+
10
+ ## Input Data
11
+ You will act on existing JSON files in the workspace (fetched by `communities_fetch`):
12
+ - `reddit_data.json`
13
+ - `x_data.json`
14
+
15
+ ## Workflow
16
+
17
+ ### 1. Read All Posts
18
+ Read every item in both data files. Look for recurring themes, viral posts, and emerging conversations.
19
+
20
+ ### 2. Analyze & Cluster
21
+ - Identify 3-5 distinct viral topics/themes from the combined data.
22
+ - Group relevant posts under each topic.
23
+ - Ignore spam or irrelevant noise.
24
+
25
+ ### 3. Calculate Metrics
26
+ - **Engagement Score**: Sum of (score + comments) for Reddit, (likes + reposts) for X.
27
+ - **Sentiment**: Determine the overall sentiment (`positive`, `negative`, `neutral`, `mixed`) for the topic.
28
+
29
+ ### 4. Extract Highlight Comments
30
+ For each trending topic, extract **3 real highlight comments** from the data:
31
+ - Each comment must be a real excerpt from an actual post or comment in the raw data.
32
+ - Include `author`, `link`, and `platform` ("reddit" or "x") for each.
33
+ - Choose comments that best represent the community's reaction to the topic — insightful takes, strong opinions, or popular observations.
34
+
35
+ ### 5. Generate Output (Strict JSON)
36
+ You **MUST** save the result to `classified_discovery.json`.
37
+
38
+ The output **MUST** strictly follow the `DiscoveryData` interface defined in `web/src/types.ts`. Read that file first to understand the exact schema.
39
+
40
+ **Schema source of truth (important):**
41
+ - Do not rely on schema examples in this skill file.
42
+ - Always read `web/src/types.ts` and follow `DiscoveryData` exactly.
43
+ - If this skill text and `types.ts` ever conflict, `types.ts` wins.
44
+
45
+ ## Critical Rules
46
+ 1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
47
+ 2. **Strict Schema**: The visualization tool will crash if the schema doesn't match `DiscoveryData` from `types.ts`.
48
+ 3. **Real Content**: Base topic clustering on actual post content — don't guess.
49
+ 4. **3 Comments Per Topic**: Always provide 3 highlight comments per trending topic with real text and source links.
50
+ 5. **Output file**: Must be saved as `classified_discovery.json` in the project root.
51
+ 6. **Data Resilience**:
52
+ - If `engagement_score` cannot be calculated, set it to `0`.
53
+ - If no comments are found, set `highlight_comments` to `[]`.
54
+ - Ensure all `Array` fields are at least empty arrays `[]` if no data exists, never `null` or `undefined`.
55
+ - Ensure `period` is always a string (e.g. "Last 7 Days"), never null.
@@ -0,0 +1,56 @@
1
+ # Communities Fetch Skill
2
+
3
+ This is a **Worker Skill** responsible for the "Eyes and Ears" of the research. It goes out to Reddit (and X if configured) to find real user discussions.
4
+
5
+ ## Capabilities
6
+
7
+ - **Sources**: Reddit (via OpenAI URL discovery) + X (via XAI API)
8
+ - **Output**: Raw JSON files (`reddit_data.json`, `x_data.json`)
9
+ - **No Analysis**: This skill DOES NOT analyze or rank data. It only fetches it.
10
+
11
+ ## Usage
12
+
13
+ ### 1. Standard Research (Quick)
14
+ Scans ~5 threads per source. Good for quick answers.
15
+
16
+ ```bash
17
+ sc-research research "YOUR TOPIC HERE"
18
+ ```
19
+
20
+ ### 2. Deep Research
21
+ Scans ~10+ threads. Good for comprehensive market analysis.
22
+
23
+ ```bash
24
+ sc-research research:deep "YOUR TOPIC HERE"
25
+ ```
26
+
27
+ ### 3. Discovery Mode
28
+ Fetches top/trending posts for topic clustering instead of targeted search.
29
+
30
+ ```bash
31
+ sc-research research "YOUR TOPIC HERE" --mode=discovery
32
+ ```
33
+
34
+ ### Optional Flags
35
+ - `--from=YYYY-MM-DD --to=YYYY-MM-DD` — Focus on a specific date range
36
+ - `--depth=deep` — Same as using `research:deep`
37
+ - `--mode=discovery` — Switch to discovery mode for topic clustering
38
+ - `--source=reddit|x|both` — Limit to a specific source (default: both available)
39
+
40
+ ## Output Format
41
+
42
+ The skill saves files to the project root:
43
+ - `reddit_data.json`
44
+ - `x_data.json`
45
+
46
+ **Next Step**: After fetching, use an analysis skill (`communities_rank`, `communities_sentiment`, `communities_trend`, `communities_controversy`, or `communities_discovery`) to classify the data.
47
+
48
+ ## Error Handling
49
+
50
+ | Scenario | Symptom | Action |
51
+ |---|---|---|
52
+ | Missing `OPENAI_API_KEY` | Process exits with auth error | Ensure `.env` has a valid `OPENAI_API_KEY` |
53
+ | Missing `XAI_API_KEY` | X data is empty, Reddit still works | Set `XAI_API_KEY` in `.env` or accept Reddit-only results |
54
+ | No results for topic | Output JSON has 0 items | Try broader search terms or check spelling |
55
+ | Rate limited | API error or timeout | Wait a few minutes and retry |
56
+ | Empty/malformed JSON output | File exists but `items` array is empty | Check topic relevance; try a more popular search term |
@@ -0,0 +1,57 @@
1
+ ---
2
+ name: communities_rank
3
+ description: Analyze raw social media data (Reddit/X) to produce a ranked, classified report with strict JSON output.
4
+ ---
5
+
6
+ # Communities Ranking Skill
7
+
8
+ This skill acts as a **Ranking Engine** and **Data Analyst**. It takes raw, unstructured user discussions from `communities_fetch` and converts them into a structured, quantitative report.
9
+
10
+ ## Input Data
11
+ You will act on existing JSON files in the workspace:
12
+ - `reddit_data.json`
13
+ - `x_data.json`
14
+
15
+ ## Workflow
16
+
17
+ ### 1. Read Raw Data
18
+ Read the input files to understand the raw community sentiment. Look for:
19
+ - **Consensus**: What is the most agreed-upon "best" product?
20
+ - **Debate**: What are the alternatives?
21
+ - **Engagement**: Which posts/comments have high upvotes/likes?
22
+
23
+ ### 2. Analyze & Rank
24
+ You must analyze the data to create a ranking. Use the following criteria:
25
+ - **Rank 1**: The clear community winner (highest sentiment + engagement).
26
+ - **Rank 2-5**: Strong contenders or "best value" alternatives.
27
+ - **Sentiment**: Label as "Positive", "Negative", "Mixed", or "Very Positive".
28
+ - **Engagement Score**: A derived score based on upvotes + comments + mentions.
29
+
30
+ ### 3. Extract Highlight Quotes
31
+ For each ranked product, extract **3 real highlight quotes** from the data:
32
+ - Each quote must be a real excerpt from the raw data with a real author and link.
33
+ - Tag each quote with a `context` field:
34
+ - `"pro"` — The quote highlights a strength or positive experience.
35
+ - `"con"` — The quote highlights a weakness or negative experience.
36
+ - `"general"` — The quote is a general opinion or comparison.
37
+ - Aim for a mix of pro/con/general across the 3 quotes to show balanced community opinion.
38
+
39
+ ### 4. Generate Output (Strict JSON)
40
+ You **MUST** save the result to `classified_rank.json`.
41
+
42
+ To determine the output schema:
43
+ 1. **Read the file** `web/src/types.ts`.
44
+ 2. **Strictly follow** the `ClassifiedData` interface defined in that file as your output schema.
45
+ 3. Ensure all fields in `ClassifiedData` (and its dependent types like `Product`) are populated.
46
+
47
+ ## Example Command
48
+ To trigger this skill, the user might ask:
49
+ - "Rank the IEMs found in the reddit data"
50
+ - "Create a classification report for the research"
51
+ - "Update the visualization data"
52
+
53
+ ## Critical Rules
54
+ 1. **No External Research**: Do not fetch new data. Use only the provided JSON files.
55
+ 2. **Strict Schema**: The visualization tool will crash if `key_insights` or `estimated_engagement_score` are missing.
56
+ 3. **Real Quotes**: Every `highlight_quotes` entry must be a real excerpt from the data with a real link.
57
+ 4. **3 Quotes Per Product**: Always provide exactly 3 highlight quotes per product with appropriate context tags.
@@ -0,0 +1,64 @@
1
+ ---
2
+ name: communities_research_test
3
+ description: "⚙️ DEBUG/LEGACY — Test the communities research skill with fixed Reddit links (no API key needed). Fetches data and returns JSON for AI classification."
4
+ ---
5
+
6
+ # Communities Research Test Skill
7
+
8
+ > **⚙️ Legacy/Debug Tool** — This skill predates the current `classified_*.json` pipeline. It fetches data from fixed URLs and asks for a markdown table classification. Use it for quick pipeline sanity checks, but prefer the standard `communities_fetch` → `communities_rank` flow for real analysis.
9
+
10
+ Use this skill to test the research pipeline with fixed Reddit URLs. No API key required.
11
+
12
+ ## Usage
13
+
14
+ ```bash
15
+ sc-research test:fixed [URL1] [URL2] ...
16
+ ```
17
+
18
+ Default URLs (if none provided):
19
+ - https://www.reddit.com/r/iems/comments/1olzu0g/the_best_iem_builds_at_each_price_2025_ultimate/
20
+ - https://www.reddit.com/r/headphones/comments/1lbcngj/new_iem_tierlist_2025/
21
+ - https://www.reddit.com/r/iems/comments/1c7imln/iem_tier_list/
22
+
23
+ ## Output
24
+
25
+ The script outputs a **JSON object** with the following structure:
26
+
27
+ ```json
28
+ {
29
+ "query": "Manual Test: ...",
30
+ "dateRange": { "from": "YYYY-MM-DD", "to": "YYYY-MM-DD" },
31
+ "items": [
32
+ {
33
+ "id": "...",
34
+ "title": "Thread Title",
35
+ "content": "Full text content...",
36
+ "author": "username",
37
+ "platform": "reddit",
38
+ "engagement": {
39
+ "score": 100,
40
+ "upvotes": 100,
41
+ "comments": 50
42
+ },
43
+ "url": "https://...",
44
+ "createdAt": "ISO8601 Date"
45
+ }
46
+ ],
47
+ "sources": { "redditThreads": 3, "xPosts": 0 }
48
+ }
49
+ ```
50
+
51
+ ## AI Instructions
52
+
53
+ After running, classify items by:
54
+ 1. **Product** mentioned
55
+ 2. **Sentiment** (Positive/Mixed/Negative)
56
+ 3. **Engagement** (upvotes)
57
+
58
+ Present as a Markdown table:
59
+
60
+ | Rank | Product | Sentiment | Mentions | Avg. Upvotes |
61
+ |------|---------|-----------|----------|--------------|
62
+ | 1 | ... | 👍 | ... | ... |
63
+
64
+ Include top quotes with attribution.