npm - sc-research - Versions diffs - 1.0.13 → 1.0.14 - Mend

sc-research 1.0.13 → 1.0.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +132 -63
package/package.json +1 -1
package/templates/base/commands/controversy.md +8 -20
package/templates/base/commands/deep-research.md +8 -19
package/templates/base/commands/discovery.md +10 -22
package/templates/base/commands/quick.md +6 -7
package/templates/base/commands/rank.md +8 -19
package/templates/base/commands/research.md +12 -6
package/templates/base/commands/sentiment.md +8 -20
package/templates/base/commands/trend.md +8 -19
package/templates/base/commands/visualize.md +7 -7
package/templates/base/skills/social_media_controversy.md +53 -23
package/templates/base/skills/social_media_discovery.md +55 -43
package/templates/base/skills/social_media_fetch.md +81 -49
package/templates/base/skills/social_media_rank.md +49 -20
package/templates/base/skills/social_media_schema.md +105 -19
package/templates/base/skills/social_media_sentiment.md +59 -23
package/templates/base/skills/social_media_trend.md +60 -20
package/templates/base/skills/using_social_media_research.md +92 -74

package/templates/base/skills/social_media_sentiment.md CHANGED Viewed

@@ -1,36 +1,19 @@
 ---
 name: social_media_sentiment
-description: Analyze existing Reddit/X raw data and generate `classified_sentiment.json` with strict `SentimentData` output. Use for sentiment, mood, or positive-vs-negative analysis requests.
+description: Analysis-only worker. Reads existing raw Reddit/X data and generates `classified_sentiment.json` with strict `SentimentData` output. Use for sentiment, mood, or positive-vs-negative analysis requests.
 ---
 # Social Media Sentiment Skill
-This worker measures community tone and produces evidence-backed sentiment output for the dashboard.
+This worker measures community tone and produces evidence-backed sentiment output for the dashboard. It performs **analysis only** — fetching is handled by the orchestrator via `social_media_fetch`.
-## Required Inputs
+## Prerequisites
-Use existing raw files only:
+The following files must already exist (produced by `social_media_fetch`):
-- `reddit_data.json`
-- `x_data.json`
+- `reddit_data.json` and/or `x_data.json`
-At least one valid source file must exist.
-## Command Execution Flow
-Use this sequence for sentiment analysis:
-1. Fetch or refresh raw data (outside this worker):
-- `sc-research research:deep "TOPIC"`
-- Optional filters: `--source=reddit|x|both --from=YYYY-MM-DD --to=YYYY-MM-DD`
-2. Run this `social_media_sentiment` worker to generate `classified_sentiment.json`.
-3. Optional visualization:
-- `sc-research visualize`
-If raw files are missing, stale, or mismatched for the requested topic/date range, run step 1 first.
+At least one valid source file must be present. If both are missing, **stop and report failure** — do not attempt to fetch data.
 ## Step 1: Preflight Validation
@@ -93,6 +76,59 @@ Save result to:
 - `classified_sentiment.json`
+## Output Type Contract
+Your output MUST match this exact shape. The dashboard detects sentiment data by checking for `distribution` (object) + `by_source` (object). Missing either field = broken tab.
+```json
+{
+  "topic": "iPhone 16 Pro reviews",
+  "overall_mood": "Positive",
+  "distribution": {
+    "very_positive": 15,
+    "positive": 38,
+    "mixed": 22,
+    "negative": 8
+  },
+  "by_source": {
+    "reddit": {
+      "very_positive": 10,
+      "positive": 25,
+      "mixed": 15,
+      "negative": 5
+    },
+    "x": {
+      "very_positive": 5,
+      "positive": 13,
+      "mixed": 7,
+      "negative": 3
+    }
+  },
+  "product_sentiments": [
+    {
+      "name": "iPhone 16 Pro",
+      "overall": "Positive",
+      "reddit_sentiment": "Positive",
+      "x_sentiment": "Mixed",
+      "evidence_quotes": [
+        {
+          "text": "Camera upgrade alone makes it worth it",
+          "author": "u/tech_reviewer",
+          "link": "https://reddit.com/r/iphone/comments/xyz789",
+          "sentiment": "Very Positive"
+        }
+      ]
+    }
+  ]
+}
+```
+### Enum Rules for Sentiment
+- ALL sentiment labels: **Title Case** — `"Very Positive"`, `"Positive"`, `"Mixed"`, `"Negative"`. NEVER use `"positive"` or `"neutral"`.
+- `distribution` keys: **snake_case** — `very_positive`, `positive`, `mixed`, `negative`. NEVER use `veryPositive` or `Very Positive` as keys.
+- `by_source` keys: **lowercase** — `reddit`, `x`. NEVER use `Reddit` or `X`.
 ## Final Validation Checklist
 - JSON parse succeeds.

package/templates/base/skills/social_media_trend.md CHANGED Viewed

@@ -1,33 +1,19 @@
 ---
 name: social_media_trend
-description: Analyze existing Reddit/X raw data and generate `classified_trend.json` with strict `TrendData` output. Use for timeline, growth/decline, and discussion-peak analysis requests.
+description: Analysis-only worker. Reads existing raw Reddit/X data and generates `classified_trend.json` with strict `TrendData` output. Use for timeline, growth/decline, and discussion-peak analysis requests.
 ---
 # Social Media Trend Skill
-This worker converts raw discussion timestamps into a trend timeline with key moments.
+This worker converts raw discussion timestamps into a trend timeline with key moments. It performs **analysis only** — fetching is handled by the orchestrator via `social_media_fetch`.
-## Required Inputs
+## Prerequisites
-Use existing files only:
+The following files must already exist (produced by `social_media_fetch`):
-- `reddit_data.json`
-- `x_data.json`
+- `reddit_data.json` and/or `x_data.json`
-At least one valid source file must exist.
-## Command Execution Flow
-Use this sequence for trend analysis:
-1. Fetch or refresh raw data (outside this worker):
-   - `sc-research research:deep "TOPIC"`
-   - Optional filters: `--source=reddit|x|both --from=YYYY-MM-DD --to=YYYY-MM-DD`
-2. Run this `social_media_trend` worker to generate `classified_trend.json`.
-3. Optional visualization:
-   - `sc-research visualize`
-If raw files are missing, stale, or mismatched for the requested topic/date range, run step 1 first.
+At least one valid source file must be present. If both are missing, **stop and report failure** — do not attempt to fetch data.
 ## Step 1: Preflight and Date Parsing
@@ -84,6 +70,60 @@ Read `../social_media_schema/SKILL.md` and output strict `TrendData` JSON to:
 - `classified_trend.json`
+## Output Type Contract
+Your output MUST match this exact shape. The dashboard detects trend data by checking for `date_range` (object) + `timeline` (array). Missing either field = broken tab.
+```json
+{
+  "topic": "AI coding assistants",
+  "date_range": {
+    "from": "2025-01-01",
+    "to": "2025-01-31"
+  },
+  "granularity": "day",
+  "timeline": [
+    {
+      "period": "2025-01-01",
+      "post_count": 12,
+      "total_engagement": 3400,
+      "reddit_posts": 8,
+      "x_posts": 4
+    },
+    {
+      "period": "2025-01-02",
+      "post_count": 7,
+      "total_engagement": 1850,
+      "reddit_posts": 5,
+      "x_posts": 2
+    }
+  ],
+  "key_moments": [
+    {
+      "date": "2025-01-15",
+      "event": "Major product update announcement drove spike in discussion",
+      "significance": "high",
+      "url": "https://reddit.com/r/programming/comments/def456"
+    }
+  ]
+}
+```
+### Period Format Rules
+The `period` field in each timeline entry MUST match the selected granularity:
+| Granularity | Period Format | Example        |
+| ----------- | ------------- | -------------- |
+| `"day"`     | `YYYY-MM-DD`  | `"2025-01-15"` |
+| `"week"`    | `YYYY-Www`    | `"2025-W03"`   |
+| `"month"`   | `YYYY-MM`     | `"2025-01"`    |
+### Enum Rules for Trend
+- `significance` on KeyMoment: **lowercase** — `"high"`, `"medium"`, `"low"`. NEVER use `"High"`, `"Medium"`, `"Low"`.
+- All numeric fields (`post_count`, `total_engagement`, `reddit_posts`, `x_posts`) must be numbers, NEVER strings or null.
 ## Final Validation Checklist
 - JSON parse succeeds.

package/templates/base/skills/using_social_media_research.md CHANGED Viewed

@@ -1,15 +1,28 @@
 ---
 name: using_social_media_research
-description: This skill should be used when the user asks what people on Reddit/X think about a topic, including requests like "best/top recommendation", "compare options", "sentiment", "trend over time", "controversy/debate", "what is trending", "quick summary", or "full analysis". It acts as the entrypoint router and selects the correct fetch strategy and worker skill.
+description: This skill should be used when the user asks what people on Reddit/X think about a topic, including requests like "best/top recommendation", "compare options", "sentiment", "trend over time", "controversy/debate", "what is trending", "quick summary", or "full analysis". It acts as the entrypoint router and delegates to the correct fetch and worker skills.
 ---
 # Using Social Media Research (Orchestrator)
-This skill is the router for the research pipeline. Its job is to choose the right path, fetch the right depth, run the right worker, and return the right output file.
+This skill is the **single entrypoint** for the entire research pipeline.
+## Pipeline Flow (always follow this order)
+```
+User Question
+  → Step 1: Resolve Intent (pick the right worker)
+  → Step 2: Fetch Data (delegate to social_media_fetch)
+  → Step 3: Classify (delegate to the chosen worker skill)
+  → Step 4: Present Results
+  → Step 5: Visualize (optional — run sc-research visualize)
+```
+**No other skill or command should run fetch commands directly.** Only this orchestrator decides when and how to fetch.
 ## Auto-Trigger Cues
-Treat this as the entrypoint skill when user requests map to social-media research intent, for example:
+Activate this skill when the user's request maps to social-media research intent:
 - "What do people think about X?"
 - "Best X according to Reddit?"
@@ -18,111 +31,116 @@ Treat this as the entrypoint skill when user requests map to social-media resear
 - "Is X trending recently?"
 - "What are people debating about X?"
 - "Give me a quick social media summary"
+- "Full analysis of X"
-## Core Routing Contract
-- For analysis intents, run exactly one worker by default.
-- Run multiple workers only when the user explicitly asks for multi-view output (for example: "full analysis", "all views", "run everything").
-- Use deep fetch for every worker analysis.
-- Use quick fetch for explicit quick-answer requests.
-- If intent is still ambiguous after routing rules, use rank as the default overview route.
-- Reuse existing raw data only if it still matches topic, source, and date range; otherwise refetch.
-## Command Execution Summary
+## Core Contract
-After route selection, execute commands as follows:
+- Run exactly **one** worker by default.
+- Run **multiple** workers only when the user explicitly asks for multi-view output ("full analysis", "all views", "run everything").
+- Always use **deep** fetch for worker analysis routes.
+- Use **quick** fetch only for explicit quick-answer requests.
+- If intent is ambiguous after all routing rules, default to **quick-answer mode** (direct text response, no classified file).
+- **Delegate all fetching to `social_media_fetch`** — never run `sc-research research` commands directly from this skill or any worker.
-- Rank / Sentiment / Trend / Controversy routes:
-  - `sc-research research:deep "TOPIC" [--source=...] [--from=YYYY-MM-DD --to=YYYY-MM-DD]`
-  - then run the selected worker skill to produce the matching `classified_*.json`.
-- Discovery route:
-  - Broad weekly feed: `sc-research research:deep "DISCOVERY_WEEKLY" --mode=discovery [--source=...] [--from=... --to=...]`
-  - Topic-focused: `sc-research research:deep "TOPIC" --mode=discovery [--source=...] [--from=... --to=...]`
-  - then run `social_media_discovery`.
-- Quick answer route:
-  - `sc-research research "TOPIC" --source=reddit [--from=... --to=...]`
-  - then synthesize a short text answer (no classified file).
-- Optional dashboard view after any classified output:
-  - `sc-research visualize`
+---
 ## Step 1: Resolve Intent (Strict Precedence)
-Apply rules top-to-bottom:
+Apply rules top-to-bottom. First match wins.
 1. **Explicit multi-analysis request**
-   - If user asks for "full analysis", "all views", or equivalent, run:
-     1. `social_media_rank`
-     2. `social_media_sentiment`
-     3. `social_media_trend`
-     4. `social_media_controversy`
-   - Include `social_media_discovery` only if the user also asks for emerging/viral topic discovery.
+   - Trigger: "full analysis", "all views", "run everything", or equivalent.
+   - Run all four workers in order: `social_media_rank` → `social_media_sentiment` → `social_media_trend` → `social_media_controversy`.
+   - Include `social_media_discovery` only if user also asks about emerging/viral topics.
 2. **Explicit template request**
-   - If the user names a template ("sentiment", "trend", "controversy", "discovery", "rank"), that route wins.
+   - Trigger: user names a specific analysis ("sentiment", "trend", "controversy", "discovery", "rank").
+   - Route directly to that single worker.
 3. **Explicit quick-answer request**
-   - If the user asks for a quick/brief answer, use quick-answer mode (no classified file).
+   - Trigger: "quick answer", "short summary", "brief".
+   - Use quick-answer mode (no classified file, direct text response).
 4. **Inferred strongest intent**
-   - Map by primary question intent (keywords table below).
-5. **Fallback**
+   - Map by primary question keywords (see table below).
-- If still ambiguous, default to `social_media_rank`.
+5. **Fallback**
+   - Default to **quick-answer mode** — synthesize a 3–5 sentence answer directly from the raw data. Do not produce a `classified_*.json` file.
-## Step 2: Map Intent to Route
+### Intent → Worker Mapping
-| Route        | Typical trigger phrases                           | Worker                     | Output                        |
+| Route        | Trigger phrases                                   | Worker skill               | Output file                   |
 | ------------ | ------------------------------------------------- | -------------------------- | ----------------------------- |
 | Rank         | best, top, compare, recommendation, which one     | `social_media_rank`        | `classified_rank.json`        |
 | Sentiment    | feel, sentiment, opinion, positive/negative       | `social_media_sentiment`   | `classified_sentiment.json`   |
 | Trend        | timeline, over time, peak, growth, decline        | `social_media_trend`       | `classified_trend.json`       |
 | Controversy  | debate, divisive, disagreement, polarizing, vs    | `social_media_controversy` | `classified_controversy.json` |
 | Discovery    | trending topics, viral, discover themes, clusters | `social_media_discovery`   | `classified_discovery.json`   |
-| Quick Answer | quick answer, short summary, brief                | none                       | direct text answer            |
+| Quick Answer | quick answer, short summary, brief                | _(none)_                   | direct text answer            |
+### Source Preference Detection
+| User wording                          | Source value                                            |
+| ------------------------------------- | ------------------------------------------------------- |
+| "on Reddit", "subreddit", "Redditors" | `reddit`                                                |
+| "on X", "on Twitter", "tweets"        | `x`                                                     |
+| no explicit source                    | _(omit — runtime uses all sources with valid API keys)_ |
+## Step 2: Fetch Data (delegate to `social_media_fetch`)
+Read the `social_media_fetch` skill and follow its instructions to fetch raw data. Provide it with:
+- **topic**: the user's topic string
+- **depth**: `deep` for all worker routes, `quick` for quick-answer route
+- **mode**: `discovery` for the discovery route, `research` for all others
+- **source**: from source preference above (omit if not specified)
+- **date range**: `from`/`to` if user provided dates
+The fetch skill handles data freshness checks, CLI execution, and output validation. Do not duplicate that logic here.
+After fetch completes, confirm that at least one raw data file (`reddit_data.json` / `x_data.json`) exists and is valid before proceeding.
+## Step 3: Classify (delegate to worker skill)
-## Step 3: Detect Source and Fetch Depth
+Based on the route chosen in Step 1:
-### Source Detection
+- **Single route**: Read the selected worker skill's instructions (e.g., `social_media_rank`) and follow them to produce the matching `classified_*.json` file.
+- **Multi-route** (full analysis): Read and execute each worker skill in order. Each worker reads existing raw data and writes its own output file independently.
+- **Quick answer**: Synthesize a 3–5 sentence answer directly from the raw data. Do not produce any `classified_*.json` file.
-| User wording                          | Source flag                                                                   |
-| ------------------------------------- | ----------------------------------------------------------------------------- |
-| "on Reddit", "subreddit", "Redditors" | `--source=reddit`                                                             |
-| "on X", "on Twitter", "tweets"        | `--source=x`                                                                  |
-| no explicit source                    | no source flag (runtime uses all enabled sources based on available API keys) |
+### CRITICAL: Schema Enforcement
-### Fetch Strategy
+Before writing ANY `classified_*.json` file, you MUST:
-- **Worker routes (rank/sentiment/trend/controversy):**
-  - `sc-research research:deep "TOPIC" [--source=...] [--from=YYYY-MM-DD --to=YYYY-MM-DD]`
-- **Discovery route:**
-  - Broad weekly feed: `sc-research research:deep "DISCOVERY_WEEKLY" --mode=discovery [--source=...] [--from=... --to=...]`
-  - Topic-focused: `sc-research research:deep "TOPIC" --mode=discovery [--source=...] [--from=... --to=...]`
-- **Quick answer route:**
-  - `sc-research research "TOPIC" --source=reddit [--from=... --to=...]`
+1. Read `social_media_schema` skill — it is the **single source of truth** for JSON shapes.
+2. Each worker skill contains an **Output Type Contract** section with a concrete JSON example — match it exactly.
+3. Use ONLY the enum values listed in the schema. Wrong casing = broken dashboard.
-## Step 4: Data Freshness Check Before Fetch
+The dashboard auto-detects each classified type by checking for specific field signatures. If required fields are missing or misnamed, the dashboard will **not show that tab at all**.
-Before refetching, check whether existing `reddit_data.json` / `x_data.json` can be reused:
+## Step 4: Present Results
-- Same or equivalent topic intent
-- Same requested source scope
-- Same requested date window (if provided)
-- Files are parseable and contain `items` arrays
+- Confirm the expected `classified_*.json` file(s) exist and are parseable.
+- Present results matching what was requested:
+  - Single-route → single output summary
+  - Multi-route → sectioned summary per route
+  - Quick answer → 3–5 sentence direct text
-If any check fails, run a fresh fetch.
+## Step 5: Visualize (optional)
-## Step 5: Execute and Return
+Suggest running `sc-research visualize` to view results in the web dashboard. The visualize command:
-- Run the selected worker skill.
-- Confirm expected `classified_*.json` file exists and is parseable.
-- Present only what was requested:
-  - Single-route request -> single output summary
-  - Multi-route request -> sectioned summary per route
-  - Quick answer -> 3-5 sentence direct answer, no classified output required
+1. Reads all `classified_*.json` files in the working directory.
+2. Validates each against the expected schema.
+3. Merges them into a single `data.json` for the dashboard.
+4. Opens the dashboard at `localhost:5173`.
-## Ambiguity and Safety Rules
+## Safety Rules
-- If a request mixes intents without explicit multi-analysis wording, prefer one route and state what was chosen.
-- Never fabricate missing output files.
+- If a request mixes intents without explicit multi-analysis wording, pick the single strongest route and state what was chosen.
+- Never fabricate output files or data.
 - Never silently switch from deep to quick fetch for worker routes.
-- Keep each route independent: each `classified_*.json` can be produced without generating the others.
+- Each `classified_*.json` file is independent — producing one never requires producing another.
 ## File Map