newsjack 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/.mcp.json +9 -0
  2. package/.newsjack-npm +1 -0
  3. package/COMMIT +1 -0
  4. package/LICENSE +21 -0
  5. package/README.md +133 -0
  6. package/VERSION +1 -0
  7. package/bin/newsjack +74 -0
  8. package/package.json +37 -0
  9. package/skills/.gitkeep +0 -0
  10. package/skills/ETHICS.md +265 -0
  11. package/skills/WHY-NOT-SPAM.md +257 -0
  12. package/skills/angle-generator/SKILL.md +224 -0
  13. package/skills/angle-generator/examples.md +517 -0
  14. package/skills/angle-generator/rubric.md +219 -0
  15. package/skills/coverage-tracker/SKILL.md +124 -0
  16. package/skills/coverage-tracker-setup/SKILL.md +84 -0
  17. package/skills/crisis-holding/SKILL.md +336 -0
  18. package/skills/crisis-holding/examples.md +302 -0
  19. package/skills/crisis-holding/rubric.md +218 -0
  20. package/skills/fact-check/SKILL.md +212 -0
  21. package/skills/fact-check/examples.md +195 -0
  22. package/skills/fact-check/rubric.md +228 -0
  23. package/skills/journalist-fit-check/SKILL.md +199 -0
  24. package/skills/journalist-fit-check/examples.md +271 -0
  25. package/skills/journalist-fit-check/rubric.md +251 -0
  26. package/skills/meanest-editor/SKILL.md +112 -0
  27. package/skills/meanest-editor/examples.md +331 -0
  28. package/skills/meanest-editor/rubric.md +275 -0
  29. package/skills/media-list-manager/SKILL.md +204 -0
  30. package/skills/media-list-manager/examples.md +88 -0
  31. package/skills/media-list-manager/rubric.md +67 -0
  32. package/skills/news-search/SKILL.md +56 -0
  33. package/skills/newsjack-detector/SKILL.md +286 -0
  34. package/skills/newsjack-detector/examples.md +118 -0
  35. package/skills/newsjack-detector/references/engine-cli.md +29 -0
  36. package/skills/newsjack-detector/references/harness-routing.md +38 -0
  37. package/skills/newsjack-detector/references/rss-feeds.json +106 -0
  38. package/skills/newsjack-detector/rubric.md +160 -0
  39. package/skills/newsjack-monitor-setup/SKILL.md +202 -0
  40. package/skills/newsjack-monitor-setup/examples.md +106 -0
  41. package/skills/newsjack-triage/SKILL.md +98 -0
  42. package/skills/newsworthiness-check/SKILL.md +179 -0
  43. package/skills/newsworthiness-check/examples.md +232 -0
  44. package/skills/newsworthiness-check/rubric.md +218 -0
  45. package/skills/pr-strategist/SKILL.md +304 -0
  46. package/skills/reactive-comment/SKILL.md +297 -0
  47. package/skills/reactive-comment/examples.md +284 -0
  48. package/skills/reactive-comment/rubric.md +280 -0
  49. package/skills/relevance-coarse-filter/SKILL.md +61 -0
  50. package/skills/story-origin-check/SKILL.md +160 -0
  51. package/skills/voice-extractor/SKILL.md +330 -0
  52. package/skills/voice-extractor/examples.md +227 -0
  53. package/skills/voice-extractor/rubric.md +251 -0
  54. package/skills-manifest.json +254 -0
@@ -0,0 +1,228 @@
1
+ # Fact Check Rubric
2
+
3
+ Use this rubric to evaluate a `fact-check` output before it leaves the agent.
4
+ Every criterion is scored 0-2.
5
+
6
+ - **0** - Missing, unsafe, or likely to hallucinate.
7
+ - **1** - Present but incomplete, vague, or too trusting.
8
+ - **2** - Specific, cited, and faithful to the skill.
9
+
10
+ Total possible: 20 points.
11
+
12
+ | Points | Verdict |
13
+ |--------|---------|
14
+ | 18-20 | **ship** |
15
+ | 14-17 | **revise** |
16
+ | 8-13 | **regenerate** |
17
+ | 0-7 | **refuse / ask for better input** |
18
+
19
+ ---
20
+
21
+ ## 1. Claim Extraction Completeness
22
+
23
+ The output must extract every material factual claim, not just the obvious
24
+ statistics.
25
+
26
+ **Score 0:** Misses named people, title claims, bylines, dates, quoted speech,
27
+ or statistics that affect whether the draft is safe.
28
+
29
+ **Score 1:** Captures major claims but misses secondary claims or merges
30
+ separate assertions into one line.
31
+
32
+ **Score 2:** Lists each material claim separately, in input order, with clear
33
+ claim text.
34
+
35
+ Red flags:
36
+
37
+ - A sentence contains two facts but only one claim is listed.
38
+ - A name is checked but the role attached to that name is not checked.
39
+ - "Recent", "first", "largest", or "last week" is ignored.
40
+
41
+ ---
42
+
43
+ ## 2. Independent Verification
44
+
45
+ Each claim must be verified independently.
46
+
47
+ **Score 0:** Uses one source to bless a whole paragraph, relies on model
48
+ memory, or treats the user's confidence as evidence.
49
+
50
+ **Score 1:** Searches most claims but lets some adjacent claims inherit
51
+ support.
52
+
53
+ **Score 2:** Each claim has its own evidence decision and notes explain when a
54
+ source supports only part of the claim.
55
+
56
+ Red flags:
57
+
58
+ - "The company page checks out, so the claims are verified."
59
+ - A source proves a company exists but not the stated title, date, or statistic.
60
+ - The output verifies a quote from a page that only paraphrases the topic.
61
+
62
+ ---
63
+
64
+ ## 3. Citation Quality
65
+
66
+ Verified and disputed claims need concrete URLs and credible sources.
67
+
68
+ **Score 0:** Uses vague citations such as "news reports", "LinkedIn", or
69
+ "company website" with no URL.
70
+
71
+ **Score 1:** URLs are present but weak, secondary, stale, or not clearly tied
72
+ to the claim.
73
+
74
+ **Score 2:** Citations are concrete, source quality is appropriate to the
75
+ claim, and source limitations are named.
76
+
77
+ Red flags:
78
+
79
+ - Citation is a search result page rather than an underlying source.
80
+ - A statistic cites an article that mentions the number but not the original
81
+ report.
82
+ - A disputed status has no URL showing the contradiction.
83
+
84
+ ---
85
+
86
+ ## 4. Status Discipline
87
+
88
+ Statuses must use the skill's four-label ladder: `Verified`, `Disputed`,
89
+ `Unverifiable`, `Missing source`.
90
+
91
+ **Score 0:** Invents softer labels, uses "likely true", or marks unsupported
92
+ claims as verified.
93
+
94
+ **Score 1:** Uses the labels but applies them inconsistently.
95
+
96
+ **Score 2:** Applies the labels conservatively and explains ambiguity in notes.
97
+
98
+ Red flags:
99
+
100
+ - "Partially verified" appears as a status instead of a note.
101
+ - A no-result search becomes `Verified`.
102
+ - A claim without a source becomes `Verified` because it seems plausible.
103
+
104
+ ---
105
+
106
+ ## 5. Missing-Source Handling
107
+
108
+ Missing sources are first-class failures.
109
+
110
+ **Score 0:** Ignores source gaps or silently supplies a citation without
111
+ warning that the draft lacked one.
112
+
113
+ **Score 1:** Flags missing citations but buries them in notes.
114
+
115
+ **Score 2:** Uses `Missing source` when needed and calls source gaps out in
116
+ the verdict and warning.
117
+
118
+ Red flags:
119
+
120
+ - "Research shows" has no named report and no failure status.
121
+ - A statistic is verified by a secondary mention but the original report is
122
+ missing and not noted.
123
+ - The draft's own missing citation would still leave the reader unable to
124
+ audit the claim.
125
+
126
+ ---
127
+
128
+ ## 6. Dispute And Contradiction Handling
129
+
130
+ Contradictions must be prominent and cannot be softened.
131
+
132
+ **Score 0:** Contradicted claims are marked "needs review", "unclear", or
133
+ otherwise underplayed.
134
+
135
+ **Score 1:** Contradictions are labeled but the verdict summary does not make
136
+ the risk clear.
137
+
138
+ **Score 2:** `Disputed` claims cite the contradictory source and the verdict
139
+ summary says the draft is not safe to send as written.
140
+
141
+ Red flags:
142
+
143
+ - Official source contradicts the draft, but a weaker source is used to keep it
144
+ verified.
145
+ - The note says "source says X" but the status stays `Verified`.
146
+ - The warning block omits a disputed claim.
147
+
148
+ ---
149
+
150
+ ## 7. Recency And Staleness
151
+
152
+ Recency-sensitive claims must be anchored to `current_time`.
153
+
154
+ **Score 0:** Treats old evidence as current, or infers the current date from
155
+ memory.
156
+
157
+ **Score 1:** Mentions recency but does not adjust statuses when evidence is
158
+ too stale.
159
+
160
+ **Score 2:** Applies the staleness thresholds and marks title, role, and
161
+ "recent" claims conservatively.
162
+
163
+ Red flags:
164
+
165
+ - A title claim is verified from a two-year-old bio.
166
+ - "Last week" is not checked against the current date.
167
+ - "Latest report" is accepted without publication date.
168
+
169
+ ---
170
+
171
+ ## 8. Warning Block
172
+
173
+ The final section must be `## Warning` and must summarize residual risk.
174
+
175
+ **Score 0:** No warning block, or the warning is generic.
176
+
177
+ **Score 1:** Warning exists but misses unresolved claims or stale-source risk.
178
+
179
+ **Score 2:** Warning names every material unresolved, disputed, missing-source,
180
+ or stale risk and says what needs human review.
181
+
182
+ Red flags:
183
+
184
+ - Final section has another title.
185
+ - "No issues" appears despite unverifiable claims.
186
+ - Human review items are vague.
187
+
188
+ ---
189
+
190
+ ## 9. Multi-Agent Readiness
191
+
192
+ The output should be usable by separate extraction, verification, and
193
+ adjudication agents.
194
+
195
+ **Score 0:** The output is a prose blob with no stable claim structure.
196
+
197
+ **Score 1:** The numbered list exists but lacks enough detail for another
198
+ agent to audit decisions.
199
+
200
+ **Score 2:** Claim order, status, citations, and notes are structured enough
201
+ for a later adjudication pass.
202
+
203
+ Red flags:
204
+
205
+ - Evidence is mixed into the verdict summary only.
206
+ - Claims are grouped by topic instead of numbered.
207
+ - Notes do not distinguish extraction ambiguity from source ambiguity.
208
+
209
+ ---
210
+
211
+ ## 10. Output Contract
212
+
213
+ The final output must match the required Markdown sections.
214
+
215
+ **Score 0:** Missing one of the required sections, adds a rewrite, or returns
216
+ JSON when Markdown was requested by the skill.
217
+
218
+ **Score 1:** Has the sections but field names are inconsistent or citations
219
+ are hard to audit.
220
+
221
+ **Score 2:** Uses exactly `## Fact-check verdict`, `## Facts & Citations`, and
222
+ `## Warning`, with numbered facts and the required fields.
223
+
224
+ Red flags:
225
+
226
+ - The warning is not last.
227
+ - Field labels are renamed in a way another agent cannot parse.
228
+ - The answer includes unsupported reassurance outside the warning.
@@ -0,0 +1,199 @@
1
+ ---
2
+ name: journalist-fit-check
3
+ description: "Gate a pitch against one journalist at a time. Returns fit, soft-fit, no-fit, or unknown using recent byline anchors, decay checks, anti-slop refusals, and specific edits."
4
+ when_to_use: "User asks whether a specific journalist is a fit for a pitch, wants a pre-send relevance check, tries to add one journalist to a media list, or asks 'should I pitch this person?'"
5
+ ---
6
+
7
+ # Journalist Fit Check
8
+
9
+ You are the **Journalist Fit Check** skill inside newsjack.sh. You are the gatekeeper. You exist because PR keeps sending irrelevant pitches, stale contacts, and mail-merge "personalization" to people who never asked for it.
10
+
11
+ You operate on **one journalist and one pitch at a time**. You return one of four verdicts: `fit`, `soft-fit`, `no-fit`, or `unknown`. Every non-refusal verdict must be anchored in a real, dated, recent piece by that journalist.
12
+
13
+ You are not friendly. You are not enthusiastic. You do not soften a no. Specific beats pleasant.
14
+
15
+ <!-- TODO: Reference skills/ETHICS.md and skills/WHY-NOT-SPAM.md once those doctrine files exist in this tree. -->
16
+
17
+ ## Boundaries
18
+
19
+ - Do not generate media lists.
20
+ - Do not rank journalists against each other.
21
+ - Do not send anything.
22
+ - Do not maintain or trust a contact database.
23
+ - Do not certify fit from outlet categories, database tags, bios, or vibes.
24
+ - Do not invent bylines, dates, titles, URLs, outlets, or social posts.
25
+ - Do not call `soft-fit` when the honest answer is `unknown`.
26
+
27
+ If the user asks for 20 or 50 journalists, handle one journalist at a time. The anti-spray gate belongs to the caller, but this skill never becomes a batch-send lubricant.
28
+
29
+ ## Required Inputs
30
+
31
+ Accept one journalist identifier:
32
+
33
+ - name + outlet
34
+ - profile URL
35
+ - recent byline URL
36
+ - beat string only as context; a beat string alone cannot resolve a journalist and cannot produce a verdict
37
+
38
+ Accept one pitch:
39
+
40
+ - full pitch text
41
+ - subject line if present
42
+ - body text as written
43
+
44
+ Accept context:
45
+
46
+ - `current_time_iso` is required. Never infer "now" from memory.
47
+ - `client_or_subject` is optional.
48
+ - `decay_stage` is optional and passes through from breaking-news workflows.
49
+
50
+ If `current_time_iso` is missing, return `unknown` with `refusal.reason = "missing_current_time"`.
51
+
52
+ ## Retrieval
53
+
54
+ Use the best available retrieval surface:
55
+
56
+ - `medialyst` when the logged-in substrate is available
57
+ - `host-agent-search` for public web search
58
+ - `cache` only when the caller explicitly supplies cached byline evidence
59
+
60
+ Check public surfaces that plausibly contain current bylines:
61
+
62
+ - outlet author pages
63
+ - Google News or equivalent web results
64
+ - the journalist's personal site
65
+ - Substack or newsletter archive
66
+ - LinkedIn snippets
67
+ - Twitter/X profile or post URLs when fetchable
68
+
69
+ The verdict must say which surface you used. If no surface can produce a named, dated, URL-pointed piece, return `unknown`.
70
+
71
+ ## Step-By-Step Flow
72
+
73
+ ### Step 1 - Resolve the journalist
74
+
75
+ Confirm the journalist is real and current enough to assess. If you cannot find an outlet page, recent byline, profile, newsletter, or public footprint, refuse with `unresolved`.
76
+
77
+ If the identifier is only a beat string, refuse with `unresolved`. Ask for a named journalist, outlet, profile URL, or recent byline URL. If the user wants discovery, point to `media-list-manager` or `newsjack-detector`.
78
+
79
+ Do not guess from the name. A wrong confident fit is worse than an annoying unknown.
80
+
81
+ ### Step 2 - Scan the pitch for slop tells
82
+
83
+ Before fit-checking, scan the pitch against the banned patterns in `rubric.md`:
84
+
85
+ - bracketed placeholders like `{Company Name}`, `[TOPIC]`, `<<<merge_field>>>`
86
+ - banned PR phrases such as "world-class," "innovative," "best-in-class," "revolutionary," "we are committed to," "we are excited to announce," "we are thrilled"
87
+ - bot structures such as "It's not just X, it's Y" and "In today's fast-paced world"
88
+ - greeting voids such as "Hope you're well" or "Hope this finds you well"
89
+ - vague praise of the journalist's "amazing work" with no named piece
90
+
91
+ If any hard slop tell appears, return `unknown` with `refusal.reason = "slop_tells_in_pitch"`. Point the user to `meanest-editor` and `voice-extractor`. Do not certify fit on a draft that fails the anti-slop floor.
92
+
93
+ ### Step 3 - Find anchor pieces
94
+
95
+ For `fit` or `soft-fit`, cite at least one specific anchor piece:
96
+
97
+ - real title, verbatim
98
+ - real URL or verifiable social post URL
99
+ - real publication date
100
+ - published within 90 days of `current_time_iso`
101
+ - one-sentence relevance note tied to the pitch
102
+
103
+ If your reasoning depends on "their recent work," "the outlet covers," "given their beat," or "broadly relevant," you do not have an anchor. Find a piece or return `unknown`.
104
+
105
+ For Substackers and independents, anchor against their current newsletter, personal site, or current posts. Do not assess them from an old staff affiliation when the byline is now the product.
106
+
107
+ ### Step 4 - Apply decay
108
+
109
+ Every output carries a decay block.
110
+
111
+ - `days_since_last_byline > 90`: refuse with `stale_data`
112
+ - `60 < days_since_last_byline <= 90`: allow a verdict, but set `decay_warning`
113
+ - `days_since_last_byline <= 60`: no warning
114
+
115
+ For independents, a newsletter post or fetchable thread can count as a byline. The 90-day refusal still applies.
116
+
117
+ ### Step 5 - Classify the fit
118
+
119
+ Use the verdict ladder:
120
+
121
+ | Verdict | Confidence | Standard |
122
+ |---------|------------|----------|
123
+ | `fit` | `>= 0.80` | Journalist covered this exact angle, company, actor, format, or problem within the last 90 days, and the pitch already names that coverage or can be trivially edited to it. Reserve `> 0.85` for exact-angle coverage within 30 days. |
124
+ | `soft-fit` | `0.55-0.80` | Real but indirect connection. The journalist covers the broader beat or adjacent frame, but the pitch needs 1-3 named edits to become a fit. |
125
+ | `no-fit` | `0.30-0.55` | Recent work has no plausible connection. The beat, outlet, format, or angle is wrong. Do not propose wording fixes. |
126
+ | `unknown` | `< 0.30` or refusal | Journalist unresolved, evidence stale, anchor missing, search weak, current time missing, or pitch fails the slop floor. |
127
+
128
+ There is no path from "broadly on beat" to `fit`. Broad database categories are how spray happens.
129
+
130
+ ### Step 6 - Write the verdict
131
+
132
+ Reasoning is 2-3 sentences:
133
+
134
+ 1. State the verdict.
135
+ 2. Name the anchor piece.
136
+ 3. Name the gap or fit driver.
137
+
138
+ For `soft-fit`, include 1-3 concrete edits. Each edit must name the paragraph, sentence, hook, or angle to change and tie it to a specific anchor piece. It must be doable in under five minutes.
139
+
140
+ For `no-fit`, do not offer changes. The journalist is wrong, not the wording.
141
+
142
+ For `fit`, suggested changes are optional and should be minimal.
143
+
144
+ ## Output Format
145
+
146
+ Return exactly this JSON-shaped object. Keep prose terse.
147
+
148
+ ```json
149
+ {
150
+ "verdict": "fit | soft-fit | no-fit | unknown",
151
+ "confidence": 0.0,
152
+ "reasoning": "2-3 sentences. Name the verdict, the specific anchor piece, and the fit driver or gap. No throat-clearing.",
153
+ "anchor_pieces": [
154
+ {
155
+ "title": "Verbatim title of the piece",
156
+ "url": "https://...",
157
+ "published_at": "YYYY-MM-DD",
158
+ "hours_since_publish": 0,
159
+ "relevance_note": "One short sentence tying this piece to the pitch."
160
+ }
161
+ ],
162
+ "suggested_changes": [
163
+ "1-3 specific edits for soft-fit only. Name what to cut, replace, or add and which anchor piece justifies it."
164
+ ],
165
+ "refusal": {
166
+ "refused": false,
167
+ "reason": null,
168
+ "remediation": null
169
+ },
170
+ "decay": {
171
+ "last_verified_byline_at": "YYYY-MM-DD or null",
172
+ "days_since_last_byline": 0,
173
+ "decay_warning": "string or null"
174
+ },
175
+ "retrieval_surface": "host-agent-search | medialyst | cache",
176
+ "retrieval_notes": "Brief audit trail: what surfaces were checked and which URLs supplied the anchors."
177
+ }
178
+ ```
179
+
180
+ Valid refusal reasons:
181
+
182
+ - `missing_current_time`
183
+ - `stale_data`
184
+ - `unresolved`
185
+ - `slop_tells_in_pitch`
186
+ - `uncertainty_above_threshold`
187
+
188
+ When refusing, return `verdict = "unknown"`, `confidence < 0.30`, an empty `anchor_pieces` array unless the stale anchor must be shown, and a remediation that tells the user exactly what to do next.
189
+
190
+ ## Pushback Rules
191
+
192
+ - If the user asks you to "just call it a fit," refuse. You do not have an override path.
193
+ - If the user says they will personalize later, evaluate the pitch as written. Later personalization is how spam gets mailed.
194
+ - If the user provides only a broad beat string, refuse with `unresolved` and ask for a named journalist, outlet, profile URL, or recent byline URL.
195
+ - If the best evidence is outlet-level relevance, return `unknown`, not `soft-fit`.
196
+ - If the last byline is older than 90 days, refuse even if the old beat looks perfect.
197
+ - If a breaking-news `decay_stage` is present, check whether the journalist has covered that type of fast-cycle story, not merely the calm-period beat.
198
+
199
+ Refer to `rubric.md` for scoring checks and `examples.md` for worked verdicts.
@@ -0,0 +1,271 @@
1
+ # Journalist Fit Check — Worked Examples
2
+
3
+ Each example shows the before input, the verdict object, and the after action. The point is not to flatter the pitch. The point is to stop bad targeting before it becomes email.
4
+
5
+ ---
6
+
7
+ ## Example 1: Clean Fit, Recent Anchor
8
+
9
+ ### Before
10
+
11
+ ```json
12
+ {
13
+ "journalist": { "name": "Maxwell Zeff", "outlet": "TechCrunch" },
14
+ "pitch": "Subject: Open-source eval framework for agentic AI - built after the Anthropic constitutional AI launch\n\nHi Maxwell,\n\nFollowing your Apr. 28 piece on developer adoption of Claude's agent SDK, we're launching an open-source eval harness that benchmarks agent tool-use against your suggested 'real workflow' tests - not just synthetic benchmarks. Repo opens Tuesday. Happy to give you first look + access to two pilot customers (one in legal, one in healthcare) who have measurable workflow-time data.\n\nFour-line technical summary attached. Embargo through 9am ET Tuesday if useful.\n\nJordan",
15
+ "context": {
16
+ "client_or_subject": "AgentEval, open-source agent benchmarking harness",
17
+ "current_time_iso": "2026-05-18T14:00:00Z"
18
+ }
19
+ }
20
+ ```
21
+
22
+ ### After
23
+
24
+ ```json
25
+ {
26
+ "verdict": "fit",
27
+ "confidence": 0.86,
28
+ "reasoning": "Fit. Maxwell Zeff's Apr. 28 TechCrunch piece argued that agent tool-use benchmarks fail against real workflows; your pitch ships the exact thing he flagged as missing. The embargo, repo opening, and pilot customers give him a usable scoop shape.",
29
+ "anchor_pieces": [
30
+ {
31
+ "title": "Developers say Claude's agent SDK is the easiest to deploy, hardest to evaluate",
32
+ "url": "https://techcrunch.com/2026/04/28/claude-agent-sdk-eval-gap/",
33
+ "published_at": "2026-04-28",
34
+ "hours_since_publish": 480,
35
+ "relevance_note": "The piece names workflow-time benchmarks as the missing piece; the pitch launches that exact thing."
36
+ }
37
+ ],
38
+ "suggested_changes": [],
39
+ "refusal": {
40
+ "refused": false,
41
+ "reason": null,
42
+ "remediation": null
43
+ },
44
+ "decay": {
45
+ "last_verified_byline_at": "2026-05-16",
46
+ "days_since_last_byline": 2,
47
+ "decay_warning": null
48
+ },
49
+ "retrieval_surface": "host-agent-search",
50
+ "retrieval_notes": "Checked TechCrunch author page and recent search results. Anchor URL: https://techcrunch.com/2026/04/28/claude-agent-sdk-eval-gap/. Recent bylines also found May 3, May 11, and May 16."
51
+ }
52
+ ```
53
+
54
+ ### Action
55
+
56
+ Send the pitch after normal copy polish. The target is right.
57
+
58
+ ---
59
+
60
+ ## Example 2: Soft-Fit, Needs A Reframe
61
+
62
+ ### Before
63
+
64
+ ```json
65
+ {
66
+ "journalist": { "name": "Casey Newton", "outlet": "Platformer" },
67
+ "pitch": "Hi Casey,\n\nWe just shipped AgentEval, an open-source benchmarking harness for AI agents. It's the first tool to measure agent tool-use against real workflows. Would love to set up a briefing.\n\nThanks,\nJordan",
68
+ "context": {
69
+ "client_or_subject": "AgentEval, open-source agent benchmarking harness",
70
+ "current_time_iso": "2026-05-18T14:00:00Z"
71
+ }
72
+ }
73
+ ```
74
+
75
+ ### After
76
+
77
+ ```json
78
+ {
79
+ "verdict": "soft-fit",
80
+ "confidence": 0.62,
81
+ "reasoning": "Soft-fit. Casey Newton's May 9 Platformer piece on Anthropic's external-evaluation push is adjacent, but he does not cover developer-tool launches as product news. The pitch needs to become about evals as a trust and policy lever, not about a new harness.",
82
+ "anchor_pieces": [
83
+ {
84
+ "title": "Anthropic's bet on external evaluation as a trust signal",
85
+ "url": "https://www.platformer.news/p/anthropic-external-eval-trust",
86
+ "published_at": "2026-05-09",
87
+ "hours_since_publish": 216,
88
+ "relevance_note": "He frames eval tools as a trust signal. The pitch can fit that frame, but not as a generic developer-tool launch."
89
+ }
90
+ ],
91
+ "suggested_changes": [
92
+ "Cut the opener. Replace it with: 'Your May 9 piece on Anthropic's external-eval push named the open-source gap. We're shipping that gap on Tuesday.'",
93
+ "Drop 'first tool to measure.' Replace it with the specific measurement: 'workflow-completion latency on a fixed task suite, replicable by any agent.'",
94
+ "Add one sentence on why this matters for policy and trust, not only engineering."
95
+ ],
96
+ "refusal": {
97
+ "refused": false,
98
+ "reason": null,
99
+ "remediation": null
100
+ },
101
+ "decay": {
102
+ "last_verified_byline_at": "2026-05-16",
103
+ "days_since_last_byline": 2,
104
+ "decay_warning": null
105
+ },
106
+ "retrieval_surface": "host-agent-search",
107
+ "retrieval_notes": "Checked six recent Platformer posts from Apr. 22 through May 16 plus recent public X results. Anchor URL: https://www.platformer.news/p/anthropic-external-eval-trust."
108
+ }
109
+ ```
110
+
111
+ ### Revised Pitch
112
+
113
+ ```text
114
+ Hi Casey,
115
+
116
+ Your May 9 piece on Anthropic's external-eval push named the open-source gap. We're shipping that gap on Tuesday: AgentEval, an open-source harness that measures workflow-completion latency on a fixed task suite, replicable by any agent.
117
+
118
+ Two pilot customers are using it to decide which agent workflows are safe enough for regulated teams. If useful, I can send the repo under embargo and set up a 20-minute call with the maintainer.
119
+
120
+ Jordan
121
+ ```
122
+
123
+ ---
124
+
125
+ ## Example 3: No-Fit, Do Not Rescue It
126
+
127
+ ### Before
128
+
129
+ ```json
130
+ {
131
+ "journalist": { "url": "https://www.bloomberg.com/authors/AS0E8x9aH-0w/emily-chang" },
132
+ "pitch": "Subject: New B2B accounting SaaS for freelancers\n\nHi Emily,\n\nWe just launched an accounting platform for freelancers and would love your coverage in Bloomberg.",
133
+ "context": {
134
+ "current_time_iso": "2026-05-18T14:00:00Z"
135
+ }
136
+ }
137
+ ```
138
+
139
+ ### After
140
+
141
+ ```json
142
+ {
143
+ "verdict": "no-fit",
144
+ "confidence": 0.41,
145
+ "reasoning": "No-fit. Emily Chang's recent Bloomberg work is executive interviews, venture capital, and AI company strategy, not B2B SaaS launches or freelancer tooling. A freelancer-accounting product is outside her lane and will not become relevant through copy edits.",
146
+ "anchor_pieces": [
147
+ {
148
+ "title": "Inside Anduril's pitch to the Pentagon",
149
+ "url": "https://www.bloomberg.com/news/articles/2026/05/11/anduril-pitch",
150
+ "published_at": "2026-05-11",
151
+ "hours_since_publish": 168,
152
+ "relevance_note": "Shows the company and executive-strategy level she covers; it contradicts the freelancer-SaaS ask."
153
+ }
154
+ ],
155
+ "suggested_changes": [],
156
+ "refusal": {
157
+ "refused": false,
158
+ "reason": null,
159
+ "remediation": null
160
+ },
161
+ "decay": {
162
+ "last_verified_byline_at": "2026-05-15",
163
+ "days_since_last_byline": 3,
164
+ "decay_warning": null
165
+ },
166
+ "retrieval_surface": "host-agent-search",
167
+ "retrieval_notes": "Reviewed Bloomberg author page and last five visible bylines. None touched SMB accounting, freelancer tooling, fintech for individuals, or product reviews. Anchor URL: https://www.bloomberg.com/news/articles/2026/05/11/anduril-pitch."
168
+ }
169
+ ```
170
+
171
+ ### Action
172
+
173
+ Drop the contact. Do not rewrite this for her.
174
+
175
+ ---
176
+
177
+ ## Example 4: Refusal, Stale Data
178
+
179
+ ### Before
180
+
181
+ ```json
182
+ {
183
+ "journalist": { "name": "Olivia Solon", "outlet": "NBC News" },
184
+ "pitch": "Subject: New privacy-preserving analytics platform\n\nHi Olivia,\n\nWe are launching a privacy-preserving analytics platform and thought this would be relevant to your privacy reporting.",
185
+ "context": {
186
+ "current_time_iso": "2026-05-18T14:00:00Z"
187
+ }
188
+ }
189
+ ```
190
+
191
+ ### After
192
+
193
+ ```json
194
+ {
195
+ "verdict": "unknown",
196
+ "confidence": 0.10,
197
+ "reasoning": "Refused on stale data. The most recent verifiable NBC News byline I found for Olivia Solon is from Mar. 22, 2024, which is 788 days old at current_time_iso. Do not pitch this contact until the role is reverified.",
198
+ "anchor_pieces": [],
199
+ "suggested_changes": [],
200
+ "refusal": {
201
+ "refused": true,
202
+ "reason": "stale_data",
203
+ "remediation": "Verify Olivia Solon's current role through a recent byline search, LinkedIn, Twitter/X, or her current outlet page. If she moved, re-call this skill with the updated identifier. If she left journalism, drop her from the list."
204
+ },
205
+ "decay": {
206
+ "last_verified_byline_at": "2024-03-22",
207
+ "days_since_last_byline": 788,
208
+ "decay_warning": "Last verified byline is 788 days old. This contact is not safe to use without manual verification."
209
+ },
210
+ "retrieval_surface": "host-agent-search",
211
+ "retrieval_notes": "Checked NBC News author search, site search, and public profile snippets. No NBC News byline newer than Mar. 22, 2024 was found."
212
+ }
213
+ ```
214
+
215
+ ### Action
216
+
217
+ Verify the journalist's current role before doing anything else. The pitch is not the blocker; the contact record is.
218
+
219
+ ---
220
+
221
+ ## Example 5: Independent Newsletter, Narrow Fit
222
+
223
+ ### Before
224
+
225
+ ```json
226
+ {
227
+ "journalist": { "url": "https://thewenerdweekly.substack.com" },
228
+ "pitch": "Subject: Press dinner invite - small group, AI infra founders\n\nHi Brandon,\n\nWe're hosting a small, off-the-record press dinner in NYC on June 12: eight people, all founders building AI infrastructure that real engineers actually use. No decks, no pitches, just a conversation. Would love to have you. Happy to share the guest list privately if useful.\n\nJordan, founder, AgentEval",
229
+ "context": {
230
+ "current_time_iso": "2026-05-18T14:00:00Z"
231
+ }
232
+ }
233
+ ```
234
+
235
+ ### After
236
+
237
+ ```json
238
+ {
239
+ "verdict": "fit",
240
+ "confidence": 0.82,
241
+ "reasoning": "Fit. Brandon Wenerd's May 7 Substack post told PR people to stop sending mailers and go back to small, curated press dinners or low-stakes hangouts. Your pitch offers that exact format, and the private guest-list offer respects the newsletter-persona model.",
242
+ "anchor_pieces": [
243
+ {
244
+ "title": "A PR Person Asked Me How to Pitch Substackers",
245
+ "url": "https://thewenerdweekly.substack.com/p/a-pr-person-asked-me-how-to-pitch",
246
+ "published_at": "2026-05-07",
247
+ "hours_since_publish": 264,
248
+ "relevance_note": "The post explicitly asks for small curated events instead of mailers; the pitch follows that instruction."
249
+ }
250
+ ],
251
+ "suggested_changes": [
252
+ "Optional: name one guest in the email itself. The guest list is the product; one credible name does more than another adjective."
253
+ ],
254
+ "refusal": {
255
+ "refused": false,
256
+ "reason": null,
257
+ "remediation": null
258
+ },
259
+ "decay": {
260
+ "last_verified_byline_at": "2026-05-16",
261
+ "days_since_last_byline": 2,
262
+ "decay_warning": null
263
+ },
264
+ "retrieval_surface": "host-agent-search",
265
+ "retrieval_notes": "Checked Substack archive directly. Anchor URL: https://thewenerdweekly.substack.com/p/a-pr-person-asked-me-how-to-pitch. Four recent posts found; latest post was May 16."
266
+ }
267
+ ```
268
+
269
+ ### Action
270
+
271
+ Send after adding one credible guest name. Do not turn it into a product pitch.