@botlearn/writer 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +35 -0
- package/knowledge/anti-patterns.md +102 -0
- package/knowledge/best-practices.md +120 -0
- package/knowledge/domain.md +183 -0
- package/manifest.json +29 -0
- package/package.json +39 -0
- package/skill.md +45 -0
- package/strategies/main.md +130 -0
- package/tests/benchmark.json +506 -0
- package/tests/smoke.json +54 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 BotLearn
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# @botlearn/writer
|
|
2
|
+
|
|
3
|
+
> Structured article writing with evidence-based argumentation, consistent style, and quality improvement of 60% for OpenClaw Agent
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# via npm
|
|
9
|
+
npm install @botlearn/writer
|
|
10
|
+
|
|
11
|
+
# via clawhub
|
|
12
|
+
clawhub install @botlearn/writer
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
## Category
|
|
16
|
+
|
|
17
|
+
Creative Generation
|
|
18
|
+
|
|
19
|
+
## Dependencies
|
|
20
|
+
|
|
21
|
+
`@botlearn/summarizer`, `@botlearn/keyword-extractor`
|
|
22
|
+
|
|
23
|
+
## Files
|
|
24
|
+
|
|
25
|
+
| File | Description |
|
|
26
|
+
|------|-------------|
|
|
27
|
+
| `manifest.json` | Skill metadata and configuration |
|
|
28
|
+
| `skill.md` | Role definition and activation rules |
|
|
29
|
+
| `knowledge/` | Domain knowledge documents |
|
|
30
|
+
| `strategies/` | Behavioral strategy definitions |
|
|
31
|
+
| `tests/` | Smoke and benchmark tests |
|
|
32
|
+
|
|
33
|
+
## License
|
|
34
|
+
|
|
35
|
+
MIT
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
---
|
|
2
|
+
domain: writer
|
|
3
|
+
topic: anti-patterns
|
|
4
|
+
priority: medium
|
|
5
|
+
ttl: 30d
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Article Writing — Anti-Patterns
|
|
9
|
+
|
|
10
|
+
## Thesis Anti-Patterns
|
|
11
|
+
|
|
12
|
+
### 1. Weak or Missing Thesis
|
|
13
|
+
- **Problem**: The article lacks a clear central argument; the reader finishes without knowing what the article was trying to prove or communicate
|
|
14
|
+
- **Symptoms**: No single sentence can be identified as the thesis; the article reads as a collection of loosely related observations
|
|
15
|
+
- **Fix**: Before drafting, write a one-sentence answer to "What is this article arguing?" Place it in the first two paragraphs. Every subsequent paragraph should trace back to this sentence
|
|
16
|
+
|
|
17
|
+
### 2. Overly Broad Thesis
|
|
18
|
+
- **Problem**: The thesis attempts to cover too much ground, making it impossible to adequately support with evidence within the article's scope
|
|
19
|
+
- **Symptoms**: Phrases like "Technology is transforming everything," "There are many benefits to X," or "The world is changing rapidly"
|
|
20
|
+
- **Fix**: Add qualifiers to narrow the scope — specify who, what, when, where, and under what conditions. "AI is transforming healthcare" becomes "AI-powered diagnostic tools have reduced misdiagnosis rates in radiology departments by 23% since 2022"
|
|
21
|
+
|
|
22
|
+
### 3. Unfalsifiable Thesis
|
|
23
|
+
- **Problem**: The thesis is so obvious or vague that no reasonable person would disagree, making the article pointless
|
|
24
|
+
- **Symptoms**: "Good communication is important in business," "Cybersecurity is a growing concern"
|
|
25
|
+
- **Fix**: Strengthen the claim by adding a specific position, a causal mechanism, or a contested recommendation. The thesis should provoke thought, not nods
|
|
26
|
+
|
|
27
|
+
### 4. Moving Thesis
|
|
28
|
+
- **Problem**: The thesis shifts mid-article as the writer discovers new points during drafting, resulting in an article that argues one thing in the introduction and a different thing in the conclusion
|
|
29
|
+
- **Fix**: After completing a first draft, compare the introduction and conclusion. If they argue different things, choose one thesis and revise the entire article to align with it
|
|
30
|
+
|
|
31
|
+
## Evidence Anti-Patterns
|
|
32
|
+
|
|
33
|
+
### 5. Unsupported Claims
|
|
34
|
+
- **Problem**: Assertions are made without any supporting evidence — the writer expects the reader to take the claim on faith
|
|
35
|
+
- **Symptoms**: "It's widely known that..." "Everyone agrees that..." "Obviously..." or simply stating opinions as if they were facts
|
|
36
|
+
- **Fix**: For every claim, ask: "How do I know this? What would I cite?" If there is no answer, either find evidence or remove the claim
|
|
37
|
+
|
|
38
|
+
### 6. Evidence Without Analysis
|
|
39
|
+
- **Problem**: Data, quotes, or examples are dropped into the article without explanation of what they mean or how they support the argument
|
|
40
|
+
- **Symptoms**: A paragraph that is just a block quote followed by another quote; a statistic stated with no interpretation
|
|
41
|
+
- **Fix**: Apply the PEEL structure — after every piece of evidence, include 1-2 sentences explaining its significance and connecting it to the argument
|
|
42
|
+
|
|
43
|
+
### 7. Cherry-Picked Evidence
|
|
44
|
+
- **Problem**: The writer selects only evidence that supports their thesis while ignoring contradicting data, undermining credibility
|
|
45
|
+
- **Symptoms**: All statistics trend in the same direction; no counterarguments acknowledged; suspiciously clean narrative
|
|
46
|
+
- **Fix**: Actively seek and address counterevidence. Acknowledge limitations. Use the Rogerian approach: present the opposing view fairly, then explain why your position still holds
|
|
47
|
+
|
|
48
|
+
### 8. Outdated Evidence
|
|
49
|
+
- **Problem**: Using data or citations that are no longer current for rapidly evolving topics
|
|
50
|
+
- **Symptoms**: Citing a 2018 study about AI capabilities, referencing deprecated technology, using pre-pandemic data for workforce trends
|
|
51
|
+
- **Fix**: For technology and current affairs, evidence should be less than 2 years old. For established science or history, older sources are acceptable. Always note the publication date
|
|
52
|
+
|
|
53
|
+
### 9. Over-Reliance on a Single Source
|
|
54
|
+
- **Problem**: The entire article's evidence base draws from one study, one author, or one organization
|
|
55
|
+
- **Symptoms**: The same citation appears in every argument block; all expert quotes come from one person
|
|
56
|
+
- **Fix**: Use a minimum of 3-5 independent sources per article. Cross-reference key claims across multiple sources. Diversify evidence types (statistics, expert opinion, case studies)
|
|
57
|
+
|
|
58
|
+
### 10. Appeal to Authority Without Verification
|
|
59
|
+
- **Problem**: Citing an "expert" without verifying their credentials or relevance to the specific topic
|
|
60
|
+
- **Symptoms**: "A Harvard professor says..." without specifying their field; quoting a tech CEO on medical topics
|
|
61
|
+
- **Fix**: Verify that the expert's credentials are relevant to the specific claim being made. Include their specific role, institution, and relevant qualification
|
|
62
|
+
|
|
63
|
+
## Tone & Style Anti-Patterns
|
|
64
|
+
|
|
65
|
+
### 11. Tone Inconsistency
|
|
66
|
+
- **Problem**: The article shifts between formal and informal, professional and casual, creating a disjointed reading experience that undermines credibility
|
|
67
|
+
- **Symptoms**: An opening like "Let's dive into the fascinating world of quantum computing" followed by "The superposition principle posits that quantum bits exist in a probabilistic state vector space"
|
|
68
|
+
- **Fix**: Decide on a single tone before writing (see knowledge/best-practices.md tone framework). Read the full article aloud — tonal shifts become obvious when heard
|
|
69
|
+
|
|
70
|
+
### 12. Register Mismatch
|
|
71
|
+
- **Problem**: The vocabulary and complexity level don't match the target audience
|
|
72
|
+
- **Symptoms**: Using "synergize cross-functional paradigms" for a general audience; using "cool stuff" in a white paper
|
|
73
|
+
- **Fix**: Define the target audience explicitly before writing. Use the vocabulary level guidelines from knowledge/best-practices.md. Have someone from the target audience review if possible
|
|
74
|
+
|
|
75
|
+
### 13. Passive Voice Overuse
|
|
76
|
+
- **Problem**: Excessive passive voice makes the writing feel impersonal, weak, and harder to read
|
|
77
|
+
- **Symptoms**: "It was determined that..." "The results were analyzed..." "The decision was made by the committee..."
|
|
78
|
+
- **Fix**: Target 80%+ active voice. Convert: "The results were analyzed by the team" to "The team analyzed the results." Reserve passive voice for when the actor is unknown or deliberately de-emphasized
|
|
79
|
+
|
|
80
|
+
### 14. Filler and Hedge Overload
|
|
81
|
+
- **Problem**: Excessive hedging language weakens every claim; filler words add length without substance
|
|
82
|
+
- **Symptoms**: "It could perhaps be argued that there may be some potential benefits..." "Basically, it's really quite interesting that..."
|
|
83
|
+
- **Fix**: Remove hedge words on first edit pass. If a claim needs qualification, use a single precise qualifier: "In most enterprise contexts" rather than "It could potentially be the case that in some situations"
|
|
84
|
+
|
|
85
|
+
## Structural Anti-Patterns
|
|
86
|
+
|
|
87
|
+
### 15. Wall of Text
|
|
88
|
+
- **Problem**: Long, unbroken blocks of text with no subheadings, bullet points, or visual breaks
|
|
89
|
+
- **Fix**: Break content into scannable sections. Use subheadings every 200-300 words. Use lists for 3+ parallel items. Keep digital paragraphs to 3-6 sentences
|
|
90
|
+
|
|
91
|
+
### 16. Burying the Lead
|
|
92
|
+
- **Problem**: The most important or interesting information appears deep in the article instead of near the top
|
|
93
|
+
- **Fix**: Apply the inverted pyramid test even for non-news articles: could a reader stop at paragraph 3 and still understand the core message? If not, restructure
|
|
94
|
+
|
|
95
|
+
### 17. Weak Conclusion
|
|
96
|
+
- **Problem**: The article ends with a vague summary or abruptly stops without synthesizing the arguments or providing forward-looking insight
|
|
97
|
+
- **Symptoms**: "In conclusion, X is very important and we should all think about it more"
|
|
98
|
+
- **Fix**: A strong conclusion should: (1) restate the thesis in light of the evidence presented, (2) synthesize the key arguments into a higher-order insight, and (3) provide a call to action, prediction, or open question for the reader
|
|
99
|
+
|
|
100
|
+
### 18. Missing Transitions
|
|
101
|
+
- **Problem**: Paragraphs and sections are placed adjacent to each other without logical connectors, forcing readers to infer the relationship
|
|
102
|
+
- **Fix**: Every paragraph should begin with a connection to the previous one — either an explicit transition word or an echo of a key term from the prior paragraph
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
---
|
|
2
|
+
domain: writer
|
|
3
|
+
topic: thesis-evidence-style-best-practices
|
|
4
|
+
priority: high
|
|
5
|
+
ttl: 30d
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Article Writing — Best Practices
|
|
9
|
+
|
|
10
|
+
## Thesis Development
|
|
11
|
+
|
|
12
|
+
### 1. Thesis Placement & Clarity
|
|
13
|
+
- Place the thesis statement within the first two paragraphs — readers should never have to guess the article's central argument
|
|
14
|
+
- A strong thesis is **specific**, **arguable**, and **scoped**: it makes a claim that can be supported with evidence and is narrow enough to be fully addressed in the article
|
|
15
|
+
- Avoid thesis statements that are merely factual ("Python is a programming language") or too broad ("Technology is changing the world")
|
|
16
|
+
|
|
17
|
+
### 2. Thesis Formulation Process
|
|
18
|
+
1. **Start with a question** — What specific question does this article answer?
|
|
19
|
+
2. **Draft a tentative claim** — What is your initial answer to that question?
|
|
20
|
+
3. **Stress-test the claim** — Can reasonable people disagree? If not, it's not a thesis
|
|
21
|
+
4. **Scope it down** — Add qualifiers to make it defensible: who, when, under what conditions
|
|
22
|
+
5. **Refine for precision** — Remove vague language ("some," "many," "interesting")
|
|
23
|
+
|
|
24
|
+
### 3. Thesis Types by Article Purpose
|
|
25
|
+
|
|
26
|
+
| Article Purpose | Thesis Type | Example |
|
|
27
|
+
|----------------|-------------|---------|
|
|
28
|
+
| Persuasive | Value claim | "Companies should adopt four-day workweeks because the productivity and retention gains outweigh the scheduling costs" |
|
|
29
|
+
| Analytical | Causal claim | "The rise of remote work has restructured urban real estate markets by shifting demand from commercial to residential space" |
|
|
30
|
+
| Explanatory | Interpretive claim | "Kubernetes adoption follows a predictable maturity curve where organizations must pass through three distinct operational phases" |
|
|
31
|
+
| Comparative | Evaluative claim | "For early-stage startups, serverless architecture offers better cost efficiency than container orchestration until reaching 10,000 daily active users" |
|
|
32
|
+
|
|
33
|
+
### 4. Thesis-Argument Alignment
|
|
34
|
+
- Every argument block in the article must directly advance the thesis
|
|
35
|
+
- Map each argument to a specific aspect of the thesis before drafting
|
|
36
|
+
- If an argument doesn't connect to the thesis, either revise the argument, revise the thesis, or remove the argument
|
|
37
|
+
|
|
38
|
+
## Evidence Integration
|
|
39
|
+
|
|
40
|
+
### 1. Evidence Selection Criteria
|
|
41
|
+
- **Relevance** — Does the evidence directly support the specific claim being made?
|
|
42
|
+
- **Recency** — Is the data current enough for the topic? (Technology: < 2 years; Social science: < 5 years; History: age is acceptable)
|
|
43
|
+
- **Authority** — Is the source credible and recognized in the field?
|
|
44
|
+
- **Representativeness** — Does the evidence reflect the broader reality, or is it an outlier?
|
|
45
|
+
|
|
46
|
+
### 2. Evidence-to-Argument Ratio
|
|
47
|
+
- Every claim should be supported by at least one piece of evidence
|
|
48
|
+
- High-stakes claims (central to the thesis) need 2-3 supporting evidence items
|
|
49
|
+
- Use different evidence types for the same claim when possible (e.g., a statistic + an expert quote)
|
|
50
|
+
- The PEEL structure (Point-Evidence-Explain-Link) ensures every piece of evidence is analyzed, not just dropped in
|
|
51
|
+
|
|
52
|
+
### 3. Evidence Introduction Patterns
|
|
53
|
+
|
|
54
|
+
**Signal phrase (attribution first):**
|
|
55
|
+
> According to a 2024 McKinsey report, organizations that adopted AI-assisted workflows saw a 40% reduction in repetitive task time.
|
|
56
|
+
|
|
57
|
+
**Data-first (impact first):**
|
|
58
|
+
> A 40% reduction in repetitive task time — that is the finding from McKinsey's 2024 survey of 500 enterprises that adopted AI-assisted workflows.
|
|
59
|
+
|
|
60
|
+
**Contextual embedding (woven into narrative):**
|
|
61
|
+
> When McKinsey surveyed 500 enterprises in 2024, the results surprised even optimists: AI-assisted workflows cut repetitive task time by 40%.
|
|
62
|
+
|
|
63
|
+
### 4. Evidence Diversity Requirements
|
|
64
|
+
- **Never rely on a single evidence type** — An article using only statistics lacks human connection; one using only anecdotes lacks rigor
|
|
65
|
+
- Target a minimum of 3 evidence types per article:
|
|
66
|
+
- At least one quantitative source (statistics, data)
|
|
67
|
+
- At least one qualitative source (expert quote, case study)
|
|
68
|
+
- At least one reasoning element (logical argument, analogy)
|
|
69
|
+
|
|
70
|
+
### 5. Source Attribution Standards
|
|
71
|
+
- Always identify: **who** produced the evidence, **when**, and **in what context**
|
|
72
|
+
- For statistics: include sample size and methodology when available
|
|
73
|
+
- For quotes: include the person's relevant credentials or role
|
|
74
|
+
- For studies: note if they are peer-reviewed, pre-print, or industry-commissioned
|
|
75
|
+
|
|
76
|
+
## Style Consistency
|
|
77
|
+
|
|
78
|
+
### 1. Tone Selection Framework
|
|
79
|
+
|
|
80
|
+
| Audience | Context | Recommended Tone | Vocabulary Level |
|
|
81
|
+
|----------|---------|-----------------|-----------------|
|
|
82
|
+
| General public | Blog, magazine | Conversational, accessible | Common terms; explain jargon |
|
|
83
|
+
| Industry professionals | Trade publication | Informed, collegial | Domain terms acceptable; minimal explanation |
|
|
84
|
+
| Academic readers | Journal, research report | Formal, precise | Technical vocabulary expected |
|
|
85
|
+
| Executive / Decision-maker | White paper, brief | Authoritative, concise | Business terms; focus on impact |
|
|
86
|
+
| Developer / Technical | Technical blog, docs | Direct, practical | Code terms, specific tool names |
|
|
87
|
+
|
|
88
|
+
### 2. Tone Consistency Rules
|
|
89
|
+
- **Establish tone in the first paragraph** and maintain it throughout
|
|
90
|
+
- If the opening is conversational ("Let's face it, nobody likes debugging"), do not shift to formal academic language in the body
|
|
91
|
+
- If the opening is formal ("This analysis examines the causal relationship..."), do not insert casual asides
|
|
92
|
+
- Transitions between sections should not shift register — use consistent bridging language
|
|
93
|
+
|
|
94
|
+
### 3. Voice & Person Conventions
|
|
95
|
+
- **First person singular ("I")** — Personal essays, opinion columns, experience-based articles
|
|
96
|
+
- **First person plural ("we")** — Collaborative or organizational pieces, tutorials ("Let's build...")
|
|
97
|
+
- **Second person ("you")** — How-to guides, advice articles, directly addressing the reader
|
|
98
|
+
- **Third person** — News articles, analytical pieces, formal reports
|
|
99
|
+
- **Do not mix** person within an article unless there is a deliberate rhetorical reason
|
|
100
|
+
|
|
101
|
+
### 4. Sentence & Paragraph Rhythm
|
|
102
|
+
- Vary sentence length: mix short (8-12 words) and long (20-30 words) sentences to create rhythm
|
|
103
|
+
- Short sentences create emphasis. Long sentences build complexity and nuance, allowing the reader to absorb multiple related ideas in a single breath.
|
|
104
|
+
- Paragraphs should be 3-6 sentences in digital content; longer paragraphs are acceptable in print
|
|
105
|
+
- Each paragraph should contain exactly one main idea — the topic sentence
|
|
106
|
+
|
|
107
|
+
### 5. Transition Quality
|
|
108
|
+
- Every paragraph must connect logically to the one before it
|
|
109
|
+
- Use explicit transitions for clarity: "However," "Furthermore," "As a result," "In contrast"
|
|
110
|
+
- Use implicit transitions through keyword echo: repeat a key term from the previous paragraph's conclusion in the next paragraph's opening
|
|
111
|
+
- Avoid abrupt topic changes — if a new section is necessary, use a subheading
|
|
112
|
+
|
|
113
|
+
### 6. Readability Standards
|
|
114
|
+
- **Flesch-Kincaid grade level** targets:
|
|
115
|
+
- General audience: Grade 8-10
|
|
116
|
+
- Professional audience: Grade 10-12
|
|
117
|
+
- Academic audience: Grade 12-16
|
|
118
|
+
- Prefer active voice over passive voice (target 80%+ active sentences)
|
|
119
|
+
- Eliminate filler words: "very," "really," "basically," "actually," "just"
|
|
120
|
+
- Replace nominalizations with verbs: "made an improvement" becomes "improved"
|
|
@@ -0,0 +1,183 @@
|
|
|
1
|
+
---
|
|
2
|
+
domain: writer
|
|
3
|
+
topic: article-structures-argumentation-evidence
|
|
4
|
+
priority: high
|
|
5
|
+
ttl: 30d
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Article Writing — Structures, Argumentation Frameworks & Evidence Types
|
|
9
|
+
|
|
10
|
+
## Article Structure Patterns
|
|
11
|
+
|
|
12
|
+
### 1. Inverted Pyramid
|
|
13
|
+
|
|
14
|
+
The most important information appears first, with details following in descending order of significance. Ideal for news articles, reports, and time-sensitive content.
|
|
15
|
+
|
|
16
|
+
**Structure:**
|
|
17
|
+
1. **Lead** — The core finding, conclusion, or newsworthy element in 1-2 sentences
|
|
18
|
+
2. **Key supporting facts** — The essential context: who, what, when, where, why, how
|
|
19
|
+
3. **Background & detail** — Additional context, historical perspective, secondary quotes
|
|
20
|
+
4. **Supplementary material** — Related information, further reading, tangential points
|
|
21
|
+
|
|
22
|
+
**Best for:** News articles, executive summaries, blog posts where readers may not finish
|
|
23
|
+
|
|
24
|
+
**Strengths:**
|
|
25
|
+
- Readers get value even if they stop early
|
|
26
|
+
- Forces the writer to prioritize information
|
|
27
|
+
- Easy to trim from the bottom for length constraints
|
|
28
|
+
|
|
29
|
+
### 2. Narrative Arc
|
|
30
|
+
|
|
31
|
+
Follows a storytelling structure that engages readers emotionally while delivering information. Ideal for feature articles, case studies, and long-form content.
|
|
32
|
+
|
|
33
|
+
**Structure:**
|
|
34
|
+
1. **Hook / Opening scene** — An anecdote, vivid scene, or provocative question that draws the reader in
|
|
35
|
+
2. **Context / Rising action** — Background information that sets the stage and introduces complexity
|
|
36
|
+
3. **Central tension / Problem** — The core conflict, challenge, or question being explored
|
|
37
|
+
4. **Resolution / Insight** — The answer, finding, or transformation revealed through the narrative
|
|
38
|
+
5. **Denouement / Reflection** — Broader implications, lessons learned, or forward-looking perspective
|
|
39
|
+
|
|
40
|
+
**Best for:** Feature articles, profiles, case studies, long-form journalism, storytelling-driven blog posts
|
|
41
|
+
|
|
42
|
+
**Strengths:**
|
|
43
|
+
- High reader engagement through emotional connection
|
|
44
|
+
- Complex ideas become accessible through concrete stories
|
|
45
|
+
- Memorable — readers retain narrative better than abstract arguments
|
|
46
|
+
|
|
47
|
+
### 3. Analytical Framework
|
|
48
|
+
|
|
49
|
+
Presents a systematic examination of a topic through structured analysis. Ideal for opinion pieces, thought leadership, and technical articles.
|
|
50
|
+
|
|
51
|
+
**Structure:**
|
|
52
|
+
1. **Thesis statement** — A clear, arguable claim stated within the first two paragraphs
|
|
53
|
+
2. **Framework introduction** — The analytical lens or criteria being applied
|
|
54
|
+
3. **Argument block 1** — First supporting argument with evidence and analysis
|
|
55
|
+
4. **Argument block 2** — Second supporting argument with evidence and analysis
|
|
56
|
+
5. **Argument block 3** — Third supporting argument (or counterargument + rebuttal)
|
|
57
|
+
6. **Synthesis** — How the arguments work together to prove the thesis
|
|
58
|
+
7. **Conclusion** — Restatement of thesis with broader implications and call to action
|
|
59
|
+
|
|
60
|
+
**Best for:** Opinion pieces, thought leadership, academic articles, technical analysis, persuasive essays
|
|
61
|
+
|
|
62
|
+
**Strengths:**
|
|
63
|
+
- Rigorous and credible; shows depth of analysis
|
|
64
|
+
- Easy for readers to follow the logical chain
|
|
65
|
+
- Well-suited to topics requiring evidence-based persuasion
|
|
66
|
+
|
|
67
|
+
### 4. Listicle / Modular
|
|
68
|
+
|
|
69
|
+
Organizes content into discrete, numbered or thematic sections that can be read independently. Ideal for practical guides, how-to articles, and reference content.
|
|
70
|
+
|
|
71
|
+
**Structure:**
|
|
72
|
+
1. **Introduction** — Context, scope, and why this list matters
|
|
73
|
+
2. **Item 1** — Standalone section with heading, explanation, and supporting detail
|
|
74
|
+
3. **Item 2-N** — Each item follows the same internal structure for consistency
|
|
75
|
+
4. **Conclusion / Summary** — Ties the items together and provides next steps
|
|
76
|
+
|
|
77
|
+
**Best for:** How-to guides, "top N" articles, best practice compilations, tool comparisons
|
|
78
|
+
|
|
79
|
+
**Strengths:**
|
|
80
|
+
- Scannable; readers can jump to relevant sections
|
|
81
|
+
- Lower cognitive load per section
|
|
82
|
+
- Strong SEO performance due to clear heading structure
|
|
83
|
+
|
|
84
|
+
### 5. Problem-Solution
|
|
85
|
+
|
|
86
|
+
Identifies a specific problem, explores its dimensions, and presents a solution with evidence. Ideal for technical articles, white papers, and persuasive content.
|
|
87
|
+
|
|
88
|
+
**Structure:**
|
|
89
|
+
1. **Problem statement** — Define the problem clearly with data on its scope and impact
|
|
90
|
+
2. **Root cause analysis** — Why the problem exists; underlying factors
|
|
91
|
+
3. **Solution proposal** — The recommended approach, method, or framework
|
|
92
|
+
4. **Evidence of effectiveness** — Data, case studies, or expert validation supporting the solution
|
|
93
|
+
5. **Implementation guidance** — Practical steps for adopting the solution
|
|
94
|
+
6. **Conclusion** — Summary of impact and call to action
|
|
95
|
+
|
|
96
|
+
**Best for:** White papers, technical blog posts, consulting reports, product marketing content
|
|
97
|
+
|
|
98
|
+
## Argumentation Frameworks
|
|
99
|
+
|
|
100
|
+
### Toulmin Model
|
|
101
|
+
|
|
102
|
+
A practical framework for constructing and evaluating arguments:
|
|
103
|
+
|
|
104
|
+
| Component | Definition | Example |
|
|
105
|
+
|-----------|-----------|---------|
|
|
106
|
+
| **Claim** | The assertion being made | "Remote work increases productivity" |
|
|
107
|
+
| **Data** | The evidence supporting the claim | "Stanford study showing 13% performance increase" |
|
|
108
|
+
| **Warrant** | The logical connection between data and claim | "Fewer distractions and flexible schedules enable deeper focus" |
|
|
109
|
+
| **Backing** | Support for the warrant itself | "Research on flow states confirms uninterrupted work improves output" |
|
|
110
|
+
| **Qualifier** | Conditions or limitations | "For knowledge workers in roles not requiring physical presence" |
|
|
111
|
+
| **Rebuttal** | Acknowledgment of counterarguments | "Though some roles benefit from in-person collaboration" |
|
|
112
|
+
|
|
113
|
+
### Classical Rhetorical Appeals (Aristotle)
|
|
114
|
+
|
|
115
|
+
| Appeal | Definition | Article Application |
|
|
116
|
+
|--------|-----------|-------------------|
|
|
117
|
+
| **Ethos** (Credibility) | Establish authority and trustworthiness | Cite expert sources, demonstrate domain knowledge, acknowledge limitations |
|
|
118
|
+
| **Pathos** (Emotion) | Connect with the reader emotionally | Use vivid anecdotes, relatable scenarios, human-impact framing |
|
|
119
|
+
| **Logos** (Logic) | Appeal through reasoning and evidence | Present data, logical deductions, structured arguments, causal chains |
|
|
120
|
+
|
|
121
|
+
**Balanced usage:** Strong articles blend all three appeals. Lead with ethos (establish credibility), engage with pathos (make it matter), and convince with logos (prove it with evidence).
|
|
122
|
+
|
|
123
|
+
### Rogerian Argument
|
|
124
|
+
|
|
125
|
+
A collaborative approach for contentious topics:
|
|
126
|
+
1. **Acknowledge the opposing view** — Present it fairly and accurately
|
|
127
|
+
2. **Find common ground** — Identify shared values or goals
|
|
128
|
+
3. **Present your position** — With evidence, framed as building on the common ground
|
|
129
|
+
4. **Propose a synthesis** — Show how both perspectives can be accommodated
|
|
130
|
+
|
|
131
|
+
**Best for:** Controversial topics, opinion pieces aiming to persuade skeptics, policy discussions
|
|
132
|
+
|
|
133
|
+
### PEEL Paragraph Structure
|
|
134
|
+
|
|
135
|
+
A micro-framework for individual paragraphs within any article structure:
|
|
136
|
+
- **P**oint — State the paragraph's main idea in one sentence
|
|
137
|
+
- **E**vidence — Provide supporting data, quotes, or examples
|
|
138
|
+
- **E**xplain — Analyze how the evidence supports the point
|
|
139
|
+
- **L**ink — Connect back to the thesis or transition to the next paragraph
|
|
140
|
+
|
|
141
|
+
## Evidence Types
|
|
142
|
+
|
|
143
|
+
### 1. Statistical Data
|
|
144
|
+
- Quantitative findings from research studies, surveys, or official reports
|
|
145
|
+
- **Strength:** Objective, measurable, difficult to dispute
|
|
146
|
+
- **Usage:** "According to a 2024 Gartner report, 65% of enterprises..."
|
|
147
|
+
- **Caution:** Always cite the source, sample size, and methodology when available
|
|
148
|
+
|
|
149
|
+
### 2. Expert Testimony
|
|
150
|
+
- Quotes, opinions, or findings from recognized authorities in the field
|
|
151
|
+
- **Strength:** Leverages established credibility (ethos)
|
|
152
|
+
- **Usage:** Direct quotes for impact, paraphrased for flow
|
|
153
|
+
- **Caution:** Verify the expert's credentials and check for conflicts of interest
|
|
154
|
+
|
|
155
|
+
### 3. Case Studies
|
|
156
|
+
- Detailed real-world examples demonstrating a principle in practice
|
|
157
|
+
- **Strength:** Concrete, relatable, shows causation in context
|
|
158
|
+
- **Usage:** Narrative description of situation, action, and outcome
|
|
159
|
+
- **Caution:** Individual cases may not be generalizable; acknowledge limitations
|
|
160
|
+
|
|
161
|
+
### 4. Analogies & Metaphors
|
|
162
|
+
- Comparisons that make abstract concepts accessible
|
|
163
|
+
- **Strength:** Bridges the knowledge gap between writer and reader
|
|
164
|
+
- **Usage:** "Microservices are like a team of specialists, each handling one task..."
|
|
165
|
+
- **Caution:** Analogies simplify; acknowledge where the comparison breaks down
|
|
166
|
+
|
|
167
|
+
### 5. Historical Precedent
|
|
168
|
+
- Past events or decisions that inform the current discussion
|
|
169
|
+
- **Strength:** Provides temporal context and pattern recognition
|
|
170
|
+
- **Usage:** "The 2008 financial crisis demonstrated that..."
|
|
171
|
+
- **Caution:** Context changes over time; historical parallels are imperfect
|
|
172
|
+
|
|
173
|
+
### 6. Logical Reasoning
|
|
174
|
+
- Deductive or inductive chains that lead the reader to a conclusion
|
|
175
|
+
- **Strength:** Self-contained; doesn't depend on external sources
|
|
176
|
+
- **Usage:** "If A is true, and B follows from A, then C must follow..."
|
|
177
|
+
- **Caution:** Ensure premises are sound; avoid hidden assumptions
|
|
178
|
+
|
|
179
|
+
### 7. Anecdotal Evidence
|
|
180
|
+
- Personal stories or individual experiences that illustrate a point
|
|
181
|
+
- **Strength:** Highly engaging and relatable (pathos)
|
|
182
|
+
- **Usage:** Opening hooks, humanizing abstract topics, illustrating impact
|
|
183
|
+
- **Caution:** Not generalizable; should supplement, not replace, systematic evidence
|
package/manifest.json
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@botlearn/writer",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"description": "Structured article writing with evidence-based argumentation, consistent style, and quality improvement of 60% for OpenClaw Agent",
|
|
5
|
+
"category": "creative-generation",
|
|
6
|
+
"author": "BotLearn",
|
|
7
|
+
"benchmarkDimension": "creative-generation",
|
|
8
|
+
"expectedImprovement": 60,
|
|
9
|
+
"dependencies": {
|
|
10
|
+
"@botlearn/summarizer": "^1.0.0",
|
|
11
|
+
"@botlearn/keyword-extractor": "^1.0.0"
|
|
12
|
+
},
|
|
13
|
+
"compatibility": {
|
|
14
|
+
"openclaw": ">=0.5.0"
|
|
15
|
+
},
|
|
16
|
+
"files": {
|
|
17
|
+
"skill": "skill.md",
|
|
18
|
+
"knowledge": [
|
|
19
|
+
"knowledge/domain.md",
|
|
20
|
+
"knowledge/best-practices.md",
|
|
21
|
+
"knowledge/anti-patterns.md"
|
|
22
|
+
],
|
|
23
|
+
"strategies": [
|
|
24
|
+
"strategies/main.md"
|
|
25
|
+
],
|
|
26
|
+
"smokeTest": "tests/smoke.json",
|
|
27
|
+
"benchmark": "tests/benchmark.json"
|
|
28
|
+
}
|
|
29
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@botlearn/writer",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"description": "Structured article writing with evidence-based argumentation, consistent style, and quality improvement of 60% for OpenClaw Agent",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"main": "manifest.json",
|
|
7
|
+
"files": [
|
|
8
|
+
"manifest.json",
|
|
9
|
+
"skill.md",
|
|
10
|
+
"knowledge/",
|
|
11
|
+
"strategies/",
|
|
12
|
+
"tests/",
|
|
13
|
+
"README.md"
|
|
14
|
+
],
|
|
15
|
+
"keywords": [
|
|
16
|
+
"botlearn",
|
|
17
|
+
"openclaw",
|
|
18
|
+
"skill",
|
|
19
|
+
"creative-generation"
|
|
20
|
+
],
|
|
21
|
+
"author": "BotLearn",
|
|
22
|
+
"license": "MIT",
|
|
23
|
+
"dependencies": {
|
|
24
|
+
"@botlearn/keyword-extractor": "0.1.0",
|
|
25
|
+
"@botlearn/summarizer": "0.1.0"
|
|
26
|
+
},
|
|
27
|
+
"repository": {
|
|
28
|
+
"type": "git",
|
|
29
|
+
"url": "https://github.com/readai-team/botlearn-awesome-skills.git",
|
|
30
|
+
"directory": "packages/skills/writer"
|
|
31
|
+
},
|
|
32
|
+
"homepage": "https://github.com/readai-team/botlearn-awesome-skills/tree/main/packages/skills/writer",
|
|
33
|
+
"bugs": {
|
|
34
|
+
"url": "https://github.com/readai-team/botlearn-awesome-skills/issues"
|
|
35
|
+
},
|
|
36
|
+
"publishConfig": {
|
|
37
|
+
"access": "public"
|
|
38
|
+
}
|
|
39
|
+
}
|
package/skill.md
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: writer
|
|
3
|
+
role: Article Writing Specialist
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
triggers:
|
|
6
|
+
- "write an article"
|
|
7
|
+
- "write about"
|
|
8
|
+
- "compose"
|
|
9
|
+
- "draft article"
|
|
10
|
+
- "blog post"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Role
|
|
14
|
+
|
|
15
|
+
You are an Article Writing Specialist. When activated, you produce structured, evidence-based articles with clear thesis statements, well-supported arguments, and consistent style. You leverage summarization capabilities to distill research and keyword extraction to optimize topic coverage and SEO relevance.
|
|
16
|
+
|
|
17
|
+
# Capabilities
|
|
18
|
+
|
|
19
|
+
1. Construct well-organized articles following established structural patterns (inverted pyramid, narrative arc, analytical framework) appropriate to the topic and audience
|
|
20
|
+
2. Develop clear, defensible thesis statements and decompose them into supporting arguments with logical progression
|
|
21
|
+
3. Integrate multiple evidence types (statistical data, expert quotes, case studies, analogies) to substantiate claims while maintaining readability
|
|
22
|
+
4. Maintain consistent tone, voice, and style throughout the article, adapting register to the target audience and publication context
|
|
23
|
+
5. Use @botlearn/summarizer to condense research material into key points for evidence integration
|
|
24
|
+
6. Use @botlearn/keyword-extractor to identify core terms, optimize topic coverage, and ensure semantic completeness
|
|
25
|
+
|
|
26
|
+
# Constraints
|
|
27
|
+
|
|
28
|
+
1. Never present claims without supporting evidence — every assertion must be backed by data, expert opinion, or logical reasoning
|
|
29
|
+
2. Never switch tone or register mid-article without deliberate rhetorical intent — style must remain consistent from introduction to conclusion
|
|
30
|
+
3. Never produce an article without a clearly identifiable thesis statement within the first two paragraphs
|
|
31
|
+
4. Never use a single evidence type exclusively — diversify between statistics, expert testimony, examples, and logical arguments
|
|
32
|
+
5. Always include a strong conclusion that reinforces the thesis and provides forward-looking insight or a call to action
|
|
33
|
+
6. Always verify that each paragraph serves a clear purpose in advancing the overall argument
|
|
34
|
+
|
|
35
|
+
# Activation
|
|
36
|
+
|
|
37
|
+
WHEN the user requests article writing, composition, or blog post creation:
|
|
38
|
+
1. Analyze the topic, target audience, desired length, and publication context
|
|
39
|
+
2. Use @botlearn/keyword-extractor to identify core terms and related concepts for comprehensive topic coverage
|
|
40
|
+
3. Use @botlearn/summarizer to distill any provided research material or references into usable evidence points
|
|
41
|
+
4. Follow strategies/main.md for the 7-step writing workflow
|
|
42
|
+
5. Apply knowledge/domain.md for article structure selection and argumentation framework
|
|
43
|
+
6. Ensure quality using knowledge/best-practices.md for thesis development, evidence integration, and style consistency
|
|
44
|
+
7. Verify against knowledge/anti-patterns.md to avoid weak thesis, unsupported claims, and tone inconsistency
|
|
45
|
+
8. Output a complete, publication-ready article with clear structure and sourced evidence
|
|
@@ -0,0 +1,130 @@
|
|
|
1
|
+
---
|
|
2
|
+
strategy: writer
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
steps: 7
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Article Writing Strategy
|
|
8
|
+
|
|
9
|
+
## Step 1: Topic Research & Scoping
|
|
10
|
+
- Parse the user's request to identify: **topic**, **target audience**, **desired length**, **publication context**, **tone preference**
|
|
11
|
+
- Use @botlearn/keyword-extractor to extract core terms, related concepts, and semantic clusters from the topic
|
|
12
|
+
- Use @botlearn/summarizer to condense any provided reference materials, sources, or background documents into key evidence points
|
|
13
|
+
- IF the topic is ambiguous or underspecified THEN ask clarifying questions:
|
|
14
|
+
- "Who is the target audience?"
|
|
15
|
+
- "What is the desired length and format?"
|
|
16
|
+
- "Is there a specific angle or thesis you want to pursue?"
|
|
17
|
+
- Identify 3-5 key subtopics that the article must address for comprehensive coverage
|
|
18
|
+
- Assess available evidence: what data, examples, or expert perspectives exist for this topic?
|
|
19
|
+
|
|
20
|
+
## Step 2: Outline Construction
|
|
21
|
+
- SELECT article structure from knowledge/domain.md based on topic and purpose:
|
|
22
|
+
- Persuasive or opinion → Analytical Framework
|
|
23
|
+
- News or time-sensitive → Inverted Pyramid
|
|
24
|
+
- Feature or case study → Narrative Arc
|
|
25
|
+
- How-to or reference → Listicle / Modular
|
|
26
|
+
- Technical or consulting → Problem-Solution
|
|
27
|
+
- Draft a hierarchical outline:
|
|
28
|
+
- **Level 1**: Major sections (introduction, argument blocks, conclusion)
|
|
29
|
+
- **Level 2**: Key points within each section
|
|
30
|
+
- **Level 3**: Evidence placeholders (type of evidence needed for each point)
|
|
31
|
+
- VERIFY outline covers all subtopics identified in Step 1
|
|
32
|
+
- VERIFY each argument block maps directly to the thesis
|
|
33
|
+
- IF outline has fewer than 3 argument blocks THEN expand — the article needs sufficient depth
|
|
34
|
+
- IF outline has more than 5 argument blocks THEN consolidate — focus risks dilution
|
|
35
|
+
|
|
36
|
+
## Step 3: Argument Building
|
|
37
|
+
- SELECT argumentation framework from knowledge/domain.md:
|
|
38
|
+
- Default: Toulmin Model (Claim → Data → Warrant → Backing → Qualifier → Rebuttal)
|
|
39
|
+
- Contentious topics: Rogerian Argument (Acknowledge opposition → Common ground → Position → Synthesis)
|
|
40
|
+
- Persuasive essays: Classical Rhetorical Appeals (Ethos → Pathos → Logos)
|
|
41
|
+
- For each argument block in the outline:
|
|
42
|
+
1. State the **claim** in one clear sentence
|
|
43
|
+
2. Identify the **evidence** needed (type and source)
|
|
44
|
+
3. Write the **warrant** — the logical bridge between evidence and claim
|
|
45
|
+
4. Add a **qualifier** if the claim has conditions or limitations
|
|
46
|
+
5. IF a counterargument exists THEN draft a **rebuttal**
|
|
47
|
+
- VERIFY logical consistency: no argument should contradict another; each should build on the previous
|
|
48
|
+
- SELF-CHECK: Does each argument independently support the thesis? Remove any that don't
|
|
49
|
+
|
|
50
|
+
## Step 4: Evidence Integration
|
|
51
|
+
- For each evidence placeholder in the outline, SELECT evidence from available sources:
|
|
52
|
+
- Apply evidence selection criteria from knowledge/best-practices.md: relevance, recency, authority, representativeness
|
|
53
|
+
- Target a minimum of 3 different evidence types across the article (statistics, expert quotes, case studies, analogies, logical reasoning)
|
|
54
|
+
- High-stakes claims (central to thesis) require 2-3 supporting evidence items
|
|
55
|
+
- Integrate evidence using appropriate introduction patterns from knowledge/best-practices.md:
|
|
56
|
+
- Signal phrase for attributed authority
|
|
57
|
+
- Data-first for maximum impact
|
|
58
|
+
- Contextual embedding for narrative flow
|
|
59
|
+
- APPLY the PEEL structure to each paragraph: Point → Evidence → Explain → Link
|
|
60
|
+
- Never leave evidence unanalyzed — always explain what it means and how it connects to the argument
|
|
61
|
+
- VERIFY source attribution: every piece of evidence must identify who, when, and in what context
|
|
62
|
+
- CHECK against knowledge/anti-patterns.md:
|
|
63
|
+
- No unsupported claims (anti-pattern #5)
|
|
64
|
+
- No evidence without analysis (anti-pattern #6)
|
|
65
|
+
- No cherry-picked evidence (anti-pattern #7)
|
|
66
|
+
- No outdated evidence (anti-pattern #8)
|
|
67
|
+
- No single-source dependency (anti-pattern #9)
|
|
68
|
+
|
|
69
|
+
## Step 5: Draft Composition
|
|
70
|
+
- Write the complete article following the outline and assembled arguments:
|
|
71
|
+
- **Introduction** (1-2 paragraphs):
|
|
72
|
+
- Open with a hook: a surprising statistic, provocative question, vivid anecdote, or bold claim
|
|
73
|
+
- Provide necessary context for the topic
|
|
74
|
+
- State the thesis clearly within the first two paragraphs
|
|
75
|
+
- **Body** (per outline):
|
|
76
|
+
- Follow the selected article structure from Step 2
|
|
77
|
+
- Use the argumentation framework from Step 3 for each argument block
|
|
78
|
+
- Embed evidence per Step 4 integration patterns
|
|
79
|
+
- Include transitions between every section and paragraph
|
|
80
|
+
- **Conclusion** (1-2 paragraphs):
|
|
81
|
+
- Restate the thesis in light of the evidence presented (do not copy-paste the introduction)
|
|
82
|
+
- Synthesize the key arguments into a higher-order insight
|
|
83
|
+
- End with a call to action, forward-looking prediction, or thought-provoking question
|
|
84
|
+
- APPLY sentence rhythm: alternate short and long sentences for readability
|
|
85
|
+
- MAINTAIN consistent tone throughout — refer to the tone selection framework in knowledge/best-practices.md
|
|
86
|
+
|
|
87
|
+
## Step 6: Style Check & Polish
|
|
88
|
+
- VERIFY tone consistency: read the full draft and flag any shifts in register, formality, or voice
|
|
89
|
+
- Check: Does the introduction and conclusion use the same register?
|
|
90
|
+
- Check: Are all sections using the same person (first, second, third)?
|
|
91
|
+
- Check: Is vocabulary level consistent throughout?
|
|
92
|
+
- VERIFY readability:
|
|
93
|
+
- Paragraphs are 3-6 sentences (digital) or 4-8 sentences (print)
|
|
94
|
+
- Sentence variety: mix of short (8-12 words) and long (20-30 words)
|
|
95
|
+
- Active voice usage is at 80%+ — convert passive constructions where possible
|
|
96
|
+
- Eliminate filler words: "very," "really," "basically," "actually," "just"
|
|
97
|
+
- Replace nominalizations: "made an improvement" → "improved"
|
|
98
|
+
- VERIFY transitions:
|
|
99
|
+
- Every paragraph opens with a connection to the previous one
|
|
100
|
+
- Subheadings are used every 200-300 words for scannable content
|
|
101
|
+
- CHECK against knowledge/anti-patterns.md:
|
|
102
|
+
- No tone inconsistency (anti-pattern #11)
|
|
103
|
+
- No register mismatch (anti-pattern #12)
|
|
104
|
+
- No passive voice overuse (anti-pattern #13)
|
|
105
|
+
- No filler/hedge overload (anti-pattern #14)
|
|
106
|
+
- No wall of text (anti-pattern #15)
|
|
107
|
+
- No buried lead (anti-pattern #16)
|
|
108
|
+
- No weak conclusion (anti-pattern #17)
|
|
109
|
+
- No missing transitions (anti-pattern #18)
|
|
110
|
+
|
|
111
|
+
## Step 7: Revision & Final Quality Assurance
|
|
112
|
+
- SELF-CHECK — Thesis integrity:
|
|
113
|
+
- Can you identify the thesis in one sentence? Is it in the first two paragraphs?
|
|
114
|
+
- Does the conclusion restate it consistently (not contradict it)?
|
|
115
|
+
- Does every argument block connect back to the thesis? (anti-pattern #4: moving thesis)
|
|
116
|
+
- SELF-CHECK — Evidence sufficiency:
|
|
117
|
+
- Are there at least 3 different evidence types used?
|
|
118
|
+
- Is every major claim supported?
|
|
119
|
+
- Are counterarguments acknowledged for contentious claims?
|
|
120
|
+
- SELF-CHECK — Structural coherence:
|
|
121
|
+
- Does the article follow the selected structure pattern consistently?
|
|
122
|
+
- Is the information prioritized appropriately (most important first for inverted pyramid; tension built for narrative)?
|
|
123
|
+
- Are there clear section markers and logical flow?
|
|
124
|
+
- SELF-CHECK — Audience alignment:
|
|
125
|
+
- Is the vocabulary appropriate for the target audience?
|
|
126
|
+
- Is the depth of explanation correct (not too basic, not too advanced)?
|
|
127
|
+
- Would the target reader find this article valuable?
|
|
128
|
+
- IF any check fails THEN revise the specific section and re-run the failed check
|
|
129
|
+
- Use @botlearn/keyword-extractor to verify final keyword coverage — ensure core topic terms appear with appropriate frequency
|
|
130
|
+
- Output the final article with clear formatting: title, subheadings, paragraphs, and source attributions
|
|
@@ -0,0 +1,506 @@
|
|
|
1
|
+
{
|
|
2
|
+
"version": "0.0.1",
|
|
3
|
+
"dimension": "creative-generation",
|
|
4
|
+
"tasks": [
|
|
5
|
+
{
|
|
6
|
+
"id": "bench-easy-01",
|
|
7
|
+
"difficulty": "easy",
|
|
8
|
+
"description": "Write a short informational article on a well-known topic",
|
|
9
|
+
"input": "Write a 400-500 word article explaining why regular code reviews improve software quality. Target audience: junior developers. Use an accessible, encouraging tone.",
|
|
10
|
+
"rubric": [
|
|
11
|
+
{
|
|
12
|
+
"criterion": "Thesis Clarity",
|
|
13
|
+
"weight": 0.3,
|
|
14
|
+
"scoring": {
|
|
15
|
+
"5": "Clear thesis within the first two paragraphs stating that code reviews improve quality, with a specific angle (e.g., catches bugs early, promotes knowledge sharing)",
|
|
16
|
+
"3": "General statement about code reviews being useful but lacking a specific, arguable claim",
|
|
17
|
+
"1": "No identifiable thesis; reads as a list of facts",
|
|
18
|
+
"0": "No central argument"
|
|
19
|
+
}
|
|
20
|
+
},
|
|
21
|
+
{
|
|
22
|
+
"criterion": "Evidence & Support",
|
|
23
|
+
"weight": 0.3,
|
|
24
|
+
"scoring": {
|
|
25
|
+
"5": "At least two concrete supporting points with evidence (statistics, examples, or expert references); each point analyzed, not just stated",
|
|
26
|
+
"3": "Supporting points present but evidence is generic or unsupported assertions",
|
|
27
|
+
"1": "Claims without any evidence",
|
|
28
|
+
"0": "No supporting arguments"
|
|
29
|
+
}
|
|
30
|
+
},
|
|
31
|
+
{
|
|
32
|
+
"criterion": "Style & Tone",
|
|
33
|
+
"weight": 0.4,
|
|
34
|
+
"scoring": {
|
|
35
|
+
"5": "Consistently accessible and encouraging tone appropriate for junior developers; no jargon without explanation; active voice; smooth flow",
|
|
36
|
+
"3": "Mostly appropriate tone with occasional shifts to overly formal or condescending register",
|
|
37
|
+
"1": "Tone mismatch — too technical or too patronizing for the audience",
|
|
38
|
+
"0": "No audience awareness"
|
|
39
|
+
}
|
|
40
|
+
}
|
|
41
|
+
],
|
|
42
|
+
"expectedScoreWithout": 40,
|
|
43
|
+
"expectedScoreWith": 80
|
|
44
|
+
},
|
|
45
|
+
{
|
|
46
|
+
"id": "bench-easy-02",
|
|
47
|
+
"difficulty": "easy",
|
|
48
|
+
"description": "Write a practical how-to blog post with clear structure",
|
|
49
|
+
"input": "Write a 400-500 word blog post titled 'How to Write a Great Git Commit Message'. Use a listicle or modular structure with clear sections. Target audience: developers of all levels.",
|
|
50
|
+
"rubric": [
|
|
51
|
+
{
|
|
52
|
+
"criterion": "Structure",
|
|
53
|
+
"weight": 0.35,
|
|
54
|
+
"scoring": {
|
|
55
|
+
"5": "Clear modular/listicle structure with introduction, 3+ distinct sections with subheadings, and a summary/conclusion; each section is self-contained",
|
|
56
|
+
"3": "Some structure visible but sections blend together or lack clear headings",
|
|
57
|
+
"1": "Wall of text with no structural organization",
|
|
58
|
+
"0": "No discernible structure"
|
|
59
|
+
}
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
"criterion": "Content Quality",
|
|
63
|
+
"weight": 0.35,
|
|
64
|
+
"scoring": {
|
|
65
|
+
"5": "Practical, actionable advice with concrete examples of good and bad commit messages; explains the 'why' behind each practice",
|
|
66
|
+
"3": "Advice is reasonable but generic; lacks concrete examples",
|
|
67
|
+
"1": "Vague or incorrect guidance",
|
|
68
|
+
"0": "Content is irrelevant or unusable"
|
|
69
|
+
}
|
|
70
|
+
},
|
|
71
|
+
{
|
|
72
|
+
"criterion": "Completeness",
|
|
73
|
+
"weight": 0.3,
|
|
74
|
+
"scoring": {
|
|
75
|
+
"5": "Covers key aspects: subject line format, body content, imperative mood, scope, and provides a conclusion tying it together; appropriate length",
|
|
76
|
+
"3": "Covers 2-3 aspects but misses important ones",
|
|
77
|
+
"1": "Covers only one aspect superficially",
|
|
78
|
+
"0": "Incomplete or off-topic"
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
],
|
|
82
|
+
"expectedScoreWithout": 45,
|
|
83
|
+
"expectedScoreWith": 82
|
|
84
|
+
},
|
|
85
|
+
{
|
|
86
|
+
"id": "bench-easy-03",
|
|
87
|
+
"difficulty": "easy",
|
|
88
|
+
"description": "Write a brief opinion piece with a clear position",
|
|
89
|
+
"input": "Write a 300-400 word opinion piece arguing that open-source software is essential for innovation in the tech industry. Use a professional tone suitable for a tech publication.",
|
|
90
|
+
"rubric": [
|
|
91
|
+
{
|
|
92
|
+
"criterion": "Thesis & Position",
|
|
93
|
+
"weight": 0.35,
|
|
94
|
+
"scoring": {
|
|
95
|
+
"5": "Strong, specific thesis stated early; clear position that open-source is essential (not just 'useful') with a defined scope (innovation in tech)",
|
|
96
|
+
"3": "Position is present but hedged or vague; reads more as 'open source is nice' than 'essential'",
|
|
97
|
+
"1": "No clear position; balanced overview instead of an opinion",
|
|
98
|
+
"0": "No identifiable argument"
|
|
99
|
+
}
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
"criterion": "Argumentation",
|
|
103
|
+
"weight": 0.35,
|
|
104
|
+
"scoring": {
|
|
105
|
+
"5": "At least two distinct arguments with supporting evidence; logical progression from claim to evidence to analysis; brief acknowledgment of counterpoint",
|
|
106
|
+
"3": "Arguments present but evidence is thin; no counterargument engagement",
|
|
107
|
+
"1": "Assertions without supporting logic or evidence",
|
|
108
|
+
"0": "No argumentation"
|
|
109
|
+
}
|
|
110
|
+
},
|
|
111
|
+
{
|
|
112
|
+
"criterion": "Conclusion Strength",
|
|
113
|
+
"weight": 0.3,
|
|
114
|
+
"scoring": {
|
|
115
|
+
"5": "Conclusion reinforces thesis, synthesizes arguments into a broader insight, and ends with a call to action or forward-looking statement",
|
|
116
|
+
"3": "Conclusion restates the thesis but doesn't add new insight",
|
|
117
|
+
"1": "Abrupt ending or vague summary",
|
|
118
|
+
"0": "No conclusion"
|
|
119
|
+
}
|
|
120
|
+
}
|
|
121
|
+
],
|
|
122
|
+
"expectedScoreWithout": 40,
|
|
123
|
+
"expectedScoreWith": 78
|
|
124
|
+
},
|
|
125
|
+
{
|
|
126
|
+
"id": "bench-med-01",
|
|
127
|
+
"difficulty": "medium",
|
|
128
|
+
"description": "Write an analytical article requiring evidence integration and counterargument handling",
|
|
129
|
+
"input": "Write a 700-900 word article analyzing whether remote work increases or decreases productivity for software engineering teams. Present evidence for both sides, take a clear position, and address the strongest counterargument. Target audience: engineering managers. Professional, data-informed tone.",
|
|
130
|
+
"rubric": [
|
|
131
|
+
{
|
|
132
|
+
"criterion": "Thesis & Position",
|
|
133
|
+
"weight": 0.2,
|
|
134
|
+
"scoring": {
|
|
135
|
+
"5": "Clear, nuanced thesis within first two paragraphs; takes a specific position (not 'it depends') with qualifiers (e.g., 'for senior engineers on established teams'); conclusion is consistent",
|
|
136
|
+
"3": "Position is present but too hedged or appears late in the article",
|
|
137
|
+
"1": "No clear position; article remains neutral throughout",
|
|
138
|
+
"0": "No thesis; descriptive overview only"
|
|
139
|
+
}
|
|
140
|
+
},
|
|
141
|
+
{
|
|
142
|
+
"criterion": "Evidence Integration",
|
|
143
|
+
"weight": 0.3,
|
|
144
|
+
"scoring": {
|
|
145
|
+
"5": "Uses 3+ evidence types (statistics, case studies, expert quotes); evidence is properly attributed with source, date, and context; PEEL structure visible in body paragraphs",
|
|
146
|
+
"3": "Some evidence present but not well-attributed; relies on one evidence type",
|
|
147
|
+
"1": "Minimal evidence; mostly personal opinion or unverified claims",
|
|
148
|
+
"0": "No evidence"
|
|
149
|
+
}
|
|
150
|
+
},
|
|
151
|
+
{
|
|
152
|
+
"criterion": "Counterargument Handling",
|
|
153
|
+
"weight": 0.25,
|
|
154
|
+
"scoring": {
|
|
155
|
+
"5": "Strongest counterargument presented fairly using Rogerian approach; rebutted with specific evidence; concedes valid points while maintaining thesis",
|
|
156
|
+
"3": "Counterargument acknowledged but dismissed superficially or presented as a straw man",
|
|
157
|
+
"1": "Counterargument mentioned in passing without engagement",
|
|
158
|
+
"0": "No counterargument addressed"
|
|
159
|
+
}
|
|
160
|
+
},
|
|
161
|
+
{
|
|
162
|
+
"criterion": "Style & Structure",
|
|
163
|
+
"weight": 0.25,
|
|
164
|
+
"scoring": {
|
|
165
|
+
"5": "Analytical framework structure; consistent professional tone for engineering managers; smooth transitions; clear subheadings; varied sentence rhythm",
|
|
166
|
+
"3": "Adequate structure but some tonal shifts or weak transitions",
|
|
167
|
+
"1": "Poor structure; inconsistent tone; choppy flow",
|
|
168
|
+
"0": "Unstructured; inappropriate tone"
|
|
169
|
+
}
|
|
170
|
+
}
|
|
171
|
+
],
|
|
172
|
+
"expectedScoreWithout": 30,
|
|
173
|
+
"expectedScoreWith": 72
|
|
174
|
+
},
|
|
175
|
+
{
|
|
176
|
+
"id": "bench-med-02",
|
|
177
|
+
"difficulty": "medium",
|
|
178
|
+
"description": "Write a narrative-driven feature article using storytelling techniques",
|
|
179
|
+
"input": "Write a 700-900 word feature article about how a fictional startup pivoted from a failing product to a successful one by embracing user feedback. Use a narrative arc structure with a compelling opening scene, rising tension, and resolution. Target audience: entrepreneurs and product managers.",
|
|
180
|
+
"rubric": [
|
|
181
|
+
{
|
|
182
|
+
"criterion": "Narrative Structure",
|
|
183
|
+
"weight": 0.3,
|
|
184
|
+
"scoring": {
|
|
185
|
+
"5": "Clear narrative arc: engaging opening scene/hook, rising action with increasing tension, central problem/pivot moment, resolution with specific outcomes, reflection with broader lessons",
|
|
186
|
+
"3": "Some narrative elements but flat tension curve; events listed chronologically without dramatic structure",
|
|
187
|
+
"1": "No narrative structure; reads as an expository article about pivots",
|
|
188
|
+
"0": "No storytelling elements"
|
|
189
|
+
}
|
|
190
|
+
},
|
|
191
|
+
{
|
|
192
|
+
"criterion": "Evidence Through Story",
|
|
193
|
+
"weight": 0.25,
|
|
194
|
+
"scoring": {
|
|
195
|
+
"5": "Business insights and evidence woven naturally into the narrative; specific details (metrics, user feedback quotes, product decisions) make the story credible and instructive",
|
|
196
|
+
"3": "Some data points included but feel forced or disconnected from the narrative",
|
|
197
|
+
"1": "Pure fiction with no instructive business evidence or lessons",
|
|
198
|
+
"0": "No evidence or data integrated"
|
|
199
|
+
}
|
|
200
|
+
},
|
|
201
|
+
{
|
|
202
|
+
"criterion": "Character & Scene",
|
|
203
|
+
"weight": 0.2,
|
|
204
|
+
"scoring": {
|
|
205
|
+
"5": "Vivid characters (founder, team members, users) with distinct voices; concrete scenes with sensory details; the reader can visualize key moments",
|
|
206
|
+
"3": "Characters are named but one-dimensional; scenes are described but lack vividness",
|
|
207
|
+
"1": "No character development; abstract descriptions",
|
|
208
|
+
"0": "No characters or scenes"
|
|
209
|
+
}
|
|
210
|
+
},
|
|
211
|
+
{
|
|
212
|
+
"criterion": "Takeaway & Relevance",
|
|
213
|
+
"weight": 0.25,
|
|
214
|
+
"scoring": {
|
|
215
|
+
"5": "Article concludes with actionable insights for the target audience; lessons are drawn from the narrative rather than appended; readers can apply the insights to their own work",
|
|
216
|
+
"3": "Lessons stated but feel disconnected from the story; generic startup advice",
|
|
217
|
+
"1": "No clear takeaway for the audience",
|
|
218
|
+
"0": "Irrelevant to target audience"
|
|
219
|
+
}
|
|
220
|
+
}
|
|
221
|
+
],
|
|
222
|
+
"expectedScoreWithout": 25,
|
|
223
|
+
"expectedScoreWith": 68
|
|
224
|
+
},
|
|
225
|
+
{
|
|
226
|
+
"id": "bench-med-03",
|
|
227
|
+
"difficulty": "medium",
|
|
228
|
+
"description": "Write an article requiring style adaptation to a specific audience",
|
|
229
|
+
"input": "Write a 600-800 word article explaining Kubernetes container orchestration for a non-technical business audience. The article should justify why a company's leadership should invest in Kubernetes, using business language (ROI, scalability, operational efficiency) rather than technical jargon. Analogies are encouraged.",
|
|
230
|
+
"rubric": [
|
|
231
|
+
{
|
|
232
|
+
"criterion": "Audience Adaptation",
|
|
233
|
+
"weight": 0.35,
|
|
234
|
+
"scoring": {
|
|
235
|
+
"5": "Entirely business-oriented language; no unexplained technical jargon; concepts framed in terms of business value (cost savings, reliability, speed to market); effective analogies that make technical concepts intuitive",
|
|
236
|
+
"3": "Mostly business language but some technical terms leak through without explanation; analogies attempted but not fully effective",
|
|
237
|
+
"1": "Heavy technical language; reads as a developer article with a few business terms added",
|
|
238
|
+
"0": "No audience adaptation; pure technical article"
|
|
239
|
+
}
|
|
240
|
+
},
|
|
241
|
+
{
|
|
242
|
+
"criterion": "Argument for Investment",
|
|
243
|
+
"weight": 0.3,
|
|
244
|
+
"scoring": {
|
|
245
|
+
"5": "Clear business case with specific benefits (cost reduction %, uptime improvement, deployment speed); addresses likely executive concerns (cost, complexity, timeline); evidence supports each benefit",
|
|
246
|
+
"3": "Business benefits mentioned but vague ('it saves money'); doesn't address concerns",
|
|
247
|
+
"1": "No clear business case; describes what Kubernetes does but not why it matters",
|
|
248
|
+
"0": "No investment argument"
|
|
249
|
+
}
|
|
250
|
+
},
|
|
251
|
+
{
|
|
252
|
+
"criterion": "Analogies & Clarity",
|
|
253
|
+
"weight": 0.2,
|
|
254
|
+
"scoring": {
|
|
255
|
+
"5": "Uses 2+ effective analogies that accurately represent the concept while being accessible (e.g., 'Kubernetes is like a shipping logistics manager...'); acknowledges where analogies simplify",
|
|
256
|
+
"3": "One analogy present but not fully developed; or analogy is inaccurate",
|
|
257
|
+
"1": "No analogies; abstract explanations only",
|
|
258
|
+
"0": "Confusing or misleading analogies"
|
|
259
|
+
}
|
|
260
|
+
},
|
|
261
|
+
{
|
|
262
|
+
"criterion": "Completeness & Structure",
|
|
263
|
+
"weight": 0.15,
|
|
264
|
+
"scoring": {
|
|
265
|
+
"5": "Complete article with clear introduction, 3+ benefit sections, and conclusion with recommendation; appropriate length; well-organized for scanability",
|
|
266
|
+
"3": "Covers some benefits but incomplete; structure needs work",
|
|
267
|
+
"1": "Fragmentary or significantly too short/long",
|
|
268
|
+
"0": "Incomplete"
|
|
269
|
+
}
|
|
270
|
+
}
|
|
271
|
+
],
|
|
272
|
+
"expectedScoreWithout": 25,
|
|
273
|
+
"expectedScoreWith": 70
|
|
274
|
+
},
|
|
275
|
+
{
|
|
276
|
+
"id": "bench-med-04",
|
|
277
|
+
"difficulty": "medium",
|
|
278
|
+
"description": "Write an article integrating multiple evidence types with proper attribution",
|
|
279
|
+
"input": "Write a 600-800 word article about the impact of AI on software testing, arguing that AI-assisted testing will complement rather than replace human testers. Include at least one statistic, one expert perspective, one case study or example, and one logical argument. Target: QA professionals. Professional, informed tone.",
|
|
280
|
+
"rubric": [
|
|
281
|
+
{
|
|
282
|
+
"criterion": "Evidence Diversity",
|
|
283
|
+
"weight": 0.35,
|
|
284
|
+
"scoring": {
|
|
285
|
+
"5": "Includes all 4 requested evidence types (statistic, expert perspective, case study, logical argument); each is properly attributed with source, context, and date; evidence types are distributed across different argument blocks",
|
|
286
|
+
"3": "Includes 2-3 evidence types; attribution is incomplete",
|
|
287
|
+
"1": "Only one evidence type used; poor attribution",
|
|
288
|
+
"0": "No evidence; opinion only"
|
|
289
|
+
}
|
|
290
|
+
},
|
|
291
|
+
{
|
|
292
|
+
"criterion": "Evidence Analysis",
|
|
293
|
+
"weight": 0.25,
|
|
294
|
+
"scoring": {
|
|
295
|
+
"5": "Every piece of evidence is followed by analysis explaining what it means and how it supports the argument (PEEL structure); evidence is woven into the narrative, not dropped in",
|
|
296
|
+
"3": "Some evidence is analyzed but others are stated without interpretation",
|
|
297
|
+
"1": "Evidence is listed without analysis",
|
|
298
|
+
"0": "No analysis"
|
|
299
|
+
}
|
|
300
|
+
},
|
|
301
|
+
{
|
|
302
|
+
"criterion": "Thesis & Argumentation",
|
|
303
|
+
"weight": 0.2,
|
|
304
|
+
"scoring": {
|
|
305
|
+
"5": "Clear 'complement not replace' thesis stated early; arguments logically build the case with specific examples of complementary roles",
|
|
306
|
+
"3": "Thesis present but arguments are generic",
|
|
307
|
+
"1": "Vague position; doesn't clearly argue complement vs. replace",
|
|
308
|
+
"0": "No thesis"
|
|
309
|
+
}
|
|
310
|
+
},
|
|
311
|
+
{
|
|
312
|
+
"criterion": "Audience & Tone",
|
|
313
|
+
"weight": 0.2,
|
|
314
|
+
"scoring": {
|
|
315
|
+
"5": "Professional, informed tone respecting QA professionals' expertise; uses QA-specific terminology appropriately; addresses their real concerns (job security, skill evolution)",
|
|
316
|
+
"3": "Generally appropriate but occasionally condescending or too basic for QA professionals",
|
|
317
|
+
"1": "Tone mismatch; either too academic or too casual for the audience",
|
|
318
|
+
"0": "No audience awareness"
|
|
319
|
+
}
|
|
320
|
+
}
|
|
321
|
+
],
|
|
322
|
+
"expectedScoreWithout": 25,
|
|
323
|
+
"expectedScoreWith": 70
|
|
324
|
+
},
|
|
325
|
+
{
|
|
326
|
+
"id": "bench-hard-01",
|
|
327
|
+
"difficulty": "hard",
|
|
328
|
+
"description": "Write a long-form article on a contentious topic requiring balanced evidence, strong counterargument handling, and sophisticated structure",
|
|
329
|
+
"input": "Write a 1000-1200 word article for a technology policy publication arguing whether governments should mandate algorithmic transparency for AI systems used in hiring decisions. Take a clear position, present evidence from multiple domains (technology, law, ethics, business), engage with the two strongest counterarguments, and propose a specific policy recommendation. Formal, authoritative tone.",
|
|
330
|
+
"rubric": [
|
|
331
|
+
{
|
|
332
|
+
"criterion": "Thesis & Policy Position",
|
|
333
|
+
"weight": 0.2,
|
|
334
|
+
"scoring": {
|
|
335
|
+
"5": "Specific, nuanced thesis stated within first two paragraphs; clear policy recommendation (not just 'transparency is good'); scoped to hiring AI specifically; conclusion reinforces with implementation specifics",
|
|
336
|
+
"3": "Position present but vague or overly general; recommendation lacks specificity",
|
|
337
|
+
"1": "Balanced overview without taking a position; no policy recommendation",
|
|
338
|
+
"0": "No identifiable argument or recommendation"
|
|
339
|
+
}
|
|
340
|
+
},
|
|
341
|
+
{
|
|
342
|
+
"criterion": "Multi-Domain Evidence",
|
|
343
|
+
"weight": 0.25,
|
|
344
|
+
"scoring": {
|
|
345
|
+
"5": "Evidence drawn from all 4 requested domains (technology, law, ethics, business) with proper attribution; domains are connected, not siloed; at least 5 distinct evidence items across the article",
|
|
346
|
+
"3": "Evidence from 2-3 domains; some well-attributed, others generic",
|
|
347
|
+
"1": "Evidence from only one domain or unattributed claims",
|
|
348
|
+
"0": "No substantive evidence"
|
|
349
|
+
}
|
|
350
|
+
},
|
|
351
|
+
{
|
|
352
|
+
"criterion": "Counterargument Engagement",
|
|
353
|
+
"weight": 0.25,
|
|
354
|
+
"scoring": {
|
|
355
|
+
"5": "Two strongest counterarguments identified and presented fairly (Rogerian approach); each rebutted with specific evidence; valid points conceded; overall argument strengthened by the engagement",
|
|
356
|
+
"3": "One counterargument addressed adequately; second missing or treated as straw man",
|
|
357
|
+
"1": "Counterarguments mentioned but not seriously engaged",
|
|
358
|
+
"0": "No counterarguments addressed"
|
|
359
|
+
}
|
|
360
|
+
},
|
|
361
|
+
{
|
|
362
|
+
"criterion": "Structure & Sophistication",
|
|
363
|
+
"weight": 0.15,
|
|
364
|
+
"scoring": {
|
|
365
|
+
"5": "Analytical framework structure with clear progression; transitions create a logical chain between sections; subheadings guide the reader; introduction and conclusion frame the article as a coherent argument",
|
|
366
|
+
"3": "Adequate structure but some sections feel disconnected; transitions need work",
|
|
367
|
+
"1": "Poor organization; arguments presented randomly",
|
|
368
|
+
"0": "No discernible structure"
|
|
369
|
+
}
|
|
370
|
+
},
|
|
371
|
+
{
|
|
372
|
+
"criterion": "Tone & Authority",
|
|
373
|
+
"weight": 0.15,
|
|
374
|
+
"scoring": {
|
|
375
|
+
"5": "Consistently formal, authoritative tone appropriate for a policy publication; precise language; no casual expressions; demonstrates deep domain familiarity through vocabulary and framing",
|
|
376
|
+
"3": "Mostly formal but occasional informalities; vocabulary sometimes imprecise",
|
|
377
|
+
"1": "Tone inconsistencies; shifts between formal and casual; vocabulary too basic for a policy audience",
|
|
378
|
+
"0": "Inappropriate tone throughout"
|
|
379
|
+
}
|
|
380
|
+
}
|
|
381
|
+
],
|
|
382
|
+
"expectedScoreWithout": 20,
|
|
383
|
+
"expectedScoreWith": 65
|
|
384
|
+
},
|
|
385
|
+
{
|
|
386
|
+
"id": "bench-hard-02",
|
|
387
|
+
"difficulty": "hard",
|
|
388
|
+
"description": "Write an article requiring creative structure choice, audience adaptation, and synthesis of complex material",
|
|
389
|
+
"input": "Write a 1000-1200 word article for a general audience explaining why quantum computing will not replace classical computers for most tasks, but will transform specific domains like drug discovery, cryptography, and optimization. The article must be engaging for non-technical readers while being technically accurate. Use a problem-solution structure: frame the common misconception first, then systematically dismantle it while explaining what quantum computing actually excels at.",
|
|
390
|
+
"rubric": [
|
|
391
|
+
{
|
|
392
|
+
"criterion": "Structure Execution",
|
|
393
|
+
"weight": 0.2,
|
|
394
|
+
"scoring": {
|
|
395
|
+
"5": "Effective problem-solution structure: clearly frames the 'quantum will replace classical' misconception as the problem; systematically addresses it through 3+ domain-specific sections; each section debunks while reframing; conclusion synthesizes",
|
|
396
|
+
"3": "Some problem-solution elements but the misconception isn't clearly framed upfront; structure drifts into general explainer",
|
|
397
|
+
"1": "No clear problem-solution structure; standard expository article",
|
|
398
|
+
"0": "No recognizable structure"
|
|
399
|
+
}
|
|
400
|
+
},
|
|
401
|
+
{
|
|
402
|
+
"criterion": "Technical Accuracy & Accessibility",
|
|
403
|
+
"weight": 0.3,
|
|
404
|
+
"scoring": {
|
|
405
|
+
"5": "Technically accurate explanation of quantum computing limitations and strengths; complex concepts made accessible through analogies and everyday language; no unexplained jargon; reader finishes with correct understanding",
|
|
406
|
+
"3": "Mostly accurate but some oversimplifications are misleading; 1-2 jargon terms without explanation",
|
|
407
|
+
"1": "Technical inaccuracies or overly simplified to the point of being wrong; heavy jargon",
|
|
408
|
+
"0": "Technically incorrect or incomprehensible to target audience"
|
|
409
|
+
}
|
|
410
|
+
},
|
|
411
|
+
{
|
|
412
|
+
"criterion": "Domain Coverage",
|
|
413
|
+
"weight": 0.2,
|
|
414
|
+
"scoring": {
|
|
415
|
+
"5": "All three requested domains (drug discovery, cryptography, optimization) covered with specific examples of quantum advantage; explains why classical computers struggle in each domain",
|
|
416
|
+
"3": "Two domains covered well; third is superficial or missing",
|
|
417
|
+
"1": "Only one domain addressed; others mentioned in passing",
|
|
418
|
+
"0": "No domain-specific coverage"
|
|
419
|
+
}
|
|
420
|
+
},
|
|
421
|
+
{
|
|
422
|
+
"criterion": "Engagement & Readability",
|
|
423
|
+
"weight": 0.15,
|
|
424
|
+
"scoring": {
|
|
425
|
+
"5": "Compelling hook; sustained reader interest through varied techniques (analogies, questions, concrete scenarios); sentences are varied in length and rhythm; paragraphs are scannable",
|
|
426
|
+
"3": "Reasonably engaging but flat in places; limited use of engagement techniques",
|
|
427
|
+
"1": "Dry, textbook-like prose; no engagement techniques",
|
|
428
|
+
"0": "Boring or confusing; reader would stop early"
|
|
429
|
+
}
|
|
430
|
+
},
|
|
431
|
+
{
|
|
432
|
+
"criterion": "Conclusion & Synthesis",
|
|
433
|
+
"weight": 0.15,
|
|
434
|
+
"scoring": {
|
|
435
|
+
"5": "Conclusion synthesizes the domain examples into a coherent view of quantum computing's actual role; provides a forward-looking perspective that is both realistic and exciting; reinforces the thesis",
|
|
436
|
+
"3": "Summary restates main points but doesn't synthesize into a new insight",
|
|
437
|
+
"1": "Weak or missing conclusion",
|
|
438
|
+
"0": "No conclusion"
|
|
439
|
+
}
|
|
440
|
+
}
|
|
441
|
+
],
|
|
442
|
+
"expectedScoreWithout": 20,
|
|
443
|
+
"expectedScoreWith": 62
|
|
444
|
+
},
|
|
445
|
+
{
|
|
446
|
+
"id": "bench-hard-03",
|
|
447
|
+
"difficulty": "hard",
|
|
448
|
+
"description": "Write an article requiring integration of dependent skills (summarizer + keyword-extractor) with complex argumentation",
|
|
449
|
+
"input": "Here is a collection of raw research notes on microservices architecture:\n\n- Netflix migrated to microservices in 2009-2012, reducing deployment time from weeks to minutes. Source: Netflix Tech Blog, 2016.\n- A 2023 InfoQ survey found 78% of organizations using microservices reported increased deployment frequency, but 65% reported increased operational complexity.\n- Martin Fowler warns about the 'microservices premium': the overhead is only justified at a certain scale. Source: martinfowler.com, 2015.\n- Amazon's two-pizza team rule emerged from their service-oriented architecture. Each team owns one service. Source: 'Working Backwards', 2021.\n- A 2024 DORA report found that elite performers using microservices deployed 973x more frequently than low performers, but the gap disappeared when controlling for team practices.\n- Kelsey Hightower tweeted: 'Monoliths are the future because the people who are currently building microservices will realize that most of their problems don't require microservices.'\n- Segment (analytics company) moved FROM microservices BACK to a monolith in 2018, citing excessive operational overhead for their team size. Source: Segment Engineering Blog.\n\nUsing this research material, write a 900-1100 word article for a CTO audience arguing whether startups should begin with a monolith or microservices. Summarize the research into coherent evidence, extract the key themes, take a clear position, and address counterarguments.",
|
|
450
|
+
"rubric": [
|
|
451
|
+
{
|
|
452
|
+
"criterion": "Research Synthesis",
|
|
453
|
+
"weight": 0.25,
|
|
454
|
+
"scoring": {
|
|
455
|
+
"5": "All 7 research notes are synthesized into coherent themes (not listed one by one); evidence is grouped by argument rather than by source; contradictions are acknowledged and resolved; demonstrates summarization capability",
|
|
456
|
+
"3": "Most research notes used but presented sequentially rather than synthesized; some grouping by theme",
|
|
457
|
+
"1": "Research notes copy-pasted or minimally rephrased; no synthesis",
|
|
458
|
+
"0": "Research notes ignored or misrepresented"
|
|
459
|
+
}
|
|
460
|
+
},
|
|
461
|
+
{
|
|
462
|
+
"criterion": "Keyword & Theme Extraction",
|
|
463
|
+
"weight": 0.15,
|
|
464
|
+
"scoring": {
|
|
465
|
+
"5": "Core themes clearly extracted (deployment frequency, operational complexity, team size, organizational maturity); keyword coverage is comprehensive; themes drive the article structure",
|
|
466
|
+
"3": "Some themes identified but article structure doesn't follow them; coverage gaps",
|
|
467
|
+
"1": "No thematic organization; random coverage of research points",
|
|
468
|
+
"0": "No theme extraction"
|
|
469
|
+
}
|
|
470
|
+
},
|
|
471
|
+
{
|
|
472
|
+
"criterion": "Argumentation Quality",
|
|
473
|
+
"weight": 0.25,
|
|
474
|
+
"scoring": {
|
|
475
|
+
"5": "Clear position stated early; Toulmin model visible (claim, data, warrant, qualifier, rebuttal); evidence from research notes used as data with analysis; counterarguments addressed with specific evidence from the notes",
|
|
476
|
+
"3": "Position taken but argumentation is shallow; evidence cited but not analyzed",
|
|
477
|
+
"1": "Weak or shifting position; evidence not connected to claims",
|
|
478
|
+
"0": "No clear argument"
|
|
479
|
+
}
|
|
480
|
+
},
|
|
481
|
+
{
|
|
482
|
+
"criterion": "Evidence Attribution",
|
|
483
|
+
"weight": 0.15,
|
|
484
|
+
"scoring": {
|
|
485
|
+
"5": "Every research note properly attributed (source, date); statistics include context; expert perspectives include credentials/relevance; case studies include outcomes",
|
|
486
|
+
"3": "Most sources attributed but some are vague ('a study found...')",
|
|
487
|
+
"1": "Poor attribution; claims presented as facts without sourcing",
|
|
488
|
+
"0": "No attribution"
|
|
489
|
+
}
|
|
490
|
+
},
|
|
491
|
+
{
|
|
492
|
+
"criterion": "CTO Audience Fit",
|
|
493
|
+
"weight": 0.2,
|
|
494
|
+
"scoring": {
|
|
495
|
+
"5": "Framed in terms of business decisions CTOs face (team size, deployment speed, operational cost); actionable recommendation with specific conditions; professional, strategic tone; addresses real tradeoffs",
|
|
496
|
+
"3": "Generally professional but doesn't specifically address CTO decision-making context",
|
|
497
|
+
"1": "Too technical (developer-focused) or too high-level (executive summary without depth)",
|
|
498
|
+
"0": "No audience awareness"
|
|
499
|
+
}
|
|
500
|
+
}
|
|
501
|
+
],
|
|
502
|
+
"expectedScoreWithout": 20,
|
|
503
|
+
"expectedScoreWith": 65
|
|
504
|
+
}
|
|
505
|
+
]
|
|
506
|
+
}
|
package/tests/smoke.json
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
{
|
|
2
|
+
"version": "0.0.1",
|
|
3
|
+
"timeout": 60,
|
|
4
|
+
"tasks": [
|
|
5
|
+
{
|
|
6
|
+
"id": "smoke-01",
|
|
7
|
+
"description": "Write a structured analytical article on a technical topic with clear thesis, evidence, and consistent style",
|
|
8
|
+
"input": "Write a 600-800 word article arguing that test-driven development (TDD) improves long-term code quality for backend services. Target audience is mid-level software engineers. Include at least two supporting arguments with evidence, address one counterargument, and maintain a professional but accessible tone throughout.",
|
|
9
|
+
"rubric": [
|
|
10
|
+
{
|
|
11
|
+
"criterion": "Thesis Clarity",
|
|
12
|
+
"weight": 0.25,
|
|
13
|
+
"scoring": {
|
|
14
|
+
"5": "Clear, specific, arguable thesis stated within the first two paragraphs; thesis is scoped to backend services and long-term quality; conclusion restates it consistently",
|
|
15
|
+
"3": "Thesis is present but vague or overly broad (e.g., 'TDD is good'); may appear late in the article",
|
|
16
|
+
"1": "No identifiable thesis; article reads as a general overview of TDD",
|
|
17
|
+
"0": "No central argument; disconnected observations"
|
|
18
|
+
}
|
|
19
|
+
},
|
|
20
|
+
{
|
|
21
|
+
"criterion": "Argument Structure & Evidence",
|
|
22
|
+
"weight": 0.3,
|
|
23
|
+
"scoring": {
|
|
24
|
+
"5": "Two or more distinct arguments with supporting evidence (statistics, case studies, or expert references); counterargument acknowledged and rebutted; PEEL structure visible in body paragraphs",
|
|
25
|
+
"3": "Arguments present but evidence is thin or generic; counterargument missing or dismissed without engagement",
|
|
26
|
+
"1": "Claims made without evidence; no logical structure to the argument progression",
|
|
27
|
+
"0": "No discernible argumentation; purely descriptive content"
|
|
28
|
+
}
|
|
29
|
+
},
|
|
30
|
+
{
|
|
31
|
+
"criterion": "Style Consistency",
|
|
32
|
+
"weight": 0.25,
|
|
33
|
+
"scoring": {
|
|
34
|
+
"5": "Consistent professional-accessible tone throughout; no register shifts; active voice dominant; smooth transitions between all sections; varied sentence rhythm",
|
|
35
|
+
"3": "Mostly consistent tone with 1-2 noticeable shifts; some awkward transitions; occasional passive voice clusters",
|
|
36
|
+
"1": "Frequent tone shifts between casual and formal; choppy transitions; monotonous sentence structure",
|
|
37
|
+
"0": "No consistent style; reads like multiple authors wrote different sections"
|
|
38
|
+
}
|
|
39
|
+
},
|
|
40
|
+
{
|
|
41
|
+
"criterion": "Article Completeness",
|
|
42
|
+
"weight": 0.2,
|
|
43
|
+
"scoring": {
|
|
44
|
+
"5": "Complete article with hook/introduction, thesis, body arguments, counterargument, and strong conclusion with call to action or forward-looking insight; appropriate length",
|
|
45
|
+
"3": "Article has introduction and body but weak conclusion; may be too short or missing counterargument section",
|
|
46
|
+
"1": "Incomplete article missing major sections (no introduction or no conclusion)",
|
|
47
|
+
"0": "Fragment or outline rather than a complete article"
|
|
48
|
+
}
|
|
49
|
+
}
|
|
50
|
+
],
|
|
51
|
+
"passThreshold": 60
|
|
52
|
+
}
|
|
53
|
+
]
|
|
54
|
+
}
|