research-copilot 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/app/out/main/index.mjs +124 -49
  2. package/app/out/preload/index.js +6 -0
  3. package/app/out/renderer/assets/{MilkdownMarkdownEditor-jaF-aGPn.js → MilkdownMarkdownEditor-G1iOHwc_.js} +50 -50
  4. package/app/out/renderer/assets/{arc-C1kBmvvR.js → arc-4dzvPWwv.js} +1 -1
  5. package/app/out/renderer/assets/{blockDiagram-c4efeb88-Do93X2rs.js → blockDiagram-c4efeb88-Cj5lBciZ.js} +8 -8
  6. package/app/out/renderer/assets/{c4Diagram-c83219d4-DgxxcZWC.js → c4Diagram-c83219d4-D4wJF4Nk.js} +3 -3
  7. package/app/out/renderer/assets/{channel-Co_M0Svj.js → channel-DRHGMFu6.js} +1 -1
  8. package/app/out/renderer/assets/{classDiagram-beda092f-CQlHgE6H.js → classDiagram-beda092f-DrBIQ4Da.js} +6 -6
  9. package/app/out/renderer/assets/{classDiagram-v2-2358418a-CkGG3aI2.js → classDiagram-v2-2358418a-yBhKnZh4.js} +10 -10
  10. package/app/out/renderer/assets/{clone-C18Y6dgC.js → clone-ChFuWsyG.js} +1 -1
  11. package/app/out/renderer/assets/{createText-1719965b-DGRc6nys.js → createText-1719965b-dtk9hA3P.js} +2 -2
  12. package/app/out/renderer/assets/{edges-96097737-BXvJ4fAK.js → edges-96097737-D_e1x_h6.js} +3 -3
  13. package/app/out/renderer/assets/{erDiagram-0228fc6a-CXjPp0pt.js → erDiagram-0228fc6a-CrZBuvPZ.js} +5 -5
  14. package/app/out/renderer/assets/{flowDb-c6c81e3f-CNhpbtw_.js → flowDb-c6c81e3f-DtZC9Fq3.js} +1 -1
  15. package/app/out/renderer/assets/{flowDiagram-50d868cf-KZ_BUCPA.js → flowDiagram-50d868cf-VzlFN6Qf.js} +12 -12
  16. package/app/out/renderer/assets/{flowDiagram-v2-4f6560a1-IMv50KZP.js → flowDiagram-v2-4f6560a1-BWNPZGSS.js} +12 -12
  17. package/app/out/renderer/assets/{flowchart-elk-definition-6af322e1-BFwFiPvq.js → flowchart-elk-definition-6af322e1-BRS2deVY.js} +6 -6
  18. package/app/out/renderer/assets/{ganttDiagram-a2739b55-D0-ehN-T.js → ganttDiagram-a2739b55-DRChnjlR.js} +3 -3
  19. package/app/out/renderer/assets/{gitGraphDiagram-82fe8481-DUyIR0Dv.js → gitGraphDiagram-82fe8481-BVfXECHD.js} +2 -2
  20. package/app/out/renderer/assets/{graph-DnTq2_3F.js → graph-Dl3bczvk.js} +1 -1
  21. package/app/out/renderer/assets/{index-5325376f-CBwuFbRF.js → index-5325376f-DjPany76.js} +6 -6
  22. package/app/out/renderer/assets/{index-u0FZRZON.js → index-6EerDbdL.js} +4 -4
  23. package/app/out/renderer/assets/{index-BHcU72Rm.js → index-B9tygoLQ.js} +3 -3
  24. package/app/out/renderer/assets/{index-DWU4ia28.js → index-BNdWI-dg.js} +6 -6
  25. package/app/out/renderer/assets/{index-D6r8msaQ.js → index-BPwe457Y.js} +3 -3
  26. package/app/out/renderer/assets/{index-DuhageEr.js → index-BQHeBYdD.js} +3 -3
  27. package/app/out/renderer/assets/{index-C1oXjI4L.js → index-BUUZSVeh.js} +3 -3
  28. package/app/out/renderer/assets/{index-Diy30-34.js → index-Bg-eDX1p.js} +4 -4
  29. package/app/out/renderer/assets/{index-CKXwBmK7.js → index-BkB91HdR.js} +5 -5
  30. package/app/out/renderer/assets/{index-BB-a1ajC.js → index-BlpusrHE.js} +136 -19
  31. package/app/out/renderer/assets/{index-gH-w4EHk.js → index-C226G81f.js} +3 -3
  32. package/app/out/renderer/assets/{index-DZbrRR7w.js → index-C4rx4LuD.js} +6 -6
  33. package/app/out/renderer/assets/{index-BpKrXGYD.js → index-C8-8M1o_.js} +3 -3
  34. package/app/out/renderer/assets/{index-CjffvluT.js → index-CLSRcr1i.js} +6 -6
  35. package/app/out/renderer/assets/{index-7hDGClrI.js → index-CMSALyTS.js} +3 -3
  36. package/app/out/renderer/assets/{index-yanwpi6t.js → index-CU9ZRB66.js} +6 -6
  37. package/app/out/renderer/assets/{index-bMe3RSkw.js → index-CgEKFTyJ.js} +6 -6
  38. package/app/out/renderer/assets/{index-D6jljsup.js → index-CllWzrq7.js} +3 -3
  39. package/app/out/renderer/assets/{index-ESFHcvWy.js → index-DHmQz4fM.js} +3 -3
  40. package/app/out/renderer/assets/{index-h_fNksib.js → index-DxziAmqO.js} +3 -3
  41. package/app/out/renderer/assets/{index-COZSDrEw.js → index-HZn90B-L.js} +6 -6
  42. package/app/out/renderer/assets/{index-CT1HtzVp.css → index-dNBQ09OL.css} +60 -0
  43. package/app/out/renderer/assets/{index-BQ7qz1CD.js → index-dvpto11c.js} +3 -3
  44. package/app/out/renderer/assets/{index-BVYoMX5H.js → index-hVorRCxO.js} +3 -3
  45. package/app/out/renderer/assets/{index-JT8OCsRP.js → index-nk-me1QW.js} +1 -1
  46. package/app/out/renderer/assets/{infoDiagram-8eee0895-Qra4japr.js → infoDiagram-8eee0895-CRzw5OpC.js} +2 -2
  47. package/app/out/renderer/assets/{journeyDiagram-c64418c1-BTN9SgOL.js → journeyDiagram-c64418c1-CYFcrwy9.js} +4 -4
  48. package/app/out/renderer/assets/{layout-DGrHHJdN.js → layout-2qUs4rWy.js} +2 -2
  49. package/app/out/renderer/assets/{line-DXtxdS2B.js → line-zEaIEY7C.js} +1 -1
  50. package/app/out/renderer/assets/{linear-CexrSQK6.js → linear-B-lut2jS.js} +1 -1
  51. package/app/out/renderer/assets/{mindmap-definition-8da855dc-pvG2hzEB.js → mindmap-definition-8da855dc--G6w66OU.js} +3 -3
  52. package/app/out/renderer/assets/{pieDiagram-a8764435-D_neFVMq.js → pieDiagram-a8764435-B1aj6YO1.js} +3 -3
  53. package/app/out/renderer/assets/{quadrantDiagram-1e28029f-C47W3UMp.js → quadrantDiagram-1e28029f-WcQ-5B6J.js} +3 -3
  54. package/app/out/renderer/assets/{requirementDiagram-08caed73-DW4Bo_fu.js → requirementDiagram-08caed73-DVS_F7Ld.js} +5 -5
  55. package/app/out/renderer/assets/{sankeyDiagram-a04cb91d-D_3PD7JI.js → sankeyDiagram-a04cb91d-C9DI8I5c.js} +2 -2
  56. package/app/out/renderer/assets/{sequenceDiagram-c5b8d532-BW6nGtuQ.js → sequenceDiagram-c5b8d532-BbjlXU_R.js} +3 -3
  57. package/app/out/renderer/assets/{stateDiagram-1ecb1508-CDgBJ3-T.js → stateDiagram-1ecb1508-D-RBhWue.js} +6 -6
  58. package/app/out/renderer/assets/{stateDiagram-v2-c2b004d7-CBw5TtXo.js → stateDiagram-v2-c2b004d7-5WeKw5B9.js} +10 -10
  59. package/app/out/renderer/assets/{styles-b4e223ce-DeeiEsuW.js → styles-b4e223ce-FiETMPKg.js} +1 -1
  60. package/app/out/renderer/assets/{styles-ca3715f6-CMpiebrG.js → styles-ca3715f6-DOyTWqN-.js} +1 -1
  61. package/app/out/renderer/assets/{styles-d45a18b0-CZe9hU7H.js → styles-d45a18b0-D2yoeLkD.js} +4 -4
  62. package/app/out/renderer/assets/{svgDrawCommon-b86b1483-CmJZfZzJ.js → svgDrawCommon-b86b1483-7THlOYFk.js} +1 -1
  63. package/app/out/renderer/assets/{timeline-definition-faaaa080-Beo2kiiz.js → timeline-definition-faaaa080-BVp4ikHz.js} +3 -3
  64. package/app/out/renderer/assets/{xychartDiagram-f5964ef8-DYmo7moz.js → xychartDiagram-f5964ef8-Cdp51x2C.js} +5 -5
  65. package/app/out/renderer/index.html +2 -2
  66. package/lib/skills/builtin/paper-revision/SKILL.md +467 -0
  67. package/lib/skills/builtin/paper-revision/references/evidence-strengthening.md +101 -0
  68. package/lib/skills/builtin/paper-revision/references/framing-patterns.md +119 -0
  69. package/lib/skills/builtin/paper-revision/references/reviewer-attack-catalog.md +171 -0
  70. package/lib/skills/builtin/paper-revision/references/venue-strategies.md +114 -0
  71. package/lib/skills/builtin/paper-writing/SKILL.md +1 -0
  72. package/package.json +4 -4
@@ -0,0 +1,467 @@
1
+ ---
2
+ name: paper-revision
3
+ description: "Strategically revise an existing CS/AI/Systems conference paper draft for resubmission or submission readiness. Focuses on framing diagnosis, claim crystallization, narrative unification, evidence strengthening, reviewer defense, and venue-aware polish."
4
+ category: Writing & Review
5
+ depends: []
6
+ tags: [Paper Revision, Conference Paper, Resubmission, Reviewer Defense, Framing, LaTeX, Academic Writing, Systems, NeurIPS, ICML, OSDI, NSDI, ASPLOS, SOSP, SC]
7
+ triggers: [revise paper, improve paper, strengthen paper, paper revision, reframe paper, reviewer defense, improve draft, resubmit paper, camera-ready prep, 修改论文, 改论文, 论文修改, 论文返修, 返修投稿]
8
+ ---
9
+
10
+ # Strategic Paper Revision
11
+
12
+ Systematic methodology for revising an existing paper draft to maximize acceptance probability at a target venue. This skill is for **strengthening a draft that already exists**, not for writing from scratch.
13
+
14
+ ## Overview
15
+
16
+ Paper revision is fundamentally different from paper writing. Writing assembles findings into a narrative. Revision asks: *given what we have, what is the strongest defensible story, and how do we make it hardest for reviewers to reject?*
17
+
18
+ The core principle: **Reviewer-adversarial thinking drives all decisions.** Every revision choice — from framing to figure captions — should be evaluated by asking: "How would the most skeptical reviewer read this?"
19
+
20
+ This skill is optimized for conference-style peer review in CS/AI/Systems venues, where framing, reviewer defense, and claim-evidence alignment determine acceptance more than sentence-level polish alone.
21
+
22
+ ## When to Use This Skill
23
+
24
+ Use this skill when:
25
+ - The user has an existing CS/AI/Systems conference paper draft and wants to strengthen it for a target venue
26
+ - A conference paper was rejected and needs strategic revision for resubmission
27
+ - The user wants to reframe the core contribution, strengthen evidence, or prepare reviewer defense
28
+
29
+ Do NOT use this skill when:
30
+ - No draft exists yet — use `paper-writing` instead
31
+ - The user only wants language polish without structural changes — use `rewrite-humanize` instead
32
+ - The user only wants an evaluation score — use `scholar-evaluation` instead
33
+ - The user is writing or revising a journal manuscript, technical report, or general scientific manuscript — use `scientific-writing` instead
34
+ - The user mainly needs literature collection or citation discovery — use the `literature-search` tool first
35
+
36
+ ### paper-revision vs paper-writing: Which to Use
37
+
38
+ | Situation | Use This Skill | Why |
39
+ |-----------|---------------|-----|
40
+ | Have a draft, want to strengthen it | `paper-revision` | Strategic framing, claim alignment, reviewer defense |
41
+ | No draft yet, need to write from scratch | `paper-writing` | Narrative assembly, section drafting, template setup |
42
+ | Rejected paper, need to resubmit | `paper-revision` | Diagnose what went wrong, reframe, address reviews |
43
+ | Draft exists but only needs language cleanup | `rewrite-humanize` | Pure prose polish, no structural changes |
44
+
45
+ ---
46
+
47
+ ## How This Skill Is Used in Practice
48
+
49
+ **This skill is NOT a linear pipeline that must run end-to-end every time.** It is a structured toolbox with a recommended order. In practice, users enter at whatever phase matches their current need, work on that phase, and stop or continue as needed.
50
+
51
+ ### Entry Point Detection
52
+
53
+ Determine which phase to enter based on the user's request:
54
+
55
+ | User says | Enter at | What to do |
56
+ |-----------|----------|------------|
57
+ | "Review this draft" / "How should I improve this paper?" / "Is this ready for SC?" | **Phase 1** | Full strategic diagnosis, then recommend next steps |
58
+ | "Is our core claim right?" / "How should we frame this?" / "Brainstorm framings" | **Phase 2** | Claim crystallization and framing selection |
59
+ | "Unify the narrative" / "Fix intro/abstract/conclusion" / "Make it consistent" | **Phase 3** | Propagate framing through all sections |
60
+ | "What evidence do we need?" / "Design an experiment" / "Add numbers" | **Phase 4** | Evidence audit and strengthening |
61
+ | "What will reviewers attack?" / "Add limitations" / "Reviewer defense" | **Phase 5** | Anticipate objections and write defenses |
62
+ | "Polish for SC" / "Remove AI tone" / "Final cleanup" | **Phase 6** | Venue-appropriate language polish (delegates to `rewrite-humanize`) |
63
+ | "Do a literature search for this paper" | **Use `literature-search` tool directly** | Then return to Phase 3 or 4 to integrate |
64
+
65
+ ### Progressive Engagement
66
+
67
+ A typical revision unfolds over multiple conversations:
68
+
69
+ 1. **First session:** Phase 1 diagnosis → user decides on framing direction
70
+ 2. **Second session:** Phase 2-3, crystallize claim and propagate through intro/abstract
71
+ 3. **Third session:** Phase 3 continued, unify methods/results/conclusion
72
+ 4. **Fourth session:** Phase 4-5, strengthen evidence and add reviewer defense
73
+ 5. **Final session:** Phase 6, language polish
74
+
75
+ But many sessions only touch one phase. That is normal and expected.
76
+
77
+ ### After Each Phase
78
+
79
+ Always end with a concrete "suggested next step" so the user can decide whether to continue, jump to a different phase, or stop. Do not assume the user wants the full pipeline.
80
+
81
+ ## Revision Philosophy
82
+
83
+ ### Five axioms that govern all revision decisions
84
+
85
+ 1. **Framing > Structure > Evidence > Language.** Fix the highest layer first. No amount of polish saves a misframed paper.
86
+
87
+ 2. **One paper, one thesis.** Every section, subsection, figure, and table must serve a single identifiable main claim. If a component doesn't serve the thesis, it either needs repositioning or removal.
88
+
89
+ 3. **Claim-evidence alignment.** Never let claims exceed evidence. A paper that claims less but proves it convincingly beats a paper that claims more but leaves gaps.
90
+
91
+ 4. **Proactive defense beats reactive rebuttal.** Address likely attacks in the paper itself. Reviewer goodwill drops sharply between "they anticipated this" and "they didn't think of this."
92
+
93
+ 5. **Minimum change, maximum persuasion.** The goal is not to rewrite the paper; it is to find the highest-leverage changes that shift reviewer perception from reject to accept.
94
+
95
+ ---
96
+
97
+ ## The Six-Phase Revision Workflow
98
+
99
+ ### Phase 1: Strategic Diagnosis
100
+
101
+ **Goal:** Determine whether the paper's core claim is defensible at the target venue, and identify the strongest possible framing given existing evidence.
102
+
103
+ **Steps:**
104
+
105
+ 1. **Read the full draft end-to-end.** Do not start revising until you understand the complete argument.
106
+
107
+ 2. **Identify what the paper currently claims.** Extract:
108
+ - The implicit thesis (what the paper acts like it's about)
109
+ - The explicit thesis (what the contribution bullets say)
110
+ - Whether these two match
111
+
112
+ 3. **Stress-test the core claim.** Ask:
113
+ - If the most hostile reviewer attacks the main claim, can it survive?
114
+ - Is the claim too broad for the evidence? Too narrow to be interesting?
115
+ - Is the claim positioned as the paper's strongest point, or is it buried?
116
+ - Does the claim fit the target venue's values?
117
+
118
+ 4. **Generate 2-3 alternative framings.** For each, evaluate:
119
+ - How well does existing evidence support it?
120
+ - How hard is it for reviewers to dismiss?
121
+ - How relevant is it to the target venue's community?
122
+
123
+ 5. **Select the strongest framing.** The best framing is the one where:
124
+ - Existing evidence most directly supports the claim
125
+ - The claim is hardest to decompose into "just integration" or "just incremental"
126
+ - The venue's reviewers would find it most relevant
127
+
128
+ **Key pattern: The Gap Chain.**
129
+ For papers that risk being seen as "pipeline integration," crystallize the contribution as a chain of necessary conditions. Example:
130
+
131
+ > "Prior work solves A, B, and C separately. But the problem requires A+B+C+D simultaneously. Our contribution is not any single component, but the protocol that enforces all four conditions together."
132
+
133
+ This makes the novelty architectural rather than algorithmic, which is much harder for reviewers to dismiss.
134
+
135
+ **Key pattern: The Bottleneck Shift.**
136
+ For measurement/evaluation papers, organize findings not as parallel observations but as a unified mechanism:
137
+
138
+ > "The dominant bottleneck shifts from X on task type A, to Y on task type B, to Z on task type C."
139
+
140
+ This turns "three separate findings" into "one mechanistic result."
141
+
142
+ **Key pattern: Three-Layer Claim Hierarchy.**
143
+ When collaborators disagree about the core contribution, resolve it by separating:
144
+
145
+ - **Value layer** (what practical benefit does this deliver?)
146
+ - **Problem layer** (what problem does this solve, and using what resources?)
147
+ - **Mechanism layer** (what is the technical novelty that makes it work?)
148
+
149
+ All three layers should appear in the paper, but the mechanism layer is the defensible novelty, while the value layer is what makes reviewers care.
150
+
151
+ **When to delegate:**
152
+ - If the user wants a structured quality score before strategic diagnosis, use `scholar-evaluation` first, then return here to interpret the scores as revision priorities.
153
+ - If alternative framings need deeper exploration, use `brainstorming-research-ideas` (Tension Hunting, Stakeholder Rotation, and Simplicity Test lenses are most relevant).
154
+
155
+ **Output:** A written decision on the paper's revised framing, including the main thesis, contribution hierarchy, and what NOT to claim.
156
+
157
+ ---
158
+
159
+ ### Phase 2: Claim Crystallization
160
+
161
+ **Goal:** Compress the revised framing into precise, quotable language that will propagate through the entire paper.
162
+
163
+ **Steps:**
164
+
165
+ 1. **Write the one-sentence thesis.** This sentence must be:
166
+ - Specific enough that no prior work obviously satisfies it
167
+ - Broad enough to capture the paper's full contribution
168
+ - Stated in terms of what the paper *shows*, not what it *does*
169
+
170
+ 2. **Write the contribution bullets (2-3 max).** Each bullet should be:
171
+ - A falsifiable claim
172
+ - Supported by a specific section of the paper
173
+ - Not decomposable into "this existed before"
174
+
175
+ 3. **Write the "what we do NOT claim" boundary.** This is critical for:
176
+ - Preventing reviewer from attacking claims you never made
177
+ - Demonstrating maturity and scope awareness
178
+ - Pre-empting "overgeneralization" criticisms
179
+
180
+ 4. **Define the key terminology.** Pick one term for each core concept and commit to it across the entire paper. Common revision failure: the same concept called three different things in different sections.
181
+
182
+ **Key pattern: Necessity Framing.**
183
+ For systems papers, the strongest claim form is often:
184
+
185
+ > "Deployment-facing X becomes possible only when conditions A, B, C, and D are enforced together."
186
+
187
+ This "only when enforced together" framing is much harder to attack than "we combine A, B, C, and D."
188
+
189
+ **Key pattern: Secondary Contribution Blending.**
190
+ When a secondary result (e.g., sample efficiency, cost reduction) matters but shouldn't dominate:
191
+
192
+ - Include it in the value layer of the hierarchy
193
+ - Give it one clear contribution bullet, positioned after the primary contribution
194
+ - Mention it in abstract/intro/conclusion, but never before the primary claim
195
+ - In evaluation, show it as a consequence of the primary mechanism, not a separate result
196
+
197
+ **Output:** Final thesis sentence, contribution bullets, scope boundary, and terminology glossary.
198
+
199
+ ---
200
+
201
+ ### Phase 3: Narrative Unification
202
+
203
+ **Goal:** Propagate the revised framing through every high-weight position in the paper, eliminating inconsistencies.
204
+
205
+ **Propagation order (do not skip steps):**
206
+
207
+ 1. **Title.** Does it convey the right paper type? (Measurement study? Systems contribution? Benchmark?)
208
+
209
+ 2. **Abstract.** Rewrite to match the new framing. The abstract should:
210
+ - State the problem (not the solution) first
211
+ - Position the contribution as the mechanism, not just the artifact
212
+ - Include 2-3 key numbers for quantitative punch
213
+ - End with practical significance, not just "we release X"
214
+
215
+ 3. **Introduction.** This is where most revision energy should go. Check:
216
+ - Does the opening paragraph establish why the problem matters?
217
+ - Does the gap paragraph explain what's missing (not just "no one did this")?
218
+ - Does the contribution paragraph match the crystallized claims?
219
+ - Is there a clear forward pointer to the rest of the paper?
220
+
221
+ 4. **Section openings.** Every section's first paragraph should answer: "What is this section's job in supporting the main thesis?"
222
+
223
+ 5. **Methods/Design section.** Redefine each subsection's role:
224
+ - Not "what component we built" but "what condition this enforces"
225
+ - Not "how clever our algorithm is" but "why this step is necessary"
226
+
227
+ 6. **Results/Evaluation section.** Rewrite the opening as a proof roadmap:
228
+ - "We evaluate N questions: (1)... (2)... (3)..."
229
+ - Each subsection should map to one question
230
+
231
+ 7. **Related work.** The ending paragraph must create a gap statement that maps directly to your contribution bullets.
232
+
233
+ 8. **Conclusion.** Structure as: what was missing → what we enforced → what we demonstrated → what remains.
234
+
235
+ **Key pattern: Section Role Redefinition.**
236
+ For each section, write a one-sentence "job description." Example:
237
+
238
+ - Methods section: "This section's job is not to describe benchmark components but to build measurement credibility."
239
+ - Evaluation section: "This section's job is not to rank agents but to isolate which design choices materially change reliability."
240
+
241
+ If a section's content doesn't match its job description, revise the content.
242
+
243
+ **Key pattern: Figure Captions as Narrative.**
244
+ Figure captions are often the most-read text after abstract and introduction. Revise captions to:
245
+ - Serve the main thesis, not just describe the figure
246
+ - Include the key takeaway, not just "X vs Y"
247
+ - Use terminology consistent with the rest of the paper
248
+
249
+ **Output:** Revised title, abstract, intro, section openings, and conclusion — all telling the same story.
250
+
251
+ ---
252
+
253
+ ### Phase 4: Evidence Strengthening
254
+
255
+ **Goal:** Ensure every claim has quantitative support, and identify the highest-ROI experiments or analyses to add.
256
+
257
+ **Steps:**
258
+
259
+ 1. **Audit claim-evidence pairs.** For each contribution bullet:
260
+ - What specific result supports it?
261
+ - Is there a number? (Qualitative claims are weaker than quantitative ones.)
262
+ - Is there a comparison? (Absolute numbers are weaker than deltas.)
263
+
264
+ 2. **Front-load key numbers.** The abstract and intro should contain 2-4 of the most important quantitative results. Not all results — just the ones that make the thesis concrete.
265
+
266
+ 3. **Identify evidence gaps.** Common gaps:
267
+ - Mechanism claimed but only observational evidence provided
268
+ - Comparison to "default" but not to reasonable alternatives
269
+ - Aggregate results but no per-category breakdown
270
+ - Single-run results but no stability/variance information
271
+
272
+ 4. **Design minimum-cost evidence additions.** Prioritize by ROI:
273
+ - **Highest ROI:** Add numbers to claims that currently lack them
274
+ - **High ROI:** Add a small diagnostic experiment to validate a mechanism claim
275
+ - **Medium ROI:** Add an ablation or controlled comparison
276
+ - **Lower ROI:** Add more baselines or larger-scale experiments
277
+
278
+ 5. **Add structured artifacts.** High-value, low-cost additions:
279
+ - **Failure taxonomy table:** What fails, where, and what mitigates it
280
+ - **Deployment guidance table:** What to invest in for each workload regime
281
+ - **Summary statistics table:** End-to-end reduction / cost / efficiency numbers in one place
282
+ - **Capability comparison table:** What prior work covers vs. what you add
283
+
284
+ **Key pattern: Diagnostic Intervention.**
285
+ When you claim a mechanism (e.g., "failures are caused by semantic grounding gaps"), design a minimal experiment that tests the mechanism directly:
286
+ - Change one variable (e.g., add one sentence of expert clarification)
287
+ - Keep everything else fixed
288
+ - Show the claimed mechanism is confirmed or refuted
289
+ - Frame it as "mechanism validation," not "new method"
290
+
291
+ **Key pattern: Necessity Evidence.**
292
+ For systems papers, prove that simpler alternatives are insufficient:
293
+ - Compare against progressively simpler baselines
294
+ - Show that each removed component degrades the result
295
+ - Frame the comparison section as "Why simpler alternatives are insufficient" rather than "Baseline comparison"
296
+
297
+ **Key pattern: Method Auditability.**
298
+ For any non-trivial algorithm or heuristic, make the method reviewer-auditable:
299
+ - State exact metrics, thresholds, and decision rules
300
+ - Explain how thresholds were chosen
301
+ - Define what happens in ambiguous cases
302
+ - Describe recovery/fallback behavior
303
+
304
+ **When to delegate:**
305
+ - If literature gaps are identified, use `literature-search` to find missing references, then return here to integrate them into the gap-framing strategy.
306
+ - If a diagnostic experiment needs to be designed, use `paper-writing` experiment-design patterns as a reference.
307
+
308
+ **Output:** List of evidence gaps, prioritized additions, and any new tables or figures.
309
+
310
+ ---
311
+
312
+ ### Phase 5: Reviewer Defense
313
+
314
+ **Goal:** Preemptively address the most likely reviewer attacks in the paper itself.
315
+
316
+ **Steps:**
317
+
318
+ 1. **List the 3-5 most likely reviewer objections.** Common categories:
319
+ - **Novelty:** "This is just integration / incremental / obvious"
320
+ - **Scope:** "Single site / small dataset / narrow domain"
321
+ - **Generalization:** "Would this work elsewhere?"
322
+ - **Baselines:** "Comparison is unfair / baseline too weak"
323
+ - **Methodology:** "How were thresholds chosen? Is this reproducible?"
324
+ - **Overclaim:** "The conclusion goes beyond what the evidence shows"
325
+
326
+ 2. **For each objection, decide: defend in text or acknowledge as limitation?**
327
+ - If you have evidence: defend in the relevant section
328
+ - If you don't but the claim still holds: acknowledge scope and explain why it doesn't invalidate the claim
329
+ - If it's a genuine limitation: put it in limitations/discussion
330
+
331
+ 3. **Write preemptive defense paragraphs.** For scope limitations:
332
+ - "We do not claim X. We claim Y, which is a controlled, high-fidelity first step."
333
+ - "The goal is not broad universality but controlled realism."
334
+
335
+ 4. **Calibrate claim language throughout.** Remove:
336
+ - Absolute words where hedging is warranted ("always" → "typically")
337
+ - Emotion/rhetoric ("just an illusion," "trivial")
338
+ - Overclaims that exceed evidence
339
+
340
+ But also remove:
341
+ - Excessive hedging that weakens solid claims ("may perhaps suggest" → "suggests")
342
+
343
+ 5. **Ensure limitations section exists and is honest but strategic.**
344
+ - Acknowledge real limitations before reviewers find them
345
+ - Explain why each limitation doesn't undermine the core claim
346
+ - Position future work as natural extensions, not missing pieces
347
+
348
+ **Output:** Revised limitations section, preemptive defense paragraphs, and calibrated claim language.
349
+
350
+ ---
351
+
352
+ ### Phase 6: Venue-Appropriate Polish
353
+
354
+ **Goal:** Make the paper read like a mature submission to the target venue.
355
+
356
+ **Steps:**
357
+
358
+ 1. **Terminology consistency pass.** Check that each core concept uses exactly one term throughout.
359
+
360
+ 2. **Remove AI-generated prose patterns.** Common tells:
361
+ - Formulaic transitions ("First and foremost," "It is worth noting that")
362
+ - Redundant self-commentary ("The key insight here is...")
363
+ - List-heavy structure where paragraphs are expected
364
+ - Inflated vocabulary ("leverage" → "use," "delve into" → "investigate")
365
+ - Mechanical "总分总" (general-specific-general) paragraph structure
366
+
367
+ 3. **Venue-specific style.** Apply `rewrite-humanize` with venue awareness:
368
+ - **SC/HPDC:** Direct claims, reproducibility details, restrained rhetoric, operational relevance
369
+ - **OSDI/SOSP:** Strong problem formulation, safety boundaries, design invariants, deployment considerations
370
+ - **NeurIPS/ICML:** Contribution clarity, ablation rigor, theoretical grounding where possible
371
+
372
+ See `@skill/references/venue-strategies.md` for detailed guidance.
373
+
374
+ 4. **Page budget check.** If over limit:
375
+ - Compress related work first (it's the easiest to shorten without losing substance)
376
+ - Move method details to appendix
377
+ - Tighten figure/table spacing
378
+ - Do NOT cut limitations or evaluation
379
+
380
+ 5. **Final compilation check.** Verify LaTeX compiles cleanly. Fix:
381
+ - Undefined references
382
+ - Missing citations
383
+ - Overfull/underfull boxes that affect readability
384
+ - BibTeX warnings on critical entries
385
+
386
+ **When to delegate:**
387
+ - For language-level polish, use `rewrite-humanize` with the venue context from `@skill/references/venue-strategies.md`.
388
+ - `rewrite-humanize` handles sentence-level naturalization; this phase handles terminology consistency, page budget, and compilation — the structural polish that `rewrite-humanize` does not cover.
389
+
390
+ **Output:** Polished, venue-ready manuscript.
391
+
392
+ ---
393
+
394
+ ## Decision Framework: What to Change and What to Leave
395
+
396
+ ### Always change (high ROI, low risk)
397
+ - Misaligned title/abstract/intro framing
398
+ - Missing numbers in claims
399
+ - Inconsistent terminology
400
+ - Obvious overclaims
401
+ - Missing limitations section
402
+
403
+ ### Usually change (high ROI, moderate effort)
404
+ - Section opening sentences that don't serve the thesis
405
+ - Figure captions that don't convey the takeaway
406
+ - Related work that doesn't create a gap for your contribution
407
+ - Evaluation opening that doesn't state what's being proven
408
+
409
+ ### Change only if needed (moderate ROI, higher risk)
410
+ - Section structure / reordering
411
+ - Adding new experiments
412
+ - Changing the core claim itself
413
+ - Major terminology overhaul
414
+
415
+ ### Almost never change (high risk, diminishing returns)
416
+ - Adding entirely new sections
417
+ - Changing the paper's fundamental direction
418
+ - Adding features/methods not already evaluated
419
+ - Rewriting code/experiments from scratch
420
+
421
+ ---
422
+
423
+ ## Common Revision Anti-Patterns
424
+
425
+ ### 1. "Fix the writing, not the framing"
426
+ Polishing prose on a misframed paper is like painting a house with foundation problems. Always diagnose framing first.
427
+
428
+ ### 2. "Add more results to compensate for weak claims"
429
+ More results don't fix a weak thesis. They make the paper longer and the weakness harder to find but still present.
430
+
431
+ ### 3. "Emphasize everything equally"
432
+ A paper with five equal contributions has zero memorable contributions. Commit to one primary claim.
433
+
434
+ ### 4. "Hide limitations"
435
+ Reviewers always find them. Acknowledged limitations are forgiven; hidden limitations are punished.
436
+
437
+ ### 5. "Respond to every possible criticism"
438
+ Over-defending makes the paper sound insecure. Address the 3-5 most likely attacks. Trust the rest to rebuttal.
439
+
440
+ ### 6. "The abstract/intro tells a different story than the results"
441
+ This happens when revision modifies the front matter but not the evaluation framing (or vice versa). Always propagate changes end-to-end.
442
+
443
+ ---
444
+
445
+ ## Integration with Other Skills
446
+
447
+ Supporting skills below are optional collaborators, not prerequisites. Load them only when the current revision phase requires them.
448
+
449
+ | Phase | When to load a supporting skill | Which skill |
450
+ |-------|-------------------------------|-------------|
451
+ | Diagnosis | User wants a structured quality score before strategic diagnosis | `scholar-evaluation` |
452
+ | Claim crystallization | Alternative framings need deeper brainstorming | `brainstorming-research-ideas` |
453
+ | Evidence strengthening | Missing references identified during evidence audit | `literature-search` tool |
454
+ | Evidence strengthening | Need experiment-design patterns or checklist guidance | `paper-writing` (reference only) |
455
+ | Reviewer defense | Want to simulate a full reviewer evaluation | `scholar-evaluation` |
456
+ | Language polish | Sentence-level naturalization and de-AI-ification | `rewrite-humanize` |
457
+
458
+ ---
459
+
460
+ ## References
461
+
462
+ Load these as needed:
463
+
464
+ - `@skill/references/framing-patterns.md`: Common reframing strategies with examples
465
+ - `@skill/references/reviewer-attack-catalog.md`: Typical reviewer objections and defense templates
466
+ - `@skill/references/evidence-strengthening.md`: Minimum-cost evidence addition strategies
467
+ - `@skill/references/venue-strategies.md`: Venue-specific revision guidance (SC, OSDI, NeurIPS, etc.)
@@ -0,0 +1,101 @@
1
+ # Evidence Strengthening Strategies
2
+
3
+ Minimum-cost additions that maximize reviewer persuasion, organized by ROI.
4
+
5
+ ---
6
+
7
+ ## Tier 1: Zero New Experiments (Writing Only)
8
+
9
+ These additions cost nothing but writing time and can dramatically improve reviewer perception.
10
+
11
+ ### 1.1 Front-load key numbers into abstract and introduction
12
+ **Problem:** Claims are qualitative; reviewer doesn't know the magnitude of effects.
13
+ **Fix:** Extract 2-4 of the most important quantitative deltas from your results and insert them into the abstract and the introduction's central finding paragraph.
14
+ **Template:** "On [task category], [intervention] improves [metric] by [X points / X%] over [baseline], while on [other category], the same intervention changes [metric] by only [Y]."
15
+
16
+ ### 1.2 Add a summary statistics table
17
+ **Problem:** Key numbers are scattered across different subsections.
18
+ **Fix:** Create a single small table that summarizes the end-to-end pipeline / reduction / efficiency story in one place.
19
+ **Columns might include:** raw input → filtered → reduced → final output → cost / samples / iterations
20
+
21
+ ### 1.3 Add a capability comparison table
22
+ **Problem:** Reviewer can't quickly see what's new vs. prior work.
23
+ **Fix:** Create a table with prior systems as rows and capabilities as columns. Check marks show what each covers. Your system should have checks in columns no other system covers.
24
+
25
+ ### 1.4 Add a failure taxonomy
26
+ **Problem:** Results show when things work but not when/why they fail.
27
+ **Fix:** Create a structured table of failure modes: type, where it occurs, which configuration is vulnerable, and what mitigates it.
28
+
29
+ ### 1.5 Add deployment guidance
30
+ **Problem:** Results are interesting but reviewer asks "so what should practitioners do?"
31
+ **Fix:** Create a table mapping workload/task characteristics to recommended interventions. This turns findings into actionable advice.
32
+
33
+ ### 1.6 Rewrite comparator section as necessity evidence
34
+ **Problem:** Baseline comparison exists but reads as "we beat baselines."
35
+ **Fix:** Rename and reframe as "Why simpler alternatives are insufficient." The question changes from "how much do we win?" to "is the full method actually necessary?"
36
+
37
+ ---
38
+
39
+ ## Tier 2: Small Diagnostic Experiments
40
+
41
+ These require running a few experiments but provide disproportionate persuasive value.
42
+
43
+ ### 2.1 Diagnostic intervention (mechanism validation)
44
+ **Problem:** You claim failures are caused by [mechanism X] but only have observational evidence.
45
+ **Fix:** Design a minimal, controlled intervention that directly tests the mechanism:
46
+ - Change exactly one variable (e.g., add one sentence of expert clarification)
47
+ - Keep everything else fixed (same model, same tools, same scoring)
48
+ - Show the claimed mechanism is confirmed
49
+
50
+ **Critical framing rules:**
51
+ - Call it "diagnostic probe" or "mechanism validation," NOT "new method"
52
+ - State explicitly what the intervention does NOT reveal (no answer leakage)
53
+ - State that this is not a deployable solution but a test of the failure explanation
54
+ - Allow negative results — they strengthen credibility
55
+
56
+ ### 2.2 Response fidelity check
57
+ **Problem:** You claim a surrogate / proxy preserves some property, but only show feature-level similarity.
58
+ **Fix:** On a small overlap set, show that the property you actually care about (e.g., tuning ranking, improvement direction) is preserved.
59
+
60
+ ### 2.3 Stability / variance check
61
+ **Problem:** Results are single-run or low-K, and reviewer questions reliability.
62
+ **Fix:** For 2-3 representative configurations, increase K modestly and show that trends/rankings are stable. Can go in appendix.
63
+
64
+ ### 2.4 Per-item breakdown
65
+ **Problem:** Aggregate results look strong but reviewer suspects cherry-picking.
66
+ **Fix:** Show per-task / per-workload results (even just in a compact table or appendix). This provides auditability and shows the effect is broad, not driven by outliers.
67
+
68
+ ---
69
+
70
+ ## Tier 3: Moderate Experiments
71
+
72
+ ### 3.1 Budget-matched comparison
73
+ **Problem:** Reviewer asks "what if I gave the same budget to a simpler method?"
74
+ **Fix:** Run the simplest reasonable alternative under the same resource budget. Show that your method converges faster or reaches higher quality.
75
+
76
+ ### 3.2 Ablation on the conservative constraints
77
+ **Problem:** Reviewer asks "would it work without [specific constraint]?"
78
+ **Fix:** Remove one constraint at a time and show degradation. This is the strongest form of necessity evidence.
79
+
80
+ ### 3.3 Cross-condition generalization
81
+ **Problem:** Reviewer questions whether results hold under different conditions.
82
+ **Fix:** Run on a small number of alternative conditions (different scale, different workload family, different model). Even 2-3 additional conditions dramatically strengthen generalization claims.
83
+
84
+ ---
85
+
86
+ ## How to Prioritize
87
+
88
+ ### Ask these questions in order:
89
+
90
+ 1. **Are there claims without numbers?** → Tier 1.1 (front-load numbers)
91
+ 2. **Is there a "so what" gap?** → Tier 1.4-1.5 (failure taxonomy / deployment guidance)
92
+ 3. **Is there a mechanism claim without direct evidence?** → Tier 2.1 (diagnostic intervention)
93
+ 4. **Could a reviewer say "simpler methods would work"?** → Tier 1.6 or 3.2 (necessity evidence)
94
+ 5. **Are key numbers scattered?** → Tier 1.2 (summary table)
95
+ 6. **Is novelty unclear vs. prior work?** → Tier 1.3 (capability comparison)
96
+
97
+ ### Time budget rules of thumb:
98
+ - **< 2 hours:** Do all of Tier 1
99
+ - **< 1 day:** Add 1-2 from Tier 2
100
+ - **< 1 week:** Add 1 from Tier 3
101
+ - **No time at all:** At minimum, do Tier 1.1 (front-load numbers) — it's the single highest-ROI action