npm - mindforge-cc - Versions diffs - 11.5.1 → 11.7.0 - Mend

mindforge-cc 11.5.1 → 11.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (214) hide show

package/.agent/mindforge/skill-tdd.md +53 -0
package/.agent/mindforge/skills-index.md +118 -0
package/.agent/mindforge/systematic-debug.md +60 -0
package/.agent/mindforge/wf-catalog.md +37 -0
package/.agent/mindforge/wf-code-audit.md +31 -0
package/.agent/mindforge/wf-competitive-analysis.md +31 -0
package/.agent/mindforge/wf-deep-research.md +32 -0
package/.agent/mindforge/wf-feature-planner.md +31 -0
package/.agent/mindforge/wf-incident-response.md +31 -0
package/.agent/mindforge/wf-onboard-codebase.md +31 -0
package/.agent/mindforge/wf-perf-optimize.md +31 -0
package/.agent/mindforge/wf-pr-review.md +31 -0
package/.agent/mindforge/wf-refactor-plan.md +31 -0
package/.agent/mindforge/wf-release-prep.md +31 -0
package/.agent/mindforge/wf-tdd-sprint.md +31 -0
package/.agent/mindforge/wf-tech-evaluation.md +31 -0
package/.agent/skills/1password-skill/SKILL.md +156 -0
package/.agent/skills/1password-skill/references/cli-examples.md +31 -0
package/.agent/skills/1password-skill/references/get-started.md +21 -0
package/.agent/skills/article-illustrator/SKILL.md +199 -0
package/.agent/skills/article-illustrator/references/prompt-construction.md +426 -0
package/.agent/skills/article-illustrator/references/style-presets.md +80 -0
package/.agent/skills/article-illustrator/references/styles.md +224 -0
package/.agent/skills/article-illustrator/references/usage.md +50 -0
package/.agent/skills/article-illustrator/references/workflow.md +332 -0
package/.agent/skills/arxiv/SKILL.md +275 -0
package/.agent/skills/blogwatcher/SKILL.md +130 -0
package/.agent/skills/code-wiki/SKILL.md +438 -0
package/.agent/skills/code-wiki/templates/README.md +31 -0
package/.agent/skills/code-wiki/templates/architecture.md +30 -0
package/.agent/skills/code-wiki/templates/getting-started.md +47 -0
package/.agent/skills/code-wiki/templates/module.md +38 -0
package/.agent/skills/codebase-inspection/SKILL.md +109 -0
package/.agent/skills/comic-creator/SKILL.md +240 -0
package/.agent/skills/comic-creator/references/analysis-framework.md +176 -0
package/.agent/skills/comic-creator/references/auto-selection.md +71 -0
package/.agent/skills/comic-creator/references/base-prompt.md +98 -0
package/.agent/skills/comic-creator/references/character-template.md +180 -0
package/.agent/skills/comic-creator/references/ohmsha-guide.md +85 -0
package/.agent/skills/comic-creator/references/partial-workflows.md +106 -0
package/.agent/skills/comic-creator/references/storyboard-template.md +143 -0
package/.agent/skills/comic-creator/references/workflow.md +401 -0
package/.agent/skills/concept-diagrams/SKILL.md +355 -0
package/.agent/skills/concept-diagrams/references/dashboard-patterns.md +43 -0
package/.agent/skills/concept-diagrams/references/infrastructure-patterns.md +144 -0
package/.agent/skills/concept-diagrams/references/physical-shape-cookbook.md +42 -0
package/.agent/skills/creative-ideation/SKILL.md +144 -0
package/.agent/skills/creative-ideation/references/full-prompt-library.md +110 -0
package/.agent/skills/devops-cli/SKILL.md +149 -0
package/.agent/skills/devops-cli/references/app-discovery.md +112 -0
package/.agent/skills/devops-cli/references/authentication.md +59 -0
package/.agent/skills/devops-cli/references/cli-reference.md +104 -0
package/.agent/skills/devops-cli/references/running-apps.md +171 -0
package/.agent/skills/devops-watchers/SKILL.md +103 -0
package/.agent/skills/docker-management/SKILL.md +273 -0
package/.agent/skills/domain-intel/SKILL.md +96 -0
package/.agent/skills/duckduckgo-search/SKILL.md +230 -0
package/.agent/skills/github-auth/SKILL.md +240 -0
package/.agent/skills/github-code-review/SKILL.md +474 -0
package/.agent/skills/github-code-review/references/review-output-template.md +74 -0
package/.agent/skills/github-issues/SKILL.md +363 -0
package/.agent/skills/github-issues/templates/bug-report.md +35 -0
package/.agent/skills/github-issues/templates/feature-request.md +31 -0
package/.agent/skills/github-pr-workflow/SKILL.md +360 -0
package/.agent/skills/github-pr-workflow/references/ci-troubleshooting.md +183 -0
package/.agent/skills/github-pr-workflow/references/conventional-commits.md +71 -0
package/.agent/skills/github-pr-workflow/templates/pr-body-bugfix.md +35 -0
package/.agent/skills/github-pr-workflow/templates/pr-body-feature.md +33 -0
package/.agent/skills/github-repo-management/SKILL.md +509 -0
package/.agent/skills/github-repo-management/references/github-api-cheatsheet.md +161 -0
package/.agent/skills/godmode/SKILL.md +396 -0
package/.agent/skills/godmode/references/jailbreak-templates.md +128 -0
package/.agent/skills/godmode/references/refusal-detection.md +142 -0
package/.agent/skills/hyperframes/SKILL.md +182 -0
package/.agent/skills/hyperframes/references/cli.md +185 -0
package/.agent/skills/hyperframes/references/composition.md +129 -0
package/.agent/skills/hyperframes/references/features.md +289 -0
package/.agent/skills/hyperframes/references/gsap.md +136 -0
package/.agent/skills/hyperframes/references/troubleshooting.md +137 -0
package/.agent/skills/hyperframes/references/website-to-video.md +145 -0
package/.agent/skills/jupyter-live-kernel/SKILL.md +160 -0
package/.agent/skills/kanban-orchestrator/SKILL.md +209 -0
package/.agent/skills/kanban-worker/SKILL.md +188 -0
package/.agent/skills/llm-wiki/SKILL.md +499 -0
package/.agent/skills/meme-generation/SKILL.md +122 -0
package/.agent/skills/node-inspect-debugger/SKILL.md +312 -0
package/.agent/skills/obsidian/SKILL.md +60 -0
package/.agent/skills/osint-investigation/SKILL.md +269 -0
package/.agent/skills/osint-investigation/templates/source-template.md +59 -0
package/.agent/skills/oss-forensics/SKILL.md +422 -0
package/.agent/skills/oss-forensics/references/evidence-types.md +89 -0
package/.agent/skills/oss-forensics/references/github-archive-guide.md +184 -0
package/.agent/skills/oss-forensics/references/investigation-templates.md +131 -0
package/.agent/skills/oss-forensics/references/recovery-techniques.md +164 -0
package/.agent/skills/oss-forensics/templates/forensic-report.md +151 -0
package/.agent/skills/oss-forensics/templates/malicious-package-report.md +43 -0
package/.agent/skills/parallel-cli/SKILL.md +384 -0
package/.agent/skills/pinggy-tunnel/SKILL.md +302 -0
package/.agent/skills/pixel-art/SKILL.md +209 -0
package/.agent/skills/pixel-art/references/palettes.md +49 -0
package/.agent/skills/plan/SKILL.md +331 -0
package/.agent/skills/polymarket/SKILL.md +75 -0
package/.agent/skills/polymarket/references/api-endpoints.md +220 -0
package/.agent/skills/python-debugpy/SKILL.md +368 -0
package/.agent/skills/requesting-code-review/SKILL.md +273 -0
package/.agent/skills/research-paper-writing/SKILL.md +2367 -0
package/.agent/skills/research-paper-writing/references/autoreason-methodology.md +394 -0
package/.agent/skills/research-paper-writing/references/checklists.md +434 -0
package/.agent/skills/research-paper-writing/references/citation-workflow.md +563 -0
package/.agent/skills/research-paper-writing/references/experiment-patterns.md +728 -0
package/.agent/skills/research-paper-writing/references/human-evaluation.md +476 -0
package/.agent/skills/research-paper-writing/references/paper-types.md +481 -0
package/.agent/skills/research-paper-writing/references/reviewer-guidelines.md +433 -0
package/.agent/skills/research-paper-writing/references/sources.md +191 -0
package/.agent/skills/research-paper-writing/references/writing-guide.md +474 -0
package/.agent/skills/research-paper-writing/templates/README.md +251 -0
package/.agent/skills/rest-graphql-debug/SKILL.md +507 -0
package/.agent/skills/s6-container-supervision/SKILL.md +171 -0
package/.agent/skills/scrapling/SKILL.md +328 -0
package/.agent/skills/sherlock/SKILL.md +186 -0
package/.agent/skills/simplify-code/SKILL.md +168 -0
package/.agent/skills/skill-authoring/SKILL.md +158 -0
package/.agent/skills/spike/SKILL.md +190 -0
package/.agent/skills/subagent-driven-development/SKILL.md +345 -0
package/.agent/skills/subagent-driven-development/references/context-budget-discipline.md +53 -0
package/.agent/skills/subagent-driven-development/references/gates-taxonomy.md +93 -0
package/.agent/skills/systematic-debugging/SKILL.md +360 -0
package/.agent/skills/test-driven-development/SKILL.md +336 -0
package/.agent/skills/video-orchestrator/SKILL.md +194 -0
package/.agent/skills/video-orchestrator/references/examples.md +227 -0
package/.agent/skills/video-orchestrator/references/intake.md +166 -0
package/.agent/skills/video-orchestrator/references/kanban-setup.md +278 -0
package/.agent/skills/video-orchestrator/references/monitoring.md +180 -0
package/.agent/skills/video-orchestrator/references/role-archetypes.md +298 -0
package/.agent/skills/video-orchestrator/references/tool-matrix.md +317 -0
package/.agent/skills/web-pentest/SKILL.md +332 -0
package/.agent/skills/web-pentest/references/bypass-techniques.md +133 -0
package/.agent/skills/web-pentest/references/exploitation-techniques.md +204 -0
package/.agent/skills/web-pentest/references/scope-enforcement.md +110 -0
package/.agent/skills/web-pentest/references/vuln-taxonomy.md +81 -0
package/.agent/skills/web-pentest/templates/authorization.md +69 -0
package/.agent/skills/web-pentest/templates/pentest-report.md +178 -0
package/.claude/commands/mindforge/skill-tdd.md +53 -0
package/.claude/commands/mindforge/skills-index.md +118 -0
package/.claude/commands/mindforge/systematic-debug.md +60 -0
package/.claude/commands/mindforge/wf-catalog.md +37 -0
package/.claude/commands/mindforge/wf-code-audit.md +31 -0
package/.claude/commands/mindforge/wf-competitive-analysis.md +31 -0
package/.claude/commands/mindforge/wf-deep-research.md +32 -0
package/.claude/commands/mindforge/wf-feature-planner.md +31 -0
package/.claude/commands/mindforge/wf-incident-response.md +31 -0
package/.claude/commands/mindforge/wf-onboard-codebase.md +31 -0
package/.claude/commands/mindforge/wf-perf-optimize.md +31 -0
package/.claude/commands/mindforge/wf-pr-review.md +31 -0
package/.claude/commands/mindforge/wf-refactor-plan.md +31 -0
package/.claude/commands/mindforge/wf-release-prep.md +31 -0
package/.claude/commands/mindforge/wf-tdd-sprint.md +31 -0
package/.claude/commands/mindforge/wf-tech-evaluation.md +31 -0
package/.mindforge/config.json +2 -2
package/.mindforge/dynamic-workflows/REGISTRY.md +65 -0
package/.mindforge/dynamic-workflows/index.json +171 -0
package/.mindforge/dynamic-workflows/scripts/code-audit.js +103 -0
package/.mindforge/dynamic-workflows/scripts/competitive-analysis.js +85 -0
package/.mindforge/dynamic-workflows/scripts/deep-research.js +151 -0
package/.mindforge/dynamic-workflows/scripts/feature-planner.js +104 -0
package/.mindforge/dynamic-workflows/scripts/incident-response.js +106 -0
package/.mindforge/dynamic-workflows/scripts/onboard-codebase.js +102 -0
package/.mindforge/dynamic-workflows/scripts/perf-optimize.js +128 -0
package/.mindforge/dynamic-workflows/scripts/pr-review.js +87 -0
package/.mindforge/dynamic-workflows/scripts/refactor-plan.js +121 -0
package/.mindforge/dynamic-workflows/scripts/release-prep.js +102 -0
package/.mindforge/dynamic-workflows/scripts/tdd-sprint.js +103 -0
package/.mindforge/dynamic-workflows/scripts/tech-evaluation.js +72 -0
package/.mindforge/memory/sync-manifest.json +1 -1
package/.mindforge/skills/arxiv/SKILL.md +294 -0
package/.mindforge/skills/blogwatcher/SKILL.md +147 -0
package/.mindforge/skills/code-wiki/SKILL.md +457 -0
package/.mindforge/skills/codebase-inspection/SKILL.md +126 -0
package/.mindforge/skills/concept-diagrams/SKILL.md +373 -0
package/.mindforge/skills/creative-ideation/SKILL.md +162 -0
package/.mindforge/skills/domain-intel/SKILL.md +116 -0
package/.mindforge/skills/duckduckgo-search/SKILL.md +249 -0
package/.mindforge/skills/github-code-review/SKILL.md +493 -0
package/.mindforge/skills/github-issues/SKILL.md +382 -0
package/.mindforge/skills/github-pr-workflow/SKILL.md +379 -0
package/.mindforge/skills/jupyter-live-kernel/SKILL.md +179 -0
package/.mindforge/skills/kanban-orchestrator/SKILL.md +227 -0
package/.mindforge/skills/kanban-worker/SKILL.md +206 -0
package/.mindforge/skills/meme-generation/SKILL.md +141 -0
package/.mindforge/skills/obsidian/SKILL.md +80 -0
package/.mindforge/skills/osint-investigation/SKILL.md +288 -0
package/.mindforge/skills/oss-forensics/SKILL.md +421 -0
package/.mindforge/skills/pixel-art/SKILL.md +228 -0
package/.mindforge/skills/plan/SKILL.md +350 -0
package/.mindforge/skills/requesting-code-review/SKILL.md +292 -0
package/.mindforge/skills/research-paper-writing/SKILL.md +2384 -0
package/.mindforge/skills/scrapling/SKILL.md +345 -0
package/.mindforge/skills/sherlock/SKILL.md +203 -0
package/.mindforge/skills/simplify-code/SKILL.md +187 -0
package/.mindforge/skills/spike/SKILL.md +209 -0
package/.mindforge/skills/subagent-driven-development/SKILL.md +364 -0
package/.mindforge/skills/systematic-debugging/SKILL.md +379 -0
package/.mindforge/skills/test-driven-development/SKILL.md +355 -0
package/.mindforge/skills/web-pentest/SKILL.md +327 -0
package/CHANGELOG.md +71 -0
package/MINDFORGE.md +2 -2
package/README.md +72 -3
package/RELEASENOTES.md +109 -0
package/bin/installer-core.js +6 -2
package/bin/mindforge-cli.js +7 -0
package/bin/workflows/workflow-runner.js +110 -0
package/docs/commands-reference.md +25 -0
package/docs/getting-started.md +42 -5
package/package.json +2 -1

package/.agent/skills/research-paper-writing/references/paper-types.md ADDED Viewed

@@ -0,0 +1,481 @@
+# Paper Types Beyond Empirical ML
+Guide for writing non-standard paper types: theory papers, survey/tutorial papers, benchmark/dataset papers, and position papers. Each type has distinct structure, evidence standards, and venue expectations.
+---
+## Contents
+- [Theory Papers](#theory-papers)
+- [Survey and Tutorial Papers](#survey-and-tutorial-papers)
+- [Benchmark and Dataset Papers](#benchmark-and-dataset-papers)
+- [Position Papers](#position-papers)
+- [Reproducibility and Replication Papers](#reproducibility-and-replication-papers)
+---
+## Theory Papers
+### When to Write a Theory Paper
+Your paper should be a theory paper if:
+- The main contribution is a theorem, bound, impossibility result, or formal characterization
+- Experiments are supplementary validation, not the core evidence
+- The contribution advances understanding rather than achieving state-of-the-art numbers
+### Structure
+```
+1. Introduction (1-1.5 pages)
+   - Problem statement and motivation
+   - Informal statement of main results
+   - Comparison to prior theoretical work
+   - Contribution bullets (state theorems informally)
+2. Preliminaries (0.5-1 page)
+   - Notation table
+   - Formal definitions
+   - Assumptions (numbered, referenced later)
+   - Known results you build on
+3. Main Results (2-3 pages)
+   - Theorem statements (formal)
+   - Proof sketches (intuition + key steps)
+   - Corollaries and special cases
+   - Discussion of tightness / optimality
+4. Experimental Validation (1-2 pages, optional but recommended)
+   - Do theoretical predictions match empirical behavior?
+   - Synthetic experiments that isolate the phenomenon
+   - Comparison to bounds from prior work
+5. Related Work (1 page)
+   - Theoretical predecessors
+   - Empirical work your theory explains
+6. Discussion & Open Problems (0.5 page)
+   - Limitations of your results
+   - Conjectures suggested by your analysis
+   - Concrete open problems
+Appendix:
+   - Full proofs
+   - Technical lemmas
+   - Extended experimental details
+```
+### Writing Theorems
+**Template for a well-stated theorem:**
+```latex
+\begin{assumption}[Bounded Gradients]\label{assum:bounded-grad}
+There exists $G > 0$ such that $\|\nabla f(x)\| \leq G$ for all $x \in \mathcal{X}$.
+\end{assumption}
+\begin{theorem}[Convergence Rate]\label{thm:convergence}
+Under Assumptions~\ref{assum:bounded-grad} and~\ref{assum:smoothness},
+Algorithm~\ref{alg:method} with step size $\eta = \frac{1}{\sqrt{T}}$ satisfies
+\[
+\frac{1}{T}\sum_{t=1}^{T} \mathbb{E}\left[\|\nabla f(x_t)\|^2\right]
+\leq \frac{2(f(x_1) - f^*)}{\sqrt{T}} + \frac{G^2}{\sqrt{T}}.
+\]
+In particular, after $T = O(1/\epsilon^2)$ iterations, we obtain an
+$\epsilon$-stationary point.
+\end{theorem}
+```
+**Rules for theorem statements:**
+- State all assumptions explicitly (numbered, with names)
+- Include the formal bound, not just "converges at rate O(·)"
+- Add a plain-language corollary: "In particular, this means..."
+- Compare to known bounds: "This improves over [prior work]'s bound of O(·) by a factor of..."
+### Proof Sketches
+The proof sketch is the most important part of the main text for a theory paper. Reviewers evaluate whether you have genuine insight or just mechanical derivation.
+**Good proof sketch pattern:**
+```latex
+\begin{proof}[Proof Sketch of Theorem~\ref{thm:convergence}]
+The key insight is that [one sentence describing the main idea].
+The proof proceeds in three steps:
+\begin{enumerate}
+\item \textbf{Decomposition.} We decompose the error into [term A]
+  and [term B] using [technique]. This reduces the problem to
+  bounding each term separately.
+\item \textbf{Bounding [term A].} By [assumption/lemma], [term A]
+  is bounded by $O(\cdot)$. The critical observation is that
+  [specific insight that makes this non-trivial].
+\item \textbf{Combining.} Choosing $\eta = 1/\sqrt{T}$ balances
+  the two terms, yielding the stated bound.
+\end{enumerate}
+The full proof, including the technical lemma for Step 2,
+appears in Appendix~\ref{app:proofs}.
+\end{proof}
+```
+**Bad proof sketch**: Restating the theorem with slightly different notation, or just saying "the proof follows standard techniques."
+### Full Proofs in Appendix
+```latex
+\appendix
+\section{Proofs}\label{app:proofs}
+\subsection{Proof of Theorem~\ref{thm:convergence}}
+We first establish two technical lemmas.
+\begin{lemma}[Descent Lemma]\label{lem:descent}
+Under Assumption~\ref{assum:smoothness}, for any step size $\eta \leq 1/L$:
+\[
+f(x_{t+1}) \leq f(x_t) - \frac{\eta}{2}\|\nabla f(x_t)\|^2 + \frac{\eta^2 L}{2}\|\nabla f(x_t)\|^2.
+\]
+\end{lemma}
+\begin{proof}
+[Complete proof with all steps]
+\end{proof}
+% Continue with remaining lemmas and main theorem proof
+```
+### Common Theory Paper Pitfalls
+| Pitfall | Problem | Fix |
+|---------|---------|-----|
+| Assumptions too strong | Trivializes the result | Discuss which assumptions are necessary; prove lower bounds |
+| No comparison to existing bounds | Reviewers can't assess contribution | Add a comparison table of bounds |
+| Proof sketch is just the full proof shortened | Doesn't convey insight | Focus on the 1-2 key ideas; defer mechanics to appendix |
+| No experimental validation | Reviewers question practical relevance | Add synthetic experiments testing predictions |
+| Notation inconsistency | Confuses reviewers | Create a notation table in Preliminaries |
+| Overly complex proofs where simple ones exist | Reviewers suspect error | Prefer clarity over generality |
+### Venues for Theory Papers
+| Venue | Theory Acceptance Rate | Notes |
+|-------|----------------------|-------|
+| **NeurIPS** | Moderate | Values theory with practical implications |
+| **ICML** | High | Strong theory track |
+| **ICLR** | Moderate | Prefers theory with empirical validation |
+| **COLT** | High | Theory-focused venue |
+| **ALT** | High | Algorithmic learning theory |
+| **STOC/FOCS** | For TCS-flavored results | If contribution is primarily combinatorial/algorithmic |
+| **JMLR** | High | No page limit; good for long proofs |
+---
+## Survey and Tutorial Papers
+### When to Write a Survey
+- A subfield has matured enough that synthesis is valuable
+- You've identified connections between works that individual papers don't make
+- Newcomers to the area have no good entry point
+- The landscape has changed significantly since the last survey
+**Warning**: Surveys require genuine expertise. A survey by someone outside the field, however comprehensive, will miss nuances and mischaracterize work.
+### Structure
+```
+1. Introduction (1-2 pages)
+   - Scope definition (what's included and excluded, and why)
+   - Motivation for the survey now
+   - Overview of organization (often with a figure)
+2. Background / Problem Formulation (1-2 pages)
+   - Formal problem definition
+   - Notation (used consistently throughout)
+   - Historical context
+3. Taxonomy (the core contribution)
+   - Organize methods along meaningful axes
+   - Present taxonomy as a figure or table
+   - Each category gets a subsection
+4. Detailed Coverage (bulk of paper)
+   - For each category: representative methods, key ideas, strengths/weaknesses
+   - Comparison tables within and across categories
+   - Don't just describe — analyze and compare
+5. Experimental Comparison (if applicable)
+   - Standardized benchmark comparison
+   - Fair hyperparameter tuning for all methods
+   - Not always feasible but significantly strengthens the survey
+6. Open Problems & Future Directions (1-2 pages)
+   - Unsolved problems the field should tackle
+   - Promising but underexplored directions
+   - This section is what makes a survey a genuine contribution
+7. Conclusion
+```
+### Taxonomy Design
+The taxonomy is the core intellectual contribution of a survey. It should:
+- **Be meaningful**: Categories should correspond to real methodological differences, not arbitrary groupings
+- **Be exhaustive**: Every relevant paper should fit somewhere
+- **Be mutually exclusive** (ideally): Each paper belongs to one primary category
+- **Have informative names**: "Attention-based methods" > "Category 3"
+- **Be visualized**: A figure showing the taxonomy is almost always helpful
+**Example taxonomy axes for "LLM Reasoning" survey:**
+- By technique: chain-of-thought, tree-of-thought, self-consistency, tool use
+- By training requirement: prompting-only, fine-tuned, RLHF
+- By reasoning type: mathematical, commonsense, logical, causal
+### Writing Standards
+- **Cite every relevant paper** — authors will check if their work is included
+- **Be fair** — don't dismiss methods you don't prefer
+- **Synthesize, don't just list** — identify patterns, trade-offs, open questions
+- **Include a comparison table** — even if qualitative (features/properties checklist)
+- **Update before submission** — check arXiv for papers published since you started writing
+### Venues for Surveys
+| Venue | Notes |
+|-------|-------|
+| **TMLR** (Survey track) | Dedicated survey submissions; no page limit |
+| **JMLR** | Long format, well-respected |
+| **Foundations and Trends in ML** | Invited, but can be proposed |
+| **ACM Computing Surveys** | Broad CS audience |
+| **arXiv** (standalone) | No peer review but high visibility if well-done |
+| **Conference tutorials** | Present as tutorial at NeurIPS/ICML/ACL; write up as paper |
+---
+## Benchmark and Dataset Papers
+### When to Write a Benchmark Paper
+- Existing benchmarks don't measure what you think matters
+- A new capability has emerged with no standard evaluation
+- Existing benchmarks are saturated (all methods score >95%)
+- You want to standardize evaluation in a fragmented subfield
+### Structure
+```
+1. Introduction
+   - What evaluation gap does this benchmark fill?
+   - Why existing benchmarks are insufficient
+2. Task Definition
+   - Formal task specification
+   - Input/output format
+   - Evaluation criteria (what makes a good answer?)
+3. Dataset Construction
+   - Data source and collection methodology
+   - Annotation process (if human-annotated)
+   - Quality control measures
+   - Dataset statistics (size, distribution, splits)
+4. Baseline Evaluation
+   - Run strong baselines (don't just report random/majority)
+   - Show the benchmark is challenging but not impossible
+   - Human performance baseline (if feasible)
+5. Analysis
+   - Error analysis on baselines
+   - What makes items hard/easy?
+   - Construct validity: does the benchmark measure what you claim?
+6. Intended Use & Limitations
+   - What should this benchmark be used for?
+   - What should it NOT be used for?
+   - Known biases or limitations
+7. Datasheet (Appendix)
+   - Full datasheet for datasets (Gebru et al.)
+```
+### Evidence Standards
+Reviewers evaluate benchmarks on different criteria than methods papers:
+| Criterion | What Reviewers Check |
+|-----------|---------------------|
+| **Novelty of evaluation** | Does this measure something existing benchmarks don't? |
+| **Construct validity** | Does the benchmark actually measure the stated capability? |
+| **Difficulty calibration** | Not too easy (saturated) or too hard (random performance) |
+| **Annotation quality** | Agreement metrics, annotator qualifications, guidelines |
+| **Documentation** | Datasheet, license, maintenance plan |
+| **Reproducibility** | Can others use this benchmark easily? |
+| **Ethical considerations** | Bias analysis, consent, sensitive content handling |
+### Dataset Documentation (Required)
+Follow the Datasheets for Datasets framework (Gebru et al., 2021):
+```
+Datasheet Questions:
+1. Motivation
+   - Why was this dataset created?
+   - Who created it and on behalf of whom?
+   - Who funded the creation?
+2. Composition
+   - What do the instances represent?
+   - How many instances are there?
+   - Does it contain all possible instances or a sample?
+   - Is there a label? If so, how was it determined?
+   - Are there recommended data splits?
+3. Collection Process
+   - How was the data collected?
+   - Who was involved in collection?
+   - Over what timeframe?
+   - Was ethical review conducted?
+4. Preprocessing
+   - What preprocessing was done?
+   - Was the "raw" data saved?
+5. Uses
+   - What tasks has this been used for?
+   - What should it NOT be used for?
+   - Are there other tasks it could be used for?
+6. Distribution
+   - How is it distributed?
+   - Under what license?
+   - Are there any restrictions?
+7. Maintenance
+   - Who maintains it?
+   - How can users contact the maintainer?
+   - Will it be updated? How?
+   - Is there an erratum?
+```
+### Venues for Benchmark Papers
+| Venue | Notes |
+|-------|-------|
+| **NeurIPS Datasets & Benchmarks** | Dedicated track; best venue for this |
+| **ACL** (Resource papers) | NLP-focused datasets |
+| **LREC-COLING** | Language resources |
+| **TMLR** | Good for benchmarks with analysis |
+---
+## Position Papers
+### When to Write a Position Paper
+- You have an argument about how the field should develop
+- You want to challenge a widely-held assumption
+- You want to propose a research agenda based on analysis
+- You've identified a systematic problem in current methodology
+### Structure
+```
+1. Introduction
+   - State your thesis clearly in the first paragraph
+   - Why this matters now
+2. Background
+   - Current state of the field
+   - Prevailing assumptions you're challenging
+3. Argument
+   - Present your thesis with supporting evidence
+   - Evidence can be: empirical data, theoretical analysis, logical argument,
+     case studies, historical precedent
+   - Be rigorous — this isn't an opinion piece
+4. Counterarguments
+   - Engage seriously with the strongest objections
+   - Explain why they don't undermine your thesis
+   - Concede where appropriate — it strengthens credibility
+5. Implications
+   - What should the field do differently?
+   - Concrete research directions your thesis suggests
+   - How should evaluation/methodology change?
+6. Conclusion
+   - Restate thesis
+   - Call to action
+```
+### Writing Standards
+- **Lead with the strongest version of your argument** — don't hedge in the first paragraph
+- **Engage with counterarguments honestly** — the best position papers address the strongest objections, not the weakest
+- **Provide evidence** — a position paper without evidence is an editorial
+- **Be concrete** — "the field should do X" is better than "more work is needed"
+- **Don't straw-man existing work** — characterize opposing positions fairly
+### Venues for Position Papers
+| Venue | Notes |
+|-------|-------|
+| **ICML** (Position track) | Dedicated track for position papers |
+| **NeurIPS** (Workshop papers) | Workshops often welcome position pieces |
+| **ACL** (Theme papers) | When your position aligns with the conference theme |
+| **TMLR** | Accepts well-argued position papers |
+| **CACM** | For broader CS audience |
+---
+## Reproducibility and Replication Papers
+### When to Write a Reproducibility Paper
+- You attempted to reproduce a published result and succeeded/failed
+- You want to verify claims under different conditions
+- You've identified that a popular method's performance depends on unreported details
+### Structure
+```
+1. Introduction
+   - What paper/result are you reproducing?
+   - Why is this reproduction valuable?
+2. Original Claims
+   - State the exact claims from the original paper
+   - What evidence was provided?
+3. Methodology
+   - Your reproduction approach
+   - Differences from original (if any) and why
+   - What information was missing from the original paper?
+4. Results
+   - Side-by-side comparison with original results
+   - Statistical comparison (confidence intervals overlap?)
+   - What reproduced and what didn't?
+5. Analysis
+   - If results differ: why? What's sensitive?
+   - Hidden hyperparameters or implementation details?
+   - Robustness to seed, hardware, library versions?
+6. Recommendations
+   - For original authors: what should be clarified?
+   - For practitioners: what to watch out for?
+   - For the field: what reproducibility lessons emerge?
+```
+### Venues
+| Venue | Notes |
+|-------|-------|
+| **ML Reproducibility Challenge** | Annual challenge at NeurIPS |
+| **ReScience** | Journal dedicated to replications |
+| **TMLR** | Accepts reproductions with analysis |
+| **Workshops** | Reproducibility workshops at major conferences |