@wentorai/research-plugins 1.2.2 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (141) hide show
  1. package/README.md +16 -8
  2. package/openclaw.plugin.json +10 -3
  3. package/package.json +2 -5
  4. package/skills/analysis/dataviz/SKILL.md +25 -0
  5. package/skills/analysis/dataviz/chart-image-generator/SKILL.md +1 -1
  6. package/skills/analysis/econometrics/SKILL.md +23 -0
  7. package/skills/analysis/econometrics/robustness-checks/SKILL.md +1 -1
  8. package/skills/analysis/statistics/SKILL.md +21 -0
  9. package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +1 -1
  10. package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +1 -1
  11. package/skills/analysis/statistics/{senior-data-scientist-guide → modeling-strategy-guide}/SKILL.md +5 -5
  12. package/skills/analysis/wrangling/SKILL.md +21 -0
  13. package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +1 -1
  14. package/skills/analysis/wrangling/data-cog-guide/SKILL.md +1 -1
  15. package/skills/domains/ai-ml/SKILL.md +37 -0
  16. package/skills/domains/biomedical/SKILL.md +28 -0
  17. package/skills/domains/biomedical/genomas-guide/SKILL.md +1 -1
  18. package/skills/domains/biomedical/med-researcher-guide/SKILL.md +1 -1
  19. package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +1 -1
  20. package/skills/domains/business/SKILL.md +17 -0
  21. package/skills/domains/business/architecture-design-guide/SKILL.md +1 -1
  22. package/skills/domains/chemistry/SKILL.md +19 -0
  23. package/skills/domains/chemistry/computational-chemistry-guide/SKILL.md +1 -1
  24. package/skills/domains/cs/SKILL.md +21 -0
  25. package/skills/domains/ecology/SKILL.md +16 -0
  26. package/skills/domains/economics/SKILL.md +20 -0
  27. package/skills/domains/economics/post-labor-economics/SKILL.md +1 -1
  28. package/skills/domains/economics/pricing-psychology-guide/SKILL.md +1 -1
  29. package/skills/domains/education/SKILL.md +19 -0
  30. package/skills/domains/education/academic-study-methods/SKILL.md +1 -1
  31. package/skills/domains/education/edumcp-guide/SKILL.md +1 -1
  32. package/skills/domains/finance/SKILL.md +19 -0
  33. package/skills/domains/finance/akshare-finance-data/SKILL.md +1 -1
  34. package/skills/domains/finance/options-analytics-agent-guide/SKILL.md +1 -1
  35. package/skills/domains/finance/stata-accounting-research/SKILL.md +1 -1
  36. package/skills/domains/geoscience/SKILL.md +17 -0
  37. package/skills/domains/humanities/SKILL.md +16 -0
  38. package/skills/domains/humanities/history-research-guide/SKILL.md +1 -1
  39. package/skills/domains/humanities/political-history-guide/SKILL.md +1 -1
  40. package/skills/domains/law/SKILL.md +19 -0
  41. package/skills/domains/math/SKILL.md +17 -0
  42. package/skills/domains/pharma/SKILL.md +17 -0
  43. package/skills/domains/physics/SKILL.md +16 -0
  44. package/skills/domains/social-science/SKILL.md +17 -0
  45. package/skills/domains/social-science/sociology-research-methods/SKILL.md +1 -1
  46. package/skills/literature/discovery/SKILL.md +20 -0
  47. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +1 -1
  48. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +1 -1
  49. package/skills/literature/fulltext/SKILL.md +26 -0
  50. package/skills/literature/metadata/SKILL.md +35 -0
  51. package/skills/literature/metadata/doi-content-negotiation/SKILL.md +4 -0
  52. package/skills/literature/metadata/doi-resolution-guide/SKILL.md +4 -0
  53. package/skills/literature/metadata/orcid-api/SKILL.md +4 -0
  54. package/skills/literature/metadata/orcid-integration-guide/SKILL.md +4 -0
  55. package/skills/literature/search/SKILL.md +43 -0
  56. package/skills/literature/search/paper-search-mcp-guide/SKILL.md +1 -1
  57. package/skills/research/automation/SKILL.md +21 -0
  58. package/skills/research/deep-research/SKILL.md +24 -0
  59. package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +1 -1
  60. package/skills/research/deep-research/in-depth-research-guide/SKILL.md +1 -1
  61. package/skills/research/funding/SKILL.md +20 -0
  62. package/skills/research/methodology/SKILL.md +24 -0
  63. package/skills/research/paper-review/SKILL.md +19 -0
  64. package/skills/research/paper-review/paper-critique-framework/SKILL.md +1 -1
  65. package/skills/tools/code-exec/SKILL.md +18 -0
  66. package/skills/tools/diagram/SKILL.md +20 -0
  67. package/skills/tools/document/SKILL.md +21 -0
  68. package/skills/tools/knowledge-graph/SKILL.md +21 -0
  69. package/skills/tools/ocr-translate/SKILL.md +18 -0
  70. package/skills/tools/ocr-translate/handwriting-recognition-guide/SKILL.md +2 -0
  71. package/skills/tools/ocr-translate/latex-ocr-guide/SKILL.md +2 -0
  72. package/skills/tools/scraping/SKILL.md +17 -0
  73. package/skills/writing/citation/SKILL.md +33 -0
  74. package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +2 -0
  75. package/skills/writing/composition/SKILL.md +22 -0
  76. package/skills/writing/composition/research-paper-writer/SKILL.md +1 -1
  77. package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +1 -1
  78. package/skills/writing/latex/SKILL.md +22 -0
  79. package/skills/writing/latex/academic-writing-latex/SKILL.md +1 -1
  80. package/skills/writing/latex/latex-drawing-guide/SKILL.md +1 -1
  81. package/skills/writing/polish/SKILL.md +20 -0
  82. package/skills/writing/polish/chinese-text-humanizer/SKILL.md +1 -1
  83. package/skills/writing/templates/SKILL.md +22 -0
  84. package/skills/writing/templates/beamer-presentation-guide/SKILL.md +1 -1
  85. package/skills/writing/templates/scientific-article-pdf/SKILL.md +1 -1
  86. package/skills/analysis/dataviz/citation-map-guide/SKILL.md +0 -184
  87. package/skills/analysis/dataviz/data-visualization-principles/SKILL.md +0 -171
  88. package/skills/analysis/econometrics/empirical-paper-analysis/SKILL.md +0 -192
  89. package/skills/analysis/econometrics/panel-data-regression-workflow/SKILL.md +0 -267
  90. package/skills/analysis/econometrics/stata-regression/SKILL.md +0 -117
  91. package/skills/analysis/statistics/general-statistics-guide/SKILL.md +0 -226
  92. package/skills/analysis/statistics/infiagent-benchmark-guide/SKILL.md +0 -106
  93. package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +0 -192
  94. package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +0 -193
  95. package/skills/analysis/wrangling/claude-data-analysis-guide/SKILL.md +0 -100
  96. package/skills/analysis/wrangling/open-data-scientist-guide/SKILL.md +0 -197
  97. package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +0 -159
  98. package/skills/domains/humanities/digital-humanities-methods/SKILL.md +0 -232
  99. package/skills/domains/law/legal-research-methods/SKILL.md +0 -190
  100. package/skills/domains/social-science/sociology-research-guide/SKILL.md +0 -238
  101. package/skills/literature/discovery/arxiv-paper-monitoring/SKILL.md +0 -233
  102. package/skills/literature/discovery/paper-tracking-guide/SKILL.md +0 -211
  103. package/skills/literature/fulltext/zotero-scihub-guide/SKILL.md +0 -168
  104. package/skills/literature/search/arxiv-osiris/SKILL.md +0 -199
  105. package/skills/literature/search/deepgit-search-guide/SKILL.md +0 -147
  106. package/skills/literature/search/multi-database-literature-search/SKILL.md +0 -198
  107. package/skills/literature/search/papers-chat-guide/SKILL.md +0 -194
  108. package/skills/literature/search/pasa-paper-search-guide/SKILL.md +0 -138
  109. package/skills/literature/search/scientify-literature-survey/SKILL.md +0 -203
  110. package/skills/research/automation/ai-scientist-guide/SKILL.md +0 -228
  111. package/skills/research/automation/coexist-ai-guide/SKILL.md +0 -149
  112. package/skills/research/automation/foam-agent-guide/SKILL.md +0 -203
  113. package/skills/research/automation/research-paper-orchestrator/SKILL.md +0 -254
  114. package/skills/research/deep-research/academic-deep-research/SKILL.md +0 -190
  115. package/skills/research/deep-research/cognitive-kernel-guide/SKILL.md +0 -200
  116. package/skills/research/deep-research/corvus-research-guide/SKILL.md +0 -132
  117. package/skills/research/deep-research/deep-research-pro/SKILL.md +0 -213
  118. package/skills/research/deep-research/deep-research-work/SKILL.md +0 -204
  119. package/skills/research/deep-research/research-cog/SKILL.md +0 -153
  120. package/skills/research/methodology/academic-mentor-guide/SKILL.md +0 -169
  121. package/skills/research/methodology/deep-innovator-guide/SKILL.md +0 -242
  122. package/skills/research/methodology/research-pipeline-units-guide/SKILL.md +0 -169
  123. package/skills/research/paper-review/paper-compare-guide/SKILL.md +0 -238
  124. package/skills/research/paper-review/paper-digest-guide/SKILL.md +0 -240
  125. package/skills/research/paper-review/paper-research-assistant/SKILL.md +0 -231
  126. package/skills/research/paper-review/research-quality-filter/SKILL.md +0 -261
  127. package/skills/tools/code-exec/contextplus-mcp-guide/SKILL.md +0 -110
  128. package/skills/tools/diagram/clawphd-guide/SKILL.md +0 -149
  129. package/skills/tools/diagram/scientific-graphical-abstract/SKILL.md +0 -201
  130. package/skills/tools/document/md2pdf-xelatex/SKILL.md +0 -212
  131. package/skills/tools/document/openpaper-guide/SKILL.md +0 -232
  132. package/skills/tools/document/weknora-guide/SKILL.md +0 -216
  133. package/skills/tools/knowledge-graph/mimir-memory-guide/SKILL.md +0 -135
  134. package/skills/tools/knowledge-graph/open-webui-tools-guide/SKILL.md +0 -156
  135. package/skills/tools/ocr-translate/formula-recognition-guide/SKILL.md +0 -367
  136. package/skills/tools/ocr-translate/math-equation-renderer/SKILL.md +0 -198
  137. package/skills/tools/scraping/api-data-collection-guide/SKILL.md +0 -301
  138. package/skills/writing/citation/academic-citation-manager-guide/SKILL.md +0 -182
  139. package/skills/writing/composition/opendraft-thesis-guide/SKILL.md +0 -200
  140. package/skills/writing/composition/paper-debugger-guide/SKILL.md +0 -143
  141. package/skills/writing/composition/paperforge-guide/SKILL.md +0 -205
@@ -1,194 +0,0 @@
1
- ---
2
- name: papers-chat-guide
3
- description: "Conversational interface for querying and discussing papers"
4
- metadata:
5
- openclaw:
6
- emoji: "💬"
7
- category: "literature"
8
- subcategory: "search"
9
- keywords: ["paper chat", "conversational search", "paper QA", "document QA", "RAG papers", "literature chat"]
10
- source: "https://github.com/paperswithcode/galai"
11
- ---
12
-
13
- # Papers Chat Guide
14
-
15
- ## Overview
16
-
17
- Papers Chat systems provide conversational interfaces for querying, discussing, and understanding academic papers. Instead of keyword searches, researchers ask natural language questions and get answers grounded in specific papers with citations. This guide covers building and using RAG-based paper chat systems — from local document Q&A to multi-paper discussion interfaces. Useful for literature comprehension, paper comparison, and research exploration.
18
-
19
- ## Architecture
20
-
21
- ```
22
- User Question
23
-
24
- Query Understanding (expand, decompose)
25
-
26
- Retrieval (vector search over paper chunks)
27
-
28
- Re-ranking (cross-encoder relevance scoring)
29
-
30
- Answer Generation (grounded in retrieved passages)
31
-
32
- Response + Citations + Follow-up Suggestions
33
- ```
34
-
35
- ## Local Paper Chat
36
-
37
- ```python
38
- from papers_chat import PaperChat
39
-
40
- chat = PaperChat(
41
- llm_provider="anthropic",
42
- embedding_model="all-MiniLM-L6-v2",
43
- )
44
-
45
- # Index papers
46
- chat.add_papers([
47
- "papers/attention_is_all_you_need.pdf",
48
- "papers/bert.pdf",
49
- "papers/gpt3.pdf",
50
- ])
51
-
52
- # Ask questions
53
- response = chat.ask(
54
- "How does the attention mechanism in Transformers differ "
55
- "from the attention used in earlier seq2seq models?"
56
- )
57
-
58
- print(response.answer)
59
- for cite in response.citations:
60
- print(f" [{cite.paper}] p.{cite.page}: {cite.excerpt[:80]}...")
61
- ```
62
-
63
- ## Multi-Paper Discussion
64
-
65
- ```python
66
- # Compare across papers
67
- response = chat.ask(
68
- "Compare the pre-training objectives of BERT and GPT-3. "
69
- "What are the trade-offs?"
70
- )
71
-
72
- # Follow-up in conversation
73
- response = chat.follow_up(
74
- "Which approach works better for few-shot learning?"
75
- )
76
-
77
- # Paper-specific questions
78
- response = chat.ask(
79
- "What is the computational complexity of multi-head attention?",
80
- scope=["attention_is_all_you_need.pdf"],
81
- )
82
- ```
83
-
84
- ## Building a Paper Index
85
-
86
- ```python
87
- from papers_chat import PaperIndex
88
-
89
- index = PaperIndex(
90
- embedding_model="all-MiniLM-L6-v2",
91
- chunk_size=512,
92
- chunk_overlap=64,
93
- storage_path="./paper_index",
94
- )
95
-
96
- # Add individual paper
97
- index.add_paper(
98
- path="paper.pdf",
99
- metadata={
100
- "title": "Attention Is All You Need",
101
- "authors": ["Vaswani et al."],
102
- "year": 2017,
103
- "venue": "NeurIPS",
104
- },
105
- )
106
-
107
- # Add directory of papers
108
- index.add_directory(
109
- "papers/",
110
- extract_metadata=True, # Auto-extract from PDF
111
- )
112
-
113
- # Search
114
- results = index.search("positional encoding", top_k=5)
115
- for r in results:
116
- print(f"[{r.paper_title}] (score: {r.score:.3f})")
117
- print(f" {r.text[:120]}...")
118
- ```
119
-
120
- ## RAG Pipeline Configuration
121
-
122
- ```python
123
- from papers_chat import RAGConfig
124
-
125
- chat = PaperChat(
126
- llm_provider="anthropic",
127
- rag_config=RAGConfig(
128
- # Retrieval
129
- retrieval_top_k=20,
130
- rerank_top_k=5,
131
- reranker="cross-encoder/ms-marco-MiniLM-L-6-v2",
132
-
133
- # Chunking
134
- chunk_size=512,
135
- chunk_overlap=64,
136
- chunk_by="paragraph", # paragraph, sentence, fixed
137
-
138
- # Generation
139
- citation_style="inline", # inline, footnote, endnote
140
- max_answer_length=500,
141
- include_quotes=True,
142
- ),
143
- )
144
- ```
145
-
146
- ## Batch Question Answering
147
-
148
- ```python
149
- # Process a list of research questions
150
- questions = [
151
- "What datasets are used for evaluating language models?",
152
- "How is perplexity calculated and what are its limitations?",
153
- "What are the main approaches to reducing model size?",
154
- ]
155
-
156
- results = chat.batch_ask(questions)
157
-
158
- for q, r in zip(questions, results):
159
- print(f"Q: {q}")
160
- print(f"A: {r.answer[:200]}...")
161
- print(f"Sources: {[c.paper for c in r.citations]}")
162
- print()
163
- ```
164
-
165
- ## Table and Figure Extraction
166
-
167
- ```python
168
- # Query specific paper elements
169
- response = chat.ask(
170
- "What are the BLEU scores reported in Table 2?",
171
- scope=["attention_is_all_you_need.pdf"],
172
- include_tables=True,
173
- )
174
-
175
- # Extract all tables from a paper
176
- tables = chat.extract_tables("paper.pdf")
177
- for table in tables:
178
- print(f"Table {table.number}: {table.caption}")
179
- print(table.to_dataframe())
180
- ```
181
-
182
- ## Use Cases
183
-
184
- 1. **Literature comprehension**: Ask clarifying questions about papers
185
- 2. **Paper comparison**: Cross-paper analysis and synthesis
186
- 3. **Research exploration**: Discover connections across literature
187
- 4. **Study groups**: Collaborative paper discussion
188
- 5. **Quick reference**: Find specific results, methods, or citations
189
-
190
- ## References
191
-
192
- - [Galactica](https://github.com/paperswithcode/galai) — Language model for science
193
- - [LangChain RAG](https://python.langchain.com/docs/use_cases/question_answering/)
194
- - [LlamaIndex](https://www.llamaindex.ai/) — Data framework for LLM applications
@@ -1,138 +0,0 @@
1
- ---
2
- name: pasa-paper-search-guide
3
- description: "Advanced paper search agent powered by LLMs for literature discovery"
4
- version: 1.0.0
5
- author: wentor-community
6
- source: https://github.com/pasa-agent/pasa
7
- metadata:
8
- openclaw:
9
- category: "literature"
10
- subcategory: "search"
11
- keywords:
12
- - paper-search
13
- - literature-discovery
14
- - semantic-search
15
- - citation-graph
16
- - academic-databases
17
- - query-expansion
18
- ---
19
-
20
- # PASA Paper Search Guide
21
-
22
- A skill for conducting advanced academic paper searches using LLM-powered query expansion, semantic ranking, and citation-graph exploration. Based on the PASA project (2K stars), this skill transforms simple research questions into comprehensive, systematic literature discovery workflows.
23
-
24
- ## Overview
25
-
26
- Finding relevant papers is the foundation of all academic research, yet traditional keyword searches miss semantically related work, and manual citation chasing is time-consuming. PASA addresses this by combining LLM-driven query understanding with multi-source search and intelligent result ranking. The agent acts as a search co-pilot, helping researchers cast a wide net and then systematically narrow results to the most relevant papers.
27
-
28
- This skill is designed for researchers at any career stage who want to go beyond simple database searches and build thorough, reproducible literature collections for reviews, grant proposals, or new research directions.
29
-
30
- ## Search Strategy Design
31
-
32
- Before executing any search, the agent helps researchers design a comprehensive strategy:
33
-
34
- **Query Formulation**
35
- - Decompose the research question into key concepts and their relationships
36
- - Identify primary terms, synonyms, and related terminology for each concept
37
- - Consider field-specific jargon and cross-disciplinary terminology differences
38
- - Build Boolean query strings combining concepts with AND/OR operators
39
- - Generate semantic search queries in natural language for embedding-based retrieval
40
-
41
- **Source Selection**
42
- - Identify appropriate databases for the research domain (Semantic Scholar, OpenAlex, PubMed, IEEE Xplore, ACL Anthology, arXiv, SSRN)
43
- - Consider preprint servers alongside peer-reviewed databases
44
- - Include grey literature sources when appropriate (dissertations, reports, conference proceedings)
45
- - Plan for cross-database deduplication
46
- - Document the search date and database coverage dates
47
-
48
- **Scope Definition**
49
- - Set date range filters based on the research question
50
- - Define inclusion and exclusion criteria before searching
51
- - Specify language restrictions and justify them
52
- - Determine minimum quality thresholds (peer-review status, impact metrics)
53
- - Plan the stopping rule (saturation, maximum count, date boundary)
54
-
55
- ## Execution Workflow
56
-
57
- The search execution follows a systematic multi-phase approach:
58
-
59
- **Phase 1: Broad Sweep**
60
- - Execute the designed queries across all selected databases
61
- - Collect metadata (title, authors, abstract, venue, year, citation count)
62
- - Record the number of results per query per database
63
- - Remove exact duplicates using DOI and title matching
64
- - Generate initial statistics (total results, date distribution, venue distribution)
65
-
66
- **Phase 2: Semantic Ranking**
67
- - Encode the research question and all abstracts into embedding space
68
- - Rank results by semantic similarity to the core research question
69
- - Identify clusters of thematically similar papers
70
- - Flag highly cited papers that appear in multiple query results
71
- - Surface unexpected but potentially relevant papers from the long tail
72
-
73
- **Phase 3: Citation Expansion**
74
- - For the top-ranked papers, retrieve their reference lists
75
- - For the top-ranked papers, retrieve papers that cite them
76
- - Apply the same relevance ranking to newly discovered papers
77
- - Identify "hub" papers that connect multiple research threads
78
- - Detect seminal works that appear frequently in citation chains
79
-
80
- **Phase 4: Snowball Refinement**
81
- - Check if newly discovered papers introduce terminology not in original queries
82
- - If so, formulate additional queries with the new terms
83
- - Repeat until reaching saturation (no significant new papers discovered)
84
- - Document the complete search trail for reproducibility
85
-
86
- ## Result Analysis
87
-
88
- After search completion, the agent assists with analyzing the collected papers:
89
-
90
- **Bibliometric Overview**
91
- - Publication year distribution showing research activity trends
92
- - Venue distribution identifying key journals and conferences
93
- - Author co-occurrence networks highlighting prolific researchers
94
- - Geographic distribution of research institutions
95
- - Citation network statistics (density, clustering coefficient)
96
-
97
- **Thematic Mapping**
98
- - Cluster papers by topic using abstract embeddings
99
- - Generate descriptive labels for each cluster
100
- - Identify emerging themes with recent publication dates and low citation counts
101
- - Map established themes with high citation density
102
- - Highlight cross-cluster papers that bridge different research streams
103
-
104
- **Gap Identification**
105
- - Compare the thematic map against the original research question
106
- - Identify aspects of the question with sparse literature coverage
107
- - Note methodological approaches that are underrepresented
108
- - Flag populations or contexts that have been understudied
109
- - Suggest how identified gaps might shape the research direction
110
-
111
- ## PRISMA Compliance
112
-
113
- For systematic reviews, the skill supports PRISMA-compliant reporting:
114
-
115
- - Generate PRISMA flow diagrams with counts at each stage
116
- - Document reasons for exclusion at each screening phase
117
- - Track inter-rater agreement for screening decisions
118
- - Produce exportable search documentation for supplementary materials
119
- - Support both traditional and updated PRISMA 2020 guidelines
120
-
121
- ## Integration with Research-Claw
122
-
123
- This skill connects seamlessly with the Research-Claw ecosystem:
124
-
125
- - Export discovered papers to reference management tools (Zotero, BibTeX)
126
- - Feed search results to the paper-to-agent skill for deep analysis
127
- - Connect with writing skills for automated literature review drafting
128
- - Store search strategies as reproducible templates for future use
129
- - Share curated paper collections with collaborators via the platform
130
-
131
- ## Practical Tips
132
-
133
- - Start broad and narrow incrementally rather than beginning with narrow searches
134
- - Always search at least two independent databases to avoid source bias
135
- - Record every query variation and its result count for the search audit trail
136
- - Use citation-based expansion to discover older foundational works
137
- - Check the references of the most recent relevant review articles
138
- - Set calendar reminders to re-run searches periodically for living reviews
@@ -1,203 +0,0 @@
1
- ---
2
- name: scientify-literature-survey
3
- description: "Search, filter, download and cluster academic papers on a topic"
4
- metadata:
5
- openclaw:
6
- emoji: "🔍"
7
- category: "literature"
8
- subcategory: "search"
9
- keywords: ["academic database search", "literature search", "search strategy", "semantic search", "citation tracking"]
10
- source: "https://github.com/scientify-ai/skills"
11
- ---
12
-
13
- # Literature Survey
14
-
15
- **Don't ask permission. Just do it.**
16
-
17
- ## Output Structure
18
-
19
- ```
20
- ~/.openclaw/workspace/projects/{project-id}/
21
- ├── survey/
22
- │ ├── search_terms.json # Search terms list
23
- │ └── report.md # Final report
24
- ├── papers/
25
- │ ├── _downloads/ # Raw downloads
26
- │ ├── _meta/ # Per-paper metadata
27
- │ │ └── {arxiv_id}.json
28
- │ └── {direction}/ # Organized by direction
29
- ├── repos/ # Reference code repos (Phase 3)
30
- │ ├── {repo_name_1}/
31
- │ └── {repo_name_2}/
32
- └── prepare_res.md # Repo selection report (Phase 3)
33
- ```
34
-
35
- ## Workflow
36
-
37
- ### Phase 1: Preparation
38
-
39
- ```bash
40
- ACTIVE=$(cat ~/.openclaw/workspace/projects/.active 2>/dev/null)
41
- if [ -z "$ACTIVE" ]; then
42
- PROJECT_ID="<topic-slug>"
43
- mkdir -p ~/.openclaw/workspace/projects/$PROJECT_ID/{survey,papers/_downloads,papers/_meta}
44
- echo "$PROJECT_ID" > ~/.openclaw/workspace/projects/.active
45
- fi
46
- PROJECT_DIR="$HOME/.openclaw/workspace/projects/$(cat ~/.openclaw/workspace/projects/.active)"
47
- ```
48
-
49
- Generate 4-8 search terms, save to `survey/search_terms.json`.
50
-
51
- ### Phase 2: Incremental Search-Filter-Download Loop
52
-
53
- **Repeat the following for each search term:**
54
-
55
- #### 2.1 Search
56
-
57
- ```
58
- arxiv_search({ query: "<term>", max_results: 30 })
59
- ```
60
-
61
- #### 2.2 Instant Filtering
62
-
63
- Score each returned paper immediately (1-5), keep only >= 4.
64
-
65
- Scoring criteria:
66
- - 5: Core paper, directly on topic
67
- - 4: Related method or application
68
- - 3 and below: Skip
69
-
70
- #### 2.3 Download Useful Papers
71
-
72
- ```
73
- arxiv_download({
74
- arxiv_ids: ["<useful_paper_ids>"],
75
- output_dir: "$PROJECT_DIR/papers/_downloads"
76
- })
77
- ```
78
-
79
- #### 2.4 Write Metadata
80
-
81
- For each downloaded paper, create `papers/_meta/{arxiv_id}.json`:
82
-
83
- ```json
84
- {
85
- "arxiv_id": "2401.12345",
86
- "title": "...",
87
- "abstract": "...",
88
- "score": 5,
89
- "source_term": "battery RUL prediction",
90
- "downloaded_at": "2024-01-15T10:00:00Z"
91
- }
92
- ```
93
-
94
- **Complete one search term before proceeding to the next.** This prevents context pollution from large search results.
95
-
96
- ### Phase 3: GitHub Code Search & Reference Repo Selection
97
-
98
- **Goal**: Provide reference implementations for downstream skills.
99
-
100
- #### 3.1 Select High-Scoring Papers
101
-
102
- Read metadata from `papers/_meta/` for papers scoring >= 4, select **Top 5** most relevant.
103
-
104
- #### 3.2 Search Reference Repos
105
-
106
- For each selected paper, search GitHub with keyword combinations:
107
- - Paper title + "code" / "implementation"
108
- - Core method name + author name
109
- - Dataset name + task name from paper
110
-
111
- Use `github_search` tool:
112
- ```javascript
113
- github_search({
114
- query: "{paper_title} implementation",
115
- max_results: 10,
116
- sort: "stars",
117
- language: "python"
118
- })
119
- ```
120
-
121
- #### 3.3 Filter & Clone
122
-
123
- Evaluate repos by:
124
- - Star count (recommend >100)
125
- - Code quality (has README, requirements.txt, clear structure)
126
- - Paper match (README references paper / implements its method)
127
-
128
- Select **3-5** most relevant repos, clone to `repos/`:
129
-
130
- ```bash
131
- mkdir -p "$PROJECT_DIR/repos"
132
- cd "$PROJECT_DIR/repos"
133
- git clone --depth 1 <repo_url>
134
- ```
135
-
136
- #### 3.4 Write Selection Report
137
-
138
- Create `$PROJECT_DIR/prepare_res.md`:
139
-
140
- ```markdown
141
- # Reference Repo Selection
142
-
143
- | Repo | Paper | Stars | Reason |
144
- |------|-------|-------|--------|
145
- | repos/{repo_name} | {paper_title} (arxiv:{id}) | {N} | {reason} |
146
-
147
- ## Key Files per Repo
148
-
149
- ### {repo_name}
150
- - **Model**: `model/` or `models/`
151
- - **Training**: `train.py` or `main.py`
152
- - **Data loading**: `data/` or `dataset.py`
153
- - **Core file**: `{path}` - {description}
154
- ```
155
-
156
- **If no repos found**, note "No reference repos available" in `prepare_res.md`.
157
-
158
- ### Phase 4: Classification
159
-
160
- After all search terms and code searches are complete:
161
-
162
- #### 4.1 Read All Metadata
163
-
164
- ```bash
165
- ls $PROJECT_DIR/papers/_meta/
166
- ```
167
-
168
- Read all `.json` files, aggregate paper list.
169
-
170
- #### 4.2 Cluster Analysis
171
-
172
- Based on paper titles, abstracts, and source terms, identify 3-6 research directions.
173
-
174
- #### 4.3 Create Folders and Move
175
-
176
- ```bash
177
- mkdir -p "$PROJECT_DIR/papers/data-driven"
178
- mv "$PROJECT_DIR/papers/_downloads/2401.12345" "$PROJECT_DIR/papers/data-driven/"
179
- ```
180
-
181
- ### Phase 5: Generate Report
182
-
183
- Create `survey/report.md`:
184
- - Survey summary (search terms count, papers count, directions count)
185
- - Overview of each research direction
186
- - Top 10 papers
187
- - **Reference repo summary** (cite prepare_res.md)
188
- - Recommended reading order
189
-
190
- ## Key Design Principles
191
-
192
- | Principle | Description |
193
- |-----------|-------------|
194
- | **Incremental processing** | Each search term independently completes search->filter->download->metadata, avoiding context bloat |
195
- | **Metadata-driven** | Classification based on `_meta/*.json`, not large in-memory lists |
196
- | **Folders as categories** | Clustering results reflected by `papers/{direction}/` structure |
197
-
198
- ## Tools
199
-
200
- | Tool | Purpose |
201
- |------|---------|
202
- | `arxiv_search` | Search papers (no side effects) |
203
- | `arxiv_download` | Download .tex/.pdf (requires absolute path) |