@wentorai/research-plugins 1.2.2 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -8
- package/openclaw.plugin.json +10 -3
- package/package.json +2 -5
- package/skills/analysis/dataviz/SKILL.md +25 -0
- package/skills/analysis/dataviz/chart-image-generator/SKILL.md +1 -1
- package/skills/analysis/econometrics/SKILL.md +23 -0
- package/skills/analysis/econometrics/robustness-checks/SKILL.md +1 -1
- package/skills/analysis/statistics/SKILL.md +21 -0
- package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +1 -1
- package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +1 -1
- package/skills/analysis/statistics/{senior-data-scientist-guide → modeling-strategy-guide}/SKILL.md +5 -5
- package/skills/analysis/wrangling/SKILL.md +21 -0
- package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +1 -1
- package/skills/analysis/wrangling/data-cog-guide/SKILL.md +1 -1
- package/skills/domains/ai-ml/SKILL.md +37 -0
- package/skills/domains/biomedical/SKILL.md +28 -0
- package/skills/domains/biomedical/genomas-guide/SKILL.md +1 -1
- package/skills/domains/biomedical/med-researcher-guide/SKILL.md +1 -1
- package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +1 -1
- package/skills/domains/business/SKILL.md +17 -0
- package/skills/domains/business/architecture-design-guide/SKILL.md +1 -1
- package/skills/domains/chemistry/SKILL.md +19 -0
- package/skills/domains/chemistry/computational-chemistry-guide/SKILL.md +1 -1
- package/skills/domains/cs/SKILL.md +21 -0
- package/skills/domains/ecology/SKILL.md +16 -0
- package/skills/domains/economics/SKILL.md +20 -0
- package/skills/domains/economics/post-labor-economics/SKILL.md +1 -1
- package/skills/domains/economics/pricing-psychology-guide/SKILL.md +1 -1
- package/skills/domains/education/SKILL.md +19 -0
- package/skills/domains/education/academic-study-methods/SKILL.md +1 -1
- package/skills/domains/education/edumcp-guide/SKILL.md +1 -1
- package/skills/domains/finance/SKILL.md +19 -0
- package/skills/domains/finance/akshare-finance-data/SKILL.md +1 -1
- package/skills/domains/finance/options-analytics-agent-guide/SKILL.md +1 -1
- package/skills/domains/finance/stata-accounting-research/SKILL.md +1 -1
- package/skills/domains/geoscience/SKILL.md +17 -0
- package/skills/domains/humanities/SKILL.md +16 -0
- package/skills/domains/humanities/history-research-guide/SKILL.md +1 -1
- package/skills/domains/humanities/political-history-guide/SKILL.md +1 -1
- package/skills/domains/law/SKILL.md +19 -0
- package/skills/domains/math/SKILL.md +17 -0
- package/skills/domains/pharma/SKILL.md +17 -0
- package/skills/domains/physics/SKILL.md +16 -0
- package/skills/domains/social-science/SKILL.md +17 -0
- package/skills/domains/social-science/sociology-research-methods/SKILL.md +1 -1
- package/skills/literature/discovery/SKILL.md +20 -0
- package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +1 -1
- package/skills/literature/discovery/semantic-paper-radar/SKILL.md +1 -1
- package/skills/literature/fulltext/SKILL.md +26 -0
- package/skills/literature/metadata/SKILL.md +35 -0
- package/skills/literature/metadata/doi-content-negotiation/SKILL.md +4 -0
- package/skills/literature/metadata/doi-resolution-guide/SKILL.md +4 -0
- package/skills/literature/metadata/orcid-api/SKILL.md +4 -0
- package/skills/literature/metadata/orcid-integration-guide/SKILL.md +4 -0
- package/skills/literature/search/SKILL.md +43 -0
- package/skills/literature/search/paper-search-mcp-guide/SKILL.md +1 -1
- package/skills/research/automation/SKILL.md +21 -0
- package/skills/research/deep-research/SKILL.md +24 -0
- package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +1 -1
- package/skills/research/deep-research/in-depth-research-guide/SKILL.md +1 -1
- package/skills/research/funding/SKILL.md +20 -0
- package/skills/research/methodology/SKILL.md +24 -0
- package/skills/research/paper-review/SKILL.md +19 -0
- package/skills/research/paper-review/paper-critique-framework/SKILL.md +1 -1
- package/skills/tools/code-exec/SKILL.md +18 -0
- package/skills/tools/diagram/SKILL.md +20 -0
- package/skills/tools/document/SKILL.md +21 -0
- package/skills/tools/knowledge-graph/SKILL.md +21 -0
- package/skills/tools/ocr-translate/SKILL.md +18 -0
- package/skills/tools/ocr-translate/handwriting-recognition-guide/SKILL.md +2 -0
- package/skills/tools/ocr-translate/latex-ocr-guide/SKILL.md +2 -0
- package/skills/tools/scraping/SKILL.md +17 -0
- package/skills/writing/citation/SKILL.md +33 -0
- package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +2 -0
- package/skills/writing/composition/SKILL.md +22 -0
- package/skills/writing/composition/research-paper-writer/SKILL.md +1 -1
- package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +1 -1
- package/skills/writing/latex/SKILL.md +22 -0
- package/skills/writing/latex/academic-writing-latex/SKILL.md +1 -1
- package/skills/writing/latex/latex-drawing-guide/SKILL.md +1 -1
- package/skills/writing/polish/SKILL.md +20 -0
- package/skills/writing/polish/chinese-text-humanizer/SKILL.md +1 -1
- package/skills/writing/templates/SKILL.md +22 -0
- package/skills/writing/templates/beamer-presentation-guide/SKILL.md +1 -1
- package/skills/writing/templates/scientific-article-pdf/SKILL.md +1 -1
- package/skills/analysis/dataviz/citation-map-guide/SKILL.md +0 -184
- package/skills/analysis/dataviz/data-visualization-principles/SKILL.md +0 -171
- package/skills/analysis/econometrics/empirical-paper-analysis/SKILL.md +0 -192
- package/skills/analysis/econometrics/panel-data-regression-workflow/SKILL.md +0 -267
- package/skills/analysis/econometrics/stata-regression/SKILL.md +0 -117
- package/skills/analysis/statistics/general-statistics-guide/SKILL.md +0 -226
- package/skills/analysis/statistics/infiagent-benchmark-guide/SKILL.md +0 -106
- package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +0 -192
- package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +0 -193
- package/skills/analysis/wrangling/claude-data-analysis-guide/SKILL.md +0 -100
- package/skills/analysis/wrangling/open-data-scientist-guide/SKILL.md +0 -197
- package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +0 -159
- package/skills/domains/humanities/digital-humanities-methods/SKILL.md +0 -232
- package/skills/domains/law/legal-research-methods/SKILL.md +0 -190
- package/skills/domains/social-science/sociology-research-guide/SKILL.md +0 -238
- package/skills/literature/discovery/arxiv-paper-monitoring/SKILL.md +0 -233
- package/skills/literature/discovery/paper-tracking-guide/SKILL.md +0 -211
- package/skills/literature/fulltext/zotero-scihub-guide/SKILL.md +0 -168
- package/skills/literature/search/arxiv-osiris/SKILL.md +0 -199
- package/skills/literature/search/deepgit-search-guide/SKILL.md +0 -147
- package/skills/literature/search/multi-database-literature-search/SKILL.md +0 -198
- package/skills/literature/search/papers-chat-guide/SKILL.md +0 -194
- package/skills/literature/search/pasa-paper-search-guide/SKILL.md +0 -138
- package/skills/literature/search/scientify-literature-survey/SKILL.md +0 -203
- package/skills/research/automation/ai-scientist-guide/SKILL.md +0 -228
- package/skills/research/automation/coexist-ai-guide/SKILL.md +0 -149
- package/skills/research/automation/foam-agent-guide/SKILL.md +0 -203
- package/skills/research/automation/research-paper-orchestrator/SKILL.md +0 -254
- package/skills/research/deep-research/academic-deep-research/SKILL.md +0 -190
- package/skills/research/deep-research/cognitive-kernel-guide/SKILL.md +0 -200
- package/skills/research/deep-research/corvus-research-guide/SKILL.md +0 -132
- package/skills/research/deep-research/deep-research-pro/SKILL.md +0 -213
- package/skills/research/deep-research/deep-research-work/SKILL.md +0 -204
- package/skills/research/deep-research/research-cog/SKILL.md +0 -153
- package/skills/research/methodology/academic-mentor-guide/SKILL.md +0 -169
- package/skills/research/methodology/deep-innovator-guide/SKILL.md +0 -242
- package/skills/research/methodology/research-pipeline-units-guide/SKILL.md +0 -169
- package/skills/research/paper-review/paper-compare-guide/SKILL.md +0 -238
- package/skills/research/paper-review/paper-digest-guide/SKILL.md +0 -240
- package/skills/research/paper-review/paper-research-assistant/SKILL.md +0 -231
- package/skills/research/paper-review/research-quality-filter/SKILL.md +0 -261
- package/skills/tools/code-exec/contextplus-mcp-guide/SKILL.md +0 -110
- package/skills/tools/diagram/clawphd-guide/SKILL.md +0 -149
- package/skills/tools/diagram/scientific-graphical-abstract/SKILL.md +0 -201
- package/skills/tools/document/md2pdf-xelatex/SKILL.md +0 -212
- package/skills/tools/document/openpaper-guide/SKILL.md +0 -232
- package/skills/tools/document/weknora-guide/SKILL.md +0 -216
- package/skills/tools/knowledge-graph/mimir-memory-guide/SKILL.md +0 -135
- package/skills/tools/knowledge-graph/open-webui-tools-guide/SKILL.md +0 -156
- package/skills/tools/ocr-translate/formula-recognition-guide/SKILL.md +0 -367
- package/skills/tools/ocr-translate/math-equation-renderer/SKILL.md +0 -198
- package/skills/tools/scraping/api-data-collection-guide/SKILL.md +0 -301
- package/skills/writing/citation/academic-citation-manager-guide/SKILL.md +0 -182
- package/skills/writing/composition/opendraft-thesis-guide/SKILL.md +0 -200
- package/skills/writing/composition/paper-debugger-guide/SKILL.md +0 -143
- package/skills/writing/composition/paperforge-guide/SKILL.md +0 -205
|
@@ -1,156 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: open-webui-tools-guide
|
|
3
|
-
description: "Academic research tools for Open WebUI chat interface"
|
|
4
|
-
metadata:
|
|
5
|
-
openclaw:
|
|
6
|
-
emoji: "🌐"
|
|
7
|
-
category: "tools"
|
|
8
|
-
subcategory: "knowledge-graph"
|
|
9
|
-
keywords: ["Open WebUI", "research tools", "chat interface", "paper search", "academic tools", "LLM UI"]
|
|
10
|
-
source: "https://github.com/Haervwe/open-webui-tools"
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Open WebUI Academic Tools Guide
|
|
14
|
-
|
|
15
|
-
## Overview
|
|
16
|
-
|
|
17
|
-
A collection of academic research tools designed for Open WebUI, the popular self-hosted LLM chat interface. These tools add paper search, citation lookup, and research capabilities directly into chat conversations — search arXiv, PubMed, and Semantic Scholar; fetch paper details; generate citations; and analyze documents, all within the familiar chat UI.
|
|
18
|
-
|
|
19
|
-
## Installation
|
|
20
|
-
|
|
21
|
-
```bash
|
|
22
|
-
# In Open WebUI: Admin → Tools → Add Tool
|
|
23
|
-
# Import from the tools collection
|
|
24
|
-
|
|
25
|
-
# Or manually add tool functions
|
|
26
|
-
# Copy tool JSON definitions to Open WebUI config
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Available Tools
|
|
30
|
-
|
|
31
|
-
### Paper Search Tool
|
|
32
|
-
|
|
33
|
-
```python
|
|
34
|
-
# Searches arXiv, Semantic Scholar, PubMed
|
|
35
|
-
# Usage in chat: "Search for papers on attention mechanisms"
|
|
36
|
-
|
|
37
|
-
def search_papers(query: str, source: str = "all",
|
|
38
|
-
max_results: int = 10) -> str:
|
|
39
|
-
"""Search academic databases for papers.
|
|
40
|
-
|
|
41
|
-
Args:
|
|
42
|
-
query: Search query
|
|
43
|
-
source: "arxiv", "semantic_scholar", "pubmed", or "all"
|
|
44
|
-
max_results: Maximum results to return
|
|
45
|
-
"""
|
|
46
|
-
results = []
|
|
47
|
-
|
|
48
|
-
if source in ("all", "arxiv"):
|
|
49
|
-
# Search arXiv
|
|
50
|
-
arxiv_results = search_arxiv(query, max_results)
|
|
51
|
-
results.extend(arxiv_results)
|
|
52
|
-
|
|
53
|
-
if source in ("all", "semantic_scholar"):
|
|
54
|
-
# Search Semantic Scholar
|
|
55
|
-
s2_results = search_s2(query, max_results)
|
|
56
|
-
results.extend(s2_results)
|
|
57
|
-
|
|
58
|
-
return format_results(results)
|
|
59
|
-
```
|
|
60
|
-
|
|
61
|
-
### Citation Generator Tool
|
|
62
|
-
|
|
63
|
-
```python
|
|
64
|
-
# Generate formatted citations from DOI or title
|
|
65
|
-
# Usage: "Get BibTeX for DOI 10.48550/arXiv.1706.03762"
|
|
66
|
-
|
|
67
|
-
def get_citation(identifier: str,
|
|
68
|
-
style: str = "bibtex") -> str:
|
|
69
|
-
"""Get formatted citation for a paper.
|
|
70
|
-
|
|
71
|
-
Args:
|
|
72
|
-
identifier: DOI, arXiv ID, or paper title
|
|
73
|
-
style: "bibtex", "apa", "mla", "chicago"
|
|
74
|
-
"""
|
|
75
|
-
paper = resolve_paper(identifier)
|
|
76
|
-
return format_citation(paper, style)
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
### Paper Summary Tool
|
|
80
|
-
|
|
81
|
-
```python
|
|
82
|
-
# Fetch and summarize paper abstract + key points
|
|
83
|
-
# Usage: "Summarize arxiv:2401.12345"
|
|
84
|
-
|
|
85
|
-
def summarize_paper(paper_id: str) -> str:
|
|
86
|
-
"""Fetch paper metadata and generate summary.
|
|
87
|
-
|
|
88
|
-
Args:
|
|
89
|
-
paper_id: arXiv ID, DOI, or Semantic Scholar ID
|
|
90
|
-
"""
|
|
91
|
-
paper = fetch_paper_details(paper_id)
|
|
92
|
-
return {
|
|
93
|
-
"title": paper.title,
|
|
94
|
-
"authors": paper.authors,
|
|
95
|
-
"abstract": paper.abstract,
|
|
96
|
-
"year": paper.year,
|
|
97
|
-
"citations": paper.citation_count,
|
|
98
|
-
}
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
## Tool Configuration
|
|
102
|
-
|
|
103
|
-
```json
|
|
104
|
-
{
|
|
105
|
-
"tools": {
|
|
106
|
-
"paper_search": {
|
|
107
|
-
"enabled": true,
|
|
108
|
-
"default_source": "semantic_scholar",
|
|
109
|
-
"max_results": 10,
|
|
110
|
-
"include_abstract": true
|
|
111
|
-
},
|
|
112
|
-
"citation_generator": {
|
|
113
|
-
"enabled": true,
|
|
114
|
-
"default_style": "bibtex"
|
|
115
|
-
},
|
|
116
|
-
"paper_summary": {
|
|
117
|
-
"enabled": true,
|
|
118
|
-
"include_related": true
|
|
119
|
-
}
|
|
120
|
-
}
|
|
121
|
-
}
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
## Chat Workflow Examples
|
|
125
|
-
|
|
126
|
-
```markdown
|
|
127
|
-
### Research Discovery
|
|
128
|
-
User: "Find recent papers on retrieval-augmented generation"
|
|
129
|
-
Bot: [Uses paper_search tool] Here are 10 recent papers on RAG...
|
|
130
|
-
|
|
131
|
-
### Citation Workflow
|
|
132
|
-
User: "Get BibTeX for the BERT paper"
|
|
133
|
-
Bot: [Uses citation_generator] @article{devlin2019bert, ...}
|
|
134
|
-
|
|
135
|
-
### Paper Analysis
|
|
136
|
-
User: "Tell me about arxiv:2404.19756"
|
|
137
|
-
Bot: [Uses paper_summary] This paper introduces KAN...
|
|
138
|
-
|
|
139
|
-
### Literature Review
|
|
140
|
-
User: "Compare attention mechanisms in the top-5 cited
|
|
141
|
-
transformer papers from 2023"
|
|
142
|
-
Bot: [Uses multiple tools] Searching... Here's a comparison...
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
## Use Cases
|
|
146
|
-
|
|
147
|
-
1. **Chat-based research**: Search papers while chatting with LLM
|
|
148
|
-
2. **Quick citations**: Generate BibTeX without leaving chat
|
|
149
|
-
3. **Paper discovery**: Find related work through conversation
|
|
150
|
-
4. **Team research**: Shared research chat with embedded tools
|
|
151
|
-
5. **Teaching**: Interactive paper exploration in classroom
|
|
152
|
-
|
|
153
|
-
## References
|
|
154
|
-
|
|
155
|
-
- [open-webui-tools GitHub](https://github.com/Haervwe/open-webui-tools)
|
|
156
|
-
- [Open WebUI](https://github.com/open-webui/open-webui)
|
|
@@ -1,367 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: formula-recognition-guide
|
|
3
|
-
description: "Math OCR and formula recognition to LaTeX conversion"
|
|
4
|
-
metadata:
|
|
5
|
-
openclaw:
|
|
6
|
-
emoji: "math"
|
|
7
|
-
category: "tools"
|
|
8
|
-
subcategory: "ocr-translate"
|
|
9
|
-
keywords: ["math OCR", "formula recognition", "LaTeX OCR"]
|
|
10
|
-
source: "wentor-research-plugins"
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Formula Recognition Guide
|
|
14
|
-
|
|
15
|
-
Convert mathematical formulas from images, PDFs, and handwritten notes to LaTeX code using OCR tools, neural models, and API services.
|
|
16
|
-
|
|
17
|
-
## Tool Comparison
|
|
18
|
-
|
|
19
|
-
| Tool | Input | Output | Accuracy | Speed | Cost |
|
|
20
|
-
|------|-------|--------|----------|-------|------|
|
|
21
|
-
| Mathpix | Image, PDF, screenshot | LaTeX, MathML | Excellent | Fast | Free tier (50/month), then paid |
|
|
22
|
-
| LaTeX-OCR (Lukas Blecher) | Image | LaTeX | Very good | Medium | Free (open source) |
|
|
23
|
-
| Pix2Text (p2t) | Image | LaTeX + text | Good | Medium | Free (open source) |
|
|
24
|
-
| Nougat (Meta) | PDF pages | Markdown + LaTeX | Excellent (full page) | Slow (GPU) | Free (open source) |
|
|
25
|
-
| InftyReader | Image, PDF | LaTeX, MathML | Good | Medium | Commercial |
|
|
26
|
-
| Google Cloud Vision | Image | Text (limited math) | Poor for math | Fast | Pay per use |
|
|
27
|
-
| img2latex (Harvard) | Image | LaTeX | Good | Medium | Free (open source) |
|
|
28
|
-
|
|
29
|
-
## Mathpix API
|
|
30
|
-
|
|
31
|
-
Mathpix is the industry-standard math OCR service, handling printed and handwritten formulas, tables, and full documents.
|
|
32
|
-
|
|
33
|
-
### Setup
|
|
34
|
-
|
|
35
|
-
```bash
|
|
36
|
-
pip install mathpix
|
|
37
|
-
# Or use the REST API directly
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
### Single Image to LaTeX
|
|
41
|
-
|
|
42
|
-
```python
|
|
43
|
-
import requests
|
|
44
|
-
import base64
|
|
45
|
-
import json
|
|
46
|
-
|
|
47
|
-
def mathpix_ocr(image_path, app_id, app_key):
|
|
48
|
-
"""Convert an image of a formula to LaTeX using Mathpix API."""
|
|
49
|
-
with open(image_path, "rb") as f:
|
|
50
|
-
image_data = base64.b64encode(f.read()).decode()
|
|
51
|
-
|
|
52
|
-
response = requests.post(
|
|
53
|
-
"https://api.mathpix.com/v3/text",
|
|
54
|
-
headers={
|
|
55
|
-
"app_id": app_id,
|
|
56
|
-
"app_key": app_key,
|
|
57
|
-
"Content-Type": "application/json"
|
|
58
|
-
},
|
|
59
|
-
json={
|
|
60
|
-
"src": f"data:image/png;base64,{image_data}",
|
|
61
|
-
"formats": ["latex_styled", "latex_normal", "mathml"],
|
|
62
|
-
"data_options": {
|
|
63
|
-
"include_asciimath": True,
|
|
64
|
-
"include_latex": True
|
|
65
|
-
}
|
|
66
|
-
}
|
|
67
|
-
)
|
|
68
|
-
|
|
69
|
-
result = response.json()
|
|
70
|
-
return {
|
|
71
|
-
"latex": result.get("latex_styled", ""),
|
|
72
|
-
"latex_normal": result.get("latex_normal", ""),
|
|
73
|
-
"confidence": result.get("confidence", 0),
|
|
74
|
-
"mathml": result.get("mathml", "")
|
|
75
|
-
}
|
|
76
|
-
|
|
77
|
-
# Usage
|
|
78
|
-
result = mathpix_ocr("equation.png", "YOUR_APP_ID", "YOUR_APP_KEY")
|
|
79
|
-
print(f"LaTeX: {result['latex']}")
|
|
80
|
-
print(f"Confidence: {result['confidence']:.2%}")
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
### Process a Full PDF Page
|
|
84
|
-
|
|
85
|
-
```python
|
|
86
|
-
def mathpix_pdf_page(image_path, app_id, app_key):
|
|
87
|
-
"""Process a full PDF page with mixed text and math."""
|
|
88
|
-
with open(image_path, "rb") as f:
|
|
89
|
-
image_data = base64.b64encode(f.read()).decode()
|
|
90
|
-
|
|
91
|
-
response = requests.post(
|
|
92
|
-
"https://api.mathpix.com/v3/text",
|
|
93
|
-
headers={
|
|
94
|
-
"app_id": app_id,
|
|
95
|
-
"app_key": app_key,
|
|
96
|
-
"Content-Type": "application/json"
|
|
97
|
-
},
|
|
98
|
-
json={
|
|
99
|
-
"src": f"data:image/png;base64,{image_data}",
|
|
100
|
-
"formats": ["text", "latex_styled"],
|
|
101
|
-
"ocr": ["math", "text"],
|
|
102
|
-
"math_inline_delimiters": ["$", "$"],
|
|
103
|
-
"math_display_delimiters": ["$$", "$$"]
|
|
104
|
-
}
|
|
105
|
-
)
|
|
106
|
-
|
|
107
|
-
result = response.json()
|
|
108
|
-
return result.get("text", "")
|
|
109
|
-
|
|
110
|
-
# Returns Markdown with inline $...$ and display $$...$$ math
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
## LaTeX-OCR (Open Source, Local)
|
|
114
|
-
|
|
115
|
-
LaTeX-OCR by Lukas Blecher is a free, locally-running model for converting formula images to LaTeX.
|
|
116
|
-
|
|
117
|
-
### Installation and Usage
|
|
118
|
-
|
|
119
|
-
```bash
|
|
120
|
-
pip install "pix2tex[gui]"
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
```python
|
|
124
|
-
from pix2tex.cli import LatexOCR
|
|
125
|
-
|
|
126
|
-
# Initialize model (downloads on first use, ~1GB)
|
|
127
|
-
model = LatexOCR()
|
|
128
|
-
|
|
129
|
-
# From file
|
|
130
|
-
from PIL import Image
|
|
131
|
-
|
|
132
|
-
img = Image.open("equation.png")
|
|
133
|
-
latex = model(img)
|
|
134
|
-
print(f"LaTeX: {latex}")
|
|
135
|
-
# Output: \frac{\partial \mathcal{L}}{\partial \theta} = -\frac{1}{N} \sum_{i=1}^{N} \nabla_\theta \log p(y_i | x_i; \theta)
|
|
136
|
-
```
|
|
137
|
-
|
|
138
|
-
### Batch Processing
|
|
139
|
-
|
|
140
|
-
```python
|
|
141
|
-
from PIL import Image
|
|
142
|
-
from pathlib import Path
|
|
143
|
-
|
|
144
|
-
def batch_ocr(image_dir, model):
|
|
145
|
-
"""Process all formula images in a directory."""
|
|
146
|
-
results = []
|
|
147
|
-
for img_path in sorted(Path(image_dir).glob("*.png")):
|
|
148
|
-
img = Image.open(img_path)
|
|
149
|
-
latex = model(img)
|
|
150
|
-
results.append({
|
|
151
|
-
"file": img_path.name,
|
|
152
|
-
"latex": latex
|
|
153
|
-
})
|
|
154
|
-
print(f"{img_path.name}: {latex[:80]}...")
|
|
155
|
-
return results
|
|
156
|
-
|
|
157
|
-
model = LatexOCR()
|
|
158
|
-
results = batch_ocr("./formula_images/", model)
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
## Pix2Text (Chinese + English + Math)
|
|
162
|
-
|
|
163
|
-
Pix2Text handles mixed Chinese/English text alongside mathematical formulas.
|
|
164
|
-
|
|
165
|
-
```bash
|
|
166
|
-
pip install pix2text
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
```python
|
|
170
|
-
from pix2text import Pix2Text
|
|
171
|
-
|
|
172
|
-
p2t = Pix2Text()
|
|
173
|
-
|
|
174
|
-
# Recognize mixed content (text + math)
|
|
175
|
-
result = p2t.recognize("mixed_content.png")
|
|
176
|
-
print(result)
|
|
177
|
-
# Output includes both text and LaTeX formulas
|
|
178
|
-
```
|
|
179
|
-
|
|
180
|
-
## Nougat (Meta) — Full Document OCR
|
|
181
|
-
|
|
182
|
-
Nougat converts entire academic PDF pages to Markdown with LaTeX math, preserving document structure.
|
|
183
|
-
|
|
184
|
-
```bash
|
|
185
|
-
pip install nougat-ocr
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
```bash
|
|
189
|
-
# Convert a PDF to Markdown
|
|
190
|
-
nougat path/to/paper.pdf -o output_dir/ --no-skipping
|
|
191
|
-
|
|
192
|
-
# Output: Markdown files with LaTeX equations preserved
|
|
193
|
-
# e.g., The loss function is $\mathcal{L}(\theta) = ...$
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
```python
|
|
197
|
-
# Programmatic usage
|
|
198
|
-
from nougat import NougatModel
|
|
199
|
-
from nougat.utils.dataset import LazyDataset
|
|
200
|
-
from nougat.postprocessing import markdown_compatible
|
|
201
|
-
|
|
202
|
-
model = NougatModel.from_pretrained("facebook/nougat-base")
|
|
203
|
-
model.eval()
|
|
204
|
-
|
|
205
|
-
# Process pages...
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
## Screenshot-Based Workflow
|
|
209
|
-
|
|
210
|
-
### macOS Workflow
|
|
211
|
-
|
|
212
|
-
```bash
|
|
213
|
-
# 1. Take a screenshot of the formula (Cmd+Shift+4)
|
|
214
|
-
# 2. Process with LaTeX-OCR or Mathpix
|
|
215
|
-
|
|
216
|
-
# Automated with a shell script:
|
|
217
|
-
#!/bin/bash
|
|
218
|
-
# save as ~/bin/formula-ocr.sh
|
|
219
|
-
SCREENSHOT=$(mktemp /tmp/formula_XXXXXX.png)
|
|
220
|
-
screencapture -i "$SCREENSHOT"
|
|
221
|
-
python -c "
|
|
222
|
-
from pix2tex.cli import LatexOCR
|
|
223
|
-
from PIL import Image
|
|
224
|
-
model = LatexOCR()
|
|
225
|
-
img = Image.open('$SCREENSHOT')
|
|
226
|
-
latex = model(img)
|
|
227
|
-
print(latex)
|
|
228
|
-
# Copy to clipboard
|
|
229
|
-
import subprocess
|
|
230
|
-
subprocess.run(['pbcopy'], input=latex.encode())
|
|
231
|
-
print('Copied to clipboard!')
|
|
232
|
-
"
|
|
233
|
-
```
|
|
234
|
-
|
|
235
|
-
### Cross-Platform with Snipping Tool
|
|
236
|
-
|
|
237
|
-
```python
|
|
238
|
-
import tkinter as tk
|
|
239
|
-
from PIL import ImageGrab
|
|
240
|
-
|
|
241
|
-
def capture_and_ocr():
|
|
242
|
-
"""Capture screen region and convert to LaTeX."""
|
|
243
|
-
# Simple screenshot capture
|
|
244
|
-
print("Select the formula region...")
|
|
245
|
-
img = ImageGrab.grab(bbox=None) # Full screen; use tool for selection
|
|
246
|
-
|
|
247
|
-
from pix2tex.cli import LatexOCR
|
|
248
|
-
model = LatexOCR()
|
|
249
|
-
latex = model(img)
|
|
250
|
-
print(f"\nLaTeX: {latex}")
|
|
251
|
-
return latex
|
|
252
|
-
```
|
|
253
|
-
|
|
254
|
-
## Post-Processing and Validation
|
|
255
|
-
|
|
256
|
-
### Common OCR Errors and Fixes
|
|
257
|
-
|
|
258
|
-
| OCR Error | Correct LaTeX | Fix Strategy |
|
|
259
|
-
|-----------|--------------|--------------|
|
|
260
|
-
| `\Sigma` vs `\sum` | Context-dependent | Check if it is a summation or sigma variable |
|
|
261
|
-
| Missing subscripts | `x_i` not `xi` | Verify variable names against source |
|
|
262
|
-
| Wrong delimiter size | `\left( \right)` | Add `\left` and `\right` for auto-sizing |
|
|
263
|
-
| Misrecognized symbols | `\theta` vs `\Theta` | Compare against original image |
|
|
264
|
-
| Missing spaces | `\frac{a}{b}c` | Add spacing commands (`\,`, `\;`, `\quad`) |
|
|
265
|
-
|
|
266
|
-
### Validation Script
|
|
267
|
-
|
|
268
|
-
```python
|
|
269
|
-
import subprocess
|
|
270
|
-
import tempfile
|
|
271
|
-
import os
|
|
272
|
-
|
|
273
|
-
def validate_latex(latex_string):
|
|
274
|
-
"""Check if a LaTeX string compiles without errors."""
|
|
275
|
-
doc = f"""
|
|
276
|
-
\\documentclass{{article}}
|
|
277
|
-
\\usepackage{{amsmath,amssymb}}
|
|
278
|
-
\\begin{{document}}
|
|
279
|
-
${latex_string}$
|
|
280
|
-
\\end{{document}}
|
|
281
|
-
"""
|
|
282
|
-
|
|
283
|
-
with tempfile.NamedTemporaryFile(mode="w", suffix=".tex", delete=False) as f:
|
|
284
|
-
f.write(doc)
|
|
285
|
-
tex_path = f.name
|
|
286
|
-
|
|
287
|
-
try:
|
|
288
|
-
result = subprocess.run(
|
|
289
|
-
["pdflatex", "-interaction=nonstopmode", tex_path],
|
|
290
|
-
capture_output=True, text=True, timeout=10,
|
|
291
|
-
cwd=tempfile.gettempdir()
|
|
292
|
-
)
|
|
293
|
-
success = result.returncode == 0
|
|
294
|
-
if not success:
|
|
295
|
-
# Extract error message
|
|
296
|
-
for line in result.stdout.split("\n"):
|
|
297
|
-
if line.startswith("!"):
|
|
298
|
-
print(f"LaTeX error: {line}")
|
|
299
|
-
return success
|
|
300
|
-
except subprocess.TimeoutExpired:
|
|
301
|
-
return False
|
|
302
|
-
finally:
|
|
303
|
-
for ext in [".tex", ".pdf", ".aux", ".log"]:
|
|
304
|
-
try:
|
|
305
|
-
os.remove(tex_path.replace(".tex", ext))
|
|
306
|
-
except FileNotFoundError:
|
|
307
|
-
pass
|
|
308
|
-
|
|
309
|
-
# Test
|
|
310
|
-
latex = r"\frac{\partial \mathcal{L}}{\partial \theta}"
|
|
311
|
-
print(f"Valid: {validate_latex(latex)}")
|
|
312
|
-
```
|
|
313
|
-
|
|
314
|
-
## Integration with Note-Taking
|
|
315
|
-
|
|
316
|
-
### Obsidian / Markdown Notes
|
|
317
|
-
|
|
318
|
-
```markdown
|
|
319
|
-
# Lecture Notes: Statistical Mechanics
|
|
320
|
-
|
|
321
|
-
The partition function is defined as:
|
|
322
|
-
|
|
323
|
-
$$Z = \sum_{i} e^{-\beta E_i}$$
|
|
324
|
-
|
|
325
|
-
where $\beta = 1/k_B T$ is the inverse temperature.
|
|
326
|
-
|
|
327
|
-
The free energy is:
|
|
328
|
-
|
|
329
|
-
$$F = -k_B T \ln Z$$
|
|
330
|
-
|
|
331
|
-
[OCR'd from slide 15 using LaTeX-OCR, confidence: 0.97]
|
|
332
|
-
```
|
|
333
|
-
|
|
334
|
-
### Automated Pipeline
|
|
335
|
-
|
|
336
|
-
```python
|
|
337
|
-
def process_lecture_slides(pdf_path, output_md):
|
|
338
|
-
"""Convert lecture slides with formulas to Markdown notes."""
|
|
339
|
-
from pdf2image import convert_from_path
|
|
340
|
-
from pix2tex.cli import LatexOCR
|
|
341
|
-
|
|
342
|
-
model = LatexOCR()
|
|
343
|
-
images = convert_from_path(pdf_path, dpi=200)
|
|
344
|
-
|
|
345
|
-
with open(output_md, "w") as f:
|
|
346
|
-
f.write(f"# Notes from {pdf_path}\n\n")
|
|
347
|
-
for i, img in enumerate(images):
|
|
348
|
-
f.write(f"## Slide {i+1}\n\n")
|
|
349
|
-
# Full page text extraction (use Nougat or Mathpix for best results)
|
|
350
|
-
# For formula-only images, use LaTeX-OCR:
|
|
351
|
-
try:
|
|
352
|
-
latex = model(img)
|
|
353
|
-
f.write(f"$$\n{latex}\n$$\n\n")
|
|
354
|
-
except Exception as e:
|
|
355
|
-
f.write(f"[OCR failed: {e}]\n\n")
|
|
356
|
-
|
|
357
|
-
print(f"Notes saved to {output_md}")
|
|
358
|
-
```
|
|
359
|
-
|
|
360
|
-
## Best Practices
|
|
361
|
-
|
|
362
|
-
1. **Crop tightly**: OCR accuracy improves significantly when the formula is cropped with minimal surrounding whitespace.
|
|
363
|
-
2. **Use high resolution**: 200-300 DPI gives the best results. Lower resolution degrades recognition accuracy.
|
|
364
|
-
3. **Validate output**: Always compile the generated LaTeX to verify correctness before using in a manuscript.
|
|
365
|
-
4. **Handle multi-line equations**: For aligned equations, process each line separately or use a full-page model like Nougat.
|
|
366
|
-
5. **Combine tools**: Use Mathpix for critical formulas and LaTeX-OCR for bulk processing to balance cost and quality.
|
|
367
|
-
6. **Build a corrections dictionary**: Track common OCR errors for your domain and apply automated post-processing fixes.
|