@wentorai/research-plugins 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +204 -0
- package/curated/analysis/README.md +64 -0
- package/curated/domains/README.md +104 -0
- package/curated/literature/README.md +53 -0
- package/curated/research/README.md +62 -0
- package/curated/tools/README.md +87 -0
- package/curated/writing/README.md +61 -0
- package/index.ts +39 -0
- package/mcp-configs/academic-db/ChatSpatial.json +17 -0
- package/mcp-configs/academic-db/academia-mcp.json +17 -0
- package/mcp-configs/academic-db/academic-paper-explorer.json +17 -0
- package/mcp-configs/academic-db/academic-search-mcp-server.json +17 -0
- package/mcp-configs/academic-db/agentinterviews-mcp.json +17 -0
- package/mcp-configs/academic-db/all-in-mcp.json +17 -0
- package/mcp-configs/academic-db/apple-health-mcp.json +17 -0
- package/mcp-configs/academic-db/arxiv-latex-mcp.json +17 -0
- package/mcp-configs/academic-db/arxiv-mcp-server.json +17 -0
- package/mcp-configs/academic-db/bgpt-mcp.json +17 -0
- package/mcp-configs/academic-db/biomcp.json +17 -0
- package/mcp-configs/academic-db/biothings-mcp.json +17 -0
- package/mcp-configs/academic-db/catalysishub-mcp-server.json +17 -0
- package/mcp-configs/academic-db/clinicaltrialsgov-mcp-server.json +17 -0
- package/mcp-configs/academic-db/deep-research-mcp.json +17 -0
- package/mcp-configs/academic-db/dicom-mcp.json +17 -0
- package/mcp-configs/academic-db/enrichr-mcp-server.json +17 -0
- package/mcp-configs/academic-db/fec-mcp-server.json +17 -0
- package/mcp-configs/academic-db/fhir-mcp-server-themomentum.json +17 -0
- package/mcp-configs/academic-db/fhir-mcp.json +19 -0
- package/mcp-configs/academic-db/gget-mcp.json +17 -0
- package/mcp-configs/academic-db/google-researcher-mcp.json +17 -0
- package/mcp-configs/academic-db/idea-reality-mcp.json +17 -0
- package/mcp-configs/academic-db/legiscan-mcp.json +19 -0
- package/mcp-configs/academic-db/lex.json +17 -0
- package/mcp-configs/ai-platform/Adaptive-Graph-of-Thoughts-MCP-server.json +17 -0
- package/mcp-configs/ai-platform/ai-counsel.json +17 -0
- package/mcp-configs/ai-platform/atlas-mcp-server.json +17 -0
- package/mcp-configs/ai-platform/counsel-mcp.json +17 -0
- package/mcp-configs/ai-platform/cross-llm-mcp.json +17 -0
- package/mcp-configs/ai-platform/gptr-mcp.json +17 -0
- package/mcp-configs/browser/decipher-research-agent.json +17 -0
- package/mcp-configs/browser/deep-research.json +17 -0
- package/mcp-configs/browser/everything-claude-code.json +17 -0
- package/mcp-configs/browser/gpt-researcher.json +17 -0
- package/mcp-configs/browser/heurist-agent-framework.json +17 -0
- package/mcp-configs/data-platform/4everland-hosting-mcp.json +17 -0
- package/mcp-configs/data-platform/context-keeper.json +17 -0
- package/mcp-configs/data-platform/context7.json +19 -0
- package/mcp-configs/data-platform/contextstream-mcp.json +17 -0
- package/mcp-configs/data-platform/email-mcp.json +17 -0
- package/mcp-configs/note-knowledge/ApeRAG.json +17 -0
- package/mcp-configs/note-knowledge/In-Memoria.json +17 -0
- package/mcp-configs/note-knowledge/agent-memory.json +17 -0
- package/mcp-configs/note-knowledge/aimemo.json +17 -0
- package/mcp-configs/note-knowledge/biel-mcp.json +19 -0
- package/mcp-configs/note-knowledge/cognee.json +17 -0
- package/mcp-configs/note-knowledge/context-awesome.json +17 -0
- package/mcp-configs/note-knowledge/context-mcp.json +17 -0
- package/mcp-configs/note-knowledge/conversation-handoff-mcp.json +17 -0
- package/mcp-configs/note-knowledge/cortex.json +17 -0
- package/mcp-configs/note-knowledge/devrag.json +17 -0
- package/mcp-configs/note-knowledge/easy-obsidian-mcp.json +17 -0
- package/mcp-configs/note-knowledge/engram.json +17 -0
- package/mcp-configs/note-knowledge/gnosis-mcp.json +17 -0
- package/mcp-configs/note-knowledge/graphlit-mcp-server.json +19 -0
- package/mcp-configs/reference-mgr/arxiv-cli.json +17 -0
- package/mcp-configs/reference-mgr/arxiv-search-mcp.json +17 -0
- package/mcp-configs/reference-mgr/chiken.json +17 -0
- package/mcp-configs/reference-mgr/claude-scholar.json +17 -0
- package/mcp-configs/reference-mgr/devonthink-mcp.json +17 -0
- package/mcp-configs/registry.json +447 -0
- package/openclaw.plugin.json +21 -0
- package/package.json +61 -0
- package/skills/analysis/dataviz/color-accessibility-guide/SKILL.md +230 -0
- package/skills/analysis/dataviz/geospatial-viz-guide/SKILL.md +218 -0
- package/skills/analysis/dataviz/interactive-viz-guide/SKILL.md +287 -0
- package/skills/analysis/dataviz/network-visualization-guide/SKILL.md +195 -0
- package/skills/analysis/dataviz/publication-figures-guide/SKILL.md +238 -0
- package/skills/analysis/dataviz/python-dataviz-guide/SKILL.md +195 -0
- package/skills/analysis/econometrics/causal-inference-guide/SKILL.md +197 -0
- package/skills/analysis/econometrics/iv-regression-guide/SKILL.md +198 -0
- package/skills/analysis/econometrics/panel-data-guide/SKILL.md +274 -0
- package/skills/analysis/econometrics/robustness-checks/SKILL.md +250 -0
- package/skills/analysis/econometrics/stata-regression/SKILL.md +117 -0
- package/skills/analysis/econometrics/time-series-guide/SKILL.md +235 -0
- package/skills/analysis/statistics/bayesian-statistics-guide/SKILL.md +221 -0
- package/skills/analysis/statistics/hypothesis-testing-guide/SKILL.md +210 -0
- package/skills/analysis/statistics/meta-analysis-guide/SKILL.md +206 -0
- package/skills/analysis/statistics/nonparametric-tests-guide/SKILL.md +221 -0
- package/skills/analysis/statistics/power-analysis-guide/SKILL.md +240 -0
- package/skills/analysis/statistics/sem-guide/SKILL.md +231 -0
- package/skills/analysis/statistics/survival-analysis-guide/SKILL.md +195 -0
- package/skills/analysis/wrangling/missing-data-handling/SKILL.md +224 -0
- package/skills/analysis/wrangling/pandas-data-wrangling/SKILL.md +242 -0
- package/skills/analysis/wrangling/questionnaire-design-guide/SKILL.md +234 -0
- package/skills/analysis/wrangling/text-mining-guide/SKILL.md +225 -0
- package/skills/domains/ai-ml/computer-vision-guide/SKILL.md +213 -0
- package/skills/domains/ai-ml/deep-learning-papers-guide/SKILL.md +200 -0
- package/skills/domains/ai-ml/llm-evaluation-guide/SKILL.md +194 -0
- package/skills/domains/ai-ml/prompt-engineering-research/SKILL.md +233 -0
- package/skills/domains/ai-ml/reinforcement-learning-guide/SKILL.md +254 -0
- package/skills/domains/ai-ml/transformer-architecture-guide/SKILL.md +233 -0
- package/skills/domains/biomedical/clinical-research-guide/SKILL.md +232 -0
- package/skills/domains/biomedical/clinicaltrials-api/SKILL.md +177 -0
- package/skills/domains/biomedical/epidemiology-guide/SKILL.md +200 -0
- package/skills/domains/biomedical/genomics-analysis-guide/SKILL.md +270 -0
- package/skills/domains/business/market-analysis-guide/SKILL.md +112 -0
- package/skills/domains/business/strategic-management-guide/SKILL.md +154 -0
- package/skills/domains/chemistry/computational-chemistry-guide/SKILL.md +266 -0
- package/skills/domains/chemistry/retrosynthesis-guide/SKILL.md +215 -0
- package/skills/domains/cs/algorithms-complexity-guide/SKILL.md +194 -0
- package/skills/domains/cs/dblp-api/SKILL.md +129 -0
- package/skills/domains/cs/software-engineering-research/SKILL.md +218 -0
- package/skills/domains/ecology/biodiversity-data-guide/SKILL.md +296 -0
- package/skills/domains/ecology/conservation-biology-guide/SKILL.md +198 -0
- package/skills/domains/ecology/gbif-api/SKILL.md +158 -0
- package/skills/domains/ecology/inaturalist-api/SKILL.md +173 -0
- package/skills/domains/economics/behavioral-economics-guide/SKILL.md +239 -0
- package/skills/domains/economics/development-economics-guide/SKILL.md +181 -0
- package/skills/domains/economics/fred-api/SKILL.md +189 -0
- package/skills/domains/education/curriculum-design-guide/SKILL.md +144 -0
- package/skills/domains/education/learning-science-guide/SKILL.md +150 -0
- package/skills/domains/finance/financial-data-analysis/SKILL.md +152 -0
- package/skills/domains/finance/quantitative-finance-guide/SKILL.md +151 -0
- package/skills/domains/geoscience/climate-science-guide/SKILL.md +158 -0
- package/skills/domains/geoscience/gis-remote-sensing-guide/SKILL.md +129 -0
- package/skills/domains/humanities/digital-humanities-guide/SKILL.md +181 -0
- package/skills/domains/humanities/philosophy-research-guide/SKILL.md +148 -0
- package/skills/domains/law/courtlistener-api/SKILL.md +213 -0
- package/skills/domains/law/legal-research-guide/SKILL.md +250 -0
- package/skills/domains/math/linear-algebra-applications/SKILL.md +227 -0
- package/skills/domains/math/numerical-methods-guide/SKILL.md +236 -0
- package/skills/domains/math/oeis-api/SKILL.md +158 -0
- package/skills/domains/pharma/clinical-pharmacology-guide/SKILL.md +165 -0
- package/skills/domains/pharma/drug-development-guide/SKILL.md +177 -0
- package/skills/domains/physics/computational-physics-guide/SKILL.md +300 -0
- package/skills/domains/physics/nasa-ads-api/SKILL.md +150 -0
- package/skills/domains/physics/quantum-computing-guide/SKILL.md +234 -0
- package/skills/domains/social-science/social-research-methods/SKILL.md +194 -0
- package/skills/domains/social-science/survey-research-guide/SKILL.md +182 -0
- package/skills/literature/discovery/citation-alert-guide/SKILL.md +154 -0
- package/skills/literature/discovery/conference-proceedings-guide/SKILL.md +142 -0
- package/skills/literature/discovery/literature-mapping-guide/SKILL.md +175 -0
- package/skills/literature/discovery/paper-tracking-guide/SKILL.md +211 -0
- package/skills/literature/discovery/rss-paper-feeds/SKILL.md +214 -0
- package/skills/literature/discovery/semantic-scholar-recs-guide/SKILL.md +164 -0
- package/skills/literature/fulltext/doaj-api/SKILL.md +120 -0
- package/skills/literature/fulltext/interlibrary-loan-guide/SKILL.md +163 -0
- package/skills/literature/fulltext/open-access-guide/SKILL.md +183 -0
- package/skills/literature/fulltext/pmc-oai-api/SKILL.md +184 -0
- package/skills/literature/fulltext/preprint-servers-guide/SKILL.md +128 -0
- package/skills/literature/fulltext/repository-harvesting-guide/SKILL.md +207 -0
- package/skills/literature/fulltext/unpaywall-api/SKILL.md +113 -0
- package/skills/literature/metadata/altmetrics-guide/SKILL.md +132 -0
- package/skills/literature/metadata/citation-network-guide/SKILL.md +236 -0
- package/skills/literature/metadata/crossref-api/SKILL.md +133 -0
- package/skills/literature/metadata/datacite-api/SKILL.md +126 -0
- package/skills/literature/metadata/doi-resolution-guide/SKILL.md +168 -0
- package/skills/literature/metadata/h-index-guide/SKILL.md +183 -0
- package/skills/literature/metadata/journal-metrics-guide/SKILL.md +188 -0
- package/skills/literature/metadata/opencitations-api/SKILL.md +128 -0
- package/skills/literature/metadata/orcid-api/SKILL.md +136 -0
- package/skills/literature/metadata/orcid-integration-guide/SKILL.md +178 -0
- package/skills/literature/search/arxiv-api/SKILL.md +95 -0
- package/skills/literature/search/biorxiv-api/SKILL.md +123 -0
- package/skills/literature/search/boolean-search-guide/SKILL.md +199 -0
- package/skills/literature/search/citation-chaining-guide/SKILL.md +148 -0
- package/skills/literature/search/database-comparison-guide/SKILL.md +100 -0
- package/skills/literature/search/europe-pmc-api/SKILL.md +120 -0
- package/skills/literature/search/google-scholar-guide/SKILL.md +182 -0
- package/skills/literature/search/mesh-terms-guide/SKILL.md +164 -0
- package/skills/literature/search/openalex-api/SKILL.md +134 -0
- package/skills/literature/search/pubmed-api/SKILL.md +130 -0
- package/skills/literature/search/scientify-literature-survey/SKILL.md +203 -0
- package/skills/literature/search/semantic-scholar-api/SKILL.md +134 -0
- package/skills/literature/search/systematic-search-strategy/SKILL.md +214 -0
- package/skills/research/automation/ai-scientist-guide/SKILL.md +228 -0
- package/skills/research/automation/data-collection-automation/SKILL.md +248 -0
- package/skills/research/automation/research-workflow-automation/SKILL.md +266 -0
- package/skills/research/deep-research/meta-synthesis-guide/SKILL.md +174 -0
- package/skills/research/deep-research/research-cog/SKILL.md +153 -0
- package/skills/research/deep-research/scoping-review-guide/SKILL.md +217 -0
- package/skills/research/deep-research/systematic-review-guide/SKILL.md +250 -0
- package/skills/research/funding/figshare-api/SKILL.md +163 -0
- package/skills/research/funding/grant-writing-guide/SKILL.md +233 -0
- package/skills/research/funding/nsf-grant-guide/SKILL.md +206 -0
- package/skills/research/funding/open-science-guide/SKILL.md +255 -0
- package/skills/research/funding/zenodo-api/SKILL.md +174 -0
- package/skills/research/methodology/action-research-guide/SKILL.md +201 -0
- package/skills/research/methodology/experimental-design-guide/SKILL.md +236 -0
- package/skills/research/methodology/grad-school-guide/SKILL.md +182 -0
- package/skills/research/methodology/grounded-theory-guide/SKILL.md +171 -0
- package/skills/research/methodology/mixed-methods-guide/SKILL.md +208 -0
- package/skills/research/methodology/qualitative-research-guide/SKILL.md +234 -0
- package/skills/research/methodology/scientify-idea-generation/SKILL.md +222 -0
- package/skills/research/paper-review/paper-reading-assistant/SKILL.md +266 -0
- package/skills/research/paper-review/peer-review-guide/SKILL.md +227 -0
- package/skills/research/paper-review/rebuttal-writing-guide/SKILL.md +185 -0
- package/skills/research/paper-review/scientify-write-review-paper/SKILL.md +209 -0
- package/skills/tools/code-exec/jupyter-notebook-guide/SKILL.md +178 -0
- package/skills/tools/code-exec/python-reproducibility-guide/SKILL.md +341 -0
- package/skills/tools/code-exec/r-reproducibility-guide/SKILL.md +236 -0
- package/skills/tools/code-exec/sandbox-execution-guide/SKILL.md +221 -0
- package/skills/tools/diagram/mermaid-diagram-guide/SKILL.md +269 -0
- package/skills/tools/diagram/plantuml-guide/SKILL.md +397 -0
- package/skills/tools/diagram/scientific-illustration-guide/SKILL.md +225 -0
- package/skills/tools/document/anystyle-api/SKILL.md +199 -0
- package/skills/tools/document/grobid-pdf-parsing/SKILL.md +294 -0
- package/skills/tools/document/markdown-academic-guide/SKILL.md +217 -0
- package/skills/tools/document/pdf-extraction-guide/SKILL.md +321 -0
- package/skills/tools/knowledge-graph/knowledge-graph-construction/SKILL.md +306 -0
- package/skills/tools/knowledge-graph/ontology-design-guide/SKILL.md +214 -0
- package/skills/tools/knowledge-graph/rag-methodology-guide/SKILL.md +325 -0
- package/skills/tools/ocr-translate/formula-recognition-guide/SKILL.md +367 -0
- package/skills/tools/ocr-translate/handwriting-recognition-guide/SKILL.md +211 -0
- package/skills/tools/ocr-translate/latex-ocr-guide/SKILL.md +204 -0
- package/skills/tools/ocr-translate/multilingual-research-guide/SKILL.md +234 -0
- package/skills/tools/scraping/academic-web-scraping/SKILL.md +326 -0
- package/skills/tools/scraping/api-data-collection-guide/SKILL.md +301 -0
- package/skills/tools/scraping/web-scraping-ethics-guide/SKILL.md +250 -0
- package/skills/writing/citation/bibtex-management-guide/SKILL.md +246 -0
- package/skills/writing/citation/citation-style-guide/SKILL.md +248 -0
- package/skills/writing/citation/reference-manager-comparison/SKILL.md +208 -0
- package/skills/writing/citation/zotero-api/SKILL.md +188 -0
- package/skills/writing/composition/abstract-writing-guide/SKILL.md +188 -0
- package/skills/writing/composition/discussion-writing-guide/SKILL.md +194 -0
- package/skills/writing/composition/introduction-writing-guide/SKILL.md +194 -0
- package/skills/writing/composition/literature-review-writing/SKILL.md +196 -0
- package/skills/writing/composition/methods-section-guide/SKILL.md +185 -0
- package/skills/writing/composition/response-to-reviewers/SKILL.md +215 -0
- package/skills/writing/composition/scientific-writing-guide/SKILL.md +152 -0
- package/skills/writing/latex/bibliography-management-guide/SKILL.md +206 -0
- package/skills/writing/latex/latex-drawing-guide/SKILL.md +234 -0
- package/skills/writing/latex/latex-ecosystem-guide/SKILL.md +240 -0
- package/skills/writing/latex/math-typesetting-guide/SKILL.md +231 -0
- package/skills/writing/latex/overleaf-collaboration-guide/SKILL.md +211 -0
- package/skills/writing/latex/tikz-diagrams-guide/SKILL.md +211 -0
- package/skills/writing/polish/academic-translation-guide/SKILL.md +175 -0
- package/skills/writing/polish/academic-writing-refiner/SKILL.md +143 -0
- package/skills/writing/polish/ai-writing-humanizer/SKILL.md +178 -0
- package/skills/writing/polish/grammar-checker-guide/SKILL.md +184 -0
- package/skills/writing/polish/plagiarism-detection-guide/SKILL.md +167 -0
- package/skills/writing/templates/beamer-presentation-guide/SKILL.md +263 -0
- package/skills/writing/templates/conference-paper-template/SKILL.md +219 -0
- package/skills/writing/templates/thesis-template-guide/SKILL.md +200 -0
- package/skills/writing/templates/thesis-writing-guide/SKILL.md +220 -0
- package/src/tools/arxiv.ts +131 -0
- package/src/tools/crossref.ts +112 -0
- package/src/tools/openalex.ts +174 -0
- package/src/tools/pubmed.ts +166 -0
- package/src/tools/semantic-scholar.ts +108 -0
- package/src/tools/unpaywall.ts +58 -0
|
@@ -0,0 +1,225 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: text-mining-guide
|
|
3
|
+
description: "Apply NLP and text mining techniques to research text data"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "mag"
|
|
7
|
+
category: "analysis"
|
|
8
|
+
subcategory: "wrangling"
|
|
9
|
+
keywords: ["text mining", "NLP", "topic modeling", "sentiment analysis", "text preprocessing", "natural language processing"]
|
|
10
|
+
source: "wentor-research-plugins"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Text Mining Guide
|
|
14
|
+
|
|
15
|
+
A skill for applying natural language processing (NLP) and text mining techniques to research data. Covers text preprocessing, feature extraction, topic modeling, sentiment analysis, and named entity recognition for analyzing surveys, abstracts, social media, and document corpora.
|
|
16
|
+
|
|
17
|
+
## Text Preprocessing Pipeline
|
|
18
|
+
|
|
19
|
+
### Standard Cleaning Steps
|
|
20
|
+
|
|
21
|
+
```python
|
|
22
|
+
import re
|
|
23
|
+
from collections import Counter
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
def preprocess_text(text: str, lowercase: bool = True,
|
|
27
|
+
remove_numbers: bool = False,
|
|
28
|
+
min_word_length: int = 2) -> list[str]:
|
|
29
|
+
"""
|
|
30
|
+
Preprocess text for NLP analysis.
|
|
31
|
+
|
|
32
|
+
Args:
|
|
33
|
+
text: Raw input text
|
|
34
|
+
lowercase: Convert to lowercase
|
|
35
|
+
remove_numbers: Remove numeric tokens
|
|
36
|
+
min_word_length: Minimum token length to keep
|
|
37
|
+
"""
|
|
38
|
+
if lowercase:
|
|
39
|
+
text = text.lower()
|
|
40
|
+
|
|
41
|
+
# Remove URLs
|
|
42
|
+
text = re.sub(r"http\S+|www\.\S+", "", text)
|
|
43
|
+
|
|
44
|
+
# Remove HTML tags
|
|
45
|
+
text = re.sub(r"<[^>]+>", "", text)
|
|
46
|
+
|
|
47
|
+
# Remove special characters (keep apostrophes for contractions)
|
|
48
|
+
text = re.sub(r"[^a-zA-Z0-9\s']", " ", text)
|
|
49
|
+
|
|
50
|
+
# Tokenize
|
|
51
|
+
tokens = text.split()
|
|
52
|
+
|
|
53
|
+
if remove_numbers:
|
|
54
|
+
tokens = [t for t in tokens if not t.isdigit()]
|
|
55
|
+
|
|
56
|
+
# Remove short tokens
|
|
57
|
+
tokens = [t for t in tokens if len(t) >= min_word_length]
|
|
58
|
+
|
|
59
|
+
return tokens
|
|
60
|
+
|
|
61
|
+
|
|
62
|
+
def remove_stopwords(tokens: list[str],
|
|
63
|
+
custom_stopwords: list[str] = None) -> list[str]:
|
|
64
|
+
"""
|
|
65
|
+
Remove stopwords from token list.
|
|
66
|
+
"""
|
|
67
|
+
# Minimal English stopwords (extend as needed)
|
|
68
|
+
default_stops = {
|
|
69
|
+
"the", "a", "an", "and", "or", "but", "in", "on", "at",
|
|
70
|
+
"to", "for", "of", "with", "by", "is", "was", "are", "were",
|
|
71
|
+
"be", "been", "being", "have", "has", "had", "do", "does",
|
|
72
|
+
"did", "will", "would", "could", "should", "may", "might",
|
|
73
|
+
"this", "that", "these", "those", "it", "its", "not", "no"
|
|
74
|
+
}
|
|
75
|
+
|
|
76
|
+
if custom_stopwords:
|
|
77
|
+
default_stops.update(custom_stopwords)
|
|
78
|
+
|
|
79
|
+
return [t for t in tokens if t not in default_stops]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Document-Term Matrix
|
|
83
|
+
|
|
84
|
+
```python
|
|
85
|
+
from sklearn.feature_extraction.text import TfidfVectorizer
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
def build_tfidf_matrix(documents: list[str],
|
|
89
|
+
max_features: int = 5000) -> dict:
|
|
90
|
+
"""
|
|
91
|
+
Build a TF-IDF document-term matrix.
|
|
92
|
+
|
|
93
|
+
Args:
|
|
94
|
+
documents: List of document strings
|
|
95
|
+
max_features: Maximum vocabulary size
|
|
96
|
+
"""
|
|
97
|
+
vectorizer = TfidfVectorizer(
|
|
98
|
+
max_features=max_features,
|
|
99
|
+
stop_words="english",
|
|
100
|
+
min_df=2, # Appear in at least 2 documents
|
|
101
|
+
max_df=0.95, # Ignore terms in >95% of documents
|
|
102
|
+
ngram_range=(1, 2) # Unigrams and bigrams
|
|
103
|
+
)
|
|
104
|
+
|
|
105
|
+
tfidf_matrix = vectorizer.fit_transform(documents)
|
|
106
|
+
|
|
107
|
+
return {
|
|
108
|
+
"matrix_shape": tfidf_matrix.shape,
|
|
109
|
+
"vocabulary_size": len(vectorizer.vocabulary_),
|
|
110
|
+
"top_terms": sorted(
|
|
111
|
+
vectorizer.vocabulary_.items(),
|
|
112
|
+
key=lambda x: x[1]
|
|
113
|
+
)[:20],
|
|
114
|
+
"vectorizer": vectorizer,
|
|
115
|
+
"matrix": tfidf_matrix
|
|
116
|
+
}
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## Topic Modeling
|
|
120
|
+
|
|
121
|
+
### Latent Dirichlet Allocation (LDA)
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
from sklearn.decomposition import LatentDirichletAllocation
|
|
125
|
+
|
|
126
|
+
|
|
127
|
+
def run_topic_model(tfidf_matrix, vectorizer,
|
|
128
|
+
n_topics: int = 10) -> list[dict]:
|
|
129
|
+
"""
|
|
130
|
+
Run LDA topic modeling on a document-term matrix.
|
|
131
|
+
|
|
132
|
+
Args:
|
|
133
|
+
tfidf_matrix: Sparse TF-IDF matrix
|
|
134
|
+
vectorizer: Fitted TfidfVectorizer
|
|
135
|
+
n_topics: Number of topics to extract
|
|
136
|
+
"""
|
|
137
|
+
lda = LatentDirichletAllocation(
|
|
138
|
+
n_components=n_topics,
|
|
139
|
+
random_state=42,
|
|
140
|
+
max_iter=50,
|
|
141
|
+
learning_method="online"
|
|
142
|
+
)
|
|
143
|
+
lda.fit(tfidf_matrix)
|
|
144
|
+
|
|
145
|
+
feature_names = vectorizer.get_feature_names_out()
|
|
146
|
+
topics = []
|
|
147
|
+
|
|
148
|
+
for idx, topic_weights in enumerate(lda.components_):
|
|
149
|
+
top_indices = topic_weights.argsort()[-10:][::-1]
|
|
150
|
+
top_words = [feature_names[i] for i in top_indices]
|
|
151
|
+
topics.append({
|
|
152
|
+
"topic_id": idx,
|
|
153
|
+
"top_words": top_words,
|
|
154
|
+
"label": "Assign a human-readable label based on top words"
|
|
155
|
+
})
|
|
156
|
+
|
|
157
|
+
return topics
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### Choosing the Number of Topics
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
Methods for selecting k (number of topics):
|
|
164
|
+
- Coherence score: Higher is better (use gensim's CoherenceModel)
|
|
165
|
+
- Perplexity: Lower is better (but can overfit)
|
|
166
|
+
- Human judgment: Do topics make interpretive sense?
|
|
167
|
+
- Domain knowledge: Expected number of themes in the corpus
|
|
168
|
+
|
|
169
|
+
Practical advice:
|
|
170
|
+
- Start with k = 5, 10, 15, 20 and compare
|
|
171
|
+
- Examine top words for each k -- look for coherent themes
|
|
172
|
+
- If topics are too broad, increase k
|
|
173
|
+
- If topics overlap heavily, decrease k
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
## Sentiment Analysis
|
|
177
|
+
|
|
178
|
+
### Lexicon-Based Approach
|
|
179
|
+
|
|
180
|
+
```python
|
|
181
|
+
def simple_sentiment(text: str, positive_words: set,
|
|
182
|
+
negative_words: set) -> dict:
|
|
183
|
+
"""
|
|
184
|
+
Basic lexicon-based sentiment scoring.
|
|
185
|
+
|
|
186
|
+
Args:
|
|
187
|
+
text: Input text
|
|
188
|
+
positive_words: Set of positive sentiment words
|
|
189
|
+
negative_words: Set of negative sentiment words
|
|
190
|
+
"""
|
|
191
|
+
tokens = text.lower().split()
|
|
192
|
+
|
|
193
|
+
pos_count = sum(1 for t in tokens if t in positive_words)
|
|
194
|
+
neg_count = sum(1 for t in tokens if t in negative_words)
|
|
195
|
+
total = len(tokens)
|
|
196
|
+
|
|
197
|
+
score = (pos_count - neg_count) / max(total, 1)
|
|
198
|
+
|
|
199
|
+
return {
|
|
200
|
+
"positive_count": pos_count,
|
|
201
|
+
"negative_count": neg_count,
|
|
202
|
+
"score": score,
|
|
203
|
+
"label": (
|
|
204
|
+
"positive" if score > 0.05
|
|
205
|
+
else "negative" if score < -0.05
|
|
206
|
+
else "neutral"
|
|
207
|
+
)
|
|
208
|
+
}
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
## Research Applications
|
|
212
|
+
|
|
213
|
+
### Common Text Mining Tasks in Research
|
|
214
|
+
|
|
215
|
+
| Task | Method | Application |
|
|
216
|
+
|------|--------|-------------|
|
|
217
|
+
| Literature mapping | Topic modeling | Identify research themes in a corpus of abstracts |
|
|
218
|
+
| Survey analysis | Thematic coding + sentiment | Analyze open-ended survey responses |
|
|
219
|
+
| Social media analysis | NER + sentiment | Track public discourse on a topic |
|
|
220
|
+
| Content analysis | Classification + keyword extraction | Code qualitative data at scale |
|
|
221
|
+
| Bibliometrics | Co-word analysis | Map intellectual structure of a field |
|
|
222
|
+
|
|
223
|
+
## Validation and Reporting
|
|
224
|
+
|
|
225
|
+
Always validate text mining results against human judgment. Report preprocessing steps, parameter choices (e.g., number of topics, min_df, max_df), and model evaluation metrics. For topic models, include the top 10-15 words per topic and representative documents. For classification, report precision, recall, and F1 on a held-out test set. Acknowledge that automated text analysis supplements but does not replace close reading.
|
|
@@ -0,0 +1,213 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: computer-vision-guide
|
|
3
|
+
description: "Apply computer vision research methods, models, and evaluation tools"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "eye"
|
|
7
|
+
category: "domains"
|
|
8
|
+
subcategory: "ai-ml"
|
|
9
|
+
keywords: ["computer vision", "image classification", "object detection", "CNN", "vision transformer", "deep learning"]
|
|
10
|
+
source: "wentor-research-plugins"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Computer Vision Guide
|
|
14
|
+
|
|
15
|
+
A skill for conducting computer vision research, covering model architectures, dataset preparation, training pipelines, evaluation metrics, and common experimental protocols for image classification, object detection, and segmentation tasks.
|
|
16
|
+
|
|
17
|
+
## Core Tasks and Architectures
|
|
18
|
+
|
|
19
|
+
### Computer Vision Task Taxonomy
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
Image Classification:
|
|
23
|
+
Input: Single image
|
|
24
|
+
Output: Class label(s)
|
|
25
|
+
Models: ResNet, EfficientNet, ViT, ConvNeXt
|
|
26
|
+
|
|
27
|
+
Object Detection:
|
|
28
|
+
Input: Single image
|
|
29
|
+
Output: Bounding boxes + class labels
|
|
30
|
+
Models: YOLO (v5-v9), Faster R-CNN, DETR, RT-DETR
|
|
31
|
+
|
|
32
|
+
Semantic Segmentation:
|
|
33
|
+
Input: Single image
|
|
34
|
+
Output: Per-pixel class label
|
|
35
|
+
Models: U-Net, DeepLab, SegFormer, Mask2Former
|
|
36
|
+
|
|
37
|
+
Instance Segmentation:
|
|
38
|
+
Input: Single image
|
|
39
|
+
Output: Per-pixel labels distinguishing individual objects
|
|
40
|
+
Models: Mask R-CNN, Mask2Former, SAM
|
|
41
|
+
|
|
42
|
+
Image Generation:
|
|
43
|
+
Input: Text prompt or noise
|
|
44
|
+
Output: Generated image
|
|
45
|
+
Models: Stable Diffusion, DALL-E, Imagen
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Model Architecture Evolution
|
|
49
|
+
|
|
50
|
+
```
|
|
51
|
+
CNNs (Convolutional Neural Networks):
|
|
52
|
+
LeNet (1998) -> AlexNet (2012) -> VGG (2014) -> ResNet (2015)
|
|
53
|
+
-> EfficientNet (2019) -> ConvNeXt (2022)
|
|
54
|
+
|
|
55
|
+
Vision Transformers:
|
|
56
|
+
ViT (2020) -> DeiT (2021) -> Swin Transformer (2021)
|
|
57
|
+
-> BEiT (2021) -> DINOv2 (2023)
|
|
58
|
+
|
|
59
|
+
Trend: Transformers are competitive with CNNs at scale.
|
|
60
|
+
Hybrid architectures combining convolutions and attention are common.
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
## Dataset Preparation
|
|
64
|
+
|
|
65
|
+
### Building a Research Dataset
|
|
66
|
+
|
|
67
|
+
```python
|
|
68
|
+
import os
|
|
69
|
+
from pathlib import Path
|
|
70
|
+
|
|
71
|
+
|
|
72
|
+
def organize_image_dataset(source_dir: str,
|
|
73
|
+
split_ratios: dict = None) -> dict:
|
|
74
|
+
"""
|
|
75
|
+
Organize images into train/val/test splits.
|
|
76
|
+
|
|
77
|
+
Args:
|
|
78
|
+
source_dir: Directory containing class subdirectories
|
|
79
|
+
split_ratios: Dict with 'train', 'val', 'test' ratios
|
|
80
|
+
"""
|
|
81
|
+
if split_ratios is None:
|
|
82
|
+
split_ratios = {"train": 0.7, "val": 0.15, "test": 0.15}
|
|
83
|
+
|
|
84
|
+
import random
|
|
85
|
+
random.seed(42)
|
|
86
|
+
|
|
87
|
+
stats = {}
|
|
88
|
+
for class_dir in sorted(Path(source_dir).iterdir()):
|
|
89
|
+
if not class_dir.is_dir():
|
|
90
|
+
continue
|
|
91
|
+
|
|
92
|
+
images = list(class_dir.glob("*.jpg")) + list(class_dir.glob("*.png"))
|
|
93
|
+
random.shuffle(images)
|
|
94
|
+
|
|
95
|
+
n = len(images)
|
|
96
|
+
n_train = int(n * split_ratios["train"])
|
|
97
|
+
n_val = int(n * split_ratios["val"])
|
|
98
|
+
|
|
99
|
+
stats[class_dir.name] = {
|
|
100
|
+
"total": n,
|
|
101
|
+
"train": n_train,
|
|
102
|
+
"val": n_val,
|
|
103
|
+
"test": n - n_train - n_val
|
|
104
|
+
}
|
|
105
|
+
|
|
106
|
+
return stats
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Data Augmentation
|
|
110
|
+
|
|
111
|
+
```python
|
|
112
|
+
from torchvision import transforms
|
|
113
|
+
|
|
114
|
+
|
|
115
|
+
def get_training_transforms(img_size: int = 224) -> transforms.Compose:
|
|
116
|
+
"""
|
|
117
|
+
Standard data augmentation pipeline for training.
|
|
118
|
+
|
|
119
|
+
Args:
|
|
120
|
+
img_size: Target image size
|
|
121
|
+
"""
|
|
122
|
+
return transforms.Compose([
|
|
123
|
+
transforms.RandomResizedCrop(img_size, scale=(0.8, 1.0)),
|
|
124
|
+
transforms.RandomHorizontalFlip(p=0.5),
|
|
125
|
+
transforms.ColorJitter(brightness=0.2, contrast=0.2,
|
|
126
|
+
saturation=0.2, hue=0.1),
|
|
127
|
+
transforms.RandomRotation(15),
|
|
128
|
+
transforms.ToTensor(),
|
|
129
|
+
transforms.Normalize(
|
|
130
|
+
mean=[0.485, 0.456, 0.406],
|
|
131
|
+
std=[0.229, 0.224, 0.225]
|
|
132
|
+
)
|
|
133
|
+
])
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
## Training Pipeline
|
|
137
|
+
|
|
138
|
+
### Transfer Learning Workflow
|
|
139
|
+
|
|
140
|
+
```python
|
|
141
|
+
import torch
|
|
142
|
+
import torch.nn as nn
|
|
143
|
+
from torchvision import models
|
|
144
|
+
|
|
145
|
+
|
|
146
|
+
def create_classifier(num_classes: int,
|
|
147
|
+
backbone: str = "resnet50",
|
|
148
|
+
pretrained: bool = True) -> nn.Module:
|
|
149
|
+
"""
|
|
150
|
+
Create an image classifier using transfer learning.
|
|
151
|
+
|
|
152
|
+
Args:
|
|
153
|
+
num_classes: Number of target classes
|
|
154
|
+
backbone: Model architecture name
|
|
155
|
+
pretrained: Whether to use ImageNet-pretrained weights
|
|
156
|
+
"""
|
|
157
|
+
if backbone == "resnet50":
|
|
158
|
+
weights = models.ResNet50_Weights.DEFAULT if pretrained else None
|
|
159
|
+
model = models.resnet50(weights=weights)
|
|
160
|
+
model.fc = nn.Linear(model.fc.in_features, num_classes)
|
|
161
|
+
elif backbone == "vit_b_16":
|
|
162
|
+
weights = models.ViT_B_16_Weights.DEFAULT if pretrained else None
|
|
163
|
+
model = models.vit_b_16(weights=weights)
|
|
164
|
+
model.heads.head = nn.Linear(
|
|
165
|
+
model.heads.head.in_features, num_classes
|
|
166
|
+
)
|
|
167
|
+
else:
|
|
168
|
+
raise ValueError(f"Unknown backbone: {backbone}")
|
|
169
|
+
|
|
170
|
+
return model
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
## Evaluation Metrics
|
|
174
|
+
|
|
175
|
+
### Metrics by Task
|
|
176
|
+
|
|
177
|
+
```
|
|
178
|
+
Classification:
|
|
179
|
+
- Top-1 Accuracy: Fraction of correct predictions
|
|
180
|
+
- Top-5 Accuracy: Correct class in top 5 predictions
|
|
181
|
+
- Precision, Recall, F1: Per-class and macro-averaged
|
|
182
|
+
- Confusion Matrix: Visualize class-level errors
|
|
183
|
+
|
|
184
|
+
Object Detection:
|
|
185
|
+
- mAP (mean Average Precision): Standard COCO metric
|
|
186
|
+
- mAP@0.5: AP at IoU threshold 0.5
|
|
187
|
+
- mAP@0.5:0.95: AP averaged over IoU thresholds 0.5 to 0.95
|
|
188
|
+
- AP per class: Identifies weak categories
|
|
189
|
+
|
|
190
|
+
Segmentation:
|
|
191
|
+
- mIoU (mean Intersection over Union): Standard metric
|
|
192
|
+
- Pixel Accuracy: Fraction of correctly classified pixels
|
|
193
|
+
- Dice Coefficient: F1 score at the pixel level
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
## Reproducibility Checklist
|
|
197
|
+
|
|
198
|
+
### What to Report in Papers
|
|
199
|
+
|
|
200
|
+
```
|
|
201
|
+
1. Architecture: Exact model name, number of parameters
|
|
202
|
+
2. Pretraining: Dataset and weights used for initialization
|
|
203
|
+
3. Training: Optimizer, learning rate schedule, batch size, epochs
|
|
204
|
+
4. Augmentation: Full list of augmentations with parameters
|
|
205
|
+
5. Hardware: GPU type, number, training time
|
|
206
|
+
6. Evaluation: Exact metrics, test set version, evaluation protocol
|
|
207
|
+
7. Code: Link to repository with training and evaluation scripts
|
|
208
|
+
8. Random seeds: Report seeds used; ideally report mean over 3+ seeds
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
## Ethical Considerations
|
|
212
|
+
|
|
213
|
+
When collecting or using image datasets, consider consent (especially for images of people), geographic and demographic representation, potential for bias amplification, and dual-use concerns. Document the dataset's composition and limitations. Follow the Datasheets for Datasets framework. For generative models, implement safeguards against generating harmful content.
|
|
@@ -0,0 +1,200 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: deep-learning-papers-guide
|
|
3
|
+
description: "Annotated deep learning paper implementations with code walkthroughs"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "🧠"
|
|
7
|
+
category: "domains"
|
|
8
|
+
subcategory: "ai-ml"
|
|
9
|
+
keywords: ["deep learning", "neural network", "Transformer", "CNN", "NLP", "computer vision"]
|
|
10
|
+
source: "https://github.com/labmlai/annotated_deep_learning_paper_implementations"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Deep Learning Papers Guide
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Understanding deep learning architectures requires more than reading papers -- it requires reading and writing code. The annotated_deep_learning_paper_implementations repository (65,800+ stars) provides line-by-line annotated implementations of seminal deep learning papers in PyTorch, making it one of the most valuable learning resources in the field.
|
|
18
|
+
|
|
19
|
+
This guide organizes the key architectures by category, provides implementation patterns for the most important building blocks, and offers strategies for going from paper to working code. Whether you are implementing a Transformer variant for your research, understanding a GAN architecture for your experiments, or teaching a deep learning course, these patterns accelerate the process.
|
|
20
|
+
|
|
21
|
+
The focus is on practical understanding: what each component does, why it is designed that way, and how to implement it correctly in PyTorch.
|
|
22
|
+
|
|
23
|
+
## Core Architecture Families
|
|
24
|
+
|
|
25
|
+
### Transformer Architectures
|
|
26
|
+
|
|
27
|
+
The Transformer (Vaswani et al., 2017) is the foundation of modern NLP and increasingly of computer vision.
|
|
28
|
+
|
|
29
|
+
#### Multi-Head Self-Attention
|
|
30
|
+
|
|
31
|
+
```python
|
|
32
|
+
import torch
|
|
33
|
+
import torch.nn as nn
|
|
34
|
+
import math
|
|
35
|
+
|
|
36
|
+
class MultiHeadAttention(nn.Module):
|
|
37
|
+
def __init__(self, d_model: int, n_heads: int):
|
|
38
|
+
super().__init__()
|
|
39
|
+
assert d_model % n_heads == 0
|
|
40
|
+
self.d_model = d_model
|
|
41
|
+
self.n_heads = n_heads
|
|
42
|
+
self.d_k = d_model // n_heads
|
|
43
|
+
|
|
44
|
+
self.W_q = nn.Linear(d_model, d_model)
|
|
45
|
+
self.W_k = nn.Linear(d_model, d_model)
|
|
46
|
+
self.W_v = nn.Linear(d_model, d_model)
|
|
47
|
+
self.W_o = nn.Linear(d_model, d_model)
|
|
48
|
+
|
|
49
|
+
def forward(self, query, key, value, mask=None):
|
|
50
|
+
batch_size = query.size(0)
|
|
51
|
+
|
|
52
|
+
# Linear projections and reshape to (batch, heads, seq, d_k)
|
|
53
|
+
Q = self.W_q(query).view(batch_size, -1, self.n_heads, self.d_k).transpose(1, 2)
|
|
54
|
+
K = self.W_k(key).view(batch_size, -1, self.n_heads, self.d_k).transpose(1, 2)
|
|
55
|
+
V = self.W_v(value).view(batch_size, -1, self.n_heads, self.d_k).transpose(1, 2)
|
|
56
|
+
|
|
57
|
+
# Scaled dot-product attention
|
|
58
|
+
scores = torch.matmul(Q, K.transpose(-2, -1)) / math.sqrt(self.d_k)
|
|
59
|
+
if mask is not None:
|
|
60
|
+
scores = scores.masked_fill(mask == 0, float('-inf'))
|
|
61
|
+
attn = torch.softmax(scores, dim=-1)
|
|
62
|
+
context = torch.matmul(attn, V)
|
|
63
|
+
|
|
64
|
+
# Concatenate heads and project
|
|
65
|
+
context = context.transpose(1, 2).contiguous().view(batch_size, -1, self.d_model)
|
|
66
|
+
return self.W_o(context)
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
#### Transformer Encoder Block
|
|
70
|
+
|
|
71
|
+
```python
|
|
72
|
+
class TransformerBlock(nn.Module):
|
|
73
|
+
def __init__(self, d_model: int, n_heads: int, d_ff: int, dropout: float = 0.1):
|
|
74
|
+
super().__init__()
|
|
75
|
+
self.attention = MultiHeadAttention(d_model, n_heads)
|
|
76
|
+
self.norm1 = nn.LayerNorm(d_model)
|
|
77
|
+
self.norm2 = nn.LayerNorm(d_model)
|
|
78
|
+
self.ffn = nn.Sequential(
|
|
79
|
+
nn.Linear(d_model, d_ff),
|
|
80
|
+
nn.GELU(),
|
|
81
|
+
nn.Dropout(dropout),
|
|
82
|
+
nn.Linear(d_ff, d_model),
|
|
83
|
+
nn.Dropout(dropout)
|
|
84
|
+
)
|
|
85
|
+
self.dropout = nn.Dropout(dropout)
|
|
86
|
+
|
|
87
|
+
def forward(self, x, mask=None):
|
|
88
|
+
# Pre-norm variant (used in GPT-2, ViT, modern architectures)
|
|
89
|
+
attn_out = self.attention(self.norm1(x), self.norm1(x), self.norm1(x), mask)
|
|
90
|
+
x = x + self.dropout(attn_out)
|
|
91
|
+
x = x + self.ffn(self.norm2(x))
|
|
92
|
+
return x
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Convolutional Neural Networks
|
|
96
|
+
|
|
97
|
+
#### ResNet Bottleneck Block
|
|
98
|
+
|
|
99
|
+
```python
|
|
100
|
+
class BottleneckBlock(nn.Module):
|
|
101
|
+
expansion = 4
|
|
102
|
+
|
|
103
|
+
def __init__(self, in_channels, out_channels, stride=1, downsample=None):
|
|
104
|
+
super().__init__()
|
|
105
|
+
self.conv1 = nn.Conv2d(in_channels, out_channels, 1, bias=False)
|
|
106
|
+
self.bn1 = nn.BatchNorm2d(out_channels)
|
|
107
|
+
self.conv2 = nn.Conv2d(out_channels, out_channels, 3,
|
|
108
|
+
stride=stride, padding=1, bias=False)
|
|
109
|
+
self.bn2 = nn.BatchNorm2d(out_channels)
|
|
110
|
+
self.conv3 = nn.Conv2d(out_channels, out_channels * self.expansion, 1, bias=False)
|
|
111
|
+
self.bn3 = nn.BatchNorm2d(out_channels * self.expansion)
|
|
112
|
+
self.relu = nn.ReLU(inplace=True)
|
|
113
|
+
self.downsample = downsample
|
|
114
|
+
|
|
115
|
+
def forward(self, x):
|
|
116
|
+
identity = x
|
|
117
|
+
out = self.relu(self.bn1(self.conv1(x)))
|
|
118
|
+
out = self.relu(self.bn2(self.conv2(out)))
|
|
119
|
+
out = self.bn3(self.conv3(out))
|
|
120
|
+
if self.downsample is not None:
|
|
121
|
+
identity = self.downsample(x)
|
|
122
|
+
out += identity
|
|
123
|
+
return self.relu(out)
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## Key Architecture Comparison
|
|
127
|
+
|
|
128
|
+
| Architecture | Year | Parameters | Key Innovation | Primary Domain |
|
|
129
|
+
|-------------|------|------------|----------------|---------------|
|
|
130
|
+
| ResNet | 2015 | 25M (ResNet-50) | Skip connections | Vision |
|
|
131
|
+
| Transformer | 2017 | Varies | Self-attention | NLP |
|
|
132
|
+
| BERT | 2018 | 340M (Large) | Masked language modeling | NLP |
|
|
133
|
+
| GPT-2 | 2019 | 1.5B | Autoregressive generation | NLP |
|
|
134
|
+
| ViT | 2020 | 86M (Base) | Patch-based image tokenization | Vision |
|
|
135
|
+
| Diffusion | 2020 | Varies | Iterative denoising | Generation |
|
|
136
|
+
| LLaMA | 2023 | 7B-70B | Efficient open LLM | NLP |
|
|
137
|
+
|
|
138
|
+
## Training Patterns
|
|
139
|
+
|
|
140
|
+
### Standard Training Loop
|
|
141
|
+
|
|
142
|
+
```python
|
|
143
|
+
def train_epoch(model, dataloader, optimizer, criterion, device):
|
|
144
|
+
model.train()
|
|
145
|
+
total_loss = 0
|
|
146
|
+
for batch_idx, (data, targets) in enumerate(dataloader):
|
|
147
|
+
data, targets = data.to(device), targets.to(device)
|
|
148
|
+
|
|
149
|
+
optimizer.zero_grad()
|
|
150
|
+
outputs = model(data)
|
|
151
|
+
loss = criterion(outputs, targets)
|
|
152
|
+
loss.backward()
|
|
153
|
+
|
|
154
|
+
# Gradient clipping (crucial for Transformers)
|
|
155
|
+
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
|
|
156
|
+
|
|
157
|
+
optimizer.step()
|
|
158
|
+
total_loss += loss.item()
|
|
159
|
+
|
|
160
|
+
return total_loss / len(dataloader)
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
### Learning Rate Scheduling
|
|
164
|
+
|
|
165
|
+
```python
|
|
166
|
+
# Cosine annealing with warmup (standard for Transformers)
|
|
167
|
+
from torch.optim.lr_scheduler import CosineAnnealingLR, LinearLR, SequentialLR
|
|
168
|
+
|
|
169
|
+
optimizer = torch.optim.AdamW(model.parameters(), lr=3e-4, weight_decay=0.01)
|
|
170
|
+
warmup = LinearLR(optimizer, start_factor=0.01, total_iters=1000)
|
|
171
|
+
cosine = CosineAnnealingLR(optimizer, T_max=50000)
|
|
172
|
+
scheduler = SequentialLR(optimizer, schedulers=[warmup, cosine], milestones=[1000])
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
## From Paper to Code: A Methodology
|
|
176
|
+
|
|
177
|
+
1. **Read the paper twice.** First pass for high-level understanding; second pass for implementation details.
|
|
178
|
+
2. **Identify the core algorithm.** Usually in Section 3 or 4 of the paper.
|
|
179
|
+
3. **List all hyperparameters.** Create a config dict before writing any code.
|
|
180
|
+
4. **Implement bottom-up.** Start with the smallest building blocks (attention, normalization), then compose.
|
|
181
|
+
5. **Test each component in isolation.** Verify tensor shapes and gradients at each level.
|
|
182
|
+
6. **Reproduce a known result first.** Match the paper's numbers on a small dataset before scaling.
|
|
183
|
+
7. **Use the official implementation as reference.** But write your own code for understanding.
|
|
184
|
+
|
|
185
|
+
## Best Practices
|
|
186
|
+
|
|
187
|
+
- **Always verify tensor shapes.** Add assert statements for dimensions during development.
|
|
188
|
+
- **Use mixed precision training.** `torch.cuda.amp` provides 2x speedup with minimal accuracy loss.
|
|
189
|
+
- **Log everything.** Use Weights & Biases or TensorBoard for experiment tracking.
|
|
190
|
+
- **Start small.** Debug on a tiny dataset before running on the full one.
|
|
191
|
+
- **Read the appendix.** Critical details (learning rates, initialization, data augmentation) are often in the supplementary material.
|
|
192
|
+
- **Join the community.** Papers With Code, Reddit r/MachineLearning, and Twitter/X are where implementation details are discussed.
|
|
193
|
+
|
|
194
|
+
## References
|
|
195
|
+
|
|
196
|
+
- [annotated_deep_learning_paper_implementations](https://github.com/labmlai/annotated_deep_learning_paper_implementations) -- Line-by-line annotated implementations (65,800+ stars)
|
|
197
|
+
- [Attention Is All You Need](https://arxiv.org/abs/1706.03762) -- Original Transformer paper
|
|
198
|
+
- [Deep Residual Learning](https://arxiv.org/abs/1512.03385) -- ResNet paper
|
|
199
|
+
- [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/) -- Jay Alammar's visual guide
|
|
200
|
+
- [Papers With Code](https://paperswithcode.com/) -- Paper-implementation pairs with benchmarks
|