cnhkmcp 2.2.0__py3-none-any.whl → 2.3.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- cnhkmcp/__init__.py +1 -1
- cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/README.md +1 -1
- cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/config.json +2 -2
- cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/main.py +1 -1
- cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/vector_db/chroma.sqlite3 +0 -0
- cnhkmcp/untracked/APP/Tranformer/Transformer.py +2 -2
- cnhkmcp/untracked/APP/Tranformer/transformer_config.json +1 -1
- cnhkmcp/untracked/APP/blueprints/feature_engineering.py +2 -2
- cnhkmcp/untracked/APP/blueprints/inspiration_house.py +4 -4
- cnhkmcp/untracked/APP/blueprints/paper_analysis.py +3 -3
- cnhkmcp/untracked/APP/give_me_idea/BRAIN_Alpha_Template_Expert_SystemPrompt.md +34 -73
- cnhkmcp/untracked/APP/give_me_idea/alpha_data_specific_template_master.py +2 -2
- cnhkmcp/untracked/APP/give_me_idea/what_is_Alpha_template.md +366 -1
- cnhkmcp/untracked/APP/static/inspiration.js +345 -13
- cnhkmcp/untracked/APP/templates/index.html +11 -3
- cnhkmcp/untracked/APP/templates/transformer_web.html +1 -1
- cnhkmcp/untracked/APP/trailSomeAlphas/README.md +38 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/ace.log +66 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/enhance_template.py +588 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/requirements.txt +3 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/run_pipeline.py +1001 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/run_pipeline_step_by_step.ipynb +5258 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/OUTPUT_TEMPLATE.md +325 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/SKILL.md +503 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/examples.md +244 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/output_report/ASI_delay1_analyst11_ideas.md +285 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/reference.md +399 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/SKILL.md +40 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/config.json +6 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709385783386000.json +388 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709386274840400.json +131 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709386838244700.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709387369198500.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709387908905800.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709388486243600.json +240 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709389024058600.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709389549608700.json +41 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709390068714000.json +110 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709390591996900.json +36 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709391129137100.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709391691643500.json +41 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709392192099200.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709392703423500.json +46 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709393213729400.json +246 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710186683932500.json +388 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710187165414300.json +131 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710187665211700.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710188149193400.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710188667627400.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710189220822000.json +240 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710189726189500.json +1926 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710190248066100.json +41 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710190768298700.json +110 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710191282588100.json +36 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710191838960900.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710192396688000.json +41 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710192941922400.json +31 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710193473524600.json +46 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710194001961200.json +246 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710420975888800.json +46 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710421647590100.json +196 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710422131378500.json +5 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710422644184400.json +196 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710423702350600.json +196 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710424244661800.json +5 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_delay1.csv +211 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/final_expressions.json +7062 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/ace.log +3 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/ace_lib.py +1514 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/fetch_dataset.py +113 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/helpful_functions.py +180 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/implement_idea.py +236 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/merge_expression_list.py +90 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/parsetab.py +60 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance/op/321/206/320/220/342/225/227/321/207/342/225/227/320/243.md +434 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance/sample_prompt.md +62 -0
- cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance//321/205/320/235/320/245/321/205/320/253/320/260/321/205/320/275/320/240/321/206/320/220/320/255/321/210/320/220/320/223/321/211/320/220/342/225/227/321/210/342/225/233/320/241/321/211/320/243/342/225/233.md +354 -0
- cnhkmcp/untracked/APP/usage.md +2 -2
- cnhkmcp/untracked/APP//321/210/342/224/220/320/240/321/210/320/261/320/234/321/206/320/231/320/243/321/205/342/225/235/320/220/321/206/320/230/320/241.py +388 -8
- cnhkmcp/untracked/skills/alpha-expression-verifier/scripts/validator.py +889 -0
- cnhkmcp/untracked/skills/brain-feature-implementation/scripts/implement_idea.py +4 -3
- cnhkmcp/untracked/skills/brain-improve-alpha-performance/arXiv_API_Tool_Manual.md +490 -0
- cnhkmcp/untracked/skills/brain-improve-alpha-performance/reference.md +1 -1
- cnhkmcp/untracked/skills/brain-improve-alpha-performance/scripts/arxiv_api.py +229 -0
- cnhkmcp/untracked//321/211/320/225/320/235/321/207/342/225/234/320/276/321/205/320/231/320/235/321/210/342/224/220/320/240/321/210/320/261/320/234/321/206/320/230/320/241_/321/205/320/276/320/231/321/210/320/263/320/225/321/205/342/224/220/320/225/321/210/320/266/320/221/321/204/342/225/233/320/255/321/210/342/225/241/320/246/321/205/320/234/320/225.py +35 -11
- cnhkmcp/vector_db/_manifest.json +1 -0
- cnhkmcp/vector_db/_meta.json +1 -0
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/METADATA +1 -1
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/RECORD +96 -30
- /cnhkmcp/untracked/{skills/expression_verifier → APP/trailSomeAlphas/skills/brain-feature-implementation}/scripts/validator.py +0 -0
- /cnhkmcp/untracked/skills/{expression_verifier → alpha-expression-verifier}/SKILL.md +0 -0
- /cnhkmcp/untracked/skills/{expression_verifier → alpha-expression-verifier}/scripts/verify_expr.py +0 -0
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/WHEEL +0 -0
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/entry_points.txt +0 -0
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/licenses/LICENSE +0 -0
- {cnhkmcp-2.2.0.dist-info → cnhkmcp-2.3.0.dist-info}/top_level.txt +0 -0
|
@@ -146,7 +146,8 @@ def main():
|
|
|
146
146
|
expression_list.append(expr)
|
|
147
147
|
|
|
148
148
|
# Save results to JSON (Always save for debugging)
|
|
149
|
-
|
|
149
|
+
# Use nanosecond precision to avoid collisions when called in a tight loop.
|
|
150
|
+
timestamp = time.time_ns()
|
|
150
151
|
json_output = {
|
|
151
152
|
"template": args.template,
|
|
152
153
|
"expression_list": expression_list
|
|
@@ -154,8 +155,8 @@ def main():
|
|
|
154
155
|
|
|
155
156
|
output_file = dataset_dir / f"idea_{timestamp}.json"
|
|
156
157
|
try:
|
|
157
|
-
with open(output_file, 'w') as f:
|
|
158
|
-
json.dump(json_output, f, indent=4)
|
|
158
|
+
with open(output_file, 'w', encoding='utf-8') as f:
|
|
159
|
+
json.dump(json_output, f, indent=4, ensure_ascii=False)
|
|
159
160
|
print(f"\nSaved idea configuration to: {output_file}")
|
|
160
161
|
except Exception as e:
|
|
161
162
|
print(f"Error saving JSON: {e}", file=sys.stderr)
|
|
@@ -0,0 +1,490 @@
|
|
|
1
|
+
# 🔍 arXiv Paper Search & Download Tool
|
|
2
|
+
|
|
3
|
+
A comprehensive Python tool for searching, analyzing, and downloading research papers from arXiv using their public API. Perfect for researchers, students, and anyone interested in academic papers.
|
|
4
|
+
|
|
5
|
+
## 📋 Table of Contents
|
|
6
|
+
|
|
7
|
+
- [Features](#-features)
|
|
8
|
+
- [Installation](#-installation)
|
|
9
|
+
- [Quick Start](#-quick-start)
|
|
10
|
+
- [Usage Modes](#-usage-modes)
|
|
11
|
+
- [API Functions](#-api-functions)
|
|
12
|
+
- [Examples](#-examples)
|
|
13
|
+
- [Advanced Usage](#-advanced-usage)
|
|
14
|
+
- [Troubleshooting](#-troubleshooting)
|
|
15
|
+
|
|
16
|
+
## ✨ Features
|
|
17
|
+
|
|
18
|
+
- **🔍 Smart Search**: Search arXiv papers by title, author, abstract, or any keyword
|
|
19
|
+
- **📥 Smart Download**: Download PDFs with automatic filename renaming to paper titles
|
|
20
|
+
- **📊 Result Parsing**: Automatically extract structured information (title, authors, abstract, ID)
|
|
21
|
+
- **🖥️ Interactive Mode**: Command-line interface for easy searching and downloading
|
|
22
|
+
- **⚡ Batch Operations**: Search multiple papers and download in sequence
|
|
23
|
+
- **📈 Academic Research**: Perfect for literature reviews and research discovery
|
|
24
|
+
- **🔄 Auto-Rename**: Downloaded files are automatically named using paper titles instead of cryptic IDs
|
|
25
|
+
|
|
26
|
+
## 🚀 Installation
|
|
27
|
+
|
|
28
|
+
### Prerequisites
|
|
29
|
+
- Python 3.6 or higher
|
|
30
|
+
- Internet connection for API access
|
|
31
|
+
|
|
32
|
+
### Install Dependencies
|
|
33
|
+
```bash
|
|
34
|
+
pip install requests
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### Download the Script
|
|
38
|
+
```bash
|
|
39
|
+
# Clone or download arxiv_api.py to your working directory
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## 🎯 Quick Start
|
|
43
|
+
|
|
44
|
+
### Basic Search
|
|
45
|
+
```bash
|
|
46
|
+
python arxiv_api.py "machine learning"
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Search with Custom Results
|
|
50
|
+
```bash
|
|
51
|
+
python arxiv_api.py "quantum computing" -n 10
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Search and Download First Result
|
|
55
|
+
```bash
|
|
56
|
+
python arxiv_api.py "deep learning" -d
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### Interactive Mode
|
|
60
|
+
```bash
|
|
61
|
+
python arxiv_api.py -i
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### Download Paper by ID (with auto-rename)
|
|
65
|
+
```bash
|
|
66
|
+
# In interactive mode:
|
|
67
|
+
# 📚 arxiv> download 2502.05218v1
|
|
68
|
+
# This will automatically rename the file to the paper's title
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## 🎮 Usage Modes
|
|
72
|
+
|
|
73
|
+
### 1. Command Line Mode
|
|
74
|
+
Direct search queries from the command line.
|
|
75
|
+
|
|
76
|
+
**Syntax:**
|
|
77
|
+
```bash
|
|
78
|
+
python arxiv_api.py [query] [options]
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
**Options:**
|
|
82
|
+
- `-n, --max_results`: Maximum number of results (default: 5)
|
|
83
|
+
- `-d, --download`: Download the first result automatically
|
|
84
|
+
- `-i, --interactive`: Start interactive mode
|
|
85
|
+
- `-h, --help`: Show help message
|
|
86
|
+
|
|
87
|
+
### 2. Interactive Mode
|
|
88
|
+
Interactive command-line interface for multiple operations.
|
|
89
|
+
|
|
90
|
+
**Commands:**
|
|
91
|
+
- `search <query> [max_results]`: Search for papers
|
|
92
|
+
- `download <paper_id>`: Download a specific paper (with auto-rename)
|
|
93
|
+
- `help`: Show available commands
|
|
94
|
+
- `quit/exit`: Exit the program
|
|
95
|
+
|
|
96
|
+
## 🔧 API Functions
|
|
97
|
+
|
|
98
|
+
### Core Functions
|
|
99
|
+
|
|
100
|
+
#### `search_arxiv(query, max_results=10)`
|
|
101
|
+
Searches arXiv for papers using the public API.
|
|
102
|
+
|
|
103
|
+
**Parameters:**
|
|
104
|
+
- `query` (str): Search query string
|
|
105
|
+
- `max_results` (int): Maximum number of results (default: 10)
|
|
106
|
+
|
|
107
|
+
**Returns:**
|
|
108
|
+
- `str`: XML response from arXiv API
|
|
109
|
+
|
|
110
|
+
**Example:**
|
|
111
|
+
```python
|
|
112
|
+
from arxiv_api import search_arxiv
|
|
113
|
+
|
|
114
|
+
results = search_arxiv("artificial intelligence", max_results=5)
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
#### `get_paper_metadata(paper_id)`
|
|
118
|
+
Fetches paper metadata directly from arXiv API using paper ID.
|
|
119
|
+
|
|
120
|
+
**Parameters:**
|
|
121
|
+
- `paper_id` (str): arXiv paper ID (e.g., "2502.05218v1")
|
|
122
|
+
|
|
123
|
+
**Returns:**
|
|
124
|
+
- `dict`: Paper information dictionary, or `None` if not found
|
|
125
|
+
|
|
126
|
+
**Example:**
|
|
127
|
+
```python
|
|
128
|
+
from arxiv_api import get_paper_metadata
|
|
129
|
+
|
|
130
|
+
paper_info = get_paper_metadata("2502.05218v1")
|
|
131
|
+
if paper_info:
|
|
132
|
+
print(f"Title: {paper_info['title']}")
|
|
133
|
+
print(f"Authors: {', '.join(paper_info['authors'])}")
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
#### `download_paper(paper_id, output_dir=".", paper_title=None)`
|
|
137
|
+
Downloads a specific paper by its arXiv ID and automatically renames it to the paper title.
|
|
138
|
+
|
|
139
|
+
**Parameters:**
|
|
140
|
+
- `paper_id` (str): arXiv paper ID (e.g., "2502.05218v1")
|
|
141
|
+
- `output_dir` (str): Output directory (default: current directory)
|
|
142
|
+
- `paper_title` (str): Paper title for filename (optional, will be fetched automatically if not provided)
|
|
143
|
+
|
|
144
|
+
**Returns:**
|
|
145
|
+
- `str`: File path of downloaded PDF, or `None` if failed
|
|
146
|
+
|
|
147
|
+
**Features:**
|
|
148
|
+
- **Auto-rename**: Automatically renames downloaded files to paper titles
|
|
149
|
+
- **Smart cleaning**: Removes special characters and limits filename length
|
|
150
|
+
- **Fallback**: Uses paper ID if title is unavailable
|
|
151
|
+
|
|
152
|
+
**Example:**
|
|
153
|
+
```python
|
|
154
|
+
from arxiv_api import download_paper
|
|
155
|
+
|
|
156
|
+
# Download with automatic title fetching and renaming
|
|
157
|
+
filepath = download_paper("2502.05218v1")
|
|
158
|
+
|
|
159
|
+
# Download with custom title
|
|
160
|
+
filepath = download_paper("2502.05218v1", paper_title="My Custom Title")
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
#### `parse_search_results(xml_content)`
|
|
164
|
+
Parses XML search results and extracts structured paper information.
|
|
165
|
+
|
|
166
|
+
**Parameters:**
|
|
167
|
+
- `xml_content` (str): XML response from arXiv API
|
|
168
|
+
|
|
169
|
+
**Returns:**
|
|
170
|
+
- `list`: List of dictionaries containing paper information
|
|
171
|
+
|
|
172
|
+
**Paper Information Structure:**
|
|
173
|
+
```python
|
|
174
|
+
{
|
|
175
|
+
'title': 'Paper Title',
|
|
176
|
+
'authors': ['Author 1', 'Author 2'],
|
|
177
|
+
'abstract': 'Paper abstract...',
|
|
178
|
+
'paper_id': '2502.05218v1',
|
|
179
|
+
'published': '2025-02-05T12:37:15Z'
|
|
180
|
+
}
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
#### `search_and_download(query, max_results=5, download_first=False)`
|
|
184
|
+
Combined function that searches for papers and optionally downloads the first result.
|
|
185
|
+
|
|
186
|
+
**Parameters:**
|
|
187
|
+
- `query` (str): Search query string
|
|
188
|
+
- `max_results` (int): Maximum number of results (default: 5)
|
|
189
|
+
- `download_first` (bool): Whether to download first result (default: False)
|
|
190
|
+
|
|
191
|
+
**Example:**
|
|
192
|
+
```python
|
|
193
|
+
from arxiv_api import search_and_download
|
|
194
|
+
|
|
195
|
+
# Search and display results only
|
|
196
|
+
search_and_download("machine learning", max_results=3)
|
|
197
|
+
|
|
198
|
+
# Search and download first result (with auto-rename)
|
|
199
|
+
search_and_download("deep learning", max_results=5, download_first=True)
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Interactive Mode Functions
|
|
203
|
+
|
|
204
|
+
#### `interactive_mode()`
|
|
205
|
+
Starts the interactive command-line interface.
|
|
206
|
+
|
|
207
|
+
**Features:**
|
|
208
|
+
- Command history
|
|
209
|
+
- Error handling
|
|
210
|
+
- User-friendly prompts
|
|
211
|
+
- Multiple search sessions
|
|
212
|
+
- **Smart download with auto-rename**
|
|
213
|
+
|
|
214
|
+
## 📚 Examples
|
|
215
|
+
|
|
216
|
+
### Example 1: Basic Paper Search
|
|
217
|
+
```bash
|
|
218
|
+
# Search for machine learning papers
|
|
219
|
+
python arxiv_api.py "machine learning"
|
|
220
|
+
|
|
221
|
+
# Output:
|
|
222
|
+
# Searching arXiv for: 'machine learning'
|
|
223
|
+
# --------------------------------------------------
|
|
224
|
+
# Found 5 papers:
|
|
225
|
+
#
|
|
226
|
+
# 1. Title: Introduction to Machine Learning
|
|
227
|
+
# Authors: John Doe, Jane Smith
|
|
228
|
+
# Paper ID: 2103.12345
|
|
229
|
+
# Published: 2021-03-15T10:30:00Z
|
|
230
|
+
# Abstract: This paper introduces...
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### Example 2: Search with Custom Results
|
|
234
|
+
```bash
|
|
235
|
+
# Get 10 results for quantum computing
|
|
236
|
+
python arxiv_api.py "quantum computing" -n 10
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
### Example 3: Search and Download (with auto-rename)
|
|
240
|
+
```bash
|
|
241
|
+
# Search for papers and download the first one
|
|
242
|
+
python arxiv_api.py "artificial intelligence" -d
|
|
243
|
+
# Downloaded file will be automatically renamed to the paper title
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
### Example 4: Interactive Mode with Smart Download
|
|
247
|
+
```bash
|
|
248
|
+
python arxiv_api.py -i
|
|
249
|
+
|
|
250
|
+
# 📚 arxiv> search blockchain finance 5
|
|
251
|
+
# 📚 arxiv> download 2502.05218v1
|
|
252
|
+
# Fetching paper information for 2502.05218v1...
|
|
253
|
+
# Found paper: FactorGCL: A Hypergraph-Based Factor Model...
|
|
254
|
+
# Downloaded: .\FactorGCL_A_Hypergraph-Based_Factor_Model...pdf
|
|
255
|
+
# 📚 arxiv> help
|
|
256
|
+
# 📚 arxiv> quit
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
### Example 5: Python Script Integration
|
|
260
|
+
```python
|
|
261
|
+
from arxiv_api import search_and_download, download_paper, get_paper_metadata
|
|
262
|
+
|
|
263
|
+
# Search for papers on a specific topic
|
|
264
|
+
search_and_download("quantitative finance China", max_results=3)
|
|
265
|
+
|
|
266
|
+
# Download a specific paper with auto-rename
|
|
267
|
+
download_paper("2502.05218v1")
|
|
268
|
+
|
|
269
|
+
# Get paper metadata
|
|
270
|
+
paper_info = get_paper_metadata("2502.05218v1")
|
|
271
|
+
if paper_info:
|
|
272
|
+
print(f"Title: {paper_info['title']}")
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
## 🔍 Advanced Usage
|
|
276
|
+
|
|
277
|
+
### Smart Download Features
|
|
278
|
+
|
|
279
|
+
#### Automatic Filename Generation
|
|
280
|
+
```python
|
|
281
|
+
from arxiv_api import download_paper
|
|
282
|
+
|
|
283
|
+
# The tool automatically:
|
|
284
|
+
# 1. Fetches paper metadata
|
|
285
|
+
# 2. Extracts the title
|
|
286
|
+
# 3. Cleans the title for filename use
|
|
287
|
+
# 4. Downloads and renames the file
|
|
288
|
+
|
|
289
|
+
# Example output filename:
|
|
290
|
+
# "FactorGCL_A_Hypergraph-Based_Factor_Model_with_Temporal_Residual_Contrastive_Learning_for_Stock_Returns_Prediction.pdf"
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
#### Custom Search Queries
|
|
294
|
+
|
|
295
|
+
##### Field-Specific Searches
|
|
296
|
+
```bash
|
|
297
|
+
# Search by author
|
|
298
|
+
python arxiv_api.py "au:Yann LeCun"
|
|
299
|
+
|
|
300
|
+
# Search by title
|
|
301
|
+
python arxiv_api.py "ti:deep learning"
|
|
302
|
+
|
|
303
|
+
# Search by abstract
|
|
304
|
+
python arxiv_api.py "abs:neural networks"
|
|
305
|
+
|
|
306
|
+
# Search by category
|
|
307
|
+
python arxiv_api.py "cat:cs.AI"
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
##### Complex Queries
|
|
311
|
+
```bash
|
|
312
|
+
# Multiple terms
|
|
313
|
+
python arxiv_api.py "machine learning AND neural networks"
|
|
314
|
+
|
|
315
|
+
# Exclude terms
|
|
316
|
+
python arxiv_api.py "deep learning NOT reinforcement"
|
|
317
|
+
|
|
318
|
+
# Date range
|
|
319
|
+
python arxiv_api.py "machine learning AND submittedDate:[20230101 TO 20231231]"
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Batch Operations
|
|
323
|
+
|
|
324
|
+
#### Download Multiple Papers with Auto-Rename
|
|
325
|
+
```python
|
|
326
|
+
from arxiv_api import search_arxiv, parse_search_results, download_paper
|
|
327
|
+
|
|
328
|
+
# Search for papers
|
|
329
|
+
query = "quantum computing"
|
|
330
|
+
results = search_arxiv(query, max_results=10)
|
|
331
|
+
papers = parse_search_results(results)
|
|
332
|
+
|
|
333
|
+
# Download all papers (each will be automatically renamed)
|
|
334
|
+
for paper in papers:
|
|
335
|
+
paper_id = paper.get('paper_id')
|
|
336
|
+
if paper_id:
|
|
337
|
+
download_paper(paper_id, output_dir="./quantum_papers")
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
#### Custom Output Formatting
|
|
341
|
+
```python
|
|
342
|
+
from arxiv_api import search_and_download
|
|
343
|
+
|
|
344
|
+
# Custom display function
|
|
345
|
+
def custom_display(papers):
|
|
346
|
+
for i, paper in enumerate(papers, 1):
|
|
347
|
+
print(f"📄 Paper {i}: {paper['title']}")
|
|
348
|
+
print(f"👥 Authors: {', '.join(paper['authors'])}")
|
|
349
|
+
print(f"🆔 ID: {paper['paper_id']}")
|
|
350
|
+
print(f"📅 Date: {paper['published']}")
|
|
351
|
+
print(f"📝 Abstract: {paper['abstract'][:150]}...")
|
|
352
|
+
print("-" * 80)
|
|
353
|
+
|
|
354
|
+
# Use custom display
|
|
355
|
+
search_and_download("blockchain", max_results=3)
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
## 🛠️ Troubleshooting
|
|
359
|
+
|
|
360
|
+
### Common Issues
|
|
361
|
+
|
|
362
|
+
#### 1. No Results Found
|
|
363
|
+
**Problem:** Search returns no papers
|
|
364
|
+
**Solution:**
|
|
365
|
+
- Check spelling and use broader terms
|
|
366
|
+
- Try different keyword combinations
|
|
367
|
+
- Verify internet connection
|
|
368
|
+
|
|
369
|
+
#### 2. Download Failed
|
|
370
|
+
**Problem:** Paper download fails
|
|
371
|
+
**Solution:**
|
|
372
|
+
- Verify paper ID is correct
|
|
373
|
+
- Check if paper exists on arXiv
|
|
374
|
+
- Ensure write permissions in output directory
|
|
375
|
+
|
|
376
|
+
#### 3. API Rate Limiting
|
|
377
|
+
**Problem:** Too many requests
|
|
378
|
+
**Solution:**
|
|
379
|
+
- Wait between requests
|
|
380
|
+
- Reduce batch size
|
|
381
|
+
- Use interactive mode for multiple searches
|
|
382
|
+
|
|
383
|
+
#### 4. XML Parsing Errors
|
|
384
|
+
**Problem:** Error parsing search results
|
|
385
|
+
**Solution:**
|
|
386
|
+
- Check internet connection
|
|
387
|
+
- Verify API response format
|
|
388
|
+
- Update the script if needed
|
|
389
|
+
|
|
390
|
+
#### 5. Filename Too Long
|
|
391
|
+
**Problem:** Generated filename exceeds system limits
|
|
392
|
+
**Solution:**
|
|
393
|
+
- The tool automatically limits filenames to 100 characters
|
|
394
|
+
- Special characters are automatically cleaned
|
|
395
|
+
- Fallback to paper ID if title is unavailable
|
|
396
|
+
|
|
397
|
+
### Error Messages
|
|
398
|
+
|
|
399
|
+
```
|
|
400
|
+
Error: Failed to download paper 2502.05218v1
|
|
401
|
+
```
|
|
402
|
+
- Paper ID may not exist
|
|
403
|
+
- Network connection issue
|
|
404
|
+
- arXiv server problem
|
|
405
|
+
|
|
406
|
+
```
|
|
407
|
+
Error parsing XML: ...
|
|
408
|
+
```
|
|
409
|
+
- Malformed API response
|
|
410
|
+
- Network interruption
|
|
411
|
+
- API format change
|
|
412
|
+
|
|
413
|
+
```
|
|
414
|
+
Could not find paper information for 2502.05218v1
|
|
415
|
+
```
|
|
416
|
+
- Paper ID may be invalid
|
|
417
|
+
- arXiv API issue
|
|
418
|
+
- Network connectivity problem
|
|
419
|
+
|
|
420
|
+
## 📖 API Reference
|
|
421
|
+
|
|
422
|
+
### arXiv API Endpoints
|
|
423
|
+
- **Search API**: `http://export.arxiv.org/api/query`
|
|
424
|
+
- **Metadata API**: `http://export.arxiv.org/api/query?id_list={paper_id}`
|
|
425
|
+
- **Documentation**: https://arxiv.org/help/api
|
|
426
|
+
- **Rate Limits**: Be respectful, avoid excessive requests
|
|
427
|
+
|
|
428
|
+
### Data Fields Available
|
|
429
|
+
- **Title**: Paper title
|
|
430
|
+
- **Authors**: List of author names
|
|
431
|
+
- **Abstract**: Paper abstract
|
|
432
|
+
- **Paper ID**: Unique arXiv identifier
|
|
433
|
+
- **Published Date**: Publication timestamp
|
|
434
|
+
- **Categories**: arXiv subject categories
|
|
435
|
+
|
|
436
|
+
### Paper ID Format
|
|
437
|
+
- **Format**: `YYMM.NNNNNvN`
|
|
438
|
+
- **Example**: `2502.05218v1`
|
|
439
|
+
- **Download URL**: `https://arxiv.org/pdf/{paper_id}.pdf`
|
|
440
|
+
|
|
441
|
+
### Smart Download Features
|
|
442
|
+
- **Automatic Metadata Fetching**: Gets paper information before download
|
|
443
|
+
- **Intelligent Filename Generation**: Converts paper titles to valid filenames
|
|
444
|
+
- **Character Cleaning**: Removes special characters and spaces
|
|
445
|
+
- **Length Limiting**: Ensures filenames don't exceed system limits
|
|
446
|
+
- **Fallback Naming**: Uses paper ID if title is unavailable
|
|
447
|
+
|
|
448
|
+
## 🤝 Contributing
|
|
449
|
+
|
|
450
|
+
### Adding New Features
|
|
451
|
+
1. Fork the repository
|
|
452
|
+
2. Create a feature branch
|
|
453
|
+
3. Implement your changes
|
|
454
|
+
4. Add tests and documentation
|
|
455
|
+
5. Submit a pull request
|
|
456
|
+
|
|
457
|
+
### Reporting Issues
|
|
458
|
+
- Check existing issues first
|
|
459
|
+
- Provide detailed error messages
|
|
460
|
+
- Include system information
|
|
461
|
+
- Describe steps to reproduce
|
|
462
|
+
|
|
463
|
+
## 📄 License
|
|
464
|
+
|
|
465
|
+
This project is open source and available under the MIT License.
|
|
466
|
+
|
|
467
|
+
## 🙏 Acknowledgments
|
|
468
|
+
|
|
469
|
+
- **arXiv**: For providing the public API
|
|
470
|
+
- **Python Community**: For excellent libraries and tools
|
|
471
|
+
- **Researchers**: For contributing to open science
|
|
472
|
+
|
|
473
|
+
## 📞 Support
|
|
474
|
+
|
|
475
|
+
### Getting Help
|
|
476
|
+
- Check this documentation first
|
|
477
|
+
- Review the examples section
|
|
478
|
+
- Search existing issues
|
|
479
|
+
- Create a new issue for bugs
|
|
480
|
+
|
|
481
|
+
### Useful Links
|
|
482
|
+
- [arXiv Official Site](https://arxiv.org/)
|
|
483
|
+
- [arXiv API Documentation](https://arxiv.org/help/api)
|
|
484
|
+
- [Python Requests Library](https://requests.readthedocs.io/)
|
|
485
|
+
|
|
486
|
+
---
|
|
487
|
+
|
|
488
|
+
**Happy Researching! 🎓📚**
|
|
489
|
+
|
|
490
|
+
*This tool makes academic research more accessible and efficient. Use it responsibly and respect arXiv's terms of service.*
|
|
@@ -5,7 +5,7 @@ This document outlines a systematic, repeatable workflow for enhancing alphas on
|
|
|
5
5
|
## Prerequisites
|
|
6
6
|
- Authenticate with BRAIN (e.g., via API tool).
|
|
7
7
|
- Have the alpha ID and expression ready.
|
|
8
|
-
- Access to arXiv script (e.g., `arxiv_api.py`) for idea sourcing.
|
|
8
|
+
- Access to arXiv script (e.g., `arxiv_api.py`) for idea sourcing, for the usage of the script, please refer to [arXiv_API_Tool_Manual.md].
|
|
9
9
|
- Track progress in a log (e.g., metrics table per iteration).
|
|
10
10
|
|
|
11
11
|
## Step 1: Gather Alpha Information (5-10 minutes)
|