@yeyuan98/opencode-bioresearcher-plugin 1.3.1 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/README.md +14 -0
  2. package/dist/index.js +4 -1
  3. package/dist/misc-tools/index.d.ts +3 -0
  4. package/dist/misc-tools/index.js +3 -0
  5. package/dist/misc-tools/json-extract.d.ts +13 -0
  6. package/dist/misc-tools/json-extract.js +394 -0
  7. package/dist/misc-tools/json-infer.d.ts +13 -0
  8. package/dist/misc-tools/json-infer.js +199 -0
  9. package/dist/misc-tools/json-tools.d.ts +33 -0
  10. package/dist/misc-tools/json-tools.js +187 -0
  11. package/dist/misc-tools/json-validate.d.ts +13 -0
  12. package/dist/misc-tools/json-validate.js +228 -0
  13. package/dist/skills/bioresearcher-core/README.md +210 -0
  14. package/dist/skills/bioresearcher-core/SKILL.md +128 -0
  15. package/dist/skills/bioresearcher-core/examples/contexts.json +29 -0
  16. package/dist/skills/bioresearcher-core/examples/data-exchange-example.md +303 -0
  17. package/dist/skills/bioresearcher-core/examples/template.md +49 -0
  18. package/dist/skills/bioresearcher-core/patterns/calculator.md +215 -0
  19. package/dist/skills/bioresearcher-core/patterns/data-exchange.md +406 -0
  20. package/dist/skills/bioresearcher-core/patterns/json-tools.md +263 -0
  21. package/dist/skills/bioresearcher-core/patterns/progress.md +127 -0
  22. package/dist/skills/bioresearcher-core/patterns/retry.md +110 -0
  23. package/dist/skills/bioresearcher-core/patterns/shell-commands.md +79 -0
  24. package/dist/skills/bioresearcher-core/patterns/subagent-waves.md +186 -0
  25. package/dist/skills/bioresearcher-core/patterns/table-tools.md +260 -0
  26. package/dist/skills/bioresearcher-core/patterns/user-confirmation.md +187 -0
  27. package/dist/skills/bioresearcher-core/python/template.md +273 -0
  28. package/dist/skills/bioresearcher-core/python/template.py +323 -0
  29. package/dist/skills/long-table-summary/SKILL.md +374 -0
  30. package/dist/skills/long-table-summary/__init__.py +3 -0
  31. package/dist/skills/long-table-summary/combine_outputs.py +345 -0
  32. package/dist/skills/long-table-summary/pyproject.toml +11 -0
  33. package/dist/skills/pubmed-weekly/SKILL.md +329 -329
  34. package/dist/skills/pubmed-weekly/pubmed_weekly.py +411 -411
  35. package/dist/skills/pubmed-weekly/pyproject.toml +8 -8
  36. package/package.json +7 -2
@@ -1,329 +1,329 @@
1
- ---
2
- name: pubmed-weekly
3
- description: Download PubMed daily update xml.gz files from the past week from NCBI FTP server
4
- allowedTools:
5
- - Bash
6
- - Read
7
- - Write
8
- - Question
9
- - parse_pubmed_articleSet
10
- ---
11
-
12
- # PubMed Weekly Daily Updates Download
13
-
14
- This skill downloads PubMed daily update xml.gz files from the past week (Monday-Sunday).
15
-
16
- ## Workflow Overview
17
-
18
- 1. **Python Environment Setup** (automatic): Checks for uv, installs via `python-setup-uv` skill if needed
19
- 2. **Date Calculation**: Calculates the past week's date range (Monday-Sunday)
20
- 3. **FTP Listing**: Fetches available xml.gz files from NCBI FTP server
21
- 4. **Filtering**: Filters files to include only those from the past week
22
- 5. **Download**: Downloads filtered files with retry logic (max 3 attempts per file)
23
-
24
- ## Prerequisites
25
- - Internet connection
26
- - Access to NCBI FTP server
27
- - uv package manager (will be automatically installed if not present)
28
-
29
- ## Integration with python-setup-uv
30
-
31
- This skill integrates with the `python-setup-uv` skill to ensure Python environment is properly configured.
32
-
33
- ### Prerequisite Check
34
-
35
- Before starting the download process:
36
-
37
- 1. **Check if uv is installed:**
38
- ```bash
39
- if [ -f "uv" ] || [ -f "uv.exe" ]; then
40
- echo "uv already installed"
41
- else
42
- echo "uv not found, setting up..."
43
- fi
44
- ```
45
-
46
- 2. **If uv is not installed:**
47
- - Load the `python-setup-uv` skill using the skill tool
48
- - Follow all steps EXACTLY as specified in the python-setup-uv skill
49
- - Wait for uv installation to complete
50
- - Continue with this skill's Step 1 below
51
-
52
- 3. **After uv is installed:**
53
- - The bundled script `pubmed_weekly.py` will be executed using uv
54
- - Extract the full script path from the `<skill_files>` section in skill tool output
55
-
56
- ## Steps
57
-
58
- Follow these steps EXACTLY as described.
59
-
60
- ### Step 1: Calculate Week Date Range
61
-
62
- First, determine the date range for the past week (Monday through Sunday).
63
-
64
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section in the skill tool output.
65
-
66
- **For Unix-like shells (Git Bash / macOS / Linux):**
67
- ```bash
68
- uv run python <skill_path>/pubmed_weekly.py calculate_week
69
- ```
70
-
71
- ### Step 1: Calculate Week Date Range
72
-
73
- First, determine the date range for the past week (Monday through Sunday).
74
-
75
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section in the skill tool output.
76
-
77
- ### Step 1: Calculate Week Date Range
78
-
79
- First, determine the date range for the past week (Monday through Sunday).
80
-
81
- **For Unix-like shells (Git Bash / macOS / Linux):**
82
- ```bash
83
- uv run python <skill_path>/pubmed_weekly.py calculate_week
84
- ```
85
-
86
- **For Windows cmd.exe:**
87
- ```bash
88
- uv.exe run python <skill_path>\pubmed_weekly.py calculate_week
89
- ```
90
-
91
- Replace `<skill_path>` with the full directory path extracted from `<skill_files>`.
92
-
93
- This will output the week folder name in format `YYYYMMDD-YYYYMMDD`.
94
-
95
- **Expected output format:**
96
- ```
97
- 20250217-20250223
98
- ```
99
-
100
- ### Step 2: Create Download Directory
101
-
102
- The `download_file` command will automatically create the directory structure when needed. No manual directory creation is required.
103
-
104
- ### Step 3: Fetch FTP File List
105
-
106
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and fetch the list of files from the NCBI FTP server.
107
-
108
- **For Unix-like shells:**
109
- ```bash
110
- uv run python <skill_path>/pubmed_weekly.py fetch_files
111
- ```
112
-
113
- **For Windows cmd.exe:**
114
- ```bash
115
- uv.exe run python <skill_path>\pubmed_weekly.py fetch_files
116
- ```
117
-
118
- Replace `<skill_path>` with the full directory path extracted from `<skill_files>`.
119
-
120
- This will list all daily update xml.gz files available on the FTP server.
121
-
122
- **Expected output:**
123
- ```
124
- pubmed24n1234.xml.gz pubmed24n1235.xml.gz pubmed24n1236.xml.gz
125
- ```
126
-
127
- ### Step 4: Filter Files for Past Week
128
-
129
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and filter the file list for the past week's daily updates.
130
-
131
- **For Unix-like shells:**
132
- ```bash
133
- uv run python <skill_path>/pubmed_weekly.py filter_files "<WEEK>" "<FILE_LIST>"
134
- ```
135
-
136
- **For Windows cmd.exe:**
137
- ```bash
138
- uv.exe run python <skill_path>\pubmed_weekly.py filter_files "<WEEK>" "<FILE_LIST>"
139
- ```
140
-
141
- Where:
142
- - `<skill_path>` is the full directory path extracted from `<skill_files>`
143
- - `<WEEK>` is the week folder name (e.g., `20250217-20250223`)
144
- - `<FILE_LIST>` is the output from Step 3 (space-separated filenames, use quotes)
145
-
146
- This will return a space-separated list of xml.gz files from the past week.
147
-
148
- **Expected output:**
149
- ```
150
- pubmed24n1234.xml.gz pubmed24n1235.xml.gz pubmed24n1236.xml.gz
151
- ```
152
-
153
- ### Step 5: Download Files with Retry
154
-
155
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and for each file in the filtered list, download to the target directory with retry logic.
156
-
157
- **For Unix-like shells:**
158
- ```bash
159
- for file in <FILE_LIST>; do
160
- uv run python <skill_path>/pubmed_weekly.py download_file <WEEK> $file
161
- done
162
- ```
163
-
164
- **For Windows cmd.exe:**
165
- ```bash
166
- for %f in (<FILE_LIST>) do uv.exe run python <skill_path>\pubmed_weekly.py download_file <WEEK> %f
167
- ```
168
-
169
- Where:
170
- - `<skill_path>` is the full directory path extracted from `<skill_files>`
171
- - `<FILE_LIST>` is the space-separated list from Step 4
172
- - `<WEEK>` is the week folder name
173
-
174
- Replace `<FILE_LIST>` with the space-separated list from Step 4.
175
-
176
- **Download behavior:**
177
- - Downloads one file at a time
178
- - Retries up to 3 times if download fails
179
- - Waits 2 seconds between retry attempts
180
- - After 3 failed attempts, asks user whether to abort
181
-
182
- **If a download fails after 3 retries:**
183
- Use the question tool to ask:
184
- - "Abort remaining downloads?" (options: "Yes" / "No")
185
-
186
- If user selects "Yes", stop the process and report summary.
187
- If user selects "No", skip the failed file and continue with the next one.
188
-
189
- ### Step 6: Verify Downloads
190
-
191
- After all downloads complete (or are aborted), verify the downloaded files:
192
-
193
- ```bash
194
- ls -lh .download/pubmed-daily/<WEEK>/
195
- ```
196
-
197
- Count the number of files downloaded and report the summary to the user.
198
-
199
- ### Step 7: Parse XML Files to Individual Excel Sheets
200
-
201
- For each downloaded `.xml.gz` file in `.download/pubmed-daily/<WEEK>/`, use the `parse_pubmed_articleSet` tool to convert it to an Excel file.
202
-
203
- **Tool invocation pattern:**
204
-
205
- ```
206
- parse_pubmed_articleSet
207
- filePath="<working_dir>/.download/pubmed-daily/<WEEK>/<filename>.xml.gz"
208
- outputMode="excel"
209
- outputFileName="<filename>.xlsx"
210
- outputDir="<working_dir>/.download/pubmed-daily/<WEEK>"
211
- ```
212
-
213
- **Example:**
214
- For file `pubmed24n1234.xml.gz` in week `20250217-20250223`:
215
- - Input: `.download/pubmed-daily/20250217-20250223/pubmed24n1234.xml.gz`
216
- - Output: `.download/pubmed-daily/20250217-20250223/pubmed24n1234.xlsx`
217
-
218
- **Process:**
219
- 1. List all `.xml.gz` files in the week directory
220
- 2. For each file, call `parse_pubmed_articleSet` with `outputMode="excel"`
221
- 3. The output Excel file will be saved in the same directory as the input
222
- 4. Report parsing statistics (articles processed, any errors)
223
-
224
- ### Step 8: Combine Individual Excel Files
225
-
226
- After all individual Excel files are created, combine them into a single `combined.xlsx` file using the Python script.
227
-
228
- Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section.
229
-
230
- **For Unix-like shells (Git Bash / macOS / Linux):**
231
- ```bash
232
- uv run python <skill_path>/pubmed_weekly.py combine_excel "<WEEK>"
233
- ```
234
-
235
- **For Windows cmd.exe:**
236
- ```bash
237
- uv.exe run python <skill_path>\pubmed_weekly.py combine_excel "<WEEK>"
238
- ```
239
-
240
- Where:
241
- - `<skill_path>` is the full directory path extracted from `<skill_files>`
242
- - `<WEEK>` is the actual week folder name (e.g., `20250217-20250223`)
243
-
244
- **Expected behavior:**
245
- - Finds all `.xlsx` files in the week directory (excluding `combined.xlsx`)
246
- - Reads each file and combines all rows
247
- - Writes `combined.xlsx` with all articles from all files
248
- - Returns summary: total rows, source files processed
249
-
250
- **Output location:**
251
- `.download/pubmed-daily/<WEEK>/combined.xlsx`
252
-
253
- ## Python Script Details
254
-
255
- The skill includes a bundled Python script at `pubmed_weekly.py` with the following functions:
256
-
257
- ### 1. `calculate_week()` - Calculate week date range
258
-
259
- Returns week folder name in format `YYYYMMDD-YYYYMMDD` for the past week (Monday-Sunday).
260
-
261
- ### 2. `fetch_files()` - Fetch FTP file list
262
-
263
- Returns list of all xml.gz filenames from NCBI FTP server.
264
-
265
- ### 3. `filter_files(week_name, file_list)` - Filter files for the week
266
-
267
- Parameters:
268
- - `week_name`: Week folder name (e.g., `20250217-20250223`)
269
- - `file_list`: List of filenames from FTP server
270
-
271
- Returns: Space-separated list of xml.gz files that fall within the date range.
272
-
273
- ### 4. `download_file(week_name, filename)` - Download single file with retry
274
-
275
- Parameters:
276
- - `week_name`: Week folder name
277
- - `filename`: XML.gz filename to download
278
-
279
- Behavior:
280
- - Downloads from `ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/<filename>`
281
- - Saves to `.download/pubmed-daily/<week_name>/<filename>` in current working directory
282
- - Creates directory structure if needed
283
- - Retries up to 3 times on failure
284
- - Returns exit code 0 on success, 1 on failure (after all retries)
285
-
286
- ### 5. `combine_excel(week_name)` - Combine Excel files into combined.xlsx
287
-
288
- Parameters:
289
- - `week_name`: Week folder name (e.g., `20250217-20250223`)
290
-
291
- Behavior:
292
- - Searches for all `.xlsx` files in `.download/pubmed-daily/<week_name>/` in current working directory
293
- - Excludes `combined.xlsx` from the list
294
- - Reads each Excel file and combines all rows
295
- - Creates `combined.xlsx` with all articles merged
296
- - Returns JSON with: success, total_rows, source_files, output_file
297
-
298
- ## Output Summary
299
-
300
- After completion, provide the user with:
301
-
302
- 1. Week date range processed
303
- 2. Number of files found for the week
304
- 3. Number of files successfully downloaded
305
- 4. Number of files failed to download (if any)
306
- 5. Download location: `.download/pubmed-daily/<WEEK>/`
307
- 6. Number of XML files parsed to Excel (Step 7)
308
- 7. Total articles in combined.xlsx (Step 8)
309
- 8. Combined file location: `.download/pubmed-daily/<WEEK>/combined.xlsx`
310
-
311
- ## Notes
312
-
313
- - This skill automatically checks for and installs uv using the `python-setup-uv` skill if not present
314
- - The Python script is bundled with this skill at `pubmed_weekly.py`
315
- - All Python commands use the full script path extracted from `<skill_files>` section
316
- - The script uses `os.getcwd()` to determine the working directory, which is naturally the opencode working directory
317
- - All output files (downloads, Excel files) are created in the opencode working directory
318
- - The FTP server path is: `ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/`
319
- - Only `.xml.gz` files are downloaded
320
- - Downloads are sequential (one file at a time)
321
- - Retry logic includes 2-second delays between attempts
322
- - User has control to abort on persistent failures
323
- - The script uses Python's built-in `urllib.request` for FTP operations
324
- - The `combine_excel` command requires `openpyxl` package (auto-installed via uv)
325
- - Skill directory path is extracted from `<skill_files>` section for script location
326
- - Windows with Git Bash: Follow Unix-like shell instructions
327
- - Windows cmd.exe: Use `uv.exe run python` syntax
328
- - Step 7 uses the `parse_pubmed_articleSet` tool for XML to Excel conversion
329
- - Step 8 combines all individual Excel files into a single `combined.xlsx`
1
+ ---
2
+ name: pubmed-weekly
3
+ description: Download PubMed daily update xml.gz files from the past week from NCBI FTP server
4
+ allowedTools:
5
+ - Bash
6
+ - Read
7
+ - Write
8
+ - Question
9
+ - parse_pubmed_articleSet
10
+ ---
11
+
12
+ # PubMed Weekly Daily Updates Download
13
+
14
+ This skill downloads PubMed daily update xml.gz files from the past week (Monday-Sunday).
15
+
16
+ ## Workflow Overview
17
+
18
+ 1. **Python Environment Setup** (automatic): Checks for uv, installs via `python-setup-uv` skill if needed
19
+ 2. **Date Calculation**: Calculates the past week's date range (Monday-Sunday)
20
+ 3. **FTP Listing**: Fetches available xml.gz files from NCBI FTP server
21
+ 4. **Filtering**: Filters files to include only those from the past week
22
+ 5. **Download**: Downloads filtered files with retry logic (max 3 attempts per file)
23
+
24
+ ## Prerequisites
25
+ - Internet connection
26
+ - Access to NCBI FTP server
27
+ - uv package manager (will be automatically installed if not present)
28
+
29
+ ## Integration with python-setup-uv
30
+
31
+ This skill integrates with the `python-setup-uv` skill to ensure Python environment is properly configured.
32
+
33
+ ### Prerequisite Check
34
+
35
+ Before starting the download process:
36
+
37
+ 1. **Check if uv is installed:**
38
+ ```bash
39
+ if [ -f "uv" ] || [ -f "uv.exe" ]; then
40
+ echo "uv already installed"
41
+ else
42
+ echo "uv not found, setting up..."
43
+ fi
44
+ ```
45
+
46
+ 2. **If uv is not installed:**
47
+ - Load the `python-setup-uv` skill using the skill tool
48
+ - Follow all steps EXACTLY as specified in the python-setup-uv skill
49
+ - Wait for uv installation to complete
50
+ - Continue with this skill's Step 1 below
51
+
52
+ 3. **After uv is installed:**
53
+ - The bundled script `pubmed_weekly.py` will be executed using uv
54
+ - Extract the full script path from the `<skill_files>` section in skill tool output
55
+
56
+ ## Steps
57
+
58
+ Follow these steps EXACTLY as described.
59
+
60
+ ### Step 1: Calculate Week Date Range
61
+
62
+ First, determine the date range for the past week (Monday through Sunday).
63
+
64
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section in the skill tool output.
65
+
66
+ **For Unix-like shells (Git Bash / macOS / Linux):**
67
+ ```bash
68
+ uv run python <skill_path>/pubmed_weekly.py calculate_week
69
+ ```
70
+
71
+ ### Step 1: Calculate Week Date Range
72
+
73
+ First, determine the date range for the past week (Monday through Sunday).
74
+
75
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section in the skill tool output.
76
+
77
+ ### Step 1: Calculate Week Date Range
78
+
79
+ First, determine the date range for the past week (Monday through Sunday).
80
+
81
+ **For Unix-like shells (Git Bash / macOS / Linux):**
82
+ ```bash
83
+ uv run python <skill_path>/pubmed_weekly.py calculate_week
84
+ ```
85
+
86
+ **For Windows cmd.exe:**
87
+ ```bash
88
+ uv.exe run python <skill_path>\pubmed_weekly.py calculate_week
89
+ ```
90
+
91
+ Replace `<skill_path>` with the full directory path extracted from `<skill_files>`.
92
+
93
+ This will output the week folder name in format `YYYYMMDD-YYYYMMDD`.
94
+
95
+ **Expected output format:**
96
+ ```
97
+ 20250217-20250223
98
+ ```
99
+
100
+ ### Step 2: Create Download Directory
101
+
102
+ The `download_file` command will automatically create the directory structure when needed. No manual directory creation is required.
103
+
104
+ ### Step 3: Fetch FTP File List
105
+
106
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and fetch the list of files from the NCBI FTP server.
107
+
108
+ **For Unix-like shells:**
109
+ ```bash
110
+ uv run python <skill_path>/pubmed_weekly.py fetch_files
111
+ ```
112
+
113
+ **For Windows cmd.exe:**
114
+ ```bash
115
+ uv.exe run python <skill_path>\pubmed_weekly.py fetch_files
116
+ ```
117
+
118
+ Replace `<skill_path>` with the full directory path extracted from `<skill_files>`.
119
+
120
+ This will list all daily update xml.gz files available on the FTP server.
121
+
122
+ **Expected output:**
123
+ ```
124
+ pubmed24n1234.xml.gz pubmed24n1235.xml.gz pubmed24n1236.xml.gz
125
+ ```
126
+
127
+ ### Step 4: Filter Files for Past Week
128
+
129
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and filter the file list for the past week's daily updates.
130
+
131
+ **For Unix-like shells:**
132
+ ```bash
133
+ uv run python <skill_path>/pubmed_weekly.py filter_files "<WEEK>" "<FILE_LIST>"
134
+ ```
135
+
136
+ **For Windows cmd.exe:**
137
+ ```bash
138
+ uv.exe run python <skill_path>\pubmed_weekly.py filter_files "<WEEK>" "<FILE_LIST>"
139
+ ```
140
+
141
+ Where:
142
+ - `<skill_path>` is the full directory path extracted from `<skill_files>`
143
+ - `<WEEK>` is the week folder name (e.g., `20250217-20250223`)
144
+ - `<FILE_LIST>` is the output from Step 3 (space-separated filenames, use quotes)
145
+
146
+ This will return a space-separated list of xml.gz files from the past week.
147
+
148
+ **Expected output:**
149
+ ```
150
+ pubmed24n1234.xml.gz pubmed24n1235.xml.gz pubmed24n1236.xml.gz
151
+ ```
152
+
153
+ ### Step 5: Download Files with Retry
154
+
155
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section and for each file in the filtered list, download to the target directory with retry logic.
156
+
157
+ **For Unix-like shells:**
158
+ ```bash
159
+ for file in <FILE_LIST>; do
160
+ uv run python <skill_path>/pubmed_weekly.py download_file <WEEK> $file
161
+ done
162
+ ```
163
+
164
+ **For Windows cmd.exe:**
165
+ ```bash
166
+ for %f in (<FILE_LIST>) do uv.exe run python <skill_path>\pubmed_weekly.py download_file <WEEK> %f
167
+ ```
168
+
169
+ Where:
170
+ - `<skill_path>` is the full directory path extracted from `<skill_files>`
171
+ - `<FILE_LIST>` is the space-separated list from Step 4
172
+ - `<WEEK>` is the week folder name
173
+
174
+ Replace `<FILE_LIST>` with the space-separated list from Step 4.
175
+
176
+ **Download behavior:**
177
+ - Downloads one file at a time
178
+ - Retries up to 3 times if download fails
179
+ - Waits 2 seconds between retry attempts
180
+ - After 3 failed attempts, asks user whether to abort
181
+
182
+ **If a download fails after 3 retries:**
183
+ Use the question tool to ask:
184
+ - "Abort remaining downloads?" (options: "Yes" / "No")
185
+
186
+ If user selects "Yes", stop the process and report summary.
187
+ If user selects "No", skip the failed file and continue with the next one.
188
+
189
+ ### Step 6: Verify Downloads
190
+
191
+ After all downloads complete (or are aborted), verify the downloaded files:
192
+
193
+ ```bash
194
+ ls -lh .download/pubmed-daily/<WEEK>/
195
+ ```
196
+
197
+ Count the number of files downloaded and report the summary to the user.
198
+
199
+ ### Step 7: Parse XML Files to Individual Excel Sheets
200
+
201
+ For each downloaded `.xml.gz` file in `.download/pubmed-daily/<WEEK>/`, use the `parse_pubmed_articleSet` tool to convert it to an Excel file.
202
+
203
+ **Tool invocation pattern:**
204
+
205
+ ```
206
+ parse_pubmed_articleSet
207
+ filePath="<working_dir>/.download/pubmed-daily/<WEEK>/<filename>.xml.gz"
208
+ outputMode="excel"
209
+ outputFileName="<filename>.xlsx"
210
+ outputDir="<working_dir>/.download/pubmed-daily/<WEEK>"
211
+ ```
212
+
213
+ **Example:**
214
+ For file `pubmed24n1234.xml.gz` in week `20250217-20250223`:
215
+ - Input: `.download/pubmed-daily/20250217-20250223/pubmed24n1234.xml.gz`
216
+ - Output: `.download/pubmed-daily/20250217-20250223/pubmed24n1234.xlsx`
217
+
218
+ **Process:**
219
+ 1. List all `.xml.gz` files in the week directory
220
+ 2. For each file, call `parse_pubmed_articleSet` with `outputMode="excel"`
221
+ 3. The output Excel file will be saved in the same directory as the input
222
+ 4. Report parsing statistics (articles processed, any errors)
223
+
224
+ ### Step 8: Combine Individual Excel Files
225
+
226
+ After all individual Excel files are created, combine them into a single `combined.xlsx` file using the Python script.
227
+
228
+ Extract the full path to `pubmed_weekly.py` from the `<skill_files>` section.
229
+
230
+ **For Unix-like shells (Git Bash / macOS / Linux):**
231
+ ```bash
232
+ uv run python <skill_path>/pubmed_weekly.py combine_excel "<WEEK>"
233
+ ```
234
+
235
+ **For Windows cmd.exe:**
236
+ ```bash
237
+ uv.exe run python <skill_path>\pubmed_weekly.py combine_excel "<WEEK>"
238
+ ```
239
+
240
+ Where:
241
+ - `<skill_path>` is the full directory path extracted from `<skill_files>`
242
+ - `<WEEK>` is the actual week folder name (e.g., `20250217-20250223`)
243
+
244
+ **Expected behavior:**
245
+ - Finds all `.xlsx` files in the week directory (excluding `combined.xlsx`)
246
+ - Reads each file and combines all rows
247
+ - Writes `combined.xlsx` with all articles from all files
248
+ - Returns summary: total rows, source files processed
249
+
250
+ **Output location:**
251
+ `.download/pubmed-daily/<WEEK>/combined.xlsx`
252
+
253
+ ## Python Script Details
254
+
255
+ The skill includes a bundled Python script at `pubmed_weekly.py` with the following functions:
256
+
257
+ ### 1. `calculate_week()` - Calculate week date range
258
+
259
+ Returns week folder name in format `YYYYMMDD-YYYYMMDD` for the past week (Monday-Sunday).
260
+
261
+ ### 2. `fetch_files()` - Fetch FTP file list
262
+
263
+ Returns list of all xml.gz filenames from NCBI FTP server.
264
+
265
+ ### 3. `filter_files(week_name, file_list)` - Filter files for the week
266
+
267
+ Parameters:
268
+ - `week_name`: Week folder name (e.g., `20250217-20250223`)
269
+ - `file_list`: List of filenames from FTP server
270
+
271
+ Returns: Space-separated list of xml.gz files that fall within the date range.
272
+
273
+ ### 4. `download_file(week_name, filename)` - Download single file with retry
274
+
275
+ Parameters:
276
+ - `week_name`: Week folder name
277
+ - `filename`: XML.gz filename to download
278
+
279
+ Behavior:
280
+ - Downloads from `ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/<filename>`
281
+ - Saves to `.download/pubmed-daily/<week_name>/<filename>` in current working directory
282
+ - Creates directory structure if needed
283
+ - Retries up to 3 times on failure
284
+ - Returns exit code 0 on success, 1 on failure (after all retries)
285
+
286
+ ### 5. `combine_excel(week_name)` - Combine Excel files into combined.xlsx
287
+
288
+ Parameters:
289
+ - `week_name`: Week folder name (e.g., `20250217-20250223`)
290
+
291
+ Behavior:
292
+ - Searches for all `.xlsx` files in `.download/pubmed-daily/<week_name>/` in current working directory
293
+ - Excludes `combined.xlsx` from the list
294
+ - Reads each Excel file and combines all rows
295
+ - Creates `combined.xlsx` with all articles merged
296
+ - Returns JSON with: success, total_rows, source_files, output_file
297
+
298
+ ## Output Summary
299
+
300
+ After completion, provide the user with:
301
+
302
+ 1. Week date range processed
303
+ 2. Number of files found for the week
304
+ 3. Number of files successfully downloaded
305
+ 4. Number of files failed to download (if any)
306
+ 5. Download location: `.download/pubmed-daily/<WEEK>/`
307
+ 6. Number of XML files parsed to Excel (Step 7)
308
+ 7. Total articles in combined.xlsx (Step 8)
309
+ 8. Combined file location: `.download/pubmed-daily/<WEEK>/combined.xlsx`
310
+
311
+ ## Notes
312
+
313
+ - This skill automatically checks for and installs uv using the `python-setup-uv` skill if not present
314
+ - The Python script is bundled with this skill at `pubmed_weekly.py`
315
+ - All Python commands use the full script path extracted from `<skill_files>` section
316
+ - The script uses `os.getcwd()` to determine the working directory, which is naturally the opencode working directory
317
+ - All output files (downloads, Excel files) are created in the opencode working directory
318
+ - The FTP server path is: `ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/`
319
+ - Only `.xml.gz` files are downloaded
320
+ - Downloads are sequential (one file at a time)
321
+ - Retry logic includes 2-second delays between attempts
322
+ - User has control to abort on persistent failures
323
+ - The script uses Python's built-in `urllib.request` for FTP operations
324
+ - The `combine_excel` command requires `openpyxl` package (auto-installed via uv)
325
+ - Skill directory path is extracted from `<skill_files>` section for script location
326
+ - Windows with Git Bash: Follow Unix-like shell instructions
327
+ - Windows cmd.exe: Use `uv.exe run python` syntax
328
+ - Step 7 uses the `parse_pubmed_articleSet` tool for XML to Excel conversion
329
+ - Step 8 combines all individual Excel files into a single `combined.xlsx`