@yeyuan98/opencode-bioresearcher-plugin 1.4.1 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +49 -22
- package/dist/db-tools/backends/index.d.ts +11 -0
- package/dist/db-tools/backends/index.js +48 -0
- package/dist/db-tools/backends/mongodb/backend.d.ts +15 -0
- package/dist/db-tools/backends/mongodb/backend.js +76 -0
- package/dist/db-tools/backends/mongodb/connection.d.ts +27 -0
- package/dist/db-tools/backends/mongodb/connection.js +107 -0
- package/dist/db-tools/backends/mongodb/index.d.ts +4 -0
- package/dist/db-tools/backends/mongodb/index.js +3 -0
- package/dist/db-tools/backends/mongodb/translator.d.ts +30 -0
- package/dist/db-tools/backends/mongodb/translator.js +407 -0
- package/dist/db-tools/backends/mysql/backend.d.ts +15 -0
- package/dist/db-tools/backends/mysql/backend.js +57 -0
- package/dist/db-tools/backends/mysql/connection.d.ts +25 -0
- package/dist/db-tools/backends/mysql/connection.js +83 -0
- package/dist/db-tools/backends/mysql/index.d.ts +3 -0
- package/dist/db-tools/backends/mysql/index.js +2 -0
- package/dist/db-tools/backends/mysql/translator.d.ts +7 -0
- package/dist/db-tools/backends/mysql/translator.js +67 -0
- package/dist/db-tools/core/base.d.ts +17 -0
- package/dist/db-tools/core/base.js +51 -0
- package/dist/db-tools/core/config-loader.d.ts +3 -0
- package/dist/db-tools/core/config-loader.js +46 -0
- package/dist/db-tools/core/index.d.ts +2 -0
- package/dist/db-tools/core/index.js +2 -0
- package/dist/db-tools/core/jsonc-parser.d.ts +2 -0
- package/dist/db-tools/core/jsonc-parser.js +77 -0
- package/dist/db-tools/core/validator.d.ts +16 -0
- package/dist/db-tools/core/validator.js +118 -0
- package/dist/db-tools/executor.d.ts +13 -0
- package/dist/db-tools/executor.js +54 -0
- package/dist/db-tools/index.d.ts +51 -0
- package/dist/db-tools/index.js +27 -0
- package/dist/db-tools/interface/backend.d.ts +24 -0
- package/dist/db-tools/interface/backend.js +1 -0
- package/dist/db-tools/interface/connection.d.ts +21 -0
- package/dist/db-tools/interface/connection.js +11 -0
- package/dist/db-tools/interface/index.d.ts +4 -0
- package/dist/db-tools/interface/index.js +4 -0
- package/dist/db-tools/interface/query.d.ts +60 -0
- package/dist/db-tools/interface/query.js +1 -0
- package/dist/db-tools/interface/schema.d.ts +22 -0
- package/dist/db-tools/interface/schema.js +1 -0
- package/dist/db-tools/pool.d.ts +8 -0
- package/dist/db-tools/pool.js +49 -0
- package/dist/db-tools/tools/index.d.ts +27 -0
- package/dist/db-tools/tools/index.js +191 -0
- package/dist/db-tools/tools.d.ts +27 -0
- package/dist/db-tools/tools.js +111 -0
- package/dist/db-tools/types.d.ts +94 -0
- package/dist/db-tools/types.js +40 -0
- package/dist/db-tools/utils.d.ts +33 -0
- package/dist/db-tools/utils.js +94 -0
- package/dist/index.js +5 -1
- package/dist/parser-tools/obo/index.d.ts +2 -0
- package/dist/parser-tools/obo/index.js +2 -0
- package/dist/parser-tools/obo/obo.d.ts +17 -0
- package/dist/parser-tools/obo/obo.js +216 -0
- package/dist/parser-tools/obo/types.d.ts +166 -0
- package/dist/parser-tools/obo/types.js +1 -0
- package/dist/parser-tools/obo/utils.d.ts +21 -0
- package/dist/parser-tools/obo/utils.js +411 -0
- package/dist/skills/env-jsonc-setup/SKILL.md +206 -0
- package/dist/skills/long-table-summary/SKILL.md +437 -374
- package/dist/skills/long-table-summary/combine_outputs.py +5 -14
- package/dist/skills/long-table-summary/generate_prompts.py +211 -0
- package/dist/skills/long-table-summary/pyproject.toml +8 -11
- package/package.json +3 -1
- package/dist/skills/long-table-summary/__init__.py +0 -3
|
@@ -1,374 +1,437 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: long-table-summary
|
|
3
|
-
description: Batch-process large tables using parallel subagents for summarization
|
|
4
|
-
allowedTools:
|
|
5
|
-
- Bash
|
|
6
|
-
- Read
|
|
7
|
-
- Write
|
|
8
|
-
- Question
|
|
9
|
-
- Task
|
|
10
|
-
- tableListSheets
|
|
11
|
-
- tableGetSheetPreview
|
|
12
|
-
- tableGetHeaders
|
|
13
|
-
- tableGetRange
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
**
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
**
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
Step
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
### Step
|
|
68
|
-
|
|
69
|
-
Use `question` tool to ask
|
|
70
|
-
|
|
71
|
-
**Question:**
|
|
72
|
-
- "
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
**
|
|
92
|
-
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
**
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
{
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
{
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
```
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
### Step
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
**Unix-like shells:**
|
|
230
|
-
```bash
|
|
231
|
-
uv run python <
|
|
232
|
-
--template .long-table-summary/{topic}/subagent_template.md \
|
|
233
|
-
--
|
|
234
|
-
--
|
|
235
|
-
--
|
|
236
|
-
--
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
```
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
--
|
|
246
|
-
--
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
```
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
-
|
|
356
|
-
-
|
|
357
|
-
-
|
|
358
|
-
-
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
-
|
|
363
|
-
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
|
|
374
|
-
|
|
1
|
+
---
|
|
2
|
+
name: long-table-summary
|
|
3
|
+
description: Batch-process large tables using parallel subagents for summarization
|
|
4
|
+
allowedTools:
|
|
5
|
+
- Bash
|
|
6
|
+
- Read
|
|
7
|
+
- Write
|
|
8
|
+
- Question
|
|
9
|
+
- Task
|
|
10
|
+
- tableListSheets
|
|
11
|
+
- tableGetSheetPreview
|
|
12
|
+
- tableGetHeaders
|
|
13
|
+
- tableGetRange
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# Long Table Summary
|
|
17
|
+
|
|
18
|
+
This skill enables batched processing of large tables (xlsx, csv) using parallel subagents.
|
|
19
|
+
|
|
20
|
+
## Workflow Overview
|
|
21
|
+
|
|
22
|
+
1. **Table Discovery**: Interview user to locate table file and confirm existence
|
|
23
|
+
2. **Sheet Selection**: If multiple sheets, prompt user to choose one
|
|
24
|
+
3. **Summarization Instructions**: Interview user for summary requirements (JSON format)
|
|
25
|
+
4. **Instruction Refinement**: Iterate to refine summarization instructions
|
|
26
|
+
5. **Batch Size Prompting**: Use `question` tool to ask user for batch size
|
|
27
|
+
6. **Topic Generation**: Autogenerate topic name from filename + comprehension of JSON
|
|
28
|
+
7. **Template Creation**: Draft subagent prompt template with JSON output schema
|
|
29
|
+
8. **Template Writing**: Write finalized template
|
|
30
|
+
9. **Prompt Generation**: Use Python script to generate batch-specific prompts
|
|
31
|
+
10. **Parallel Processing**: Launch subagents in waves of 3
|
|
32
|
+
11. **Progress Monitoring**: Report every 3 completed subagents
|
|
33
|
+
12. **Retry Failed Batches**: Up to 3 retry attempts for failed batches
|
|
34
|
+
13. **Output Combination**: Automatically combine all JSON outputs into single table
|
|
35
|
+
|
|
36
|
+
## Steps
|
|
37
|
+
|
|
38
|
+
### Step 1: Interview User for Table Location
|
|
39
|
+
|
|
40
|
+
Use `question` tool to ask for table file path:
|
|
41
|
+
|
|
42
|
+
**Question:**
|
|
43
|
+
- "What is the full path to the table file you want to process?" (supports .xlsx, .csv, .ods)
|
|
44
|
+
|
|
45
|
+
### Step 2: Confirm Table Existence and List Sheets
|
|
46
|
+
|
|
47
|
+
Use `tableListSheets` tool to verify:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
tableListSheets(file_path="<user_provided_path>")
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
If the file doesn't exist or is invalid, prompt user to verify the path.
|
|
54
|
+
|
|
55
|
+
### Step 3: Handle Multiple Sheets (if applicable)
|
|
56
|
+
|
|
57
|
+
**Important:** CSV files have only one sheet. Always skip this step for CSV files.
|
|
58
|
+
|
|
59
|
+
If the file is an Excel (.xlsx) or ODS (.ods) file AND there is more than one sheet, use `question` tool to ask user to choose one.
|
|
60
|
+
|
|
61
|
+
For CSV files: Use the first/only sheet name automatically (either filename returned by tableListSheets or "Sheet1" default) without asking user.
|
|
62
|
+
|
|
63
|
+
### Step 4: Get Table Metadata
|
|
64
|
+
|
|
65
|
+
Use `tableGetSheetPreview` and `tableGetHeaders` to get row count and column structure.
|
|
66
|
+
|
|
67
|
+
### Step 5: Interview User for Summarization Instructions
|
|
68
|
+
|
|
69
|
+
Use `question` tool to ask user to provide summarization instructions in JSON format.
|
|
70
|
+
|
|
71
|
+
**Question:**
|
|
72
|
+
- "Please provide your summarization requirements in JSON format. For each field you want extracted, specify as key the field name, and as value a brief description of what it represents."
|
|
73
|
+
|
|
74
|
+
**Show example:**
|
|
75
|
+
```json
|
|
76
|
+
{
|
|
77
|
+
"species": "Species, one of Tier1/Tier2/NA. Tier1 includes human and monkey, Tier2 includes other animals. NA otherwise.",
|
|
78
|
+
"topic": "Main topic, one of Oncology/Immunology/General Biology/Others."
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Additional example:**
|
|
83
|
+
```json
|
|
84
|
+
{
|
|
85
|
+
"gene_mutation": "Gene mutation pattern, e.g., V600E, R173, Wild Type",
|
|
86
|
+
"clinical_significance": "Clinical relevance, one of High/Medium/Low/Unknown",
|
|
87
|
+
"therapeutic_target": "Is this a drug target? Answer Yes/No/Unknown"
|
|
88
|
+
}
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
**Instructions to user:**
|
|
92
|
+
- Provide any number of fields
|
|
93
|
+
- Each field should be a key (JSON property name) with a description as the value
|
|
94
|
+
- Use clear, specific descriptions that a subagent can interpret
|
|
95
|
+
- Include allowed values or examples if applicable
|
|
96
|
+
|
|
97
|
+
### Step 6: Summarization Instruction Refinement
|
|
98
|
+
|
|
99
|
+
After receiving user's JSON instructions, iteratively refine them.
|
|
100
|
+
|
|
101
|
+
**Question:**
|
|
102
|
+
- "Here's your summarization instruction template. Would you like to modify any field descriptions or add new fields?"
|
|
103
|
+
|
|
104
|
+
**Show the current JSON to user**
|
|
105
|
+
|
|
106
|
+
If user selects "No, it's correct", proceed to Step 7.
|
|
107
|
+
|
|
108
|
+
If user selects "Yes, I want to modify":
|
|
109
|
+
- Ask which field to modify or add
|
|
110
|
+
- Update the JSON accordingly
|
|
111
|
+
- Repeat the approval question
|
|
112
|
+
|
|
113
|
+
Continue until user explicitly confirms that the instruction JSON is correct.
|
|
114
|
+
|
|
115
|
+
### Step 7: Autogenerate Topic Name
|
|
116
|
+
|
|
117
|
+
Generate the topic name by combining:
|
|
118
|
+
- Base filename (without extension)
|
|
119
|
+
- Summary words derived from manual comprehension of user's JSON
|
|
120
|
+
|
|
121
|
+
**Algorithm:**
|
|
122
|
+
1. Extract filename: `clinical_trials_2024_data.xlsx` → `clinical_trials_2024_data`
|
|
123
|
+
- **Note:** The filename may have more than 6 words; keep the full name
|
|
124
|
+
2. Comprehend the overall content of user's JSON instructions:
|
|
125
|
+
- Read the JSON manually to understand what it's about
|
|
126
|
+
- Identify the main idea or subject matter
|
|
127
|
+
- Generate summary words (maximum 6 words, lowercase, hyphenated)
|
|
128
|
+
- Examples: `species-classification`, `gene-mutation`, `clinical-significance`
|
|
129
|
+
3. Combine with hyphens: `clinical_trials_2024_data-species-classification`
|
|
130
|
+
- **Total words may exceed 6** (filename + summary words)
|
|
131
|
+
- **Only the summary words part has a 6-word limit**
|
|
132
|
+
|
|
133
|
+
**Examples:**
|
|
134
|
+
- `clinical_trials_2024_data.xlsx` + topic about species → `clinical_trials_2024_data-species-classification`
|
|
135
|
+
- `literature_review_comprehensive.csv` + gene mutation analysis → `literature_review_comprehensive-gene-mutation`
|
|
136
|
+
- `long_descriptive_filename_multiple_words.xlsx` + immunology → `long_descriptive_filename_multiple_words-immunology`
|
|
137
|
+
|
|
138
|
+
### Step 8: Ask User for Batch Size
|
|
139
|
+
|
|
140
|
+
Use `question` tool to explicitly prompt user about batch size.
|
|
141
|
+
|
|
142
|
+
**Question:**
|
|
143
|
+
- "How many rows should each batch contain? Recommended default: 30 rows per batch."
|
|
144
|
+
|
|
145
|
+
User's value will be used. Calculate the number of batches needed: `ceil(total_rows / batch_size)`.
|
|
146
|
+
|
|
147
|
+
### Step 9: Calculate Batch Ranges
|
|
148
|
+
|
|
149
|
+
Example for 90 rows with 30 per batch:
|
|
150
|
+
- Batch 1: Rows 2-31
|
|
151
|
+
- Batch 2: Rows 32-61
|
|
152
|
+
- Batch 3: Rows 62-90
|
|
153
|
+
|
|
154
|
+
**Note:** Row 1 is the header, data starts at row 2.
|
|
155
|
+
|
|
156
|
+
### Step 10: Create Subagent Prompt Template
|
|
157
|
+
|
|
158
|
+
Create a template with `{placeholder}` format (single braces):
|
|
159
|
+
|
|
160
|
+
```markdown
|
|
161
|
+
# Batch Data Summarization Task
|
|
162
|
+
|
|
163
|
+
## Input File
|
|
164
|
+
- Full path: `{file_path}`
|
|
165
|
+
- Sheet name: `{sheet_name}`
|
|
166
|
+
|
|
167
|
+
## Row Range
|
|
168
|
+
- Batch number: {batch_number}
|
|
169
|
+
- Start row: {row_start}
|
|
170
|
+
- End row: {row_end}
|
|
171
|
+
|
|
172
|
+
## Summarization Instructions
|
|
173
|
+
|
|
174
|
+
Extract the following fields from each row:
|
|
175
|
+
|
|
176
|
+
{instructions_json}
|
|
177
|
+
|
|
178
|
+
## Output Format
|
|
179
|
+
|
|
180
|
+
Your output must be a valid JSON file with this structure:
|
|
181
|
+
|
|
182
|
+
```json
|
|
183
|
+
{
|
|
184
|
+
"batch_number": {batch_number},
|
|
185
|
+
"row_count": <number_of_rows_processed>,
|
|
186
|
+
"summaries": [
|
|
187
|
+
{
|
|
188
|
+
"row_number": <row_number>,
|
|
189
|
+
<field_1>: "<extracted_value>",
|
|
190
|
+
<field_2>: "<extracted_value>"
|
|
191
|
+
}
|
|
192
|
+
]
|
|
193
|
+
}
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
**Important:** The JSON keys for extracted values must match the field names specified in the Summarization Instructions.
|
|
197
|
+
|
|
198
|
+
## Instructions
|
|
199
|
+
|
|
200
|
+
1. Read the specified row range from the input file using the `tableGetRange` tool
|
|
201
|
+
2. For each row, extract the requested fields according to the instructions above
|
|
202
|
+
3. Map the extracted values to the JSON keys specified in the instructions
|
|
203
|
+
4. Generate concise summaries based on the extracted data
|
|
204
|
+
5. Save your output to: `{output_file}`
|
|
205
|
+
|
|
206
|
+
## Output File Path
|
|
207
|
+
Full path: `{output_file}`
|
|
208
|
+
|
|
209
|
+
**CRITICAL:** Write your final output as a markdown file (.md) containing ONLY the JSON object (no additional text or explanation).
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### Step 11: Create Directory Structure
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
mkdir -p .long-table-summary/{topic}/prompts
|
|
216
|
+
mkdir -p .long-table-summary/{topic}/outputs
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
### Step 12: Write Template
|
|
220
|
+
|
|
221
|
+
Write the finalized template to `.long-table-summary/{topic}/subagent_template.md`.
|
|
222
|
+
|
|
223
|
+
### Step 13: Generate Subagent Prompts
|
|
224
|
+
|
|
225
|
+
Use `generate_prompts.py`:
|
|
226
|
+
|
|
227
|
+
**Before Step 13 and Step 17:** Extract the full path to the skill directory from the `<skill_files>` section in the skill tool output. Use this path as `<skill_path>` in the commands below.
|
|
228
|
+
|
|
229
|
+
**Unix-like shells:**
|
|
230
|
+
```bash
|
|
231
|
+
uv run python <skill_path>/generate_prompts.py \
|
|
232
|
+
--template .long-table-summary/{topic}/subagent_template.md \
|
|
233
|
+
--output-dir .long-table-summary/{topic}/prompts \
|
|
234
|
+
--num-batches {num_batches} \
|
|
235
|
+
--sheet-name "{sheet_name}" \
|
|
236
|
+
--file-path "{input_file}" \
|
|
237
|
+
--start-row 2 \
|
|
238
|
+
--batch-size {batch_size} \
|
|
239
|
+
--instructions '{instructions_json}'
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
**For Windows cmd.exe:**
|
|
243
|
+
```bash
|
|
244
|
+
uv.exe run python <skill_path>\generate_prompts.py ^
|
|
245
|
+
--template .long-table-summary\{topic}\subagent_template.md ^
|
|
246
|
+
--output-dir .long-table-summary\{topic}\prompts ^
|
|
247
|
+
--num-batches {num_batches} ^
|
|
248
|
+
--sheet-name "{sheet_name}" ^
|
|
249
|
+
--file-path "{input_file}" ^
|
|
250
|
+
--start-row 2 ^
|
|
251
|
+
--batch-size {batch_size} ^
|
|
252
|
+
--instructions "{instructions_json}"
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
**Note:** The `{instructions_json}` is the user-confirmed JSON from Step 6.
|
|
256
|
+
|
|
257
|
+
### Step 14: Launch Subagents in Waves of 3
|
|
258
|
+
|
|
259
|
+
Launch subagents in waves of 3, waiting for each wave to complete before starting the next.
|
|
260
|
+
|
|
261
|
+
**Wave 1:**
|
|
262
|
+
```typescript
|
|
263
|
+
task(subagent_type="general", description="Process batch 001", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch001.md and perform the task described there exactly as written.", run_in_background=true)
|
|
264
|
+
task(subagent_type="general", description="Process batch 002", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch002.md and perform the task described there exactly as written.", run_in_background=true)
|
|
265
|
+
task(subagent_type="general", description="Process batch 003", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch003.md and perform the task described there exactly as written.", run_in_background=true)
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
**Wait for Wave 1 to complete.**
|
|
269
|
+
|
|
270
|
+
**Wave 2 (if more batches):**
|
|
271
|
+
```typescript
|
|
272
|
+
task(subagent_type="general", description="Process batch 004", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch004.md and perform the task described there exactly as written.", run_in_background=true)
|
|
273
|
+
task(subagent_type="general", description="Process batch 005", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch005.md and perform the task described there exactly as written.", run_in_background=true)
|
|
274
|
+
task(subagent_type="general", description="Process batch 006", prompt="Read your prompt from .long-table-summary/{topic}/prompts/batch006.md and perform the task described there exactly as written.", run_in_background=true)
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
**Continue** launching waves of 3 until all batches are started.
|
|
278
|
+
|
|
279
|
+
**Important:** Do NOT pass the full generated prompts directly to subagents. Always direct subagents to read their respective prompt files.
|
|
280
|
+
|
|
281
|
+
### Step 15: Monitor Progress
|
|
282
|
+
|
|
283
|
+
Every time 3 subagents complete, report progress:
|
|
284
|
+
|
|
285
|
+
**Progress report format:**
|
|
286
|
+
- "Progress: X/Y batches completed (Z%)"
|
|
287
|
+
|
|
288
|
+
For example:
|
|
289
|
+
- "Progress: 3/10 batches completed (30%)"
|
|
290
|
+
- "Progress: 6/10 batches completed (60%)"
|
|
291
|
+
- "Progress: 9/10 batches completed (90%)"
|
|
292
|
+
- "Progress: 10/10 batches completed (100%)"
|
|
293
|
+
|
|
294
|
+
Do NOT inspect individual subagent outputs midway.
|
|
295
|
+
|
|
296
|
+
### Step 16: Retry Failed Batches
|
|
297
|
+
|
|
298
|
+
After all batches are done, check for missing outputs:
|
|
299
|
+
|
|
300
|
+
```bash
|
|
301
|
+
ls .long-table-summary/{topic}/outputs/
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
**Missing files** = failed batches. Collect the batch numbers of all missing files.
|
|
305
|
+
|
|
306
|
+
If there are failed batches:
|
|
307
|
+
|
|
308
|
+
1. Ask user using the `question` tool:
|
|
309
|
+
- "Continue with failed batches or stop?
|
|
310
|
+
|
|
311
|
+
Failed batches: batch003, batch007 (2 failures)
|
|
312
|
+
|
|
313
|
+
• Continue - Will retry each failed batch up to 3 times
|
|
314
|
+
• Stop - Keep current outputs and proceed to final report"
|
|
315
|
+
|
|
316
|
+
(Replace the bracketed list with the actual missing batch numbers)
|
|
317
|
+
|
|
318
|
+
2. **Options:**
|
|
319
|
+
- "Continue with failed batches"
|
|
320
|
+
- "Stop and keep current outputs"
|
|
321
|
+
|
|
322
|
+
3. **If user selects "Continue":**
|
|
323
|
+
- For each failed batch:
|
|
324
|
+
a. Wait 2 seconds
|
|
325
|
+
b. Retry with the same `subagent_type="general"`
|
|
326
|
+
c. Up to 3 retry attempts
|
|
327
|
+
- After retries, check again for remaining failures
|
|
328
|
+
- If batches are still failing, repeat the question with the updated failed list
|
|
329
|
+
|
|
330
|
+
4. **If user selects "Stop":**
|
|
331
|
+
- Do not retry any more batches
|
|
332
|
+
- Proceed to Step 17 with whatever outputs exist
|
|
333
|
+
|
|
334
|
+
### Step 17: Combine All JSON Outputs
|
|
335
|
+
|
|
336
|
+
After all batches are complete (or user stops retrying), use `combine_outputs.py`:
|
|
337
|
+
|
|
338
|
+
**Before Step 13 and Step 17:** Extract the full path to the skill directory from the `<skill_files>` section in the skill tool output. Use this path as `<skill_path>` in the commands below.
|
|
339
|
+
|
|
340
|
+
**Unix-like shells:**
|
|
341
|
+
```bash
|
|
342
|
+
uv run python <skill_path>/combine_outputs.py \
|
|
343
|
+
--input-dir .long-table-summary/{topic}/outputs \
|
|
344
|
+
--output-file .long-table-summary/{topic}/combined_summary.xlsx
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
**For Windows cmd.exe:**
|
|
348
|
+
```bash
|
|
349
|
+
uv.exe run python <skill_path>\combine_outputs.py ^
|
|
350
|
+
--input-dir .long-table-summary\{topic}\outputs ^
|
|
351
|
+
--output-file .long-table-summary\{topic}\combined_summary.xlsx
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
**Expected output:**
|
|
355
|
+
- A single Excel file with all summaries combined
|
|
356
|
+
- Table format: row_number, <field_1>, <field_2>, ... (all user-requested fields)
|
|
357
|
+
- One row per input table row
|
|
358
|
+
- Sorted by row_number ascending
|
|
359
|
+
|
|
360
|
+
**Script behavior:**
|
|
361
|
+
- Reads all `batch*.md` files from the output directory
|
|
362
|
+
- Parses JSON from each file
|
|
363
|
+
- Dynamically determines all columns from the first batch's summaries (user's JSON keys)
|
|
364
|
+
- Merges all summaries into a structured table
|
|
365
|
+
- Writes the combined Excel file
|
|
366
|
+
|
|
367
|
+
### Step 18: Final Report
|
|
368
|
+
|
|
369
|
+
Provide user with:
|
|
370
|
+
1. Topic name used
|
|
371
|
+
2. Total batches processed
|
|
372
|
+
3. Number of retries (if any)
|
|
373
|
+
4. Combined output location
|
|
374
|
+
5. Row count in the combined file
|
|
375
|
+
|
|
376
|
+
## Python Scripts
|
|
377
|
+
|
|
378
|
+
### Script 1: `generate_prompts.py`
|
|
379
|
+
|
|
380
|
+
**Arguments:**
|
|
381
|
+
- `--template`: Path to subagent_template.md
|
|
382
|
+
- `--output-dir`: Directory for generated prompts
|
|
383
|
+
- `--num-batches`: Total number of batches
|
|
384
|
+
- `--sheet-name`: Sheet name
|
|
385
|
+
- `--file-path`: Full path to the input table file
|
|
386
|
+
- `--start-row`: Starting data row (default: 2)
|
|
387
|
+
- `--batch-size`: Rows per batch (default: 30)
|
|
388
|
+
- `--instructions`: User-confirmed JSON with summarization fields
|
|
389
|
+
- `--dry-run`: Validate without creating files (optional)
|
|
390
|
+
- `--verbose`: Enable verbose output for debugging (optional)
|
|
391
|
+
|
|
392
|
+
**Placeholders to replace:**
|
|
393
|
+
- `{file_path}` → Absolute input file path
|
|
394
|
+
- `{sheet_name}` → Sheet name
|
|
395
|
+
- `{batch_number}` → Batch number (001, 002, etc.)
|
|
396
|
+
- `{row_start}` → Start row
|
|
397
|
+
- `{row_end}` → End row
|
|
398
|
+
- `{output_file}` → Output file path
|
|
399
|
+
- `{instructions_json}` → User's JSON instruction (properly escaped for markdown code block)
|
|
400
|
+
|
|
401
|
+
### Script 2: `combine_outputs.py`
|
|
402
|
+
|
|
403
|
+
**Arguments:**
|
|
404
|
+
- `--input-dir`: Directory containing batch output JSON files
|
|
405
|
+
- `--output-file`: Path for the combined Excel output file
|
|
406
|
+
- `--dry-run`: Validate inputs without writing output file (optional)
|
|
407
|
+
- `--verbose`: Enable verbose output for debugging (optional)
|
|
408
|
+
- `--deduplicate`: Remove duplicate row numbers (keep first occurrence) (optional)
|
|
409
|
+
- `--column-order`: Column order - 'preserve' (from first batch) or 'alphabetical' (default) (optional)
|
|
410
|
+
|
|
411
|
+
**Behavior:**
|
|
412
|
+
1. Scan the input directory for `batch*.md` files
|
|
413
|
+
2. For each file, extract the JSON content
|
|
414
|
+
3. Dynamically determine all columns from the first batch's summaries (extract all JSON keys from the first summary, excluding `batch_number` and `row_count`)
|
|
415
|
+
4. Merge all summaries by row_number
|
|
416
|
+
5. Create an Excel file with columns: row_number, <all user fields>
|
|
417
|
+
6. Sort by row_number ascending
|
|
418
|
+
7. Write to the output path
|
|
419
|
+
|
|
420
|
+
**Error handling:**
|
|
421
|
+
- If no output files are found → Return error JSON
|
|
422
|
+
- If JSON parse fails → Log error and continue with other files
|
|
423
|
+
- If duplicate row numbers exist → The last write wins (or use --deduplicate flag to keep first)
|
|
424
|
+
|
|
425
|
+
## Notes
|
|
426
|
+
|
|
427
|
+
- The default batch size recommendation is 30 rows per batch
|
|
428
|
+
- Summarization instructions are provided as JSON with explicit field descriptions
|
|
429
|
+
- The topic name is autogenerated from the filename + manual comprehension of the user's JSON
|
|
430
|
+
- The summary words part (derived from JSON) has a maximum of 6 words
|
|
431
|
+
- The total topic name (filename + summary words) may exceed 6 words
|
|
432
|
+
- Subagent type is always `general`
|
|
433
|
+
- Subagents are launched in waves of 3 (not 5)
|
|
434
|
+
- Progress is reported every 3 completions
|
|
435
|
+
- Failed batches are retried up to 3 times
|
|
436
|
+
- Output is always combined automatically via the Python script
|
|
437
|
+
- The main agent does NOT manually read the JSON outputs
|