@yeyuan98/opencode-bioresearcher-plugin 1.4.0 → 1.5.0-alpha.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +35 -20
- package/dist/db-tools/backends/index.d.ts +11 -0
- package/dist/db-tools/backends/index.js +48 -0
- package/dist/db-tools/backends/mongodb/backend.d.ts +15 -0
- package/dist/db-tools/backends/mongodb/backend.js +76 -0
- package/dist/db-tools/backends/mongodb/connection.d.ts +27 -0
- package/dist/db-tools/backends/mongodb/connection.js +107 -0
- package/dist/db-tools/backends/mongodb/index.d.ts +4 -0
- package/dist/db-tools/backends/mongodb/index.js +3 -0
- package/dist/db-tools/backends/mongodb/translator.d.ts +30 -0
- package/dist/db-tools/backends/mongodb/translator.js +407 -0
- package/dist/db-tools/backends/mysql/backend.d.ts +15 -0
- package/dist/db-tools/backends/mysql/backend.js +57 -0
- package/dist/db-tools/backends/mysql/connection.d.ts +25 -0
- package/dist/db-tools/backends/mysql/connection.js +83 -0
- package/dist/db-tools/backends/mysql/index.d.ts +3 -0
- package/dist/db-tools/backends/mysql/index.js +2 -0
- package/dist/db-tools/backends/mysql/translator.d.ts +7 -0
- package/dist/db-tools/backends/mysql/translator.js +67 -0
- package/dist/db-tools/core/base.d.ts +17 -0
- package/dist/db-tools/core/base.js +51 -0
- package/dist/db-tools/core/config-loader.d.ts +3 -0
- package/dist/db-tools/core/config-loader.js +46 -0
- package/dist/db-tools/core/index.d.ts +2 -0
- package/dist/db-tools/core/index.js +2 -0
- package/dist/db-tools/core/jsonc-parser.d.ts +2 -0
- package/dist/db-tools/core/jsonc-parser.js +77 -0
- package/dist/db-tools/core/validator.d.ts +16 -0
- package/dist/db-tools/core/validator.js +118 -0
- package/dist/db-tools/executor.d.ts +13 -0
- package/dist/db-tools/executor.js +54 -0
- package/dist/db-tools/index.d.ts +51 -0
- package/dist/db-tools/index.js +27 -0
- package/dist/db-tools/interface/backend.d.ts +24 -0
- package/dist/db-tools/interface/backend.js +1 -0
- package/dist/db-tools/interface/connection.d.ts +21 -0
- package/dist/db-tools/interface/connection.js +11 -0
- package/dist/db-tools/interface/index.d.ts +4 -0
- package/dist/db-tools/interface/index.js +4 -0
- package/dist/db-tools/interface/query.d.ts +60 -0
- package/dist/db-tools/interface/query.js +1 -0
- package/dist/db-tools/interface/schema.d.ts +22 -0
- package/dist/db-tools/interface/schema.js +1 -0
- package/dist/db-tools/pool.d.ts +8 -0
- package/dist/db-tools/pool.js +49 -0
- package/dist/db-tools/tools/index.d.ts +27 -0
- package/dist/db-tools/tools/index.js +191 -0
- package/dist/db-tools/tools.d.ts +27 -0
- package/dist/db-tools/tools.js +111 -0
- package/dist/db-tools/types.d.ts +94 -0
- package/dist/db-tools/types.js +40 -0
- package/dist/db-tools/utils.d.ts +33 -0
- package/dist/db-tools/utils.js +94 -0
- package/dist/index.js +2 -0
- package/dist/skills/bioresearcher-core/README.md +210 -210
- package/dist/skills/bioresearcher-core/SKILL.md +128 -128
- package/dist/skills/bioresearcher-core/examples/contexts.json +29 -29
- package/dist/skills/bioresearcher-core/examples/data-exchange-example.md +303 -303
- package/dist/skills/bioresearcher-core/examples/template.md +49 -49
- package/dist/skills/bioresearcher-core/patterns/calculator.md +215 -215
- package/dist/skills/bioresearcher-core/patterns/data-exchange.md +406 -406
- package/dist/skills/bioresearcher-core/patterns/json-tools.md +263 -263
- package/dist/skills/bioresearcher-core/patterns/progress.md +127 -127
- package/dist/skills/bioresearcher-core/patterns/retry.md +110 -110
- package/dist/skills/bioresearcher-core/patterns/shell-commands.md +79 -79
- package/dist/skills/bioresearcher-core/patterns/subagent-waves.md +186 -186
- package/dist/skills/bioresearcher-core/patterns/table-tools.md +260 -260
- package/dist/skills/bioresearcher-core/patterns/user-confirmation.md +187 -187
- package/dist/skills/bioresearcher-core/python/template.md +273 -273
- package/dist/skills/bioresearcher-core/python/template.py +323 -323
- package/dist/skills/env-jsonc-setup/SKILL.md +206 -0
- package/package.json +3 -1
|
@@ -1,260 +1,260 @@
|
|
|
1
|
-
# Table Tools Pattern
|
|
2
|
-
|
|
3
|
-
Guide for combining subagent outputs using table tools.
|
|
4
|
-
|
|
5
|
-
## Overview
|
|
6
|
-
|
|
7
|
-
Table tools can be used to combine JSON outputs into Excel/CSV files without Python scripts for small batches.
|
|
8
|
-
|
|
9
|
-
## When to Use Table Tools vs Python
|
|
10
|
-
|
|
11
|
-
| Scenario | Use Table Tools | Use Python Script |
|
|
12
|
-
|----------|-----------------|-------------------|
|
|
13
|
-
| Files to combine | <10 files | >=10 files |
|
|
14
|
-
| Data size | Small (<1000 rows total) | Large (>1000 rows) |
|
|
15
|
-
| Complexity | Simple merge | Complex transformations |
|
|
16
|
-
| Performance | Acceptable overhead | Need efficiency |
|
|
17
|
-
|
|
18
|
-
## Tool: tableCreateFile
|
|
19
|
-
|
|
20
|
-
Create a new Excel/CSV file from data.
|
|
21
|
-
|
|
22
|
-
### Signature
|
|
23
|
-
```
|
|
24
|
-
tableCreateFile(
|
|
25
|
-
file_path: string,
|
|
26
|
-
sheet_name: string = "Sheet1",
|
|
27
|
-
data: array // Array of arrays OR array of objects
|
|
28
|
-
)
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
### Return Format
|
|
32
|
-
```json
|
|
33
|
-
{
|
|
34
|
-
"success": true,
|
|
35
|
-
"file_path": "./output.xlsx",
|
|
36
|
-
"sheet_name": "Sheet1",
|
|
37
|
-
"rows_created": 100,
|
|
38
|
-
"message": "Successfully created Excel file with 100 rows"
|
|
39
|
-
}
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
### Examples
|
|
43
|
-
|
|
44
|
-
```
|
|
45
|
-
# Create from array of objects
|
|
46
|
-
tableCreateFile(
|
|
47
|
-
file_path="./output.xlsx",
|
|
48
|
-
sheet_name="Results",
|
|
49
|
-
data=[
|
|
50
|
-
{"row_number": 1, "name": "Alice", "score": 95},
|
|
51
|
-
{"row_number": 2, "name": "Bob", "score": 87}
|
|
52
|
-
]
|
|
53
|
-
)
|
|
54
|
-
|
|
55
|
-
# Create from array of arrays
|
|
56
|
-
tableCreateFile(
|
|
57
|
-
file_path="./output.csv",
|
|
58
|
-
sheet_name="Sheet1",
|
|
59
|
-
data=[
|
|
60
|
-
["row_number", "name", "score"],
|
|
61
|
-
[1, "Alice", 95],
|
|
62
|
-
[2, "Bob", 87]
|
|
63
|
-
]
|
|
64
|
-
)
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
## Tool: tableAppendRows
|
|
68
|
-
|
|
69
|
-
Append rows to an existing table file.
|
|
70
|
-
|
|
71
|
-
### Signature
|
|
72
|
-
```
|
|
73
|
-
tableAppendRows(
|
|
74
|
-
file_path: string,
|
|
75
|
-
sheet_name: string?, // Optional, uses first sheet
|
|
76
|
-
rows: array // Array of arrays OR array of objects
|
|
77
|
-
)
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
### Return Format
|
|
81
|
-
```json
|
|
82
|
-
{
|
|
83
|
-
"success": true,
|
|
84
|
-
"file_path": "./output.xlsx",
|
|
85
|
-
"sheet_name": "Sheet1",
|
|
86
|
-
"rows_appended": 50,
|
|
87
|
-
"message": "Successfully appended 50 rows"
|
|
88
|
-
}
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
### Examples
|
|
92
|
-
|
|
93
|
-
**RECOMMENDED: Append using array-of-arrays format (no header duplication):**
|
|
94
|
-
```
|
|
95
|
-
tableAppendRows(
|
|
96
|
-
file_path="./output.xlsx",
|
|
97
|
-
rows=[
|
|
98
|
-
[3, "Charlie", 92],
|
|
99
|
-
[4, "Diana", 88]
|
|
100
|
-
]
|
|
101
|
-
)
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
**Alternative: Append using objects (may duplicate headers):**
|
|
105
|
-
```
|
|
106
|
-
tableAppendRows(
|
|
107
|
-
file_path="./output.xlsx",
|
|
108
|
-
rows=[
|
|
109
|
-
{"row_number": 3, "name": "Charlie", "score": 92}
|
|
110
|
-
]
|
|
111
|
-
)
|
|
112
|
-
```
|
|
113
|
-
|
|
114
|
-
> **Note:** When appending object-format data, some implementations may insert a duplicate header row. For reliable results, use array-of-arrays format for append operations.
|
|
115
|
-
|
|
116
|
-
## Combining JSON Outputs Workflow
|
|
117
|
-
|
|
118
|
-
### Step 1: Extract All JSON
|
|
119
|
-
|
|
120
|
-
```
|
|
121
|
-
# Extract JSON from each output file
|
|
122
|
-
all_rows = []
|
|
123
|
-
|
|
124
|
-
for batch_num in range(1, num_batches + 1):
|
|
125
|
-
file_path = f"./outputs/batch{batch_num:03d}.md"
|
|
126
|
-
result = jsonExtract(file_path=file_path)
|
|
127
|
-
|
|
128
|
-
if result.success:
|
|
129
|
-
# Assuming data has "summaries" array
|
|
130
|
-
summaries = result.data.get("summaries", [])
|
|
131
|
-
all_rows.extend(summaries)
|
|
132
|
-
else:
|
|
133
|
-
log_error(f"Failed to extract from {file_path}")
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
### Step 2: Create Combined File
|
|
137
|
-
|
|
138
|
-
```
|
|
139
|
-
# Create Excel file with all rows
|
|
140
|
-
tableCreateFile(
|
|
141
|
-
file_path="./combined_summary.xlsx",
|
|
142
|
-
sheet_name="Summary",
|
|
143
|
-
data=all_rows
|
|
144
|
-
)
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
### Step 3: Append Additional Data (Optional)
|
|
148
|
-
|
|
149
|
-
```
|
|
150
|
-
# If processing in chunks, append to existing file using array format
|
|
151
|
-
# First, get headers from the created file
|
|
152
|
-
headers = list(all_rows[0].keys()) if all_rows else []
|
|
153
|
-
|
|
154
|
-
for chunk in chunks:
|
|
155
|
-
rows = extract_rows(chunk)
|
|
156
|
-
# Convert to arrays to avoid header duplication
|
|
157
|
-
rows_as_arrays = [[item.get(h) for h in headers] for item in rows]
|
|
158
|
-
tableAppendRows(
|
|
159
|
-
file_path="./combined_summary.xlsx",
|
|
160
|
-
rows=rows_as_arrays
|
|
161
|
-
)
|
|
162
|
-
```
|
|
163
|
-
|
|
164
|
-
## Complete Example: Combining Batch Outputs
|
|
165
|
-
|
|
166
|
-
```
|
|
167
|
-
# Configuration
|
|
168
|
-
output_dir = "./outputs"
|
|
169
|
-
num_batches = 9
|
|
170
|
-
combined_file = "./combined_summary.xlsx"
|
|
171
|
-
|
|
172
|
-
# Collect all rows
|
|
173
|
-
all_rows = []
|
|
174
|
-
failed_batches = []
|
|
175
|
-
|
|
176
|
-
for batch_num in range(1, num_batches + 1):
|
|
177
|
-
file_path = f"{output_dir}/batch{batch_num:03d}.md"
|
|
178
|
-
|
|
179
|
-
# Extract JSON
|
|
180
|
-
result = jsonExtract(file_path=file_path)
|
|
181
|
-
|
|
182
|
-
if not result.success:
|
|
183
|
-
failed_batches.append(batch_num)
|
|
184
|
-
continue
|
|
185
|
-
|
|
186
|
-
# Get summaries from batch output
|
|
187
|
-
summaries = result.data.get("summaries", [])
|
|
188
|
-
all_rows.extend(summaries)
|
|
189
|
-
|
|
190
|
-
# Report failures
|
|
191
|
-
if failed_batches:
|
|
192
|
-
log_error(f"Failed batches: {failed_batches}")
|
|
193
|
-
|
|
194
|
-
# Sort by row_number
|
|
195
|
-
all_rows.sort(key=lambda x: x.get("row_number", 0))
|
|
196
|
-
|
|
197
|
-
# Create combined Excel
|
|
198
|
-
result = tableCreateFile(
|
|
199
|
-
file_path=combined_file,
|
|
200
|
-
sheet_name="Combined",
|
|
201
|
-
data=all_rows
|
|
202
|
-
)
|
|
203
|
-
|
|
204
|
-
# Report result
|
|
205
|
-
report(f"Created {combined_file} with {result.rows_created} rows")
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
## Incremental Append Strategy
|
|
209
|
-
|
|
210
|
-
For large datasets, append incrementally using array-of-arrays format:
|
|
211
|
-
|
|
212
|
-
```
|
|
213
|
-
# Get column headers from first batch
|
|
214
|
-
first_batch = jsonExtract(file_path="./outputs/batch001.md")
|
|
215
|
-
headers = ["row_number", "field1", "field2"] # Define your headers
|
|
216
|
-
|
|
217
|
-
# Create file with first batch (using objects for auto-headers)
|
|
218
|
-
tableCreateFile(
|
|
219
|
-
file_path="./combined.xlsx",
|
|
220
|
-
sheet_name="Data",
|
|
221
|
-
data=first_batch.data.get("summaries", [])
|
|
222
|
-
)
|
|
223
|
-
|
|
224
|
-
# Get header order for array format
|
|
225
|
-
headers = list(first_batch.data.get("summaries", [{}])[0].keys())
|
|
226
|
-
|
|
227
|
-
# Append remaining batches using array format
|
|
228
|
-
for batch_num in range(2, num_batches + 1):
|
|
229
|
-
file_path = f"./outputs/batch{batch_num:03d}.md"
|
|
230
|
-
result = jsonExtract(file_path=file_path)
|
|
231
|
-
|
|
232
|
-
if result.success:
|
|
233
|
-
# Convert objects to arrays to avoid header duplication
|
|
234
|
-
rows_as_arrays = [
|
|
235
|
-
[item.get(h) for h in headers]
|
|
236
|
-
for item in result.data.get("summaries", [])
|
|
237
|
-
]
|
|
238
|
-
tableAppendRows(
|
|
239
|
-
file_path="./combined.xlsx",
|
|
240
|
-
rows=rows_as_arrays
|
|
241
|
-
)
|
|
242
|
-
```
|
|
243
|
-
|
|
244
|
-
## Supported File Formats
|
|
245
|
-
|
|
246
|
-
| Format | Extension | Notes |
|
|
247
|
-
|--------|-----------|-------|
|
|
248
|
-
| Excel | .xlsx | Recommended |
|
|
249
|
-
| Excel (Legacy) | .xls | Limited support |
|
|
250
|
-
| ODS | .ods | OpenDocument Spreadsheet |
|
|
251
|
-
| CSV | .csv | Text-based, no sheets |
|
|
252
|
-
|
|
253
|
-
## Best Practices
|
|
254
|
-
|
|
255
|
-
1. **Sort before writing**: Sort rows by key field before creating file
|
|
256
|
-
2. **Handle failures gracefully**: Log failed extractions, continue with others
|
|
257
|
-
3. **Use object format for creation**: Array of objects auto-generates headers
|
|
258
|
-
4. **Use array format for appends**: Prefer array-of-arrays when using tableAppendRows to avoid potential header duplication
|
|
259
|
-
5. **Check row counts**: Verify expected vs actual row counts
|
|
260
|
-
6. **For large batches**: Use Python script for better performance
|
|
1
|
+
# Table Tools Pattern
|
|
2
|
+
|
|
3
|
+
Guide for combining subagent outputs using table tools.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Table tools can be used to combine JSON outputs into Excel/CSV files without Python scripts for small batches.
|
|
8
|
+
|
|
9
|
+
## When to Use Table Tools vs Python
|
|
10
|
+
|
|
11
|
+
| Scenario | Use Table Tools | Use Python Script |
|
|
12
|
+
|----------|-----------------|-------------------|
|
|
13
|
+
| Files to combine | <10 files | >=10 files |
|
|
14
|
+
| Data size | Small (<1000 rows total) | Large (>1000 rows) |
|
|
15
|
+
| Complexity | Simple merge | Complex transformations |
|
|
16
|
+
| Performance | Acceptable overhead | Need efficiency |
|
|
17
|
+
|
|
18
|
+
## Tool: tableCreateFile
|
|
19
|
+
|
|
20
|
+
Create a new Excel/CSV file from data.
|
|
21
|
+
|
|
22
|
+
### Signature
|
|
23
|
+
```
|
|
24
|
+
tableCreateFile(
|
|
25
|
+
file_path: string,
|
|
26
|
+
sheet_name: string = "Sheet1",
|
|
27
|
+
data: array // Array of arrays OR array of objects
|
|
28
|
+
)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### Return Format
|
|
32
|
+
```json
|
|
33
|
+
{
|
|
34
|
+
"success": true,
|
|
35
|
+
"file_path": "./output.xlsx",
|
|
36
|
+
"sheet_name": "Sheet1",
|
|
37
|
+
"rows_created": 100,
|
|
38
|
+
"message": "Successfully created Excel file with 100 rows"
|
|
39
|
+
}
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Examples
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
# Create from array of objects
|
|
46
|
+
tableCreateFile(
|
|
47
|
+
file_path="./output.xlsx",
|
|
48
|
+
sheet_name="Results",
|
|
49
|
+
data=[
|
|
50
|
+
{"row_number": 1, "name": "Alice", "score": 95},
|
|
51
|
+
{"row_number": 2, "name": "Bob", "score": 87}
|
|
52
|
+
]
|
|
53
|
+
)
|
|
54
|
+
|
|
55
|
+
# Create from array of arrays
|
|
56
|
+
tableCreateFile(
|
|
57
|
+
file_path="./output.csv",
|
|
58
|
+
sheet_name="Sheet1",
|
|
59
|
+
data=[
|
|
60
|
+
["row_number", "name", "score"],
|
|
61
|
+
[1, "Alice", 95],
|
|
62
|
+
[2, "Bob", 87]
|
|
63
|
+
]
|
|
64
|
+
)
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Tool: tableAppendRows
|
|
68
|
+
|
|
69
|
+
Append rows to an existing table file.
|
|
70
|
+
|
|
71
|
+
### Signature
|
|
72
|
+
```
|
|
73
|
+
tableAppendRows(
|
|
74
|
+
file_path: string,
|
|
75
|
+
sheet_name: string?, // Optional, uses first sheet
|
|
76
|
+
rows: array // Array of arrays OR array of objects
|
|
77
|
+
)
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Return Format
|
|
81
|
+
```json
|
|
82
|
+
{
|
|
83
|
+
"success": true,
|
|
84
|
+
"file_path": "./output.xlsx",
|
|
85
|
+
"sheet_name": "Sheet1",
|
|
86
|
+
"rows_appended": 50,
|
|
87
|
+
"message": "Successfully appended 50 rows"
|
|
88
|
+
}
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
### Examples
|
|
92
|
+
|
|
93
|
+
**RECOMMENDED: Append using array-of-arrays format (no header duplication):**
|
|
94
|
+
```
|
|
95
|
+
tableAppendRows(
|
|
96
|
+
file_path="./output.xlsx",
|
|
97
|
+
rows=[
|
|
98
|
+
[3, "Charlie", 92],
|
|
99
|
+
[4, "Diana", 88]
|
|
100
|
+
]
|
|
101
|
+
)
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
**Alternative: Append using objects (may duplicate headers):**
|
|
105
|
+
```
|
|
106
|
+
tableAppendRows(
|
|
107
|
+
file_path="./output.xlsx",
|
|
108
|
+
rows=[
|
|
109
|
+
{"row_number": 3, "name": "Charlie", "score": 92}
|
|
110
|
+
]
|
|
111
|
+
)
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
> **Note:** When appending object-format data, some implementations may insert a duplicate header row. For reliable results, use array-of-arrays format for append operations.
|
|
115
|
+
|
|
116
|
+
## Combining JSON Outputs Workflow
|
|
117
|
+
|
|
118
|
+
### Step 1: Extract All JSON
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
# Extract JSON from each output file
|
|
122
|
+
all_rows = []
|
|
123
|
+
|
|
124
|
+
for batch_num in range(1, num_batches + 1):
|
|
125
|
+
file_path = f"./outputs/batch{batch_num:03d}.md"
|
|
126
|
+
result = jsonExtract(file_path=file_path)
|
|
127
|
+
|
|
128
|
+
if result.success:
|
|
129
|
+
# Assuming data has "summaries" array
|
|
130
|
+
summaries = result.data.get("summaries", [])
|
|
131
|
+
all_rows.extend(summaries)
|
|
132
|
+
else:
|
|
133
|
+
log_error(f"Failed to extract from {file_path}")
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Step 2: Create Combined File
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
# Create Excel file with all rows
|
|
140
|
+
tableCreateFile(
|
|
141
|
+
file_path="./combined_summary.xlsx",
|
|
142
|
+
sheet_name="Summary",
|
|
143
|
+
data=all_rows
|
|
144
|
+
)
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
### Step 3: Append Additional Data (Optional)
|
|
148
|
+
|
|
149
|
+
```
|
|
150
|
+
# If processing in chunks, append to existing file using array format
|
|
151
|
+
# First, get headers from the created file
|
|
152
|
+
headers = list(all_rows[0].keys()) if all_rows else []
|
|
153
|
+
|
|
154
|
+
for chunk in chunks:
|
|
155
|
+
rows = extract_rows(chunk)
|
|
156
|
+
# Convert to arrays to avoid header duplication
|
|
157
|
+
rows_as_arrays = [[item.get(h) for h in headers] for item in rows]
|
|
158
|
+
tableAppendRows(
|
|
159
|
+
file_path="./combined_summary.xlsx",
|
|
160
|
+
rows=rows_as_arrays
|
|
161
|
+
)
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
## Complete Example: Combining Batch Outputs
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
# Configuration
|
|
168
|
+
output_dir = "./outputs"
|
|
169
|
+
num_batches = 9
|
|
170
|
+
combined_file = "./combined_summary.xlsx"
|
|
171
|
+
|
|
172
|
+
# Collect all rows
|
|
173
|
+
all_rows = []
|
|
174
|
+
failed_batches = []
|
|
175
|
+
|
|
176
|
+
for batch_num in range(1, num_batches + 1):
|
|
177
|
+
file_path = f"{output_dir}/batch{batch_num:03d}.md"
|
|
178
|
+
|
|
179
|
+
# Extract JSON
|
|
180
|
+
result = jsonExtract(file_path=file_path)
|
|
181
|
+
|
|
182
|
+
if not result.success:
|
|
183
|
+
failed_batches.append(batch_num)
|
|
184
|
+
continue
|
|
185
|
+
|
|
186
|
+
# Get summaries from batch output
|
|
187
|
+
summaries = result.data.get("summaries", [])
|
|
188
|
+
all_rows.extend(summaries)
|
|
189
|
+
|
|
190
|
+
# Report failures
|
|
191
|
+
if failed_batches:
|
|
192
|
+
log_error(f"Failed batches: {failed_batches}")
|
|
193
|
+
|
|
194
|
+
# Sort by row_number
|
|
195
|
+
all_rows.sort(key=lambda x: x.get("row_number", 0))
|
|
196
|
+
|
|
197
|
+
# Create combined Excel
|
|
198
|
+
result = tableCreateFile(
|
|
199
|
+
file_path=combined_file,
|
|
200
|
+
sheet_name="Combined",
|
|
201
|
+
data=all_rows
|
|
202
|
+
)
|
|
203
|
+
|
|
204
|
+
# Report result
|
|
205
|
+
report(f"Created {combined_file} with {result.rows_created} rows")
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
## Incremental Append Strategy
|
|
209
|
+
|
|
210
|
+
For large datasets, append incrementally using array-of-arrays format:
|
|
211
|
+
|
|
212
|
+
```
|
|
213
|
+
# Get column headers from first batch
|
|
214
|
+
first_batch = jsonExtract(file_path="./outputs/batch001.md")
|
|
215
|
+
headers = ["row_number", "field1", "field2"] # Define your headers
|
|
216
|
+
|
|
217
|
+
# Create file with first batch (using objects for auto-headers)
|
|
218
|
+
tableCreateFile(
|
|
219
|
+
file_path="./combined.xlsx",
|
|
220
|
+
sheet_name="Data",
|
|
221
|
+
data=first_batch.data.get("summaries", [])
|
|
222
|
+
)
|
|
223
|
+
|
|
224
|
+
# Get header order for array format
|
|
225
|
+
headers = list(first_batch.data.get("summaries", [{}])[0].keys())
|
|
226
|
+
|
|
227
|
+
# Append remaining batches using array format
|
|
228
|
+
for batch_num in range(2, num_batches + 1):
|
|
229
|
+
file_path = f"./outputs/batch{batch_num:03d}.md"
|
|
230
|
+
result = jsonExtract(file_path=file_path)
|
|
231
|
+
|
|
232
|
+
if result.success:
|
|
233
|
+
# Convert objects to arrays to avoid header duplication
|
|
234
|
+
rows_as_arrays = [
|
|
235
|
+
[item.get(h) for h in headers]
|
|
236
|
+
for item in result.data.get("summaries", [])
|
|
237
|
+
]
|
|
238
|
+
tableAppendRows(
|
|
239
|
+
file_path="./combined.xlsx",
|
|
240
|
+
rows=rows_as_arrays
|
|
241
|
+
)
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
## Supported File Formats
|
|
245
|
+
|
|
246
|
+
| Format | Extension | Notes |
|
|
247
|
+
|--------|-----------|-------|
|
|
248
|
+
| Excel | .xlsx | Recommended |
|
|
249
|
+
| Excel (Legacy) | .xls | Limited support |
|
|
250
|
+
| ODS | .ods | OpenDocument Spreadsheet |
|
|
251
|
+
| CSV | .csv | Text-based, no sheets |
|
|
252
|
+
|
|
253
|
+
## Best Practices
|
|
254
|
+
|
|
255
|
+
1. **Sort before writing**: Sort rows by key field before creating file
|
|
256
|
+
2. **Handle failures gracefully**: Log failed extractions, continue with others
|
|
257
|
+
3. **Use object format for creation**: Array of objects auto-generates headers
|
|
258
|
+
4. **Use array format for appends**: Prefer array-of-arrays when using tableAppendRows to avoid potential header duplication
|
|
259
|
+
5. **Check row counts**: Verify expected vs actual row counts
|
|
260
|
+
6. **For large batches**: Use Python script for better performance
|