@brookmind/ai-toolkit 1.1.6 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +42 -14
- package/dist/__tests__/index.test.d.ts +2 -0
- package/dist/__tests__/index.test.d.ts.map +1 -0
- package/dist/__tests__/index.test.js +214 -0
- package/dist/__tests__/index.test.js.map +1 -0
- package/dist/constants.d.ts +10 -0
- package/dist/constants.d.ts.map +1 -0
- package/dist/constants.js +39 -0
- package/dist/constants.js.map +1 -0
- package/dist/index.d.ts +2 -0
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +52 -335
- package/dist/index.js.map +1 -1
- package/dist/services/installers.d.ts +8 -0
- package/dist/services/installers.d.ts.map +1 -0
- package/dist/services/installers.js +79 -0
- package/dist/services/installers.js.map +1 -0
- package/dist/services/opencode.d.ts +3 -0
- package/dist/services/opencode.d.ts.map +1 -0
- package/dist/services/opencode.js +33 -0
- package/dist/services/opencode.js.map +1 -0
- package/dist/types.d.ts +10 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +2 -0
- package/dist/types.js.map +1 -0
- package/dist/ui/categorize.d.ts +6 -0
- package/dist/ui/categorize.d.ts.map +1 -0
- package/dist/ui/categorize.js +69 -0
- package/dist/ui/categorize.js.map +1 -0
- package/dist/ui/choices.d.ts +6 -0
- package/dist/ui/choices.d.ts.map +1 -0
- package/dist/ui/choices.js +70 -0
- package/dist/ui/choices.js.map +1 -0
- package/dist/ui/display.d.ts +8 -0
- package/dist/ui/display.d.ts.map +1 -0
- package/dist/ui/display.js +84 -0
- package/dist/ui/display.js.map +1 -0
- package/dist/utils/fs.d.ts +5 -0
- package/dist/utils/fs.d.ts.map +1 -0
- package/dist/utils/fs.js +40 -0
- package/dist/utils/fs.js.map +1 -0
- package/dist/utils/terminal.d.ts +5 -0
- package/dist/utils/terminal.d.ts.map +1 -0
- package/dist/utils/terminal.js +18 -0
- package/dist/utils/terminal.js.map +1 -0
- package/package.json +27 -5
- package/agents/code-reviewer.md +0 -35
- package/agents/code-simplifier.md +0 -52
- package/commands/create-pr-description.md +0 -102
- package/commands/create-pr.md +0 -76
- package/commands/create-react-tests.md +0 -207
- package/mcps/context7/.mcp.json +0 -13
- package/mcps/expo-mcp/.mcp.json +0 -13
- package/mcps/figma-mcp/.mcp.json +0 -10
- package/skills/github-cli/SKILL.md +0 -125
- package/skills/pdf-processing-pro/FORMS.md +0 -610
- package/skills/pdf-processing-pro/OCR.md +0 -137
- package/skills/pdf-processing-pro/SKILL.md +0 -296
- package/skills/pdf-processing-pro/TABLES.md +0 -626
- package/skills/pdf-processing-pro/scripts/analyze_form.py +0 -307
- package/skills/react-best-practices/AGENTS.md +0 -915
- package/skills/react-best-practices/README.md +0 -127
- package/skills/react-best-practices/SKILL.md +0 -110
- package/skills/react-best-practices/metadata.json +0 -14
- package/skills/react-best-practices/rules/_sections.md +0 -41
- package/skills/react-best-practices/rules/_template.md +0 -28
- package/skills/react-best-practices/rules/advanced-event-handler-refs.md +0 -80
- package/skills/react-best-practices/rules/advanced-use-latest.md +0 -76
- package/skills/react-best-practices/rules/async-defer-await.md +0 -80
- package/skills/react-best-practices/rules/async-dependencies.md +0 -36
- package/skills/react-best-practices/rules/async-parallel.md +0 -28
- package/skills/react-best-practices/rules/async-suspense-boundaries.md +0 -100
- package/skills/react-best-practices/rules/bundle-barrel-imports.md +0 -42
- package/skills/react-best-practices/rules/bundle-conditional.md +0 -106
- package/skills/react-best-practices/rules/bundle-preload.md +0 -44
- package/skills/react-best-practices/rules/client-event-listeners.md +0 -131
- package/skills/react-best-practices/rules/client-swr-dedup.md +0 -133
- package/skills/react-best-practices/rules/js-batch-dom-css.md +0 -82
- package/skills/react-best-practices/rules/js-cache-function-results.md +0 -80
- package/skills/react-best-practices/rules/js-cache-property-access.md +0 -28
- package/skills/react-best-practices/rules/js-cache-storage.md +0 -70
- package/skills/react-best-practices/rules/js-combine-iterations.md +0 -32
- package/skills/react-best-practices/rules/js-early-exit.md +0 -50
- package/skills/react-best-practices/rules/js-hoist-regexp.md +0 -45
- package/skills/react-best-practices/rules/js-index-maps.md +0 -37
- package/skills/react-best-practices/rules/js-length-check-first.md +0 -49
- package/skills/react-best-practices/rules/js-min-max-loop.md +0 -82
- package/skills/react-best-practices/rules/js-set-map-lookups.md +0 -24
- package/skills/react-best-practices/rules/js-tosorted-immutable.md +0 -57
- package/skills/react-best-practices/rules/rendering-activity.md +0 -90
- package/skills/react-best-practices/rules/rendering-animate-svg-wrapper.md +0 -47
- package/skills/react-best-practices/rules/rendering-conditional-render.md +0 -40
- package/skills/react-best-practices/rules/rendering-content-visibility.md +0 -38
- package/skills/react-best-practices/rules/rendering-hoist-jsx.md +0 -65
- package/skills/react-best-practices/rules/rendering-svg-precision.md +0 -28
- package/skills/react-best-practices/rules/rerender-defer-reads.md +0 -39
- package/skills/react-best-practices/rules/rerender-dependencies.md +0 -45
- package/skills/react-best-practices/rules/rerender-derived-state.md +0 -29
- package/skills/react-best-practices/rules/rerender-functional-setstate.md +0 -74
- package/skills/react-best-practices/rules/rerender-lazy-state-init.md +0 -58
- package/skills/react-best-practices/rules/rerender-memo.md +0 -85
- package/skills/react-best-practices/rules/rerender-transitions.md +0 -40
- package/skills/skill-creator/LICENSE.txt +0 -202
- package/skills/skill-creator/SKILL.md +0 -209
- package/skills/skill-creator/scripts/init_skill.py +0 -303
- package/skills/skill-creator/scripts/package_skill.py +0 -110
- package/skills/skill-creator/scripts/quick_validate.py +0 -65
- package/skills/spring-boot-development/EXAMPLES.md +0 -2346
- package/skills/spring-boot-development/README.md +0 -595
- package/skills/spring-boot-development/SKILL.md +0 -1519
- package/themes/README.md +0 -68
- package/themes/claude-vivid.json +0 -72
|
@@ -1,610 +0,0 @@
|
|
|
1
|
-
# PDF Form Processing Guide
|
|
2
|
-
|
|
3
|
-
Complete guide for processing PDF forms in production environments.
|
|
4
|
-
|
|
5
|
-
## Table of contents
|
|
6
|
-
|
|
7
|
-
- Form analysis and field detection
|
|
8
|
-
- Form filling workflows
|
|
9
|
-
- Validation strategies
|
|
10
|
-
- Field types and handling
|
|
11
|
-
- Multi-page forms
|
|
12
|
-
- Flattening and finalization
|
|
13
|
-
- Error handling patterns
|
|
14
|
-
- Production examples
|
|
15
|
-
|
|
16
|
-
## Form analysis
|
|
17
|
-
|
|
18
|
-
### Analyze form structure
|
|
19
|
-
|
|
20
|
-
Use `analyze_form.py` to extract complete form information:
|
|
21
|
-
|
|
22
|
-
```bash
|
|
23
|
-
python scripts/analyze_form.py application.pdf --output schema.json
|
|
24
|
-
```
|
|
25
|
-
|
|
26
|
-
Output format:
|
|
27
|
-
|
|
28
|
-
```json
|
|
29
|
-
{
|
|
30
|
-
"full_name": {
|
|
31
|
-
"type": "text",
|
|
32
|
-
"required": true,
|
|
33
|
-
"max_length": 100,
|
|
34
|
-
"x": 120.5,
|
|
35
|
-
"y": 450.2,
|
|
36
|
-
"width": 300,
|
|
37
|
-
"height": 20
|
|
38
|
-
},
|
|
39
|
-
"date_of_birth": {
|
|
40
|
-
"type": "text",
|
|
41
|
-
"required": true,
|
|
42
|
-
"format": "MM/DD/YYYY",
|
|
43
|
-
"x": 120.5,
|
|
44
|
-
"y": 400.8,
|
|
45
|
-
"width": 150,
|
|
46
|
-
"height": 20
|
|
47
|
-
},
|
|
48
|
-
"email_newsletter": {
|
|
49
|
-
"type": "checkbox",
|
|
50
|
-
"required": false,
|
|
51
|
-
"x": 120.5,
|
|
52
|
-
"y": 350.4,
|
|
53
|
-
"width": 15,
|
|
54
|
-
"height": 15
|
|
55
|
-
},
|
|
56
|
-
"preferred_contact": {
|
|
57
|
-
"type": "radio",
|
|
58
|
-
"required": true,
|
|
59
|
-
"options": ["email", "phone", "mail"],
|
|
60
|
-
"x": 120.5,
|
|
61
|
-
"y": 300.0,
|
|
62
|
-
"width": 200,
|
|
63
|
-
"height": 60
|
|
64
|
-
}
|
|
65
|
-
}
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
### Programmatic analysis
|
|
69
|
-
|
|
70
|
-
```python
|
|
71
|
-
from pypdf import PdfReader
|
|
72
|
-
|
|
73
|
-
reader = PdfReader("form.pdf")
|
|
74
|
-
fields = reader.get_fields()
|
|
75
|
-
|
|
76
|
-
for field_name, field_info in fields.items():
|
|
77
|
-
print(f"Field: {field_name}")
|
|
78
|
-
print(f" Type: {field_info.get('/FT')}")
|
|
79
|
-
print(f" Value: {field_info.get('/V')}")
|
|
80
|
-
print(f" Flags: {field_info.get('/Ff', 0)}")
|
|
81
|
-
print()
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
## Form filling workflows
|
|
85
|
-
|
|
86
|
-
### Basic workflow
|
|
87
|
-
|
|
88
|
-
```bash
|
|
89
|
-
# 1. Analyze form
|
|
90
|
-
python scripts/analyze_form.py template.pdf --output schema.json
|
|
91
|
-
|
|
92
|
-
# 2. Prepare data
|
|
93
|
-
cat > data.json << EOF
|
|
94
|
-
{
|
|
95
|
-
"full_name": "John Doe",
|
|
96
|
-
"date_of_birth": "01/15/1990",
|
|
97
|
-
"email": "john@example.com",
|
|
98
|
-
"email_newsletter": true,
|
|
99
|
-
"preferred_contact": "email"
|
|
100
|
-
}
|
|
101
|
-
EOF
|
|
102
|
-
|
|
103
|
-
# 3. Validate data
|
|
104
|
-
python scripts/validate_form.py data.json schema.json
|
|
105
|
-
|
|
106
|
-
# 4. Fill form
|
|
107
|
-
python scripts/fill_form.py template.pdf data.json filled.pdf
|
|
108
|
-
|
|
109
|
-
# 5. Flatten (optional - makes fields non-editable)
|
|
110
|
-
python scripts/flatten_form.py filled.pdf final.pdf
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
### Programmatic filling
|
|
114
|
-
|
|
115
|
-
```python
|
|
116
|
-
from pypdf import PdfReader, PdfWriter
|
|
117
|
-
|
|
118
|
-
reader = PdfReader("template.pdf")
|
|
119
|
-
writer = PdfWriter()
|
|
120
|
-
|
|
121
|
-
# Clone all pages
|
|
122
|
-
for page in reader.pages:
|
|
123
|
-
writer.add_page(page)
|
|
124
|
-
|
|
125
|
-
# Fill form fields
|
|
126
|
-
writer.update_page_form_field_values(
|
|
127
|
-
writer.pages[0],
|
|
128
|
-
{
|
|
129
|
-
"full_name": "John Doe",
|
|
130
|
-
"date_of_birth": "01/15/1990",
|
|
131
|
-
"email": "john@example.com",
|
|
132
|
-
"email_newsletter": "/Yes", # Checkbox value
|
|
133
|
-
"preferred_contact": "/email" # Radio value
|
|
134
|
-
}
|
|
135
|
-
)
|
|
136
|
-
|
|
137
|
-
# Save filled form
|
|
138
|
-
with open("filled.pdf", "wb") as output:
|
|
139
|
-
writer.write(output)
|
|
140
|
-
```
|
|
141
|
-
|
|
142
|
-
## Field types and handling
|
|
143
|
-
|
|
144
|
-
### Text fields
|
|
145
|
-
|
|
146
|
-
```python
|
|
147
|
-
# Simple text
|
|
148
|
-
field_values["customer_name"] = "Jane Smith"
|
|
149
|
-
|
|
150
|
-
# Formatted text (dates)
|
|
151
|
-
field_values["date"] = "12/25/2024"
|
|
152
|
-
|
|
153
|
-
# Numbers
|
|
154
|
-
field_values["amount"] = "1234.56"
|
|
155
|
-
|
|
156
|
-
# Multi-line text
|
|
157
|
-
field_values["comments"] = "Line 1\nLine 2\nLine 3"
|
|
158
|
-
```
|
|
159
|
-
|
|
160
|
-
### Checkboxes
|
|
161
|
-
|
|
162
|
-
Checkboxes typically use `/Yes` for checked, `/Off` for unchecked:
|
|
163
|
-
|
|
164
|
-
```python
|
|
165
|
-
# Check checkbox
|
|
166
|
-
field_values["agree_to_terms"] = "/Yes"
|
|
167
|
-
|
|
168
|
-
# Uncheck checkbox
|
|
169
|
-
field_values["newsletter_opt_out"] = "/Off"
|
|
170
|
-
```
|
|
171
|
-
|
|
172
|
-
**Note**: Some PDFs use different values. Check with `analyze_form.py`:
|
|
173
|
-
|
|
174
|
-
```json
|
|
175
|
-
{
|
|
176
|
-
"some_checkbox": {
|
|
177
|
-
"type": "checkbox",
|
|
178
|
-
"on_value": "/On", # ← Check this
|
|
179
|
-
"off_value": "/Off"
|
|
180
|
-
}
|
|
181
|
-
}
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
### Radio buttons
|
|
185
|
-
|
|
186
|
-
Radio buttons are mutually exclusive options:
|
|
187
|
-
|
|
188
|
-
```python
|
|
189
|
-
# Select one option from radio group
|
|
190
|
-
field_values["preferred_contact"] = "/email"
|
|
191
|
-
|
|
192
|
-
# Other options in same group
|
|
193
|
-
# field_values["preferred_contact"] = "/phone"
|
|
194
|
-
# field_values["preferred_contact"] = "/mail"
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
### Dropdown/List boxes
|
|
198
|
-
|
|
199
|
-
```python
|
|
200
|
-
# Single selection
|
|
201
|
-
field_values["country"] = "United States"
|
|
202
|
-
|
|
203
|
-
# List of available options in schema
|
|
204
|
-
"country": {
|
|
205
|
-
"type": "dropdown",
|
|
206
|
-
"options": ["United States", "Canada", "Mexico", ...]
|
|
207
|
-
}
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
## Validation strategies
|
|
211
|
-
|
|
212
|
-
### Schema-based validation
|
|
213
|
-
|
|
214
|
-
```python
|
|
215
|
-
import json
|
|
216
|
-
from jsonschema import validate, ValidationError
|
|
217
|
-
|
|
218
|
-
# Load schema from analyze_form.py output
|
|
219
|
-
with open("schema.json") as f:
|
|
220
|
-
schema = json.load(f)
|
|
221
|
-
|
|
222
|
-
# Load form data
|
|
223
|
-
with open("data.json") as f:
|
|
224
|
-
data = json.load(f)
|
|
225
|
-
|
|
226
|
-
# Validate all fields
|
|
227
|
-
errors = []
|
|
228
|
-
|
|
229
|
-
for field_name, field_schema in schema.items():
|
|
230
|
-
value = data.get(field_name)
|
|
231
|
-
|
|
232
|
-
# Check required fields
|
|
233
|
-
if field_schema.get("required") and not value:
|
|
234
|
-
errors.append(f"Missing required field: {field_name}")
|
|
235
|
-
|
|
236
|
-
# Check field type
|
|
237
|
-
if value and field_schema.get("type") == "text":
|
|
238
|
-
if not isinstance(value, str):
|
|
239
|
-
errors.append(f"Field {field_name} must be string")
|
|
240
|
-
|
|
241
|
-
# Check max length
|
|
242
|
-
max_length = field_schema.get("max_length")
|
|
243
|
-
if value and max_length and len(str(value)) > max_length:
|
|
244
|
-
errors.append(f"Field {field_name} exceeds max length {max_length}")
|
|
245
|
-
|
|
246
|
-
# Check format (dates, emails, etc)
|
|
247
|
-
format_type = field_schema.get("format")
|
|
248
|
-
if value and format_type:
|
|
249
|
-
if not validate_format(value, format_type):
|
|
250
|
-
errors.append(f"Field {field_name} has invalid format")
|
|
251
|
-
|
|
252
|
-
if errors:
|
|
253
|
-
print("Validation errors:")
|
|
254
|
-
for error in errors:
|
|
255
|
-
print(f" - {error}")
|
|
256
|
-
exit(1)
|
|
257
|
-
|
|
258
|
-
print("Validation passed")
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
### Format validation
|
|
262
|
-
|
|
263
|
-
```python
|
|
264
|
-
import re
|
|
265
|
-
from datetime import datetime
|
|
266
|
-
|
|
267
|
-
def validate_format(value, format_type):
|
|
268
|
-
"""Validate field format."""
|
|
269
|
-
|
|
270
|
-
if format_type == "email":
|
|
271
|
-
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
|
|
272
|
-
return re.match(pattern, value) is not None
|
|
273
|
-
|
|
274
|
-
elif format_type == "phone":
|
|
275
|
-
# US phone: (555) 123-4567 or 555-123-4567
|
|
276
|
-
pattern = r'^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$'
|
|
277
|
-
return re.match(pattern, value) is not None
|
|
278
|
-
|
|
279
|
-
elif format_type == "MM/DD/YYYY":
|
|
280
|
-
try:
|
|
281
|
-
datetime.strptime(value, "%m/%d/%Y")
|
|
282
|
-
return True
|
|
283
|
-
except ValueError:
|
|
284
|
-
return False
|
|
285
|
-
|
|
286
|
-
elif format_type == "SSN":
|
|
287
|
-
# XXX-XX-XXXX
|
|
288
|
-
pattern = r'^\d{3}-\d{2}-\d{4}$'
|
|
289
|
-
return re.match(pattern, value) is not None
|
|
290
|
-
|
|
291
|
-
elif format_type == "ZIP":
|
|
292
|
-
# XXXXX or XXXXX-XXXX
|
|
293
|
-
pattern = r'^\d{5}(-\d{4})?$'
|
|
294
|
-
return re.match(pattern, value) is not None
|
|
295
|
-
|
|
296
|
-
return True # Unknown format, skip validation
|
|
297
|
-
```
|
|
298
|
-
|
|
299
|
-
## Multi-page forms
|
|
300
|
-
|
|
301
|
-
### Handling multi-page forms
|
|
302
|
-
|
|
303
|
-
```python
|
|
304
|
-
from pypdf import PdfReader, PdfWriter
|
|
305
|
-
|
|
306
|
-
reader = PdfReader("multi_page_form.pdf")
|
|
307
|
-
writer = PdfWriter()
|
|
308
|
-
|
|
309
|
-
# Clone all pages
|
|
310
|
-
for page in reader.pages:
|
|
311
|
-
writer.add_page(page)
|
|
312
|
-
|
|
313
|
-
# Fill fields on page 1
|
|
314
|
-
writer.update_page_form_field_values(
|
|
315
|
-
writer.pages[0],
|
|
316
|
-
{
|
|
317
|
-
"name_page1": "John Doe",
|
|
318
|
-
"email_page1": "john@example.com"
|
|
319
|
-
}
|
|
320
|
-
)
|
|
321
|
-
|
|
322
|
-
# Fill fields on page 2
|
|
323
|
-
writer.update_page_form_field_values(
|
|
324
|
-
writer.pages[1],
|
|
325
|
-
{
|
|
326
|
-
"address_page2": "123 Main St",
|
|
327
|
-
"city_page2": "Springfield"
|
|
328
|
-
}
|
|
329
|
-
)
|
|
330
|
-
|
|
331
|
-
# Fill fields on page 3
|
|
332
|
-
writer.update_page_form_field_values(
|
|
333
|
-
writer.pages[2],
|
|
334
|
-
{
|
|
335
|
-
"signature_page3": "John Doe",
|
|
336
|
-
"date_page3": "12/25/2024"
|
|
337
|
-
}
|
|
338
|
-
)
|
|
339
|
-
|
|
340
|
-
with open("filled_multi_page.pdf", "wb") as output:
|
|
341
|
-
writer.write(output)
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
### Identifying page-specific fields
|
|
345
|
-
|
|
346
|
-
```python
|
|
347
|
-
# Analyze which fields are on which pages
|
|
348
|
-
for page_num, page in enumerate(reader.pages, 1):
|
|
349
|
-
fields = page.get("/Annots", [])
|
|
350
|
-
|
|
351
|
-
if fields:
|
|
352
|
-
print(f"\nPage {page_num} fields:")
|
|
353
|
-
for field_ref in fields:
|
|
354
|
-
field = field_ref.get_object()
|
|
355
|
-
field_name = field.get("/T", "Unknown")
|
|
356
|
-
print(f" - {field_name}")
|
|
357
|
-
```
|
|
358
|
-
|
|
359
|
-
## Flattening forms
|
|
360
|
-
|
|
361
|
-
### Why flatten
|
|
362
|
-
|
|
363
|
-
Flattening makes form fields non-editable, embedding values permanently:
|
|
364
|
-
|
|
365
|
-
- **Security**: Prevent modifications
|
|
366
|
-
- **Distribution**: Share read-only forms
|
|
367
|
-
- **Printing**: Ensure correct appearance
|
|
368
|
-
- **Archival**: Long-term storage
|
|
369
|
-
|
|
370
|
-
### Flatten with pypdf
|
|
371
|
-
|
|
372
|
-
```python
|
|
373
|
-
from pypdf import PdfReader, PdfWriter
|
|
374
|
-
|
|
375
|
-
reader = PdfReader("filled.pdf")
|
|
376
|
-
writer = PdfWriter()
|
|
377
|
-
|
|
378
|
-
# Add all pages
|
|
379
|
-
for page in reader.pages:
|
|
380
|
-
writer.add_page(page)
|
|
381
|
-
|
|
382
|
-
# Flatten all form fields
|
|
383
|
-
writer.flatten_fields()
|
|
384
|
-
|
|
385
|
-
# Save flattened PDF
|
|
386
|
-
with open("flattened.pdf", "wb") as output:
|
|
387
|
-
writer.write(output)
|
|
388
|
-
```
|
|
389
|
-
|
|
390
|
-
### Using included script
|
|
391
|
-
|
|
392
|
-
```bash
|
|
393
|
-
python scripts/flatten_form.py filled.pdf flattened.pdf
|
|
394
|
-
```
|
|
395
|
-
|
|
396
|
-
## Error handling patterns
|
|
397
|
-
|
|
398
|
-
### Robust form filling
|
|
399
|
-
|
|
400
|
-
```python
|
|
401
|
-
import logging
|
|
402
|
-
from pathlib import Path
|
|
403
|
-
from pypdf import PdfReader, PdfWriter
|
|
404
|
-
from pypdf.errors import PdfReadError
|
|
405
|
-
|
|
406
|
-
logging.basicConfig(level=logging.INFO)
|
|
407
|
-
logger = logging.getLogger(__name__)
|
|
408
|
-
|
|
409
|
-
def fill_form_safe(template_path, data, output_path):
|
|
410
|
-
"""Fill form with comprehensive error handling."""
|
|
411
|
-
|
|
412
|
-
try:
|
|
413
|
-
# Validate inputs
|
|
414
|
-
template = Path(template_path)
|
|
415
|
-
if not template.exists():
|
|
416
|
-
raise FileNotFoundError(f"Template not found: {template_path}")
|
|
417
|
-
|
|
418
|
-
# Read template
|
|
419
|
-
logger.info(f"Reading template: {template_path}")
|
|
420
|
-
reader = PdfReader(template_path)
|
|
421
|
-
|
|
422
|
-
if not reader.pages:
|
|
423
|
-
raise ValueError("PDF has no pages")
|
|
424
|
-
|
|
425
|
-
# Check if form has fields
|
|
426
|
-
fields = reader.get_fields()
|
|
427
|
-
if not fields:
|
|
428
|
-
logger.warning("PDF has no form fields")
|
|
429
|
-
return False
|
|
430
|
-
|
|
431
|
-
# Create writer
|
|
432
|
-
writer = PdfWriter()
|
|
433
|
-
for page in reader.pages:
|
|
434
|
-
writer.add_page(page)
|
|
435
|
-
|
|
436
|
-
# Validate data against schema
|
|
437
|
-
missing_required = []
|
|
438
|
-
invalid_fields = []
|
|
439
|
-
|
|
440
|
-
for field_name, field_info in fields.items():
|
|
441
|
-
# Check required fields
|
|
442
|
-
is_required = field_info.get("/Ff", 0) & 2 == 2
|
|
443
|
-
if is_required and field_name not in data:
|
|
444
|
-
missing_required.append(field_name)
|
|
445
|
-
|
|
446
|
-
# Check invalid field names in data
|
|
447
|
-
if field_name in data:
|
|
448
|
-
value = data[field_name]
|
|
449
|
-
# Add type validation here if needed
|
|
450
|
-
|
|
451
|
-
if missing_required:
|
|
452
|
-
raise ValueError(f"Missing required fields: {missing_required}")
|
|
453
|
-
|
|
454
|
-
# Fill fields
|
|
455
|
-
logger.info("Filling form fields")
|
|
456
|
-
writer.update_page_form_field_values(
|
|
457
|
-
writer.pages[0],
|
|
458
|
-
data
|
|
459
|
-
)
|
|
460
|
-
|
|
461
|
-
# Write output
|
|
462
|
-
logger.info(f"Writing output: {output_path}")
|
|
463
|
-
with open(output_path, "wb") as output:
|
|
464
|
-
writer.write(output)
|
|
465
|
-
|
|
466
|
-
logger.info("Form filled successfully")
|
|
467
|
-
return True
|
|
468
|
-
|
|
469
|
-
except PdfReadError as e:
|
|
470
|
-
logger.error(f"PDF read error: {e}")
|
|
471
|
-
return False
|
|
472
|
-
|
|
473
|
-
except FileNotFoundError as e:
|
|
474
|
-
logger.error(f"File error: {e}")
|
|
475
|
-
return False
|
|
476
|
-
|
|
477
|
-
except ValueError as e:
|
|
478
|
-
logger.error(f"Validation error: {e}")
|
|
479
|
-
return False
|
|
480
|
-
|
|
481
|
-
except Exception as e:
|
|
482
|
-
logger.error(f"Unexpected error: {e}")
|
|
483
|
-
return False
|
|
484
|
-
|
|
485
|
-
# Usage
|
|
486
|
-
success = fill_form_safe(
|
|
487
|
-
"template.pdf",
|
|
488
|
-
{"name": "John", "email": "john@example.com"},
|
|
489
|
-
"filled.pdf"
|
|
490
|
-
)
|
|
491
|
-
|
|
492
|
-
if not success:
|
|
493
|
-
exit(1)
|
|
494
|
-
```
|
|
495
|
-
|
|
496
|
-
## Production examples
|
|
497
|
-
|
|
498
|
-
### Example 1: Batch form processing
|
|
499
|
-
|
|
500
|
-
```python
|
|
501
|
-
import json
|
|
502
|
-
import glob
|
|
503
|
-
from pathlib import Path
|
|
504
|
-
from fill_form_safe import fill_form_safe
|
|
505
|
-
|
|
506
|
-
# Process multiple submissions
|
|
507
|
-
submissions_dir = Path("submissions")
|
|
508
|
-
template = "application_template.pdf"
|
|
509
|
-
output_dir = Path("completed")
|
|
510
|
-
output_dir.mkdir(exist_ok=True)
|
|
511
|
-
|
|
512
|
-
for submission_file in submissions_dir.glob("*.json"):
|
|
513
|
-
print(f"Processing: {submission_file.name}")
|
|
514
|
-
|
|
515
|
-
# Load submission data
|
|
516
|
-
with open(submission_file) as f:
|
|
517
|
-
data = json.load(f)
|
|
518
|
-
|
|
519
|
-
# Fill form
|
|
520
|
-
applicant_id = data.get("id", "unknown")
|
|
521
|
-
output_file = output_dir / f"application_{applicant_id}.pdf"
|
|
522
|
-
|
|
523
|
-
success = fill_form_safe(template, data, output_file)
|
|
524
|
-
|
|
525
|
-
if success:
|
|
526
|
-
print(f" ✓ Completed: {output_file}")
|
|
527
|
-
else:
|
|
528
|
-
print(f" ✗ Failed: {submission_file.name}")
|
|
529
|
-
```
|
|
530
|
-
|
|
531
|
-
### Example 2: Form with conditional logic
|
|
532
|
-
|
|
533
|
-
```python
|
|
534
|
-
def prepare_form_data(raw_data):
|
|
535
|
-
"""Prepare form data with conditional logic."""
|
|
536
|
-
|
|
537
|
-
form_data = {}
|
|
538
|
-
|
|
539
|
-
# Basic fields
|
|
540
|
-
form_data["full_name"] = raw_data["name"]
|
|
541
|
-
form_data["email"] = raw_data["email"]
|
|
542
|
-
|
|
543
|
-
# Conditional fields
|
|
544
|
-
if raw_data.get("is_student"):
|
|
545
|
-
form_data["student_id"] = raw_data["student_id"]
|
|
546
|
-
form_data["school_name"] = raw_data["school"]
|
|
547
|
-
else:
|
|
548
|
-
form_data["employer"] = raw_data.get("employer", "")
|
|
549
|
-
|
|
550
|
-
# Checkbox logic
|
|
551
|
-
form_data["newsletter"] = "/Yes" if raw_data.get("opt_in") else "/Off"
|
|
552
|
-
|
|
553
|
-
# Calculated fields
|
|
554
|
-
total = sum(raw_data.get("items", []))
|
|
555
|
-
form_data["total_amount"] = f"${total:.2f}"
|
|
556
|
-
|
|
557
|
-
return form_data
|
|
558
|
-
|
|
559
|
-
# Usage
|
|
560
|
-
raw_input = {
|
|
561
|
-
"name": "Jane Smith",
|
|
562
|
-
"email": "jane@example.com",
|
|
563
|
-
"is_student": True,
|
|
564
|
-
"student_id": "12345",
|
|
565
|
-
"school": "State University",
|
|
566
|
-
"opt_in": True,
|
|
567
|
-
"items": [10.00, 25.50, 15.75]
|
|
568
|
-
}
|
|
569
|
-
|
|
570
|
-
form_data = prepare_form_data(raw_input)
|
|
571
|
-
fill_form_safe("template.pdf", form_data, "output.pdf")
|
|
572
|
-
```
|
|
573
|
-
|
|
574
|
-
## Best practices
|
|
575
|
-
|
|
576
|
-
1. **Always analyze before filling**: Use `analyze_form.py` to understand structure
|
|
577
|
-
2. **Validate early**: Check data before attempting to fill
|
|
578
|
-
3. **Use logging**: Track operations for debugging
|
|
579
|
-
4. **Handle errors gracefully**: Don't crash on invalid data
|
|
580
|
-
5. **Test with samples**: Verify with small datasets first
|
|
581
|
-
6. **Flatten when distributing**: Make read-only for recipients
|
|
582
|
-
7. **Keep templates versioned**: Track form template changes
|
|
583
|
-
8. **Document field mappings**: Maintain data-to-field documentation
|
|
584
|
-
|
|
585
|
-
## Troubleshooting
|
|
586
|
-
|
|
587
|
-
### Fields not filling
|
|
588
|
-
|
|
589
|
-
1. Check field names match exactly (case-sensitive)
|
|
590
|
-
2. Verify checkbox/radio values (`/Yes`, `/On`, etc.)
|
|
591
|
-
3. Ensure PDF is not encrypted or protected
|
|
592
|
-
4. Check if form uses XFA format (not supported by pypdf)
|
|
593
|
-
|
|
594
|
-
### Encoding issues
|
|
595
|
-
|
|
596
|
-
```python
|
|
597
|
-
# Handle special characters
|
|
598
|
-
field_values["name"] = "José García" # UTF-8 encoded
|
|
599
|
-
```
|
|
600
|
-
|
|
601
|
-
### Large batch processing
|
|
602
|
-
|
|
603
|
-
```python
|
|
604
|
-
# Process in chunks to avoid memory issues
|
|
605
|
-
chunk_size = 100
|
|
606
|
-
|
|
607
|
-
for i in range(0, len(submissions), chunk_size):
|
|
608
|
-
chunk = submissions[i:i + chunk_size]
|
|
609
|
-
process_batch(chunk)
|
|
610
|
-
```
|