acroforge 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 700f9e41b86a58ef21a6f5d045907c1df97ae55c9d2728a465cd7b0f3ecf828e
4
+ data.tar.gz: 9859e634dbf6236e17ee1392ccf1239c8edf494be6072272218f6e2c58092c33
5
+ SHA512:
6
+ metadata.gz: 29c17e2e6958e5772cab51aa25a11187cc8d1f08cc9ccf7b2add174abacee05682b44f3f38f54dd50e6755084d5ccd5949407373da0d5d9a0e81de4257ed2348
7
+ data.tar.gz: eebeb0e39a636cba72f879cb6b4d31af3a670a01ae3692a86e47d5d248eb8a20ba656ded433f7c17f1fdf4b8d29041c4751bb2b78f2fff141c660798bbd1f805
data/CHANGELOG.md ADDED
@@ -0,0 +1,11 @@
1
+ # CHANGELOG
2
+
3
+ ## [0.1.0] - 2026-05-26
4
+
5
+ ## Added
6
+
7
+ - `AcroForge::Engine`: PDF AcroForm compilation with spatial heuristic and DI'd schema/overrides.
8
+ - `AcroForge::Schema`: load, dump, normalize, and infer schemas (YAML or JSON).
9
+ - `AcroForge::Relabeler`: propose and apply field renames via heuristic-derived YAML mappings.
10
+ - `AcroForge::Validator`: type validation for payloads.
11
+ - CLI: `acroforge schema|relabel|compile|bootstrap|version|help`.
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026-present Maxwell Nana Forson
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,217 @@
1
+ # AcroForge
2
+
3
+ A Ruby toolkit for working with PDF AcroForms, especially the broken ones.
4
+
5
+ AcroForge reads, validates, relabels, and fills PDF forms. Its standout feature is the **relabeling workflow**: when a vendor ships you a fillable PDF whose internal field names look like `page0_field6`, `Text101`, or worse, AcroForge runs a spatial heuristic to figure out what each field is _actually_ for, writes its proposal to a human-reviewable YAML file, and then permanently renames the AcroForm fields once you've approved the mapping. The result is a PDF you can fill programmatically without ever again writing `pdf.fields["page0_field6"] = "Alice"`.
6
+
7
+ It works on any AcroForm PDF: loan applications, school admission forms, government paperwork, internal HR templates. Nothing in the gem is domain-specific.
8
+
9
+ ## Requirements
10
+
11
+ - Ruby `>= 2.7`
12
+ - [HexaPDF](https://hexapdf.gettalong.org/) `~> 1.0` (runtime dependency, installed automatically)
13
+
14
+ ## Installation
15
+
16
+ ```bash
17
+ gem install acroforge
18
+ ```
19
+
20
+ The `acroforge` command lands on your `PATH` automatically. RubyGems handles this the same way it does for `rails`, `bundle`, or any other Ruby CLI; tools like mise, rbenv, asdf, and rvm pick up the new binary through their shim layer without further configuration.
21
+
22
+ To use AcroForge as a library inside another Ruby project, add it to that project's `Gemfile`:
23
+
24
+ ```ruby
25
+ gem "acroforge"
26
+ ```
27
+
28
+ See the [Installation guide](https://lzcorp-solutions.github.io/acroforge/installation) for troubleshooting `PATH` issues on non-standard Ruby setups.
29
+
30
+ ## Quick start: the relabeling workflow
31
+
32
+ Given a PDF with garbage-named fields:
33
+
34
+ ```bash
35
+ # 1. Generate a starter schema (advisory; the heuristic's best guess at canonical keys)
36
+ $ acroforge schema infer broken_form.pdf --out schema.yml
37
+
38
+ # 2. Generate a draft mapping (per-field rename proposals, sorted by page/position)
39
+ $ acroforge relabel propose broken_form.pdf --schema schema.yml --out mapping.yml
40
+
41
+ # 3. Review mapping.yml in your editor: fix wrong proposals, fill in any blanks
42
+
43
+ # 4. Apply the mapping: permanently renames the AcroForm fields in place
44
+ $ acroforge relabel apply broken_form.pdf mapping.yml
45
+ ```
46
+
47
+ After step 4, the PDF's internal field names are semantic (`full_name`, `email`, `gender`, ...) and you can fill the form programmatically with confidence.
48
+
49
+ The shortcut for the first two steps:
50
+
51
+ ```bash
52
+ $ acroforge bootstrap broken_form.pdf
53
+ # writes schema.yml AND mapping.yml in one pass
54
+ ```
55
+
56
+ ## CLI
57
+
58
+ ```text
59
+ acroforge schema infer <pdf> [--out schema.yml] [--sections a,b,c]
60
+ acroforge relabel propose <pdf> [--out mapping.yml] [--schema schema.yml] [--merge|--overwrite]
61
+ acroforge relabel apply <pdf> <mapping.yml>
62
+ acroforge compile <pdf> [--schema schema.yml]
63
+ acroforge bootstrap <pdf> [--schema-out s.yml] [--mapping-out m.yml]
64
+ acroforge version
65
+ acroforge help
66
+ ```
67
+
68
+ | Subcommand | What it does |
69
+ | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
70
+ | `schema infer` | Runs the heuristic on a PDF and writes a starter schema (canonical key → type + variations). Advisory; you review and edit. |
71
+ | `relabel propose` | Writes a YAML mapping file proposing a semantic name for every AcroForm field. Sorted by page → top-to-bottom → left-to-right. Default mode `--merge` preserves any `key`/`type` values you've already edited. |
72
+ | `relabel apply` | Reads a corrected mapping file and rewrites `field[:T]` / `field[:TU]` in the source PDF in place. Auto-disambiguates collisions (`full_name`, `full_name_1`, ...). |
73
+ | `compile` | Diagnostic: runs the engine and prints mapped/unmapped counts. Useful for checking heuristic coverage without writing any files. |
74
+ | `bootstrap` | Convenience: `schema infer` + `relabel propose` in one call. |
75
+
76
+ Exit codes: `0` success, `1` user error (bad args, missing file), `2` validation error, `3` internal error.
77
+
78
+ ## Library API
79
+
80
+ ```ruby
81
+ require "acroforge"
82
+
83
+ # Compile a PDF and inspect what the heuristic found.
84
+ engine = AcroForge::Engine.new(
85
+ "form.pdf",
86
+ schema: AcroForge::Schema.load("schema.yml"), # or pass a Hash directly
87
+ overrides: {}, # optional per-PDF overrides
88
+ sections: ["Personal Details", "Loan Details"] # optional section headers for scoping
89
+ )
90
+ result = engine.compile!
91
+ # => { mapped: {...}, unmapped: [...], select_options: {...}, new_fields_detected: [...] }
92
+
93
+ # Fill a form with a payload.
94
+ engine.validate_payload!(full_name: "Alice", email: "alice@example.com")
95
+ engine.fill!({ full_name: "Alice", email: "alice@example.com" }, "filled.pdf")
96
+
97
+ # Generate a starter schema from a PDF.
98
+ schema = AcroForge::Schema.infer("form.pdf")
99
+ AcroForge::Schema.dump(schema, "schema.yml")
100
+
101
+ # Run the relabeler programmatically.
102
+ AcroForge::Relabeler.propose("form.pdf", out: "mapping.yml", schema: schema)
103
+ AcroForge::Relabeler.apply!("form.pdf", "mapping.yml")
104
+
105
+ # Validate individual values.
106
+ AcroForge::Validator.valid?("alice@example.com", :email) # => true
107
+ AcroForge::Validator.valid?("not a date", :date) # => false
108
+ ```
109
+
110
+ ### Errors
111
+
112
+ - `AcroForge::ValidationError`: raised by `Engine#validate_payload!` on type mismatch.
113
+ - `AcroForge::RelabelError`: raised by `Relabeler.apply!` on malformed mapping YAML, invalid key names, or missing AcroForm.
114
+
115
+ ## Schema format
116
+
117
+ Schemas are YAML or JSON files in the "rich form":
118
+
119
+ ```yaml
120
+ full_name:
121
+ type: string
122
+ variations:
123
+ - Full Name
124
+ - First Name
125
+ - Surname
126
+ gender:
127
+ type: select
128
+ variations:
129
+ - Gender
130
+ - Sex
131
+ options:
132
+ - male
133
+ - female
134
+ amount_requested:
135
+ type: money
136
+ variations:
137
+ - Amount Requested
138
+ - Loan Amount
139
+ ```
140
+
141
+ **Field keys** (`full_name`, `gender`, ...) become Ruby symbols. **`type`** is one of `string | select | boolean | money | date | email | number`. **`variations`** are the human-readable label strings to look for on the page. **`options`** are the allowed select values (for `select` and `boolean` types).
142
+
143
+ AcroForge also accepts a legacy "shorthand" form where the value is just an array of variations. `AcroForge::Schema.normalize` upgrades it to rich form on the way in:
144
+
145
+ ```ruby
146
+ {
147
+ full_name: ["Full Name", "First Name", "Surname"],
148
+ dob: ["Date of Birth", "DOB"]
149
+ }
150
+ ```
151
+
152
+ ## Mapping file format
153
+
154
+ `relabel propose` writes one of these. Edit the `key:` and `type:` values; the `meta:` blocks are advisory and get regenerated on the next `propose`.
155
+
156
+ ```yaml
157
+ _meta:
158
+ source_pdf: broken_form.pdf
159
+ generated_at: 2026-05-26T14:32:11Z
160
+ acroforge_version: 0.1.0
161
+ total_fields: 98
162
+
163
+ page0_field6:
164
+ key: full_name
165
+ type: string
166
+ meta:
167
+ raw_label: Full Name
168
+ confidence: high
169
+ section: personal_details
170
+ page: 0
171
+
172
+ page0_field28:
173
+ key: full_name # collision: apply! renames this one to full_name_1
174
+ type: string
175
+ meta:
176
+ raw_label: Customer Name
177
+ confidence: medium
178
+ section: personal_details
179
+ page: 0
180
+
181
+ page0_field99:
182
+ key: ~ # null = skip this field, leave its name unchanged
183
+ type: ~
184
+ meta:
185
+ raw_label: ~
186
+ confidence: none
187
+ section: ~
188
+ page: 3
189
+ ```
190
+
191
+ `key` must match `/\A[a-z][a-z0-9_]*\z/`. Invalid keys cause `apply!` to raise `RelabelError` _before_ writing anything to the PDF.
192
+
193
+ ## How the heuristic works
194
+
195
+ For each AcroForm field, AcroForge:
196
+
197
+ 1. Reads every text chunk on the page along with its bounding box.
198
+ 2. Scores nearby text against the field's widget rectangle using a mode-aware weighted heuristic (Grid-Lock, Inline Paragraph, or Standard Label depending on layout).
199
+ 3. Picks the best-scoring label, sanitises it into a snake-case key.
200
+ 4. If a `schema` is supplied, canonicalises the key against its `variations` lists.
201
+ 5. For radio groups and checkboxes, also discovers the option export values from the widget appearance states.
202
+
203
+ You can inspect what it found via `engine.field_proposals` after `compile!`. That's the data structure the Relabeler consumes.
204
+
205
+ ## Development
206
+
207
+ ```bash
208
+ bundle install
209
+ bundle exec rspec # run the test suite
210
+ bundle exec standardrb # lint
211
+ ```
212
+
213
+ Synthetic test fixtures live in `spec/fixtures/`. To regenerate them, run `ruby spec/fixtures/build_fixtures.rb`.
214
+
215
+ ## License
216
+
217
+ MIT.
data/Rakefile ADDED
@@ -0,0 +1,10 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ require "standard/rake"
9
+
10
+ task default: %i[spec standard]
data/acroforge.gemspec ADDED
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "lib/acroforge/version"
4
+
5
+ Gem::Specification.new do |spec|
6
+ spec.name = "acroforge"
7
+ spec.version = AcroForge::VERSION
8
+ spec.authors = ["Maxwell Nana Forson"]
9
+ spec.email = ["nanaforson@gmail.com"]
10
+
11
+ spec.summary = "PDF AcroForm engine with heuristic-assisted field relabeling."
12
+ spec.description = "Compile, fill, and relabel garbage-named AcroForm PDFs through a spatial heuristic and a human-reviewed mapping file."
13
+ spec.homepage = "https://github.com/Lzcorp-Solutions/acroforge"
14
+ spec.license = "MIT"
15
+ spec.required_ruby_version = ">= 2.7"
16
+
17
+ spec.metadata["homepage_uri"] = spec.homepage
18
+ spec.metadata["source_code_uri"] = "#{spec.homepage}/tree/main"
19
+ spec.metadata["bug_tracker_uri"] = "#{spec.homepage}/issues"
20
+ spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
21
+ spec.metadata["documentation_uri"] = "https://lzcorp-solutions.github.io/acroforge/"
22
+ spec.metadata["rubygems_mfa_required"] = "true"
23
+
24
+ spec.files = Dir.chdir(__dir__) do
25
+ `git ls-files -z`.split("\x0").reject do |f|
26
+ (File.expand_path(f) == __FILE__) ||
27
+ f.start_with?(*%w[bin/ test/ spec/ features/ docs/ .git .github appveyor Gemfile]) ||
28
+ %w[.rspec .standard.yml .rubocop.yml Gemfile.lock].include?(f)
29
+ end
30
+ end
31
+
32
+ spec.bindir = "exe"
33
+ spec.executables = ["acroforge"]
34
+ spec.require_paths = ["lib"]
35
+
36
+ spec.add_dependency "hexapdf", "~> 1.0"
37
+ end
data/exe/acroforge ADDED
@@ -0,0 +1,5 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "acroforge/cli"
5
+ exit AcroForge::CLI.run(ARGV)
@@ -0,0 +1,126 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "hexapdf"
4
+
5
+ module AcroForge
6
+ class AllTextProcessor < HexaPDF::Content::Processor
7
+ def initialize
8
+ super
9
+ @raw_chunks = []
10
+ end
11
+
12
+ def show_text(str)
13
+ process_text(str)
14
+ end
15
+
16
+ def show_text_with_positioning(arr)
17
+ process_text(arr)
18
+ end
19
+
20
+ def text_chunks
21
+ merged = merge_fragments(@raw_chunks)
22
+ # merge_fragments joins adjacent chunks with a literal " ", which can
23
+ # produce strings like "M o d e O f R e p a y m e n t" when the PDF
24
+ # rendered each glyph as a separate text object. Re-run normalization
25
+ # on the merged result so the spaced-letter collapse and other fragment
26
+ # fixes get a second chance to fire on the joined text.
27
+ merged.map { |c| c.merge(text: normalize_extracted_text(c[:text])) }
28
+ end
29
+
30
+ private
31
+
32
+ def process_text(data)
33
+ text_pos = decode_text_with_positioning(data)
34
+
35
+ str = normalize_extracted_text(text_pos.string)
36
+ return if str.empty?
37
+
38
+ @raw_chunks << {
39
+ text: str,
40
+ x_min: text_pos.lower_left[0],
41
+ y_min: text_pos.lower_left[1],
42
+ x_max: text_pos.upper_right[0],
43
+ y_max: text_pos.upper_right[1]
44
+ }
45
+ rescue HexaPDF::Error
46
+ nil
47
+ end
48
+
49
+ def normalize_extracted_text(raw_text)
50
+ str = raw_text.to_s
51
+ # Unicode NFKC normalization handles ligatures (fi -> fi), fullwidth
52
+ # letters, superscripts, and the rest of the Unicode "compatibility"
53
+ # subset in one pass. We guard against invalid encoding because PDFs
54
+ # occasionally smuggle raw bytes through.
55
+ str = str.unicode_normalize(:nfkc) if str.valid_encoding?
56
+ # NBSP to regular space (NFKC leaves NBSP alone)
57
+ str = str.tr(" ", " ")
58
+ # Curly quotes, en/em dashes, ellipsis, zero-width chars: NFKC doesn't
59
+ # touch these since they're separate codepoints, not compatibility forms.
60
+ AcroForge::Constants::UNICODE_REPLACEMENTS.each do |from, to|
61
+ str = str.gsub(from, to)
62
+ end
63
+ str = str.gsub(/\s+/, " ").strip
64
+
65
+ # Fix split apostrophes like "Customer ' s".
66
+ str = str.gsub(/\s+['’]\s+/, "'")
67
+
68
+ # Collapse only clear spaced-letter sequences like "C u s t o m e r".
69
+ str = str.gsub(/\b(?:\p{L}\s+){2,}\p{L}\b/) { |m| m.gsub(/\s+/, "") }
70
+
71
+ # Collapsing strips ALL whitespace, so a sequence like
72
+ # "L o a n T e n o r A p p r o v e d" becomes "LoanTenorApproved".
73
+ # Recover the word breaks from the surviving capital letters.
74
+ str = str.gsub(/([a-z])([A-Z])/, '\1 \2')
75
+
76
+ # PDFs often render punctuation as separate text objects too, so we end
77
+ # up with "( ForDisbursement )" or "No ." once everything else collapses.
78
+ # Tighten the spacing around those.
79
+ str = str.gsub(/\(\s+/, "(").gsub(/\s+\)/, ")")
80
+ str = str.gsub(/(\w)\s+([.,;:!?])/, '\1\2')
81
+
82
+ # Merge split fragments that commonly appear in broken PDF extraction,
83
+ # e.g. "c ertify", "h as", "th at", "o ther".
84
+ loop do
85
+ previous = str
86
+
87
+ # One-letter consonant + token: "c ertify" => "certify", "h as" => "has".
88
+ str = str.gsub(/\b([bcdfghjklmnpqrstvwxyz])\s+([a-z]{2,})\b/i, '\\1\\2')
89
+
90
+ # Common 2-letter heads: "th at" => "that", "ot her" => "other".
91
+ str = str.gsub(/\b(th|wh|ot|pr|tr|cl|gr|fr|br|dr|st|sp|ch)\s+([a-z]{2,})\b/i, '\\1\\2')
92
+
93
+ break if str == previous
94
+ end
95
+
96
+ str.gsub(/\s+/, " ").strip
97
+ end
98
+
99
+ def merge_fragments(chunks)
100
+ sorted = chunks.sort_by { |c| [-c[:y_min], c[:x_min]] }
101
+ merged = []
102
+
103
+ sorted.each do |chunk|
104
+ if merged.empty?
105
+ merged << chunk
106
+ else
107
+ last = merged.last
108
+
109
+ if (last[:y_min] - chunk[:y_min]).abs < 8 &&
110
+ (chunk[:x_min] - last[:x_max]) < 20 &&
111
+ (chunk[:x_min] - last[:x_max]) > -5 &&
112
+ !last[:text].strip.end_with?(":")
113
+
114
+ last[:text] += " " + chunk[:text]
115
+ last[:x_max] = chunk[:x_max]
116
+ last[:y_min] = [last[:y_min], chunk[:y_min]].min
117
+ last[:y_max] = [last[:y_max], chunk[:y_max]].max
118
+ else
119
+ merged << chunk
120
+ end
121
+ end
122
+ end
123
+ merged
124
+ end
125
+ end
126
+ end
@@ -0,0 +1,137 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "yaml"
4
+ require "hexapdf"
5
+
6
+ module AcroForge
7
+ # Overlays a labeled badge on every AcroForm field in a PDF so a human
8
+ # can correlate cryptic field names (page0_field6) to what's visible on
9
+ # the page. The badge optionally shows the proposed semantic key from a
10
+ # mapping file, with colour-coding for mapped vs. unmapped entries.
11
+ module Annotator
12
+ module_function
13
+
14
+ MAPPED_COLOR = "1f7a3a" # green: in mapping with a key set
15
+ UNMAPPED_COLOR = "c2410c" # amber: in mapping with key: nil
16
+ MISSING_COLOR = "6b7280" # gray: not in mapping at all
17
+ BARE_COLOR = "1f3a8a" # blue: no mapping supplied at all
18
+
19
+ def annotate(pdf_path, out:, mapping: nil)
20
+ entries = mapping ? load_mapping(mapping) : nil
21
+
22
+ doc = HexaPDF::Document.open(pdf_path)
23
+ form = doc.acro_form(create: false)
24
+ raise RelabelError, "PDF has no AcroForm: #{pdf_path}" unless form
25
+
26
+ annotated = 0
27
+ mapped_count = 0
28
+ unmapped_count = 0
29
+ missing_count = 0
30
+
31
+ form.each_field do |field|
32
+ field.each_widget do |widget|
33
+ next unless widget[:Rect]
34
+ page = find_page_for_widget(doc, widget)
35
+ next unless page
36
+
37
+ original_name = field.full_field_name
38
+ entry = entries&.[](original_name)
39
+
40
+ color, label = if entries.nil?
41
+ [BARE_COLOR, original_name]
42
+ elsif entry.nil?
43
+ missing_count += 1
44
+ [MISSING_COLOR, "#{original_name} (not in mapping)"]
45
+ elsif entry["key"].nil? || entry["key"].to_s.empty?
46
+ unmapped_count += 1
47
+ [UNMAPPED_COLOR, "#{original_name} (no key)"]
48
+ else
49
+ mapped_count += 1
50
+ [MAPPED_COLOR, "#{original_name} -> #{entry["key"]}"]
51
+ end
52
+
53
+ draw_badge(page, widget[:Rect], label, color)
54
+ annotated += 1
55
+ end
56
+ end
57
+
58
+ doc.write(out)
59
+
60
+ {
61
+ annotated: annotated,
62
+ mapped: mapped_count,
63
+ unmapped: unmapped_count,
64
+ missing: missing_count,
65
+ out_path: out
66
+ }
67
+ end
68
+
69
+ def find_page_for_widget(doc, widget)
70
+ doc.pages.find { |page| page[:Annots]&.include?(widget) }
71
+ end
72
+
73
+ def load_mapping(arg)
74
+ return arg if arg.is_a?(Hash)
75
+ data = YAML.load_file(arg) || {}
76
+ data.reject { |k, _| k.to_s.start_with?("_") }
77
+ end
78
+
79
+ # Draw a coloured badge near the field rectangle showing the label text.
80
+ #
81
+ # Placement heuristic:
82
+ # - Text input (wide rectangle with enough height): badge goes INSIDE
83
+ # the field's empty input area. This is "free space" before the form
84
+ # is filled and never collides with the form's own labels or
85
+ # neighboring fields.
86
+ # - Small field (checkbox / radio): too small to host a badge inside,
87
+ # so the badge sits ABOVE the field. The form's label for these is
88
+ # typically to the right of the box, so above doesn't collide.
89
+ def draw_badge(page, rect, label, color_hex)
90
+ x1, y1, x2, y2 = rect.to_a
91
+ width = x2 - x1
92
+ height = y2 - y1
93
+
94
+ canvas = page.canvas(type: :overlay)
95
+
96
+ # Outline around the field so each badge is tied visually to its field,
97
+ # even when the badge sits inside an empty input.
98
+ canvas.save_graphics_state do
99
+ canvas.stroke_color(color_hex)
100
+ canvas.line_width(0.75)
101
+ canvas.rectangle(x1 - 1, y1 - 1, width + 2, height + 2).stroke
102
+ end
103
+
104
+ page_box = page.box(:media)
105
+ badge_h = 10
106
+ char_width = 3.6
107
+ max_width = page_box.width - x1 - 4
108
+ badge_w = [(label.length * char_width) + 6, max_width].min
109
+
110
+ fits_inside = width > height * 1.5 && height >= badge_h + 2
111
+
112
+ badge_x, badge_y = if fits_inside
113
+ # Inside the field, top-aligned, clipped to the field width
114
+ inside_w = [badge_w, width - 2].min
115
+ [x1 + 1, y2 - badge_h - 1].tap { |_| badge_w = inside_w }
116
+ else
117
+ # Above the small field; clamp to page top if needed
118
+ above_y = y2 + 1
119
+ below_y = y1 - badge_h - 1
120
+ candidate = (above_y + badge_h <= page_box.top) ? above_y : below_y
121
+ [x1, candidate.clamp(0, page_box.top - badge_h)]
122
+ end
123
+
124
+ canvas.save_graphics_state do
125
+ canvas.fill_color(color_hex)
126
+ canvas.opacity(fill_alpha: fits_inside ? 0.85 : 0.9)
127
+ canvas.rectangle(badge_x, badge_y, badge_w, badge_h).fill
128
+ end
129
+
130
+ canvas.save_graphics_state do
131
+ canvas.fill_color("ffffff")
132
+ canvas.font("Helvetica", size: 6.5)
133
+ canvas.text(label, at: [badge_x + 3, badge_y + 2.5])
134
+ end
135
+ end
136
+ end
137
+ end