pure_jpeg 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e09848a734582d7635ff6a8c3d85b166da4f2c5b496d5c4f4e30105af4834e33
4
- data.tar.gz: 052b8e0a21d58eb9aa9e169e06b7cfbde44f3610775da6fed86e05875dceaea3
3
+ metadata.gz: 6eacb8a616f95a52625f6f5acb3c8c137306c5dcf3636e93e2e715287b429655
4
+ data.tar.gz: dade5c2d3b9603bb7089635977ea2e38827da1a3912c820152367f3ca643c5f9
5
5
  SHA512:
6
- metadata.gz: 750ea3d65bf2ae6c272998b2ef7b814954eb1b576f6df83036d931543f30b99481281e8a12ef63fc388fdc60db0c54815fc8219b66782415844b3ed8721a5975
7
- data.tar.gz: 16475c5174b009a7a45eec5ee288799ea7965e4e10ed02ab20d55b15de5ebbdf92e9c0196d58053b6539dfea4601d7e22016c636ee471d2e73300feb86d97f07
6
+ metadata.gz: ccf7a06b88c08f14ca70d944ddcc753795217e3bfbca00874484ee6f3c2a360d8cede768c4469f5b92d2789bf7bcc71d61b22f67d76fb98774af28f88132c248
7
+ data.tar.gz: d868a5d4f7db3b20a504bc09f9908169fd35c269b511e619e083957fd6b4bf90a3cf22985a5e6d81289a02306ba75677b94b8ff9f6d741acc7b2047979d0f2c6
data/CHANGELOG.md CHANGED
@@ -1,5 +1,31 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.0
4
+
5
+ New features:
6
+
7
+ - Progressive JPEG decoding (SOF2) with spectral selection and successive approximation
8
+ - `Image#each_rgb` for iterating pixels without per-pixel struct allocation
9
+ - `PureJPEG::DecodeError` exception class for all decoding errors
10
+ - Validation of custom quantization tables (length and value range)
11
+
12
+ Performance:
13
+
14
+ - Packed integer pixel storage in `Image` eliminates per-pixel object allocation on decode (~6x faster decode)
15
+ - Fast path for encoder pixel extraction from packed sources (`ChunkyPNGSource`, `Image`)
16
+ - `BitReader#read_bits` fast path when buffer already has enough bits
17
+ - `BitWriter` builds a `String` directly instead of `Array` + `pack`
18
+ - `Huffman.build_table` returns an `Array` for O(1) lookup instead of `Hash`
19
+ - Faster scan data extraction using `String#index`
20
+
21
+ Fixes:
22
+
23
+ - JPEG data detection uses SOI marker check instead of null-byte heuristic
24
+ - `RawSource` pixels default to black instead of `nil`
25
+ - `BitReader` bounds check for truncated 0xFF sequences
26
+ - `JFIFReader` bounds check when reading past end of data
27
+ - Fixed dead tautological check in AC encoding EOB logic
28
+
3
29
  ## 0.1.0
4
30
 
5
31
  Initial release.
data/LICENSE CHANGED
@@ -1,6 +1,6 @@
1
1
  MIT License
2
2
 
3
- Copyright (c) 2025 Peter Cooper
3
+ Copyright (c) 2026 Peter Cooper
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
data/README.md CHANGED
@@ -1,6 +1,15 @@
1
- # PureJPEG
1
+ <p align="center">
2
+ <img src="purejpeg.jpg" width="480" alt="PureJPEG">
3
+ </p>
2
4
 
3
- Pure Ruby JPEG encoder and decoder. Implements baseline JPEG (DCT, Huffman, 4:2:0 chroma subsampling) and exposes a variety of encoding options to adjust parts of the JPEG pipeline not normally available (I needed this to recreate the JPEG compression styles of older digital cameras - don't ask..)
5
+ # PureJPEG - Pure Ruby JPEG encoder and decoder library
6
+
7
+ Convert PNG or other pixel data to JPEG. Or the other way! Implements baseline JPEG encoding (DCT, Huffman, 4:2:0 chroma subsampling) and decodes both baseline and progressive JPEGs. Exposes a variety of encoding options to adjust parts of the JPEG pipeline not normally available (I needed this to recreate the JPEG compression styles of older digital cameras - don't ask..)
8
+
9
+ It works on CRuby 3.0+, TruffleRuby 33.0, and JRuby 10.0.
10
+
11
+ > [!NOTE]
12
+ > Rubyists might find the [AI Disclosure](#ai-disclosure) section below of interest.
4
13
 
5
14
  ## Installation
6
15
 
@@ -14,7 +23,7 @@ gem "pure_jpeg"
14
23
  gem install pure_jpeg
15
24
  ```
16
25
 
17
- There are no runtime dependencies. [ChunkyPNG](https://github.com/wvanbergen/chunky_png) is optional and for if you want to use `from_chunky_png`. I have a pure PNG encoder/decoder not far behind this that will ultimately plug in nicely too to get pure Ruby graphical bliss ;-)
26
+ There are no runtime dependencies. [ChunkyPNG](https://github.com/wvanbergen/chunky_png) is optional (though quite useful) if you want to use `from_chunky_png`. I have a pure PNG encoder/decoder not far behind this that will ultimately plug in nicely too to get 100% pure Ruby graphical bliss ;-)
18
27
 
19
28
  `examples/` contains some useful example scripts for basic JPEG to PNG and PNG to JPEG conversion if you want to do some quick tests without writing code.
20
29
 
@@ -76,7 +85,35 @@ PureJPEG.encode(source,
76
85
 
77
86
  See [CREATIVE.md](CREATIVE.md) for detailed examples of the creative encoding options.
78
87
 
79
- Each stage of the JPEG pipeline is a separate module, so individual components (DCT, quantization, Huffman coding) can be replaced or extended independently which is kinda my plan here as I made this to play around with effects.
88
+ Here's a quick example of sort of the "old digital camera" effect I was looking for though:
89
+
90
+ <table>
91
+ <tr>
92
+ <td align="center"><strong>Normal</strong></td>
93
+ <td align="center"><strong>Scrambled quantization</strong></td>
94
+ </tr>
95
+ <tr>
96
+ <td><img src="examples/peppers.jpg" width="360"></td>
97
+ <td><img src="examples/peppers-funky.jpg" width="360"></td>
98
+ </tr>
99
+ </table>
100
+
101
+ And here's what happens when you convert a PNG with transparency — JPEG doesn't support alpha, so the hidden RGB data behind transparent pixels bleeds through:
102
+
103
+ <table>
104
+ <tr>
105
+ <td align="center"><strong>PNG with transparency</strong></td>
106
+ <td align="center"><strong>Converted to JPEG</strong></td>
107
+ </tr>
108
+ <tr>
109
+ <td><img src="examples/dice.png" width="360"></td>
110
+ <td><img src="examples/dice.jpg" width="360"></td>
111
+ </tr>
112
+ </table>
113
+
114
+ I consider this a feature but you may consider it a deficiency and that a default background of white should be applied. This may be something I'll add if anyone wants it!
115
+
116
+ Note that each stage of the JPEG pipeline is a separate module, so individual components (DCT, quantization, Huffman coding) can be replaced or extended independently which is kinda my plan here as I made this to play around with effects.
80
117
 
81
118
  ## Decoding (reading JPEGs!)
82
119
 
@@ -137,24 +174,24 @@ Encoding:
137
174
  - Standard Huffman tables (Annex K)
138
175
 
139
176
  Decoding:
140
- - Baseline DCT (SOF0)
177
+ - Baseline DCT (SOF0) and Progressive DCT (SOF2)
141
178
  - 8-bit precision
142
179
  - 1-component (grayscale) and 3-component (YCbCr) images
143
180
  - Any chroma subsampling factor (4:4:4, 4:2:2, 4:2:0, etc.)
144
181
  - Restart markers (DRI/RST)
145
182
 
146
- Not supported: progressive JPEG (SOF2), arithmetic coding, 12-bit precision, multi-scan, EXIF/ICC profile preservation. Largely because I don't need these, but they are all do-able, especially with how loosely coupled this library is internally. Raise an issue if you really care about them!
183
+ Not supported: arithmetic coding, 12-bit precision, EXIF/ICC profile preservation, adding a default background for transparent sources (see what happens above!). Largely because I don't need these, but they are all do-able, especially with how loosely coupled this library is internally. Raise an issue if you really care about them!
147
184
 
148
185
  ## Performance
149
186
 
150
- On a 1024x1024 image (Ruby 3.4 on my M1 Max):
187
+ On a 1024x1024 image (Ruby 4.0.1 on my M1 Max):
151
188
 
152
189
  | Operation | Time |
153
190
  |-----------|------|
154
- | Encode (color, q85) | ~2.8s |
155
- | Decode (color) | ~12s |
191
+ | Encode (color, q85) | ~1.7s |
192
+ | Decode (color) | ~1.8s |
156
193
 
157
- The encoder uses a separable DCT with a precomputed cosine matrix and reuses all per-block buffers to minimize GC pressure (more on the optimizations below).
194
+ Both the encoder and decoder use a separable DCT with a precomputed cosine matrix and reuse all per-block buffers to minimize GC pressure. Pixel data is stored as packed integers internally to avoid per-pixel object allocation.
158
195
 
159
196
  ## Some useful `rake` tasks
160
197
 
@@ -167,13 +204,19 @@ rake profile # CPU profile with StackProf (requires the stackprof gem)
167
204
 
168
205
  ## AI Disclosure
169
206
 
170
- Claude Code did the majority of the work. However, it did require a lot of guidance as it was quite naive in its approach at first with its JPEG outputs looking very akin to those of my Kodak digital camera from 2001! It turns out it got something wrong which, amusingly, it seems devices of those era also got wrong (specifically not using the zigzag approach during quanitization).
207
+ **Claude Code did the majority of the work.** The math of JPEG encoding/decoding is beyond me, except 'getting it' at a high level. I understand it like I understand the engine in my car :-)
208
+
209
+ **I have read all of the code produced.** The algorithms are above my paygrade, but I'm OK with what has been produced, and I manually fixed a variety of stylistic things along the way. For example, CC seems to like wrapping entire functions in `if` statements rather than bailing on the opposite condition.
210
+
211
+ **CC needed a lot of guidance.** Its initial JPEG algorithm was somewhat naive and output odd looking JPEGs akin to those of my Kodak digital camera from 2001. After some back and forth and image comparisons, we figured out it was doing the quantization entirely wrong (specifically not using the zigzag approach during quanitization but just going in raster order). I *like* this aesthetic, but fixed it up so that it works as a generally usable JPEG library, while adding ways to customize things so you can recreate the effect, if preferred (see `CREATIVE.md` for more on that).
212
+
213
+ **CC is lazy.** The initial implementation was VERY SLOW. It took 15 seconds to turn a 1024x1024 PNG into a JPEG, so we went down the profiling rabbit hole and found many optimizations to make it ~6x faster. CC is poor at considering the role of Ruby's GC when implementing low level algorithms and needs some prodding to make the correct optimizations. CC is also lazy to the point of recommending that you just use another language (e.g. Go or Rust) rather than do a pure Ruby version of something - despite it being possible with some extra work.
171
214
 
172
- The initial implementation was also VERY SLOW. It took about 15 seconds just to turn a 1024x1024 PNG into a JPEG, so some profiling was necessary which ended up finding a lot of possible optimizations to make it about 6x faster.
215
+ **CC's testing and cleanliness leaves a bit to be desired.** The CC-created tests were superficial, so I worked on getting them beefed up to tackle a variety of edge cases. They could still get better. It also didn't do RDoc comments, use Minitest, and a variety of other things I coerced it into working on. A good `CLAUDE.md` file could probably avoid many of these problems. I worked without one.
173
216
 
174
- The tests were also a bit superficial, so I worked on getting them beefed up to tackle a variety of edge cases, although they could still be better. It also didn't do RDoc comments, use Minitest, and a variety of other things I had to coerce it into finishing.
217
+ **The overall experience was good.** I enjoyed this project, but CC clearly requires an experienced developer to keep it on the rails and to not end up with a bunch of buggy half-working crap. Getting to the basic 'turn a PNG into a JPEG' took only twenty minutes, but the rest of making it actually widely useful took several hours more.
175
218
 
176
- I have read all of the code produced. A lot of the internals are above my paygrade but I'm generally OK with what has been produced and fixed a variety of stylistic things along the way.
219
+ **The final 10% still takes 90% of the time.** As mentioned above, the first run was quick, but getting things right has taken much longer. v0.1->0.2 has taken longer than 0.1 did! But we now have progressive JPEG support, even more optimizations, better tests, etc. etc.
177
220
 
178
221
  ## License
179
222
 
@@ -18,6 +18,12 @@ module PureJPEG
18
18
 
19
19
  def read_bits(n)
20
20
  return 0 if n == 0
21
+ # Fast path: enough bits already in the buffer
22
+ if @bits_in_buffer >= n
23
+ @bits_in_buffer -= n
24
+ return (@buffer >> @bits_in_buffer) & ((1 << n) - 1)
25
+ end
26
+ # Slow path: need to refill
21
27
  value = 0
22
28
  n.times { value = (value << 1) | read_bit }
23
29
  value
@@ -43,10 +49,11 @@ module PureJPEG
43
49
  private
44
50
 
45
51
  def fill_buffer
46
- raise "Unexpected end of scan data" if @pos >= @length
52
+ raise PureJPEG::DecodeError, "Unexpected end of scan data" if @pos >= @length
47
53
  byte = @data.getbyte(@pos)
48
54
  @pos += 1
49
55
  if byte == 0xFF
56
+ raise PureJPEG::DecodeError, "Unexpected end of scan data" if @pos >= @length
50
57
  next_byte = @data.getbyte(@pos)
51
58
  @pos += 1
52
59
  # 0xFF 0x00 is a stuffed 0xFF byte
@@ -3,7 +3,7 @@
3
3
  module PureJPEG
4
4
  class BitWriter
5
5
  def initialize
6
- @data = []
6
+ @data = String.new(capacity: 4096, encoding: Encoding::BINARY)
7
7
  @buffer = 0
8
8
  @bits_in_buffer = 0
9
9
  end
@@ -17,8 +17,8 @@ module PureJPEG
17
17
  while @bits_in_buffer >= 8
18
18
  @bits_in_buffer -= 8
19
19
  byte = (@buffer >> @bits_in_buffer) & 0xFF
20
- @data << byte
21
- @data << 0x00 if byte == 0xFF # byte stuffing
20
+ @data << byte.chr
21
+ @data << "\x00".b if byte == 0xFF # byte stuffing
22
22
  end
23
23
 
24
24
  @buffer &= (1 << @bits_in_buffer) - 1
@@ -32,7 +32,7 @@ module PureJPEG
32
32
  end
33
33
 
34
34
  def bytes
35
- @data.pack("C*")
35
+ @data
36
36
  end
37
37
  end
38
38
  end
@@ -13,7 +13,7 @@ module PureJPEG
13
13
  # @param path_or_data [String] a file path or raw JPEG bytes
14
14
  # @return [Image] decoded image with pixel access
15
15
  def self.decode(path_or_data)
16
- data = if path_or_data.is_a?(String) && !path_or_data.include?("\x00") && File.exist?(path_or_data)
16
+ data = if path_or_data.is_a?(String) && !path_or_data.start_with?("\xFF\xD8".b) && File.exist?(path_or_data)
17
17
  File.binread(path_or_data)
18
18
  else
19
19
  path_or_data.b
@@ -27,6 +27,8 @@ module PureJPEG
27
27
 
28
28
  def decode
29
29
  jfif = JFIFReader.new(@data)
30
+ return decode_progressive(jfif) if jfif.progressive
31
+
30
32
  width = jfif.width
31
33
  height = jfif.height
32
34
 
@@ -127,6 +129,290 @@ module PureJPEG
127
129
 
128
130
  private
129
131
 
132
+ # --- Progressive JPEG decoding ---
133
+
134
+ def decode_progressive(jfif)
135
+ width = jfif.width
136
+ height = jfif.height
137
+
138
+ comp_info = {}
139
+ jfif.components.each { |c| comp_info[c.id] = c }
140
+
141
+ max_h = jfif.components.map(&:h_sampling).max
142
+ max_v = jfif.components.map(&:v_sampling).max
143
+
144
+ mcu_px_w = max_h * 8
145
+ mcu_px_h = max_v * 8
146
+ mcus_x = (width + mcu_px_w - 1) / mcu_px_w
147
+ mcus_y = (height + mcu_px_h - 1) / mcu_px_h
148
+
149
+ # Coefficient buffers per component (zigzag order, pre-dequantization)
150
+ coeffs = {}
151
+ comp_blocks = {}
152
+ jfif.components.each do |c|
153
+ bx = mcus_x * c.h_sampling
154
+ by = mcus_y * c.v_sampling
155
+ coeffs[c.id] = Array.new(bx * by * 64, 0)
156
+ comp_blocks[c.id] = [bx, by]
157
+ end
158
+
159
+ restart_interval = jfif.restart_interval
160
+
161
+ jfif.scans.each do |scan|
162
+ # Build Huffman tables from this scan's snapshot (tables change between scans)
163
+ dc_tables = {}
164
+ ac_tables = {}
165
+ scan.huffman_tables.each do |(table_class, table_id), info|
166
+ table = Huffman::DecodeTable.new(info[:bits], info[:values])
167
+ if table_class == 0
168
+ dc_tables[table_id] = table
169
+ else
170
+ ac_tables[table_id] = table
171
+ end
172
+ end
173
+
174
+ reader = BitReader.new(scan.data)
175
+ ss = scan.spectral_start
176
+ se = scan.spectral_end
177
+ ah = scan.successive_high
178
+ al = scan.successive_low
179
+
180
+ if scan.components.length == 1
181
+ prog_scan_non_interleaved(reader, scan, comp_info, dc_tables, ac_tables,
182
+ coeffs, comp_blocks, restart_interval, ss, se, ah, al)
183
+ else
184
+ prog_scan_interleaved(reader, scan, comp_info, dc_tables, ac_tables,
185
+ coeffs, comp_blocks, mcus_x, mcus_y, restart_interval, ss, se, ah, al)
186
+ end
187
+ end
188
+
189
+ # Reconstruct: unzigzag, dequantize, IDCT, write to channel buffers
190
+ padded_w = mcus_x * mcu_px_w
191
+ padded_h = mcus_y * mcu_px_h
192
+ channels = {}
193
+ jfif.components.each do |c|
194
+ ch_w = (padded_w * c.h_sampling) / max_h
195
+ ch_h = (padded_h * c.v_sampling) / max_v
196
+ channels[c.id] = { data: Array.new(ch_w * ch_h, 0), width: ch_w, height: ch_h }
197
+ end
198
+
199
+ zigzag = Array.new(64, 0)
200
+ raster = Array.new(64, 0.0)
201
+ dequant = Array.new(64, 0.0)
202
+ temp = Array.new(64, 0.0)
203
+ spatial = Array.new(64, 0.0)
204
+
205
+ jfif.components.each do |c|
206
+ qt = jfif.quant_tables[c.qt_id]
207
+ ch = channels[c.id]
208
+ coeff_buf = coeffs[c.id]
209
+ bx_count, by_count = comp_blocks[c.id]
210
+
211
+ by_count.times do |block_y|
212
+ bx_count.times do |block_x|
213
+ offset = (block_y * bx_count + block_x) * 64
214
+ 64.times { |i| zigzag[i] = coeff_buf[offset + i] }
215
+
216
+ Zigzag.unreorder!(zigzag, raster)
217
+ Quantization.dequantize!(raster, qt, dequant)
218
+ DCT.inverse!(dequant, temp, spatial)
219
+ write_block(spatial, ch[:data], ch[:width], block_x * 8, block_y * 8)
220
+ end
221
+ end
222
+ end
223
+
224
+ num_components = jfif.components.length
225
+ if num_components == 1
226
+ assemble_grayscale(width, height, channels, jfif.components[0])
227
+ else
228
+ assemble_color(width, height, channels, jfif.components, max_h, max_v)
229
+ end
230
+ end
231
+
232
+ def prog_scan_non_interleaved(reader, scan, comp_info, dc_tables, ac_tables,
233
+ coeffs, comp_blocks, restart_interval, ss, se, ah, al)
234
+ sc = scan.components[0]
235
+ comp = comp_info[sc.id]
236
+ dc_tab = dc_tables[sc.dc_table_id]
237
+ ac_tab = ac_tables[sc.ac_table_id]
238
+ coeff_buf = coeffs[comp.id]
239
+ bx_count, by_count = comp_blocks[comp.id]
240
+
241
+ prev_dc = 0
242
+ eobrun = 0
243
+ mcu_count = 0
244
+
245
+ by_count.times do |block_y|
246
+ bx_count.times do |block_x|
247
+ if restart_interval > 0 && mcu_count > 0 && (mcu_count % restart_interval) == 0
248
+ reader.reset
249
+ prev_dc = 0
250
+ eobrun = 0
251
+ end
252
+
253
+ offset = (block_y * bx_count + block_x) * 64
254
+
255
+ if ss == 0
256
+ if ah == 0
257
+ prev_dc = prog_dc_first(reader, dc_tab, prev_dc, coeff_buf, offset, al)
258
+ else
259
+ prog_dc_refine(reader, coeff_buf, offset, al)
260
+ end
261
+ else
262
+ if ah == 0
263
+ eobrun = prog_ac_first(reader, ac_tab, coeff_buf, offset, ss, se, al, eobrun)
264
+ else
265
+ eobrun = prog_ac_refine(reader, ac_tab, coeff_buf, offset, ss, se, al, eobrun)
266
+ end
267
+ end
268
+
269
+ mcu_count += 1
270
+ end
271
+ end
272
+ end
273
+
274
+ def prog_scan_interleaved(reader, scan, comp_info, dc_tables, ac_tables,
275
+ coeffs, comp_blocks, mcus_x, mcus_y, restart_interval, ss, se, ah, al)
276
+ prev_dc = Hash.new(0)
277
+ mcu_count = 0
278
+
279
+ mcus_y.times do |mcu_row|
280
+ mcus_x.times do |mcu_col|
281
+ if restart_interval > 0 && mcu_count > 0 && (mcu_count % restart_interval) == 0
282
+ reader.reset
283
+ prev_dc.clear
284
+ end
285
+
286
+ scan.components.each do |sc|
287
+ comp = comp_info[sc.id]
288
+ dc_tab = dc_tables[sc.dc_table_id]
289
+ coeff_buf = coeffs[comp.id]
290
+ bx_count = comp_blocks[comp.id][0]
291
+
292
+ comp.v_sampling.times do |bv|
293
+ comp.h_sampling.times do |bh|
294
+ block_x = mcu_col * comp.h_sampling + bh
295
+ block_y = mcu_row * comp.v_sampling + bv
296
+ offset = (block_y * bx_count + block_x) * 64
297
+
298
+ if ah == 0
299
+ prev_dc[sc.id] = prog_dc_first(reader, dc_tab, prev_dc[sc.id], coeff_buf, offset, al)
300
+ else
301
+ prog_dc_refine(reader, coeff_buf, offset, al)
302
+ end
303
+ end
304
+ end
305
+ end
306
+
307
+ mcu_count += 1
308
+ end
309
+ end
310
+ end
311
+
312
+ def prog_dc_first(reader, dc_tab, prev_dc, coeff_buf, offset, al)
313
+ cat = dc_tab.decode(reader)
314
+ diff = reader.receive_extend(cat)
315
+ dc_val = prev_dc + diff
316
+ coeff_buf[offset] = dc_val << al
317
+ dc_val
318
+ end
319
+
320
+ def prog_dc_refine(reader, coeff_buf, offset, al)
321
+ coeff_buf[offset] |= (reader.read_bit << al)
322
+ end
323
+
324
+ def prog_ac_first(reader, ac_tab, coeff_buf, offset, ss, se, al, eobrun)
325
+ return eobrun - 1 if eobrun > 0
326
+
327
+ k = ss
328
+ while k <= se
329
+ symbol = ac_tab.decode(reader)
330
+ run = (symbol >> 4) & 0x0F
331
+ size = symbol & 0x0F
332
+
333
+ if size == 0
334
+ if run == 15
335
+ k += 16
336
+ else
337
+ # EOBn
338
+ eobrun = (1 << run)
339
+ eobrun += reader.read_bits(run) if run > 0
340
+ return eobrun - 1
341
+ end
342
+ else
343
+ k += run
344
+ coeff_buf[offset + k] = reader.receive_extend(size) << al
345
+ k += 1
346
+ end
347
+ end
348
+
349
+ 0
350
+ end
351
+
352
+ def prog_ac_refine(reader, ac_tab, coeff_buf, offset, ss, se, al, eobrun)
353
+ p1 = 1 << al
354
+ m1 = -(1 << al)
355
+
356
+ if eobrun > 0
357
+ ss.upto(se) do |k|
358
+ prog_refine_bit(reader, coeff_buf, offset + k, p1, m1) if coeff_buf[offset + k] != 0
359
+ end
360
+ return eobrun - 1
361
+ end
362
+
363
+ k = ss
364
+ while k <= se
365
+ symbol = ac_tab.decode(reader)
366
+ r = (symbol >> 4) & 0x0F
367
+ s = symbol & 0x0F
368
+
369
+ # Read the new coefficient value before processing the run
370
+ # (the value bits come before refinement bits in the bitstream)
371
+ new_value = nil
372
+ if s != 0
373
+ new_value = reader.receive_extend(s) << al
374
+ elsif r != 15
375
+ # EOBn: refine remaining nonzero coefficients in this block
376
+ eobrun = (1 << r)
377
+ eobrun += reader.read_bits(r) if r > 0
378
+ while k <= se
379
+ prog_refine_bit(reader, coeff_buf, offset + k, p1, m1) if coeff_buf[offset + k] != 0
380
+ k += 1
381
+ end
382
+ return eobrun - 1
383
+ end
384
+
385
+ # Advance through the band: refine nonzero coefficients, count zeros for run.
386
+ # Break when we've skipped `r` zeros and found the target zero position.
387
+ while k <= se
388
+ if coeff_buf[offset + k] != 0
389
+ prog_refine_bit(reader, coeff_buf, offset + k, p1, m1)
390
+ elsif r == 0
391
+ break
392
+ else
393
+ r -= 1
394
+ end
395
+ k += 1
396
+ end
397
+
398
+ # Place new coefficient at the target zero position
399
+ if new_value && k <= se
400
+ coeff_buf[offset + k] = new_value
401
+ end
402
+ k += 1
403
+ end
404
+
405
+ 0
406
+ end
407
+
408
+ def prog_refine_bit(reader, coeff_buf, idx, p1, m1)
409
+ if reader.read_bit == 1
410
+ coeff_buf[idx] += coeff_buf[idx] > 0 ? p1 : m1
411
+ end
412
+ end
413
+
414
+ # --- Baseline decoding helpers ---
415
+
130
416
  def decode_block(reader, dc_tab, ac_tab, prev_dc, comp_id, out)
131
417
  # DC coefficient
132
418
  dc_cat = dc_tab.decode(reader)
@@ -185,7 +471,7 @@ module PureJPEG
185
471
  dst_row = y * width
186
472
  width.times do |x|
187
473
  v = ch[:data][src_row + x]
188
- pixels[dst_row + x] = Source::Pixel.new(v, v, v)
474
+ pixels[dst_row + x] = (v << 16) | (v << 8) | v
189
475
  end
190
476
  end
191
477
  Image.new(width, height, pixels)
@@ -228,7 +514,7 @@ module PureJPEG
228
514
  g = g < 0 ? 0 : (g > 255 ? 255 : g)
229
515
  b = b < 0 ? 0 : (b > 255 ? 255 : b)
230
516
 
231
- pixels[dst_row + px] = Source::Pixel.new(r, g, b)
517
+ pixels[dst_row + px] = (r << 16) | (g << 8) | b
232
518
  end
233
519
  end
234
520
 
@@ -41,6 +41,8 @@ module PureJPEG
41
41
  @quality = quality
42
42
  @grayscale = grayscale
43
43
  @chroma_quality = chroma_quality || quality
44
+ validate_qtable!(luminance_table, "luminance_table") if luminance_table
45
+ validate_qtable!(chrominance_table, "chrominance_table") if chrominance_table
44
46
  @luminance_table = luminance_table
45
47
  @chrominance_table = chrominance_table
46
48
  @quantization_modifier = quantization_modifier
@@ -78,6 +80,13 @@ module PureJPEG
78
80
  table
79
81
  end
80
82
 
83
+ def validate_qtable!(table, name)
84
+ raise ArgumentError, "#{name} must have exactly 64 elements (got #{table.length})" unless table.length == 64
85
+ unless table.all? { |v| v.is_a?(Integer) && v >= 1 && v <= 255 }
86
+ raise ArgumentError, "#{name} elements must be integers between 1 and 255"
87
+ end
88
+ end
89
+
81
90
  def encode(io)
82
91
  width = source.width
83
92
  height = source.height
@@ -223,13 +232,37 @@ module PureJPEG
223
232
 
224
233
  # --- Pixel extraction ---
225
234
 
235
+ # Determine RGB bit shifts for a packed_pixels source.
236
+ # ChunkyPNG uses (r<<24 | g<<16 | b<<8 | a), Image uses (r<<16 | g<<8 | b).
237
+ def packed_shifts
238
+ if source.is_a?(Image)
239
+ [16, 8, 0]
240
+ else
241
+ [24, 16, 8]
242
+ end
243
+ end
244
+
226
245
  def extract_luminance(width, height)
227
246
  luminance = Array.new(width * height)
228
- height.times do |y|
229
- row = y * width
230
- width.times do |x|
231
- pixel = source[x, y]
232
- luminance[row + x] = (0.299 * pixel.r + 0.587 * pixel.g + 0.114 * pixel.b).round.clamp(0, 255)
247
+ if source.respond_to?(:packed_pixels)
248
+ packed = source.packed_pixels
249
+ r_shift, g_shift, b_shift = packed_shifts
250
+ i = 0
251
+ (width * height).times do
252
+ color = packed[i]
253
+ r = (color >> r_shift) & 0xFF
254
+ g = (color >> g_shift) & 0xFF
255
+ b = (color >> b_shift) & 0xFF
256
+ luminance[i] = (0.299 * r + 0.587 * g + 0.114 * b).round.clamp(0, 255)
257
+ i += 1
258
+ end
259
+ else
260
+ height.times do |y|
261
+ row = y * width
262
+ width.times do |x|
263
+ pixel = source[x, y]
264
+ luminance[row + x] = (0.299 * pixel.r + 0.587 * pixel.g + 0.114 * pixel.b).round.clamp(0, 255)
265
+ end
233
266
  end
234
267
  end
235
268
  luminance
@@ -241,15 +274,31 @@ module PureJPEG
241
274
  cb_data = Array.new(size)
242
275
  cr_data = Array.new(size)
243
276
 
244
- height.times do |py|
245
- row = py * width
246
- width.times do |px|
247
- pixel = source[px, py]
248
- r = pixel.r; g = pixel.g; b = pixel.b
249
- i = row + px
277
+ if source.respond_to?(:packed_pixels)
278
+ packed = source.packed_pixels
279
+ r_shift, g_shift, b_shift = packed_shifts
280
+ i = 0
281
+ size.times do
282
+ color = packed[i]
283
+ r = (color >> r_shift) & 0xFF
284
+ g = (color >> g_shift) & 0xFF
285
+ b = (color >> b_shift) & 0xFF
250
286
  y_data[i] = ( 0.299 * r + 0.587 * g + 0.114 * b).round.clamp(0, 255)
251
287
  cb_data[i] = (-0.168736 * r - 0.331264 * g + 0.5 * b + 128.0).round.clamp(0, 255)
252
288
  cr_data[i] = ( 0.5 * r - 0.418688 * g - 0.081312 * b + 128.0).round.clamp(0, 255)
289
+ i += 1
290
+ end
291
+ else
292
+ height.times do |py|
293
+ row = py * width
294
+ width.times do |px|
295
+ pixel = source[px, py]
296
+ r = pixel.r; g = pixel.g; b = pixel.b
297
+ i = row + px
298
+ y_data[i] = ( 0.299 * r + 0.587 * g + 0.114 * b).round.clamp(0, 255)
299
+ cb_data[i] = (-0.168736 * r - 0.331264 * g + 0.5 * b + 128.0).round.clamp(0, 255)
300
+ cr_data[i] = ( 0.5 * r - 0.418688 * g - 0.081312 * b + 128.0).round.clamp(0, 255)
301
+ end
253
302
  end
254
303
  end
255
304
 
@@ -33,7 +33,7 @@ module PureJPEG
33
33
  return @values[@val_ptr[len] + code - @min_code[len]]
34
34
  end
35
35
  end
36
- raise "Invalid Huffman code"
36
+ raise PureJPEG::DecodeError, "Invalid Huffman code"
37
37
  end
38
38
  end
39
39
  end
@@ -33,8 +33,8 @@ module PureJPEG
33
33
  last_nonzero = 63
34
34
  last_nonzero -= 1 while last_nonzero > 0 && zigzag[last_nonzero] == 0
35
35
 
36
- if last_nonzero == 0 && zigzag[0] == zigzag[0] # AC starts at index 1
37
- # All AC coefficients are zero
36
+ if last_nonzero == 0
37
+ # All AC coefficients are zero (AC starts at index 1)
38
38
  eob = @ac_table[0x00]
39
39
  writer.write_bits(eob[0], eob[1])
40
40
  return
@@ -64,8 +64,9 @@ module PureJPEG
64
64
 
65
65
  # Build a lookup table: symbol -> [code, code_length]
66
66
  # from the bits/values specification.
67
+ # Returns an Array indexed by symbol value for O(1) lookup.
67
68
  def self.build_table(bits, values)
68
- table = {}
69
+ table = Array.new(256)
69
70
  code = 0
70
71
  k = 0
71
72
 
@@ -3,22 +3,28 @@
3
3
  module PureJPEG
4
4
  # A decoded JPEG image with pixel-level access.
5
5
  #
6
- # Implements the same pixel source interface (+width+, +height+, +[x, y]+)
7
- # as encoder inputs, so a decoded image can be passed directly to
8
- # {PureJPEG.encode} for re-encoding.
6
+ # Internally stores pixels as packed integers (+r << 16 | g << 8 | b+) to
7
+ # avoid per-pixel object allocation. Implements the same pixel source
8
+ # interface (+width+, +height+, +[x, y]+) as encoder inputs, so a decoded
9
+ # image can be passed directly to {PureJPEG.encode} for re-encoding.
9
10
  class Image
10
11
  # @return [Integer] image width in pixels
11
12
  attr_reader :width
12
13
  # @return [Integer] image height in pixels
13
14
  attr_reader :height
14
15
 
16
+ # @return [Array<Integer>] flat row-major array of packed RGB integers.
17
+ # Format: +(r << 16) | (g << 8) | b+.
18
+ attr_reader :packed_pixels
19
+
15
20
  # @param width [Integer]
16
21
  # @param height [Integer]
17
- # @param pixels [Array<Source::Pixel>] flat row-major array of pixels
18
- def initialize(width, height, pixels)
22
+ # @param packed_pixels [Array<Integer>] flat row-major array of packed RGB
23
+ # integers in the format +(r << 16) | (g << 8) | b+
24
+ def initialize(width, height, packed_pixels)
19
25
  @width = width
20
26
  @height = height
21
- @pixels = pixels
27
+ @packed_pixels = packed_pixels
22
28
  end
23
29
 
24
30
  # Retrieve a pixel by coordinate.
@@ -27,7 +33,8 @@ module PureJPEG
27
33
  # @param y [Integer] row (0-based)
28
34
  # @return [Source::Pixel] pixel with +.r+, +.g+, +.b+ in 0-255
29
35
  def [](x, y)
30
- @pixels[y * @width + x]
36
+ color = @packed_pixels[y * @width + x]
37
+ Source::Pixel.new((color >> 16) & 0xFF, (color >> 8) & 0xFF, color & 0xFF)
31
38
  end
32
39
 
33
40
  # Set a pixel by coordinate.
@@ -37,7 +44,8 @@ module PureJPEG
37
44
  # @param pixel [Source::Pixel] replacement pixel
38
45
  # @return [Source::Pixel]
39
46
  def []=(x, y, pixel)
40
- @pixels[y * @width + x] = pixel
47
+ @packed_pixels[y * @width + x] = (pixel.r << 16) | (pixel.g << 8) | pixel.b
48
+ pixel
41
49
  end
42
50
 
43
51
  # Iterate over every pixel in the image.
@@ -53,5 +61,24 @@ module PureJPEG
53
61
  end
54
62
  end
55
63
  end
64
+
65
+ # Iterate over every pixel without allocating Pixel structs.
66
+ #
67
+ # @yieldparam x [Integer] column
68
+ # @yieldparam y [Integer] row
69
+ # @yieldparam r [Integer] red component (0-255)
70
+ # @yieldparam g [Integer] green component (0-255)
71
+ # @yieldparam b [Integer] blue component (0-255)
72
+ # @return [void]
73
+ def each_rgb
74
+ i = 0
75
+ @height.times do |y|
76
+ @width.times do |x|
77
+ color = @packed_pixels[i]
78
+ yield x, y, (color >> 16) & 0xFF, (color >> 8) & 0xFF, color & 0xFF
79
+ i += 1
80
+ end
81
+ end
82
+ end
56
83
  end
57
84
  end
@@ -3,10 +3,11 @@
3
3
  module PureJPEG
4
4
  class JFIFReader
5
5
  attr_reader :width, :height, :components, :quant_tables, :huffman_tables,
6
- :scan_components, :scan_data, :restart_interval
6
+ :restart_interval, :progressive, :scans
7
7
 
8
8
  Component = Struct.new(:id, :h_sampling, :v_sampling, :qt_id)
9
9
  ScanComponent = Struct.new(:id, :dc_table_id, :ac_table_id)
10
+ Scan = Struct.new(:components, :spectral_start, :spectral_end, :successive_high, :successive_low, :data, :huffman_tables)
10
11
 
11
12
  def initialize(data)
12
13
  @data = data.b
@@ -14,11 +15,20 @@ module PureJPEG
14
15
  @quant_tables = {}
15
16
  @huffman_tables = {}
16
17
  @components = []
17
- @scan_components = []
18
18
  @restart_interval = 0
19
+ @progressive = false
20
+ @scans = []
19
21
  parse
20
22
  end
21
23
 
24
+ def scan_components
25
+ @scans.first&.components || []
26
+ end
27
+
28
+ def scan_data
29
+ @scans.first&.data || "".b
30
+ end
31
+
22
32
  private
23
33
 
24
34
  def parse
@@ -35,14 +45,21 @@ module PureJPEG
35
45
  parse_dht
36
46
  when 0xC0 # SOF0 (baseline)
37
47
  parse_sof0
48
+ when 0xC2 # SOF2 (progressive)
49
+ parse_sof0
50
+ @progressive = true
38
51
  when 0xDA # SOS
39
- parse_sos
40
- extract_scan_data
41
- return
52
+ scan = parse_sos
53
+ scan.data = extract_scan_data
54
+ scan.huffman_tables = @huffman_tables.dup
55
+ @scans << scan
56
+ return unless @progressive
42
57
  when 0xFE # COM (comment)
43
58
  skip_segment
44
59
  when 0xDD # DRI (restart interval)
45
60
  parse_dri
61
+ when 0xD9 # EOI
62
+ return
46
63
  else
47
64
  skip_segment
48
65
  end
@@ -50,6 +67,7 @@ module PureJPEG
50
67
  end
51
68
 
52
69
  def read_byte
70
+ raise PureJPEG::DecodeError, "Unexpected end of JPEG data" if @pos >= @data.bytesize
53
71
  byte = @data.getbyte(@pos)
54
72
  @pos += 1
55
73
  byte
@@ -61,7 +79,7 @@ module PureJPEG
61
79
 
62
80
  def read_marker
63
81
  byte = read_byte
64
- raise "Expected 0xFF, got 0x#{byte.to_s(16)}" unless byte == 0xFF
82
+ raise PureJPEG::DecodeError, "Expected 0xFF, got 0x#{byte.to_s(16)}" unless byte == 0xFF
65
83
  # Skip padding 0xFF bytes
66
84
  code = read_byte
67
85
  code = read_byte while code == 0xFF
@@ -70,7 +88,7 @@ module PureJPEG
70
88
 
71
89
  def expect_marker(expected)
72
90
  marker = read_marker
73
- raise "Expected marker 0x#{expected.to_s(16)}, got 0x#{marker.to_s(16)}" unless marker == expected
91
+ raise PureJPEG::DecodeError, "Expected marker 0x#{expected.to_s(16)}, got 0x#{marker.to_s(16)}" unless marker == expected
74
92
  end
75
93
 
76
94
  def skip_segment
@@ -136,7 +154,7 @@ module PureJPEG
136
154
  read_u16 # length
137
155
  num_components = read_byte
138
156
 
139
- @scan_components = Array.new(num_components) do
157
+ components = Array.new(num_components) do
140
158
  id = read_byte
141
159
  tables = read_byte
142
160
  dc_id = (tables >> 4) & 0x0F
@@ -144,8 +162,12 @@ module PureJPEG
144
162
  ScanComponent.new(id, dc_id, ac_id)
145
163
  end
146
164
 
147
- # Spectral selection and approximation (ignored for baseline)
148
- 3.times { read_byte }
165
+ ss = read_byte # spectral selection start
166
+ se = read_byte # spectral selection end
167
+ ahl = read_byte # successive approximation
168
+ ah = (ahl >> 4) & 0x0F
169
+ al = ahl & 0x0F
170
+ Scan.new(components, ss, se, ah, al, nil)
149
171
  end
150
172
 
151
173
  def parse_dri
@@ -156,18 +178,20 @@ module PureJPEG
156
178
  # Extract entropy-coded scan data (everything from current position to EOI marker).
157
179
  def extract_scan_data
158
180
  start = @pos
181
+ len = @data.bytesize
159
182
  # Scan forward looking for a marker that isn't a stuffing byte or restart
160
- while @pos < @data.bytesize - 1
161
- if @data.getbyte(@pos) == 0xFF
162
- next_byte = @data.getbyte(@pos + 1)
163
- # 0x00 is byte stuffing, 0xD0-0xD7 are restart markers — all part of scan data
164
- if next_byte != 0x00 && !(next_byte >= 0xD0 && next_byte <= 0xD7) && next_byte != 0xFF
165
- break
166
- end
183
+ while @pos < len - 1
184
+ found = @data.index("\xFF".b, @pos)
185
+ break unless found && found < len - 1
186
+ @pos = found
187
+ next_byte = @data.getbyte(@pos + 1)
188
+ # 0x00 is byte stuffing, 0xD0-0xD7 are restart markers, 0xFF is padding — all part of scan data
189
+ if next_byte != 0x00 && !(next_byte >= 0xD0 && next_byte <= 0xD7) && next_byte != 0xFF
190
+ break
167
191
  end
168
- @pos += 1
192
+ @pos += 2
169
193
  end
170
- @scan_data = @data[start...@pos]
194
+ @data[start...@pos]
171
195
  end
172
196
  end
173
197
  end
@@ -19,22 +19,25 @@ module PureJPEG
19
19
 
20
20
  # @param image [ChunkyPNG::Image] the source PNG image
21
21
  def initialize(image)
22
- @image = image
23
22
  @width = image.width
24
23
  @height = image.height
24
+ @packed_pixels = image.pixels
25
25
  end
26
26
 
27
+ # @return [Array<Integer>] flat row-major array of packed RGBA integers
28
+ attr_reader :packed_pixels
29
+
27
30
  # Retrieve a pixel at the given coordinate.
28
31
  #
29
32
  # @param x [Integer] column (0-based)
30
33
  # @param y [Integer] row (0-based)
31
34
  # @return [Pixel]
32
35
  def [](x, y)
33
- color = @image[x, y]
36
+ color = @packed_pixels[y * @width + x]
34
37
  Pixel.new(
35
- ChunkyPNG::Color.r(color),
36
- ChunkyPNG::Color.g(color),
37
- ChunkyPNG::Color.b(color)
38
+ (color >> 24) & 0xFF,
39
+ (color >> 16) & 0xFF,
40
+ (color >> 8) & 0xFF
38
41
  )
39
42
  end
40
43
  end
@@ -27,7 +27,8 @@ module PureJPEG
27
27
  def initialize(width, height, &block)
28
28
  @width = width
29
29
  @height = height
30
- @pixels = Array.new(width * height)
30
+ black = Pixel.new(0, 0, 0)
31
+ @pixels = Array.new(width * height, black)
31
32
 
32
33
  if block
33
34
  height.times do |y|
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module PureJPEG
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/pure_jpeg.rb CHANGED
@@ -25,6 +25,9 @@ require_relative "pure_jpeg/decoder"
25
25
  # Supports baseline DCT (SOF0) with 8-bit precision, grayscale and YCbCr
26
26
  # color (4:2:0 chroma subsampling), and standard Huffman tables (Annex K).
27
27
  module PureJPEG
28
+ # Raised when decoding invalid or unsupported JPEG data.
29
+ class DecodeError < StandardError; end
30
+
28
31
  # Encode a pixel source as a JPEG.
29
32
  #
30
33
  # @param source [#width, #height, #[]] any object responding to +width+,
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pure_jpeg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Peter Cooper
@@ -78,7 +78,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
78
78
  requirements:
79
79
  - - ">="
80
80
  - !ruby/object:Gem::Version
81
- version: 2.7.0
81
+ version: 3.0.0
82
82
  required_rubygems_version: !ruby/object:Gem::Requirement
83
83
  requirements:
84
84
  - - ">="