pure_jpeg 0.3.0 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +24 -0
- data/README.md +12 -7
- data/lib/pure_jpeg/bit_writer.rb +2 -2
- data/lib/pure_jpeg/dct.rb +204 -56
- data/lib/pure_jpeg/decoder.rb +52 -33
- data/lib/pure_jpeg/encoder.rb +63 -45
- data/lib/pure_jpeg/huffman/encoder.rb +17 -12
- data/lib/pure_jpeg/quantization.rb +13 -2
- data/lib/pure_jpeg/source/raw_source.rb +1 -2
- data/lib/pure_jpeg/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 780f932b176fecdb5daab2546909fe2610325e54f7364cf482e3e2652ab614a5
|
|
4
|
+
data.tar.gz: 1fbc350f25d09b989ee6262e077bb36847c6637db325fb03c29f2083bb0ec973
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a385db40804bf4a992d78253ba619e90b147aa5be7808b341398918f5f36c3593fbd1a979aaad9729ad874f2427fe15d31fc9a0413e5f7ea98ee44e43e125f2d
|
|
7
|
+
data.tar.gz: b9f5581f4c4f27f42460961b3b4231fd94b45be130fbdce74a28bc4888f2a2f3de08fd43b9e3efc782ccb6b4f260aa8484ff7794e7bde1362018dc3deb282b26
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,29 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.3.2
|
|
4
|
+
|
|
5
|
+
Performance:
|
|
6
|
+
|
|
7
|
+
- Replaced matrix-multiply float DCT with integer-scaled AAN (Arai-Agui-Nakajima) DCT from the IJG reference implementation -- all-integer, no Float allocations
|
|
8
|
+
- Fixed-point integer arithmetic for RGB/YCbCr color space conversion in both encoder and decoder
|
|
9
|
+
- Eliminated short-lived Array allocations in Huffman encoder (`category_and_bits` split into separate methods)
|
|
10
|
+
- `String#<<` with Integer instead of `byte.chr` to avoid String allocations in bit writer
|
|
11
|
+
- DCT inner loop unrolling to eliminate nested block invocations
|
|
12
|
+
- Unrolled `write_block` and `extract_block_into` inner loops
|
|
13
|
+
- Integer rounding division in quantization (no more Float division + round)
|
|
14
|
+
- Hoisted hash lookups and method calls out of per-pixel loops in decoder
|
|
15
|
+
|
|
16
|
+
Result: ~2.9x faster encode, ~4.6x faster decode on Ruby 4.0.2 with YJIT.
|
|
17
|
+
|
|
18
|
+
Credits: [Ufuk Kayserilioglu](https://github.com/paracycle)
|
|
19
|
+
|
|
20
|
+
## 0.3.1
|
|
21
|
+
|
|
22
|
+
Fixes:
|
|
23
|
+
|
|
24
|
+
- Fixed shared `Pixel` instance bug in decoder that could corrupt pixel data
|
|
25
|
+
- Encoder validates return values from `quantization_modifier` blocks
|
|
26
|
+
|
|
3
27
|
## 0.3.0
|
|
4
28
|
|
|
5
29
|
New features:
|
data/README.md
CHANGED
|
@@ -194,25 +194,26 @@ Decoding:
|
|
|
194
194
|
|
|
195
195
|
Not supported: arithmetic coding, 12-bit precision, EXIF/ICC profile preservation, adding a default background for transparent sources (see what happens above!). Largely because I don't need these, but they are all do-able, especially with how loosely coupled this library is internally. Raise an issue if you really care about them!
|
|
196
196
|
|
|
197
|
-
Possible future improvements:
|
|
197
|
+
Possible future improvements: ICC profile rendering/conversion.
|
|
198
198
|
|
|
199
199
|
## Performance
|
|
200
200
|
|
|
201
|
-
On a 1024x1024 image (Ruby 4.0.
|
|
201
|
+
On a 1024x1024 image (Ruby 4.0.2 with YJIT on an M5):
|
|
202
202
|
|
|
203
203
|
| Operation | Time |
|
|
204
204
|
|-----------|------|
|
|
205
|
-
| Encode (color, q85) | ~
|
|
206
|
-
| Decode (
|
|
205
|
+
| Encode (color, q85) | ~0.16s |
|
|
206
|
+
| Decode (baseline) | ~0.14s |
|
|
207
|
+
| Decode (progressive) | ~0.18s |
|
|
207
208
|
|
|
208
|
-
|
|
209
|
+
The encoder and decoder use an integer-scaled AAN (Arai-Agui-Nakajima) DCT with fixed-point arithmetic throughout — no Float operations in the hot path. Color space conversion uses fixed-point integer math, and pixel data is stored as packed integers to avoid per-pixel object allocation.
|
|
209
210
|
|
|
210
211
|
## Some useful `rake` tasks
|
|
211
212
|
|
|
212
213
|
```
|
|
213
214
|
bundle install
|
|
214
215
|
rake test # run the test suite
|
|
215
|
-
rake benchmark # benchmark encoding (3 runs
|
|
216
|
+
rake benchmark # benchmark encoding and decoding (3 runs each)
|
|
216
217
|
rake profile # CPU profile with StackProf (requires the stackprof gem)
|
|
217
218
|
```
|
|
218
219
|
|
|
@@ -222,7 +223,7 @@ rake profile # CPU profile with StackProf (requires the stackprof gem)
|
|
|
222
223
|
|
|
223
224
|
**I have read all of the code produced up to v0.2.0.** The algorithms are above my paygrade, but I'm OK with what has been produced, and I manually fixed a variety of stylistic things along the way. For example, CC seems to like wrapping entire functions in `if` statements rather than bailing on the opposite condition. *Later update: I have not read the ICC and optimized Huffman code yet, but it is heavily tested.*
|
|
224
225
|
|
|
225
|
-
**CC needed a lot of guidance.** Its initial JPEG algorithm was somewhat naive and output odd looking JPEGs akin to those of my
|
|
226
|
+
**CC needed a lot of guidance.** Its initial JPEG algorithm was somewhat naive and output odd looking JPEGs akin to those of my [Casio QV-10 digital camera](https://medium.com/people-gadgets/the-gadget-we-miss-the-casio-qv-10-digital-camera-c25ab786ce49) from the late 1990s. After some back and forth and image comparisons, we figured out it was doing the quantization entirely wrong (specifically not using the zigzag approach during quanitization but just going in raster order). I *like* this aesthetic, but fixed it up so that it works as a generally usable JPEG library, while adding ways to customize things so you can recreate the effect, if preferred (see `CREATIVE.md` for more on that).
|
|
226
227
|
|
|
227
228
|
**CC is lazy.** The initial implementation was VERY SLOW. It took 15 seconds to turn a 1024x1024 PNG into a JPEG, so we went down the profiling rabbit hole and found many optimizations to make it ~6x faster. CC is poor at considering the role of Ruby's GC when implementing low level algorithms and needs some prodding to make the correct optimizations. CC is also lazy to the point of recommending that you just use another language (e.g. Go or Rust) rather than do a pure Ruby version of something - despite it being possible with some extra work.
|
|
228
229
|
|
|
@@ -232,6 +233,10 @@ rake profile # CPU profile with StackProf (requires the stackprof gem)
|
|
|
232
233
|
|
|
233
234
|
**The final 10% still takes 90% of the time.** As mentioned above, the first run was quick, but getting things right has taken much longer. v0.1->0.2 has taken longer than 0.1 did! But we now have progressive JPEG support, even more optimizations, better tests, etc. etc.
|
|
234
235
|
|
|
236
|
+
## Credits
|
|
237
|
+
|
|
238
|
+
- [Ufuk Kayserilioglu](https://github.com/paracycle) - Major performance optimizations including integer-scaled AAN DCT, fixed-point color space conversion, and YJIT-targeted improvements.
|
|
239
|
+
|
|
235
240
|
## License
|
|
236
241
|
|
|
237
242
|
MIT
|
data/lib/pure_jpeg/bit_writer.rb
CHANGED
|
@@ -17,8 +17,8 @@ module PureJPEG
|
|
|
17
17
|
while @bits_in_buffer >= 8
|
|
18
18
|
@bits_in_buffer -= 8
|
|
19
19
|
byte = (@buffer >> @bits_in_buffer) & 0xFF
|
|
20
|
-
@data << byte
|
|
21
|
-
@data <<
|
|
20
|
+
@data << byte
|
|
21
|
+
@data << 0x00 if byte == 0xFF # byte stuffing
|
|
22
22
|
end
|
|
23
23
|
|
|
24
24
|
@buffer &= (1 << @bits_in_buffer) - 1
|
data/lib/pure_jpeg/dct.rb
CHANGED
|
@@ -1,10 +1,15 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
module PureJPEG
|
|
4
|
+
# Integer-scaled DCT based on the IJG (Independent JPEG Group) reference
|
|
5
|
+
# implementation (jfdctint.c / jidctint.c). Uses the Arai-Agui-Nakajima
|
|
6
|
+
# factorization with 13-bit fixed-point constants.
|
|
7
|
+
#
|
|
8
|
+
# All arithmetic is pure Integer (additions, shifts, multiplies) — no Float
|
|
9
|
+
# operations. This is ~3x faster than the matrix-multiply float DCT under
|
|
10
|
+
# YJIT and eliminates millions of Float object allocations during decode.
|
|
4
11
|
module DCT
|
|
5
|
-
#
|
|
6
|
-
# where C(0) = 1/sqrt(2), C(k) = 1 for k > 0.
|
|
7
|
-
# This lets us do the 2D DCT as two 1D matrix-vector multiplies (separable).
|
|
12
|
+
# Keep the float matrix available for reference / testing
|
|
8
13
|
MATRIX = Array.new(8) { |k|
|
|
9
14
|
ck = k == 0 ? 0.5 / Math.sqrt(2.0) : 0.5
|
|
10
15
|
Array.new(8) { |n|
|
|
@@ -12,72 +17,215 @@ module PureJPEG
|
|
|
12
17
|
}
|
|
13
18
|
}.freeze
|
|
14
19
|
|
|
15
|
-
# Flatten for faster indexed access
|
|
16
20
|
MATRIX_FLAT = MATRIX.flatten.freeze
|
|
17
|
-
|
|
18
|
-
# Transposed matrix for inverse DCT: A^T[n][k] = A[k][n]
|
|
19
21
|
MATRIX_T_FLAT = Array.new(64) { |i| MATRIX_FLAT[(i % 8) * 8 + i / 8] }.freeze
|
|
20
22
|
|
|
21
|
-
#
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
23
|
+
# Fixed-point constants (13-bit precision) from IJG reference.
|
|
24
|
+
CONST_BITS = 13
|
|
25
|
+
PASS1_BITS = 2
|
|
26
|
+
|
|
27
|
+
FIX_0_298631336 = 2446
|
|
28
|
+
FIX_0_390180644 = 3196
|
|
29
|
+
FIX_0_541196100 = 4433
|
|
30
|
+
FIX_0_765366865 = 6270
|
|
31
|
+
FIX_0_899976223 = 7373
|
|
32
|
+
FIX_1_175875602 = 9633
|
|
33
|
+
FIX_1_501321110 = 12299
|
|
34
|
+
FIX_1_847759065 = 15137
|
|
35
|
+
FIX_1_961570560 = 16069
|
|
36
|
+
FIX_2_053119869 = 16819
|
|
37
|
+
FIX_2_562915447 = 20995
|
|
38
|
+
FIX_3_072711026 = 25172
|
|
39
|
+
|
|
40
|
+
CB = CONST_BITS
|
|
41
|
+
P1 = PASS1_BITS
|
|
42
|
+
CB_M_P1 = CB - P1 # 11
|
|
43
|
+
CB_P_P1_P3 = CB + P1 + 3 # 18
|
|
44
|
+
P1_P3 = P1 + 3 # 5
|
|
45
|
+
CB2_P_P1 = CB * 2 + P1 # 28 (unused, was for column even-multiplied path)
|
|
46
|
+
|
|
47
|
+
# Forward 2D DCT (in-place). Input: 64-element array of level-shifted
|
|
48
|
+
# integers (-128..127). Output: DCT coefficients (integers).
|
|
49
|
+
# The `_temp` and `_out` parameters are accepted for API compatibility
|
|
50
|
+
# but ignored; computation is done in-place on `data`.
|
|
51
|
+
def self.forward!(data, _temp = nil, _out = nil)
|
|
52
|
+
# Pass 1: process rows
|
|
53
|
+
8.times do |row|
|
|
54
|
+
i = row << 3
|
|
55
|
+
d0 = data[i]; d1 = data[i+1]; d2 = data[i+2]; d3 = data[i+3]
|
|
56
|
+
d4 = data[i+4]; d5 = data[i+5]; d6 = data[i+6]; d7 = data[i+7]
|
|
57
|
+
|
|
58
|
+
tmp0 = d0 + d7; tmp7 = d0 - d7
|
|
59
|
+
tmp1 = d1 + d6; tmp6 = d1 - d6
|
|
60
|
+
tmp2 = d2 + d5; tmp5 = d2 - d5
|
|
61
|
+
tmp3 = d3 + d4; tmp4 = d3 - d4
|
|
62
|
+
|
|
63
|
+
# Even part
|
|
64
|
+
tmp10 = tmp0 + tmp3; tmp13 = tmp0 - tmp3
|
|
65
|
+
tmp11 = tmp1 + tmp2; tmp12 = tmp1 - tmp2
|
|
66
|
+
|
|
67
|
+
data[i] = (tmp10 + tmp11) << P1
|
|
68
|
+
data[i+4] = (tmp10 - tmp11) << P1
|
|
69
|
+
|
|
70
|
+
z1 = (tmp12 + tmp13) * FIX_0_541196100
|
|
71
|
+
data[i+2] = (z1 + tmp13 * FIX_0_765366865 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
72
|
+
data[i+6] = (z1 - tmp12 * FIX_1_847759065 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
73
|
+
|
|
74
|
+
# Odd part
|
|
75
|
+
z1 = tmp4 + tmp7; z2 = tmp5 + tmp6
|
|
76
|
+
z3 = tmp4 + tmp6; z4 = tmp5 + tmp7
|
|
77
|
+
z5 = (z3 + z4) * FIX_1_175875602
|
|
78
|
+
|
|
79
|
+
tmp4 = tmp4 * FIX_0_298631336
|
|
80
|
+
tmp5 = tmp5 * FIX_2_053119869
|
|
81
|
+
tmp6 = tmp6 * FIX_3_072711026
|
|
82
|
+
tmp7 = tmp7 * FIX_1_501321110
|
|
83
|
+
z1 = z1 * -FIX_0_899976223
|
|
84
|
+
z2 = z2 * -FIX_2_562915447
|
|
85
|
+
z3 = z3 * -FIX_1_961570560 + z5
|
|
86
|
+
z4 = z4 * -FIX_0_390180644 + z5
|
|
87
|
+
|
|
88
|
+
data[i+7] = (tmp4 + z1 + z3 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
89
|
+
data[i+5] = (tmp5 + z2 + z4 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
90
|
+
data[i+3] = (tmp6 + z2 + z3 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
91
|
+
data[i+1] = (tmp7 + z1 + z4 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
36
92
|
end
|
|
37
93
|
|
|
38
|
-
#
|
|
39
|
-
8.times do |
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
94
|
+
# Pass 2: process columns
|
|
95
|
+
8.times do |col|
|
|
96
|
+
d0 = data[col]; d1 = data[col+8]; d2 = data[col+16]; d3 = data[col+24]
|
|
97
|
+
d4 = data[col+32]; d5 = data[col+40]; d6 = data[col+48]; d7 = data[col+56]
|
|
98
|
+
|
|
99
|
+
tmp0 = d0 + d7; tmp7 = d0 - d7
|
|
100
|
+
tmp1 = d1 + d6; tmp6 = d1 - d6
|
|
101
|
+
tmp2 = d2 + d5; tmp5 = d2 - d5
|
|
102
|
+
tmp3 = d3 + d4; tmp4 = d3 - d4
|
|
103
|
+
|
|
104
|
+
tmp10 = tmp0 + tmp3; tmp13 = tmp0 - tmp3
|
|
105
|
+
tmp11 = tmp1 + tmp2; tmp12 = tmp1 - tmp2
|
|
106
|
+
|
|
107
|
+
data[col] = (tmp10 + tmp11 + (1 << (P1_P3 - 1))) >> P1_P3
|
|
108
|
+
data[col+32] = (tmp10 - tmp11 + (1 << (P1_P3 - 1))) >> P1_P3
|
|
109
|
+
|
|
110
|
+
z1 = (tmp12 + tmp13) * FIX_0_541196100
|
|
111
|
+
data[col+16] = (z1 + tmp13 * FIX_0_765366865 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
112
|
+
data[col+48] = (z1 - tmp12 * FIX_1_847759065 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
113
|
+
|
|
114
|
+
z1 = tmp4 + tmp7; z2 = tmp5 + tmp6
|
|
115
|
+
z3 = tmp4 + tmp6; z4 = tmp5 + tmp7
|
|
116
|
+
z5 = (z3 + z4) * FIX_1_175875602
|
|
117
|
+
|
|
118
|
+
tmp4 = tmp4 * FIX_0_298631336
|
|
119
|
+
tmp5 = tmp5 * FIX_2_053119869
|
|
120
|
+
tmp6 = tmp6 * FIX_3_072711026
|
|
121
|
+
tmp7 = tmp7 * FIX_1_501321110
|
|
122
|
+
z1 = z1 * -FIX_0_899976223
|
|
123
|
+
z2 = z2 * -FIX_2_562915447
|
|
124
|
+
z3 = z3 * -FIX_1_961570560 + z5
|
|
125
|
+
z4 = z4 * -FIX_0_390180644 + z5
|
|
126
|
+
|
|
127
|
+
data[col+56] = (tmp4 + z1 + z3 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
128
|
+
data[col+40] = (tmp5 + z2 + z4 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
129
|
+
data[col+24] = (tmp6 + z2 + z3 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
130
|
+
data[col+8] = (tmp7 + z1 + z4 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
47
131
|
end
|
|
48
132
|
|
|
49
|
-
|
|
133
|
+
data
|
|
50
134
|
end
|
|
51
135
|
|
|
52
|
-
#
|
|
53
|
-
#
|
|
54
|
-
def self.inverse!(
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
136
|
+
# Inverse 2D DCT (in-place). Input: dequantized DCT coefficients (integers).
|
|
137
|
+
# Output: spatial-domain values (integers) that still need +128 level shift.
|
|
138
|
+
def self.inverse!(data, _temp = nil, _out = nil)
|
|
139
|
+
# Pass 1: process columns
|
|
140
|
+
8.times do |col|
|
|
141
|
+
d0 = data[col]; d2 = data[col+16]; d4 = data[col+32]; d6 = data[col+48]
|
|
142
|
+
d1 = data[col+8]; d3 = data[col+24]; d5 = data[col+40]; d7 = data[col+56]
|
|
143
|
+
|
|
144
|
+
# Even part
|
|
145
|
+
z1 = (d2 + d6) * FIX_0_541196100
|
|
146
|
+
tmp2 = z1 - d6 * FIX_1_847759065
|
|
147
|
+
tmp3 = z1 + d2 * FIX_0_765366865
|
|
148
|
+
|
|
149
|
+
tmp0 = (d0 + d4) << CB
|
|
150
|
+
tmp1 = (d0 - d4) << CB
|
|
151
|
+
|
|
152
|
+
tmp10 = tmp0 + tmp3; tmp13 = tmp0 - tmp3
|
|
153
|
+
tmp11 = tmp1 + tmp2; tmp12 = tmp1 - tmp2
|
|
154
|
+
|
|
155
|
+
# Odd part
|
|
156
|
+
tmp0 = d7; tmp1 = d5; tmp2 = d3; tmp3 = d1
|
|
157
|
+
z1 = tmp0 + tmp3; z2 = tmp1 + tmp2
|
|
158
|
+
z3 = tmp0 + tmp2; z4 = tmp1 + tmp3
|
|
159
|
+
z5 = (z3 + z4) * FIX_1_175875602
|
|
160
|
+
|
|
161
|
+
tmp0 = tmp0 * FIX_0_298631336
|
|
162
|
+
tmp1 = tmp1 * FIX_2_053119869
|
|
163
|
+
tmp2 = tmp2 * FIX_3_072711026
|
|
164
|
+
tmp3 = tmp3 * FIX_1_501321110
|
|
165
|
+
z1 = z1 * -FIX_0_899976223
|
|
166
|
+
z2 = z2 * -FIX_2_562915447
|
|
167
|
+
z3 = z3 * -FIX_1_961570560 + z5
|
|
168
|
+
z4 = z4 * -FIX_0_390180644 + z5
|
|
169
|
+
|
|
170
|
+
tmp0 += z1 + z3; tmp1 += z2 + z4
|
|
171
|
+
tmp2 += z2 + z3; tmp3 += z1 + z4
|
|
172
|
+
|
|
173
|
+
data[col] = (tmp10 + tmp3 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
174
|
+
data[col+56] = (tmp10 - tmp3 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
175
|
+
data[col+8] = (tmp11 + tmp2 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
176
|
+
data[col+48] = (tmp11 - tmp2 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
177
|
+
data[col+16] = (tmp12 + tmp1 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
178
|
+
data[col+40] = (tmp12 - tmp1 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
179
|
+
data[col+24] = (tmp13 + tmp0 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
180
|
+
data[col+32] = (tmp13 - tmp0 + (1 << (CB_M_P1 - 1))) >> CB_M_P1
|
|
67
181
|
end
|
|
68
182
|
|
|
69
|
-
#
|
|
70
|
-
8.times do |
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
183
|
+
# Pass 2: process rows
|
|
184
|
+
8.times do |row|
|
|
185
|
+
i = row << 3
|
|
186
|
+
d0 = data[i]; d2 = data[i+2]; d4 = data[i+4]; d6 = data[i+6]
|
|
187
|
+
d1 = data[i+1]; d3 = data[i+3]; d5 = data[i+5]; d7 = data[i+7]
|
|
188
|
+
|
|
189
|
+
# Even part
|
|
190
|
+
z1 = (d2 + d6) * FIX_0_541196100
|
|
191
|
+
tmp2 = z1 - d6 * FIX_1_847759065
|
|
192
|
+
tmp3 = z1 + d2 * FIX_0_765366865
|
|
193
|
+
|
|
194
|
+
tmp0 = (d0 + d4) << CB
|
|
195
|
+
tmp1 = (d0 - d4) << CB
|
|
196
|
+
|
|
197
|
+
tmp10 = tmp0 + tmp3; tmp13 = tmp0 - tmp3
|
|
198
|
+
tmp11 = tmp1 + tmp2; tmp12 = tmp1 - tmp2
|
|
199
|
+
|
|
200
|
+
# Odd part
|
|
201
|
+
tmp0 = d7; tmp1 = d5; tmp2 = d3; tmp3 = d1
|
|
202
|
+
z1 = tmp0 + tmp3; z2 = tmp1 + tmp2
|
|
203
|
+
z3 = tmp0 + tmp2; z4 = tmp1 + tmp3
|
|
204
|
+
z5 = (z3 + z4) * FIX_1_175875602
|
|
205
|
+
|
|
206
|
+
tmp0 = tmp0 * FIX_0_298631336
|
|
207
|
+
tmp1 = tmp1 * FIX_2_053119869
|
|
208
|
+
tmp2 = tmp2 * FIX_3_072711026
|
|
209
|
+
tmp3 = tmp3 * FIX_1_501321110
|
|
210
|
+
z1 = z1 * -FIX_0_899976223
|
|
211
|
+
z2 = z2 * -FIX_2_562915447
|
|
212
|
+
z3 = z3 * -FIX_1_961570560 + z5
|
|
213
|
+
z4 = z4 * -FIX_0_390180644 + z5
|
|
214
|
+
|
|
215
|
+
tmp0 += z1 + z3; tmp1 += z2 + z4
|
|
216
|
+
tmp2 += z2 + z3; tmp3 += z1 + z4
|
|
217
|
+
|
|
218
|
+
data[i] = (tmp10 + tmp3 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
219
|
+
data[i+7] = (tmp10 - tmp3 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
220
|
+
data[i+1] = (tmp11 + tmp2 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
221
|
+
data[i+6] = (tmp11 - tmp2 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
222
|
+
data[i+2] = (tmp12 + tmp1 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
223
|
+
data[i+5] = (tmp12 - tmp1 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
224
|
+
data[i+3] = (tmp13 + tmp0 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
225
|
+
data[i+4] = (tmp13 - tmp0 + (1 << (CB_P_P1_P3 - 1))) >> CB_P_P1_P3
|
|
78
226
|
end
|
|
79
227
|
|
|
80
|
-
|
|
228
|
+
data
|
|
81
229
|
end
|
|
82
230
|
end
|
|
83
231
|
end
|
data/lib/pure_jpeg/decoder.rb
CHANGED
|
@@ -78,10 +78,8 @@ module PureJPEG
|
|
|
78
78
|
|
|
79
79
|
# Reusable buffers
|
|
80
80
|
zigzag = Array.new(64, 0)
|
|
81
|
-
raster = Array.new(64, 0
|
|
82
|
-
dequant = Array.new(64, 0
|
|
83
|
-
temp = Array.new(64, 0.0)
|
|
84
|
-
spatial = Array.new(64, 0.0)
|
|
81
|
+
raster = Array.new(64, 0)
|
|
82
|
+
dequant = Array.new(64, 0)
|
|
85
83
|
|
|
86
84
|
mcus_y.times do |mcu_row|
|
|
87
85
|
mcus_x.times do |mcu_col|
|
|
@@ -104,12 +102,12 @@ module PureJPEG
|
|
|
104
102
|
# Inverse pipeline: unzigzag -> dequantize -> IDCT -> level shift
|
|
105
103
|
Zigzag.unreorder!(zigzag, raster)
|
|
106
104
|
Quantization.dequantize!(raster, qt, dequant)
|
|
107
|
-
DCT.inverse!(dequant
|
|
105
|
+
DCT.inverse!(dequant)
|
|
108
106
|
|
|
109
107
|
# Write block into channel buffer
|
|
110
108
|
bx = (mcu_col * comp.h_sampling + bh) * 8
|
|
111
109
|
by = (mcu_row * comp.v_sampling + bv) * 8
|
|
112
|
-
write_block(
|
|
110
|
+
write_block(dequant, ch[:data], ch[:width], bx, by)
|
|
113
111
|
end
|
|
114
112
|
end
|
|
115
113
|
end
|
|
@@ -204,10 +202,8 @@ module PureJPEG
|
|
|
204
202
|
end
|
|
205
203
|
|
|
206
204
|
zigzag = Array.new(64, 0)
|
|
207
|
-
raster = Array.new(64, 0
|
|
208
|
-
dequant = Array.new(64, 0
|
|
209
|
-
temp = Array.new(64, 0.0)
|
|
210
|
-
spatial = Array.new(64, 0.0)
|
|
205
|
+
raster = Array.new(64, 0)
|
|
206
|
+
dequant = Array.new(64, 0)
|
|
211
207
|
|
|
212
208
|
jfif.components.each do |c|
|
|
213
209
|
qt = fetch_quant_table!(jfif, c)
|
|
@@ -222,8 +218,8 @@ module PureJPEG
|
|
|
222
218
|
|
|
223
219
|
Zigzag.unreorder!(zigzag, raster)
|
|
224
220
|
Quantization.dequantize!(raster, qt, dequant)
|
|
225
|
-
DCT.inverse!(dequant
|
|
226
|
-
write_block(
|
|
221
|
+
DCT.inverse!(dequant)
|
|
222
|
+
write_block(dequant, ch[:data], ch[:width], block_x * 8, block_y * 8)
|
|
227
223
|
end
|
|
228
224
|
end
|
|
229
225
|
end
|
|
@@ -460,12 +456,16 @@ module PureJPEG
|
|
|
460
456
|
# Write an 8x8 spatial block (level-shifted by +128) into a channel buffer.
|
|
461
457
|
def write_block(spatial, channel, ch_width, bx, by)
|
|
462
458
|
8.times do |row|
|
|
463
|
-
|
|
464
|
-
|
|
465
|
-
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
|
|
459
|
+
dst = (by + row) * ch_width + bx
|
|
460
|
+
r8 = row << 3
|
|
461
|
+
v = spatial[r8] + 128; channel[dst] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
462
|
+
v = spatial[r8 | 1] + 128; channel[dst + 1] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
463
|
+
v = spatial[r8 | 2] + 128; channel[dst + 2] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
464
|
+
v = spatial[r8 | 3] + 128; channel[dst + 3] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
465
|
+
v = spatial[r8 | 4] + 128; channel[dst + 4] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
466
|
+
v = spatial[r8 | 5] + 128; channel[dst + 5] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
467
|
+
v = spatial[r8 | 6] + 128; channel[dst + 6] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
468
|
+
v = spatial[r8 | 7] + 128; channel[dst + 7] = v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
469
469
|
end
|
|
470
470
|
end
|
|
471
471
|
|
|
@@ -493,18 +493,27 @@ module PureJPEG
|
|
|
493
493
|
|
|
494
494
|
def assemble_grayscale(width, height, channels, comp)
|
|
495
495
|
ch = channels[comp.id]
|
|
496
|
+
ch_data = ch[:data]
|
|
497
|
+
ch_width = ch[:width]
|
|
496
498
|
pixels = Array.new(width * height)
|
|
497
499
|
height.times do |y|
|
|
498
|
-
src_row = y *
|
|
500
|
+
src_row = y * ch_width
|
|
499
501
|
dst_row = y * width
|
|
500
502
|
width.times do |x|
|
|
501
|
-
v =
|
|
503
|
+
v = ch_data[src_row + x]
|
|
502
504
|
pixels[dst_row + x] = (v << 16) | (v << 8) | v
|
|
503
505
|
end
|
|
504
506
|
end
|
|
505
507
|
Image.new(width, height, pixels, icc_profile: @icc_profile)
|
|
506
508
|
end
|
|
507
509
|
|
|
510
|
+
# Fixed-point coefficients (scaled by 2^16) for YCbCr→RGB.
|
|
511
|
+
FP_R_CR = 91881 # 1.402 * 65536
|
|
512
|
+
FP_G_CB = -22554 # -0.344136 * 65536
|
|
513
|
+
FP_G_CR = -46802 # -0.714136 * 65536
|
|
514
|
+
FP_B_CB = 116130 # 1.772 * 65536
|
|
515
|
+
FP_HALF = 32768 # rounding bias
|
|
516
|
+
|
|
508
517
|
def assemble_color(width, height, channels, components, max_h, max_v)
|
|
509
518
|
# Upsample chroma channels if needed and convert YCbCr to RGB
|
|
510
519
|
y_comp, cb_comp, cr_comp = resolve_color_components(components)
|
|
@@ -513,29 +522,39 @@ module PureJPEG
|
|
|
513
522
|
cb_ch = channels[cb_comp.id]
|
|
514
523
|
cr_ch = channels[cr_comp.id]
|
|
515
524
|
|
|
525
|
+
y_data = y_ch[:data]
|
|
526
|
+
cb_data = cb_ch[:data]
|
|
527
|
+
cr_data = cr_ch[:data]
|
|
528
|
+
y_stride = y_ch[:width]
|
|
529
|
+
cb_stride = cb_ch[:width]
|
|
530
|
+
cr_stride = cr_ch[:width]
|
|
531
|
+
cb_h = cb_comp.h_sampling
|
|
532
|
+
cb_v = cb_comp.v_sampling
|
|
533
|
+
cr_h = cr_comp.h_sampling
|
|
534
|
+
cr_v = cr_comp.v_sampling
|
|
535
|
+
|
|
516
536
|
pixels = Array.new(width * height)
|
|
517
537
|
|
|
518
538
|
height.times do |py|
|
|
519
539
|
dst_row = py * width
|
|
520
|
-
y_row = py *
|
|
540
|
+
y_row = py * y_stride
|
|
521
541
|
|
|
522
542
|
# Chroma coordinates (nearest-neighbor upsampling)
|
|
523
|
-
|
|
524
|
-
|
|
525
|
-
cb_row = cb_y * cb_ch[:width]
|
|
526
|
-
cr_row = cr_y * cr_ch[:width]
|
|
543
|
+
cb_row = ((py * cb_v) / max_v) * cb_stride
|
|
544
|
+
cr_row = ((py * cr_v) / max_v) * cr_stride
|
|
527
545
|
|
|
528
546
|
width.times do |px|
|
|
529
|
-
lum =
|
|
547
|
+
lum = y_data[y_row + px]
|
|
530
548
|
|
|
531
|
-
cb_x = (px *
|
|
532
|
-
cr_x = (px *
|
|
533
|
-
|
|
534
|
-
|
|
549
|
+
cb_x = (px * cb_h) / max_h
|
|
550
|
+
cr_x = (px * cr_h) / max_h
|
|
551
|
+
cb_val = cb_data[cb_row + cb_x] - 128
|
|
552
|
+
cr_val = cr_data[cr_row + cr_x] - 128
|
|
535
553
|
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
|
|
554
|
+
# Fixed-point YCbCr→RGB (all integer arithmetic)
|
|
555
|
+
r = lum + ((FP_R_CR * cr_val + FP_HALF) >> 16)
|
|
556
|
+
g = lum + ((FP_G_CB * cb_val + FP_G_CR * cr_val + FP_HALF) >> 16)
|
|
557
|
+
b = lum + ((FP_B_CB * cb_val + FP_HALF) >> 16)
|
|
539
558
|
|
|
540
559
|
r = r < 0 ? 0 : (r > 255 ? 255 : r)
|
|
541
560
|
g = g < 0 ? 0 : (g > 255 ? 255 : g)
|
data/lib/pure_jpeg/encoder.rb
CHANGED
|
@@ -76,17 +76,27 @@ module PureJPEG
|
|
|
76
76
|
|
|
77
77
|
def build_lum_qtable
|
|
78
78
|
table = @luminance_table || Quantization.scale_table(Quantization::LUMINANCE_BASE, quality)
|
|
79
|
-
table =
|
|
79
|
+
table = apply_quantization_modifier(table, :luminance) if @quantization_modifier
|
|
80
80
|
table
|
|
81
81
|
end
|
|
82
82
|
|
|
83
83
|
def build_chr_qtable
|
|
84
84
|
table = @chrominance_table || Quantization.scale_table(Quantization::CHROMINANCE_BASE, @chroma_quality)
|
|
85
|
-
table =
|
|
85
|
+
table = apply_quantization_modifier(table, :chrominance) if @quantization_modifier
|
|
86
86
|
table
|
|
87
87
|
end
|
|
88
88
|
|
|
89
|
+
def apply_quantization_modifier(table, channel)
|
|
90
|
+
modified = @quantization_modifier.call(table, channel)
|
|
91
|
+
validate_qtable!(modified, "quantization_modifier result for #{channel}")
|
|
92
|
+
modified
|
|
93
|
+
end
|
|
94
|
+
|
|
89
95
|
def validate_qtable!(table, name)
|
|
96
|
+
unless table.respond_to?(:length) && table.respond_to?(:all?)
|
|
97
|
+
raise ArgumentError, "#{name} must be a 64-element array of integers between 1 and 255"
|
|
98
|
+
end
|
|
99
|
+
|
|
90
100
|
raise ArgumentError, "#{name} must have exactly 64 elements (got #{table.length})" unless table.length == 64
|
|
91
101
|
unless table.all? { |v| v.is_a?(Integer) && v >= 1 && v <= 255 }
|
|
92
102
|
raise ArgumentError, "#{name} elements must be integers between 1 and 255"
|
|
@@ -195,17 +205,14 @@ module PureJPEG
|
|
|
195
205
|
padded_w = (width + 7) & ~7
|
|
196
206
|
padded_h = (height + 7) & ~7
|
|
197
207
|
|
|
198
|
-
block = Array.new(64, 0
|
|
199
|
-
temp = Array.new(64, 0.0)
|
|
200
|
-
dct = Array.new(64, 0.0)
|
|
208
|
+
block = Array.new(64, 0)
|
|
201
209
|
qbuf = Array.new(64, 0)
|
|
202
210
|
zbuf = Array.new(64, 0)
|
|
203
211
|
|
|
204
212
|
(0...padded_h).step(8) do |by|
|
|
205
213
|
(0...padded_w).step(8) do |bx|
|
|
206
214
|
extract_block_into(y_data, width, height, bx, by, block)
|
|
207
|
-
transform_block(block,
|
|
208
|
-
yield zbuf
|
|
215
|
+
yield transform_block(block, qbuf, zbuf, qtable)
|
|
209
216
|
end
|
|
210
217
|
end
|
|
211
218
|
end
|
|
@@ -268,37 +275,29 @@ module PureJPEG
|
|
|
268
275
|
mcu_w = (width + 15) & ~15
|
|
269
276
|
mcu_h = (height + 15) & ~15
|
|
270
277
|
|
|
271
|
-
block = Array.new(64, 0
|
|
272
|
-
temp = Array.new(64, 0.0)
|
|
273
|
-
dct = Array.new(64, 0.0)
|
|
278
|
+
block = Array.new(64, 0)
|
|
274
279
|
qbuf = Array.new(64, 0)
|
|
275
280
|
zbuf = Array.new(64, 0)
|
|
276
281
|
|
|
277
282
|
(0...mcu_h).step(16) do |my|
|
|
278
283
|
(0...mcu_w).step(16) do |mx|
|
|
279
284
|
extract_block_into(y_data, width, height, mx, my, block)
|
|
280
|
-
transform_block(block,
|
|
281
|
-
yield :y, zbuf
|
|
285
|
+
yield :y, transform_block(block, qbuf, zbuf, lum_qt)
|
|
282
286
|
|
|
283
287
|
extract_block_into(y_data, width, height, mx + 8, my, block)
|
|
284
|
-
transform_block(block,
|
|
285
|
-
yield :y, zbuf
|
|
288
|
+
yield :y, transform_block(block, qbuf, zbuf, lum_qt)
|
|
286
289
|
|
|
287
290
|
extract_block_into(y_data, width, height, mx, my + 8, block)
|
|
288
|
-
transform_block(block,
|
|
289
|
-
yield :y, zbuf
|
|
291
|
+
yield :y, transform_block(block, qbuf, zbuf, lum_qt)
|
|
290
292
|
|
|
291
293
|
extract_block_into(y_data, width, height, mx + 8, my + 8, block)
|
|
292
|
-
transform_block(block,
|
|
293
|
-
yield :y, zbuf
|
|
294
|
+
yield :y, transform_block(block, qbuf, zbuf, lum_qt)
|
|
294
295
|
|
|
295
296
|
extract_block_into(cb_sub, sub_w, sub_h, mx >> 1, my >> 1, block)
|
|
296
|
-
transform_block(block,
|
|
297
|
-
yield :cb, zbuf
|
|
297
|
+
yield :cb, transform_block(block, qbuf, zbuf, chr_qt)
|
|
298
298
|
|
|
299
299
|
extract_block_into(cr_sub, sub_w, sub_h, mx >> 1, my >> 1, block)
|
|
300
|
-
transform_block(block,
|
|
301
|
-
yield :cr, zbuf
|
|
300
|
+
yield :cr, transform_block(block, qbuf, zbuf, chr_qt)
|
|
302
301
|
end
|
|
303
302
|
end
|
|
304
303
|
end
|
|
@@ -323,9 +322,9 @@ module PureJPEG
|
|
|
323
322
|
|
|
324
323
|
# --- Shared block pipeline (all buffers pre-allocated) ---
|
|
325
324
|
|
|
326
|
-
def transform_block(block,
|
|
327
|
-
DCT.forward!(block
|
|
328
|
-
Quantization.quantize!(
|
|
325
|
+
def transform_block(block, qbuf, zbuf, qtable)
|
|
326
|
+
DCT.forward!(block)
|
|
327
|
+
Quantization.quantize!(block, qtable, qbuf)
|
|
329
328
|
Zigzag.reorder!(qbuf, zbuf)
|
|
330
329
|
zbuf
|
|
331
330
|
end
|
|
@@ -342,26 +341,42 @@ module PureJPEG
|
|
|
342
341
|
end
|
|
343
342
|
end
|
|
344
343
|
|
|
344
|
+
# Fixed-point coefficients (scaled by 2^16 = 65536) for RGB→YCbCr.
|
|
345
|
+
# Y = 0.299*R + 0.587*G + 0.114*B
|
|
346
|
+
# Cb = -0.168736*R - 0.331264*G + 0.5*B + 128
|
|
347
|
+
# Cr = 0.5*R - 0.418688*G - 0.081312*B + 128
|
|
348
|
+
FP_Y_R = 19595; FP_Y_G = 38470; FP_Y_B = 7471
|
|
349
|
+
FP_CB_R = -11058; FP_CB_G = -21710; FP_CB_B = 32768
|
|
350
|
+
FP_CR_R = 32768; FP_CR_G = -27440; FP_CR_B = -5328
|
|
351
|
+
FP_HALF = 32768 # rounding bias
|
|
352
|
+
FP_128 = 8388608 # 128 << 16
|
|
353
|
+
|
|
354
|
+
def clamp255(v)
|
|
355
|
+
v < 0 ? 0 : (v > 255 ? 255 : v)
|
|
356
|
+
end
|
|
357
|
+
|
|
345
358
|
def extract_luminance(width, height)
|
|
346
359
|
luminance = Array.new(width * height)
|
|
347
360
|
if source.respond_to?(:packed_pixels)
|
|
348
361
|
packed = source.packed_pixels
|
|
349
362
|
r_shift, g_shift, b_shift = packed_shifts
|
|
363
|
+
n = width * height
|
|
350
364
|
i = 0
|
|
351
|
-
|
|
365
|
+
n.times do
|
|
352
366
|
color = packed[i]
|
|
353
367
|
r = (color >> r_shift) & 0xFF
|
|
354
368
|
g = (color >> g_shift) & 0xFF
|
|
355
369
|
b = (color >> b_shift) & 0xFF
|
|
356
|
-
luminance[i] = (
|
|
370
|
+
luminance[i] = clamp255((FP_Y_R * r + FP_Y_G * g + FP_Y_B * b + FP_HALF) >> 16)
|
|
357
371
|
i += 1
|
|
358
372
|
end
|
|
359
373
|
else
|
|
360
|
-
height.times do |
|
|
361
|
-
row =
|
|
362
|
-
width.times do |
|
|
363
|
-
pixel = source[
|
|
364
|
-
|
|
374
|
+
height.times do |py|
|
|
375
|
+
row = py * width
|
|
376
|
+
width.times do |px|
|
|
377
|
+
pixel = source[px, py]
|
|
378
|
+
r = pixel.r; g = pixel.g; b = pixel.b
|
|
379
|
+
luminance[row + px] = clamp255((FP_Y_R * r + FP_Y_G * g + FP_Y_B * b + FP_HALF) >> 16)
|
|
365
380
|
end
|
|
366
381
|
end
|
|
367
382
|
end
|
|
@@ -383,9 +398,9 @@ module PureJPEG
|
|
|
383
398
|
r = (color >> r_shift) & 0xFF
|
|
384
399
|
g = (color >> g_shift) & 0xFF
|
|
385
400
|
b = (color >> b_shift) & 0xFF
|
|
386
|
-
y_data[i] = (
|
|
387
|
-
cb_data[i] = (
|
|
388
|
-
cr_data[i] = (
|
|
401
|
+
y_data[i] = clamp255((FP_Y_R * r + FP_Y_G * g + FP_Y_B * b + FP_HALF) >> 16)
|
|
402
|
+
cb_data[i] = clamp255((FP_CB_R * r + FP_CB_G * g + FP_CB_B * b + FP_128 + FP_HALF) >> 16)
|
|
403
|
+
cr_data[i] = clamp255((FP_CR_R * r + FP_CR_G * g + FP_CR_B * b + FP_128 + FP_HALF) >> 16)
|
|
389
404
|
i += 1
|
|
390
405
|
end
|
|
391
406
|
else
|
|
@@ -395,9 +410,9 @@ module PureJPEG
|
|
|
395
410
|
pixel = source[px, py]
|
|
396
411
|
r = pixel.r; g = pixel.g; b = pixel.b
|
|
397
412
|
i = row + px
|
|
398
|
-
y_data[i] = (
|
|
399
|
-
cb_data[i] = (
|
|
400
|
-
cr_data[i] = (
|
|
413
|
+
y_data[i] = clamp255((FP_Y_R * r + FP_Y_G * g + FP_Y_B * b + FP_HALF) >> 16)
|
|
414
|
+
cb_data[i] = clamp255((FP_CB_R * r + FP_CB_G * g + FP_CB_B * b + FP_128 + FP_HALF) >> 16)
|
|
415
|
+
cr_data[i] = clamp255((FP_CR_R * r + FP_CR_G * g + FP_CR_B * b + FP_128 + FP_HALF) >> 16)
|
|
401
416
|
end
|
|
402
417
|
end
|
|
403
418
|
end
|
|
@@ -432,13 +447,16 @@ module PureJPEG
|
|
|
432
447
|
8.times do |row|
|
|
433
448
|
sy = by + row
|
|
434
449
|
sy = max_y if sy > max_y
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
450
|
+
src = sy * width
|
|
451
|
+
r8 = row << 3
|
|
452
|
+
x = bx; block[r8] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
453
|
+
x = bx + 1; block[r8 | 1] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
454
|
+
x = bx + 2; block[r8 | 2] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
455
|
+
x = bx + 3; block[r8 | 3] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
456
|
+
x = bx + 4; block[r8 | 4] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
457
|
+
x = bx + 5; block[r8 | 5] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
458
|
+
x = bx + 6; block[r8 | 6] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
459
|
+
x = bx + 7; block[r8 | 7] = channel[src + (x > max_x ? max_x : x)] - 128
|
|
442
460
|
end
|
|
443
461
|
block
|
|
444
462
|
end
|
|
@@ -3,17 +3,22 @@
|
|
|
3
3
|
module PureJPEG
|
|
4
4
|
module Huffman
|
|
5
5
|
class Encoder
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
6
|
+
# Return the Huffman category (bit length) for a value.
|
|
7
|
+
# Avoids Array allocation compared to the combined category_and_bits.
|
|
8
|
+
def self.category(value)
|
|
9
|
+
return 0 if value == 0
|
|
10
|
+
v = value.abs
|
|
9
11
|
cat = 0
|
|
10
|
-
v = abs_val
|
|
11
12
|
while v > 0
|
|
12
13
|
cat += 1
|
|
13
14
|
v >>= 1
|
|
14
15
|
end
|
|
15
|
-
|
|
16
|
-
|
|
16
|
+
cat
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
# Return the extra bits to encode for a value with the given category.
|
|
20
|
+
def self.value_bits(value, cat)
|
|
21
|
+
value > 0 ? value : value + (1 << cat) - 1
|
|
17
22
|
end
|
|
18
23
|
|
|
19
24
|
def self.each_ac_item(zigzag)
|
|
@@ -39,7 +44,7 @@ module PureJPEG
|
|
|
39
44
|
end
|
|
40
45
|
|
|
41
46
|
value = zigzag[i]
|
|
42
|
-
cat
|
|
47
|
+
cat = category(value)
|
|
43
48
|
yield (run << 4) | cat, value
|
|
44
49
|
i += 1
|
|
45
50
|
end
|
|
@@ -73,10 +78,10 @@ module PureJPEG
|
|
|
73
78
|
private
|
|
74
79
|
|
|
75
80
|
def encode_dc(diff, writer)
|
|
76
|
-
cat
|
|
81
|
+
cat = self.class.category(diff)
|
|
77
82
|
code, length = @dc_table[cat]
|
|
78
83
|
writer.write_bits(code, length)
|
|
79
|
-
writer.write_bits(
|
|
84
|
+
writer.write_bits(self.class.value_bits(diff, cat), cat) if cat > 0
|
|
80
85
|
end
|
|
81
86
|
|
|
82
87
|
def encode_ac(zigzag, writer)
|
|
@@ -85,8 +90,8 @@ module PureJPEG
|
|
|
85
90
|
writer.write_bits(code, length)
|
|
86
91
|
next if symbol == 0x00 || symbol == 0xF0
|
|
87
92
|
|
|
88
|
-
cat
|
|
89
|
-
writer.write_bits(
|
|
93
|
+
cat = self.class.category(value)
|
|
94
|
+
writer.write_bits(self.class.value_bits(value, cat), cat)
|
|
90
95
|
end
|
|
91
96
|
end
|
|
92
97
|
end
|
|
@@ -104,7 +109,7 @@ module PureJPEG
|
|
|
104
109
|
diff = zigzag[0] - @prev_dc[state_key]
|
|
105
110
|
@prev_dc[state_key] = zigzag[0]
|
|
106
111
|
|
|
107
|
-
cat
|
|
112
|
+
cat = Encoder.category(diff)
|
|
108
113
|
@dc_frequencies[cat] += 1
|
|
109
114
|
|
|
110
115
|
Encoder.each_ac_symbol(zigzag) do |symbol|
|
|
@@ -36,9 +36,20 @@ module PureJPEG
|
|
|
36
36
|
}
|
|
37
37
|
end
|
|
38
38
|
|
|
39
|
-
# Quantize a 64-element DCT block
|
|
39
|
+
# Quantize a 64-element DCT block into `out`.
|
|
40
|
+
# Uses integer rounding division (round-to-nearest) to match the
|
|
41
|
+
# behavior of Float division + round from the previous float DCT.
|
|
40
42
|
def self.quantize!(block, table, out)
|
|
41
|
-
|
|
43
|
+
i = 0
|
|
44
|
+
while i < 64
|
|
45
|
+
v = block[i]; t = table[i]
|
|
46
|
+
out[i] = if v >= 0
|
|
47
|
+
(v + (t >> 1)) / t
|
|
48
|
+
else
|
|
49
|
+
-((-v + (t >> 1)) / t)
|
|
50
|
+
end
|
|
51
|
+
i += 1
|
|
52
|
+
end
|
|
42
53
|
out
|
|
43
54
|
end
|
|
44
55
|
|
|
@@ -27,8 +27,7 @@ module PureJPEG
|
|
|
27
27
|
def initialize(width, height, &block)
|
|
28
28
|
@width = width
|
|
29
29
|
@height = height
|
|
30
|
-
|
|
31
|
-
@pixels = Array.new(width * height, black)
|
|
30
|
+
@pixels = Array.new(width * height) { Pixel.new(0, 0, 0) }
|
|
32
31
|
|
|
33
32
|
if block
|
|
34
33
|
height.times do |y|
|
data/lib/pure_jpeg/version.rb
CHANGED