react-native-image-stitcher 0.4.0 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -16,6 +16,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
16
16
 
17
17
  ## [Unreleased]
18
18
 
19
+ ## [0.4.1] — 2026-05-23
20
+
21
+ ### Fixed
22
+ - **ARCore Image hold time** (PR #15) — `forwardToIncremental` on
23
+ Android now packs the ARCore `Image` payload synchronously and
24
+ closes the image immediately, rather than holding it across the JNI
25
+ hand-off. Eliminates the "ImageReader: maxImages exceeded" backlog
26
+ that throttled non-keyframe processing on the A35 at high pan
27
+ rates.
28
+
29
+ ### Tooling
30
+ - **Example app Metro port pinned to 8082** (cherry-pick from
31
+ `feature/f8-frame-processor-yuv`). Mirrored across
32
+ `example/metro.config.js`, `example/package.json` scripts,
33
+ `example/ios/RNImageStitcherExample/AppDelegate.swift`, and
34
+ `example/android/gradle.properties` to keep CLI builds, IDE
35
+ builds, and Gradle invocations consistent on machines where 8081
36
+ is already taken.
37
+
38
+ ### Internal
39
+ - Lockfile sync after the v0.4.0 version bump (Podfile.lock spec
40
+ checksum + npm prune of transitive deps that had drifted from
41
+ branch experimentation). No impact on consumers — example-app
42
+ tooling only.
43
+
19
44
  ## [0.4.0] — 2026-05-23
20
45
 
21
46
  ### v0.4 settings revamp (F10)
@@ -428,133 +428,146 @@ class RNSARCameraView @JvmOverloads constructor(
428
428
  }
429
429
  return
430
430
  }
431
- try {
432
- // 2026-05-21 (v0.3) — pixel-data path. Pre-0.3 this code
433
- // unconditionally encoded the YUV camera image to JPEG and
434
- // wrote it to disk for EVERY ARCore frame at ~60 Hz (~25 ms
435
- // per frame of JPEG encode + disk I/O on the GL render
436
- // thread), regardless of whether the C++ KeyframeGate would
437
- // accept it. Now we extract the Y plane bytes (cheap
438
- // memcpy from a DirectByteBuffer), feed them to the gate
439
- // for proper Flow-strategy evaluation, and defer the JPEG
440
- // encode + disk write to the `onAccept` lambda so it only
441
- // runs on the rare frames the gate actually keeps
442
- // (typically ~6 per capture).
443
- //
444
- // Y-plane extraction for ARCore's YUV_420_888 format:
445
- // plane[0] is the luminance channel at full resolution,
446
- // pixelStride=1, rowStride may equal width OR be padded.
447
- // We pass rowStride as the C++ side's `stride` so the gate
448
- // skips padding correctly.
449
- val yPlane = image.planes[0]
450
- val yBuffer = yPlane.buffer
451
- val yStride = yPlane.rowStride
452
- val yWidth = image.width
453
- val yHeight = image.height
454
- // Copy Y bytes into a JVM-side ByteArray. Using
455
- // duplicate() so we don't mutate the original buffer's
456
- // position state (ARCore may have other readers).
457
- // For 1920×1080 Y plane that's ~2 MB; on Galaxy A35 the
458
- // memcpy itself is < 1 ms. JNI side pins via
459
- // GetPrimitiveArrayCritical so the byte[] stays a single
460
- // copy through the entire frame's lifecycle.
461
- val ySize = yStride * yHeight
462
- val yBytes = ByteArray(ySize)
463
- yBuffer.duplicate().apply { rewind() }.get(yBytes, 0, ySize)
464
-
465
- // Compute yaw + pitch from the ARCore quaternion using
466
- // the same convention the iOS Swift side uses (camera-
467
- // forward in world space). This keeps the two platforms
468
- // numerically aligned for the FoV-overlap gate.
469
- val q = camera.pose.rotationQuaternion // x, y, z, w
470
- val (yaw, pitch) = quaternionYawPitch(q)
471
-
472
- // Both FoVs + the full quaternion + intrinsics go to the
473
- // engine. V6 pose-driven path uses (qx, qy, qz, qw, fx,
474
- // fy, cx, cy, w, h) to compute the geometrically-exact
475
- // homography.
476
- val intrinsics = camera.imageIntrinsics
477
- val fx = intrinsics.focalLength[0].toDouble()
478
- val fy = intrinsics.focalLength[1].toDouble()
479
- val cxIntr = intrinsics.principalPoint[0].toDouble()
480
- val cyIntr = intrinsics.principalPoint[1].toDouble()
481
- val w = intrinsics.imageDimensions[0].toDouble()
482
- val h = intrinsics.imageDimensions[1].toDouble()
483
- val fovHRad = 2.0 * atan(w / (2.0 * fx))
484
- val fovVRad = 2.0 * atan(h / (2.0 * fy))
485
- val fovHDeg = fovHRad * 180.0 / Math.PI
486
- val fovVDeg = fovVRad * 180.0 / Math.PI
487
-
488
- // ARCore quaternion comes back in (x, y, z, w) order.
489
- val qarr = camera.pose.rotationQuaternion
490
- // P3-F: also extract translation so the KeyframeGate's
491
- // plane-based ray-projection can compute polygon overlap.
492
- // Previously these were dropped, forcing the gate into
493
- // angular-fallback even when a plane was latched.
494
- val tArr = camera.pose.translation
495
-
496
- val trackingPoor = camera.trackingState != TrackingState.TRACKING
497
- val module = IncrementalStitcher.bridgeInstance ?: return
498
- // 2026-05-15 (B3) — pass current display rotation so the
499
- // encoded JPEG gets an EXIF orientation tag. Captured into
500
- // a local val so the lambda below closes over a primitive
501
- // (avoids re-reading lastDisplayRotation if it shifts
502
- // between gate-evaluate and lambda invocation).
503
- val rotationForEncode = if (lastDisplayRotation >= 0)
504
- lastDisplayRotation else android.view.Surface.ROTATION_0
505
- // 2026-05-21 (v0.3) — eager JPEG encode is only needed when
506
- // the engine is in the legacy hybrid/firstwins live-engine
507
- // mode (which feeds JPEG paths into addFrameAtPath every
508
- // frame). In batch-keyframe mode (the production Camera
509
- // component's path), the JPEG is encoded LAZILY inside
510
- // the onAccept lambda below — only on the ~6 frames per
511
- // capture that the C++ KeyframeGate actually keeps.
512
- val legacyJpegPath: String? = if (module.isBatchKeyframeMode) {
513
- null
514
- } else {
515
- YuvImageConverter.encodeToJpeg(
516
- image,
517
- tmpJpegFile.absolutePath,
518
- jpegQuality = 70,
519
- displayRotation = rotationForEncode,
520
- )
431
+
432
+ // 2026-05-22 (audit follow-up #19) — minimise ARCore Image
433
+ // hold time.
434
+ //
435
+ // Pre-#19 the Image stayed open through the entire JNI
436
+ // ingest call AND any subsequent JPEG encode (~25 ms in
437
+ // legacy hybrid mode where every frame is encoded eagerly;
438
+ // ~25 ms in batch-keyframe mode for the ~5/60 frames the
439
+ // gate accepts). At 60 Hz ARCore that meant the Image was
440
+ // held 25-30 ms per frame on accepts, starving the Camera2
441
+ // ImageReader's circular buffer pool and risking
442
+ // "BufferQueue has been abandoned" stalls.
443
+ //
444
+ // The fix is mechanical: pack the YUV planes into a
445
+ // JVM-side NV21 byte array (~3 ms), close the Image, and
446
+ // run all subsequent work (JNI ingest + JPEG encode) on
447
+ // the copied bytes. ARCore Camera2 buffer pool stays
448
+ // healthier; latency-sensitive ARCore frames flow through
449
+ // their fixed pool instead of waiting on our JPEG path.
450
+ //
451
+ // The packed.nv21 array's first `width*height` bytes are
452
+ // the Y plane (densely packed, stride = width) — these go
453
+ // to the C++ gate as grayscale. The full array is the
454
+ // input to YuvImageConverter.encodeJpegFromNV21 if the
455
+ // gate accepts (or if we're in legacy eager-encode mode).
456
+ val packed = try {
457
+ YuvImageConverter.packNV21(image)
458
+ } finally {
459
+ // Close ASAP every microsecond reduces buffer-pool
460
+ // pressure on Camera2. Even if packNV21 returns null
461
+ // (unsupported format), we still need to close.
462
+ try { image.close() } catch (_: Throwable) {}
463
+ } ?: run {
464
+ if (forwardLogTick % 30 == 1) {
465
+ Log.w(TAG, "forwardToIncremental: packNV21 returned null (unexpected format?)")
521
466
  }
522
- module.ingestFromARCameraView(
523
- tx = tArr[0].toDouble(),
524
- ty = tArr[1].toDouble(),
525
- tz = tArr[2].toDouble(),
526
- qx = qarr[0].toDouble(), qy = qarr[1].toDouble(),
527
- qz = qarr[2].toDouble(), qw = qarr[3].toDouble(),
528
- fx = fx, fy = fy, cx = cxIntr, cy = cyIntr,
529
- imageWidth = intrinsics.imageDimensions[0],
530
- imageHeight = intrinsics.imageDimensions[1],
531
- yaw = yaw, pitch = pitch,
532
- fovHorizDegrees = fovHDeg, fovVertDegrees = fovVDeg,
533
- trackingPoor = trackingPoor,
534
- grayData = yBytes,
535
- grayWidth = yWidth,
536
- grayHeight = yHeight,
537
- grayStride = yStride,
538
- legacyJpegPath = legacyJpegPath,
539
- onAccept = { targetPath ->
540
- // Lazy JPEG encode. Runs ONLY if the C++ KeyframeGate
541
- // accepted the frame. The ARCore Image is still open
542
- // at this point (we haven't reached `image.close()`
543
- // in the surrounding `finally` block yet), so the
544
- // encode reads raw camera pixels directly into a
545
- // JPEG at the final persistent path no tmp file,
546
- // no second copy.
547
- YuvImageConverter.encodeToJpeg(
548
- image,
549
- targetPath,
550
- jpegQuality = 70,
551
- displayRotation = rotationForEncode,
552
- ) != null
553
- },
467
+ return
468
+ }
469
+
470
+ // Compute yaw + pitch from the ARCore quaternion using
471
+ // the same convention the iOS Swift side uses (camera-
472
+ // forward in world space). This keeps the two platforms
473
+ // numerically aligned for the FoV-overlap gate. `camera`
474
+ // (and `camera.pose`) remain valid after image.close() —
475
+ // they're ARCore Frame metadata, not pixel buffers.
476
+ val q = camera.pose.rotationQuaternion // x, y, z, w
477
+ val (yaw, pitch) = quaternionYawPitch(q)
478
+
479
+ // Both FoVs + the full quaternion + intrinsics go to the
480
+ // engine. V6 pose-driven path uses (qx, qy, qz, qw, fx,
481
+ // fy, cx, cy, w, h) to compute the geometrically-exact
482
+ // homography.
483
+ val intrinsics = camera.imageIntrinsics
484
+ val fx = intrinsics.focalLength[0].toDouble()
485
+ val fy = intrinsics.focalLength[1].toDouble()
486
+ val cxIntr = intrinsics.principalPoint[0].toDouble()
487
+ val cyIntr = intrinsics.principalPoint[1].toDouble()
488
+ val w = intrinsics.imageDimensions[0].toDouble()
489
+ val h = intrinsics.imageDimensions[1].toDouble()
490
+ val fovHRad = 2.0 * atan(w / (2.0 * fx))
491
+ val fovVRad = 2.0 * atan(h / (2.0 * fy))
492
+ val fovHDeg = fovHRad * 180.0 / Math.PI
493
+ val fovVDeg = fovVRad * 180.0 / Math.PI
494
+
495
+ // ARCore quaternion comes back in (x, y, z, w) order.
496
+ val qarr = camera.pose.rotationQuaternion
497
+ // P3-F: also extract translation so the KeyframeGate's
498
+ // plane-based ray-projection can compute polygon overlap.
499
+ // Previously these were dropped, forcing the gate into
500
+ // angular-fallback even when a plane was latched.
501
+ val tArr = camera.pose.translation
502
+
503
+ val trackingPoor = camera.trackingState != TrackingState.TRACKING
504
+ val module = IncrementalStitcher.bridgeInstance ?: return
505
+ // 2026-05-15 (B3) — pass current display rotation so the
506
+ // encoded JPEG gets an EXIF orientation tag. Captured into
507
+ // a local val so the lambda below closes over a primitive
508
+ // (avoids re-reading lastDisplayRotation if it shifts
509
+ // between gate-evaluate and lambda invocation).
510
+ val rotationForEncode = if (lastDisplayRotation >= 0)
511
+ lastDisplayRotation else android.view.Surface.ROTATION_0
512
+
513
+ // 2026-05-21 (v0.3) — eager JPEG encode is only needed when
514
+ // the engine is in the legacy hybrid/firstwins live-engine
515
+ // mode (which feeds JPEG paths into addFrameAtPath every
516
+ // frame). In batch-keyframe mode (the production Camera
517
+ // component's path), the JPEG is encoded LAZILY inside
518
+ // the onAccept lambda below — only on the ~6 frames per
519
+ // capture that the C++ KeyframeGate actually keeps.
520
+ //
521
+ // 2026-05-22 (#19) — the encode now reads from the already-
522
+ // packed NV21 bytes (`packed`), NOT from the live Image
523
+ // (which has been closed above). Same output, no Image
524
+ // hold time.
525
+ val legacyJpegPath: String? = if (module.isBatchKeyframeMode) {
526
+ null
527
+ } else {
528
+ YuvImageConverter.encodeJpegFromNV21(
529
+ packed,
530
+ tmpJpegFile.absolutePath,
531
+ jpegQuality = 70,
532
+ displayRotation = rotationForEncode,
554
533
  )
555
- } finally {
556
- image.close()
557
534
  }
535
+ module.ingestFromARCameraView(
536
+ tx = tArr[0].toDouble(),
537
+ ty = tArr[1].toDouble(),
538
+ tz = tArr[2].toDouble(),
539
+ qx = qarr[0].toDouble(), qy = qarr[1].toDouble(),
540
+ qz = qarr[2].toDouble(), qw = qarr[3].toDouble(),
541
+ fx = fx, fy = fy, cx = cxIntr, cy = cyIntr,
542
+ imageWidth = intrinsics.imageDimensions[0],
543
+ imageHeight = intrinsics.imageDimensions[1],
544
+ yaw = yaw, pitch = pitch,
545
+ fovHorizDegrees = fovHDeg, fovVertDegrees = fovVDeg,
546
+ trackingPoor = trackingPoor,
547
+ // The Y plane lives at packed.nv21[0 .. width*height).
548
+ // C++ keyframe_gate reads `height * stride` bytes and
549
+ // ignores anything past that, so passing the full NV21
550
+ // array with `grayStride = width` reads exactly the Y
551
+ // plane (UV bytes at the tail are not touched).
552
+ grayData = packed.nv21,
553
+ grayWidth = packed.width,
554
+ grayHeight = packed.height,
555
+ grayStride = packed.width,
556
+ legacyJpegPath = legacyJpegPath,
557
+ onAccept = { targetPath ->
558
+ // Lazy JPEG encode. Runs ONLY if the C++ KeyframeGate
559
+ // accepted the frame. Encodes from the pre-packed
560
+ // NV21 bytes — the ARCore Image has been closed since
561
+ // ~25 ms ago (right after packNV21), so no
562
+ // Image-hold cost on this slow path.
563
+ YuvImageConverter.encodeJpegFromNV21(
564
+ packed,
565
+ targetPath,
566
+ jpegQuality = 70,
567
+ displayRotation = rotationForEncode,
568
+ ) != null
569
+ },
570
+ )
558
571
  }
559
572
 
560
573
  private fun applyDisplayGeometry() {
@@ -14,6 +14,25 @@ import java.io.FileOutputStream
14
14
  /**
15
15
  * Convert an ARCore `Image` (YUV_420_888) to a JPEG file on disk.
16
16
  *
17
+ * 2026-05-22 (audit follow-up #19) — split into two phases so callers
18
+ * can release the underlying ARCore `Image` ASAP:
19
+ *
20
+ * 1. `packNV21(image)` — reads the Y/U/V planes into a contiguous
21
+ * JVM-side `ByteArray` (NV21 layout). Fast (~3 ms for 1920×1080).
22
+ * The caller can close the `Image` IMMEDIATELY after this returns,
23
+ * freeing the ARCore Camera2 ImageReader buffer.
24
+ *
25
+ * 2. `encodeJpegFromNV21(packed, …)` — does the slow YUV→JPEG
26
+ * conversion (~10-25 ms) on the already-extracted bytes, NOT on
27
+ * the Image. Safe to run after the Image has been closed.
28
+ *
29
+ * The pre-#19 single-call `encodeToJpeg(image, …)` API is preserved as
30
+ * a thin wrapper for callers that don't care about Image hold time
31
+ * (e.g., one-shot photo capture). Performance-critical paths
32
+ * (`RNSARCameraView.forwardToIncremental`, called at ~60 Hz on the
33
+ * GL render thread) should use the two-step API to keep Image hold
34
+ * times bounded by the ~3 ms pack step instead of the ~25 ms encode.
35
+ *
17
36
  * Why JPEG → file → re-decode by OpenCV (slightly wasteful)?
18
37
  * The incremental engine's existing API (matching iOS') consumes
19
38
  * image PATHS, not raw planes. Threading raw YUV through the
@@ -22,56 +41,182 @@ import java.io.FileOutputStream
22
41
  * next to the ~40 ms per-frame engine work — keeping the surface
23
42
  * uniform across iOS / Android paths is worth the few-ms cost.
24
43
  *
25
- * `Image` ownership: caller MUST `image.close()` after this returns.
26
- * We don't close inside because the caller may want to inspect more
27
- * fields (timestamp, format) before releasing.
44
+ * `Image` ownership: the two-step API (`packNV21` + `encodeJpegFromNV21`)
45
+ * returns control to the caller after the pack step so the caller can
46
+ * close the Image at the right moment. The legacy single-call
47
+ * `encodeToJpeg(image, …)` does NOT close the Image — caller is
48
+ * responsible for that.
28
49
  */
29
50
  internal object YuvImageConverter {
30
51
 
31
- /// Convert + write JPEG. Returns the path (no file:// prefix)
32
- /// or null on any encode/write error (caller-decides whether to
33
- /// log + drop the frame).
34
- ///
35
- /// 2026-05-15 (B3) — `displayRotation` parameter writes the
36
- /// appropriate EXIF orientation tag so RN's Image loader (and
37
- /// any other consumer that respects EXIF) displays the JPEG
38
- /// upright regardless of how the device was held at capture.
39
- ///
40
- /// Without this, Android's image loaders display the raw sensor
41
- /// pixels (typically landscape) as-is — so a portrait-held
42
- /// capture's thumbnail appears sideways. iOS's image loader
43
- /// auto-respects EXIF orientation; Android's doesn't always
44
- /// (depends on the loader path). Setting the tag covers both.
45
- ///
46
- /// `displayRotation` should be the value returned from
47
- /// `WindowManager.defaultDisplay.rotation` at capture time
48
- /// (Surface.ROTATION_0/_90/_180/_270). We assume a typical
49
- /// back-camera sensor orientation of 90° — true for all
50
- /// devices we ship to (Galaxy A35 verified). Wire through
51
- /// CameraCharacteristics.SENSOR_ORIENTATION in a follow-up
52
- /// if we ever encounter a device that differs.
53
- ///
54
- /// Default `displayRotation = Surface.ROTATION_0` (portrait)
55
- /// preserves the previous behaviour for legacy callsites that
56
- /// haven't been updated yet.
57
- fun encodeToJpeg(
58
- image: Image,
52
+ /**
53
+ * Packed NV21 pixel data extracted from an ARCore `Image`.
54
+ * Once you hold one of these, the source `Image` can be closed —
55
+ * all subsequent operations work on the JVM-side byte array.
56
+ *
57
+ * NV21 layout (single contiguous byte array):
58
+ * bytes [0 .. width*height) = Y plane (luminance),
59
+ * densely packed,
60
+ * row stride = width
61
+ * bytes [width*height .. width*height*3/2) = interleaved V-U pairs
62
+ * at half resolution
63
+ *
64
+ * The Y plane portion can be passed directly to the C++
65
+ * `keyframe_gate` as grayscale pixels with `stride = width`.
66
+ */
67
+ data class PackedYuv(
68
+ val nv21: ByteArray,
69
+ val width: Int,
70
+ val height: Int,
71
+ ) {
72
+ /** Length of the Y plane portion (bytes [0 .. ySize)). */
73
+ val ySize: Int get() = width * height
74
+
75
+ // equals + hashCode override required because `nv21` is a
76
+ // mutable array; default `data class` equality uses reference
77
+ // identity for arrays, which is rarely what callers want.
78
+ override fun equals(other: Any?): Boolean {
79
+ if (this === other) return true
80
+ if (other !is PackedYuv) return false
81
+ return width == other.width
82
+ && height == other.height
83
+ && nv21.contentEquals(other.nv21)
84
+ }
85
+ override fun hashCode(): Int {
86
+ var result = nv21.contentHashCode()
87
+ result = 31 * result + width
88
+ result = 31 * result + height
89
+ return result
90
+ }
91
+ }
92
+
93
+
94
+ /**
95
+ * Pack the Y, U, V planes of a YUV_420_888 `Image` into a
96
+ * contiguous JVM-side NV21 byte array. Returns null if the
97
+ * `Image`'s format isn't YUV_420_888 or doesn't expose 3 planes.
98
+ *
99
+ * Performance: ~3 ms for 1920×1080 on a Galaxy A35. Dominated
100
+ * by the row-by-row copy through the direct ByteBuffers backing
101
+ * the camera planes.
102
+ *
103
+ * The Y plane is densely repacked (the source rowStride may be
104
+ * padded, but we discard padding on the way in so the result has
105
+ * `rowStride = width`). This is what callers want — `cv::Mat`
106
+ * wrap on the C++ side prefers tight strides, and downstream
107
+ * `YuvImage.compressToJpeg` requires densely-packed input.
108
+ */
109
+ fun packNV21(image: Image): PackedYuv? {
110
+ if (image.format != ImageFormat.YUV_420_888) return null
111
+ val planes = image.planes
112
+ if (planes.size < 3) return null
113
+
114
+ val w = image.width
115
+ val h = image.height
116
+ val ySize = w * h
117
+ val uvSize = w * h / 2
118
+ val nv21 = ByteArray(ySize + uvSize)
119
+
120
+ // ── Y plane (luminance) ─────────────────────────────────
121
+ val yPlane = planes[0]
122
+ val yBuf = yPlane.buffer
123
+ val yRowStride = yPlane.rowStride
124
+ if (yRowStride == w) {
125
+ // Source already densely packed — single block copy.
126
+ // Use duplicate() so we don't mutate the original buffer's
127
+ // position state (defensive — ARCore may have other readers
128
+ // of the same underlying buffer, though in practice it
129
+ // shouldn't).
130
+ yBuf.duplicate().apply { rewind() }.get(nv21, 0, ySize)
131
+ } else {
132
+ // Row-by-row copy when stride > width (padded rows).
133
+ val dup = yBuf.duplicate()
134
+ var dstOffset = 0
135
+ var srcOffset = 0
136
+ for (row in 0 until h) {
137
+ dup.position(srcOffset)
138
+ dup.get(nv21, dstOffset, w)
139
+ dstOffset += w
140
+ srcOffset += yRowStride
141
+ }
142
+ }
143
+
144
+ // ── U + V planes (chroma) ───────────────────────────────
145
+ // YUV_420_888 has them subsampled 2:1 so each plane physically
146
+ // covers (w/2) × (h/2). Pixel stride is 1 (planar) or 2
147
+ // (semi-planar interleaved). NV21 expects interleaved V-U.
148
+ val uPlane = planes[1]
149
+ val vPlane = planes[2]
150
+ val uBuf = uPlane.buffer
151
+ val vBuf = vPlane.buffer
152
+ val uRowStride = uPlane.rowStride
153
+ val uPixelStride = uPlane.pixelStride
154
+ val vRowStride = vPlane.rowStride
155
+ val vPixelStride = vPlane.pixelStride
156
+
157
+ // Fast path — most Android camera2 / ARCore producers emit
158
+ // semi-planar interleaved data with pixelStride=2. In that
159
+ // case the V plane's underlying bytes physically interleave
160
+ // V-U-V-U... and copying the V plane's full byte range
161
+ // produces NV21 layout directly.
162
+ if (uPixelStride == 2 && vPixelStride == 2 &&
163
+ uRowStride == vRowStride && uRowStride == w) {
164
+ val vBytes = vBuf.remaining().coerceAtMost(uvSize)
165
+ // Defensive duplicate() again — same reasoning as Y plane.
166
+ vBuf.duplicate().apply { rewind() }.get(nv21, ySize, vBytes)
167
+ return PackedYuv(nv21, w, h)
168
+ }
169
+
170
+ // Slow path — manual interleave for planar (pixelStride=1) or
171
+ // non-tight semi-planar layouts.
172
+ var pos = ySize
173
+ val rowsUv = h / 2
174
+ val colsUv = w / 2
175
+ for (row in 0 until rowsUv) {
176
+ for (col in 0 until colsUv) {
177
+ val vIdx = row * vRowStride + col * vPixelStride
178
+ val uIdx = row * uRowStride + col * uPixelStride
179
+ nv21[pos++] = vBuf.get(vIdx)
180
+ nv21[pos++] = uBuf.get(uIdx)
181
+ }
182
+ }
183
+ return PackedYuv(nv21, w, h)
184
+ }
185
+
186
+
187
+ /**
188
+ * Encode an already-packed NV21 buffer to a JPEG file on disk.
189
+ *
190
+ * Returns the output path on success, or null on any encode/write
191
+ * error (caller decides whether to log + drop the frame).
192
+ *
193
+ * `displayRotation` writes the appropriate EXIF orientation tag
194
+ * so consumers that respect EXIF (RN's Image loader, etc.)
195
+ * display the JPEG upright regardless of how the device was held
196
+ * at capture. Should be the value from
197
+ * `WindowManager.defaultDisplay.rotation` at capture time
198
+ * (Surface.ROTATION_0 / _90 / _180 / _270).
199
+ *
200
+ * Sensor orientation 90° assumed (back camera) — verified on
201
+ * Galaxy A35. Wire `CameraCharacteristics.SENSOR_ORIENTATION`
202
+ * through in a follow-up if we hit a device that differs.
203
+ */
204
+ fun encodeJpegFromNV21(
205
+ packed: PackedYuv,
59
206
  outputPath: String,
60
207
  jpegQuality: Int = 70,
61
208
  displayRotation: Int = Surface.ROTATION_0,
62
209
  ): String? {
63
- if (image.format != ImageFormat.YUV_420_888) return null
64
- val nv21 = yuv420toNV21(image) ?: return null
65
210
  val yuvImage = YuvImage(
66
- nv21,
211
+ packed.nv21,
67
212
  ImageFormat.NV21,
68
- image.width,
69
- image.height,
213
+ packed.width,
214
+ packed.height,
70
215
  null,
71
216
  )
72
217
  val baos = ByteArrayOutputStream()
73
218
  val ok = yuvImage.compressToJpeg(
74
- Rect(0, 0, image.width, image.height),
219
+ Rect(0, 0, packed.width, packed.height),
75
220
  jpegQuality.coerceIn(1, 100),
76
221
  baos,
77
222
  )
@@ -81,8 +226,9 @@ internal object YuvImageConverter {
81
226
  } catch (e: Throwable) {
82
227
  return null
83
228
  }
229
+
84
230
  // Write EXIF orientation tag based on display rotation.
85
- // Sensor orientation 90° assumed (back camera). The math:
231
+ // The math:
86
232
  // ROTATION_0 (portrait, sensor 90° CW from screen-up)
87
233
  // → JPEG needs 90° CW to display upright → ROTATE_90 (6)
88
234
  // ROTATION_90 (landscape-left, sensor aligned with screen)
@@ -93,11 +239,11 @@ internal object YuvImageConverter {
93
239
  // → 180° → ROTATE_180 (3)
94
240
  //
95
241
  // EXIF tag set EVEN when the orientation is normal — keeps
96
- // every output JPEG self-describing for downstream
97
- // consumers (cv::Stitcher does NOT auto-honour EXIF, see
98
- // BatchStitcher.applyExifOrientation; this metadata
99
- // exists primarily for the live thumbnail strip + future
100
- // RN Image renderers).
242
+ // every output JPEG self-describing for downstream consumers.
243
+ // cv::Stitcher does NOT auto-honour EXIF (see
244
+ // BatchStitcher.applyExifOrientation); this metadata exists
245
+ // primarily for the live thumbnail strip + future RN Image
246
+ // renderers.
101
247
  val exifOrientation = when (displayRotation) {
102
248
  Surface.ROTATION_0 -> ExifInterface.ORIENTATION_ROTATE_90
103
249
  Surface.ROTATION_90 -> ExifInterface.ORIENTATION_NORMAL
@@ -114,88 +260,35 @@ internal object YuvImageConverter {
114
260
  exif.saveAttributes()
115
261
  } catch (e: Throwable) {
116
262
  // EXIF write failed — JPEG itself is still valid; just
117
- // missing the orientation hint. Caller doesn't need to
118
- // know non-fatal.
263
+ // missing the orientation hint. Non-fatal; caller doesn't
264
+ // need to know.
119
265
  }
120
266
  return outputPath
121
267
  }
122
268
 
269
+
123
270
  /**
124
- * Pack a YUV_420_888 `Image` into a contiguous NV21 byte array.
271
+ * Single-call convenience wrapper: pack the `Image` and encode
272
+ * to JPEG in one step. Keeps the `Image` open through the entire
273
+ * ~25 ms encode — fine for one-shot photo capture, NOT
274
+ * recommended for the ~60 Hz `forwardToIncremental` path. See the
275
+ * file-level docs for the two-step alternative.
125
276
  *
126
- * The Image API exposes Y, U, V as three planes, each with its
127
- * own row stride and pixel stride. NV21 expects a single
128
- * contiguous buffer with Y plane first, then interleaved VU bytes
129
- * after. The repacking handles row + pixel strides that don't
130
- * match the dense layout.
277
+ * Caller still owns the `Image` and MUST close it afterwards;
278
+ * this function does not.
131
279
  */
132
- private fun yuv420toNV21(image: Image): ByteArray? {
133
- val w = image.width
134
- val h = image.height
135
- val ySize = w * h
136
- val uvSize = w * h / 2
137
-
138
- val nv21 = ByteArray(ySize + uvSize)
139
- val planes = image.planes
140
- if (planes.size < 3) return null
141
-
142
- // Y plane.
143
- val yPlane = planes[0]
144
- val yBuf = yPlane.buffer
145
- val yRowStride = yPlane.rowStride
146
- if (yRowStride == w) {
147
- yBuf.get(nv21, 0, ySize)
148
- } else {
149
- // Row-by-row copy when stride != width.
150
- var dstOffset = 0
151
- var srcOffset = 0
152
- for (row in 0 until h) {
153
- yBuf.position(srcOffset)
154
- yBuf.get(nv21, dstOffset, w)
155
- dstOffset += w
156
- srcOffset += yRowStride
157
- }
158
- }
159
-
160
- // U + V planes. YUV_420_888 has them subsampled 2:1 so each
161
- // covers (w/2) × (h/2). Pixel stride is 1 (planar) or 2
162
- // (semi-planar interleaved). NV21 requires interleaved VU.
163
- val uPlane = planes[1]
164
- val vPlane = planes[2]
165
- val uBuf = uPlane.buffer
166
- val vBuf = vPlane.buffer
167
- val uRowStride = uPlane.rowStride
168
- val uPixelStride = uPlane.pixelStride
169
- val vRowStride = vPlane.rowStride
170
- val vPixelStride = vPlane.pixelStride
171
-
172
- // Most camera2 / ARCore implementations on Android already
173
- // produce semi-planar interleaved data with pixelStride=2.
174
- // In that case Y plane + V plane (offset by 1) form NV21
175
- // directly with a single block copy. Detect + fast-path it.
176
- if (uPixelStride == 2 && vPixelStride == 2 &&
177
- uRowStride == vRowStride && uRowStride == w) {
178
- // The V plane in NV21 layout starts at vBuf's first byte.
179
- // Copy the entire V plane (which physically interleaves
180
- // with U bytes since pixelStride=2 means consecutive
181
- // bytes are V-U-V-U...).
182
- val vBytes = vBuf.remaining().coerceAtMost(uvSize)
183
- vBuf.get(nv21, ySize, vBytes)
184
- return nv21
185
- }
186
-
187
- // Slow path — manual interleave.
188
- var pos = ySize
189
- val rowsUv = h / 2
190
- val colsUv = w / 2
191
- for (row in 0 until rowsUv) {
192
- for (col in 0 until colsUv) {
193
- val vIdx = row * vRowStride + col * vPixelStride
194
- val uIdx = row * uRowStride + col * uPixelStride
195
- nv21[pos++] = vBuf.get(vIdx)
196
- nv21[pos++] = uBuf.get(uIdx)
197
- }
198
- }
199
- return nv21
280
+ fun encodeToJpeg(
281
+ image: Image,
282
+ outputPath: String,
283
+ jpegQuality: Int = 70,
284
+ displayRotation: Int = Surface.ROTATION_0,
285
+ ): String? {
286
+ val packed = packNV21(image) ?: return null
287
+ return encodeJpegFromNV21(
288
+ packed,
289
+ outputPath,
290
+ jpegQuality = jpegQuality,
291
+ displayRotation = displayRotation,
292
+ )
200
293
  }
201
294
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "react-native-image-stitcher",
3
- "version": "0.4.0",
3
+ "version": "0.4.1",
4
4
  "description": "Pose-aware panorama capture + stitching for React Native. One <Camera> component, both tap-to-photo and hold-to-pan modes, both AR-backed and IMU-fallback capture paths.",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",