@fairfox/polly 0.55.0 → 0.57.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -219,6 +219,127 @@ export interface TransportSnapshot {
219
219
  /** `performance.now()` at the time the stats were last refreshed. */
220
220
  at: number;
221
221
  }
222
+ /** Reasons the adapter declined to construct an RTC slot for a peer
223
+ * that appeared in the signalling roster or the keyring. Recorded
224
+ * per-peer in {@link MeshWebRTCAdapter.getPeerStateSnapshot} so a
225
+ * consumer harness observing "(no slot)" can tell which gate inside
226
+ * the adapter stopped construction without having to log-correlate
227
+ * through three layers of timing. Polly issue #106 item 7.
228
+ *
229
+ * - `self`: the peerId equals the local peerId.
230
+ * - `not-in-keyring`: the live keyring (or captured Set) does not
231
+ * currently authorise this peer.
232
+ * - `not-present`: the peer is not in the signalling roster. The
233
+ * adapter only dials peers it has heard about through
234
+ * `peers-present` or `peer-joined`; keyring entries that have not
235
+ * appeared on signalling are quietly held without a slot.
236
+ * - `tie-break-other-side`: the lex-tie-break designates the remote
237
+ * peer as the initiator; we wait for their offer.
238
+ * - `slot-already-exists`: a slot exists already, possibly in any
239
+ * negotiation state.
240
+ * - `fatal-error`: an exception was thrown while attempting to build
241
+ * the slot. The accompanying {@link SlotInitiationDecision.error}
242
+ * string carries the message.
243
+ */
244
+ export type SlotInitiationRejectionReason = "self" | "not-in-keyring" | "not-present" | "tie-break-other-side" | "slot-already-exists" | "fatal-error";
245
+ /** Most-recent slot-initiation decision for a peer. Computed at
246
+ * snapshot time so the view always reflects the current state of the
247
+ * relevant gates; a `decision === "accepted"` value paired with a
248
+ * `slot === undefined` view on the same snapshot is the load-bearing
249
+ * signal for "the adapter wants to dial but isn't" — the failure
250
+ * shape polly#106 documents. */
251
+ export interface SlotInitiationDecision {
252
+ /** "accepted" means every gate in `shouldInitiateTo` would pass right
253
+ * now and a sweep tick would construct a slot. "rejected" means at
254
+ * least one gate failed; the `reason` names it. */
255
+ decision: "accepted" | "rejected";
256
+ /** Populated only on `rejected` decisions. */
257
+ reason: SlotInitiationRejectionReason | undefined;
258
+ /** If the previous sweep tick caught a synchronous throw while
259
+ * building this peer's slot, the message is preserved here for the
260
+ * next snapshot. The reason will be `fatal-error`. */
261
+ error: string | undefined;
262
+ /** `performance.now()` at the time the decision was computed. */
263
+ at: number;
264
+ }
265
+ /** Most-recent sync-handshake timeline for a peer slot. Each timestamp
266
+ * is `performance.now()` of the corresponding event the first time it
267
+ * fired on the current slot — slots that get evicted and rebuilt start
268
+ * over. The four fields together describe whether the adapter and the
269
+ * receiver downstream of it have done their part in initiating sync:
270
+ *
271
+ * - `dataChannelOpenedAt`: when the wire is ready to carry bytes.
272
+ * - `peerCandidateEmittedAt`: when polly emitted `peer-candidate`
273
+ * upward; Automerge's network subsystem hooks this event to add the
274
+ * peer to its known set and kick off the per-document sync exchange.
275
+ * If this is `undefined` long after the data channel has opened,
276
+ * polly never signalled "ready" to Automerge — that's a failure in
277
+ * the adapter.
278
+ * - `firstOutboundSendAt`: when polly first dispatched bytes through
279
+ * {@link MeshWebRTCAdapter.send} for this peer. If
280
+ * `peerCandidateEmittedAt` is set but this is still `undefined`,
281
+ * Automerge's NetworkSubsystem has not asked the adapter to send
282
+ * anything. The polly#106 ladder named "no handle locally" as the
283
+ * typical cause for this rung; the polly#107 failing-shape evidence
284
+ * (fourteen `$meshState` handles pre-warmed in `ready` state,
285
+ * `firstOutboundSendAt` still undefined long after peer-candidate
286
+ * fires) shows that's a misleading ladder entry. Revised:
287
+ *
288
+ * - If `repo.handles[docId]` is undefined or in a non-`ready`
289
+ * state for every doc this peer should sync, the consumer
290
+ * fix is real — pre-warm the handles via the documented
291
+ * `$meshState(key, initial)` factory before the slot opens.
292
+ *
293
+ * - If `repo.handles[docId]` is `ready` for every doc AND
294
+ * `getPeerStateSnapshot().peers[…].slot.handles[docId]
295
+ * .announcedToPeer` is `false`, the gate is BETWEEN Automerge's
296
+ * NetworkSubsystem and the adapter — not on the consumer's
297
+ * handle-construction path. This is the polly#107 surface;
298
+ * post-#107 the mesh client hooks `peer-candidate` to invoke
299
+ * the synchronizer's `reevaluateDocumentShare` so the
300
+ * `addPeer`/`addDocument` ordering race that leaves a handle
301
+ * un-announced gets closed by an idempotent re-evaluation.
302
+ *
303
+ * - If `announcedToPeer` becomes `true` for every relevant doc
304
+ * and `firstOutboundSendAt` is still undefined, the gap is in
305
+ * polly's own send path — that's a polly bug, not an
306
+ * Automerge or consumer one.
307
+ *
308
+ * - `firstInboundMessageAt`: when polly first emitted a `message` event
309
+ * for this peer. If `peerCandidateEmittedAt` is set on the remote and
310
+ * `firstOutboundSendAt` is set on the remote but this is `undefined`
311
+ * locally, bytes were sent across the wire but never decoded — that
312
+ * points at the crypto envelope or the wire fragmenter.
313
+ *
314
+ * Polly issue #106 item 7; polly#107 revised the
315
+ * `firstOutboundSendAt` rung to point at the polly⇄Automerge gate
316
+ * rather than the consumer. */
317
+ export interface SyncHandshakeAttemptSnapshot {
318
+ dataChannelOpenedAt: number | undefined;
319
+ peerCandidateEmittedAt: number | undefined;
320
+ firstOutboundSendAt: number | undefined;
321
+ firstInboundMessageAt: number | undefined;
322
+ }
323
+ /** Sweep-loop observability for the periodic dial re-evaluation. The
324
+ * sweep is what catches peers that were not in the keyring at the time
325
+ * of their `peer-joined` notification (polly#103) AND peers whose
326
+ * previous slot failed and got evicted (polly#106). Exposing its tick
327
+ * counter lets a consumer distinguish "sweep is running but
328
+ * `shouldInitiateTo` rejected the peer" from "sweep is broken and
329
+ * never fires" without instrumenting polly internals. */
330
+ export interface SweepSnapshot {
331
+ /** True iff the sweep timer is currently scheduled — false on the
332
+ * captured-set fallback path (no keyringSource) or when the interval
333
+ * is configured to 0. */
334
+ enabled: boolean;
335
+ /** Configured interval in milliseconds. 0 means disabled. */
336
+ intervalMs: number;
337
+ /** How many times the sweep callback has fired since `connect()`. */
338
+ runCount: number;
339
+ /** `performance.now()` at the last tick. `undefined` until the
340
+ * first tick fires. */
341
+ lastRunAt: number | undefined;
342
+ }
222
343
  /** Per-peer view of an in-flight initial sync. Populated by
223
344
  * {@link MeshWebRTCAdapter.handleSyncFragment} as fragments of a
224
345
  * single reassembly arrive, and reset to `undefined` once the
@@ -242,6 +363,47 @@ export interface InFlightSyncSnapshot {
242
363
  * and this stays 0. */
243
364
  applyBacklog: number;
244
365
  }
366
+ /** Per-handle per-peer sync bookkeeping. Closes the diagnostic gap
367
+ * between "handle exists in repo" (observable today via `repo.handles`)
368
+ * and "Automerge's NetworkSubsystem has actually told this peer about
369
+ * this handle" — the latter being the load-bearing question for the
370
+ * polly#107 failing shape, where fourteen pre-warmed handles sit in
371
+ * `ready` state and the peer slot is fully connected, yet Automerge
372
+ * has not asked the adapter to send a single sync message.
373
+ *
374
+ * Each field is stamped from the adapter's own wire path: `out` from
375
+ * {@link MeshWebRTCAdapter.send}, `in` from {@link dispatchMessage}'s
376
+ * deserialised view. Combined with the consumer's view of
377
+ * `repo.handles[documentId].state`, the mesh client's
378
+ * {@link createMeshClient}-returned `getPeerStateSnapshot` produces the
379
+ * full per-handle observability the polly#107 ticket asks for in
380
+ * item 7. */
381
+ export interface HandleSyncSnapshot {
382
+ /** `performance.now()` of the most recent send through
383
+ * {@link MeshWebRTCAdapter.send} carrying this `documentId` for this
384
+ * peer. `undefined` means Automerge has never asked the adapter to
385
+ * send a sync message for this document — i.e. the handle is NOT
386
+ * "announced to peer" in the polly#107 sense, regardless of whether
387
+ * the handle is `ready` in `repo.handles`. */
388
+ lastSyncMessageOutAt: number | undefined;
389
+ /** `performance.now()` of the most recent inbound message dispatched
390
+ * upward for this `documentId` from this peer. `undefined` means we
391
+ * have never received a sync message for this document from this
392
+ * peer — they have not announced their copy of this handle to us. */
393
+ lastSyncMessageInAt: number | undefined;
394
+ /** Byte length of the most recent outbound sync message (the raw
395
+ * `RTCDataChannel.send` payload, which is the crypto-wrapped envelope
396
+ * — not the inner Automerge sync size). `undefined` until the first
397
+ * outbound send. Used by the polly#107 example's wire-level
398
+ * transcript to distinguish "Automerge generated an empty sync
399
+ * message" from "Automerge generated nothing at all". */
400
+ lastSyncMessageOutSize: number | undefined;
401
+ /** Type field of the most recent outbound message for this document
402
+ * to this peer. Typically `"sync"` once handshake has begun, or
403
+ * `"request"` while the local side is still asking. `undefined`
404
+ * until the first outbound send. */
405
+ lastSyncMessageOutType: string | undefined;
406
+ }
245
407
  /**
246
408
  * Automerge-Repo NetworkAdapter backed by real WebRTC data channels.
247
409
  * Manages one RTCPeerConnection per remote peer and uses a supplied
@@ -301,6 +463,21 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
301
463
  private readonly slots;
302
464
  private ready;
303
465
  private readyResolver;
466
+ /** Sticky cache of the most recent {@link SlotInitiationDecision}
467
+ * per peer. Updated on every `shouldInitiateTo` call and on every
468
+ * caught throw inside the sweep loop, so a snapshot taken at any
469
+ * moment can answer "why is there no slot for this peer right
470
+ * now?". Polly issue #106 item 7. */
471
+ private readonly lastSlotInitiationDecisions;
472
+ /** Tick count of the periodic sweep — incremented inside the
473
+ * `setInterval` callback, exposed via {@link getPeerStateSnapshot}.
474
+ * Lets a consumer rule out "sweep is dead" before chasing the
475
+ * shouldInitiateTo gates. */
476
+ private sweepRunCount;
477
+ /** `performance.now()` at the last sweep tick. Paired with
478
+ * {@link sweepRunCount} so a stalled sweep is visible at a glance
479
+ * via the snapshot's `sweep` block. */
480
+ private lastSweepAt;
304
481
  /** The peers this adapter will dial. Backward-compatible read accessor
305
482
  * for callers that previously iterated the `knownPeerIds` array. With
306
483
  * a {@link MeshWebRTCAdapterOptions.keyringSource} configured, the
@@ -348,10 +525,13 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
348
525
  localPeerId: string;
349
526
  knownPeerIds: string[];
350
527
  presentPeerIds: string[];
528
+ sweep: SweepSnapshot;
351
529
  peers: Array<{
352
530
  peerId: string;
353
531
  knownInKeyring: boolean;
354
532
  presentInSignalling: boolean;
533
+ slotInitiationRejectionReason: SlotInitiationRejectionReason | undefined;
534
+ slotInitiationDecision: SlotInitiationDecision;
355
535
  slot: undefined | {
356
536
  signalingState: string;
357
537
  iceConnectionState: string;
@@ -361,6 +541,8 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
361
541
  pendingRemoteIceCount: number;
362
542
  inFlightSync: InFlightSyncSnapshot | undefined;
363
543
  transport: TransportSnapshot | undefined;
544
+ lastSyncHandshakeAttempt: SyncHandshakeAttemptSnapshot;
545
+ handles: Record<string, HandleSyncSnapshot>;
364
546
  };
365
547
  }>;
366
548
  };
@@ -378,6 +560,17 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
378
560
  * every listed peer, so a device joining into an established lobby
379
561
  * dials every knownPeer it is meant to initiate to in one pass. */
380
562
  handlePeersPresent(peerIds: string[]): void;
563
+ /** Construct an initiating slot inside a per-peer try/catch and
564
+ * record any throw as a `fatal-error` rejection on the per-peer
565
+ * decision map so the snapshot surface names it. Every dial entry
566
+ * point ({@link handlePeerJoined}, {@link handlePeersPresent},
567
+ * {@link addKnownPeer}, {@link refreshKnownPeers}) routes through
568
+ * here so a single peer's broken construction can never take down a
569
+ * batch of peers — pre-#106 a thrown `new RTCPeerConnection()`
570
+ * inside `handlePeersPresent` would skip every later peer in the
571
+ * same batch with no observable trace because the signalling
572
+ * client's frame dispatch swallowed the rejection. */
573
+ private tryCreateInitiatingSlot;
381
574
  /** Handle the signalling server's `peer-left` notification: a
382
575
  * previously joined peer has closed its socket. Evict any slot we
383
576
  * hold for that peer so a subsequent `peer-joined` from the same
@@ -406,9 +599,41 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
406
599
  * dial the ones the keyring authorises that we do not already have
407
600
  * a slot for. The periodic sweep started in {@link connect} calls
408
601
  * this; consumers can call it manually to skip the wait after they
409
- * apply a fresh pairing token. Idempotent. */
602
+ * apply a fresh pairing token. Idempotent.
603
+ *
604
+ * A throw from {@link createInitiatingSlot} for one peer must not
605
+ * prevent the sweep from continuing to the next one — pre-#106 a
606
+ * synchronous throw inside `new RTCPeerConnection()` (a real risk
607
+ * once the page has built dozens of connections and Chrome's
608
+ * per-page cap is in play) skipped every later peer in the same
609
+ * sweep, with no observable trace because `setInterval` swallows
610
+ * the rejection silently. {@link tryCreateInitiatingSlot} caches
611
+ * the error onto the snapshot's slotInitiationRejectionReason as
612
+ * `fatal-error` so the failing peer is named instead of disguised
613
+ * as "(no slot)". */
410
614
  refreshKnownPeers(): void;
411
615
  private shouldInitiateTo;
616
+ /** Pure-function form of the gate cascade behind {@link shouldInitiateTo}.
617
+ * Returns `undefined` when every gate passes (the slot would be
618
+ * built); otherwise returns the named reason the dial was declined.
619
+ * Pulling the gates out of the boolean wrapper lets the snapshot
620
+ * surface name *which* gate stopped construction without the caller
621
+ * having to re-implement the check. Polly issue #106 item 7.
622
+ *
623
+ * The "not-present" gate is checked here even though the inbound
624
+ * call sites (`handlePeerJoined`, `handlePeersPresent`,
625
+ * `refreshKnownPeers`, `addKnownPeer`) only invoke `shouldInitiateTo`
626
+ * for peers in the signalling roster — so on those paths the gate
627
+ * never fires. It's load-bearing on the snapshot path, where the
628
+ * reason is computed for every peer the caller might ask about
629
+ * (including keyring entries that aren't currently signalling). */
630
+ private evaluateInitiation;
631
+ /** Compute the latest initiation decision for a peer at snapshot
632
+ * time. Prefers the cached decision when a sweep tick fixed the
633
+ * outcome to `fatal-error` (a thrown construction is sticky until
634
+ * the next successful sweep clears it); otherwise re-evaluates the
635
+ * gates against current state. Pure read; never mutates the map. */
636
+ private snapshotInitiationDecision;
412
637
  whenReady(): Promise<void>;
413
638
  /**
414
639
  * Start the adapter. Marks the adapter ready so Automerge's
@@ -428,7 +653,15 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
428
653
  * keyring after their `peer-joined` notification has already fired.
429
654
  * No-op when no keyringSource was supplied — the captured-set
430
655
  * fallback has no live source to re-read, so the sweep would be
431
- * useless. No-op when the interval is configured to 0. */
656
+ * useless. No-op when the interval is configured to 0.
657
+ *
658
+ * Each tick increments {@link sweepRunCount} and stamps
659
+ * {@link lastSweepAt} *before* dispatching to
660
+ * {@link refreshKnownPeers} so a snapshot can distinguish "sweep is
661
+ * running but every peer is rejected" from "sweep is dead". An
662
+ * outer try/catch keeps the timer alive even if the per-peer
663
+ * try/catch inside `refreshKnownPeers` somehow leaks. Polly issue
664
+ * #106 item 7. */
432
665
  private startKnownPeersSweep;
433
666
  private stopKnownPeersSweep;
434
667
  /**
@@ -521,6 +754,19 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
521
754
  * them — a single bad candidate must not stall the connection. */
522
755
  private flushPendingRemoteIce;
523
756
  private wireConnection;
757
+ /** Emit `peer-candidate` upward exactly once per slot lifetime,
758
+ * stamping the slot's `lastSyncHandshakeAttempt.peerCandidateEmittedAt`
759
+ * with the moment of emission. The "once" semantics protect Automerge's
760
+ * NetworkSubsystem from a double-add when both the
761
+ * `connectionstatechange = connected` event AND the
762
+ * `dataChannel.onopen` event fire on the same slot — under werift
763
+ * the former is sometimes flaky, under Chrome the latter sometimes
764
+ * fires first; pre-#106 only the connection-state path emitted, so a
765
+ * werift slot whose data channel opened cleanly but whose connection
766
+ * state never advanced past `connecting` would never signal "ready"
767
+ * upward. Polly issue #106 Failure B item — closing one named
768
+ * mechanism for "data channel open, sync never started". */
769
+ private emitPeerCandidateOnce;
524
770
  private wireDataChannel;
525
771
  /** Refresh the per-peer {@link TransportSnapshot} for one peer by
526
772
  * pulling {@link RTCPeerConnection.getStats} and distilling the
@@ -553,6 +799,15 @@ export declare class MeshWebRTCAdapter extends NetworkAdapter {
553
799
  * `inFlightSync` reassembly state worth bookkeeping. Small
554
800
  * single-message dispatches yield but don't touch inFlightSync. */
555
801
  private scheduleEmitMessage;
802
+ /** Stamp the slot's `firstInboundMessageAt` the first time a
803
+ * dispatched (non-fragment, non-blob) message lands for a peer. Pure
804
+ * observability for {@link SyncHandshakeAttemptSnapshot}; does not
805
+ * affect dispatch. */
806
+ private stampFirstInboundMessage;
807
+ /** Stamp `handles[documentId].lastSyncMessageInAt` for the
808
+ * per-handle observability layer polly#107 adds. Pure observability;
809
+ * does not affect dispatch. */
810
+ private stampHandleInbound;
556
811
  private finishInFlightSyncApply;
557
812
  private emitSyncProgress;
558
813
  private handleSyncFragment;
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@fairfox/polly",
3
- "version": "0.55.0",
3
+ "version": "0.57.0",
4
4
  "private": false,
5
5
  "type": "module",
6
6
  "description": "Multi-execution-context framework with reactive state and cross-context messaging for Chrome extensions, PWAs, and worker-based applications",