npm - @fluidframework/container-runtime - Versions diffs - 2.1.0-276985 → 2.1.0 - Mend

@fluidframework/container-runtime 2.1.0-276985 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (199) hide show

package/CHANGELOG.md +4 -0
package/README.md +71 -18
package/api-extractor/api-extractor.current.json +5 -0
package/api-extractor/api-extractor.legacy.json +1 -1
package/api-extractor.json +1 -1
package/api-report/container-runtime.legacy.public.api.md +9 -0
package/container-runtime.test-files.tar +0 -0
package/dist/blobManager/blobManager.d.ts +10 -0
package/dist/blobManager/blobManager.d.ts.map +1 -1
package/dist/blobManager/blobManager.js +19 -0
package/dist/blobManager/blobManager.js.map +1 -1
package/dist/channelCollection.d.ts +1 -1
package/dist/channelCollection.d.ts.map +1 -1
package/dist/channelCollection.js +40 -8
package/dist/channelCollection.js.map +1 -1
package/dist/containerRuntime.d.ts +14 -5
package/dist/containerRuntime.d.ts.map +1 -1
package/dist/containerRuntime.js +151 -99
package/dist/containerRuntime.js.map +1 -1
package/dist/dataStoreContext.d.ts +4 -0
package/dist/dataStoreContext.d.ts.map +1 -1
package/dist/dataStoreContext.js +9 -3
package/dist/dataStoreContext.js.map +1 -1
package/dist/gc/garbageCollection.d.ts +1 -1
package/dist/gc/garbageCollection.d.ts.map +1 -1
package/dist/gc/garbageCollection.js +14 -8
package/dist/gc/garbageCollection.js.map +1 -1
package/dist/gc/gcDefinitions.d.ts +4 -2
package/dist/gc/gcDefinitions.d.ts.map +1 -1
package/dist/gc/gcDefinitions.js.map +1 -1
package/dist/gc/gcHelpers.d.ts.map +1 -1
package/dist/gc/gcHelpers.js +12 -0
package/dist/gc/gcHelpers.js.map +1 -1
package/dist/gc/gcTelemetry.d.ts +3 -2
package/dist/gc/gcTelemetry.d.ts.map +1 -1
package/dist/gc/gcTelemetry.js +6 -6
package/dist/gc/gcTelemetry.js.map +1 -1
package/dist/legacy.d.ts +1 -1
package/dist/metadata.d.ts +7 -1
package/dist/metadata.d.ts.map +1 -1
package/dist/metadata.js +6 -0
package/dist/metadata.js.map +1 -1
package/dist/opLifecycle/batchManager.d.ts +8 -1
package/dist/opLifecycle/batchManager.d.ts.map +1 -1
package/dist/opLifecycle/batchManager.js +37 -16
package/dist/opLifecycle/batchManager.js.map +1 -1
package/dist/opLifecycle/definitions.d.ts +1 -1
package/dist/opLifecycle/definitions.d.ts.map +1 -1
package/dist/opLifecycle/definitions.js.map +1 -1
package/dist/opLifecycle/index.d.ts +2 -2
package/dist/opLifecycle/index.d.ts.map +1 -1
package/dist/opLifecycle/index.js +3 -1
package/dist/opLifecycle/index.js.map +1 -1
package/dist/opLifecycle/opCompressor.d.ts.map +1 -1
package/dist/opLifecycle/opCompressor.js +12 -8
package/dist/opLifecycle/opCompressor.js.map +1 -1
package/dist/opLifecycle/opGroupingManager.d.ts.map +1 -1
package/dist/opLifecycle/opGroupingManager.js +14 -11
package/dist/opLifecycle/opGroupingManager.js.map +1 -1
package/dist/opLifecycle/opSplitter.d.ts.map +1 -1
package/dist/opLifecycle/opSplitter.js +11 -6
package/dist/opLifecycle/opSplitter.js.map +1 -1
package/dist/opLifecycle/outbox.d.ts +22 -6
package/dist/opLifecycle/outbox.d.ts.map +1 -1
package/dist/opLifecycle/outbox.js +43 -21
package/dist/opLifecycle/outbox.js.map +1 -1
package/dist/opLifecycle/remoteMessageProcessor.d.ts +10 -8
package/dist/opLifecycle/remoteMessageProcessor.d.ts.map +1 -1
package/dist/opLifecycle/remoteMessageProcessor.js +39 -15
package/dist/opLifecycle/remoteMessageProcessor.js.map +1 -1
package/dist/packageVersion.d.ts +1 -1
package/dist/packageVersion.d.ts.map +1 -1
package/dist/packageVersion.js +1 -1
package/dist/packageVersion.js.map +1 -1
package/dist/pendingStateManager.d.ts +37 -13
package/dist/pendingStateManager.d.ts.map +1 -1
package/dist/pendingStateManager.js +95 -45
package/dist/pendingStateManager.js.map +1 -1
package/dist/public.d.ts +1 -1
package/dist/scheduleManager.js +4 -0
package/dist/scheduleManager.js.map +1 -1
package/dist/summary/summarizerNode/summarizerNodeUtils.d.ts.map +1 -1
package/dist/summary/summarizerNode/summarizerNodeUtils.js +2 -0
package/dist/summary/summarizerNode/summarizerNodeUtils.js.map +1 -1
package/dist/summary/summaryFormat.d.ts.map +1 -1
package/dist/summary/summaryFormat.js +4 -1
package/dist/summary/summaryFormat.js.map +1 -1
package/internal.d.ts +1 -1
package/legacy.d.ts +1 -1
package/lib/blobManager/blobManager.d.ts +10 -0
package/lib/blobManager/blobManager.d.ts.map +1 -1
package/lib/blobManager/blobManager.js +19 -0
package/lib/blobManager/blobManager.js.map +1 -1
package/lib/channelCollection.d.ts +1 -1
package/lib/channelCollection.d.ts.map +1 -1
package/lib/channelCollection.js +40 -8
package/lib/channelCollection.js.map +1 -1
package/lib/containerRuntime.d.ts +14 -5
package/lib/containerRuntime.d.ts.map +1 -1
package/lib/containerRuntime.js +152 -100
package/lib/containerRuntime.js.map +1 -1
package/lib/dataStoreContext.d.ts +4 -0
package/lib/dataStoreContext.d.ts.map +1 -1
package/lib/dataStoreContext.js +10 -4
package/lib/dataStoreContext.js.map +1 -1
package/lib/gc/garbageCollection.d.ts +1 -1
package/lib/gc/garbageCollection.d.ts.map +1 -1
package/lib/gc/garbageCollection.js +14 -8
package/lib/gc/garbageCollection.js.map +1 -1
package/lib/gc/gcDefinitions.d.ts +4 -2
package/lib/gc/gcDefinitions.d.ts.map +1 -1
package/lib/gc/gcDefinitions.js.map +1 -1
package/lib/gc/gcHelpers.d.ts.map +1 -1
package/lib/gc/gcHelpers.js +12 -0
package/lib/gc/gcHelpers.js.map +1 -1
package/lib/gc/gcTelemetry.d.ts +3 -2
package/lib/gc/gcTelemetry.d.ts.map +1 -1
package/lib/gc/gcTelemetry.js +6 -6
package/lib/gc/gcTelemetry.js.map +1 -1
package/lib/legacy.d.ts +1 -1
package/lib/metadata.d.ts +7 -1
package/lib/metadata.d.ts.map +1 -1
package/lib/metadata.js +4 -1
package/lib/metadata.js.map +1 -1
package/lib/opLifecycle/batchManager.d.ts +8 -1
package/lib/opLifecycle/batchManager.d.ts.map +1 -1
package/lib/opLifecycle/batchManager.js +35 -15
package/lib/opLifecycle/batchManager.js.map +1 -1
package/lib/opLifecycle/definitions.d.ts +1 -1
package/lib/opLifecycle/definitions.d.ts.map +1 -1
package/lib/opLifecycle/definitions.js.map +1 -1
package/lib/opLifecycle/index.d.ts +2 -2
package/lib/opLifecycle/index.d.ts.map +1 -1
package/lib/opLifecycle/index.js +2 -2
package/lib/opLifecycle/index.js.map +1 -1
package/lib/opLifecycle/opCompressor.d.ts.map +1 -1
package/lib/opLifecycle/opCompressor.js +12 -8
package/lib/opLifecycle/opCompressor.js.map +1 -1
package/lib/opLifecycle/opGroupingManager.d.ts.map +1 -1
package/lib/opLifecycle/opGroupingManager.js +14 -11
package/lib/opLifecycle/opGroupingManager.js.map +1 -1
package/lib/opLifecycle/opSplitter.d.ts.map +1 -1
package/lib/opLifecycle/opSplitter.js +11 -6
package/lib/opLifecycle/opSplitter.js.map +1 -1
package/lib/opLifecycle/outbox.d.ts +22 -6
package/lib/opLifecycle/outbox.d.ts.map +1 -1
package/lib/opLifecycle/outbox.js +44 -22
package/lib/opLifecycle/outbox.js.map +1 -1
package/lib/opLifecycle/remoteMessageProcessor.d.ts +10 -8
package/lib/opLifecycle/remoteMessageProcessor.d.ts.map +1 -1
package/lib/opLifecycle/remoteMessageProcessor.js +37 -14
package/lib/opLifecycle/remoteMessageProcessor.js.map +1 -1
package/lib/packageVersion.d.ts +1 -1
package/lib/packageVersion.d.ts.map +1 -1
package/lib/packageVersion.js +1 -1
package/lib/packageVersion.js.map +1 -1
package/lib/pendingStateManager.d.ts +37 -13
package/lib/pendingStateManager.d.ts.map +1 -1
package/lib/pendingStateManager.js +95 -45
package/lib/pendingStateManager.js.map +1 -1
package/lib/public.d.ts +1 -1
package/lib/scheduleManager.js +4 -0
package/lib/scheduleManager.js.map +1 -1
package/lib/summary/summarizerNode/summarizerNodeUtils.d.ts.map +1 -1
package/lib/summary/summarizerNode/summarizerNodeUtils.js +2 -0
package/lib/summary/summarizerNode/summarizerNodeUtils.js.map +1 -1
package/lib/summary/summaryFormat.d.ts.map +1 -1
package/lib/summary/summaryFormat.js +4 -1
package/lib/summary/summaryFormat.js.map +1 -1
package/package.json +46 -31
package/src/blobManager/blobManager.ts +19 -0
package/src/channelCollection.ts +48 -11
package/src/containerRuntime.ts +203 -133
package/src/dataStoreContext.ts +22 -4
package/src/gc/garbageCollection.ts +15 -10
package/src/gc/gcDefinitions.ts +7 -2
package/src/gc/gcHelpers.ts +18 -6
package/src/gc/gcTelemetry.ts +20 -8
package/src/metadata.ts +11 -1
package/src/opLifecycle/README.md +0 -8
package/src/opLifecycle/batchManager.ts +49 -16
package/src/opLifecycle/definitions.ts +1 -1
package/src/opLifecycle/index.ts +13 -2
package/src/opLifecycle/opCompressor.ts +12 -8
package/src/opLifecycle/opGroupingManager.ts +14 -11
package/src/opLifecycle/opSplitter.ts +10 -6
package/src/opLifecycle/outbox.ts +64 -26
package/src/opLifecycle/remoteMessageProcessor.ts +56 -17
package/src/packageVersion.ts +1 -1
package/src/pendingStateManager.ts +173 -74
package/src/scheduleManager.ts +6 -2
package/src/summary/README.md +81 -0
package/src/summary/summarizerNode/summarizerNodeUtils.ts +3 -1
package/src/summary/summaryFormat.ts +3 -1
package/src/summary/summaryFormats.md +69 -8
package/tsconfig.json +0 -1
package/src/summary/images/appTree.png +0 -0
package/src/summary/images/protocolAndAppTree.png +0 -0
package/src/summary/images/summaryTree.png +0 -0

package/src/pendingStateManager.ts CHANGED Viewed

@@ -14,15 +14,18 @@ import {
 	extractSafePropertiesFromMessage,
 } from "@fluidframework/telemetry-utils/internal";
 import Deque from "double-ended-queue";
+import { v4 as uuid } from "uuid";
-import { InboundSequencedContainerRuntimeMessage } from "./messageTypes.js";
-import { IBatchMetadata } from "./metadata.js";
-import type { BatchMessage } from "./opLifecycle/index.js";
+import { type InboundSequencedContainerRuntimeMessage } from "./messageTypes.js";
+import { asBatchMetadata, IBatchMetadata } from "./metadata.js";
+import { BatchId, BatchMessage, generateBatchId } from "./opLifecycle/index.js";
 import { pkgVersion } from "./packageVersion.js";
 /**
  * This represents a message that has been submitted and is added to the pending queue when `submit` is called on the
  * ContainerRuntime. This message has either not been ack'd by the server or has not been submitted to the server yet.
+ *
+ * @remarks This is the current serialization format for pending local state when a Container is serialized.
  */
 export interface IPendingMessage {
 	type: "message";
@@ -31,9 +34,30 @@ export interface IPendingMessage {
 	localOpMetadata: unknown;
 	opMetadata: Record<string, unknown> | undefined;
 	sequenceNumber?: number;
-	batchStartCsn?: number;
+	/** Info needed to compute the batchId on reconnect */
+	batchIdContext: {
+		/** The Batch's original clientId, from when it was first flushed to be submitted */
+		clientId: string;
+		/**
+		 * The Batch's original clientSequenceNumber, from when it was first flushed to be submitted
+		 *	@remarks A negative value means it was not yet submitted when queued here (e.g. disconnected right before flush fired)
+		 */
+		batchStartCsn: number;
+	};
 }
+type Patch<T, U> = U & Omit<T, keyof U>;
+/** First version of the type (pre-dates batchIdContext) */
+type IPendingMessageV0 = Patch<IPendingMessage, { batchIdContext?: undefined }>;
+/**
+ * Union of all supported schemas for when applying stashed ops
+ *
+ * @remarks When the format changes, this type should update to reflect all possible schemas.
+ */
+type IPendingMessageFromStash = IPendingMessageV0 | IPendingMessage;
 export interface IPendingLocalState {
 	/**
 	 * list of pending states, including ops and batch information
@@ -41,19 +65,18 @@ export interface IPendingLocalState {
 	pendingStates: IPendingMessage[];
 }
-export interface IPendingBatchMessage {
-	content: string;
-	localOpMetadata: unknown;
-	opMetadata: Record<string, unknown> | undefined;
-}
+/** Info needed to replay/resubmit a pending message */
+export type PendingMessageResubmitData = Pick<
+	IPendingMessage,
+	"content" | "localOpMetadata" | "opMetadata"
+>;
 export interface IRuntimeStateHandler {
 	connected(): boolean;
 	clientId(): string | undefined;
 	close(error?: ICriticalContainerError): void;
 	applyStashedOp(content: string): Promise<unknown>;
-	reSubmit(message: IPendingBatchMessage): void;
-	reSubmitBatch(batch: IPendingBatchMessage[]): void;
+	reSubmitBatch(batch: PendingMessageResubmitData[], batchId: BatchId): void;
 	isActiveConnection: () => boolean;
 	isAttached: () => boolean;
 }
@@ -95,15 +118,19 @@ function withoutLocalOpMetadata(message: IPendingMessage): IPendingMessage {
  * It verifies that all the ops are acked, are received in the right order and batch information is correct.
  */
 export class PendingStateManager implements IDisposable {
+	/** Messages that will need to be resubmitted if not ack'd before the next reconnection */
 	private readonly pendingMessages = new Deque<IPendingMessage>();
-	// This queue represents already acked messages.
-	private readonly initialMessages = new Deque<IPendingMessage>();
+	/** Messages stashed from a previous container, now being rehydrated. Need to be resubmitted. */
+	private readonly initialMessages = new Deque<IPendingMessageFromStash>();
 	/**
 	 * Sequenced local ops that are saved when stashing since pending ops may depend on them
 	 */
 	private savedOps: IPendingMessage[] = [];
+	/** Used to stand in for batchStartCsn for messages that weren't submitted (so no CSN) */
+	private negativeCounter: number = -1;
 	private readonly disposeOnce = new Lazy<void>(() => {
 		this.initialMessages.clear();
 		this.pendingMessages.clear();
@@ -116,7 +143,8 @@ export class PendingStateManager implements IDisposable {
 	// the correct batch metadata.
 	private pendingBatchBeginMessage: ISequencedDocumentMessage | undefined;
-	private clientId: string | undefined;
+	/** Used to ensure we don't replay ops on the same connection twice */
+	private clientIdFromLastReplay: string | undefined;
 	/**
 	 * The pending messages count. Includes `pendingMessages` and `initialMessages` to keep in sync with
@@ -176,11 +204,11 @@ export class PendingStateManager implements IDisposable {
 	constructor(
 		private readonly stateHandler: IRuntimeStateHandler,
-		initialLocalState: IPendingLocalState | undefined,
-		private readonly logger: ITelemetryLoggerExt | undefined,
+		stashedLocalState: IPendingLocalState | undefined,
+		private readonly logger: ITelemetryLoggerExt,
 	) {
-		if (initialLocalState?.pendingStates) {
-			this.initialMessages.push(...initialLocalState.pendingStates);
+		if (stashedLocalState?.pendingStates) {
+			this.initialMessages.push(...stashedLocalState.pendingStates);
 		}
 	}
@@ -197,6 +225,16 @@ export class PendingStateManager implements IDisposable {
 	 * or undefined if the batch was not yet sent (e.g. by the time we flushed we lost the connection)
 	 */
 	public onFlushBatch(batch: BatchMessage[], clientSequenceNumber: number | undefined) {
+		// If we're connected this is the client of the current connection,
+		// otherwise it's the clientId that just disconnected
+		// It's only undefined if we've NEVER connected. This is a tight corner case and we can
+		// simply make up a unique ID in this case.
+		const clientId = this.stateHandler.clientId() ?? uuid();
+		// If the batch was not yet sent, we need to assign a unique batchStartCsn
+		// Use a negative number to distinguish these from real CSNs
+		const batchStartCsn = clientSequenceNumber ?? this.negativeCounter--;
 		for (const message of batch) {
 			const {
 				contents: content = "",
@@ -210,7 +248,8 @@ export class PendingStateManager implements IDisposable {
 				content,
 				localOpMetadata,
 				opMetadata,
-				batchStartCsn: clientSequenceNumber,
+				// Note: We only need this on the first message.
+				batchIdContext: { clientId, batchStartCsn },
 			};
 			this.pendingMessages.push(pendingMessage);
 		}
@@ -245,6 +284,7 @@ export class PendingStateManager implements IDisposable {
 				} else {
 					nextMessage.localOpMetadata = localOpMetadata;
 					// then we push onto pendingMessages which will cause PendingStateManager to resubmit when we connect
+					patchBatchIdContext(nextMessage); // Back compat
 					this.pendingMessages.push(nextMessage);
 				}
 			} catch (error) {
@@ -253,6 +293,25 @@ export class PendingStateManager implements IDisposable {
 		}
 	}
+	/**
+	 * Processes the incoming batch from the server. It verifies that messages are received in the right order and
+	 * that the batch information is correct.
+	 * @param batch - The batch that is being processed.
+	 * @param batchStartCsn - The clientSequenceNumber of the start of this message's batch
+	 */
+	public processPendingLocalBatch(
+		batch: InboundSequencedContainerRuntimeMessage[],
+		batchStartCsn: number,
+	): {
+		message: InboundSequencedContainerRuntimeMessage;
+		localOpMetadata: unknown;
+	}[] {
+		return batch.map((message) => ({
+			message,
+			localOpMetadata: this.processPendingLocalMessage(message, batchStartCsn),
+		}));
+	}
 	/**
 	 * Processes a local message once its ack'd by the server. It verifies that there was no data corruption and that
 	 * the batch information was preserved for batch messages.
@@ -260,37 +319,25 @@ export class PendingStateManager implements IDisposable {
 	 * @param batchStartCsn - The clientSequenceNumber of the start of this message's batch (assigned during submit)
 	 * (not to be confused with message.clientSequenceNumber - the overwritten value in case of grouped batching)
 	 */
-	public processPendingLocalMessage(
+	private processPendingLocalMessage(
 		message: InboundSequencedContainerRuntimeMessage,
 		batchStartCsn: number,
 	): unknown {
-		// Pre-processing part - This may be the start of a batch.
-		this.maybeProcessBatchBegin(message);
 		// Get the next message from the pending queue. Verify a message exists.
 		const pendingMessage = this.pendingMessages.peekFront();
 		assert(
 			pendingMessage !== undefined,
 			0x169 /* "No pending message found for this remote message" */,
 		);
+		// This may be the start of a batch.
+		this.maybeProcessBatchBegin(message, batchStartCsn, pendingMessage);
 		pendingMessage.sequenceNumber = message.sequenceNumber;
 		this.savedOps.push(withoutLocalOpMetadata(pendingMessage));
 		this.pendingMessages.shift();
-		if (pendingMessage.batchStartCsn !== batchStartCsn) {
-			this.logger?.sendErrorEvent({
-				eventName: "BatchClientSequenceNumberMismatch",
-				details: {
-					processingBatch: !!this.pendingBatchBeginMessage,
-					pendingBatchCsn: pendingMessage.batchStartCsn,
-					batchStartCsn,
-					messageBatchMetadata: (message.metadata as any)?.batch,
-					pendingMessageBatchMetadata: (pendingMessage.opMetadata as any)?.batch,
-				},
-				messageDetails: extractSafePropertiesFromMessage(message),
-			});
-		}
 		const messageContent = buildPendingMessageContent(message);
 		// Stringified content should match
@@ -317,8 +364,31 @@ export class PendingStateManager implements IDisposable {
 	/**
 	 * This message could be the first message in batch. If so, set batch state marking the beginning of a batch.
 	 * @param message - The message that is being processed.
+	 * @param batchStartCsn - The clientSequenceNumber of the start of this message's batch (assigned during submit)
+	 * @param pendingMessage - The corresponding pendingMessage.
 	 */
-	private maybeProcessBatchBegin(message: ISequencedDocumentMessage) {
+	private maybeProcessBatchBegin(
+		message: ISequencedDocumentMessage,
+		batchStartCsn: number,
+		pendingMessage: IPendingMessage,
+	) {
+		if (!this.isProcessingBatch) {
+			// Expecting the start of a batch (maybe single-message).
+			if (pendingMessage.batchIdContext.batchStartCsn !== batchStartCsn) {
+				this.logger?.sendErrorEvent({
+					eventName: "BatchClientSequenceNumberMismatch",
+					details: {
+						processingBatch: !!this.pendingBatchBeginMessage,
+						pendingBatchCsn: pendingMessage.batchIdContext.batchStartCsn,
+						batchStartCsn,
+						messageBatchMetadata: (message.metadata as any)?.batch,
+						pendingMessageBatchMetadata: (pendingMessage.opMetadata as any)?.batch,
+					},
+					messageDetails: extractSafePropertiesFromMessage(message),
+				});
+			}
+		}
 		// This message is the first in a batch if the "batch" property on the metadata is set to true
 		if ((message.metadata as IBatchMetadata | undefined)?.batch) {
 			// We should not already be processing a batch and there should be no pending batch begin message.
@@ -406,10 +476,10 @@ export class PendingStateManager implements IDisposable {
 		// This assert suggests we are about to send same ops twice, which will result in data loss.
 		assert(
-			this.clientId !== this.stateHandler.clientId(),
+			this.clientIdFromLastReplay !== this.stateHandler.clientId(),
 			0x173 /* "replayPendingStates called twice for same clientId!" */,
 		);
-		this.clientId = this.stateHandler.clientId();
+		this.clientIdFromLastReplay = this.stateHandler.clientId();
 		assert(
 			this.initialMessages.isEmpty(),
@@ -426,54 +496,72 @@ export class PendingStateManager implements IDisposable {
 			// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
 			let pendingMessage = this.pendingMessages.shift()!;
 			remainingPendingMessagesCount--;
-			assert(
-				pendingMessage.opMetadata?.batch !== false,
-				0x41b /* We cannot process batches in chunks */,
-			);
+			const batchMetadataFlag = asBatchMetadata(pendingMessage.opMetadata)?.batch;
+			assert(batchMetadataFlag !== false, 0x41b /* We cannot process batches in chunks */);
+			// The next message starts a batch (possibly single-message), and we'll need its batchId.
+			// We'll find batchId on this message if it was previously generated.
+			// Otherwise, generate it now - this is the first time resubmitting this batch.
+			const batchId =
+				asBatchMetadata(pendingMessage.opMetadata)?.batchId ??
+				generateBatchId(
+					pendingMessage.batchIdContext.clientId,
+					pendingMessage.batchIdContext.batchStartCsn,
+				);
 			/**
-			 * We want to ensure grouped messages get processed in a batch.
+			 * We must preserve the distinct batches on resubmit.
 			 * Note: It is not possible for the PendingStateManager to receive a partially acked batch. It will
-			 * either receive the whole batch ack or nothing at all.
+			 * either receive the whole batch ack or nothing at all.  @see ScheduleManager for how this works.
 			 */
-			if (pendingMessage.opMetadata?.batch) {
-				assert(
-					remainingPendingMessagesCount > 0,
-					0x554 /* Last pending message cannot be a batch begin */,
+			if (batchMetadataFlag === undefined) {
+				// Single-message batch
+				this.stateHandler.reSubmitBatch(
+					[
+						{
+							content: pendingMessage.content,
+							localOpMetadata: pendingMessage.localOpMetadata,
+							opMetadata: pendingMessage.opMetadata,
+						},
+					],
+					batchId,
 				);
+				continue;
+			}
+			// else: batchMetadataFlag === true  (It's a typical multi-message batch)
-				const batch: IPendingBatchMessage[] = [];
-				// check is >= because batch end may be last pending message
-				while (remainingPendingMessagesCount >= 0) {
-					batch.push({
-						content: pendingMessage.content,
-						localOpMetadata: pendingMessage.localOpMetadata,
-						opMetadata: pendingMessage.opMetadata,
-					});
+			assert(
+				remainingPendingMessagesCount > 0,
+				0x554 /* Last pending message cannot be a batch begin */,
+			);
-					if (pendingMessage.opMetadata?.batch === false) {
-						break;
-					}
-					assert(remainingPendingMessagesCount > 0, 0x555 /* No batch end found */);
-					// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
-					pendingMessage = this.pendingMessages.shift()!;
-					remainingPendingMessagesCount--;
-					assert(
-						pendingMessage.opMetadata?.batch !== true,
-						0x556 /* Batch start needs a corresponding batch end */,
-					);
-				}
+			const batch: PendingMessageResubmitData[] = [];
-				this.stateHandler.reSubmitBatch(batch);
-			} else {
-				this.stateHandler.reSubmit({
+			// check is >= because batch end may be last pending message
+			while (remainingPendingMessagesCount >= 0) {
+				batch.push({
 					content: pendingMessage.content,
 					localOpMetadata: pendingMessage.localOpMetadata,
 					opMetadata: pendingMessage.opMetadata,
 				});
+				if (pendingMessage.opMetadata?.batch === false) {
+					break;
+				}
+				assert(remainingPendingMessagesCount > 0, 0x555 /* No batch end found */);
+				// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+				pendingMessage = this.pendingMessages.shift()!;
+				remainingPendingMessagesCount--;
+				assert(
+					pendingMessage.opMetadata?.batch !== true,
+					0x556 /* Batch start needs a corresponding batch end */,
+				);
 			}
+			this.stateHandler.reSubmitBatch(batch, batchId);
 		}
 		// pending ops should no longer depend on previous sequenced local ops after resubmit
@@ -491,3 +579,14 @@ export class PendingStateManager implements IDisposable {
 		}
 	}
 }
+/** For back-compat if trying to apply stashed ops that pre-date batchIdContext */
+function patchBatchIdContext(
+	message: IPendingMessageFromStash,
+): asserts message is IPendingMessage {
+	const batchIdContext: IPendingMessageFromStash["batchIdContext"] = message.batchIdContext;
+	if (batchIdContext === undefined) {
+		// Using uuid guarantees uniqueness, retaining existing behavior
+		message.batchIdContext = { clientId: uuid(), batchStartCsn: -1 };
+	}
+}

package/src/scheduleManager.ts CHANGED Viewed

@@ -124,7 +124,9 @@ class ScheduleManagerCore {
 			}
 			// First message will have the batch flag set to true if doing a batched send
-			const firstMessageMetadata = messages[0].metadata as IRuntimeMessageMetadata;
+			// Non null asserting because of the length check above
+			// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+			const firstMessageMetadata = messages[0]!.metadata as IRuntimeMessageMetadata;
 			if (!firstMessageMetadata?.batch) {
 				return;
 			}
@@ -136,7 +138,9 @@ class ScheduleManagerCore {
 			}
 			// Set the batch flag to false on the last message to indicate the end of the send batch
-			const lastMessage = messages[messages.length - 1];
+			// Non null asserting here because of the length check at the start of the function
+			// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+			const lastMessage = messages[messages.length - 1]!;
 			// TODO: It's not clear if this shallow clone is required, as opposed to just setting "batch" to false.
 			// eslint-disable-next-line @typescript-eslint/no-unnecessary-type-assertion
 			lastMessage.metadata = { ...(lastMessage.metadata as any), batch: false };

package/src/summary/README.md ADDED Viewed

@@ -0,0 +1,81 @@
+## Table of contents
+-   [Introduction](#introduction)
+    -   [Summary vs snapshot](#summary-vs-snapshot)
+-   [Why do we need summaries?](#why-do-we-need-summaries)
+-   [Who generates summaries?](#who-generates-summaries)
+-   [When are summaries generated?](#when-are-summaries-generated)
+-   [How are summaries generated?](#how-are-summaries-generated)
+    -   [Summary Lifecycle](#summary-lifecycle)
+    -   [Single-commit vs two-commit summaries](#single-commit-vs-two-commit-summaries)
+    -   [Incremental summaries](#incremental-summaries)
+	-   [Resiliency](#resiliency)
+-   [What does a summary look like?](#what-does-a-summary-look-like)
+## Introduction
+This document provides a conceptual overview of summarization without going into a lot of technical or implementation details. It describes what summaries are, how / when they are generated and what do they look like. The goal is for this to be an entry point into summarization for users and developers alike.
+### Summary vs snapshot
+The terms summary and snapshot are sometimes used interchangeably. Both represent the state of a container at a point in time. They differ in some respects which are described in [this FAQ](https://fluidframework.com/docs/faq/#summarization).
+## Why do we need summaries?
+A 'summary' captures the state of a container at a point in time. Without it, a client would have to apply every operation in the op log, even if those operations no longer affected the current state (e.g. op 1 inserts ‘h’ and op 2 deletes ‘h’). For very large op logs, this would be very expensive for clients to both process and download from the service.
+Instead, when a client opens a collaborative document, it can download a snapshot of the container, and simply process new operations from that point forward.
+## Who generates summaries?
+Summaries can be generated by any client connected in "write" mode. In the current implementation, summaries are generated by a separate non-interactive client. Using a separate client is an optimization - this client doesn't have to take local changes into account which can make the summary process more complicated. A summarizer client has the following characteristics:
+-   All the clients connected to the document participate in a process called "summary client election" to elect a "parent summarizer" client. Typically, it's the oldest "write" client connected to the document. The parent summarizer client spawns a "summarizer" client which is responsible for summarization.
+-   A summarizer client is like any other client connected to the document except that users cannot interact with this client, and it only works on the state it receives from other clients. It has a brand-new container with its own connection to services.
+> Note that if the summarizer client closes, the "summary client election" process will choose a new one, if applicable. The default "summary client election" algorithm is to select the oldest "write" client as the parent summarizer client which in turn will create the summarizer client.
+## When are summaries generated?
+The summarizer client periodically generates summary based on heuristics calculated based on configurations such as the number of user or system ops received, the amount of time a client has been idle (hasn't received any ops), the maximum time since last summary, maximum number of ops since last summary, etc. The heuristic configurations are defined by an _ISummaryConfigurationHeuristics_ interface defined in [this file](../../src/containerRuntime.ts).
+The summarizer client uses a default set of configurations defined by _DefaultSummaryConfiguration_ in [this file](../../src/containerRuntime.ts). These can be overridden by providing a new set of configurations as part of container runtime options during creation.
+## How are summaries generated?
+When summarization process is triggered, every object in the container's object tree that has data to be summarized is asked to generate its summary, starting at the container runtime which is at the root. There are various objects that participate in the summary process and generate its summary such as data stores, DDSes, garbage collector, blob manager, id compressor, etc. Note that the user data is in the DDSes.
+### Summary Lifecycle
+The lifecycle of a summary starts when a "parent summarizer" client is elected.
+-   The parent summarizer spawns a non-interactive summarizer client.
+-   The summarizer client periodically starts a summary as per heuristics. A summary happens at a particular sequence number called the "summary sequence number" or reference sequence number for the summary.
+-   The container runtime (or simply runtime) generates a summary tree (described in the ["What does a summary look like?"](#what-does-a-summary-look-like) section below).
+-   The runtime uploads the summary tree to the Fluid storage service which returns a handle (unique id) to the summary if the upload is successful. Otherwise, it returns a failure.
+-   The runtime submits a "summarize" op to the Fluid ordering service containing the uploaded summary handle and the summary sequence number.
+-   The ordering service stamps it with a sequence number (like any other op) and broadcasts the summarize op.
+-   Another service on the server responds to the summarize op.
+    -   If the summary is accepted, it sends a "summary ack" with the summary sequence number and a summary handle. This handle may or may not be the same as the one in summary op depending on whether this is a single-commit or two-commit summary. More details on this below.
+    -   If the summary is rejected, it sends a "summary nack" with the details of the summary op
+-   The runtime completes the summary process on receiving the summary ack / nack. The runtime has a timeout called "maxAckWaitTime" and if the summary op, ack or nack is not received within this time, it will fail this summary.
+### Single-commit vs two-commit summaries
+By default, Fluid uses "two-commit summaries" mode where the two commits refer to the storage committing the summary twice and returning two different handles for it - One when the summary is uploaded and second, on responding to the summary op via a summary ack. In this mode, when the server receives the summary op, it augments the corresponding summary with a "protocol" blob hence generating a new commit and new handle for this summary which it returns in the summary ack.
+Fluid is switching to "single-commit summary" mode where the client adds the "protocol" blob when uploading the summary. Thus, the server doesn't need to augment the summary and the summary ack is no longer required. As soon as the summary is uploaded (first commit), the summary process is complete. The "summarize" op then is just a way to indicate that a summary happened, and it has details of the summary
+### Incremental summaries
+Summaries are incremental, i.e., if an object (or node) did not change since the last summary, it doesn't have to re-summarize its entire contents. Fluid supports the concept of a summary handle defined in [this file](../../../../../common/lib/protocol-definitions/src/summary.ts). A handle is a path to a subtree in a snapshot and it allows objects to reference a subtree in the previous snapshot, which is essentially an instruction to storage to find that subtree and populate into new summary.
+Say that a data store or DDS did not change since the last summary, it doesn't have to go through the whole summary process described above. It can instead return an ISummaryHandle with path to its subtree in the previous snapshot. The same applies to other types of content like a single content blob within an object's summary tree.
+### Resiliency
+The summarization process is designed to be resilient - Basically, a document will eventually summarize and make progress even if there are intermittent failures or disruptions. Some examples of steps taken to achieve this:
+-   Last summary - Usually, if the "parent summarizer" client disconnects or shuts down, the "summarizer" client also shuts down and the summarizer election process begins. However, if there a certain number of un-summarized ops, the summarizer client will perform a "last summary" even if the parent shuts down. This is done to make progress in scenarios where new summarizer clients are closed quickly because the parent summarizer keeps disconnecting repeatedly.
+-   Retries - The summarizer has a retry mechanism which can identify certain types of intermittent failures either in the client or in the server. It will retry the summary attempt for these failures a certain number of times. This helps in cases where there are intermittent failures such as throttling errors from the server which goes away after waiting for a while.
+## What does a summary look like?
+The format of a summary is described in [summary formats](./summaryFormats.md).

package/src/summary/summarizerNode/summarizerNodeUtils.ts CHANGED Viewed

@@ -73,7 +73,9 @@ export class EscapedPath {
 	public static createAndConcat(pathParts: string[]): EscapedPath {
 		let ret = EscapedPath.create(pathParts[0] ?? "");
 		for (let i = 1; i < pathParts.length; i++) {
-			ret = ret.concat(EscapedPath.create(pathParts[i]));
+			// Non null asserting here since we are iterating over pathParts
+			// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+			ret = ret.concat(EscapedPath.create(pathParts[i]!));
 		}
 		return ret;
 	}

package/src/summary/summaryFormat.ts CHANGED Viewed

@@ -281,7 +281,9 @@ export async function getFluidDataStoreAttributes(
 ): Promise<ReadFluidDataStoreAttributes> {
 	const attributes = await readAndParse<ReadFluidDataStoreAttributes>(
 		storage,
-		snapshot.blobs[dataStoreAttributesBlobName],
+		// TODO why are we non null asserting here?
+		// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+		snapshot.blobs[dataStoreAttributesBlobName]!,
 	);
 	// Use the snapshotFormatVersion to determine how the pkg is encoded in the snapshot.
 	// For snapshotFormatVersion = "0.1" (1) or above, pkg is jsonified, otherwise it is just a string.

package/src/summary/summaryFormats.md CHANGED Viewed

@@ -104,13 +104,26 @@ Each node in a snapshot tree is represented by the above interface and contains
 This section shows what a typical summary or snapshot tree in a container looks like. Some key things to note:
 -   The diagrams in this section show some examples of existing blobs / trees that are added at each node and doesn't show an exhaustive list.
--   The blue boxes represent tree nodes.
--   The green boxes represent blobs.
--   The purple boxes represent attachments.
--   The orange boxes represent other nodes - either existing nodes that are not shown or new nodes that may be added in the future. A node can be a tree, blob or attachment.
+-   The blue boxes represent summary tree nodes.
+-   The green boxes represent summary blobs.
+-   The purple boxes represent summary attachments.
+-   The pink boxes represent other nodes - either existing nodes that are not shown or new nodes that may be added in the future. A node can be a tree, blob or attachment.
 A typical tree uploaded to or downloaded from storage looks like the following:
-![ProtocolAndAppTree](./images/protocolAndAppTree.png)
+```mermaid
+flowchart TD
+    classDef tree fill:#4672c4,color:#fff
+    classDef blob fill:#538135,color:#fff
+    classDef others fill:#d636bb,stroke:#4672c4,stroke-width:1px,color:#fff,stroke-dasharray: 5 5
+    A["/"]:::tree --> B[".protocol"]:::tree
+        B --> C[attributes]:::blob
+        B --> D["quorum members"]:::blob
+        B --> E["quorum proposals"]:::blob
+        B --> F["quorum values"]:::blob
+        B --> G["other nodes"]:::others
+    A --> H[".app (described below)"]:::tree
+```
 `Protocol tree` - This is the tree named `.protocol` and contains protocol level information for the container. These are used by the container to initialize.
@@ -129,8 +142,38 @@ The contents of the protocol tree are:
 ### App tree
 This is what the ".app" tree looks like which is generated by the container runtime during summary upload. The same is passed to container runtime during snapshot download:
-![appTree](./images/appTree.png)
+```mermaid
+flowchart TD
+    classDef tree fill:#4672c4,color:#fff
+    classDef blob fill:#538135,color:#fff
+    classDef attachment fill:#904fc2,color:#fff
+    classDef others fill:#d636bb,stroke:#4672c4,stroke-width:1px,color:#fff,stroke-dasharray: 5 5
+    classDef hidden display:none;
+    A[".app"]:::tree --> B[.metadata]:::blob
+    A --> C[.aliases]:::blob
+    A --> D[.idCompressor]:::blob
+    A --> E[.channels]:::tree
+        E --> F["data store 1"]:::tree
+        E --> G["data store 2"]:::tree
+        E --> H["data store N"]:::tree
+            G --> I[.components]:::blob
+            G --> J[.channels]:::tree
+                J --> K[.channels]:::tree
+                J --> L[DDS2]:::tree
+                    L --> M[.attributes]:::blob
+                    L --> N["other nodes"]:::others
+                        N --> END:::hidden
+                    L --> O[.header]:::blob
+                J --> P[.channels]:::tree
+            G --> Q["other nodes"]:::others
+    A --> R[gc]:::tree
+    A --> S["other nodes"]:::others
+    A --> T[.blobs]:::tree
+        T --> U["attachment blob 1"]:::attachment
+        T --> V["attachment blob N"]:::attachment
+```
 -   `Container`: The root represents the container or container runtime node. Its contents are described below:
     -   `.metadata blob` - The container level metadata such as creation time, create version, etc.
@@ -152,9 +195,27 @@ This is what the ".app" tree looks like which is generated by the container runt
     -   `.header blob` - Added by some DDSs and may contains its data. Note that all DDSs may not add this.
     -   A DDS may add other blobs and / or trees to represent its data. Basically, a DDS can write its data in any form
-### Summary tree distinction
+### Summary tree distinction - Incremental summaries
 In the visualization above, a summary tree differs from a snapshot tree in the following way:
 A summary tree supports incremental summaries via summary handles. Any node in the tree that has not changed since the previous successful summary can send a summary handle (`ISummaryHandle`) instead of sending its entire contents in a full summary. The following diagram shows this with an example where certain parts of the summary tree use a summary handle. It is a zoomed in version of the same app tree as above where nodes where summary handles are marked in red:
-![summaryTree](./images/summaryTree.png)
+```mermaid
+flowchart TD
+    classDef tree fill:#4672c4,color:#fff
+    classDef blob fill:#538135,color:#fff
+    classDef others fill:#d636bb,stroke:#4672c4,stroke-width:1px,color:#fff,stroke-dasharray: 5 5
+    classDef handle fill:#cc4343,color:#fff
+    A[".app"]:::tree --> B["other nodes"]:::others
+    A --> C[.channels]:::tree
+        C --> D["handle: '/data store 1'"]:::handle
+        C --> E["data store 2"]:::tree
+            E --> F[".channels"]:::tree
+                F --> G["handle: '/data store 2/DDS 1'"]:::handle
+                F --> H["DDS 2"]:::tree
+                    H --> I["handle: '/data store 2/DDS 2/sub node'"]:::handle
+                F --> J["DDS N"]:::tree
+            E --> K["other nodes"]:::others
+        C --> L["data store N"]:::tree
+    A --> M["handle: '/gc'"]:::handle
+```

package/tsconfig.json CHANGED Viewed

@@ -5,7 +5,6 @@
 	"compilerOptions": {
 		"rootDir": "./src",
 		"outDir": "./lib",
-		"noUncheckedIndexedAccess": false,
 		"exactOptionalPropertyTypes": false,
 	},
 }

package/src/summary/images/appTree.png DELETED Viewed

Binary file

package/src/summary/images/protocolAndAppTree.png DELETED Viewed

Binary file

package/src/summary/images/summaryTree.png DELETED Viewed

Binary file