npm - @fluidframework/container-runtime - Versions diffs - 2.1.1 → 2.2.1 - Mend

@fluidframework/container-runtime 2.1.1 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (196) hide show

package/CHANGELOG.md +30 -0
package/README.md +2 -2
package/api-report/container-runtime.legacy.alpha.api.md +4 -3
package/container-runtime.test-files.tar +0 -0
package/dist/batchTracker.d.ts.map +1 -1
package/dist/batchTracker.js.map +1 -1
package/dist/blobManager/blobManager.d.ts.map +1 -1
package/dist/blobManager/blobManager.js +9 -0
package/dist/blobManager/blobManager.js.map +1 -1
package/dist/channelCollection.d.ts +0 -14
package/dist/channelCollection.d.ts.map +1 -1
package/dist/channelCollection.js +2 -12
package/dist/channelCollection.js.map +1 -1
package/dist/containerRuntime.d.ts +34 -6
package/dist/containerRuntime.d.ts.map +1 -1
package/dist/containerRuntime.js +181 -90
package/dist/containerRuntime.js.map +1 -1
package/dist/dataStoreContext.d.ts +9 -18
package/dist/dataStoreContext.d.ts.map +1 -1
package/dist/dataStoreContext.js +40 -78
package/dist/dataStoreContext.js.map +1 -1
package/dist/gc/garbageCollection.d.ts +0 -6
package/dist/gc/garbageCollection.d.ts.map +1 -1
package/dist/gc/garbageCollection.js +23 -66
package/dist/gc/garbageCollection.js.map +1 -1
package/dist/gc/gcConfigs.d.ts.map +1 -1
package/dist/gc/gcConfigs.js +11 -34
package/dist/gc/gcConfigs.js.map +1 -1
package/dist/gc/gcDefinitions.d.ts +9 -52
package/dist/gc/gcDefinitions.d.ts.map +1 -1
package/dist/gc/gcDefinitions.js +3 -23
package/dist/gc/gcDefinitions.js.map +1 -1
package/dist/gc/gcHelpers.d.ts.map +1 -1
package/dist/gc/gcHelpers.js +2 -6
package/dist/gc/gcHelpers.js.map +1 -1
package/dist/gc/gcSummaryStateTracker.d.ts +1 -1
package/dist/gc/gcSummaryStateTracker.d.ts.map +1 -1
package/dist/gc/gcSummaryStateTracker.js +4 -8
package/dist/gc/gcSummaryStateTracker.js.map +1 -1
package/dist/gc/gcTelemetry.d.ts +1 -9
package/dist/gc/gcTelemetry.d.ts.map +1 -1
package/dist/gc/gcTelemetry.js +3 -25
package/dist/gc/gcTelemetry.js.map +1 -1
package/dist/gc/index.d.ts +2 -2
package/dist/gc/index.d.ts.map +1 -1
package/dist/gc/index.js +2 -7
package/dist/gc/index.js.map +1 -1
package/dist/index.d.ts +1 -1
package/dist/index.d.ts.map +1 -1
package/dist/index.js +1 -2
package/dist/index.js.map +1 -1
package/dist/messageTypes.d.ts +6 -5
package/dist/messageTypes.d.ts.map +1 -1
package/dist/messageTypes.js.map +1 -1
package/dist/metadata.d.ts +9 -1
package/dist/metadata.d.ts.map +1 -1
package/dist/metadata.js +6 -1
package/dist/metadata.js.map +1 -1
package/dist/opLifecycle/index.d.ts +1 -1
package/dist/opLifecycle/index.d.ts.map +1 -1
package/dist/opLifecycle/index.js.map +1 -1
package/dist/opLifecycle/opGroupingManager.d.ts +8 -0
package/dist/opLifecycle/opGroupingManager.d.ts.map +1 -1
package/dist/opLifecycle/opGroupingManager.js +34 -2
package/dist/opLifecycle/opGroupingManager.js.map +1 -1
package/dist/opLifecycle/outbox.d.ts +1 -0
package/dist/opLifecycle/outbox.d.ts.map +1 -1
package/dist/opLifecycle/outbox.js +20 -0
package/dist/opLifecycle/outbox.js.map +1 -1
package/dist/opLifecycle/remoteMessageProcessor.d.ts +38 -19
package/dist/opLifecycle/remoteMessageProcessor.d.ts.map +1 -1
package/dist/opLifecycle/remoteMessageProcessor.js +67 -43
package/dist/opLifecycle/remoteMessageProcessor.js.map +1 -1
package/dist/packageVersion.d.ts +1 -1
package/dist/packageVersion.js +1 -1
package/dist/packageVersion.js.map +1 -1
package/dist/pendingStateManager.d.ts +33 -22
package/dist/pendingStateManager.d.ts.map +1 -1
package/dist/pendingStateManager.js +148 -105
package/dist/pendingStateManager.js.map +1 -1
package/dist/summary/summarizerNode/summarizerNode.d.ts.map +1 -1
package/dist/summary/summarizerNode/summarizerNode.js +5 -1
package/dist/summary/summarizerNode/summarizerNode.js.map +1 -1
package/dist/summary/summarizerNode/summarizerNodeWithGc.d.ts +3 -4
package/dist/summary/summarizerNode/summarizerNodeWithGc.d.ts.map +1 -1
package/dist/summary/summarizerNode/summarizerNodeWithGc.js +16 -15
package/dist/summary/summarizerNode/summarizerNodeWithGc.js.map +1 -1
package/lib/batchTracker.d.ts.map +1 -1
package/lib/batchTracker.js.map +1 -1
package/lib/blobManager/blobManager.d.ts.map +1 -1
package/lib/blobManager/blobManager.js +9 -0
package/lib/blobManager/blobManager.js.map +1 -1
package/lib/channelCollection.d.ts +0 -14
package/lib/channelCollection.d.ts.map +1 -1
package/lib/channelCollection.js +2 -11
package/lib/channelCollection.js.map +1 -1
package/lib/containerRuntime.d.ts +34 -6
package/lib/containerRuntime.d.ts.map +1 -1
package/lib/containerRuntime.js +181 -90
package/lib/containerRuntime.js.map +1 -1
package/lib/dataStoreContext.d.ts +9 -18
package/lib/dataStoreContext.d.ts.map +1 -1
package/lib/dataStoreContext.js +27 -65
package/lib/dataStoreContext.js.map +1 -1
package/lib/gc/garbageCollection.d.ts +0 -6
package/lib/gc/garbageCollection.d.ts.map +1 -1
package/lib/gc/garbageCollection.js +25 -68
package/lib/gc/garbageCollection.js.map +1 -1
package/lib/gc/gcConfigs.d.ts.map +1 -1
package/lib/gc/gcConfigs.js +12 -35
package/lib/gc/gcConfigs.js.map +1 -1
package/lib/gc/gcDefinitions.d.ts +9 -52
package/lib/gc/gcDefinitions.d.ts.map +1 -1
package/lib/gc/gcDefinitions.js +2 -22
package/lib/gc/gcDefinitions.js.map +1 -1
package/lib/gc/gcHelpers.d.ts.map +1 -1
package/lib/gc/gcHelpers.js +2 -6
package/lib/gc/gcHelpers.js.map +1 -1
package/lib/gc/gcSummaryStateTracker.d.ts +1 -1
package/lib/gc/gcSummaryStateTracker.d.ts.map +1 -1
package/lib/gc/gcSummaryStateTracker.js +4 -8
package/lib/gc/gcSummaryStateTracker.js.map +1 -1
package/lib/gc/gcTelemetry.d.ts +1 -9
package/lib/gc/gcTelemetry.d.ts.map +1 -1
package/lib/gc/gcTelemetry.js +3 -24
package/lib/gc/gcTelemetry.js.map +1 -1
package/lib/gc/index.d.ts +2 -2
package/lib/gc/index.d.ts.map +1 -1
package/lib/gc/index.js +2 -2
package/lib/gc/index.js.map +1 -1
package/lib/index.d.ts +1 -1
package/lib/index.d.ts.map +1 -1
package/lib/index.js +1 -1
package/lib/index.js.map +1 -1
package/lib/messageTypes.d.ts +6 -5
package/lib/messageTypes.d.ts.map +1 -1
package/lib/messageTypes.js.map +1 -1
package/lib/metadata.d.ts +9 -1
package/lib/metadata.d.ts.map +1 -1
package/lib/metadata.js +4 -0
package/lib/metadata.js.map +1 -1
package/lib/opLifecycle/index.d.ts +1 -1
package/lib/opLifecycle/index.d.ts.map +1 -1
package/lib/opLifecycle/index.js.map +1 -1
package/lib/opLifecycle/opGroupingManager.d.ts +8 -0
package/lib/opLifecycle/opGroupingManager.d.ts.map +1 -1
package/lib/opLifecycle/opGroupingManager.js +34 -2
package/lib/opLifecycle/opGroupingManager.js.map +1 -1
package/lib/opLifecycle/outbox.d.ts +1 -0
package/lib/opLifecycle/outbox.d.ts.map +1 -1
package/lib/opLifecycle/outbox.js +20 -0
package/lib/opLifecycle/outbox.js.map +1 -1
package/lib/opLifecycle/remoteMessageProcessor.d.ts +38 -19
package/lib/opLifecycle/remoteMessageProcessor.d.ts.map +1 -1
package/lib/opLifecycle/remoteMessageProcessor.js +67 -43
package/lib/opLifecycle/remoteMessageProcessor.js.map +1 -1
package/lib/packageVersion.d.ts +1 -1
package/lib/packageVersion.js +1 -1
package/lib/packageVersion.js.map +1 -1
package/lib/pendingStateManager.d.ts +33 -22
package/lib/pendingStateManager.d.ts.map +1 -1
package/lib/pendingStateManager.js +149 -106
package/lib/pendingStateManager.js.map +1 -1
package/lib/summary/summarizerNode/summarizerNode.d.ts.map +1 -1
package/lib/summary/summarizerNode/summarizerNode.js +5 -1
package/lib/summary/summarizerNode/summarizerNode.js.map +1 -1
package/lib/summary/summarizerNode/summarizerNodeWithGc.d.ts +3 -4
package/lib/summary/summarizerNode/summarizerNodeWithGc.d.ts.map +1 -1
package/lib/summary/summarizerNode/summarizerNodeWithGc.js +16 -15
package/lib/summary/summarizerNode/summarizerNodeWithGc.js.map +1 -1
package/package.json +21 -21
package/src/batchTracker.ts +4 -2
package/src/blobManager/blobManager.ts +9 -0
package/src/channelCollection.ts +2 -11
package/src/containerRuntime.ts +216 -121
package/src/dataStoreContext.ts +29 -93
package/src/gc/garbageCollection.ts +26 -79
package/src/gc/gcConfigs.ts +12 -45
package/src/gc/gcDefinitions.ts +10 -55
package/src/gc/gcHelpers.ts +10 -8
package/src/gc/gcSummaryStateTracker.ts +6 -9
package/src/gc/gcTelemetry.ts +3 -38
package/src/gc/index.ts +2 -6
package/src/index.ts +0 -1
package/src/messageTypes.ts +12 -11
package/src/metadata.ts +16 -2
package/src/opLifecycle/index.ts +1 -0
package/src/opLifecycle/opGroupingManager.ts +42 -3
package/src/opLifecycle/outbox.ts +30 -0
package/src/opLifecycle/remoteMessageProcessor.ts +110 -56
package/src/packageVersion.ts +1 -1
package/src/pendingStateManager.ts +209 -168
package/src/summary/README.md +31 -28
package/src/summary/summarizerNode/summarizerNode.ts +6 -1
package/src/summary/summarizerNode/summarizerNodeWithGc.ts +20 -43
package/src/summary/summaryFormats.md +25 -22

package/src/pendingStateManager.ts CHANGED Viewed

@@ -3,10 +3,8 @@
  * Licensed under the MIT License.
  */
-import { ICriticalContainerError } from "@fluidframework/container-definitions";
 import { IDisposable } from "@fluidframework/core-interfaces";
 import { assert, Lazy } from "@fluidframework/core-utils/internal";
-import { ISequencedDocumentMessage } from "@fluidframework/driver-definitions/internal";
 import {
 	ITelemetryLoggerExt,
 	DataProcessingError,
@@ -16,10 +14,13 @@ import {
 import Deque from "double-ended-queue";
 import { v4 as uuid } from "uuid";
-import { InboundSequencedContainerRuntimeMessage } from "./messageTypes.js";
-import { asBatchMetadata, IBatchMetadata } from "./metadata.js";
-import { BatchId, BatchMessage, generateBatchId } from "./opLifecycle/index.js";
-import { pkgVersion } from "./packageVersion.js";
+import {
+	type InboundContainerRuntimeMessage,
+	type InboundSequencedContainerRuntimeMessage,
+	type LocalContainerRuntimeMessage,
+} from "./messageTypes.js";
+import { asBatchMetadata, asEmptyBatchLocalOpMetadata } from "./metadata.js";
+import { BatchId, BatchMessage, generateBatchId, InboundBatch } from "./opLifecycle/index.js";
 /**
  * This represents a message that has been submitted and is added to the pending queue when `submit` is called on the
@@ -34,8 +35,8 @@ export interface IPendingMessage {
 	localOpMetadata: unknown;
 	opMetadata: Record<string, unknown> | undefined;
 	sequenceNumber?: number;
-	/** Info needed to compute the batchId on reconnect */
-	batchIdContext: {
+	/** Info about the batch this pending message belongs to, for validation and for computing the batchId on reconnect */
+	batchInfo: {
 		/** The Batch's original clientId, from when it was first flushed to be submitted */
 		clientId: string;
 		/**
@@ -43,13 +44,15 @@ export interface IPendingMessage {
 		 *	@remarks A negative value means it was not yet submitted when queued here (e.g. disconnected right before flush fired)
 		 */
 		batchStartCsn: number;
+		/** length of the batch (how many runtime messages here) */
+		length: number;
 	};
 }
 type Patch<T, U> = U & Omit<T, keyof U>;
-/** First version of the type (pre-dates batchIdContext) */
-type IPendingMessageV0 = Patch<IPendingMessage, { batchIdContext?: undefined }>;
+/** First version of the type (pre-dates batchInfo) */
+type IPendingMessageV0 = Patch<IPendingMessage, { batchInfo?: undefined }>;
 /**
  * Union of all supported schemas for when applying stashed ops
@@ -74,33 +77,47 @@ export type PendingMessageResubmitData = Pick<
 export interface IRuntimeStateHandler {
 	connected(): boolean;
 	clientId(): string | undefined;
-	close(error?: ICriticalContainerError): void;
 	applyStashedOp(content: string): Promise<unknown>;
 	reSubmitBatch(batch: PendingMessageResubmitData[], batchId: BatchId): void;
 	isActiveConnection: () => boolean;
 	isAttached: () => boolean;
 }
-/** Union of keys of T */
-type KeysOfUnion<T extends object> = T extends T ? keyof T : never;
-/** *Partial* type all possible combinations of properties and values of union T.
- * This loosens typing allowing access to all possible properties without
- * narrowing.
- */
-type AnyComboFromUnion<T extends object> = { [P in KeysOfUnion<T>]?: T[P] };
+function isEmptyBatchPendingMessage(message: IPendingMessageFromStash): boolean {
+	const content = JSON.parse(message.content);
+	return content.type === "groupedBatch" && content.contents?.length === 0;
+}
-function buildPendingMessageContent(
-	// AnyComboFromUnion is needed need to gain access to compatDetails that
-	// is only defined for some cases.
-	message: AnyComboFromUnion<InboundSequencedContainerRuntimeMessage>,
-): string {
+function buildPendingMessageContent(message: InboundSequencedContainerRuntimeMessage): string {
 	// IMPORTANT: Order matters here, this must match the order of the properties used
 	// when submitting the message.
-	const { type, contents, compatDetails } = message;
+	const { type, contents, compatDetails }: InboundContainerRuntimeMessage = message;
 	// Any properties that are not defined, won't be emitted by stringify.
 	return JSON.stringify({ type, contents, compatDetails });
 }
+function typesOfKeys<T extends object>(obj: T): Record<keyof T, string> {
+	return Object.keys(obj).reduce((acc, key) => {
+		acc[key] = typeof obj[key];
+		return acc;
+	}, {}) as Record<keyof T, string>;
+}
+function scrubAndStringify(
+	message: InboundContainerRuntimeMessage | LocalContainerRuntimeMessage,
+): string {
+	// Scrub the whole object in case there are unexpected keys
+	const scrubbed: Record<string, unknown> = typesOfKeys(message);
+	// For these known/expected keys, we can either drill in (for contents)
+	// or just use the value as-is (since it's not personal info)
+	scrubbed.contents = message.contents && typesOfKeys(message.contents);
+	scrubbed.compatDetails = message.compatDetails;
+	scrubbed.type = message.type;
+	return JSON.stringify(scrubbed);
+}
 function withoutLocalOpMetadata(message: IPendingMessage): IPendingMessage {
 	return {
 		...message,
@@ -108,6 +125,20 @@ function withoutLocalOpMetadata(message: IPendingMessage): IPendingMessage {
 	};
 }
+/**
+ * Get the effective batch ID for a pending message.
+ * If the batch ID is already present in the message's op metadata, return it.
+ * Otherwise, generate a new batch ID using the client ID and batch start CSN.
+ * @param pendingMessage - The pending message
+ * @returns The effective batch ID
+ */
+function getEffectiveBatchId(pendingMessage: IPendingMessage): string {
+	return (
+		asBatchMetadata(pendingMessage.opMetadata)?.batchId ??
+		generateBatchId(pendingMessage.batchInfo.clientId, pendingMessage.batchInfo.batchStartCsn)
+	);
+}
 /**
  * PendingStateManager is responsible for maintaining the messages that have not been sent or have not yet been
  * acknowledged by the server. It also maintains the batch information for both automatically and manually flushed
@@ -136,13 +167,6 @@ export class PendingStateManager implements IDisposable {
 		this.pendingMessages.clear();
 	});
-	// Indicates whether we are processing a batch.
-	private isProcessingBatch: boolean = false;
-	// This stores the first message in the batch that we are processing. This is used to verify that we get
-	// the correct batch metadata.
-	private pendingBatchBeginMessage: ISequencedDocumentMessage | undefined;
 	/** Used to ensure we don't replay ops on the same connection twice */
 	private clientIdFromLastReplay: string | undefined;
@@ -248,8 +272,8 @@ export class PendingStateManager implements IDisposable {
 				content,
 				localOpMetadata,
 				opMetadata,
-				// Note: We only need this on the first message.
-				batchIdContext: { clientId, batchStartCsn },
+				// Note: We only will read this off the first message, but put it on all for simplicity
+				batchInfo: { clientId, batchStartCsn, length: batch.length },
 			};
 			this.pendingMessages.push(pendingMessage);
 		}
@@ -274,7 +298,15 @@ export class PendingStateManager implements IDisposable {
 			}
 			// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
 			const nextMessage = this.initialMessages.shift()!;
+			// Nothing to apply if the message is an empty batch.
+			// We still need to track it for resubmission.
 			try {
+				if (isEmptyBatchPendingMessage(nextMessage)) {
+					nextMessage.localOpMetadata = { emptyBatch: true }; // equivalent to applyStashedOp for empty batch
+					patchbatchInfo(nextMessage); // Back compat
+					this.pendingMessages.push(nextMessage);
+					continue;
+				}
 				// applyStashedOp will cause the DDS to behave as if it has sent the op but not actually send it
 				const localOpMetadata = await this.stateHandler.applyStashedOp(nextMessage.content);
 				if (!this.stateHandler.isAttached()) {
@@ -284,7 +316,7 @@ export class PendingStateManager implements IDisposable {
 				} else {
 					nextMessage.localOpMetadata = localOpMetadata;
 					// then we push onto pendingMessages which will cause PendingStateManager to resubmit when we connect
-					patchBatchIdContext(nextMessage); // Back compat
+					patchbatchInfo(nextMessage); // Back compat
 					this.pendingMessages.push(nextMessage);
 				}
 			} catch (error) {
@@ -294,153 +326,164 @@ export class PendingStateManager implements IDisposable {
 	}
 	/**
-	 * Processes a local message once its ack'd by the server. It verifies that there was no data corruption and that
-	 * the batch information was preserved for batch messages.
-	 * @param message - The message that got ack'd and needs to be processed.
-	 * @param batchStartCsn - The clientSequenceNumber of the start of this message's batch (assigned during submit)
-	 * (not to be confused with message.clientSequenceNumber - the overwritten value in case of grouped batching)
+	 * Processes an inbound batch of messages - May be local or remote.
+	 *
+	 * @param batch - The inbound batch of messages to process. Could be local or remote.
+	 * @param local - true if we submitted this batch and expect corresponding pending messages
+	 * @returns The inbound batch's messages with localOpMetadata "zipped" in.
+	 *
+	 * @remarks Closes the container if:
+	 * - The batchStartCsn doesn't match for local batches
+	 */
+	public processInboundBatch(
+		batch: InboundBatch,
+		local: boolean,
+	): {
+		message: InboundSequencedContainerRuntimeMessage;
+		localOpMetadata?: unknown;
+	}[] {
+		if (local) {
+			return this.processPendingLocalBatch(batch);
+		}
+		// No localOpMetadata for remote messages
+		return batch.messages.map((message) => ({ message }));
+	}
+	/**
+	 * Processes the incoming batch from the server that was submitted by this client.
+	 * It verifies that messages are received in the right order and that the batch information is correct.
+	 * @param batch - The inbound batch (originating from this client) to correlate with the pending local state
+	 * @returns The inbound batch's messages with localOpMetadata "zipped" in.
 	 */
-	public processPendingLocalMessage(
-		message: InboundSequencedContainerRuntimeMessage,
-		batchStartCsn: number,
+	private processPendingLocalBatch(batch: InboundBatch): {
+		message: InboundSequencedContainerRuntimeMessage;
+		localOpMetadata: unknown;
+	}[] {
+		this.onLocalBatchBegin(batch);
+		// Empty batch
+		if (batch.messages.length === 0) {
+			assert(
+				batch.emptyBatchSequenceNumber !== undefined,
+				0x9fb /* Expected sequence number for empty batch */,
+			);
+			const localOpMetadata = this.processNextPendingMessage(batch.emptyBatchSequenceNumber);
+			assert(
+				asEmptyBatchLocalOpMetadata(localOpMetadata)?.emptyBatch === true,
+				0xa20 /* Expected empty batch marker */,
+			);
+			return [];
+		}
+		return batch.messages.map((message) => ({
+			message,
+			localOpMetadata: this.processNextPendingMessage(message.sequenceNumber, message),
+		}));
+	}
+	/**
+	 * Processes the pending local copy of message that's been ack'd by the server.
+	 * @param sequenceNumber - The sequenceNumber from the server corresponding to the next pending message.
+	 * @param message - [optional] The entire incoming message, for comparing contents with the pending message for extra validation.
+	 * @throws DataProcessingError if the pending message content doesn't match the incoming message content.
+	 * @returns - The localOpMetadata of the next pending message, to be sent to whoever submitted the original message.
+	 */
+	private processNextPendingMessage(
+		sequenceNumber: number,
+		message?: InboundSequencedContainerRuntimeMessage,
 	): unknown {
-		// Get the next message from the pending queue. Verify a message exists.
 		const pendingMessage = this.pendingMessages.peekFront();
 		assert(
 			pendingMessage !== undefined,
 			0x169 /* "No pending message found for this remote message" */,
 		);
-		// This may be the start of a batch.
-		this.maybeProcessBatchBegin(message, batchStartCsn, pendingMessage);
-		pendingMessage.sequenceNumber = message.sequenceNumber;
+		pendingMessage.sequenceNumber = sequenceNumber;
 		this.savedOps.push(withoutLocalOpMetadata(pendingMessage));
 		this.pendingMessages.shift();
-		const messageContent = buildPendingMessageContent(message);
+		// message is undefined in the Empty Batch case,
+		// because we don't have an incoming message to compare and pendingMessage is just a placeholder anyway.
+		if (message !== undefined) {
+			const messageContent = buildPendingMessageContent(message);
+			// Stringified content should match
+			if (pendingMessage.content !== messageContent) {
+				const pendingContentObj = JSON.parse(
+					pendingMessage.content,
+				) as LocalContainerRuntimeMessage;
+				const incomingContentObj = JSON.parse(
+					messageContent,
+				) as InboundContainerRuntimeMessage;
+				const contentsMatch =
+					pendingContentObj.contents === incomingContentObj.contents ||
+					(pendingContentObj.contents !== undefined &&
+						incomingContentObj.contents !== undefined &&
+						JSON.stringify(pendingContentObj.contents) ===
+							JSON.stringify(incomingContentObj.contents));
+				this.logger.sendErrorEvent({
+					eventName: "unexpectedAckReceived",
+					details: {
+						pendingContentScrubbed: scrubAndStringify(pendingContentObj),
+						incomingContentScrubbed: scrubAndStringify(incomingContentObj),
+						contentsMatch,
+					},
+				});
-		// Stringified content should match
-		if (pendingMessage.content !== messageContent) {
-			this.stateHandler.close(
-				DataProcessingError.create(
+				throw DataProcessingError.create(
 					"pending local message content mismatch",
 					"unexpectedAckReceived",
 					message,
-					{
-						expectedMessageType: JSON.parse(pendingMessage.content).type,
-					},
-				),
-			);
-			return;
-		}
-		// Post-processing part - If we are processing a batch then this could be the last message in the batch.
-		this.maybeProcessBatchEnd(message);
-		return pendingMessage.localOpMetadata;
-	}
-	/**
-	 * This message could be the first message in batch. If so, set batch state marking the beginning of a batch.
-	 * @param message - The message that is being processed.
-	 * @param batchStartCsn - The clientSequenceNumber of the start of this message's batch (assigned during submit)
-	 * @param pendingMessage - The corresponding pendingMessage.
-	 */
-	private maybeProcessBatchBegin(
-		message: ISequencedDocumentMessage,
-		batchStartCsn: number,
-		pendingMessage: IPendingMessage,
-	) {
-		if (!this.isProcessingBatch) {
-			// Expecting the start of a batch (maybe single-message).
-			if (pendingMessage.batchIdContext.batchStartCsn !== batchStartCsn) {
-				this.logger?.sendErrorEvent({
-					eventName: "BatchClientSequenceNumberMismatch",
-					details: {
-						processingBatch: !!this.pendingBatchBeginMessage,
-						pendingBatchCsn: pendingMessage.batchIdContext.batchStartCsn,
-						batchStartCsn,
-						messageBatchMetadata: (message.metadata as any)?.batch,
-						pendingMessageBatchMetadata: (pendingMessage.opMetadata as any)?.batch,
-					},
-					messageDetails: extractSafePropertiesFromMessage(message),
-				});
+				);
 			}
 		}
-		// This message is the first in a batch if the "batch" property on the metadata is set to true
-		if ((message.metadata as IBatchMetadata | undefined)?.batch) {
-			// We should not already be processing a batch and there should be no pending batch begin message.
-			assert(
-				!this.isProcessingBatch && this.pendingBatchBeginMessage === undefined,
-				0x16b /* "The pending batch state indicates we are already processing a batch" */,
-			);
-			// Set the pending batch state indicating we have started processing a batch.
-			this.pendingBatchBeginMessage = message;
-			this.isProcessingBatch = true;
-		}
+		return pendingMessage.localOpMetadata;
 	}
 	/**
-	 * This message could be the last message in batch. If so, clear batch state since the batch is complete.
-	 * @param message - The message that is being processed.
+	 * Check if the incoming batch matches the batch info for the next pending message.
 	 */
-	private maybeProcessBatchEnd(message: ISequencedDocumentMessage) {
-		if (!this.isProcessingBatch) {
-			return;
-		}
-		// There should be a pending batch begin message.
+	private onLocalBatchBegin(batch: InboundBatch) {
+		// Get the next message from the pending queue. Verify a message exists.
+		const pendingMessage = this.pendingMessages.peekFront();
 		assert(
-			this.pendingBatchBeginMessage !== undefined,
-			0x16d /* "There is no pending batch begin message" */,
+			pendingMessage !== undefined,
+			0xa21 /* No pending message found as we start processing this remote batch */,
 		);
-		const batchEndMetadata = (message.metadata as IBatchMetadata | undefined)?.batch;
-		if (this.pendingMessages.isEmpty() || batchEndMetadata === false) {
-			// Get the batch begin metadata from the first message in the batch.
-			const batchBeginMetadata = (
-				this.pendingBatchBeginMessage.metadata as IBatchMetadata | undefined
-			)?.batch;
-			// There could be just a single message in the batch. If so, it should not have any batch metadata. If there
-			// are multiple messages in the batch, verify that we got the correct batch begin and end metadata.
-			if (this.pendingBatchBeginMessage === message) {
-				assert(
-					batchBeginMetadata === undefined,
-					0x16e /* "Batch with single message should not have batch metadata" */,
-				);
-			} else {
-				if (batchBeginMetadata !== true || batchEndMetadata !== false) {
-					this.stateHandler.close(
-						DataProcessingError.create(
-							"Pending batch inconsistency", // Formerly known as asserts 0x16f and 0x170
-							"processPendingLocalMessage",
-							message,
-							{
-								runtimeVersion: pkgVersion,
-								batchClientId:
-									// eslint-disable-next-line @typescript-eslint/prefer-nullish-coalescing
-									this.pendingBatchBeginMessage.clientId === null
-										? "null"
-										: this.pendingBatchBeginMessage.clientId,
-								clientId: this.stateHandler.clientId(),
-								hasBatchStart: batchBeginMetadata === true,
-								hasBatchEnd: batchEndMetadata === false,
-								messageType: message.type,
-								pendingMessagesCount: this.pendingMessagesCount,
-							},
-						),
-					);
-				}
-			}
-			// Clear the pending batch state now that we have processed the entire batch.
-			this.pendingBatchBeginMessage = undefined;
-			this.isProcessingBatch = false;
+		// Note: This could be undefined if this batch became empty on resubmit.
+		// In this case the next pending message is an empty batch marker.
+		// Empty batches became empty on Resubmit, and submit them and track them in case
+		// a different fork of this container also submitted the same batch (and it may not be empty for that fork).
+		const firstMessage = batch.messages.length > 0 ? batch.messages[0] : undefined;
+		const expectedPendingBatchLength = batch.messages.length === 0 ? 1 : batch.messages.length;
+		// We expect the incoming batch to be of the same length, starting at the same clientSequenceNumber,
+		// as the batch we originally submitted.
+		// We have another later check to compare the message contents, which we'd expect to fail if this check does,
+		// so we don't throw here, merely log.  In a later release this check may replace that one.
+		if (
+			pendingMessage.batchInfo.batchStartCsn !== batch.batchStartCsn ||
+			(pendingMessage.batchInfo.length >= 0 && // -1 length is back compat and isn't suitable for this check
+				pendingMessage.batchInfo.length !== expectedPendingBatchLength)
+		) {
+			this.logger?.sendErrorEvent({
+				eventName: "BatchInfoMismatch",
+				details: {
+					pendingBatchCsn: pendingMessage.batchInfo.batchStartCsn,
+					batchStartCsn: batch.batchStartCsn,
+					pendingBatchLength: pendingMessage.batchInfo.length,
+					batchLength: batch.messages.length,
+					pendingMessageBatchMetadata: asBatchMetadata(pendingMessage.opMetadata)?.batch,
+					messageBatchMetadata: asBatchMetadata(firstMessage?.metadata)?.batch,
+				},
+				messageDetails: firstMessage && extractSafePropertiesFromMessage(firstMessage),
+			});
 		}
 	}
@@ -482,14 +525,12 @@ export class PendingStateManager implements IDisposable {
 			assert(batchMetadataFlag !== false, 0x41b /* We cannot process batches in chunks */);
 			// The next message starts a batch (possibly single-message), and we'll need its batchId.
-			// We'll find batchId on this message if it was previously generated.
-			// Otherwise, generate it now - this is the first time resubmitting this batch.
-			const batchId =
-				asBatchMetadata(pendingMessage.opMetadata)?.batchId ??
-				generateBatchId(
-					pendingMessage.batchIdContext.clientId,
-					pendingMessage.batchIdContext.batchStartCsn,
-				);
+			const batchId = getEffectiveBatchId(pendingMessage);
+			// Resubmit no messages, with the batchId. Will result in another empty batch marker.
+			if (asEmptyBatchLocalOpMetadata(pendingMessage.localOpMetadata)?.emptyBatch === true) {
+				this.stateHandler.reSubmitBatch([], batchId);
+				continue;
+			}
 			/**
 			 * We must preserve the distinct batches on resubmit.
@@ -561,13 +602,13 @@ export class PendingStateManager implements IDisposable {
 	}
 }
-/** For back-compat if trying to apply stashed ops that pre-date batchIdContext */
-function patchBatchIdContext(
+/** For back-compat if trying to apply stashed ops that pre-date batchInfo */
+function patchbatchInfo(
 	message: IPendingMessageFromStash,
 ): asserts message is IPendingMessage {
-	const batchIdContext: IPendingMessageFromStash["batchIdContext"] = message.batchIdContext;
-	if (batchIdContext === undefined) {
+	const batchInfo: IPendingMessageFromStash["batchInfo"] = message.batchInfo;
+	if (batchInfo === undefined) {
 		// Using uuid guarantees uniqueness, retaining existing behavior
-		message.batchIdContext = { clientId: uuid(), batchStartCsn: -1 };
+		message.batchInfo = { clientId: uuid(), batchStartCsn: -1, length: -1 };
 	}
 }

package/src/summary/README.md CHANGED Viewed

@@ -9,12 +9,12 @@
     -   [Summary Lifecycle](#summary-lifecycle)
     -   [Single-commit vs two-commit summaries](#single-commit-vs-two-commit-summaries)
     -   [Incremental summaries](#incremental-summaries)
-	-   [Resiliency](#resiliency)
+    -   [Resiliency](#resiliency)
 -   [What does a summary look like?](#what-does-a-summary-look-like)
 ## Introduction
-This document provides a conceptual overview of summarization without going into a lot of technical or implementation details. It describes what summaries are, how / when they are generated and what do they look like. The goal is for this to be an entry point into summarization for users and developers alike.
+This document provides a conceptual overview of summarization. It describes what summaries are, how / when they are generated, and what they look like. The goal is for this to be an entry point into summarization for users and developers alike.
 ### Summary vs snapshot
@@ -22,23 +22,22 @@ The terms summary and snapshot are sometimes used interchangeably. Both represen
 ## Why do we need summaries?
-A 'summary' captures the state of a container at a point in time. Without it, a client would have to apply every operation in the op log, even if those operations no longer affected the current state (e.g. op 1 inserts ‘h’ and op 2 deletes ‘h’). For very large op logs, this would be very expensive for clients to both process and download from the service.
-Instead, when a client opens a collaborative document, it can download a snapshot of the container, and simply process new operations from that point forward.
+A 'summary' captures the state of a container at a point in time so that future clients can start from this point. Without it, a client would have to apply every operation in the op log, even if those operations (hereafter ops) no longer affected the current state (e.g. op 1 inserts 'h' and op 2 deletes 'h'). For large op logs, this would be very expensive for clients to both download from the service and to process them.
+Instead, when a client opens a collaborative document, it downloads the latest snapshot of the container, and simply process new operations from that point forward.
 ## Who generates summaries?
-Summaries can be generated by any client connected in "write" mode. In the current implementation, summaries are generated by a separate non-interactive client. Using a separate client is an optimization - this client doesn't have to take local changes into account which can make the summary process more complicated. A summarizer client has the following characteristics:
+Summaries can be generated by any client connected in "write" mode. They are generated by a separate non-interactive client called the summarizer client. Using a separate client is an optimization - this client doesn't have to take local changes into account which can make the summary process more complicated.
+A summarizer client is like any other client connected to the document except that users cannot interact with this client, and it only works on the state it receives from other clients. It has a brand-new container with its own connection to services.
+All the clients connected to the document participate in a process called "summary client election" to elect a "parent summarizer" client. Typically, it's the oldest "write" client connected to the document. The parent summarizer client spawns a "summarizer" client which is responsible for summarization.
--   All the clients connected to the document participate in a process called "summary client election" to elect a "parent summarizer" client. Typically, it's the oldest "write" client connected to the document. The parent summarizer client spawns a "summarizer" client which is responsible for summarization.
--   A summarizer client is like any other client connected to the document except that users cannot interact with this client, and it only works on the state it receives from other clients. It has a brand-new container with its own connection to services.
-> Note that if the summarizer client closes, the "summary client election" process will choose a new one, if applicable. The default "summary client election" algorithm is to select the oldest "write" client as the parent summarizer client which in turn will create the summarizer client.
+Note: If the summarizer client closes, the "summary client election" process will choose a new one, if there are eligible clients.
 ## When are summaries generated?
-The summarizer client periodically generates summary based on heuristics calculated based on configurations such as the number of user or system ops received, the amount of time a client has been idle (hasn't received any ops), the maximum time since last summary, maximum number of ops since last summary, etc. The heuristic configurations are defined by an _ISummaryConfigurationHeuristics_ interface defined in [this file](../../src/containerRuntime.ts).
+The summarizer client periodically generates summary based on heuristics calculated based on configurations such as the number of user or system operations received, the amount of time a client has been idle (hasn't received any ops), the maximum time since last summary, maximum number of ops since last summary, etc. The heuristic configurations are defined by an `ISummaryConfigurationHeuristics` interface defined in [containerRuntime.ts in the container-runtime package][container-runtime].
-The summarizer client uses a default set of configurations defined by _DefaultSummaryConfiguration_ in [this file](../../src/containerRuntime.ts). These can be overridden by providing a new set of configurations as part of container runtime options during creation.
+The summarizer client uses a default set of configurations defined by `DefaultSummaryConfiguration` in [containerRuntime.ts in the container-runtime package][container-runtime]. These can be overridden by providing a new set of configurations as part of container runtime options during creation.
 ## How are summaries generated?
@@ -47,35 +46,39 @@ When summarization process is triggered, every object in the container's object
 ### Summary Lifecycle
 The lifecycle of a summary starts when a "parent summarizer" client is elected.
 -   The parent summarizer spawns a non-interactive summarizer client.
 -   The summarizer client periodically starts a summary as per heuristics. A summary happens at a particular sequence number called the "summary sequence number" or reference sequence number for the summary.
--   The container runtime (or simply runtime) generates a summary tree (described in the ["What does a summary look like?"](#what-does-a-summary-look-like) section below).
--   The runtime uploads the summary tree to the Fluid storage service which returns a handle (unique id) to the summary if the upload is successful. Otherwise, it returns a failure.
+-   The container runtime (hereafter runtime) generates a summary tree (described in the ["What does a summary look like?"](#what-does-a-summary-look-like) section below).
+-   The runtime uploads the summary tree to the Fluid storage service which returns a handle (unique id) to the summary if the upload is successful. Otherwise, it returns a failure. The runtime also includes the handle of the last successful summary. If this information is incorrect, the service will reject this summary. This is done to ensure that [incremental summaries](#incremental-summaries) are correct.
 -   The runtime submits a "summarize" op to the Fluid ordering service containing the uploaded summary handle and the summary sequence number.
--   The ordering service stamps it with a sequence number (like any other op) and broadcasts the summarize op.
--   Another service on the server responds to the summarize op.
-    -   If the summary is accepted, it sends a "summary ack" with the summary sequence number and a summary handle. This handle may or may not be the same as the one in summary op depending on whether this is a single-commit or two-commit summary. More details on this below.
-    -   If the summary is rejected, it sends a "summary nack" with the details of the summary op
--   The runtime completes the summary process on receiving the summary ack / nack. The runtime has a timeout called "maxAckWaitTime" and if the summary op, ack or nack is not received within this time, it will fail this summary.
-### Single-commit vs two-commit summaries
-By default, Fluid uses "two-commit summaries" mode where the two commits refer to the storage committing the summary twice and returning two different handles for it - One when the summary is uploaded and second, on responding to the summary op via a summary ack. In this mode, when the server receives the summary op, it augments the corresponding summary with a "protocol" blob hence generating a new commit and new handle for this summary which it returns in the summary ack.
-Fluid is switching to "single-commit summary" mode where the client adds the "protocol" blob when uploading the summary. Thus, the server doesn't need to augment the summary and the summary ack is no longer required. As soon as the summary is uploaded (first commit), the summary process is complete. The "summarize" op then is just a way to indicate that a summary happened, and it has details of the summary
+-   The ordering service stamps it with a sequence number (like any other op) and broadcasts the summarize op. This creates a record in the op log that a summary was submitted and it lets other clients know about it. Non-summarizer clients don't do anything with the summary op. The summarizer client that submitted it processes it and waits for a summary ack / nack. Future summarizer clients also process them and validates that a corresponding summary ack / nack is received.
+-   The ordering service then responds to the summarize op:
+    -   If the summary is accepted, it sends a "summary ack" with the summary sequence number and a summary handle.
+    -   If the summary is rejected, it sends a "summary nack" with the details of the summary op.
+-   The runtime processes the summary ack or nack completes the summary process as success or failure accordingly.
+    -   If the summary is successful, the handle in the ack becomes the last successful summary's handle which is used when upload summaries as described earlier.
+    -   If the summary failed, the summarizer client closes and the summary election process starts to elect a new one.
+-   The runtime has a timeout called "maxAckWaitTime" and if the summary op, ack or nack is not received within this time, it will fail this summary.
 ### Incremental summaries
-Summaries are incremental, i.e., if an object (or node) did not change since the last summary, it doesn't have to re-summarize its entire contents. Fluid supports the concept of a summary handle defined in [this file](../../../../../common/lib/protocol-definitions/src/summary.ts). A handle is a path to a subtree in a snapshot and it allows objects to reference a subtree in the previous snapshot, which is essentially an instruction to storage to find that subtree and populate into new summary.
+Summaries are incremental, i.e., if an object (or node) did not change since the last summary, it doesn't have to re-summarize its entire contents. Fluid supports the concept of a summary handle defined in [summary.ts in the protocol-definitions package][summary-protocol]. A handle is a path to a subtree in a snapshot and it allows objects to reference a subtree in the previous snapshot, which is essentially an instruction to storage to find that subtree and populate into new summary.
+So, say that a data store or DDS did not change since the last summary, it doesn't have to go through the whole summary process described above. It can instead return an ISummaryHandle with path to its subtree in the last successful summary. The same applies to other types of content like a single content blob within an object's summary tree.
-Say that a data store or DDS did not change since the last summary, it doesn't have to go through the whole summary process described above. It can instead return an ISummaryHandle with path to its subtree in the previous snapshot. The same applies to other types of content like a single content blob within an object's summary tree.
+For incremental summary, objects diff their content against the last summary to determine whether to send a summary handle. So, it's crucial that the last summary information be correct or else the summary will be incorrect. So, during upload, the last summary's handle is also sent and the service will validate that it's correct.
 ### Resiliency
-The summarization process is designed to be resilient - Basically, a document will eventually summarize and make progress even if there are intermittent failures or disruptions. Some examples of steps taken to achieve this:
+The summarization process is designed to be resilient - A document will eventually summarize and make progress even if there are intermittent failures or disruptions. Some examples of steps taken to achieve this:
 -   Last summary - Usually, if the "parent summarizer" client disconnects or shuts down, the "summarizer" client also shuts down and the summarizer election process begins. However, if there a certain number of un-summarized ops, the summarizer client will perform a "last summary" even if the parent shuts down. This is done to make progress in scenarios where new summarizer clients are closed quickly because the parent summarizer keeps disconnecting repeatedly.
 -   Retries - The summarizer has a retry mechanism which can identify certain types of intermittent failures either in the client or in the server. It will retry the summary attempt for these failures a certain number of times. This helps in cases where there are intermittent failures such as throttling errors from the server which goes away after waiting for a while.
 ## What does a summary look like?
-The format of a summary is described in [summary formats](./summaryFormats.md).
+The format of summaries (and snapshots) is described in [summary and snapshot formats](./summaryFormats.md).
+[container-runtime]: ../../src/containerRuntime.ts
+[summary-protocol]: /common/lib/protocol-definitions/src/summary.ts