npm - whisperai-sdk - Versions diffs - 1.0.2 → 2.0.0 - Mend

whisperai-sdk 1.0.2 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # whisperai-sdk
-Unofficial TypeScript SDK for [WhisperAI](https://whisperai.com/). This package provides methods and interfaces for interacting with the WhisperAI service without external runtime dependencies (except axios).
+Unofficial TypeScript SDK for [WhisperAI](https://whisperai.com/). Version 2 uses WhisperAI's signed Google Cloud Storage resumable-upload flow and requires Node.js 22 or newer.
-**Disclaimer:** This is an unofficial implementation and is not affiliated with WhisperAI.
+> This project is not affiliated with WhisperAI.
 ## Installation
@@ -10,152 +10,126 @@ Unofficial TypeScript SDK for [WhisperAI](https://whisperai.com/). This package
 npm install whisperai-sdk
 ```
-## Usage
+## Transcribe a file
-### Initialization
-Initialize the client with your WhisperAI credentials.
+`transcribe()` performs the complete operation: authentication, upload, retries, finalization, status polling, and fetching the completed transcription.
 ```typescript
-import { WhisperClient } from 'whisperai-sdk';
+import { readFile } from "node:fs/promises"
+import { WhisperClient } from "whisperai-sdk"
 const client = new WhisperClient({
   login: {
-    email: "your-email@example.com",
-    password: "your-password"
-  },
-  // Optional: Override default settings
-  // whisperUrl: "https://whisperai.com",
-  // chunkSize: 8 * 1024 * 1024 // 8MB
-});
-```
-The client handles authentication automatically. It will log in on the first request and refresh the session if the token expires.
+    email: process.env.WHISPER_EMAIL!,
+    password: process.env.WHISPER_PASSWORD!
+  }
+})
-### Methods
+const audio = new Uint8Array(await readFile("./interview.m4a"))
+const recording = await client.transcribe(audio, {
+  filename: "interview.m4a",
+  mimeType: "audio/x-m4a",
+  durationSeconds: 120
+})
-#### User & Account
+console.log(recording.transcription.content)
+```
-Get current user information, usage statistics, and subscription details.
+The default processing timeout is 30 minutes and the default polling interval is 2 seconds.
 ```typescript
-// Get user info
-const userInfo = await client.user();
-console.log(`User: ${userInfo.firstName} ${userInfo.lastName}`);
-// Get usage stats
-const usage = await client.usage();
-console.log(`Monthly Usage: ${usage.monthlyUsageMinutes} minutes`);
-// Get subscription details
-const subscription = await client.subscriptionDetails();
+const controller = new AbortController()
+const recording = await client.transcribe(audio, metadata, {
+  timeoutMs: 45 * 60 * 1000,
+  pollIntervalMs: 3000,
+  signal: controller.signal,
+  onProgress: percentage => console.log(`Upload: ${percentage}%`)
+})
 ```
-#### Uploading Audio
+## Streams
-Upload audio files for transcription. The `upload` method handles file chunking automatically.
+For streaming uploads, provide `totalSize` so the SDK can upload without buffering the entire file. If it is omitted, the stream is buffered first to determine its size.
 ```typescript
-import fs from 'fs';
-// Read file buffer
-const buffer = fs.readFileSync('./interview.mp3');
-// Upload
-const result = await client.upload(buffer, {
-  filename: 'interview.mp3',
-  durationSeconds: 120, // Total duration in seconds
-  mimeType: 'audio/mpeg', // Optional
-  title: 'Interview with John Doe', // Optional
-  enableSpeakerDetection: true, // Optional
-  speakerCount: 'auto' // Optional: 'auto' or number
-});
-console.log(`Uploaded recording ID: ${result.id}`);
+const recording = await client.transcribe(stream, {
+  filename: "meeting.webm",
+  mimeType: "audio/webm",
+  durationSeconds: 900,
+  totalSize: contentLength
+})
 ```
-#### Transcription
+## Start without waiting
-Manage transcriptions for uploaded recordings.
+Queue workers can upload and return immediately, then check the recording later.
 ```typescript
-const recordingId = 12345;
-// Start/Request transcription
-const transcriptionJob = await client.transcription(recordingId);
-// Check recording status and get transcription result
-const recording = await client.recording(recordingId);
-if (recording.status === 'completed' && recording.transcription) {
-  console.log(recording.transcription.content);
-  // Access segments with timestamps
-  recording.transcription.segments.forEach(segment => {
-    console.log(`[${segment.start} - ${segment.end}]: ${segment.text}`);
-  });
-}
+const started = await client.startTranscription(audio, metadata)
+console.log(started.id, started.status) // processing
+const statuses = await client.recordingStatus([started.id])
+const completed = await client.waitForTranscription(started.id)
 ```
-#### Translation
+`requestTranscription(recordingId)` is available for explicitly restarting or recovering an existing recording. A normal signed upload starts processing when the upload is completed, so it does not need this extra call.
-Translate a recording to another language.
+## Upload metadata
-```typescript
-const recordingId = 12345;
+The SDK accepts the current WhisperAI transcription settings:
-// Translate to Spanish
-const translation = await client.translate(recordingId, 'es');
+```typescript
+await client.transcribe(audio, {
+  filename: "interview.m4a",
+  durationSeconds: 120,
+  language: "multi-auto",
+  enableSpeakerDetection: true,
+  speakerCount: "auto",
+  transcriptionStyle: "clean_readable",
+  importantTerms: "WhisperAI, Codex",
+  customPrompt: "Technical product interview",
+  speakerIdentificationEnabled: true,
+  speakerIdentificationMode: "role",
+  speakerIdentificationValues: ["Interviewer", "Guest"]
+})
 ```
-#### Recordings Management
-List and retrieve recordings.
+## Other methods
 ```typescript
-// Get a specific recording by ID
-const recording = await client.recording(recordingId);
-// List recordings (paginated)
-const recordingsList = await client.recordings({
-  limit: 10,
-  page: 1,
-  // search: "interview", // Optional search query
-  // status: "completed"  // Optional status filter
-});
-console.log(`Found ${recordingsList.meta.totalItems} recordings`);
+await client.user()
+await client.usage()
+await client.subscriptionDetails()
+await client.recording(recordingId)
+await client.recordings({ limit: 20, sort: "newest" })
+await client.summary()
+await client.translate(recordingId, "es")
 ```
-#### Analytics
-Get a summary of your activity.
+## Errors
 ```typescript
-const summary = await client.summary();
-console.log(`Total recordings: ${summary.recordings.total}`);
+import {
+  WhisperApiError,
+  WhisperAuthError,
+  WhisperNetworkError,
+  WhisperTimeoutError,
+  WhisperTranscriptionError,
+  WhisperUploadError
+} from "whisperai-sdk"
 ```
-## Error Handling
+Upload diagnostics are enabled by default and sent best-effort to WhisperAI. Disable them globally with `diagnostics: false` in `ClientOptions`, or per operation with `{ diagnostics: false }`.
-The SDK throws specific errors for different failure scenarios.
+## Live smoke test
-```typescript
-import { WhisperAuthError, WhisperNetworkError, WhisperApiError } from 'whisperai-sdk/errors';
-try {
-  await client.user();
-} catch (error) {
-  if (error instanceof WhisperAuthError) {
-    console.error("Authentication failed. Check credentials.");
-  } else if (error instanceof WhisperNetworkError) {
-    console.error("Network issue.");
-  } else if (error instanceof WhisperApiError) {
-    console.error(`API Error ${error.status}: ${JSON.stringify(error.data)}`);
-  } else {
-    console.error("Unknown error:", error);
-  }
-}
+```bash
+WHISPER_EMAIL=... \
+WHISPER_PASSWORD=... \
+WHISPER_AUDIO_PATH=./sample.m4a \
+WHISPER_AUDIO_DURATION_SECONDS=10 \
+bun test test/live.test.ts
 ```
 ## License

package/dist/client.d.ts CHANGED Viewed

@@ -1,4 +1,5 @@
-import type { ClientOptions, FinalizeChunkResponse, InitChunkResponse, InitMetaFile, RecordingResponse, RecordingsQuery, RecordingsResponse, SummaryResponse, SubscriptionDetailsResponse, TranscriptionResponse, TranslateResponse, UploadChunkResponse, UsageInfo, UserInfo } from "./types.js";
+import type { ClientOptions, CompletedRecordingResponse, FinalizeUploadResponse, InitMetaFile, OperationOptions, RecordingResponse, RecordingsQuery, RecordingsResponse, RecordingStatusResponse, SubscriptionDetailsResponse, SummaryResponse, TranscribeOptions, TranscriptionResponse, TranslateResponse, UsageInfo, UserInfo } from "./types.js";
+type AudioInput = Uint8Array | ReadableStream<Uint8Array>;
 export declare class WhisperClient {
     private cookies?;
     private readonly clientOptions;
@@ -7,36 +8,50 @@ export declare class WhisperClient {
     user(): Promise<UserInfo>;
     usage(): Promise<UsageInfo>;
     subscriptionDetails(): Promise<SubscriptionDetailsResponse>;
-    upload(file: Uint8Array | ReadableStream<Uint8Array>, meta: InitMetaFile): Promise<FinalizeChunkResponse>;
-    initChunk(fileBuffer: Uint8Array, meta: InitMetaFile): Promise<InitChunkResponse>;
-    uploadChunk(chunk: Uint8Array, recordingId: number, chunkIndex: number): Promise<UploadChunkResponse>;
-    finalizeChunk(recordingId: number): Promise<FinalizeChunkResponse>;
-    transcription(recordingId: number): Promise<TranscriptionResponse>;
+    startTranscription(file: AudioInput, meta: InitMetaFile, options?: OperationOptions): Promise<FinalizeUploadResponse>;
+    transcribe(file: AudioInput, meta: InitMetaFile, options?: TranscribeOptions): Promise<CompletedRecordingResponse>;
+    requestTranscription(recordingId: number, signal?: AbortSignal): Promise<TranscriptionResponse>;
+    recordingStatus(recordingIds: number[], signal?: AbortSignal): Promise<RecordingStatusResponse[]>;
+    waitForTranscription(recordingId: number, options?: Pick<TranscribeOptions, "pollIntervalMs" | "timeoutMs" | "signal">): Promise<CompletedRecordingResponse>;
     translate(recordingId: number, language: string): Promise<TranslateResponse>;
-    recording(recordingId: number): Promise<RecordingResponse>;
+    recording(recordingId: number, signal?: AbortSignal): Promise<RecordingResponse>;
     recordings(query?: RecordingsQuery): Promise<RecordingsResponse>;
     summary(): Promise<SummaryResponse>;
-    private uploadFromStream;
-    private uploadChunkSlice;
-    private initChunkRequest;
-    private uploadChunkRequest;
+    private buildUploadMetadata;
+    private startResumableSession;
+    private uploadToGcs;
+    private uploadRangeWithRetry;
+    private putRange;
+    private probeUpload;
+    private readChunks;
     private recall;
     private get;
     private post;
-    private postForm;
     private request;
+    private responseData;
+    private sendDiagnostic;
+    private isRetryable;
+    private retryDelay;
+    private sleep;
+    private throwIfAborted;
+    private nextOffset;
+    private toArrayBuffer;
+    private errorMessage;
+    private withDiagnosticId;
+    private createDiagnosticId;
     private get loginLink();
     private get userLink();
     private get usageLink();
     private get subscriptionDetailsLink();
-    private get initChunkLink();
-    private get uploadChunkLink();
-    private get finalizeChunkLink();
+    private get signUploadLink();
+    private completeUploadLink;
+    private get diagnosticsLink();
     private get transcriptionLink();
+    private get recordingStatusLink();
     private get summaryLink();
     private get recordingsLink();
     private translateLink;
     private recordingLink;
     private mergeCookies;
-    private getCountChunks;
 }
+export {};

package/dist/constant.d.ts CHANGED Viewed

@@ -7,3 +7,7 @@ export declare enum WhisperStatus {
     FAILED = "failed",
     CANCELLED = "cancelled"
 }
+export declare const DEFAULT_UPLOAD_CHUNK_SIZE: number;
+export declare const DEFAULT_MAX_UPLOAD_ATTEMPTS = 5;
+export declare const DEFAULT_POLL_INTERVAL_MS = 2000;
+export declare const DEFAULT_TRANSCRIPTION_TIMEOUT_MS: number;

package/dist/errors.d.ts CHANGED Viewed

@@ -16,3 +16,20 @@ export declare class WhisperNetworkError extends WhisperError {
     readonly code = "NETWORK_ERROR";
     constructor(cause?: unknown);
 }
+export declare class WhisperUploadError extends WhisperError {
+    readonly diagnosticId?: string | undefined;
+    readonly code = "UPLOAD_ERROR";
+    constructor(message: string, diagnosticId?: string | undefined, cause?: unknown);
+}
+export declare class WhisperTranscriptionError extends WhisperError {
+    readonly recordingId: number;
+    readonly status: string;
+    readonly code = "TRANSCRIPTION_ERROR";
+    constructor(recordingId: number, status: string);
+}
+export declare class WhisperTimeoutError extends WhisperError {
+    readonly recordingId: number;
+    readonly timeoutMs: number;
+    readonly code = "TIMEOUT_ERROR";
+    constructor(recordingId: number, timeoutMs: number);
+}

package/dist/index.d.ts CHANGED Viewed

@@ -1,3 +1,4 @@
 export * from "./types.js";
 export * from "./client.js";
 export * from "./constant.js";
+export * from "./errors.js";