npm - prompt-api-polyfill - Versions diffs - 0.1.0 → 0.2.0 - Mend

prompt-api-polyfill 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +143 -66
package/json-schema-converter.js +3 -1
package/multimodal-converter.js +138 -12
package/package.json +19 -4
package/prompt-api-polyfill.js +478 -444

package/README.md CHANGED Viewed

@@ -1,8 +1,12 @@
-# Prompt API Polyfill (Firebase AI Logic backend)
+# Prompt API Polyfill
 This package provides a browser polyfill for the
-[Prompt API `LanguageModel`](https://github.com/webmachinelearning/prompt-api)
-backed by **Firebase AI Logic**.
+[Prompt API `LanguageModel`](https://github.com/webmachinelearning/prompt-api),
+supporting dynamic backends:
+- **Firebase AI Logic**
+- **Google Gemini API**
+- **OpenAI API**
 When loaded in the browser, it defines a global:
@@ -13,8 +17,28 @@ window.LanguageModel;
 so you can use the Prompt API shape even in environments where it is not yet
 natively available.
-- Back end: Firebase AI Logic
-- Default model: `gemini-2.5-flash-lite` (configurable via `modelName`)
+## Supported Backends
+### Firebase AI Logic
+- **Uses**: `firebase/ai` SDK.
+- **Select by setting**: `window.FIREBASE_CONFIG`.
+- **Model**: Uses default if not specified (see
+  [`backends/defaults.js`](backends/defaults.js)).
+### Google Gemini API
+- **Uses**: `@google/generative-ai` SDK.
+- **Select by setting**: `window.GEMINI_CONFIG`.
+- **Model**: Uses default if not specified (see
+  [`backends/defaults.js`](backends/defaults.js)).
+### OpenAI API
+- **Uses**: `openai` SDK.
+- **Select by setting**: `window.OPENAI_CONFIG`.
+- **Model**: Uses default if not specified (see
+  [`backends/defaults.js`](backends/defaults.js)).
 ---
@@ -28,37 +52,78 @@ npm install prompt-api-polyfill
 ## Quick start
-1. **Create a Firebase project with Generative AI enabled** (see Configuration
-   below).
-2. **Provide your Firebase config** on `window.FIREBASE_CONFIG`.
-3. **Import the polyfill** so it can attach `window.LanguageModel`.
+### Backed by Firebase
-### Example (using a JSON config file)
-Create a `.env.json` file (see
-[Configuring `dot_env.json` / `.env.json`](#configuring-dot_envjson--envjson))
-and then use it from a browser entry point:
+1. **Create a Firebase project with Generative AI enabled**.
+2. **Provide your Firebase config** on `window.FIREBASE_CONFIG`.
+3. **Import the polyfill**.
 ```html
 <script type="module">
   import firebaseConfig from './.env.json' with { type: 'json' };
-  // Make the config available to the polyfill
+  // Set FIREBASE_CONFIG to select the Firebase backend
   window.FIREBASE_CONFIG = firebaseConfig;
-  // Only load the polyfill if LanguageModel is not available natively
   if (!('LanguageModel' in window)) {
     await import('prompt-api-polyfill');
   }
   const session = await LanguageModel.create();
-  const text = await session.prompt('Say hello from the polyfill!');
-  console.log(text);
 </script>
 ```
-> **Note**: The polyfill attaches `LanguageModel` to `window` as a side effect.
-> There are no named exports.
+### Backed by Gemini API
+1. **Get a Gemini API Key** from
+   [Google AI Studio](https://aistudio.google.com/).
+2. **Provide your API Key** on `window.GEMINI_CONFIG`.
+3. **Import the polyfill**.
+```html
+<script type="module">
+  // NOTE: Do not expose real keys in production source code!
+  // Set GEMINI_CONFIG to select the Gemini backend
+  window.GEMINI_CONFIG = { apiKey: 'YOUR_GEMINI_API_KEY' };
+  if (!('LanguageModel' in window)) {
+    await import('prompt-api-polyfill');
+  }
+  const session = await LanguageModel.create();
+</script>
+```
+### Backed by OpenAI API
+1. **Get an OpenAI API Key** from the
+   [OpenAI Platform](https://platform.openai.com/).
+2. **Provide your API Key** on `window.OPENAI_CONFIG`.
+3. **Import the polyfill**.
+```html
+<script type="module">
+  // NOTE: Do not expose real keys in production source code!
+  // Set OPENAI_CONFIG to select the OpenAI backend
+  window.OPENAI_CONFIG = { apiKey: 'YOUR_OPENAI_API_KEY' };
+  if (!('LanguageModel' in window)) {
+    await import('prompt-api-polyfill');
+  }
+  const session = await LanguageModel.create();
+</script>
+```
+---
+## Configuration
+### Example (using a JSON config file)
+Create a `.env.json` file (see
+[Configuring `dot_env.json` / `.env.json`](#configuring-dot_envjson--envjson))
+and then use it from a browser entry point.
 ### Example based on `index.html` in this repo
@@ -76,8 +141,8 @@ A simplified version of how it is wired up:
 ```html
 <script type="module">
-  import firebaseConfig from './.env.json' with { type: 'json' };
-  window.FIREBASE_CONFIG = firebaseConfig;
+  // Set GEMINI_CONFIG to select the Gemini backend
+  window.GEMINI_CONFIG = { apiKey: 'YOUR_GEMINI_API_KEY' };
   // Load the polyfill only when necessary
   if (!('LanguageModel' in window)) {
@@ -110,17 +175,20 @@ This repo ships with a template file:
 ```jsonc
 // dot_env.json
 {
-  "apiKey": "",
+  // For Firebase:
   "projectId": "",
   "appId": "",
   "modelName": "",
+  // For Firebase OR Gemini OR OpenAI:
+  "apiKey": "",
 }
 ```
 You should treat `dot_env.json` as a **template** and create a real `.env.json`
 that is **not committed** with your secrets.
-### 1. Create `.env.json`
+### Create `.env.json`
 Copy the template:
@@ -128,63 +196,56 @@ Copy the template:
 cp dot_env.json .env.json
 ```
-Then open `.env.json` and fill in the values from your Firebase project:
+Then open `.env.json` and fill in the values.
+**For Firebase:**
 ```json
 {
   "apiKey": "YOUR_FIREBASE_WEB_API_KEY",
   "projectId": "your-gcp-project-id",
   "appId": "YOUR_FIREBASE_APP_ID",
-  "modelName": "gemini-2.5-flash-lite"
+  "modelName": "choose-model-for-firebase"
 }
 ```
-### 2. Field-by-field explanation
-- `apiKey` Your **Firebase Web API key**. You can find this in the Firebase
-  Console under: _Project settings → General → Your apps → Web app_.
+**For Gemini:**
-- `projectId` The **GCP / Firebase project ID**, e.g. `my-ai-project`.
-- `appId` The **Firebase Web app ID**, e.g. `1:1234567890:web:abcdef123456`.
-- `modelName` (optional) The Gemini model ID to use. If omitted, the polyfill
-  defaults to:
+```json
+{
+  "apiKey": "YOUR_GEMINI_CONFIG",
+  "modelName": "choose-model-for-gemini"
+}
+```
-  ```json
-  "modelName": "gemini-2.5-flash-lite"
-  ```
+**For OpenAI:**
-  You can substitute another supported Gemini model here if desired.
+```json
+{
+  "apiKey": "YOUR_OPENAI_API_KEY",
+  "modelName": "choose-model-for-openai"
+}
+```
-These fields are passed directly to:
+### Field-by-field explanation
-- `initializeApp(firebaseConfig)` from Firebase
-- `getAI(app, { backend: new GoogleAIBackend() })` from the Firebase AI SDK
+- `apiKey`:
+  - **Firebase**: Your Firebase Web API key.
+  - **Gemini**: Your Gemini API Key.
+  - **OpenAI**: Your OpenAI API Key.
+- `projectId` / `appId`: **Firebase only**.
-and `modelName` is used to select which Gemini model to call.
+- `modelName` (optional): The model ID to use. If not provided, the polyfill
+  uses the defaults defined in [`backends/defaults.js`](backends/defaults.js).
 > **Important:** Do **not** commit a real `.env.json` with production
 > credentials to source control. Use `dot_env.json` as the committed template
 > and keep `.env.json` local.
-### 3. Wiring the config into the polyfill
+### Wiring the config into the polyfill
-Once `.env.json` is filled out, you can import it and expose it to the polyfill
-exactly like in `index.html`:
-```js
-import firebaseConfig from './.env.json' with { type: 'json' };
-window.FIREBASE_CONFIG = firebaseConfig;
-if (!('LanguageModel' in window)) {
-  await import('prompt-api-polyfill');
-}
-```
-From this point on, `LanguageModel.create()` will use your Firebase
-configuration.
+Once `.env.json` is filled out, you can import it and expose it to the polyfill.
+See the [Quick start](#quick-start) examples above.
 ---
@@ -200,8 +261,7 @@ For a complete, end-to-end example, see the `index.html` file in this directory.
 ## Running the demo locally
-1. Install dependencies and this package (if using the npm-installed version in
-   another project):
+1. Install dependencies:
    ```bash
    npm install
@@ -211,17 +271,34 @@ For a complete, end-to-end example, see the `index.html` file in this directory.
    ```bash
    cp dot_env.json .env.json
-   # then edit .env.json with your Firebase and model settings
    ```
 3. Serve `index.html`:
    ```bash
    npm start
    ```
-You should see network requests to the Vertex AI / Firebase AI backend and
-streaming responses logged in the console.
+You should see network requests to the backends logs.
+---
+## Testing
+The project includes a comprehensive test suite that runs in a headless browser.
+### Running Browser Tests
+Uses `playwright` to run tests in a real Chromium instance. This is the
+recommended way to verify environmental fidelity and multimodal support.
+```bash
+npm run test:browser
+```
+To see the browser and DevTools while testing, you can modify
+`vitest.browser.config.js` to set `headless: false`.
+---
 ## License

package/json-schema-converter.js CHANGED Viewed

@@ -6,7 +6,9 @@ import { Schema } from 'https://esm.run/firebase/ai';
  * @returns {Schema} - The Firebase Vertex AI Schema instance.
  */
 export function convertJsonSchemaToVertexSchema(jsonSchema) {
-  if (!jsonSchema) return undefined;
+  if (!jsonSchema) {
+    return undefined;
+  }
   // Extract common base parameters supported by all Schema types
   const baseParams = {

package/multimodal-converter.js CHANGED Viewed

@@ -1,7 +1,11 @@
 export default class MultimodalConverter {
   static async convert(type, value) {
-    if (type === 'image') return this.processImage(value);
-    if (type === 'audio') return this.processAudio(value);
+    if (type === 'image') {
+      return this.processImage(value);
+    }
+    if (type === 'audio') {
+      return this.processAudio(value);
+    }
     throw new DOMException(
       `Unsupported media type: ${type}`,
       'NotSupportedError'
@@ -16,13 +20,16 @@ export default class MultimodalConverter {
     // BufferSource (ArrayBuffer/View) -> Sniff or Default
     if (ArrayBuffer.isView(source) || source instanceof ArrayBuffer) {
-      const buffer = source instanceof ArrayBuffer ? source : source.buffer;
+      const u8 =
+        source instanceof ArrayBuffer
+          ? new Uint8Array(source)
+          : new Uint8Array(source.buffer, source.byteOffset, source.byteLength);
+      const buffer = u8.buffer.slice(
+        u8.byteOffset,
+        u8.byteOffset + u8.byteLength
+      );
       const base64 = this.arrayBufferToBase64(buffer);
-      // Basic sniffing for PNG/JPEG magic bytes
-      const u8 = new Uint8Array(buffer);
-      let mimeType = 'image/png'; // Default
-      if (u8[0] === 0xff && u8[1] === 0xd8) mimeType = 'image/jpeg';
-      else if (u8[0] === 0x89 && u8[1] === 0x50) mimeType = 'image/png';
+      const mimeType = this.#sniffImageMimeType(u8) || 'image/png';
       return { inlineData: { data: base64, mimeType } };
     }
@@ -32,6 +39,111 @@ export default class MultimodalConverter {
     return this.canvasSourceToInlineData(source);
   }
+  static #sniffImageMimeType(u8) {
+    const len = u8.length;
+    if (len < 4) {
+      return null;
+    }
+    // JPEG: FF D8 FF
+    if (u8[0] === 0xff && u8[1] === 0xd8 && u8[2] === 0xff) {
+      return 'image/jpeg';
+    }
+    // PNG: 89 50 4E 47 0D 0A 1A 0A
+    if (
+      u8[0] === 0x89 &&
+      u8[1] === 0x50 &&
+      u8[2] === 0x4e &&
+      u8[3] === 0x47 &&
+      u8[4] === 0x0d &&
+      u8[5] === 0x0a &&
+      u8[6] === 0x1a &&
+      u8[7] === 0x0a
+    ) {
+      return 'image/png';
+    }
+    // GIF: GIF87a / GIF89a
+    if (u8[0] === 0x47 && u8[1] === 0x49 && u8[2] === 0x46 && u8[3] === 0x38) {
+      return 'image/gif';
+    }
+    // WebP: RIFF (offset 0) + WEBP (offset 8)
+    if (
+      u8[0] === 0x52 &&
+      u8[1] === 0x49 &&
+      u8[2] === 0x46 &&
+      u8[3] === 0x46 &&
+      u8[8] === 0x57 &&
+      u8[9] === 0x45 &&
+      u8[10] === 0x42 &&
+      u8[11] === 0x50
+    ) {
+      return 'image/webp';
+    }
+    // BMP: BM
+    if (u8[0] === 0x42 && u8[1] === 0x4d) {
+      return 'image/bmp';
+    }
+    // ICO: 00 00 01 00
+    if (u8[0] === 0x00 && u8[1] === 0x00 && u8[2] === 0x01 && u8[3] === 0x00) {
+      return 'image/x-icon';
+    }
+    // TIFF: II* (LE) / MM* (BE)
+    if (
+      (u8[0] === 0x49 && u8[1] === 0x49 && u8[2] === 0x2a) ||
+      (u8[0] === 0x4d && u8[1] === 0x4d && u8[2] === 0x2a)
+    ) {
+      return 'image/tiff';
+    }
+    // ISOBMFF (AVIF / HEIC / HEIF)
+    // "ftyp" at offset 4
+    if (u8[4] === 0x66 && u8[5] === 0x74 && u8[6] === 0x79 && u8[7] === 0x70) {
+      const type = String.fromCharCode(u8[8], u8[9], u8[10], u8[11]);
+      if (type === 'avif' || type === 'avis') {
+        return 'image/avif';
+      }
+      if (
+        type === 'heic' ||
+        type === 'heix' ||
+        type === 'hevc' ||
+        type === 'hevx'
+      ) {
+        return 'image/heic';
+      }
+      if (type === 'mif1' || type === 'msf1') {
+        return 'image/heif';
+      }
+    }
+    // JPEG XL: FF 0A or container bits
+    if (u8[0] === 0xff && u8[1] === 0x0a) {
+      return 'image/jxl';
+    }
+    // Container: 00 00 00 0c 4a 58 4c 20 0d 0a 87 0a (JXL )
+    if (u8[0] === 0x00 && u8[4] === 0x4a && u8[5] === 0x58 && u8[6] === 0x4c) {
+      return 'image/jxl';
+    }
+    // JPEG 2000
+    if (u8[0] === 0x00 && u8[4] === 0x6a && u8[5] === 0x50 && u8[6] === 0x20) {
+      return 'image/jp2';
+    }
+    // SVG: Check for <svg or <?xml (heuristics)
+    const preview = String.fromCharCode(...u8.slice(0, 100)).toLowerCase();
+    if (preview.includes('<svg') || preview.includes('<?xml')) {
+      return 'image/svg+xml';
+    }
+    return null;
+  }
   static async processAudio(source) {
     // Blob
     if (source instanceof Blob) {
@@ -46,8 +158,20 @@ export default class MultimodalConverter {
     }
     // BufferSource -> Assume it's already an audio file (mp3/wav)
-    if (ArrayBuffer.isView(source) || source instanceof ArrayBuffer) {
-      const buffer = source instanceof ArrayBuffer ? source : source.buffer;
+    const isArrayBuffer =
+      source instanceof ArrayBuffer ||
+      (source &&
+        source.constructor &&
+        source.constructor.name === 'ArrayBuffer');
+    const isView =
+      ArrayBuffer.isView(source) ||
+      (source &&
+        source.buffer &&
+        (source.buffer instanceof ArrayBuffer ||
+          source.buffer.constructor.name === 'ArrayBuffer'));
+    if (isArrayBuffer || isView) {
+      const buffer = isArrayBuffer ? source : source.buffer;
       return {
         inlineData: {
           data: this.arrayBufferToBase64(buffer),
@@ -65,14 +189,16 @@ export default class MultimodalConverter {
     return new Promise((resolve, reject) => {
       const reader = new FileReader();
       reader.onloadend = () => {
-        if (reader.error) reject(reader.error);
-        else
+        if (reader.error) {
+          reject(reader.error);
+        } else {
           resolve({
             inlineData: {
               data: reader.result.split(',')[1],
               mimeType: blob.type,
             },
           });
+        }
       };
       reader.readAsDataURL(blob);
     });

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "prompt-api-polyfill",
-  "version": "0.1.0",
-  "description": "Polyfill for the Prompt API (`LanguageModel`) backed by Firebase AI Logic.",
+  "version": "0.2.0",
+  "description": "Polyfill for the Prompt API (`LanguageModel`) backed by Firebase AI Logic, Gemini API, or OpenAI API.",
   "type": "module",
   "main": "./prompt-api-polyfill.js",
   "module": "./prompt-api-polyfill.js",
@@ -22,6 +22,8 @@
     "language-model",
     "polyfill",
     "firebase",
+    "gemini",
+    "openai",
     "web-ai"
   ],
   "repository": {
@@ -35,9 +37,22 @@
   "homepage": "https://github.com/GoogleChromeLabs/web-ai-demos/tree/main/prompt-api-polyfill/README.md",
   "license": "Apache-2.0",
   "scripts": {
-    "start": "npx http-server"
+    "start": "npx http-server",
+    "test:browser": "node scripts/list-backends.js && vitest run -c vitest.browser.config.js .browser.test.js",
+    "fix": "npx prettier --write ."
   },
   "devDependencies": {
-    "http-server": "^14.1.1"
+    "@firebase/ai": "^2.6.1",
+    "@google/generative-ai": "^0.24.1",
+    "@vitest/browser": "^4.0.17",
+    "@vitest/browser-playwright": "^4.0.17",
+    "ajv": "^8.17.1",
+    "firebase": "^12.7.0",
+    "http-server": "^14.1.1",
+    "jsdom": "^27.4.0",
+    "openai": "^6.16.0",
+    "playwright": "^1.57.0",
+    "prettier-plugin-curly": "^0.4.1",
+    "vitest": "^4.0.17"
   }
 }