npm - @huen123/llm-token-counter - Versions diffs - 0.1.0 - Mend

@huen123/llm-token-counter 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 angusmhlee113
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,162 @@
+# @huen123/llm-token-counter
+Count prompt text and simple chat-message tokens for mainstream LLM families without making remote API calls.
+`@huen123/llm-token-counter` is a Node.js-first npm package for quick token budgeting across OpenAI, Anthropic, Gemini, Mistral, Cohere, and Llama model families. Every result includes both the token count and metadata that tells you whether the count is exact or estimated.
+## Install
+```bash
+npm install @huen123/llm-token-counter
+```
+## What It Counts
+- Plain text with `countTokens({ model, input })`
+- Structured chat messages with `countChatTokens({ model, messages })`
+- Supported model metadata with `getModelInfo(model)`
+- Curated canonical model families with `listSupportedModels()`
+## Supported Precision
+| Provider family | Plain text | Chat messages | Strategy |
+| --- | --- | --- | --- |
+| OpenAI (`gpt-*`, `o*`, `chatgpt-*`) | Exact | Estimated | `openai-tiktoken` |
+| Anthropic (`claude-*`) | Estimated | Estimated | `anthropic-tokenizer` |
+| Google (`gemini-*`) | Estimated | Estimated | `gemini-char4` |
+| Mistral (`mistral-*`) | Estimated | Estimated | `cl100k-heuristic` |
+| Cohere (`command-*`, `aya-*`) | Estimated | Estimated | `cl100k-heuristic` |
+| Meta (`llama-*`) | Estimated | Estimated | `cl100k-heuristic` |
+OpenAI plain-text counts use the model tokenizer directly. Chat-message counts are estimated for every provider because message wrappers vary across APIs and versions.
+## Quick Start
+```ts
+import {
+  countChatTokens,
+  countTokens,
+  getModelInfo,
+  listSupportedModels,
+} from "@huen123/llm-token-counter";
+const promptResult = countTokens({
+  model: "gpt-4o",
+  input: "hello world!",
+});
+console.log(promptResult);
+// {
+//   requestedModel: 'gpt-4o',
+//   resolvedModel: 'gpt-4o',
+//   provider: 'openai',
+//   family: 'gpt-4o',
+//   tokenCount: 3,
+//   precision: 'exact',
+//   strategy: 'openai-tiktoken'
+// }
+const chatResult = countChatTokens({
+  model: "claude-3-5-sonnet-latest",
+  messages: [
+    { role: "system", content: "Be concise." },
+    { role: "user", content: "Summarize this paragraph." },
+  ],
+});
+console.log(chatResult.precision); // "estimated"
+console.log(getModelInfo("chatgpt-4o-latest"));
+console.log(listSupportedModels());
+```
+## API
+### `countTokens({ model, input })`
+Counts tokens for a plain string and returns:
+```ts
+type TokenCountResult = {
+  requestedModel: string;
+  resolvedModel: string;
+  provider: "openai" | "anthropic" | "google" | "mistral" | "cohere" | "meta";
+  family: string;
+  tokenCount: number;
+  precision: "exact" | "estimated";
+  strategy:
+    | "openai-tiktoken"
+    | "anthropic-tokenizer"
+    | "gemini-char4"
+    | "cl100k-heuristic";
+};
+```
+### `countChatTokens({ model, messages })`
+Accepts only simple string chat messages:
+```ts
+type ChatMessage = {
+  role: "system" | "user" | "assistant";
+  content: string;
+};
+```
+Chat counts add a lightweight per-message overhead estimate on top of each message content count.
+### `getModelInfo(model)`
+Resolves aliases to canonical families and returns:
+```ts
+type ModelInfo = {
+  resolvedModel: string;
+  provider: "openai" | "anthropic" | "google" | "mistral" | "cohere" | "meta";
+  family: string;
+  precision: "exact" | "estimated";
+  strategy:
+    | "openai-tiktoken"
+    | "anthropic-tokenizer"
+    | "gemini-char4"
+    | "cl100k-heuristic";
+  aliases: string[];
+};
+```
+### `listSupportedModels()`
+Returns the curated canonical families bundled in the current package release.
+## Aliases and Unknown Models
+- Known aliases are normalized to canonical families before counting.
+- Unknown models throw an error with close suggestions instead of silently guessing providers.
+- New provider releases usually only need registry updates in a new package version.
+## Unsupported in v1
+- Tool-call accounting
+- Multimodal inputs such as images, files, and audio
+- Full provider request-body parsing
+- Pricing estimates
+- Remote model-catalog updates
+- Browser runtime support
+## Development
+```bash
+npm test
+npm run build
+npm pack
+```
+## Publish Checklist
+```bash
+npm test
+npm run build
+npm publish --access public
+```
+`prepublishOnly` already runs the build and test steps before `npm publish`.

package/dist/index.cjs ADDED Viewed

@@ -0,0 +1,316 @@
+"use strict";
+var __create = Object.create;
+var __defProp = Object.defineProperty;
+var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
+var __getOwnPropNames = Object.getOwnPropertyNames;
+var __getProtoOf = Object.getPrototypeOf;
+var __hasOwnProp = Object.prototype.hasOwnProperty;
+var __export = (target, all) => {
+  for (var name in all)
+    __defProp(target, name, { get: all[name], enumerable: true });
+};
+var __copyProps = (to, from, except, desc) => {
+  if (from && typeof from === "object" || typeof from === "function") {
+    for (let key of __getOwnPropNames(from))
+      if (!__hasOwnProp.call(to, key) && key !== except)
+        __defProp(to, key, { get: () => from[key], enumerable: !(desc = __getOwnPropDesc(from, key)) || desc.enumerable });
+  }
+  return to;
+};
+var __toESM = (mod, isNodeMode, target) => (target = mod != null ? __create(__getProtoOf(mod)) : {}, __copyProps(
+  // If the importer is in node compatibility mode or this is not an ESM
+  // file that has been converted to a CommonJS file using a Babel-
+  // compatible transform (i.e. "__esModule" has not been set), then set
+  // "default" to the CommonJS "module.exports" for node compatibility.
+  isNodeMode || !mod || !mod.__esModule ? __defProp(target, "default", { value: mod, enumerable: true }) : target,
+  mod
+));
+var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: true }), mod);
+// src/index.ts
+var index_exports = {};
+__export(index_exports, {
+  countChatTokens: () => countChatTokens,
+  countTokens: () => countTokens2,
+  getModelInfo: () => getModelInfo,
+  listSupportedModels: () => listSupportedModels
+});
+module.exports = __toCommonJS(index_exports);
+// src/adapters/count-text.ts
+var anthropicTokenizer = __toESM(require("@anthropic-ai/tokenizer"), 1);
+var import_js_tiktoken = require("js-tiktoken");
+var CL100K = (0, import_js_tiktoken.getEncoding)("cl100k_base");
+var normalizeGeminiText = (input) => input.normalize("NFC").replace(/\r\n/g, "\n");
+var countOpenAiTokens = (model, input) => (0, import_js_tiktoken.encodingForModel)(model).encode(input).length;
+var countAnthropicTokens = (input) => anthropicTokenizer.countTokens(input);
+var countGeminiTokens = (input) => Math.ceil(normalizeGeminiText(input).length / 4);
+var countCl100kTokens = (input) => CL100K.encode(input).length;
+var countTextWithEntry = (entry, input) => {
+  switch (entry.strategy) {
+    case "openai-tiktoken":
+      return countOpenAiTokens(entry.resolvedModel, input);
+    case "anthropic-tokenizer":
+      return countAnthropicTokens(input);
+    case "gemini-char4":
+      return countGeminiTokens(input);
+    case "cl100k-heuristic":
+      return countCl100kTokens(input);
+    default: {
+      const exhaustive = entry.strategy;
+      throw new Error(`Unsupported token strategy: ${exhaustive}`);
+    }
+  }
+};
+var countChatWithEntry = (entry, messages) => {
+  const textCount = messages.reduce((sum, message) => sum + countTextWithEntry(entry, message.content), 0);
+  return textCount + messages.length * 4 + 2;
+};
+// src/registry/models.ts
+var MODEL_REGISTRY = [
+  {
+    resolvedModel: "gpt-4",
+    provider: "openai",
+    family: "gpt-4",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: []
+  },
+  {
+    resolvedModel: "gpt-4o",
+    provider: "openai",
+    family: "gpt-4o",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: ["chatgpt-4o-latest"],
+    prefixes: ["gpt-4o-", "chatgpt-4o-"]
+  },
+  {
+    resolvedModel: "gpt-4o-mini",
+    provider: "openai",
+    family: "gpt-4o-mini",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["gpt-4o-mini-"]
+  },
+  {
+    resolvedModel: "gpt-3.5-turbo",
+    provider: "openai",
+    family: "gpt-3.5-turbo",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["gpt-3.5-turbo-"]
+  },
+  {
+    resolvedModel: "o1",
+    provider: "openai",
+    family: "o1",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["o1-"]
+  },
+  {
+    resolvedModel: "o3-mini",
+    provider: "openai",
+    family: "o3-mini",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["o3-mini-"]
+  },
+  {
+    resolvedModel: "claude-3-5-sonnet",
+    provider: "anthropic",
+    family: "claude-3-5-sonnet",
+    precision: "estimated",
+    strategy: "anthropic-tokenizer",
+    aliases: ["claude-3-5-sonnet-latest"],
+    prefixes: ["claude-3-5-sonnet-"]
+  },
+  {
+    resolvedModel: "claude-3-haiku",
+    provider: "anthropic",
+    family: "claude-3-haiku",
+    precision: "estimated",
+    strategy: "anthropic-tokenizer",
+    aliases: ["claude-3-haiku-latest"],
+    prefixes: ["claude-3-haiku-"]
+  },
+  {
+    resolvedModel: "gemini-2.5-pro",
+    provider: "google",
+    family: "gemini-2.5-pro",
+    precision: "estimated",
+    strategy: "gemini-char4",
+    aliases: [],
+    prefixes: ["gemini-2.5-pro-"]
+  },
+  {
+    resolvedModel: "gemini-2.5-flash",
+    provider: "google",
+    family: "gemini-2.5-flash",
+    precision: "estimated",
+    strategy: "gemini-char4",
+    aliases: [],
+    prefixes: ["gemini-2.5-flash-"]
+  },
+  {
+    resolvedModel: "mistral-large",
+    provider: "mistral",
+    family: "mistral-large",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: ["mistral-large-latest"],
+    prefixes: ["mistral-large-"]
+  },
+  {
+    resolvedModel: "command-r-plus",
+    provider: "cohere",
+    family: "command-r-plus",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: [],
+    prefixes: ["command-r-plus-"]
+  },
+  {
+    resolvedModel: "aya-expanse-32b",
+    provider: "cohere",
+    family: "aya-expanse-32b",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: [],
+    prefixes: ["aya-expanse-32b-"]
+  },
+  {
+    resolvedModel: "llama-3.1-70b",
+    provider: "meta",
+    family: "llama-3.1-70b",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: ["llama-3.1-70b-instruct"],
+    prefixes: ["llama-3.1-70b-"]
+  }
+];
+// src/registry/resolve-model.ts
+var normalizeModelId = (model) => model.trim().toLowerCase();
+var levenshtein = (left, right) => {
+  const rows = left.length + 1;
+  const cols = right.length + 1;
+  const matrix = Array.from({ length: rows }, () => Array(cols).fill(0));
+  for (let row = 0; row < rows; row += 1) {
+    matrix[row][0] = row;
+  }
+  for (let col = 0; col < cols; col += 1) {
+    matrix[0][col] = col;
+  }
+  for (let row = 1; row < rows; row += 1) {
+    for (let col = 1; col < cols; col += 1) {
+      const cost = left[row - 1] === right[col - 1] ? 0 : 1;
+      matrix[row][col] = Math.min(
+        matrix[row - 1][col] + 1,
+        matrix[row][col - 1] + 1,
+        matrix[row - 1][col - 1] + cost
+      );
+    }
+  }
+  return matrix[left.length][right.length];
+};
+var normalizedName = (value) => value.replace(/[^a-z0-9]/g, "");
+var findClosestModels = (model) => {
+  const normalizedInput = normalizedName(model);
+  const candidates = MODEL_REGISTRY.flatMap((entry) => [entry.resolvedModel, ...entry.aliases]);
+  return candidates.map((candidate) => ({
+    candidate,
+    distance: levenshtein(normalizedInput, normalizedName(candidate)),
+    lengthDifference: Math.abs(normalizedInput.length - normalizedName(candidate).length)
+  })).sort(
+    (left, right) => left.distance - right.distance || left.lengthDifference - right.lengthDifference || left.candidate.localeCompare(right.candidate)
+  ).slice(0, 3).map(({ candidate }) => candidate);
+};
+var findEntry = (normalizedModel) => {
+  const exactMatch = MODEL_REGISTRY.find(
+    (entry) => entry.resolvedModel === normalizedModel || entry.aliases.includes(normalizedModel)
+  );
+  if (exactMatch) {
+    return exactMatch;
+  }
+  return MODEL_REGISTRY.find(
+    (entry) => entry.prefixes.some((prefix) => normalizedModel.startsWith(prefix))
+  );
+};
+var resolveModel = (model) => {
+  const normalizedModel = normalizeModelId(model);
+  const entry = findEntry(normalizedModel);
+  if (!entry) {
+    const suggestions = findClosestModels(normalizedModel);
+    const suggestionText = suggestions.length > 0 ? ` Did you mean: ${suggestions.join(", ")}?` : "";
+    throw new Error(`Unsupported model "${model}".${suggestionText}`);
+  }
+  return entry;
+};
+var toModelInfo = (entry) => ({
+  resolvedModel: entry.resolvedModel,
+  provider: entry.provider,
+  family: entry.family,
+  precision: entry.precision,
+  strategy: entry.strategy,
+  aliases: entry.aliases
+});
+// src/index.ts
+var VALID_CHAT_ROLES = /* @__PURE__ */ new Set(["system", "user", "assistant"]);
+function assertStringInput(input, label) {
+  if (typeof input !== "string") {
+    throw new TypeError(`${label} must be a string`);
+  }
+}
+function assertChatMessages(messages) {
+  for (const message of messages) {
+    if (!VALID_CHAT_ROLES.has(message.role)) {
+      throw new TypeError(`Unsupported chat role "${String(message.role)}"`);
+    }
+    if (typeof message.content !== "string") {
+      throw new TypeError("Chat messages must use string content");
+    }
+  }
+}
+var toResult = (requestedModel, tokenCount, resolved) => ({
+  requestedModel,
+  resolvedModel: resolved.resolvedModel,
+  provider: resolved.provider,
+  family: resolved.family,
+  tokenCount,
+  precision: resolved.precision,
+  strategy: resolved.strategy
+});
+var countTokens2 = ({ model, input }) => {
+  assertStringInput(input, "Input");
+  const resolved = resolveModel(model);
+  return toResult(model, countTextWithEntry(resolved, input), resolved);
+};
+var countChatTokens = ({
+  model,
+  messages
+}) => {
+  assertChatMessages(messages);
+  const resolved = resolveModel(model);
+  return toResult(model, countChatWithEntry(resolved, messages), {
+    ...resolved,
+    precision: "estimated"
+  });
+};
+var getModelInfo = (model) => toModelInfo(resolveModel(model));
+var listSupportedModels = () => MODEL_REGISTRY.map(toModelInfo);
+// Annotate the CommonJS export names for ESM import in node:
+0 && (module.exports = {
+  countChatTokens,
+  countTokens,
+  getModelInfo,
+  listSupportedModels
+});

package/dist/index.d.cts ADDED Viewed

@@ -0,0 +1,40 @@
+type Provider = "openai" | "anthropic" | "google" | "mistral" | "cohere" | "meta";
+type Precision = "exact" | "estimated";
+type Strategy = "openai-tiktoken" | "anthropic-tokenizer" | "gemini-char4" | "cl100k-heuristic";
+type ChatRole = "system" | "user" | "assistant";
+interface ChatMessage {
+    role: ChatRole;
+    content: string;
+}
+interface CountTokensParams {
+    model: string;
+    input: string;
+}
+interface CountChatTokensParams {
+    model: string;
+    messages: ChatMessage[];
+}
+interface TokenCountResult {
+    requestedModel: string;
+    resolvedModel: string;
+    provider: Provider;
+    family: string;
+    tokenCount: number;
+    precision: Precision;
+    strategy: Strategy;
+}
+interface ModelInfo {
+    resolvedModel: string;
+    provider: Provider;
+    family: string;
+    precision: Precision;
+    strategy: Strategy;
+    aliases: string[];
+}
+declare const countTokens: ({ model, input }: CountTokensParams) => TokenCountResult;
+declare const countChatTokens: ({ model, messages, }: CountChatTokensParams) => TokenCountResult;
+declare const getModelInfo: (model: string) => ModelInfo;
+declare const listSupportedModels: () => ModelInfo[];
+export { type ChatMessage, type ChatRole, type CountChatTokensParams, type CountTokensParams, type ModelInfo, type Precision, type Provider, type Strategy, type TokenCountResult, countChatTokens, countTokens, getModelInfo, listSupportedModels };

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,40 @@
+type Provider = "openai" | "anthropic" | "google" | "mistral" | "cohere" | "meta";
+type Precision = "exact" | "estimated";
+type Strategy = "openai-tiktoken" | "anthropic-tokenizer" | "gemini-char4" | "cl100k-heuristic";
+type ChatRole = "system" | "user" | "assistant";
+interface ChatMessage {
+    role: ChatRole;
+    content: string;
+}
+interface CountTokensParams {
+    model: string;
+    input: string;
+}
+interface CountChatTokensParams {
+    model: string;
+    messages: ChatMessage[];
+}
+interface TokenCountResult {
+    requestedModel: string;
+    resolvedModel: string;
+    provider: Provider;
+    family: string;
+    tokenCount: number;
+    precision: Precision;
+    strategy: Strategy;
+}
+interface ModelInfo {
+    resolvedModel: string;
+    provider: Provider;
+    family: string;
+    precision: Precision;
+    strategy: Strategy;
+    aliases: string[];
+}
+declare const countTokens: ({ model, input }: CountTokensParams) => TokenCountResult;
+declare const countChatTokens: ({ model, messages, }: CountChatTokensParams) => TokenCountResult;
+declare const getModelInfo: (model: string) => ModelInfo;
+declare const listSupportedModels: () => ModelInfo[];
+export { type ChatMessage, type ChatRole, type CountChatTokensParams, type CountTokensParams, type ModelInfo, type Precision, type Provider, type Strategy, type TokenCountResult, countChatTokens, countTokens, getModelInfo, listSupportedModels };

package/dist/index.js ADDED Viewed

@@ -0,0 +1,276 @@
+// src/adapters/count-text.ts
+import * as anthropicTokenizer from "@anthropic-ai/tokenizer";
+import { encodingForModel, getEncoding } from "js-tiktoken";
+var CL100K = getEncoding("cl100k_base");
+var normalizeGeminiText = (input) => input.normalize("NFC").replace(/\r\n/g, "\n");
+var countOpenAiTokens = (model, input) => encodingForModel(model).encode(input).length;
+var countAnthropicTokens = (input) => anthropicTokenizer.countTokens(input);
+var countGeminiTokens = (input) => Math.ceil(normalizeGeminiText(input).length / 4);
+var countCl100kTokens = (input) => CL100K.encode(input).length;
+var countTextWithEntry = (entry, input) => {
+  switch (entry.strategy) {
+    case "openai-tiktoken":
+      return countOpenAiTokens(entry.resolvedModel, input);
+    case "anthropic-tokenizer":
+      return countAnthropicTokens(input);
+    case "gemini-char4":
+      return countGeminiTokens(input);
+    case "cl100k-heuristic":
+      return countCl100kTokens(input);
+    default: {
+      const exhaustive = entry.strategy;
+      throw new Error(`Unsupported token strategy: ${exhaustive}`);
+    }
+  }
+};
+var countChatWithEntry = (entry, messages) => {
+  const textCount = messages.reduce((sum, message) => sum + countTextWithEntry(entry, message.content), 0);
+  return textCount + messages.length * 4 + 2;
+};
+// src/registry/models.ts
+var MODEL_REGISTRY = [
+  {
+    resolvedModel: "gpt-4",
+    provider: "openai",
+    family: "gpt-4",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: []
+  },
+  {
+    resolvedModel: "gpt-4o",
+    provider: "openai",
+    family: "gpt-4o",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: ["chatgpt-4o-latest"],
+    prefixes: ["gpt-4o-", "chatgpt-4o-"]
+  },
+  {
+    resolvedModel: "gpt-4o-mini",
+    provider: "openai",
+    family: "gpt-4o-mini",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["gpt-4o-mini-"]
+  },
+  {
+    resolvedModel: "gpt-3.5-turbo",
+    provider: "openai",
+    family: "gpt-3.5-turbo",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["gpt-3.5-turbo-"]
+  },
+  {
+    resolvedModel: "o1",
+    provider: "openai",
+    family: "o1",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["o1-"]
+  },
+  {
+    resolvedModel: "o3-mini",
+    provider: "openai",
+    family: "o3-mini",
+    precision: "exact",
+    strategy: "openai-tiktoken",
+    aliases: [],
+    prefixes: ["o3-mini-"]
+  },
+  {
+    resolvedModel: "claude-3-5-sonnet",
+    provider: "anthropic",
+    family: "claude-3-5-sonnet",
+    precision: "estimated",
+    strategy: "anthropic-tokenizer",
+    aliases: ["claude-3-5-sonnet-latest"],
+    prefixes: ["claude-3-5-sonnet-"]
+  },
+  {
+    resolvedModel: "claude-3-haiku",
+    provider: "anthropic",
+    family: "claude-3-haiku",
+    precision: "estimated",
+    strategy: "anthropic-tokenizer",
+    aliases: ["claude-3-haiku-latest"],
+    prefixes: ["claude-3-haiku-"]
+  },
+  {
+    resolvedModel: "gemini-2.5-pro",
+    provider: "google",
+    family: "gemini-2.5-pro",
+    precision: "estimated",
+    strategy: "gemini-char4",
+    aliases: [],
+    prefixes: ["gemini-2.5-pro-"]
+  },
+  {
+    resolvedModel: "gemini-2.5-flash",
+    provider: "google",
+    family: "gemini-2.5-flash",
+    precision: "estimated",
+    strategy: "gemini-char4",
+    aliases: [],
+    prefixes: ["gemini-2.5-flash-"]
+  },
+  {
+    resolvedModel: "mistral-large",
+    provider: "mistral",
+    family: "mistral-large",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: ["mistral-large-latest"],
+    prefixes: ["mistral-large-"]
+  },
+  {
+    resolvedModel: "command-r-plus",
+    provider: "cohere",
+    family: "command-r-plus",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: [],
+    prefixes: ["command-r-plus-"]
+  },
+  {
+    resolvedModel: "aya-expanse-32b",
+    provider: "cohere",
+    family: "aya-expanse-32b",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: [],
+    prefixes: ["aya-expanse-32b-"]
+  },
+  {
+    resolvedModel: "llama-3.1-70b",
+    provider: "meta",
+    family: "llama-3.1-70b",
+    precision: "estimated",
+    strategy: "cl100k-heuristic",
+    aliases: ["llama-3.1-70b-instruct"],
+    prefixes: ["llama-3.1-70b-"]
+  }
+];
+// src/registry/resolve-model.ts
+var normalizeModelId = (model) => model.trim().toLowerCase();
+var levenshtein = (left, right) => {
+  const rows = left.length + 1;
+  const cols = right.length + 1;
+  const matrix = Array.from({ length: rows }, () => Array(cols).fill(0));
+  for (let row = 0; row < rows; row += 1) {
+    matrix[row][0] = row;
+  }
+  for (let col = 0; col < cols; col += 1) {
+    matrix[0][col] = col;
+  }
+  for (let row = 1; row < rows; row += 1) {
+    for (let col = 1; col < cols; col += 1) {
+      const cost = left[row - 1] === right[col - 1] ? 0 : 1;
+      matrix[row][col] = Math.min(
+        matrix[row - 1][col] + 1,
+        matrix[row][col - 1] + 1,
+        matrix[row - 1][col - 1] + cost
+      );
+    }
+  }
+  return matrix[left.length][right.length];
+};
+var normalizedName = (value) => value.replace(/[^a-z0-9]/g, "");
+var findClosestModels = (model) => {
+  const normalizedInput = normalizedName(model);
+  const candidates = MODEL_REGISTRY.flatMap((entry) => [entry.resolvedModel, ...entry.aliases]);
+  return candidates.map((candidate) => ({
+    candidate,
+    distance: levenshtein(normalizedInput, normalizedName(candidate)),
+    lengthDifference: Math.abs(normalizedInput.length - normalizedName(candidate).length)
+  })).sort(
+    (left, right) => left.distance - right.distance || left.lengthDifference - right.lengthDifference || left.candidate.localeCompare(right.candidate)
+  ).slice(0, 3).map(({ candidate }) => candidate);
+};
+var findEntry = (normalizedModel) => {
+  const exactMatch = MODEL_REGISTRY.find(
+    (entry) => entry.resolvedModel === normalizedModel || entry.aliases.includes(normalizedModel)
+  );
+  if (exactMatch) {
+    return exactMatch;
+  }
+  return MODEL_REGISTRY.find(
+    (entry) => entry.prefixes.some((prefix) => normalizedModel.startsWith(prefix))
+  );
+};
+var resolveModel = (model) => {
+  const normalizedModel = normalizeModelId(model);
+  const entry = findEntry(normalizedModel);
+  if (!entry) {
+    const suggestions = findClosestModels(normalizedModel);
+    const suggestionText = suggestions.length > 0 ? ` Did you mean: ${suggestions.join(", ")}?` : "";
+    throw new Error(`Unsupported model "${model}".${suggestionText}`);
+  }
+  return entry;
+};
+var toModelInfo = (entry) => ({
+  resolvedModel: entry.resolvedModel,
+  provider: entry.provider,
+  family: entry.family,
+  precision: entry.precision,
+  strategy: entry.strategy,
+  aliases: entry.aliases
+});
+// src/index.ts
+var VALID_CHAT_ROLES = /* @__PURE__ */ new Set(["system", "user", "assistant"]);
+function assertStringInput(input, label) {
+  if (typeof input !== "string") {
+    throw new TypeError(`${label} must be a string`);
+  }
+}
+function assertChatMessages(messages) {
+  for (const message of messages) {
+    if (!VALID_CHAT_ROLES.has(message.role)) {
+      throw new TypeError(`Unsupported chat role "${String(message.role)}"`);
+    }
+    if (typeof message.content !== "string") {
+      throw new TypeError("Chat messages must use string content");
+    }
+  }
+}
+var toResult = (requestedModel, tokenCount, resolved) => ({
+  requestedModel,
+  resolvedModel: resolved.resolvedModel,
+  provider: resolved.provider,
+  family: resolved.family,
+  tokenCount,
+  precision: resolved.precision,
+  strategy: resolved.strategy
+});
+var countTokens2 = ({ model, input }) => {
+  assertStringInput(input, "Input");
+  const resolved = resolveModel(model);
+  return toResult(model, countTextWithEntry(resolved, input), resolved);
+};
+var countChatTokens = ({
+  model,
+  messages
+}) => {
+  assertChatMessages(messages);
+  const resolved = resolveModel(model);
+  return toResult(model, countChatWithEntry(resolved, messages), {
+    ...resolved,
+    precision: "estimated"
+  });
+};
+var getModelInfo = (model) => toModelInfo(resolveModel(model));
+var listSupportedModels = () => MODEL_REGISTRY.map(toModelInfo);
+export {
+  countChatTokens,
+  countTokens2 as countTokens,
+  getModelInfo,
+  listSupportedModels
+};

package/package.json ADDED Viewed

@@ -0,0 +1,53 @@
+{
+  "name": "@huen123/llm-token-counter",
+  "version": "0.1.0",
+  "description": "Count prompt and chat message tokens for mainstream LLM families with explicit precision metadata.",
+  "keywords": [
+    "llm",
+    "tokens",
+    "tokenizer",
+    "openai",
+    "anthropic",
+    "gemini",
+    "claude"
+  ],
+  "license": "MIT",
+  "author": "angusmhlee113",
+  "type": "module",
+  "sideEffects": false,
+  "main": "./dist/index.cjs",
+  "module": "./dist/index.js",
+  "types": "./dist/index.d.ts",
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.ts",
+      "import": "./dist/index.js",
+      "require": "./dist/index.cjs"
+    }
+  },
+  "files": [
+    "dist"
+  ],
+  "engines": {
+    "node": ">=18"
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "scripts": {
+    "build": "tsup src/index.ts --format esm,cjs --dts --clean",
+    "test": "vitest run",
+    "test:watch": "vitest",
+    "prepublishOnly": "npm run build && npm test"
+  },
+  "dependencies": {
+    "@anthropic-ai/tokenizer": "^0.0.4",
+    "js-tiktoken": "^1.0.21"
+  },
+  "devDependencies": {
+    "@types/node": "^25.5.2",
+    "tsup": "^8.5.1",
+    "typescript": "^6.0.2",
+    "vitest": "^4.1.2"
+  }
+}