ai-token-estimator 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,28 @@
1
1
  # ai-token-estimator
2
2
 
3
- Estimate token counts and costs for LLM API calls based on character count and model-specific ratios.
4
-
5
- > **Important:** This is a rough estimation tool for budgeting purposes, not a precise tokenizer. Actual token counts may vary by ±20% depending on:
6
- > - Content type (code vs prose)
7
- > - Language (CJK languages use more tokens)
8
- > - API message framing overhead
9
- > - Special characters and formatting
3
+ [![npm](https://img.shields.io/npm/v/ai-token-estimator.svg)](https://www.npmjs.com/package/ai-token-estimator)
4
+ [![CI](https://github.com/BitsAndBytesAI/ai-token-estimator/actions/workflows/ci.yml/badge.svg)](https://github.com/BitsAndBytesAI/ai-token-estimator/actions/workflows/ci.yml)
5
+ [![license](https://img.shields.io/npm/l/ai-token-estimator.svg)](https://github.com/BitsAndBytesAI/ai-token-estimator/blob/main/LICENSE)
6
+
7
+ The best way to estimate **tokens + input cost** for LLM calls — with **exact OpenAI tokenization** (tiktoken-compatible BPE) and optional **official provider token counting** for Claude/Gemini.
8
+
9
+ > Accuracy depends on the tokenizer mode you choose:
10
+ > - **Exact** for OpenAI models when you use `openai_exact` / `encode()` / `decode()`.
11
+ > - **Exact** for Claude/Gemini when you use `estimateAsync()` with their official count-tokens endpoints.
12
+ > - **Heuristic** fallback is available for speed and resilience.
13
+
14
+ ## Features
15
+
16
+ - **Exact OpenAI tokenization** (tiktoken-compatible BPE): `encode()` / `decode()` / `openai_exact`
17
+ - **Official provider token counting** (async):
18
+ - Anthropic `POST /v1/messages/count_tokens` (`anthropic_count_tokens`)
19
+ - Gemini `models/:countTokens` (`gemini_count_tokens`)
20
+ - **Fast local fallback** options:
21
+ - Heuristic (`heuristic`, default)
22
+ - Local Gemma SentencePiece approximation (`gemma_sentencepiece`)
23
+ - Automatic fallback to heuristic on provider failures (`fallbackToHeuristicOnError`)
24
+ - **Cost estimation** using a weekly auto-updated pricing/model list (GitHub Actions)
25
+ - TypeScript-first, ships ESM + CJS
10
26
 
11
27
  ## Installation
12
28
 
@@ -17,7 +33,7 @@ npm install ai-token-estimator
17
33
  ## Usage
18
34
 
19
35
  ```typescript
20
- import { estimate, getAvailableModels } from 'ai-token-estimator';
36
+ import { countTokens, estimate, getAvailableModels } from 'ai-token-estimator';
21
37
 
22
38
  // Basic usage
23
39
  const result = estimate({
@@ -37,6 +53,10 @@ console.log(result);
37
53
  // List available models
38
54
  console.log(getAvailableModels());
39
55
  // ['gpt-5.2', 'gpt-4o', 'claude-opus-4.5', 'gemini-3-pro', ...]
56
+
57
+ // Exact tokens for OpenAI, heuristic for others
58
+ console.log(countTokens({ text: 'Hello, world!', model: 'gpt-5.1' }));
59
+ // { tokens: 4, exact: true, encoding: 'o200k_base' }
40
60
  ```
41
61
 
42
62
  ## Exact OpenAI tokenization (BPE)
@@ -62,6 +82,97 @@ console.log(roundTrip); // "Hello, world!"
62
82
  Supported encodings:
63
83
  `r50k_base`, `p50k_base`, `p50k_edit`, `cl100k_base`, `o200k_base`, `o200k_harmony`
64
84
 
85
+ ## Using the exact tokenizer with `estimate()`
86
+
87
+ `estimate()` is heuristic by default (fast). If you want to use exact OpenAI token counting:
88
+
89
+ ```ts
90
+ import { estimate } from 'ai-token-estimator';
91
+
92
+ const result = estimate({
93
+ text: 'Hello, world!',
94
+ model: 'gpt-5.1',
95
+ tokenizer: 'openai_exact',
96
+ });
97
+
98
+ console.log(result.tokenizerMode); // "openai_exact"
99
+ console.log(result.encodingUsed); // "o200k_base"
100
+ ```
101
+
102
+ Or use `tokenizer: 'auto'` to use exact counting for OpenAI models and heuristic for everything else.
103
+
104
+ ## Provider token counting (Claude / Gemini)
105
+
106
+ If you want **more accurate token counts** for Anthropic or Gemini models, you can call their official token counting endpoints
107
+ via `estimateAsync()`. This requires API keys, and therefore should be used **server-side** (never in the browser).
108
+
109
+ If you want these modes to **fail open** (fallback to heuristic estimation) when the provider API is throttled/unavailable or the API key is invalid,
110
+ set `fallbackToHeuristicOnError: true`.
111
+
112
+ ### Anthropic: `POST /v1/messages/count_tokens`
113
+
114
+ - Env var: `ANTHROPIC_API_KEY`
115
+
116
+ ```ts
117
+ import { estimateAsync } from 'ai-token-estimator';
118
+
119
+ const out = await estimateAsync({
120
+ text: 'Hello, Claude',
121
+ model: 'claude-sonnet-4-5',
122
+ tokenizer: 'anthropic_count_tokens',
123
+ fallbackToHeuristicOnError: true,
124
+ anthropic: {
125
+ // apiKey: '...' // optional; otherwise uses process.env.ANTHROPIC_API_KEY
126
+ system: 'You are a helpful assistant',
127
+ },
128
+ });
129
+
130
+ console.log(out.estimatedTokens);
131
+ ```
132
+
133
+ ### Gemini: `models/:countTokens` (Google AI Studio)
134
+
135
+ - Env var: `GEMINI_API_KEY`
136
+
137
+ ```ts
138
+ import { estimateAsync } from 'ai-token-estimator';
139
+
140
+ const out = await estimateAsync({
141
+ text: 'The quick brown fox jumps over the lazy dog.',
142
+ model: 'gemini-2.0-flash',
143
+ tokenizer: 'gemini_count_tokens',
144
+ fallbackToHeuristicOnError: true,
145
+ gemini: {
146
+ // apiKey: '...' // optional; otherwise uses process.env.GEMINI_API_KEY
147
+ },
148
+ });
149
+
150
+ console.log(out.estimatedTokens);
151
+ ```
152
+
153
+ ### Local Gemini option: Gemma SentencePiece (approximation)
154
+
155
+ If you want a **local** tokenizer option for Gemini-like models, you can use a SentencePiece tokenizer model (e.g. Gemma's
156
+ `tokenizer.model`) via `sentencepiece-js`.
157
+
158
+ ```ts
159
+ import { estimateAsync } from 'ai-token-estimator';
160
+
161
+ const out = await estimateAsync({
162
+ text: 'Hello!',
163
+ model: 'gemini-2.0-flash',
164
+ tokenizer: 'gemma_sentencepiece',
165
+ gemma: {
166
+ modelPath: '/path/to/tokenizer.model',
167
+ },
168
+ });
169
+
170
+ console.log(out.estimatedTokens);
171
+ ```
172
+
173
+ Note:
174
+ - This is **not** an official Gemini tokenizer; treat it as an approximation unless you have verified equivalence for your models.
175
+
65
176
  ## API Reference
66
177
 
67
178
  ### `estimate(input: EstimateInput): EstimateOutput`
@@ -79,6 +190,9 @@ interface EstimateInput {
79
190
  }
80
191
  ```
81
192
 
193
+ Note:
194
+ - Provider-backed modes (`anthropic_count_tokens`, `gemini_count_tokens`, `gemma_sentencepiece`) are only supported in `estimateAsync()`.
195
+
82
196
  **Returns:**
83
197
 
84
198
  ```typescript
@@ -93,6 +207,20 @@ interface EstimateOutput {
93
207
  }
94
208
  ```
95
209
 
210
+ ### `estimateAsync(input: EstimateAsyncInput): Promise<EstimateOutput>`
211
+
212
+ Async estimator that supports provider token counting modes:
213
+ - `anthropic_count_tokens` (Anthropic token count endpoint)
214
+ - `gemini_count_tokens` (Gemini token count endpoint)
215
+ - `gemma_sentencepiece` (local SentencePiece, requires `sentencepiece-js` and a model file)
216
+
217
+ API keys should be provided via env vars (`ANTHROPIC_API_KEY`, `GEMINI_API_KEY`) or passed explicitly in the config objects.
218
+
219
+ If you pass `fallbackToHeuristicOnError: true`, provider-backed modes will fall back to heuristic estimation on:
220
+ - invalid/expired API key (401/403)
221
+ - rate limiting (429)
222
+ - provider errors (5xx) or network issues
223
+
96
224
  ### `countTokens(input: TokenCountInput): TokenCountOutput`
97
225
 
98
226
  Counts tokens for a given model:
package/dist/index.cjs CHANGED
@@ -1,7 +1,9 @@
1
1
  "use strict";
2
+ var __create = Object.create;
2
3
  var __defProp = Object.defineProperty;
3
4
  var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
4
5
  var __getOwnPropNames = Object.getOwnPropertyNames;
6
+ var __getProtoOf = Object.getPrototypeOf;
5
7
  var __hasOwnProp = Object.prototype.hasOwnProperty;
6
8
  var __export = (target, all) => {
7
9
  for (var name in all)
@@ -15,6 +17,14 @@ var __copyProps = (to, from, except, desc) => {
15
17
  }
16
18
  return to;
17
19
  };
20
+ var __toESM = (mod, isNodeMode, target) => (target = mod != null ? __create(__getProtoOf(mod)) : {}, __copyProps(
21
+ // If the importer is in node compatibility mode or this is not an ESM
22
+ // file that has been converted to a CommonJS file using a Babel-
23
+ // compatible transform (i.e. "__esModule" has not been set), then set
24
+ // "default" to the CommonJS "module.exports" for node compatibility.
25
+ isNodeMode || !mod || !mod.__esModule ? __defProp(target, "default", { value: mod, enumerable: true }) : target,
26
+ mod
27
+ ));
18
28
  var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: true }), mod);
19
29
 
20
30
  // src/index.ts
@@ -22,10 +32,14 @@ var index_exports = {};
22
32
  __export(index_exports, {
23
33
  DEFAULT_MODELS: () => DEFAULT_MODELS,
24
34
  LAST_UPDATED: () => LAST_UPDATED,
35
+ countAnthropicInputTokens: () => countAnthropicInputTokens,
36
+ countGeminiTokens: () => countGeminiTokens,
37
+ countGemmaSentencePieceTokens: () => countGemmaSentencePieceTokens,
25
38
  countTokens: () => countTokens,
26
39
  decode: () => decode,
27
40
  encode: () => encode,
28
41
  estimate: () => estimate,
42
+ estimateAsync: () => estimateAsync,
29
43
  getAvailableModels: () => getAvailableModels,
30
44
  getModelConfig: () => getModelConfig
31
45
  });
@@ -394,14 +408,21 @@ var models = {
394
408
  Object.values(models).forEach((config) => Object.freeze(config));
395
409
  var DEFAULT_MODELS = Object.freeze(models);
396
410
  function getModelConfig(model) {
397
- const config = DEFAULT_MODELS[model];
398
- if (!config) {
411
+ const direct = DEFAULT_MODELS[model];
412
+ if (direct) return direct;
413
+ const normalized = (() => {
414
+ if (!model.startsWith("claude-")) return model;
415
+ const withoutDate = model.replace(/-\d{8}$/, "");
416
+ return withoutDate.replace(/-(\d+)-(\d+)$/, (_m, major, minor) => `-${major}.${minor}`);
417
+ })();
418
+ const aliased = DEFAULT_MODELS[normalized];
419
+ if (!aliased) {
399
420
  const available = Object.keys(DEFAULT_MODELS).join(", ");
400
421
  throw new Error(
401
422
  `Unknown model: "${model}". Available models: ${available}`
402
423
  );
403
424
  }
404
- return config;
425
+ return aliased;
405
426
  }
406
427
  function getAvailableModels() {
407
428
  return Object.keys(DEFAULT_MODELS);
@@ -491,13 +512,17 @@ function countCodePoints(text) {
491
512
  function estimate(input) {
492
513
  const { text, model, rounding = "ceil", tokenizer = "heuristic" } = input;
493
514
  const config = getModelConfig(model);
515
+ const tokenizerStr = tokenizer;
516
+ if (tokenizerStr === "anthropic_count_tokens" || tokenizerStr === "gemini_count_tokens" || tokenizerStr === "gemma_sentencepiece") {
517
+ throw new Error(`Tokenizer mode "${tokenizerStr}" requires async execution. Use estimateAsync(...) instead.`);
518
+ }
494
519
  const characterCount = countCodePoints(text);
495
- const isNonOpenAIModel2 = model.startsWith("claude-") || model.startsWith("gemini-");
520
+ const isNonOpenAIModel3 = model.startsWith("claude-") || model.startsWith("gemini-");
496
521
  let estimatedTokens;
497
522
  let tokenizerModeUsed = "heuristic";
498
523
  let encodingUsed;
499
524
  const shouldTryExact = tokenizer === "openai_exact" || tokenizer === "auto";
500
- if (shouldTryExact && !isNonOpenAIModel2) {
525
+ if (shouldTryExact && !isNonOpenAIModel3) {
501
526
  try {
502
527
  estimatedTokens = encode(text, { model, allowSpecial: "none" }).length;
503
528
  tokenizerModeUsed = "openai_exact";
@@ -507,7 +532,7 @@ function estimate(input) {
507
532
  throw error;
508
533
  }
509
534
  }
510
- } else if (tokenizer === "openai_exact" && isNonOpenAIModel2) {
535
+ } else if (tokenizer === "openai_exact" && isNonOpenAIModel3) {
511
536
  throw new Error(
512
537
  `Tokenizer mode "openai_exact" requested for non-OpenAI model: "${model}"`
513
538
  );
@@ -539,13 +564,283 @@ function estimate(input) {
539
564
  };
540
565
  }
541
566
 
542
- // src/token-counter.ts
567
+ // src/providers/anthropic.ts
568
+ function getFetch(fetchImpl) {
569
+ const f = fetchImpl ?? globalThis.fetch;
570
+ if (!f) {
571
+ throw new Error("globalThis.fetch is not available; pass fetch in AnthropicCountTokensParams");
572
+ }
573
+ return f;
574
+ }
575
+ function withStatus(message, status) {
576
+ const err = new Error(message);
577
+ err.status = status;
578
+ return err;
579
+ }
580
+ function getApiKey(explicit) {
581
+ const key = explicit ?? (typeof process !== "undefined" ? process.env.ANTHROPIC_API_KEY : void 0);
582
+ if (!key) throw withStatus("Anthropic API key missing (set ANTHROPIC_API_KEY or pass apiKey)", 401);
583
+ return key;
584
+ }
585
+ function asRecord(value) {
586
+ if (!value || typeof value !== "object" || Array.isArray(value)) return null;
587
+ return value;
588
+ }
589
+ async function countAnthropicInputTokens(params) {
590
+ const fetchImpl = getFetch(params.fetch);
591
+ const apiKey = getApiKey(params.apiKey);
592
+ const baseUrl = (params.baseUrl ?? "https://api.anthropic.com").replace(/\/+$/, "");
593
+ const version = params.version ?? "2023-06-01";
594
+ const messages = params.messages ?? (typeof params.text === "string" ? [{ role: "user", content: params.text }] : null);
595
+ if (!messages) {
596
+ throw new Error("Anthropic token counting requires either `messages` or `text`");
597
+ }
598
+ const body = {
599
+ model: params.model,
600
+ messages
601
+ };
602
+ if (typeof params.system === "string" && params.system.trim()) {
603
+ body.system = params.system;
604
+ }
605
+ const response = await fetchImpl(`${baseUrl}/v1/messages/count_tokens`, {
606
+ method: "POST",
607
+ headers: {
608
+ "content-type": "application/json",
609
+ "x-api-key": apiKey,
610
+ "anthropic-version": version
611
+ },
612
+ body: JSON.stringify(body)
613
+ });
614
+ const text = await response.text();
615
+ let data = null;
616
+ try {
617
+ data = text ? JSON.parse(text) : null;
618
+ } catch {
619
+ }
620
+ const dataObj = asRecord(data);
621
+ if (!response.ok) {
622
+ const errorObj = asRecord(dataObj?.error);
623
+ const msg = typeof errorObj?.message === "string" ? errorObj.message : typeof dataObj?.message === "string" ? dataObj.message : `HTTP ${response.status}`;
624
+ throw withStatus(`Anthropic count_tokens failed: ${msg}`, response.status);
625
+ }
626
+ const inputTokens = dataObj?.input_tokens;
627
+ if (typeof inputTokens !== "number" || !Number.isFinite(inputTokens) || inputTokens < 0) {
628
+ throw new Error("Anthropic count_tokens returned invalid input_tokens");
629
+ }
630
+ return inputTokens;
631
+ }
632
+
633
+ // src/providers/gemini.ts
634
+ function getFetch2(fetchImpl) {
635
+ const f = fetchImpl ?? globalThis.fetch;
636
+ if (!f) {
637
+ throw new Error("globalThis.fetch is not available; pass fetch in GeminiCountTokensParams");
638
+ }
639
+ return f;
640
+ }
641
+ function withStatus2(message, status) {
642
+ const err = new Error(message);
643
+ err.status = status;
644
+ return err;
645
+ }
646
+ function getApiKey2(explicit) {
647
+ const key = explicit ?? (typeof process !== "undefined" ? process.env.GEMINI_API_KEY : void 0);
648
+ if (!key) throw withStatus2("Gemini API key missing (set GEMINI_API_KEY or pass apiKey)", 401);
649
+ return key;
650
+ }
651
+ function toContents(text) {
652
+ return [{ role: "user", parts: [{ text }] }];
653
+ }
654
+ function asRecord2(value) {
655
+ if (!value || typeof value !== "object" || Array.isArray(value)) return null;
656
+ return value;
657
+ }
658
+ async function countGeminiTokens(params) {
659
+ const fetchImpl = getFetch2(params.fetch);
660
+ const apiKey = getApiKey2(params.apiKey);
661
+ const baseUrl = (params.baseUrl ?? "https://generativelanguage.googleapis.com").replace(/\/+$/, "");
662
+ const contents = params.contents ?? (typeof params.text === "string" ? toContents(params.text) : null);
663
+ if (!contents) {
664
+ throw new Error("Gemini token counting requires either `contents` or `text`");
665
+ }
666
+ const url = `${baseUrl}/v1beta/models/${encodeURIComponent(params.model)}:countTokens?key=${encodeURIComponent(apiKey)}`;
667
+ const response = await fetchImpl(url, {
668
+ method: "POST",
669
+ headers: { "content-type": "application/json" },
670
+ body: JSON.stringify({ contents })
671
+ });
672
+ const text = await response.text();
673
+ let data = null;
674
+ try {
675
+ data = text ? JSON.parse(text) : null;
676
+ } catch {
677
+ }
678
+ const dataObj = asRecord2(data);
679
+ if (!response.ok) {
680
+ const errorObj = asRecord2(dataObj?.error);
681
+ const msg = typeof errorObj?.message === "string" ? errorObj.message : typeof dataObj?.message === "string" ? dataObj.message : `HTTP ${response.status}`;
682
+ throw withStatus2(`Gemini countTokens failed: ${msg}`, response.status);
683
+ }
684
+ const totalTokens = dataObj?.totalTokens ?? dataObj?.total_tokens ?? dataObj?.total_tokens_count;
685
+ if (typeof totalTokens !== "number" || !Number.isFinite(totalTokens) || totalTokens < 0) {
686
+ throw new Error("Gemini countTokens returned invalid totalTokens");
687
+ }
688
+ return totalTokens;
689
+ }
690
+
691
+ // src/providers/gemma-sentencepiece.ts
692
+ async function loadSentencePiece() {
693
+ try {
694
+ const mod = await import("sentencepiece-js");
695
+ if (mod.SentencePieceProcessor || mod.cleanText) return mod;
696
+ if (mod.default && typeof mod.default === "object" && mod.default.SentencePieceProcessor) {
697
+ return mod.default;
698
+ }
699
+ return mod;
700
+ } catch {
701
+ throw new Error(
702
+ "Local Gemma SentencePiece tokenization requires the optional dependency `sentencepiece-js`. Install it and try again."
703
+ );
704
+ }
705
+ }
706
+ async function countGemmaSentencePieceTokens(params) {
707
+ const sp = await loadSentencePiece();
708
+ const defaults = (sp.default && typeof sp.default === "object" ? sp.default : null) ?? {};
709
+ const SentencePieceProcessor = sp.SentencePieceProcessor ?? defaults.SentencePieceProcessor;
710
+ const cleanText = sp.cleanText ?? defaults.cleanText;
711
+ if (!SentencePieceProcessor || typeof SentencePieceProcessor !== "function") {
712
+ throw new Error("sentencepiece-js did not export SentencePieceProcessor as expected");
713
+ }
714
+ const processor = new SentencePieceProcessor();
715
+ const loaded = processor.load(params.modelPath);
716
+ if (loaded instanceof Promise) await loaded;
717
+ const cleaned = typeof cleanText === "function" ? cleanText(params.text) : params.text;
718
+ const ids = processor.encodeIds(cleaned);
719
+ if (!Array.isArray(ids)) {
720
+ throw new Error("sentencepiece-js returned invalid ids from encodeIds");
721
+ }
722
+ return ids.length;
723
+ }
724
+
725
+ // src/estimator-async.ts
726
+ function countCodePoints2(text) {
727
+ let count = 0;
728
+ for (const _char of text) count++;
729
+ return count;
730
+ }
543
731
  function isNonOpenAIModel(model) {
544
732
  return model.startsWith("claude-") || model.startsWith("gemini-");
545
733
  }
734
+ function shouldFallbackToHeuristic(err) {
735
+ if (!err) return true;
736
+ const maybe = err;
737
+ const statusRaw = maybe.status;
738
+ const status = typeof statusRaw === "number" && Number.isFinite(statusRaw) ? statusRaw : null;
739
+ if (!status) return true;
740
+ if (status === 401 || status === 403 || status === 429) return true;
741
+ if (status >= 500 && status <= 599) return true;
742
+ return false;
743
+ }
744
+ async function estimateAsync(input) {
745
+ const { text, model, rounding = "ceil", tokenizer = "heuristic" } = input;
746
+ const config = getModelConfig(model);
747
+ const characterCount = countCodePoints2(text);
748
+ let estimatedTokens;
749
+ let tokenizerModeUsed = "heuristic";
750
+ let encodingUsed;
751
+ if (tokenizer === "anthropic_count_tokens") {
752
+ try {
753
+ estimatedTokens = await countAnthropicInputTokens({
754
+ model,
755
+ text,
756
+ system: input.anthropic?.system,
757
+ apiKey: input.anthropic?.apiKey,
758
+ baseUrl: input.anthropic?.baseUrl,
759
+ version: input.anthropic?.version,
760
+ fetch: input.fetch
761
+ });
762
+ tokenizerModeUsed = "anthropic_count_tokens";
763
+ } catch (error) {
764
+ if (input.fallbackToHeuristicOnError && shouldFallbackToHeuristic(error)) {
765
+ estimatedTokens = void 0;
766
+ tokenizerModeUsed = "heuristic";
767
+ } else {
768
+ throw error;
769
+ }
770
+ }
771
+ } else if (tokenizer === "gemini_count_tokens") {
772
+ try {
773
+ estimatedTokens = await countGeminiTokens({
774
+ model,
775
+ text,
776
+ apiKey: input.gemini?.apiKey,
777
+ baseUrl: input.gemini?.baseUrl,
778
+ fetch: input.fetch
779
+ });
780
+ tokenizerModeUsed = "gemini_count_tokens";
781
+ } catch (error) {
782
+ if (input.fallbackToHeuristicOnError && shouldFallbackToHeuristic(error)) {
783
+ estimatedTokens = void 0;
784
+ tokenizerModeUsed = "heuristic";
785
+ } else {
786
+ throw error;
787
+ }
788
+ }
789
+ } else if (tokenizer === "gemma_sentencepiece") {
790
+ const modelPath = input.gemma?.modelPath;
791
+ if (!modelPath) {
792
+ throw new Error("gemma_sentencepiece tokenizer requires gemma.modelPath (path to tokenizer.model)");
793
+ }
794
+ estimatedTokens = await countGemmaSentencePieceTokens({ modelPath, text });
795
+ tokenizerModeUsed = "gemma_sentencepiece";
796
+ } else {
797
+ const shouldTryExact = tokenizer === "openai_exact" || tokenizer === "auto";
798
+ if (shouldTryExact && !isNonOpenAIModel(model)) {
799
+ try {
800
+ estimatedTokens = encode(text, { model, allowSpecial: "none" }).length;
801
+ tokenizerModeUsed = "openai_exact";
802
+ encodingUsed = getOpenAIEncoding({ model });
803
+ } catch (error) {
804
+ if (tokenizer === "openai_exact") throw error;
805
+ }
806
+ } else if (tokenizer === "openai_exact" && isNonOpenAIModel(model)) {
807
+ throw new Error(`Tokenizer mode "openai_exact" requested for non-OpenAI model: "${model}"`);
808
+ }
809
+ }
810
+ if (estimatedTokens === void 0) {
811
+ const rawTokens = characterCount / config.charsPerToken;
812
+ switch (rounding) {
813
+ case "floor":
814
+ estimatedTokens = Math.floor(rawTokens);
815
+ break;
816
+ case "round":
817
+ estimatedTokens = Math.round(rawTokens);
818
+ break;
819
+ case "ceil":
820
+ default:
821
+ estimatedTokens = Math.ceil(rawTokens);
822
+ }
823
+ tokenizerModeUsed = "heuristic";
824
+ }
825
+ const estimatedInputCost = estimatedTokens * config.inputCostPerMillion / 1e6;
826
+ return {
827
+ model,
828
+ characterCount,
829
+ estimatedTokens,
830
+ estimatedInputCost,
831
+ charsPerToken: config.charsPerToken,
832
+ tokenizerMode: tokenizerModeUsed,
833
+ encodingUsed
834
+ };
835
+ }
836
+
837
+ // src/token-counter.ts
838
+ function isNonOpenAIModel2(model) {
839
+ return model.startsWith("claude-") || model.startsWith("gemini-");
840
+ }
546
841
  function countTokens(input) {
547
842
  const { text, model } = input;
548
- if (isNonOpenAIModel(model)) {
843
+ if (isNonOpenAIModel2(model)) {
549
844
  return {
550
845
  tokens: estimate({ text, model }).estimatedTokens,
551
846
  exact: false
@@ -568,10 +863,14 @@ function countTokens(input) {
568
863
  0 && (module.exports = {
569
864
  DEFAULT_MODELS,
570
865
  LAST_UPDATED,
866
+ countAnthropicInputTokens,
867
+ countGeminiTokens,
868
+ countGemmaSentencePieceTokens,
571
869
  countTokens,
572
870
  decode,
573
871
  encode,
574
872
  estimate,
873
+ estimateAsync,
575
874
  getAvailableModels,
576
875
  getModelConfig
577
876
  });
package/dist/index.d.cts CHANGED
@@ -8,6 +8,13 @@ interface ModelConfig {
8
8
  inputCostPerMillion: number;
9
9
  }
10
10
  type TokenizerMode = 'heuristic' | 'openai_exact' | 'auto';
11
+ /**
12
+ * Tokenizer modes supported by `estimateAsync(...)`.
13
+ *
14
+ * This is intentionally separate from `TokenizerMode` to avoid breaking
15
+ * TypeScript users who exhaustively switch on the legacy `TokenizerMode` union.
16
+ */
17
+ type TokenizerModeAsync = TokenizerMode | 'anthropic_count_tokens' | 'gemini_count_tokens' | 'gemma_sentencepiece';
11
18
  /**
12
19
  * Input parameters for the estimate function.
13
20
  */
@@ -26,6 +33,53 @@ interface EstimateInput {
26
33
  */
27
34
  tokenizer?: TokenizerMode;
28
35
  }
36
+ interface EstimateAsyncInput extends Omit<EstimateInput, 'tokenizer'> {
37
+ /**
38
+ * Token counting strategy for async estimation.
39
+ * Includes provider-backed modes that require network access or local model files.
40
+ */
41
+ tokenizer?: TokenizerModeAsync;
42
+ /**
43
+ * Optional fetch implementation (useful for tests, edge runtimes, or custom fetch).
44
+ * Defaults to globalThis.fetch.
45
+ */
46
+ fetch?: typeof fetch;
47
+ /**
48
+ * If true, provider-backed tokenizer modes will fall back to heuristic token estimation
49
+ * when the provider API is throttled/unavailable or the API key is invalid.
50
+ *
51
+ * This never stores API keys; it only affects error handling.
52
+ *
53
+ * Default: false (throw on provider errors)
54
+ */
55
+ fallbackToHeuristicOnError?: boolean;
56
+ /**
57
+ * Configuration for Anthropic token counting.
58
+ * Only used when tokenizer === 'anthropic_count_tokens'.
59
+ */
60
+ anthropic?: {
61
+ apiKey?: string;
62
+ baseUrl?: string;
63
+ version?: string;
64
+ system?: string;
65
+ };
66
+ /**
67
+ * Configuration for Gemini token counting (Google AI Studio / Generative Language API).
68
+ * Only used when tokenizer === 'gemini_count_tokens'.
69
+ */
70
+ gemini?: {
71
+ apiKey?: string;
72
+ baseUrl?: string;
73
+ };
74
+ /**
75
+ * Configuration for local Gemma SentencePiece tokenization.
76
+ * Only used when tokenizer === 'gemma_sentencepiece'.
77
+ */
78
+ gemma?: {
79
+ /** Filesystem path to a SentencePiece model file (e.g. Gemma tokenizer.model). */
80
+ modelPath?: string;
81
+ };
82
+ }
29
83
  /**
30
84
  * Output from the estimate function.
31
85
  */
@@ -41,7 +95,7 @@ interface EstimateOutput {
41
95
  /** The chars-per-token ratio used */
42
96
  charsPerToken: number;
43
97
  /** Which tokenizer strategy was used */
44
- tokenizerMode?: TokenizerMode;
98
+ tokenizerMode?: TokenizerModeAsync;
45
99
  /** OpenAI encoding used when tokenizerMode is `openai_exact` */
46
100
  encodingUsed?: string;
47
101
  }
@@ -65,6 +119,8 @@ interface EstimateOutput {
65
119
  */
66
120
  declare function estimate(input: EstimateInput): EstimateOutput;
67
121
 
122
+ declare function estimateAsync(input: EstimateAsyncInput): Promise<EstimateOutput>;
123
+
68
124
  /**
69
125
  * Default model configurations.
70
126
  *
@@ -141,4 +197,47 @@ interface TokenCountOutput {
141
197
  */
142
198
  declare function countTokens(input: TokenCountInput): TokenCountOutput;
143
199
 
144
- export { DEFAULT_MODELS, type EncodeOptions, type EstimateInput, type EstimateOutput, LAST_UPDATED, type ModelConfig, type OpenAIEncoding, type SpecialTokenHandling, type TokenCountInput, type TokenCountOutput, type TokenizerMode, countTokens, decode, encode, estimate, getAvailableModels, getModelConfig };
200
+ interface AnthropicCountTokensParams {
201
+ /** Claude model id, e.g. `claude-sonnet-4-5` */
202
+ model: string;
203
+ /** Anthropic API key. If omitted, uses process.env.ANTHROPIC_API_KEY */
204
+ apiKey?: string;
205
+ /** Text-only helper; converted into a single user message. */
206
+ text?: string;
207
+ /** Optional system prompt. */
208
+ system?: string;
209
+ /** Full messages payload (wins over `text` when provided). */
210
+ messages?: unknown;
211
+ /** Override API base URL (default: https://api.anthropic.com) */
212
+ baseUrl?: string;
213
+ /** Override Anthropic version header (default: 2023-06-01) */
214
+ version?: string;
215
+ /** Optional fetch implementation. Defaults to globalThis.fetch. */
216
+ fetch?: typeof fetch;
217
+ }
218
+ declare function countAnthropicInputTokens(params: AnthropicCountTokensParams): Promise<number>;
219
+
220
+ interface GeminiCountTokensParams {
221
+ /** Gemini model id, e.g. `gemini-2.0-flash` */
222
+ model: string;
223
+ /** Gemini API key. If omitted, uses process.env.GEMINI_API_KEY */
224
+ apiKey?: string;
225
+ /** Text-only helper; converted into a basic `contents` payload. */
226
+ text?: string;
227
+ /** Full `contents` payload (wins over `text` when provided). */
228
+ contents?: unknown;
229
+ /** Override API base URL (default: https://generativelanguage.googleapis.com) */
230
+ baseUrl?: string;
231
+ /** Optional fetch implementation. Defaults to globalThis.fetch. */
232
+ fetch?: typeof fetch;
233
+ }
234
+ declare function countGeminiTokens(params: GeminiCountTokensParams): Promise<number>;
235
+
236
+ interface GemmaSentencePieceCountTokensParams {
237
+ /** Filesystem path to a SentencePiece model file (e.g. Gemma `tokenizer.model`). */
238
+ modelPath: string;
239
+ text: string;
240
+ }
241
+ declare function countGemmaSentencePieceTokens(params: GemmaSentencePieceCountTokensParams): Promise<number>;
242
+
243
+ export { type AnthropicCountTokensParams, DEFAULT_MODELS, type EncodeOptions, type EstimateAsyncInput, type EstimateInput, type EstimateOutput, type GeminiCountTokensParams, type GemmaSentencePieceCountTokensParams, LAST_UPDATED, type ModelConfig, type OpenAIEncoding, type SpecialTokenHandling, type TokenCountInput, type TokenCountOutput, type TokenizerMode, type TokenizerModeAsync, countAnthropicInputTokens, countGeminiTokens, countGemmaSentencePieceTokens, countTokens, decode, encode, estimate, estimateAsync, getAvailableModels, getModelConfig };
package/dist/index.d.ts CHANGED
@@ -8,6 +8,13 @@ interface ModelConfig {
8
8
  inputCostPerMillion: number;
9
9
  }
10
10
  type TokenizerMode = 'heuristic' | 'openai_exact' | 'auto';
11
+ /**
12
+ * Tokenizer modes supported by `estimateAsync(...)`.
13
+ *
14
+ * This is intentionally separate from `TokenizerMode` to avoid breaking
15
+ * TypeScript users who exhaustively switch on the legacy `TokenizerMode` union.
16
+ */
17
+ type TokenizerModeAsync = TokenizerMode | 'anthropic_count_tokens' | 'gemini_count_tokens' | 'gemma_sentencepiece';
11
18
  /**
12
19
  * Input parameters for the estimate function.
13
20
  */
@@ -26,6 +33,53 @@ interface EstimateInput {
26
33
  */
27
34
  tokenizer?: TokenizerMode;
28
35
  }
36
+ interface EstimateAsyncInput extends Omit<EstimateInput, 'tokenizer'> {
37
+ /**
38
+ * Token counting strategy for async estimation.
39
+ * Includes provider-backed modes that require network access or local model files.
40
+ */
41
+ tokenizer?: TokenizerModeAsync;
42
+ /**
43
+ * Optional fetch implementation (useful for tests, edge runtimes, or custom fetch).
44
+ * Defaults to globalThis.fetch.
45
+ */
46
+ fetch?: typeof fetch;
47
+ /**
48
+ * If true, provider-backed tokenizer modes will fall back to heuristic token estimation
49
+ * when the provider API is throttled/unavailable or the API key is invalid.
50
+ *
51
+ * This never stores API keys; it only affects error handling.
52
+ *
53
+ * Default: false (throw on provider errors)
54
+ */
55
+ fallbackToHeuristicOnError?: boolean;
56
+ /**
57
+ * Configuration for Anthropic token counting.
58
+ * Only used when tokenizer === 'anthropic_count_tokens'.
59
+ */
60
+ anthropic?: {
61
+ apiKey?: string;
62
+ baseUrl?: string;
63
+ version?: string;
64
+ system?: string;
65
+ };
66
+ /**
67
+ * Configuration for Gemini token counting (Google AI Studio / Generative Language API).
68
+ * Only used when tokenizer === 'gemini_count_tokens'.
69
+ */
70
+ gemini?: {
71
+ apiKey?: string;
72
+ baseUrl?: string;
73
+ };
74
+ /**
75
+ * Configuration for local Gemma SentencePiece tokenization.
76
+ * Only used when tokenizer === 'gemma_sentencepiece'.
77
+ */
78
+ gemma?: {
79
+ /** Filesystem path to a SentencePiece model file (e.g. Gemma tokenizer.model). */
80
+ modelPath?: string;
81
+ };
82
+ }
29
83
  /**
30
84
  * Output from the estimate function.
31
85
  */
@@ -41,7 +95,7 @@ interface EstimateOutput {
41
95
  /** The chars-per-token ratio used */
42
96
  charsPerToken: number;
43
97
  /** Which tokenizer strategy was used */
44
- tokenizerMode?: TokenizerMode;
98
+ tokenizerMode?: TokenizerModeAsync;
45
99
  /** OpenAI encoding used when tokenizerMode is `openai_exact` */
46
100
  encodingUsed?: string;
47
101
  }
@@ -65,6 +119,8 @@ interface EstimateOutput {
65
119
  */
66
120
  declare function estimate(input: EstimateInput): EstimateOutput;
67
121
 
122
+ declare function estimateAsync(input: EstimateAsyncInput): Promise<EstimateOutput>;
123
+
68
124
  /**
69
125
  * Default model configurations.
70
126
  *
@@ -141,4 +197,47 @@ interface TokenCountOutput {
141
197
  */
142
198
  declare function countTokens(input: TokenCountInput): TokenCountOutput;
143
199
 
144
- export { DEFAULT_MODELS, type EncodeOptions, type EstimateInput, type EstimateOutput, LAST_UPDATED, type ModelConfig, type OpenAIEncoding, type SpecialTokenHandling, type TokenCountInput, type TokenCountOutput, type TokenizerMode, countTokens, decode, encode, estimate, getAvailableModels, getModelConfig };
200
+ interface AnthropicCountTokensParams {
201
+ /** Claude model id, e.g. `claude-sonnet-4-5` */
202
+ model: string;
203
+ /** Anthropic API key. If omitted, uses process.env.ANTHROPIC_API_KEY */
204
+ apiKey?: string;
205
+ /** Text-only helper; converted into a single user message. */
206
+ text?: string;
207
+ /** Optional system prompt. */
208
+ system?: string;
209
+ /** Full messages payload (wins over `text` when provided). */
210
+ messages?: unknown;
211
+ /** Override API base URL (default: https://api.anthropic.com) */
212
+ baseUrl?: string;
213
+ /** Override Anthropic version header (default: 2023-06-01) */
214
+ version?: string;
215
+ /** Optional fetch implementation. Defaults to globalThis.fetch. */
216
+ fetch?: typeof fetch;
217
+ }
218
+ declare function countAnthropicInputTokens(params: AnthropicCountTokensParams): Promise<number>;
219
+
220
+ interface GeminiCountTokensParams {
221
+ /** Gemini model id, e.g. `gemini-2.0-flash` */
222
+ model: string;
223
+ /** Gemini API key. If omitted, uses process.env.GEMINI_API_KEY */
224
+ apiKey?: string;
225
+ /** Text-only helper; converted into a basic `contents` payload. */
226
+ text?: string;
227
+ /** Full `contents` payload (wins over `text` when provided). */
228
+ contents?: unknown;
229
+ /** Override API base URL (default: https://generativelanguage.googleapis.com) */
230
+ baseUrl?: string;
231
+ /** Optional fetch implementation. Defaults to globalThis.fetch. */
232
+ fetch?: typeof fetch;
233
+ }
234
+ declare function countGeminiTokens(params: GeminiCountTokensParams): Promise<number>;
235
+
236
+ interface GemmaSentencePieceCountTokensParams {
237
+ /** Filesystem path to a SentencePiece model file (e.g. Gemma `tokenizer.model`). */
238
+ modelPath: string;
239
+ text: string;
240
+ }
241
+ declare function countGemmaSentencePieceTokens(params: GemmaSentencePieceCountTokensParams): Promise<number>;
242
+
243
+ export { type AnthropicCountTokensParams, DEFAULT_MODELS, type EncodeOptions, type EstimateAsyncInput, type EstimateInput, type EstimateOutput, type GeminiCountTokensParams, type GemmaSentencePieceCountTokensParams, LAST_UPDATED, type ModelConfig, type OpenAIEncoding, type SpecialTokenHandling, type TokenCountInput, type TokenCountOutput, type TokenizerMode, type TokenizerModeAsync, countAnthropicInputTokens, countGeminiTokens, countGemmaSentencePieceTokens, countTokens, decode, encode, estimate, estimateAsync, getAvailableModels, getModelConfig };
package/dist/index.js CHANGED
@@ -361,14 +361,21 @@ var models = {
361
361
  Object.values(models).forEach((config) => Object.freeze(config));
362
362
  var DEFAULT_MODELS = Object.freeze(models);
363
363
  function getModelConfig(model) {
364
- const config = DEFAULT_MODELS[model];
365
- if (!config) {
364
+ const direct = DEFAULT_MODELS[model];
365
+ if (direct) return direct;
366
+ const normalized = (() => {
367
+ if (!model.startsWith("claude-")) return model;
368
+ const withoutDate = model.replace(/-\d{8}$/, "");
369
+ return withoutDate.replace(/-(\d+)-(\d+)$/, (_m, major, minor) => `-${major}.${minor}`);
370
+ })();
371
+ const aliased = DEFAULT_MODELS[normalized];
372
+ if (!aliased) {
366
373
  const available = Object.keys(DEFAULT_MODELS).join(", ");
367
374
  throw new Error(
368
375
  `Unknown model: "${model}". Available models: ${available}`
369
376
  );
370
377
  }
371
- return config;
378
+ return aliased;
372
379
  }
373
380
  function getAvailableModels() {
374
381
  return Object.keys(DEFAULT_MODELS);
@@ -457,13 +464,17 @@ function countCodePoints(text) {
457
464
  function estimate(input) {
458
465
  const { text, model, rounding = "ceil", tokenizer = "heuristic" } = input;
459
466
  const config = getModelConfig(model);
467
+ const tokenizerStr = tokenizer;
468
+ if (tokenizerStr === "anthropic_count_tokens" || tokenizerStr === "gemini_count_tokens" || tokenizerStr === "gemma_sentencepiece") {
469
+ throw new Error(`Tokenizer mode "${tokenizerStr}" requires async execution. Use estimateAsync(...) instead.`);
470
+ }
460
471
  const characterCount = countCodePoints(text);
461
- const isNonOpenAIModel2 = model.startsWith("claude-") || model.startsWith("gemini-");
472
+ const isNonOpenAIModel3 = model.startsWith("claude-") || model.startsWith("gemini-");
462
473
  let estimatedTokens;
463
474
  let tokenizerModeUsed = "heuristic";
464
475
  let encodingUsed;
465
476
  const shouldTryExact = tokenizer === "openai_exact" || tokenizer === "auto";
466
- if (shouldTryExact && !isNonOpenAIModel2) {
477
+ if (shouldTryExact && !isNonOpenAIModel3) {
467
478
  try {
468
479
  estimatedTokens = encode(text, { model, allowSpecial: "none" }).length;
469
480
  tokenizerModeUsed = "openai_exact";
@@ -473,7 +484,7 @@ function estimate(input) {
473
484
  throw error;
474
485
  }
475
486
  }
476
- } else if (tokenizer === "openai_exact" && isNonOpenAIModel2) {
487
+ } else if (tokenizer === "openai_exact" && isNonOpenAIModel3) {
477
488
  throw new Error(
478
489
  `Tokenizer mode "openai_exact" requested for non-OpenAI model: "${model}"`
479
490
  );
@@ -505,13 +516,283 @@ function estimate(input) {
505
516
  };
506
517
  }
507
518
 
508
- // src/token-counter.ts
519
+ // src/providers/anthropic.ts
520
+ function getFetch(fetchImpl) {
521
+ const f = fetchImpl ?? globalThis.fetch;
522
+ if (!f) {
523
+ throw new Error("globalThis.fetch is not available; pass fetch in AnthropicCountTokensParams");
524
+ }
525
+ return f;
526
+ }
527
+ function withStatus(message, status) {
528
+ const err = new Error(message);
529
+ err.status = status;
530
+ return err;
531
+ }
532
+ function getApiKey(explicit) {
533
+ const key = explicit ?? (typeof process !== "undefined" ? process.env.ANTHROPIC_API_KEY : void 0);
534
+ if (!key) throw withStatus("Anthropic API key missing (set ANTHROPIC_API_KEY or pass apiKey)", 401);
535
+ return key;
536
+ }
537
+ function asRecord(value) {
538
+ if (!value || typeof value !== "object" || Array.isArray(value)) return null;
539
+ return value;
540
+ }
541
+ async function countAnthropicInputTokens(params) {
542
+ const fetchImpl = getFetch(params.fetch);
543
+ const apiKey = getApiKey(params.apiKey);
544
+ const baseUrl = (params.baseUrl ?? "https://api.anthropic.com").replace(/\/+$/, "");
545
+ const version = params.version ?? "2023-06-01";
546
+ const messages = params.messages ?? (typeof params.text === "string" ? [{ role: "user", content: params.text }] : null);
547
+ if (!messages) {
548
+ throw new Error("Anthropic token counting requires either `messages` or `text`");
549
+ }
550
+ const body = {
551
+ model: params.model,
552
+ messages
553
+ };
554
+ if (typeof params.system === "string" && params.system.trim()) {
555
+ body.system = params.system;
556
+ }
557
+ const response = await fetchImpl(`${baseUrl}/v1/messages/count_tokens`, {
558
+ method: "POST",
559
+ headers: {
560
+ "content-type": "application/json",
561
+ "x-api-key": apiKey,
562
+ "anthropic-version": version
563
+ },
564
+ body: JSON.stringify(body)
565
+ });
566
+ const text = await response.text();
567
+ let data = null;
568
+ try {
569
+ data = text ? JSON.parse(text) : null;
570
+ } catch {
571
+ }
572
+ const dataObj = asRecord(data);
573
+ if (!response.ok) {
574
+ const errorObj = asRecord(dataObj?.error);
575
+ const msg = typeof errorObj?.message === "string" ? errorObj.message : typeof dataObj?.message === "string" ? dataObj.message : `HTTP ${response.status}`;
576
+ throw withStatus(`Anthropic count_tokens failed: ${msg}`, response.status);
577
+ }
578
+ const inputTokens = dataObj?.input_tokens;
579
+ if (typeof inputTokens !== "number" || !Number.isFinite(inputTokens) || inputTokens < 0) {
580
+ throw new Error("Anthropic count_tokens returned invalid input_tokens");
581
+ }
582
+ return inputTokens;
583
+ }
584
+
585
+ // src/providers/gemini.ts
586
+ function getFetch2(fetchImpl) {
587
+ const f = fetchImpl ?? globalThis.fetch;
588
+ if (!f) {
589
+ throw new Error("globalThis.fetch is not available; pass fetch in GeminiCountTokensParams");
590
+ }
591
+ return f;
592
+ }
593
+ function withStatus2(message, status) {
594
+ const err = new Error(message);
595
+ err.status = status;
596
+ return err;
597
+ }
598
+ function getApiKey2(explicit) {
599
+ const key = explicit ?? (typeof process !== "undefined" ? process.env.GEMINI_API_KEY : void 0);
600
+ if (!key) throw withStatus2("Gemini API key missing (set GEMINI_API_KEY or pass apiKey)", 401);
601
+ return key;
602
+ }
603
+ function toContents(text) {
604
+ return [{ role: "user", parts: [{ text }] }];
605
+ }
606
+ function asRecord2(value) {
607
+ if (!value || typeof value !== "object" || Array.isArray(value)) return null;
608
+ return value;
609
+ }
610
+ async function countGeminiTokens(params) {
611
+ const fetchImpl = getFetch2(params.fetch);
612
+ const apiKey = getApiKey2(params.apiKey);
613
+ const baseUrl = (params.baseUrl ?? "https://generativelanguage.googleapis.com").replace(/\/+$/, "");
614
+ const contents = params.contents ?? (typeof params.text === "string" ? toContents(params.text) : null);
615
+ if (!contents) {
616
+ throw new Error("Gemini token counting requires either `contents` or `text`");
617
+ }
618
+ const url = `${baseUrl}/v1beta/models/${encodeURIComponent(params.model)}:countTokens?key=${encodeURIComponent(apiKey)}`;
619
+ const response = await fetchImpl(url, {
620
+ method: "POST",
621
+ headers: { "content-type": "application/json" },
622
+ body: JSON.stringify({ contents })
623
+ });
624
+ const text = await response.text();
625
+ let data = null;
626
+ try {
627
+ data = text ? JSON.parse(text) : null;
628
+ } catch {
629
+ }
630
+ const dataObj = asRecord2(data);
631
+ if (!response.ok) {
632
+ const errorObj = asRecord2(dataObj?.error);
633
+ const msg = typeof errorObj?.message === "string" ? errorObj.message : typeof dataObj?.message === "string" ? dataObj.message : `HTTP ${response.status}`;
634
+ throw withStatus2(`Gemini countTokens failed: ${msg}`, response.status);
635
+ }
636
+ const totalTokens = dataObj?.totalTokens ?? dataObj?.total_tokens ?? dataObj?.total_tokens_count;
637
+ if (typeof totalTokens !== "number" || !Number.isFinite(totalTokens) || totalTokens < 0) {
638
+ throw new Error("Gemini countTokens returned invalid totalTokens");
639
+ }
640
+ return totalTokens;
641
+ }
642
+
643
+ // src/providers/gemma-sentencepiece.ts
644
+ async function loadSentencePiece() {
645
+ try {
646
+ const mod = await import("sentencepiece-js");
647
+ if (mod.SentencePieceProcessor || mod.cleanText) return mod;
648
+ if (mod.default && typeof mod.default === "object" && mod.default.SentencePieceProcessor) {
649
+ return mod.default;
650
+ }
651
+ return mod;
652
+ } catch {
653
+ throw new Error(
654
+ "Local Gemma SentencePiece tokenization requires the optional dependency `sentencepiece-js`. Install it and try again."
655
+ );
656
+ }
657
+ }
658
+ async function countGemmaSentencePieceTokens(params) {
659
+ const sp = await loadSentencePiece();
660
+ const defaults = (sp.default && typeof sp.default === "object" ? sp.default : null) ?? {};
661
+ const SentencePieceProcessor = sp.SentencePieceProcessor ?? defaults.SentencePieceProcessor;
662
+ const cleanText = sp.cleanText ?? defaults.cleanText;
663
+ if (!SentencePieceProcessor || typeof SentencePieceProcessor !== "function") {
664
+ throw new Error("sentencepiece-js did not export SentencePieceProcessor as expected");
665
+ }
666
+ const processor = new SentencePieceProcessor();
667
+ const loaded = processor.load(params.modelPath);
668
+ if (loaded instanceof Promise) await loaded;
669
+ const cleaned = typeof cleanText === "function" ? cleanText(params.text) : params.text;
670
+ const ids = processor.encodeIds(cleaned);
671
+ if (!Array.isArray(ids)) {
672
+ throw new Error("sentencepiece-js returned invalid ids from encodeIds");
673
+ }
674
+ return ids.length;
675
+ }
676
+
677
+ // src/estimator-async.ts
678
+ function countCodePoints2(text) {
679
+ let count = 0;
680
+ for (const _char of text) count++;
681
+ return count;
682
+ }
509
683
  function isNonOpenAIModel(model) {
510
684
  return model.startsWith("claude-") || model.startsWith("gemini-");
511
685
  }
686
+ function shouldFallbackToHeuristic(err) {
687
+ if (!err) return true;
688
+ const maybe = err;
689
+ const statusRaw = maybe.status;
690
+ const status = typeof statusRaw === "number" && Number.isFinite(statusRaw) ? statusRaw : null;
691
+ if (!status) return true;
692
+ if (status === 401 || status === 403 || status === 429) return true;
693
+ if (status >= 500 && status <= 599) return true;
694
+ return false;
695
+ }
696
+ async function estimateAsync(input) {
697
+ const { text, model, rounding = "ceil", tokenizer = "heuristic" } = input;
698
+ const config = getModelConfig(model);
699
+ const characterCount = countCodePoints2(text);
700
+ let estimatedTokens;
701
+ let tokenizerModeUsed = "heuristic";
702
+ let encodingUsed;
703
+ if (tokenizer === "anthropic_count_tokens") {
704
+ try {
705
+ estimatedTokens = await countAnthropicInputTokens({
706
+ model,
707
+ text,
708
+ system: input.anthropic?.system,
709
+ apiKey: input.anthropic?.apiKey,
710
+ baseUrl: input.anthropic?.baseUrl,
711
+ version: input.anthropic?.version,
712
+ fetch: input.fetch
713
+ });
714
+ tokenizerModeUsed = "anthropic_count_tokens";
715
+ } catch (error) {
716
+ if (input.fallbackToHeuristicOnError && shouldFallbackToHeuristic(error)) {
717
+ estimatedTokens = void 0;
718
+ tokenizerModeUsed = "heuristic";
719
+ } else {
720
+ throw error;
721
+ }
722
+ }
723
+ } else if (tokenizer === "gemini_count_tokens") {
724
+ try {
725
+ estimatedTokens = await countGeminiTokens({
726
+ model,
727
+ text,
728
+ apiKey: input.gemini?.apiKey,
729
+ baseUrl: input.gemini?.baseUrl,
730
+ fetch: input.fetch
731
+ });
732
+ tokenizerModeUsed = "gemini_count_tokens";
733
+ } catch (error) {
734
+ if (input.fallbackToHeuristicOnError && shouldFallbackToHeuristic(error)) {
735
+ estimatedTokens = void 0;
736
+ tokenizerModeUsed = "heuristic";
737
+ } else {
738
+ throw error;
739
+ }
740
+ }
741
+ } else if (tokenizer === "gemma_sentencepiece") {
742
+ const modelPath = input.gemma?.modelPath;
743
+ if (!modelPath) {
744
+ throw new Error("gemma_sentencepiece tokenizer requires gemma.modelPath (path to tokenizer.model)");
745
+ }
746
+ estimatedTokens = await countGemmaSentencePieceTokens({ modelPath, text });
747
+ tokenizerModeUsed = "gemma_sentencepiece";
748
+ } else {
749
+ const shouldTryExact = tokenizer === "openai_exact" || tokenizer === "auto";
750
+ if (shouldTryExact && !isNonOpenAIModel(model)) {
751
+ try {
752
+ estimatedTokens = encode(text, { model, allowSpecial: "none" }).length;
753
+ tokenizerModeUsed = "openai_exact";
754
+ encodingUsed = getOpenAIEncoding({ model });
755
+ } catch (error) {
756
+ if (tokenizer === "openai_exact") throw error;
757
+ }
758
+ } else if (tokenizer === "openai_exact" && isNonOpenAIModel(model)) {
759
+ throw new Error(`Tokenizer mode "openai_exact" requested for non-OpenAI model: "${model}"`);
760
+ }
761
+ }
762
+ if (estimatedTokens === void 0) {
763
+ const rawTokens = characterCount / config.charsPerToken;
764
+ switch (rounding) {
765
+ case "floor":
766
+ estimatedTokens = Math.floor(rawTokens);
767
+ break;
768
+ case "round":
769
+ estimatedTokens = Math.round(rawTokens);
770
+ break;
771
+ case "ceil":
772
+ default:
773
+ estimatedTokens = Math.ceil(rawTokens);
774
+ }
775
+ tokenizerModeUsed = "heuristic";
776
+ }
777
+ const estimatedInputCost = estimatedTokens * config.inputCostPerMillion / 1e6;
778
+ return {
779
+ model,
780
+ characterCount,
781
+ estimatedTokens,
782
+ estimatedInputCost,
783
+ charsPerToken: config.charsPerToken,
784
+ tokenizerMode: tokenizerModeUsed,
785
+ encodingUsed
786
+ };
787
+ }
788
+
789
+ // src/token-counter.ts
790
+ function isNonOpenAIModel2(model) {
791
+ return model.startsWith("claude-") || model.startsWith("gemini-");
792
+ }
512
793
  function countTokens(input) {
513
794
  const { text, model } = input;
514
- if (isNonOpenAIModel(model)) {
795
+ if (isNonOpenAIModel2(model)) {
515
796
  return {
516
797
  tokens: estimate({ text, model }).estimatedTokens,
517
798
  exact: false
@@ -533,10 +814,14 @@ function countTokens(input) {
533
814
  export {
534
815
  DEFAULT_MODELS,
535
816
  LAST_UPDATED,
817
+ countAnthropicInputTokens,
818
+ countGeminiTokens,
819
+ countGemmaSentencePieceTokens,
536
820
  countTokens,
537
821
  decode,
538
822
  encode,
539
823
  estimate,
824
+ estimateAsync,
540
825
  getAvailableModels,
541
826
  getModelConfig
542
827
  };
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "ai-token-estimator",
3
- "version": "1.1.0",
4
- "description": "Estimate token counts and costs for LLM API calls",
3
+ "version": "1.2.0",
4
+ "description": "Estimate and count tokens (incl. exact OpenAI BPE) and input costs for LLM API calls",
5
5
  "type": "module",
6
6
  "main": "./dist/index.cjs",
7
7
  "module": "./dist/index.js",
@@ -18,13 +18,17 @@
18
18
  }
19
19
  }
20
20
  },
21
+ "publishConfig": {
22
+ "access": "public"
23
+ },
21
24
  "files": [
22
25
  "dist",
23
26
  "LICENSE",
24
27
  "README.md"
25
28
  ],
26
29
  "dependencies": {
27
- "gpt-tokenizer": "^3.4.0"
30
+ "gpt-tokenizer": "^3.4.0",
31
+ "sentencepiece-js": "^1.1.0"
28
32
  },
29
33
  "scripts": {
30
34
  "build": "tsup src/index.ts --format cjs,esm --dts",
@@ -37,8 +41,14 @@
37
41
  },
38
42
  "keywords": [
39
43
  "llm",
44
+ "tokenizer",
45
+ "token-count",
46
+ "token-counter",
40
47
  "tokens",
41
48
  "estimator",
49
+ "cost-estimator",
50
+ "tiktoken",
51
+ "bpe",
42
52
  "openai",
43
53
  "anthropic",
44
54
  "claude",