active_harness_pricing 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f689ca4f8b7b527abbb2acda1b27f1204cf67df99ce45054f26a5039e8f4ee16
4
- data.tar.gz: df4cfba2637bbcb51712f31e832dd541fbf5b1700d0a4095fb4a50e2e371ae8b
3
+ metadata.gz: de32174380ea4e5c91b08472367ac37fa1cf3a8c970d268c609dcc4ee5a7f5bc
4
+ data.tar.gz: 2c6fe410024077d7f24bda114a535967add07180c0e663ddf17eee4d598781ea
5
5
  SHA512:
6
- metadata.gz: 7837a3c69a2a7f70e8c97a5617820a7ee072ea1540c5a99cde473309a1ddd3103088768b4db079095fce26b0a74a47cdbca0c8b4cf66c58c9b34e72d1daec79c
7
- data.tar.gz: 013d47bfe9231a842ca7ad09f6d3b181bfb253aa797be5027aad22c95a4d99ad2f8d2d1a02945c272a010c1fe6f508f515802e6bbd9b568bd9587a1a557b9191
6
+ metadata.gz: 10ef9ceeb213eb4bb50fad4751b8e001bdf0452193d058d4a30a8ffa8dfbb6e35442216c230a501c2caa7968082c8bcba02396d96a7f20e2b027ec3f0f3f4320
7
+ data.tar.gz: 3ba05d9f76e9a0813488e3b23ce7bf0643160322a37e6067f103e73931304efbad2cda8c3a200957963160887a5f3b3768b574750353aacc4a623ef9ea3c9939
data/data/README.md ADDED
@@ -0,0 +1,127 @@
1
+ # Pricing Data
2
+
3
+ Three JSON files with LLM model pricing from different sources, all sharing the same format. Updated daily via GitHub Actions.
4
+
5
+ | File | Source | Models | Notes |
6
+ |---|---|---|---|
7
+ | `pricepertoken.json` | [pricepertoken.com](https://pricepertoken.com) | ~550 | Includes performance data (TPS, TTFT) |
8
+ | `modelsdev.json` | [models.dev](https://models.dev) | ~1600 | Broadest provider coverage |
9
+ | `openrouter.json` | [openrouter.ai](https://openrouter.ai) | ~290 | Routing prices via OpenRouter |
10
+
11
+ ---
12
+
13
+ ## Canonical Key
14
+
15
+ Each entry is keyed by a normalized model name — lowercase, hyphen-separated, no provider prefix:
16
+
17
+ ```
18
+ "gpt-4o"
19
+ "mistral-nemo"
20
+ "claude-3-5-sonnet"
21
+ "gemini-2-5-flash"
22
+ ```
23
+
24
+ The same key across different files refers to the same model, which allows price comparison between sources.
25
+
26
+ **How the key is derived** from raw provider model IDs:
27
+
28
+ ```
29
+ "mistralai/mistral-nemo" → "mistral-nemo"
30
+ "global.anthropic.claude-haiku-4-5-20251001-v1:0" → "claude-haiku-4-5"
31
+ "gpt-4o" → "gpt-4o"
32
+ "models/gemini-2.5-flash" → "gemini-2-5-flash"
33
+ "claude-3-5-haiku-20241022" → "claude-3-5-haiku"
34
+ ```
35
+
36
+ Rules: strip `author/` prefix, remove date/version suffixes (`-20241022`, `-v2:0`), normalize separators to hyphens.
37
+
38
+ ---
39
+
40
+ ## Entry Format
41
+
42
+ ```json
43
+ {
44
+ "mistral-nemo": {
45
+ "name": "Mistral Nemo",
46
+ "input_per_1m": 0.02,
47
+ "output_per_1m": 0.03,
48
+ "context_window": 131072,
49
+ "cache_read_per_1m": 0.005,
50
+ "tokens_per_second": 85.4,
51
+ "time_to_first_token": 0.61
52
+ }
53
+ }
54
+ ```
55
+
56
+ ### Fields
57
+
58
+ | Field | Type | Description |
59
+ |---|---|---|
60
+ | `name` | String | Display name of the model |
61
+ | `input_per_1m` | Float | Price per 1M input tokens, USD |
62
+ | `output_per_1m` | Float | Price per 1M output tokens, USD |
63
+ | `context_window` | Integer? | Maximum context length in tokens |
64
+ | `cache_read_per_1m` | Float? | Price per 1M cache-read tokens (Anthropic, OpenAI) |
65
+ | `tokens_per_second` | Float? | Generation speed — `pricepertoken.json` only |
66
+ | `time_to_first_token` | Float? | Time to first token in seconds — `pricepertoken.json` only |
67
+
68
+ Fields marked `?` are optional and not present for every model.
69
+
70
+ ---
71
+
72
+ ## Multimodality and Per-Token-Type Pricing
73
+
74
+ Many modern models accept multiple token types, each with its own price.
75
+
76
+ ### Examples
77
+
78
+ **GPT-4o** accepts text and images:
79
+ - Text input tokens: `$2.5/M`
80
+ - Image input tokens: billed separately at ~`$1.275/M` (OpenAI counts image tiles by their own scheme)
81
+ - Cache-read tokens: `$1.25/M` → stored in `cache_read_per_1m`
82
+
83
+ **Gemini 2.5 Flash** accepts text, images, audio, and video:
84
+ - Text tokens: `$0.09/M` input / `$0.71/M` output
85
+ - Audio tokens: `$0.07/M` input (different rate)
86
+ - Images and video: separate rates per provider scheme
87
+
88
+ **Claude** supports prompt caching:
89
+ - Standard input tokens: `$3.0/M`
90
+ - Cache-read tokens: `$0.3/M` (10× cheaper) → stored in `cache_read_per_1m`
91
+ - Cache-write tokens: `$3.75/M` (more expensive; not stored in this format)
92
+
93
+ ### Why only `input_per_1m` for input
94
+
95
+ The format stores the **text token rate** as the primary input price. This covers ~95% of real-world usage since most LLM requests send text.
96
+
97
+ Image, audio, and video token rates are **not unified** in this format because each provider uses different billing units (tiles, seconds, pixels). For accurate cost calculation on multimodal requests, consult the provider's API documentation directly.
98
+
99
+ ---
100
+
101
+ ## Price Differences Between Sources
102
+
103
+ The same model may have different prices across files:
104
+
105
+ ```
106
+ gemini-2-5-flash:
107
+ pricepertoken → $0.30/M input (Google's official rate)
108
+ modelsdev → $0.09/M input (different tier or stale data)
109
+ openrouter → $0.30/M input (OpenRouter routing price)
110
+ ```
111
+
112
+ Common reasons for differences:
113
+ - Providers offer multiple pricing tiers (Batch API, volume discounts)
114
+ - Sources update at different frequencies and may carry stale data
115
+ - OpenRouter adds a small markup over the underlying provider price
116
+
117
+ `PriceResolver.max_cost` selects the **highest** price across all sources as a conservative cost estimate; `PriceResolver.min_cost` selects the lowest.
118
+
119
+ ---
120
+
121
+ ## Updating Data Manually
122
+
123
+ ```bash
124
+ ruby bin/fetch_pricepertoken # refresh pricepertoken.json
125
+ ruby bin/fetch_modelsdev # refresh modelsdev.json
126
+ ruby bin/fetch_openrouter # refresh openrouter.json
127
+ ```