active_harness_pricing 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/data/README.md +127 -0
- data/data/modelsdev.json +10299 -0
- data/data/openrouter.json +1866 -0
- data/data/pricepertoken.json +3226 -0
- data/lib/active_harness/pricing/normalizer.rb +44 -0
- data/lib/active_harness/pricing/price_resolver.rb +128 -0
- data/lib/active_harness/pricing/source.rb +110 -0
- data/lib/active_harness_pricing.rb +4 -1
- metadata +9 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: de32174380ea4e5c91b08472367ac37fa1cf3a8c970d268c609dcc4ee5a7f5bc
|
|
4
|
+
data.tar.gz: 2c6fe410024077d7f24bda114a535967add07180c0e663ddf17eee4d598781ea
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 10ef9ceeb213eb4bb50fad4751b8e001bdf0452193d058d4a30a8ffa8dfbb6e35442216c230a501c2caa7968082c8bcba02396d96a7f20e2b027ec3f0f3f4320
|
|
7
|
+
data.tar.gz: 3ba05d9f76e9a0813488e3b23ce7bf0643160322a37e6067f103e73931304efbad2cda8c3a200957963160887a5f3b3768b574750353aacc4a623ef9ea3c9939
|
data/data/README.md
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# Pricing Data
|
|
2
|
+
|
|
3
|
+
Three JSON files with LLM model pricing from different sources, all sharing the same format. Updated daily via GitHub Actions.
|
|
4
|
+
|
|
5
|
+
| File | Source | Models | Notes |
|
|
6
|
+
|---|---|---|---|
|
|
7
|
+
| `pricepertoken.json` | [pricepertoken.com](https://pricepertoken.com) | ~550 | Includes performance data (TPS, TTFT) |
|
|
8
|
+
| `modelsdev.json` | [models.dev](https://models.dev) | ~1600 | Broadest provider coverage |
|
|
9
|
+
| `openrouter.json` | [openrouter.ai](https://openrouter.ai) | ~290 | Routing prices via OpenRouter |
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Canonical Key
|
|
14
|
+
|
|
15
|
+
Each entry is keyed by a normalized model name — lowercase, hyphen-separated, no provider prefix:
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
"gpt-4o"
|
|
19
|
+
"mistral-nemo"
|
|
20
|
+
"claude-3-5-sonnet"
|
|
21
|
+
"gemini-2-5-flash"
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
The same key across different files refers to the same model, which allows price comparison between sources.
|
|
25
|
+
|
|
26
|
+
**How the key is derived** from raw provider model IDs:
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
"mistralai/mistral-nemo" → "mistral-nemo"
|
|
30
|
+
"global.anthropic.claude-haiku-4-5-20251001-v1:0" → "claude-haiku-4-5"
|
|
31
|
+
"gpt-4o" → "gpt-4o"
|
|
32
|
+
"models/gemini-2.5-flash" → "gemini-2-5-flash"
|
|
33
|
+
"claude-3-5-haiku-20241022" → "claude-3-5-haiku"
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Rules: strip `author/` prefix, remove date/version suffixes (`-20241022`, `-v2:0`), normalize separators to hyphens.
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Entry Format
|
|
41
|
+
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"mistral-nemo": {
|
|
45
|
+
"name": "Mistral Nemo",
|
|
46
|
+
"input_per_1m": 0.02,
|
|
47
|
+
"output_per_1m": 0.03,
|
|
48
|
+
"context_window": 131072,
|
|
49
|
+
"cache_read_per_1m": 0.005,
|
|
50
|
+
"tokens_per_second": 85.4,
|
|
51
|
+
"time_to_first_token": 0.61
|
|
52
|
+
}
|
|
53
|
+
}
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
### Fields
|
|
57
|
+
|
|
58
|
+
| Field | Type | Description |
|
|
59
|
+
|---|---|---|
|
|
60
|
+
| `name` | String | Display name of the model |
|
|
61
|
+
| `input_per_1m` | Float | Price per 1M input tokens, USD |
|
|
62
|
+
| `output_per_1m` | Float | Price per 1M output tokens, USD |
|
|
63
|
+
| `context_window` | Integer? | Maximum context length in tokens |
|
|
64
|
+
| `cache_read_per_1m` | Float? | Price per 1M cache-read tokens (Anthropic, OpenAI) |
|
|
65
|
+
| `tokens_per_second` | Float? | Generation speed — `pricepertoken.json` only |
|
|
66
|
+
| `time_to_first_token` | Float? | Time to first token in seconds — `pricepertoken.json` only |
|
|
67
|
+
|
|
68
|
+
Fields marked `?` are optional and not present for every model.
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Multimodality and Per-Token-Type Pricing
|
|
73
|
+
|
|
74
|
+
Many modern models accept multiple token types, each with its own price.
|
|
75
|
+
|
|
76
|
+
### Examples
|
|
77
|
+
|
|
78
|
+
**GPT-4o** accepts text and images:
|
|
79
|
+
- Text input tokens: `$2.5/M`
|
|
80
|
+
- Image input tokens: billed separately at ~`$1.275/M` (OpenAI counts image tiles by their own scheme)
|
|
81
|
+
- Cache-read tokens: `$1.25/M` → stored in `cache_read_per_1m`
|
|
82
|
+
|
|
83
|
+
**Gemini 2.5 Flash** accepts text, images, audio, and video:
|
|
84
|
+
- Text tokens: `$0.09/M` input / `$0.71/M` output
|
|
85
|
+
- Audio tokens: `$0.07/M` input (different rate)
|
|
86
|
+
- Images and video: separate rates per provider scheme
|
|
87
|
+
|
|
88
|
+
**Claude** supports prompt caching:
|
|
89
|
+
- Standard input tokens: `$3.0/M`
|
|
90
|
+
- Cache-read tokens: `$0.3/M` (10× cheaper) → stored in `cache_read_per_1m`
|
|
91
|
+
- Cache-write tokens: `$3.75/M` (more expensive; not stored in this format)
|
|
92
|
+
|
|
93
|
+
### Why only `input_per_1m` for input
|
|
94
|
+
|
|
95
|
+
The format stores the **text token rate** as the primary input price. This covers ~95% of real-world usage since most LLM requests send text.
|
|
96
|
+
|
|
97
|
+
Image, audio, and video token rates are **not unified** in this format because each provider uses different billing units (tiles, seconds, pixels). For accurate cost calculation on multimodal requests, consult the provider's API documentation directly.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Price Differences Between Sources
|
|
102
|
+
|
|
103
|
+
The same model may have different prices across files:
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
gemini-2-5-flash:
|
|
107
|
+
pricepertoken → $0.30/M input (Google's official rate)
|
|
108
|
+
modelsdev → $0.09/M input (different tier or stale data)
|
|
109
|
+
openrouter → $0.30/M input (OpenRouter routing price)
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
Common reasons for differences:
|
|
113
|
+
- Providers offer multiple pricing tiers (Batch API, volume discounts)
|
|
114
|
+
- Sources update at different frequencies and may carry stale data
|
|
115
|
+
- OpenRouter adds a small markup over the underlying provider price
|
|
116
|
+
|
|
117
|
+
`PriceResolver.max_cost` selects the **highest** price across all sources as a conservative cost estimate; `PriceResolver.min_cost` selects the lowest.
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## Updating Data Manually
|
|
122
|
+
|
|
123
|
+
```bash
|
|
124
|
+
ruby bin/fetch_pricepertoken # refresh pricepertoken.json
|
|
125
|
+
ruby bin/fetch_modelsdev # refresh modelsdev.json
|
|
126
|
+
ruby bin/fetch_openrouter # refresh openrouter.json
|
|
127
|
+
```
|