@zhangferry-dev/tokendash 1.4.0 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,456 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="zh">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <style>
7
- * { margin: 0; padding: 0; box-sizing: border-box; }
8
- body {
9
- font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
10
- background: #fafaf9;
11
- padding: 40px 48px;
12
- width: 900px;
13
- }
14
-
15
- .title {
16
- font-size: 22px;
17
- font-weight: 800;
18
- color: #1c1917;
19
- margin-bottom: 8px;
20
- }
21
- .subtitle {
22
- font-size: 14px;
23
- color: #78716c;
24
- margin-bottom: 36px;
25
- line-height: 1.6;
26
- }
27
-
28
- .turn {
29
- margin-bottom: 28px;
30
- position: relative;
31
- }
32
-
33
- .turn-label {
34
- display: inline-flex;
35
- align-items: center;
36
- gap: 8px;
37
- margin-bottom: 12px;
38
- }
39
- .turn-badge {
40
- background: #1c1917;
41
- color: white;
42
- font-size: 12px;
43
- font-weight: 700;
44
- padding: 3px 10px;
45
- border-radius: 6px;
46
- letter-spacing: 0.03em;
47
- }
48
- .turn-desc {
49
- font-size: 13px;
50
- color: #57534e;
51
- font-weight: 500;
52
- }
53
-
54
- .blocks {
55
- display: flex;
56
- border-radius: 10px;
57
- overflow: hidden;
58
- box-shadow: 0 1px 3px rgba(0,0,0,0.06);
59
- height: 52px;
60
- }
61
-
62
- .block {
63
- display: flex;
64
- align-items: center;
65
- justify-content: center;
66
- font-size: 12px;
67
- font-weight: 600;
68
- letter-spacing: 0.02em;
69
- position: relative;
70
- transition: all 0.2s;
71
- border-right: 1px solid rgba(255,255,255,0.3);
72
- }
73
- .block:last-child { border-right: none; }
74
-
75
- /* Colors */
76
- .block.system {
77
- background: #6366f1;
78
- color: white;
79
- }
80
- .block.history {
81
- background: #818cf8;
82
- color: white;
83
- }
84
- .block.new-input {
85
- background: #10b981;
86
- color: white;
87
- }
88
- .block.output {
89
- background: #f59e0b;
90
- color: white;
91
- }
92
- .block.cached {
93
- background: linear-gradient(135deg, #6366f1 0%, #4f46e5 100%);
94
- color: white;
95
- position: relative;
96
- }
97
- .block.cached::after {
98
- content: '';
99
- position: absolute;
100
- top: 3px;
101
- right: 3px;
102
- width: 16px;
103
- height: 16px;
104
- background: rgba(255,255,255,0.25);
105
- border-radius: 4px;
106
- display: flex;
107
- align-items: center;
108
- justify-content: center;
109
- font-size: 10px;
110
- }
111
-
112
- /* Legend */
113
- .legend {
114
- display: flex;
115
- gap: 20px;
116
- margin-top: 36px;
117
- margin-bottom: 32px;
118
- padding: 16px 20px;
119
- background: white;
120
- border-radius: 10px;
121
- border: 1px solid #e7e5e4;
122
- }
123
- .legend-item {
124
- display: flex;
125
- align-items: center;
126
- gap: 8px;
127
- font-size: 12px;
128
- font-weight: 500;
129
- color: #44403c;
130
- }
131
- .legend-dot {
132
- width: 14px;
133
- height: 14px;
134
- border-radius: 4px;
135
- flex-shrink: 0;
136
- }
137
-
138
- /* Annotation */
139
- .annotation {
140
- display: flex;
141
- gap: 16px;
142
- margin-top: 24px;
143
- }
144
- .anno-card {
145
- flex: 1;
146
- background: white;
147
- border: 1px solid #e7e5e4;
148
- border-radius: 10px;
149
- padding: 16px 18px;
150
- }
151
- .anno-card .anno-title {
152
- font-size: 13px;
153
- font-weight: 700;
154
- color: #1c1917;
155
- margin-bottom: 6px;
156
- }
157
- .anno-card .anno-text {
158
- font-size: 12px;
159
- color: #78716c;
160
- line-height: 1.7;
161
- }
162
- .anno-card.highlight {
163
- border-color: #10b981;
164
- background: #ecfdf5;
165
- }
166
- .anno-card.highlight .anno-title { color: #059669; }
167
-
168
- /* Cost comparison */
169
- .cost-row {
170
- display: flex;
171
- align-items: center;
172
- gap: 12px;
173
- margin-top: 6px;
174
- }
175
- .cost-tag {
176
- display: inline-flex;
177
- align-items: center;
178
- gap: 4px;
179
- font-size: 12px;
180
- font-weight: 700;
181
- padding: 2px 8px;
182
- border-radius: 5px;
183
- font-family: 'SF Mono', 'Fira Code', monospace;
184
- }
185
- .cost-tag.expensive {
186
- background: #fef2f2;
187
- color: #dc2626;
188
- }
189
- .cost-tag.cheap {
190
- background: #ecfdf5;
191
- color: #059669;
192
- }
193
- .cost-tag.free {
194
- background: #f0f9ff;
195
- color: #0284c7;
196
- }
197
-
198
- .arrow-container {
199
- display: flex;
200
- align-items: center;
201
- justify-content: center;
202
- margin: 4px 0;
203
- position: relative;
204
- }
205
- .arrow-down {
206
- width: 2px;
207
- height: 20px;
208
- background: #d6d3d1;
209
- position: relative;
210
- }
211
- .arrow-down::after {
212
- content: '';
213
- position: absolute;
214
- bottom: -3px;
215
- left: 50%;
216
- transform: translateX(-50%);
217
- border-left: 5px solid transparent;
218
- border-right: 5px solid transparent;
219
- border-top: 6px solid #d6d3d1;
220
- }
221
- .arrow-label {
222
- position: absolute;
223
- right: calc(50% + 16px);
224
- font-size: 11px;
225
- color: #a8a29e;
226
- font-weight: 500;
227
- }
228
-
229
- /* Hit rate bar */
230
- .hit-rate-section {
231
- margin-top: 28px;
232
- background: white;
233
- border: 1px solid #e7e5e4;
234
- border-radius: 10px;
235
- padding: 20px;
236
- }
237
- .hit-rate-title {
238
- font-size: 14px;
239
- font-weight: 700;
240
- color: #1c1917;
241
- margin-bottom: 16px;
242
- }
243
- .hit-row {
244
- display: flex;
245
- align-items: center;
246
- gap: 12px;
247
- margin-bottom: 10px;
248
- }
249
- .hit-label {
250
- font-size: 12px;
251
- font-weight: 600;
252
- color: #57534e;
253
- width: 60px;
254
- text-align: right;
255
- flex-shrink: 0;
256
- }
257
- .hit-bar-container {
258
- flex: 1;
259
- height: 22px;
260
- border-radius: 6px;
261
- overflow: hidden;
262
- display: flex;
263
- background: #f5f5f4;
264
- }
265
- .hit-bar-cached {
266
- background: #6366f1;
267
- height: 100%;
268
- display: flex;
269
- align-items: center;
270
- justify-content: center;
271
- color: white;
272
- font-size: 10px;
273
- font-weight: 700;
274
- transition: width 0.3s;
275
- }
276
- .hit-bar-new {
277
- background: #10b981;
278
- height: 100%;
279
- display: flex;
280
- align-items: center;
281
- justify-content: center;
282
- color: white;
283
- font-size: 10px;
284
- font-weight: 700;
285
- transition: width 0.3s;
286
- }
287
- .hit-value {
288
- font-size: 12px;
289
- font-weight: 700;
290
- color: #6366f1;
291
- width: 40px;
292
- flex-shrink: 0;
293
- }
294
-
295
- /* Bottom note */
296
- .note {
297
- margin-top: 24px;
298
- padding: 14px 18px;
299
- background: #fffbeb;
300
- border: 1px solid #fde68a;
301
- border-radius: 10px;
302
- font-size: 12px;
303
- color: #92400e;
304
- line-height: 1.7;
305
- }
306
- .note strong { color: #78350f; }
307
- </style>
308
- </head>
309
- <body>
310
-
311
- <div class="title">Prompt Caching 是如何工作的</div>
312
- <div class="subtitle">在一次持续的 Claude Code 对话中,系统会自动将重复的上下文前缀缓存起来。随着对话轮次增加,缓存命中的比例越来越高,费用越来越低。</div>
313
-
314
- <!-- Turn 1 -->
315
- <div class="turn">
316
- <div class="turn-label">
317
- <span class="turn-badge">Turn 1</span>
318
- <span class="turn-desc">第一轮对话 — 全部是新 Token,无缓存</span>
319
- </div>
320
- <div class="blocks">
321
- <div class="block system" style="width: 25%;">System Prompt</div>
322
- <div class="block new-input" style="width: 35%;">User Input</div>
323
- <div class="block output" style="width: 40%;">Model Output</div>
324
- </div>
325
- <div class="cost-row">
326
- <span class="cost-tag expensive">全额计费: $0.03/M</span>
327
- <span style="font-size: 11px; color: #a8a29e;">— 所有输入按正常费率计算</span>
328
- </div>
329
- </div>
330
-
331
- <div class="arrow-container">
332
- <div class="arrow-down"></div>
333
- <span class="arrow-label">上下文写入缓存</span>
334
- </div>
335
-
336
- <!-- Turn 2 -->
337
- <div class="turn">
338
- <div class="turn-label">
339
- <span class="turn-badge">Turn 2</span>
340
- <span class="turn-desc">第二轮对话 — 系统提示和第一轮对话被缓存命中</span>
341
- </div>
342
- <div class="blocks">
343
- <div class="block cached" style="width: 22%;">System Prompt ✓</div>
344
- <div class="block cached" style="width: 28%;">Turn 1 History ✓</div>
345
- <div class="block new-input" style="width: 18%;">New Input</div>
346
- <div class="block output" style="width: 32%;">Model Output</div>
347
- </div>
348
- <div class="cost-row">
349
- <span class="cost-tag cheap">缓存计费: $0.003/M</span>
350
- <span class="cost-tag free">新输入: $0.03/M</span>
351
- <span style="font-size: 11px; color: #a8a29e;">— 50% 的输入从缓存读取,费用大幅降低</span>
352
- </div>
353
- </div>
354
-
355
- <div class="arrow-container">
356
- <div class="arrow-down"></div>
357
- <span class="arrow-label">更多上下文写入缓存</span>
358
- </div>
359
-
360
- <!-- Turn 3 -->
361
- <div class="turn">
362
- <div class="turn-label">
363
- <span class="turn-badge">Turn 3</span>
364
- <span class="turn-desc">第三轮对话 — 大部分上下文已被缓存</span>
365
- </div>
366
- <div class="blocks">
367
- <div class="block cached" style="width: 18%;">System Prompt ✓</div>
368
- <div class="block cached" style="width: 22%;">Turn 1 ✓</div>
369
- <div class="block cached" style="width: 20%;">Turn 2 ✓</div>
370
- <div class="block new-input" style="width: 12%;">New Input</div>
371
- <div class="block output" style="width: 28%;">Model Output</div>
372
- </div>
373
- <div class="cost-row">
374
- <span class="cost-tag cheap">缓存: $0.003/M</span>
375
- <span class="cost-tag free">新输入: $0.03/M</span>
376
- <span style="font-size: 11px; color: #a8a29e;">— 仅 16% 按正常费率,其余全部 1/10 价格</span>
377
- </div>
378
- </div>
379
-
380
- <!-- Legend -->
381
- <div class="legend">
382
- <div class="legend-item">
383
- <div class="legend-dot" style="background: #6366f1;"></div>
384
- 缓存命中 (1/10 价格)
385
- </div>
386
- <div class="legend-item">
387
- <div class="legend-dot" style="background: #10b981;"></div>
388
- 新输入 (正常价格)
389
- </div>
390
- <div class="legend-item">
391
- <div class="legend-dot" style="background: #f59e0b;"></div>
392
- 模型输出
393
- </div>
394
- </div>
395
-
396
- <!-- Cache Hit Rate -->
397
- <div class="hit-rate-section">
398
- <div class="hit-rate-title">缓存命中率变化趋势</div>
399
- <div class="hit-row">
400
- <span class="hit-label">Turn 1</span>
401
- <div class="hit-bar-container">
402
- <div class="hit-bar-cached" style="width: 0%;"></div>
403
- <div class="hit-bar-new" style="width: 100%;">100%</div>
404
- </div>
405
- <span class="hit-value" style="color: #dc2626;">0%</span>
406
- </div>
407
- <div class="hit-row">
408
- <span class="hit-label">Turn 2</span>
409
- <div class="hit-bar-container">
410
- <div class="hit-bar-cached" style="width: 50%;">缓存</div>
411
- <div class="hit-bar-new" style="width: 50%;">新</div>
412
- </div>
413
- <span class="hit-value">50%</span>
414
- </div>
415
- <div class="hit-row">
416
- <span class="hit-label">Turn 3</span>
417
- <div class="hit-bar-container">
418
- <div class="hit-bar-cached" style="width: 75%;">缓存命中</div>
419
- <div class="hit-bar-new" style="width: 25%;">新</div>
420
- </div>
421
- <span class="hit-value">75%</span>
422
- </div>
423
- <div class="hit-row">
424
- <span class="hit-label">Turn 5+</span>
425
- <div class="hit-bar-container">
426
- <div class="hit-bar-cached" style="width: 90%;">缓存命中 — 几乎全部从缓存读取</div>
427
- <div class="hit-bar-new" style="width: 10%;"></div>
428
- </div>
429
- <span class="hit-value">90%+</span>
430
- </div>
431
- </div>
432
-
433
- <!-- Key insights -->
434
- <div class="annotation">
435
- <div class="anno-card">
436
- <div class="anno-title">什么样的请求会产生缓存?</div>
437
- <div class="anno-text">
438
- 只要上下文的<strong>前缀相同</strong>,就会命中缓存。在同一会话中,System Prompt 和之前的对话历史构成固定前缀,每轮只会增加新的内容。这就是为什么对话越长,缓存命中率越高。
439
- </div>
440
- </div>
441
- <div class="anno-card highlight">
442
- <div class="anno-title">实际省多少?</div>
443
- <div class="anno-text">
444
- 以 Sonnet 为例:正常输入 $3.00/M,缓存输入 $0.30/M。<br>
445
- 一个 10 轮对话中,假设累计输入 100 万 Token,其中 80 万命中缓存:<br>
446
- <strong>节省 = 80万 × ($3.00 - $0.30) / 100万 = $2.16</strong>
447
- </div>
448
- </div>
449
- </div>
450
-
451
- <div class="note">
452
- <strong>为什么缓存不是免费的?</strong> 缓存只跳过了 Token 的编码阶段,但命中的 Token 仍需通过模型的 Attention 层参与推理计算,同时 KV Cache 需要常驻 GPU 显存。所以 Anthropic 和 OpenAI 的定价策略都是缓存输入按正常价格的 <strong>1/10</strong> 计费,而非免费。
453
- </div>
454
-
455
- </body>
456
- </html>
Binary file
Binary file
Binary file