Loading...
Loading...
Understanding tokens, optimizing context usage, and managing AI API costs at scale
Different models use different tokenizers:
| Size | What It Holds |
|---|---|
| 8K | ~10-page article |
| 128K | ~200-page book |
| 200K (Claude) | ~300 pages |
| 1M (Gemini) | The complete Harry Potter series |
requests = 100000
inp, out = 2000, 500
models = [
("Gemini Flash", 0.15, 0.60),
("GPT-4o", 2.50, 10.00),
("Claude Haiku", 0.80, 4.00),
]
for name, in_rate, out_rate in models:
daily = (requests * inp / 1e6 * in_rate) + (requests * out / 1e6 * out_rate)
print(f"{name:20s} | Daily: ${daily:.2f} | Monthly: ${daily*30:.2f}")