methodology

How the calculator gets its numbers.

The savings calculator on the homepage is a live estimate, not a guess. For each framework you can pick, we publish a workload profile — what a typical workload on that stack looks like — and we price that profile against the same model index our product runs on. When the prices change upstream, the calculator changes with them.

The formula

For each selected framework, we compute a per-call baseline cost on the primary model and a per-call optimized cost across four disjoint slices of call volume:

cacheable share — exact semantic-cache hits, cost goes to zero.
downshiftable share — re-priced at the cheaper-equivalent model that does the same job.
compressible share— prompt-compressed inputs, saving a percentage of input tokens at the primary model's rate.
residual — unchanged, original recipe.

The recoverable rate per framework is (baseline − optimized) / baseline. The blended rate across selected frameworks is the simple average, capped at 40% so the figure stays honest.

Per-stack profiles

Models below resolve against the live @whatmodel/data knowledge base (617 models indexed today). Shares are directional estimates grounded in how each framework typically runs; we revise them as we see more real-workload data and we changelog every revision.

OpenClaw

28%

primary model: anthropic/claude-opus-4.7
cacheable share: 8%
downshiftable share: 45%
compressible share: 15% (−30% input)
avg tokens / call: 8,000 in / 1,500 out

Hermes

46%

primary model: anthropic/claude-sonnet-4.6
cacheable share: 10%
downshiftable share: 50%
compressible share: 18% (−35% input)
avg tokens / call: 6,000 in / 1,200 out

CrewAI

43%

primary model: openai/gpt-4.1
cacheable share: 7%
downshiftable share: 40%
compressible share: 20% (−35% input)
avg tokens / call: 4,000 in / 800 out

AutoGPT

39%

primary model: openai/gpt-4.1
cacheable share: 5%
downshiftable share: 38%
compressible share: 18% (−30% input)
avg tokens / call: 5,000 in / 900 out

LangGraph

39%

primary model: anthropic/claude-sonnet-4.6
cacheable share: 10%
downshiftable share: 40%
compressible share: 15% (−32% input)
avg tokens / call: 5,000 in / 1,100 out

Custom

29%

primary model: anthropic/claude-sonnet-4.6
cacheable share: 6%
downshiftable share: 32%
compressible share: 15% (−28% input)
avg tokens / call: 4,500 in / 900 out

What this is not

A workload profile is a stand-in, not a measurement of your stack. Your real recoverable rate depends on your actual model mix, prompt shapes, cache locality, and how much of your traffic is genuinely downshiftable. The calculator is a directional estimate. The number you see in your own workspace, once you connect, is the truth.

Source: apps/web/lib/landing/savings-math.ts (open source repo). Model prices come from @whatmodel/data, our open model index, rebuilt from upstream provider rates.