← back

Break-Even Pay-Per-Token API vs Dedicated GPU

#451 · Machine Learning · Medium

⊣ Solve on deep-ml.com

Problem

Determine the break-even point (in tokens) for using a pay-per-token API versus a dedicated GPU. Given the cost per token for the API, the fixed hourly cost of a GPU, and the GPU's throughput in tokens per second, compute the number of tokens per hour at which both options cost the same.

Solution

1
2
def break_even_tokens(cost_per_token: float, gpu_hourly_cost: float, gpu_tokens_per_sec: float) -> float:
    return gpu_hourly_cost / cost_per_token

Explanation

  1. The API cost for N tokens is N * cost_per_token.
  2. The GPU cost for one hour is the fixed gpu_hourly_cost, regardless of how many tokens are processed (up to its throughput limit).
  3. At the break-even point the two costs are equal: N * cost_per_token = gpu_hourly_cost, so N = gpu_hourly_cost / cost_per_token.
  4. The gpu_tokens_per_sec parameter can be used to verify that the break-even volume is actually achievable: the GPU can produce at most gpu_tokens_per_sec * 3600 tokens per hour.

Complexity

  • Time: O(1)
  • Space: O(1)