#451 · Machine Learning · Medium
⊣ Solve on deep-ml.comDetermine the break-even point (in tokens) for using a pay-per-token API versus a dedicated GPU. Given the cost per token for the API, the fixed hourly cost of a GPU, and the GPU's throughput in tokens per second, compute the number of tokens per hour at which both options cost the same.
def break_even_tokens(cost_per_token: float, gpu_hourly_cost: float, gpu_tokens_per_sec: float) -> float:
return gpu_hourly_cost / cost_per_tokenN tokens is N * cost_per_token.gpu_hourly_cost, regardless of how many tokens are processed (up to its throughput limit).N * cost_per_token = gpu_hourly_cost, so N = gpu_hourly_cost / cost_per_token.gpu_tokens_per_sec parameter can be used to verify that the break-even volume is actually achievable: the GPU can produce at most gpu_tokens_per_sec * 3600 tokens per hour.