Cheapest LLM for a chatbot
For an always-on chatbot, cost is dominated by the system prompt and conversation history replayed on every turn — so input price matters most. These are the cheapest generally-available models for a typical chat workload.
Cheapest models for chatbots
Monthly cost for a busy chatbot handling ~10M input and ~3M output tokens a month. Sorted cheapest first.
| # | Model | Context | Input $/M | Output $/M | Monthly cost |
|---|---|---|---|---|---|
| 1 | Llama 3.1 8B Instruct Meta | 128K | $0.02 | $0.03 | $0.29 ◎ |
| 2 | Ministral 3 3B Mistral | — | $0.04 | $0.04 | $0.52 |
| 3 | Amazon Nova Micro Amazon | 128K | $0.035 | $0.14 | $0.77 |
| 4 | Command R7B Cohere | 128K | $0.037 | $0.15 | $0.82 |
| 5 | Amazon Nova Lite Amazon | 300K | $0.06 | $0.24 | $1.32 |
| 6 | Qwen-Flash Alibaba | 1M | $0.05 | $0.4 | $1.70 |
| 7 | Llama 4 Scout (17B-16E Instruct) Meta | 10M | $0.1 | $0.3 | $1.90 |
| 8 | Ministral 3 8B Mistral | 256K | $0.15 | $0.15 | $1.95 |
| 9 | Llama 3.3 70B Instruct Meta | 128K | $0.1 | $0.32 | $1.96 |
| 10 | Qwen3.5-Flash Alibaba | 1M | $0.1 | $0.4 | $2.20 |
| 11 | Ministral 3 14B Mistral | — | $0.2 | $0.2 | $2.60 |
| 12 | Llama 4 Maverick (17B-128E Instruct) Meta | 1M | $0.15 | $0.6 | $3.30 |
Estimate only; excludes prompt caching, batch discounts and free tiers. Different volumes change the ranking —run your own numbers. Prices verified against official docs · catalog updated 2026-06-28.
Chat traffic skews input-heavy (~3:1), because the system prompt and prior turns are re-sent each message while replies stay short. We weight a 10M-in / 3M-out monthly workload accordingly. A small, fast model is usually enough; reach for a flagship only when answer quality demands it.
Cheapest LLM for chatbots
What is the cheapest LLM for chatbots?
Llama 3.1 8B Instruct (Meta) is the cheapest generally-available model we track for chatbots, at $0.02 per 1M input tokens and $0.03 per 1M output tokens — about $0.29/month for a busy chatbot handling ~10M input and ~3M output tokens a month. Ministral 3 3B is the next cheapest at $0.52/month.
How is "cheapest for chatbots" calculated?
We price a representative monthly workload — a busy chatbot handling ~10M input and ~3M output tokens a month — against every generally-available model, then rank by total cost. All prices are USD per 1M tokens, sourced from official provider documentation.
Is the cheapest model always the right choice for chatbots?
No. Price is one axis; quality, latency, rate limits and reliability matter too. Use this ranking to shortlist, then test the top candidates on your own chatbots workload before committing. Cost is easy to measure — fit is not.
Get alerted when a cheaper model for chatbots ships
New models, price cuts, and deprecations — a short email when something actually changes. No spam, unsubscribe anytime.
◎ You're on the watch list. We'll ping you the moment a model launches, changes price, or gets deprecated.
Free forever · powered by the same data on this page.