Guides / coding

Cheapest LLM for coding

Coding agents read a lot of code and write a fair amount back, so both input and output count — and they need a large context window to hold the files in play. These models clear a 200K-token floor, ranked cheapest first.

The cheapest pickAmazon Nova Lite
$11.40/mo for a coding agent burning ~90M input and ~25M output tokens a month · $0.06 in / $0.24 out per 1M · Amazon
The ranking

Cheapest models for coding

Monthly cost for a coding agent burning ~90M input and ~25M output tokens a month. Sorted cheapest first.

#ModelContextInput $/MOutput $/MMonthly cost
1Amazon Nova Lite
Amazon
300K$0.06$0.24$11.40 ◎
2Qwen-Flash
Alibaba
1M$0.05$0.4$14.50
3Llama 4 Scout (17B-16E Instruct)
Meta
10M$0.1$0.3$16.50
4Ministral 3 8B
Mistral
256K$0.15$0.15$17.25
5Qwen3.5-Flash
Alibaba
1M$0.1$0.4$19.00
6Llama 4 Maverick (17B-128E Instruct)
Meta
1M$0.15$0.6$28.50
7Mistral Small 4
Mistral
256K$0.15$0.6$28.50
8GPT-5.4 nano
OpenAI
400K$0.2$1.25$49.25
9Gemini 3.1 Flash-Lite
Google
1.0M$0.25$1.50$60.00
10Qwen3.6-Flash
Alibaba
1M$0.25$1.50$60.00
11Qwen-Plus (Qwen3-series)
Alibaba
1M$0.4$1.20$66.00
12Qwen3.7-Plus
Alibaba
1M$0.4$1.60$76.00

Estimate only; excludes prompt caching, batch discounts and free tiers. Different volumes change the ranking —run your own numbers. Prices verified against official docs · catalog updated 2026-06-28.

Methodology

Coding workloads carry whole files and diffs into context and generate substantial output, so we weight a 90M-in / 25M-out monthly mix and require ≥200K context to fit a real working set. Cheapest is not always best here — verify the model can actually pass your tests before committing.

FAQ

Cheapest LLM for coding

What is the cheapest LLM for coding?

Amazon Nova Lite (Amazon) is the cheapest generally-available model we track for coding, at $0.06 per 1M input tokens and $0.24 per 1M output tokens — about $11.40/month for a coding agent burning ~90M input and ~25M output tokens a month. Qwen-Flash is the next cheapest at $14.50/month.

How is "cheapest for coding" calculated?

We price a representative monthly workload — a coding agent burning ~90M input and ~25M output tokens a month — against every generally-available model, then rank by total cost. Only models with at least a 200K-token context window are included. All prices are USD per 1M tokens, sourced from official provider documentation.

Is the cheapest model always the right choice for coding?

No. Price is one axis; quality, latency, rate limits and reliability matter too. Use this ranking to shortlist, then test the top candidates on your own coding workload before committing. Cost is easy to measure — fit is not.