Large language models are the heart of the AI revolution. But how many are there, really? Who builds them? What do they cost? And which model is actually the best?
The honest answer:
It has gotten messy. In 2026, a new top-tier model shows up roughly every month, prices swing by a factor of 600, and the single most important metric of the past few years, the parameter count, is something the big labs no longer disclose at all.
In this article, I sort through the numbers. Every value comes from our centrally maintained LLM database, the same one that powers tools like the API cost calculator, and reflects the state of June 2026.
- Our database tracks 93 LLMs from 16 providers, 48 of them proprietary and 45 openly available.
- For coding, GPT-5.5 and Claude Opus 4.8 lead at around 88.6% SWE-bench. Open-weights models like DeepSeek-V4-Pro trail by just 8 percentage points.
- Prices range from $0.05 (GPT-5 nano) to $30 (GPT-5.5 Pro) per 1M input tokens. Frontier labs no longer disclose parameter counts.
1. How Many Large Language Models Are There in 2026?
Our database currently tracks 93 large language models from 16 different providers, from GPT-2 back in 2019 to the latest flagships in June 2026. This is deliberately a curated selection of the most important models, not a claim to completeness.
For context:
According to the Stanford AI Index 2026, US labs alone shipped around 50 notable models in 2025, Chinese providers about 30. More than 90% of all significant frontier models now come from industry rather than academic research. The market has professionalized and concentrated.
2. The Biggest LLM Providers by Model Count
A simple indicator of how active a lab is: the number of models it maintains. The chart below shows how many of the models we track belong to each provider:
OpenAI leads with 18 models, followed by Anthropic and Google with 13 each. That number only measures how deeply a lab maintains its lineup, though, not actual usage. Real market share looks different: in AI chatbot web traffic, ChatGPT dominates, while Gemini and Claude follow behind.
3. Parameters and Architecture: The End of Size Disclosures
For years, the parameter count was the most important metric for a model. GPT-3 had 175 billion, GPT-4 an estimated 1.76 trillion. Then the labs stopped reporting the number.
Today the rule is:
For every current frontier model from OpenAI, Anthropic, Google, and xAI, the parameter count is officially unknown. Model size has become a trade secret. Concrete, confirmed numbers only exist for open-weights models, and those are huge:
The architecture is the striking part. Almost all large models today use a Mixture-of-Experts (MoE) design, where only a fraction of the parameters is active per request. DeepSeek-V4-Pro has 1.6 trillion parameters but activates only 49 billion per token, around 3%. That makes giant models affordable to run. In total, 22 of the tracked models are built as MoE.
You can filter and search the full parameter database by provider, size, and type below. For most current frontier models, the parameter column deliberately reads "unknown":
Legend:
Showing 93 models
Model | Developer | Parameters |
|---|---|---|
GPT-5.5 | OpenAI | Unknown |
GPT-5.5 Pro | OpenAI | Unknown |
GPT-5.5 Instant | OpenAI | Unknown |
GPT-5.4 | OpenAI | Unknown |
GPT-5.3-Codex | OpenAI | Unknown |
GPT-5.2 | OpenAI | Unknown |
GPT-5 | OpenAI | Unknown |
GPT-5 pro | OpenAI | Unknown |
GPT-5 mini | OpenAI | Unknown |
GPT-3.5 Turbo | OpenAI | Unknown |
o3 | OpenAI | Unknown |
o4-mini | OpenAI | Unknown |
o1 | OpenAI | Unknown |
Claude Fable 5 | Anthropic | Unknown |
Claude Mythos 5 | Anthropic | Unknown |
Claude Opus 4.8 | Anthropic | Unknown |
Claude Opus 4.7 | Anthropic | Unknown |
Claude Opus 4.6 | Anthropic | Unknown |
Claude Sonnet 4.6 | Anthropic | Unknown |
Claude Opus 4.5 | Anthropic | Unknown |
Claude Sonnet 4.5 | Anthropic | Unknown |
Claude Sonnet 4 | Anthropic | Unknown |
Gemini 3.5 Flash MoE | Unknown | |
Gemini 3.1 Pro MoE | Unknown | |
Gemini 3 Pro MoE | Unknown | |
Gemini 2.0 Flash MoE | Unknown | |
Gemini 1.5 Pro MoE | Unknown | |
Grok 4 | xAI | Unknown |
Grok 3 | xAI | Unknown |
Grok 2 | xAI | Unknown |
Claude 3 Opus | Anthropic | 2T* |
Llama 4 Behemoth MoE(288B active) | Meta | 2T |
GPT-4 MoE(220B active) | OpenAI | 1.76T* |
DeepSeek-V4-Pro MoE(49B active) | DeepSeek | 1.6T |
Kimi K2.6 MoE(32B active) | Moonshot AI | 1T |
Qwen 3.6 Max-Preview MoE | Alibaba | 1T* |
Yi-Large MoE | 01.AI | 1T |
DeepSeek-V3.2 MoE(37B active) | DeepSeek | 685B |
Mistral Large 3 MoE(41B active) | Mistral AI | 675B |
DeepSeek-V3 MoE(37B active) | DeepSeek | 671B |
DeepSeek-R1 MoE(37B active) | DeepSeek | 671B |
PaLM | 540B | |
Megatron-Turing NLG | NVIDIA | 530B |
Llama 3.1 405B | Meta | 405B |
Llama 4 Maverick MoE(17B active) | Meta | 400B |
Nemotron-4 340B | NVIDIA | 340B |
PaLM 2 | 340B* | |
Grok 1 MoE(86B active) | xAI | 314B |
DeepSeek-V2 MoE(21B active) | DeepSeek | 236B |
GPT-4o | OpenAI | 200B* |
Falcon 180B | TII | 180B |
Mixtral 8x22B MoE(44B active) | Mistral AI | 176B |
BLOOM | BigScience | 176B |
GPT-3 | OpenAI | 175B |
Claude 3.5 Sonnet | Anthropic | 175B* |
OPT-175B | Meta | 175B |
LaMDA | 137B | |
DBRX MoE(36B active) | Databricks | 132B |
Mistral Large 2 | Mistral AI | 123B |
Command A | Cohere | 111B |
Llama 4 Scout MoE(17B active) | Meta | 109B |
Command R+ | Cohere | 104B |
Qwen 2.5 72B | Alibaba | 72B |
Claude 3 Sonnet | Anthropic | 70B* |
Llama 3.3 70B | Meta | 70B |
Llama 3.1 70B | Meta | 70B |
Llama 3 70B | Meta | 70B |
Llama 2 70B | Meta | 70B |
Mixtral 8x7B MoE(14B active) | Mistral AI | 56B |
Falcon 40B | TII | 40B |
Yi-34B | 01.AI | 34B |
Qwen 2.5 32B | Alibaba | 32B |
Command R | Cohere | 32B |
Gemma 2 27B | 27B | |
Claude 3 Haiku | Anthropic | 20B* |
Qwen 2.5 14B | Alibaba | 14B |
Phi-4 | Microsoft | 14B |
Gemma 2 9B | 9B | |
GPT-4o mini | OpenAI | 8B* |
Llama 3.1 8B | Meta | 8B |
Llama 3 8B | Meta | 8B |
Ministral 8B | Mistral AI | 8B |
Mistral 7B | Mistral AI | 7B |
Qwen 2.5 7B | Alibaba | 7B |
Phi-4 Multimodal | Microsoft | 5.6B |
Phi-4 mini | Microsoft | 3.8B |
Phi-3 mini | Microsoft | 3.8B |
Gemini Nano 2 | 3.3B | |
Ministral 3B | Mistral AI | 3B |
Gemma 2 2B | 2B | |
Gemini Nano 1 | 1.8B | |
GPT-2 | OpenAI | 1.5B |
Qwen 2.5 0.5B | Alibaba | 0.5B |
Parameter sizes of popular Large Language Models (as of May 2026)
4. Context Windows: From 200,000 to 10 Million Tokens
The context window determines how much text a model can process at once. Here the orders of magnitude have multiplied over the past two years. The overview below covers more than 140 current models, sortable and filterable by provider:
Model | Developer | Context Window |
|---|---|---|
| Meta | 10M | |
| Alibaba | 10M | |
2M | ||
2M | ||
| xAI | 2M | |
| xAI | 2M | |
| Meta | 1M | |
1M | ||
1M | ||
1M | ||
1M | ||
1M | ||
1M | ||
1M | ||
1M | ||
1M | ||
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| Anthropic | 1M | |
| OpenAI | 1M | |
| OpenAI | 1M | |
| OpenAI | 1M | |
| OpenAI | 1M | |
| OpenAI | 1M | |
| DeepSeek | 1M | |
| Alibaba | 1M | |
| Alibaba | 1M | |
| Amazon | 1M | |
| Amazon | 1M | |
| Amazon | 1M | |
| MiniMax | 1M | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| OpenAI | 400K | |
| Amazon | 300K | |
| Amazon | 300K | |
| Moonshot AI | 262.14K | |
| Alibaba | 262.14K | |
| Alibaba | 262.14K | |
| xAI | 256K | |
| xAI | 256K | |
| Mistral | 256K | |
| Mistral | 256K | |
| Alibaba | 256K | |
| Cohere | 256K | |
| Cohere | 256K | |
| AI21 Labs | 256K | |
| AI21 Labs | 256K | |
| AI21 Labs | 256K | |
| MiniMax | 245.76K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| Anthropic | 200K | |
| OpenAI | 200K | |
| OpenAI | 200K | |
| OpenAI | 200K | |
| OpenAI | 200K | |
| 01.AI | 200K | |
| 01.AI | 200K | |
| xAI | 131.07K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
| Meta | 128K | |
128K | ||
128K | ||
128K | ||
| xAI | 128K | |
| OpenAI | 128K | |
| OpenAI | 128K | |
| OpenAI | 128K | |
| OpenAI | 128K | |
| OpenAI | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| DeepSeek | 128K | |
| Mistral | 128K | |
| Mistral | 128K | |
| Mistral | 128K | |
| Mistral | 128K | |
| Mistral | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Alibaba | 128K | |
| Cohere | 128K | |
| Cohere | 128K | |
| Amazon | 128K | |
| Microsoft | 128K | |
| Microsoft | 128K | |
| Microsoft | 128K | |
| Microsoft | 128K | |
| Microsoft | 128K | |
| Microsoft | 128K | |
| 01.AI | 128K | |
| 01.AI | 128K | |
| Nvidia | 128K | |
| Nvidia | 128K | |
| Nvidia | 128K | |
| Reka | 128K | |
| Reka | 128K | |
| Reka | 128K | |
| Zhipu AI | 128K | |
| Zhipu AI | 128K | |
| Baidu | 128K | |
| Mistral | 65.54K | |
| Microsoft | 64K | |
| Mistral | 32.77K | |
| Mistral | 32.77K | |
| Alibaba | 32.77K | |
| Alibaba | 32.77K | |
| Alibaba | 32.77K | |
| Microsoft | 32.77K | |
| Databricks | 32.77K | |
32K | ||
| 01.AI | 32K | |
| Microsoft | 16.38K | |
| 01.AI | 16K | |
8.19K | ||
8.19K | ||
| OpenAI | 8.19K | |
| AI21 Labs | 8.19K | |
| Zhipu AI | 8.19K | |
| Baidu | 8K | |
| Cohere | 4.1K | |
| Nvidia | 4.1K | |
| Stability AI | 4.1K | |
| Stability AI | 4.1K |
Context window sizes of current AI language models (as of May 2026)
At the top are Llama 4 Scout and Qwen-Long with 10 million tokens each. That's roughly 30 Harry Potter books in a single prompt. The current all-rounders like GPT-5.5, Claude Opus 4.8, and Gemini 3.1 Pro sit at 1 million tokens, which is more than enough for most use cases. For more on the individual model families, see our overviews of the Claude models and Gemini models.
5. What Does an LLM Cost? Prices per 1 Million Tokens
API prices span worlds. The cheapest model with API access is GPT-5 nano at $0.05 per 1M input tokens. The most expensive is GPT-5.5 Pro at $30, a 600x difference.
More interesting than the raw price is the ratio of price to performance. The chart below plots input price against coding performance (SWE-bench Verified). Models toward the bottom right are ideal: strong and cheap.
The quiet star of this chart is DeepSeek-V4-Pro. At 80.6% SWE-bench for just $0.435 input price, it sits right on the efficiency frontier, no other model is both stronger and cheaper. So if you don't strictly need the last few percentage points of coding performance, the open models offer an extremely good price-performance ratio. For a detailed cost estimate of your specific usage, see the API cost calculator.
6. LLM Performance Head to Head
To make the strengths and weaknesses of the top models visible at a glance, the radar below compares five representative frontier models across four dimensions: reasoning, coding, context window, and price efficiency. Each axis is scaled relative to the five models so even small leads become visible. The real values appear in the tooltip.
The pattern is clear. Claude Opus 4.8 and GPT-5.5 dominate on raw coding performance but are expensive. Gemini 3.5 Flash flips that, nearly on par on reasoning and only trailing on coding, yet with the best price efficiency in the field. Every AI project comes down to this one trade-off in the end, maximum quality versus maximum economy.
7. Open Source vs. Proprietary
One of the most important developments of 2026 is the catch-up of open models. Of the 93 tracked models, 48 are proprietary and 45 are openly available, 40 of them open-weights and 5 fully open-source.
But at the very top:
According to the Stanford AI Index 2026, the best closed model led the best open-weights model by 3.3 percentage points in early 2026. In August 2024, the gap had been only 0.5 percentage points. So at the top it has not been shrinking but widening again, with six of the top-ten models in the Chatbot Arena now closed once more. Our data shows the same lead on coding: DeepSeek-V4-Pro (80.6% SWE-bench) and Kimi K2.6 (80.2%) trail the closed leader GPT-5.5 (88.7%) by about 8 percentage points. For an overview of the best free models, see our article on open-source LLMs.
8. Knowledge Cutoff: How Current Are the Models?
Every model has a knowledge cutoff, after which it has learned nothing more about the world. Right now the freshest cutoff in our database is October 2025:
Between the knowledge cutoff and the release date there are usually six to eight months in which the model is trained and tested. For current events, the models therefore almost always need a web search. Raw model knowledge is always a few months old.
9. Release Pace: The Cadence of the Labs
How fast the market moves shows in the release timeline. What happened quarterly in 2024 comes almost monthly in 2026:
December 2025 was especially dense, when Google, OpenAI, and Mistral all shipped new flagships in the same month. So was April 2026, which brought GPT-5.5, Claude Opus 4.7, DeepSeek-V4-Pro, Kimi K2.6, and Qwen 3.6 Max, five top models at once. If you want to keep up here, don't cling too tightly to individual version numbers.
10. Model Status: Active, Deprecated, Legacy
Not every model ever released is still usable. Across the three big providers Anthropic, Google, and OpenAI, we track the lifecycle of 77 models. Here is how they split across the individual statuses:
Just over half of the models are still active, and nearly a third are already deprecated. And lifecycles are getting shorter. A good example is Gemini 3 Pro, deprecated only about three months after its release because Gemini 3.1 Pro was already standing by as a successor. Anyone building production systems on a model has to keep an active eye on these deprecations.
11. Market Position and Conclusion
The LLM market of 2026 has grown up. Instead of one dominant model, there's a tight leading pack of OpenAI, Anthropic, and Google, closely chased by open models from China, led by DeepSeek and Moonshot.
Bottom line:
Performance at the top is remarkably close together, and the competition is shifting to price, context length, and specialization. For most applications in 2026, it matters less which model is the absolute best and more which one is right for the specific purpose and budget. If you want to dig deeper into individual providers, you'll find the details in our statistics on OpenAI, Anthropic, Google Gemini, Grok, and DeepSeek.






