What is Fine-Tuning (for LLMs)?

Fine-tuning means adapting a pre-trained LLM (large language model) to a specific task or domain by further training it on a specialized dataset.

During this process, the model's weights are updated so that it better captures the specifics of the task or domain.

Fine-tuning thus allows you to optimize the broad but superficial knowledge base of LLMs for specific use cases.

Did that just go over your head?

No problem, here's a simpler explanation:

Think of an LLM as a new employee in your company who has a lot of general knowledge but little understanding of internal processes and communication within your organization.

Through fine-tuning, you feed your employee the necessary specialized knowledge so they can better fulfill their role in the company.

1. Benefits of Fine-Tuning

Fine-tuning improves "few-shot learning" by training with many more examples than would fit in a prompt.

This means you no longer need to provide as many examples in your prompts to get the desired output. Additionally, you don't need to give the LLM as many details about its task, such as the writing style to use, the target audience, or the output length. This can save a lot of time.

Furthermore, fine-tuning can help an LLM respond with lower latency and consume fewer tokens. Fine-tuning can therefore also reduce costs for API usage or computing power.

2. What Steps Are Required for Fine-Tuning?

Fine-tuning sounds complicated. But it's actually a relatively simple, though very time-consuming process.

The key steps in fine-tuning are:

Prepare and upload training data (by far the most labor-intensive step for you)
Train a new fine-tuned model
Evaluate results and return to step 1 if needed
Use your fine-tuned model

2.1 How Do You Prepare Training Data?

The most important and time-consuming step in fine-tuning is preparing the training data.

For this, you need to create a diverse set of example conversations that resemble the conversations the model will encounter in production.

Each example in the dataset must have a specific format. When fine-tuning OpenAI models, for example, the training data must be in the same format as the Chat Completions API.

To achieve the best results with fine-tuning, you should primarily train an LLM with cases where it doesn't give the desired responses and provide your preferred answers in the training data.

Here's an example from OpenAI where a chatbot named "Marv" is fine-tuned to give sarcastic responses:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "About 384,400 kilometers. Give or take a few, as if that really matters."}]}

3. Which AI Models Can Be Fine-Tuned?

Fine-tuning is available for many different AI models – from proprietary APIs to open-source models. Here's a comprehensive overview of all currently available models (as of January 2026):

Showing 29 models

Provider	Model	Parameters	Platform	Method
Alibaba / Qwen	Qwen3	0.6B–235B	Open Source	FullLoRA
Alibaba / Qwen	Qwen2.5-Max	MoE	Alibaba Cloud	SFT
Alibaba / Qwen	Qwen2.5	0.5B–72B	Open Source	FullLoRA
Amazon	Amazon Nova 2 Lite	–	Amazon Bedrock	Reinforcement FT
Amazon	Amazon Titan Text	–	Amazon Bedrock	SFT
Anthropic	Claude 3 Haiku	–	Amazon Bedrock	SFT
Cohere	Command R (08-2024)	32B	Cohere API	SFTLoRA
Cohere	Command R+ (08-2024)	104B	Cohere API	SFTLoRA
DeepSeek	DeepSeek R1 Distill	1.5B–70B	Open Source	LoRAQLoRA
DeepSeek	DeepSeek V3	671B (37B MoE)	Open Source	QAT
Google	Gemini 2.5 Pro	–	Vertex AI	SFT
Google	Gemini 2.5 Flash	–	Vertex AI	SFT
Google	Gemini 2.5 Flash-Lite	–	Vertex AI	SFT
Google	Gemini 2.0 Flash	–	Vertex AI	SFT
Google	Gemma 3	1B–27B	Open Source	FullLoRA
Meta	Llama 3.3	70B	Open Source	FullLoRA
Meta	Llama 3.2	1B–90B	Open Source, Amazon Bedrock	FullLoRA
Meta	Llama 3.1	8B–405B	Open Source	FullLoRA
Mistral	Mistral Large 3	123B	Mistral API, Open Source	SFTLoRA
Mistral	Mistral Nemo	12B	Mistral API, Open Source	SFTLoRA
Mistral	Codestral	–	Mistral API	SFT
Mistral	Mistral Small	–	Mistral API	SFT
Mistral	Mistral 7B	7B	Open Source	FullLoRA
OpenAI	GPT-4.1	–	OpenAI API	SFTDPO
OpenAI	GPT-4.1 mini	–	OpenAI API	SFTDPO
OpenAI	GPT-4.1 nano	–	OpenAI API	SFTDPO
OpenAI	GPT-4o (2024-08-06)	–	OpenAI API	SFT
OpenAI	GPT-4o mini	–	OpenAI API	SFT
OpenAI	GPT-3.5 Turbo	–	OpenAI API	SFT

3.1 Explanation of Fine-Tuning Methods

SFT (Supervised Fine-Tuning): Classic supervised fine-tuning with input-output pairs
DPO (Direct Preference Optimization): Training with preference data (which answer is better)
Full Fine-Tuning: All model weights are adjusted
LoRA (Low-Rank Adaptation): Efficient method that only trains small adapter layers
QLoRA: LoRA with quantized base model (requires less VRAM)
QAT (Quantization-Aware Training): Training that accounts for later quantization
Reinforcement FT: Fine-tuning with reinforcement learning from human feedback

Note: Google AI Studio no longer supports fine-tuning since 2025. For Gemini models, Vertex AI must be used.

Tip: You can also further fine-tune an already fine-tuned model. This is useful when you receive additional data and don't want to repeat the previous training steps.

4. When Should You Use Fine-Tuning?

Fine-tuning is a great method for getting better output from an LLM and is especially useful when it's easier to "show than to explain."

The problem is, however:

Fine-tuning is, as already explained, very time-consuming.

Therefore, it always makes sense to first check whether you can get better results with other methods and only resort to fine-tuning when you've exhausted these methods.

These include:

Prompt Engineering (i.e., formulating prompts, such as adding a role, precisely defining the answer format, etc.)
Prompt Chaining (breaking complex tasks into multiple prompts)
Function Calling (e.g., calling external interfaces or databases)

A major advantage of these methods is that you get feedback much faster and more easily.

For example, if you add a role to your prompt, you can immediately compare your output with the prompt without a role.

With fine-tuning, you often have to spend days or even weeks preparing your training data. Then you have to wait for the model to be fine-tuned and can only test what the fine-tuning actually achieved after these two steps.

5. Approaches to Fine-Tuning

There are various ways to fine-tune LLMs:

Approach	Description	Analogy
Full Fine-Tuning	Retraining the entire model, requires a lot of data and resources	Completely training a new employee
Parameter Efficient Fine-Tuning (PEFT)	Adding new efficient adapters without changing the model structure	Further training an employee
Distillation	Training a smaller specialized model that replicates the decisions of the large model	Having an experienced employee train a new employee

Fine-tuning means adapting a pre-trained LLM (large language model) to a specific task or domain by further training it on a specialized dataset.

During this process, the model's weights are updated so that it better captures the specifics of the task or domain.

Fine-tuning thus allows you to optimize the broad but superficial knowledge base of LLMs for specific use cases.

Did that just go over your head?

No problem, here's a simpler explanation:

Think of an LLM as a new employee in your company who has a lot of general knowledge but little understanding of internal processes and communication within your organization.

Through fine-tuning, you feed your employee the necessary specialized knowledge so they can better fulfill their role in the company.

1. Benefits of Fine-Tuning

Fine-tuning improves "few-shot learning" by training with many more examples than would fit in a prompt.

Furthermore, fine-tuning can help an LLM respond with lower latency and consume fewer tokens. Fine-tuning can therefore also reduce costs for API usage or computing power.

2. What Steps Are Required for Fine-Tuning?

Fine-tuning sounds complicated. But it's actually a relatively simple, though very time-consuming process.

The key steps in fine-tuning are:

Prepare and upload training data (by far the most labor-intensive step for you)
Train a new fine-tuned model
Evaluate results and return to step 1 if needed
Use your fine-tuned model

2.1 How Do You Prepare Training Data?

The most important and time-consuming step in fine-tuning is preparing the training data.

For this, you need to create a diverse set of example conversations that resemble the conversations the model will encounter in production.

Each example in the dataset must have a specific format. When fine-tuning OpenAI models, for example, the training data must be in the same format as the Chat Completions API.

To achieve the best results with fine-tuning, you should primarily train an LLM with cases where it doesn't give the desired responses and provide your preferred answers in the training data.

Here's an example from OpenAI where a chatbot named "Marv" is fine-tuned to give sarcastic responses:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What is the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "About 384,400 kilometers. Give or take a few, as if that really matters."}]}

3. Which AI Models Can Be Fine-Tuned?

Fine-tuning is available for many different AI models – from proprietary APIs to open-source models. Here's a comprehensive overview of all currently available models (as of January 2026):

Showing 29 models

Provider	Model	Parameters	Platform	Method
Alibaba / Qwen	Qwen3	0.6B–235B	Open Source	FullLoRA
Alibaba / Qwen	Qwen2.5-Max	MoE	Alibaba Cloud	SFT
Alibaba / Qwen	Qwen2.5	0.5B–72B	Open Source	FullLoRA
Amazon	Amazon Nova 2 Lite	–	Amazon Bedrock	Reinforcement FT
Amazon	Amazon Titan Text	–	Amazon Bedrock	SFT
Anthropic	Claude 3 Haiku	–	Amazon Bedrock	SFT
Cohere	Command R (08-2024)	32B	Cohere API	SFTLoRA
Cohere	Command R+ (08-2024)	104B	Cohere API	SFTLoRA
DeepSeek	DeepSeek R1 Distill	1.5B–70B	Open Source	LoRAQLoRA
DeepSeek	DeepSeek V3	671B (37B MoE)	Open Source	QAT
Google	Gemini 2.5 Pro	–	Vertex AI	SFT
Google	Gemini 2.5 Flash	–	Vertex AI	SFT
Google	Gemini 2.5 Flash-Lite	–	Vertex AI	SFT
Google	Gemini 2.0 Flash	–	Vertex AI	SFT
Google	Gemma 3	1B–27B	Open Source	FullLoRA
Meta	Llama 3.3	70B	Open Source	FullLoRA
Meta	Llama 3.2	1B–90B	Open Source, Amazon Bedrock	FullLoRA
Meta	Llama 3.1	8B–405B	Open Source	FullLoRA
Mistral	Mistral Large 3	123B	Mistral API, Open Source	SFTLoRA
Mistral	Mistral Nemo	12B	Mistral API, Open Source	SFTLoRA
Mistral	Codestral	–	Mistral API	SFT
Mistral	Mistral Small	–	Mistral API	SFT
Mistral	Mistral 7B	7B	Open Source	FullLoRA
OpenAI	GPT-4.1	–	OpenAI API	SFTDPO
OpenAI	GPT-4.1 mini	–	OpenAI API	SFTDPO
OpenAI	GPT-4.1 nano	–	OpenAI API	SFTDPO
OpenAI	GPT-4o (2024-08-06)	–	OpenAI API	SFT
OpenAI	GPT-4o mini	–	OpenAI API	SFT
OpenAI	GPT-3.5 Turbo	–	OpenAI API	SFT

3.1 Explanation of Fine-Tuning Methods

SFT (Supervised Fine-Tuning): Classic supervised fine-tuning with input-output pairs
DPO (Direct Preference Optimization): Training with preference data (which answer is better)
Full Fine-Tuning: All model weights are adjusted
LoRA (Low-Rank Adaptation): Efficient method that only trains small adapter layers
QLoRA: LoRA with quantized base model (requires less VRAM)
QAT (Quantization-Aware Training): Training that accounts for later quantization
Reinforcement FT: Fine-tuning with reinforcement learning from human feedback

Note: Google AI Studio no longer supports fine-tuning since 2025. For Gemini models, Vertex AI must be used.

Tip: You can also further fine-tune an already fine-tuned model. This is useful when you receive additional data and don't want to repeat the previous training steps.

4. When Should You Use Fine-Tuning?

Fine-tuning is a great method for getting better output from an LLM and is especially useful when it's easier to "show than to explain."

The problem is, however:

Fine-tuning is, as already explained, very time-consuming.

Therefore, it always makes sense to first check whether you can get better results with other methods and only resort to fine-tuning when you've exhausted these methods.

These include:

Prompt Engineering (i.e., formulating prompts, such as adding a role, precisely defining the answer format, etc.)
Prompt Chaining (breaking complex tasks into multiple prompts)
Function Calling (e.g., calling external interfaces or databases)

A major advantage of these methods is that you get feedback much faster and more easily.

For example, if you add a role to your prompt, you can immediately compare your output with the prompt without a role.

5. Approaches to Fine-Tuning

There are various ways to fine-tune LLMs:

Approach	Description	Analogy
Full Fine-Tuning	Retraining the entire model, requires a lot of data and resources	Completely training a new employee
Parameter Efficient Fine-Tuning (PEFT)	Adding new efficient adapters without changing the model structure	Further training an employee
Distillation	Training a smaller specialized model that replicates the decisions of the large model	Having an experienced employee train a new employee

What is Fine-Tuning (for LLMs)?

1. Benefits of Fine-Tuning

2. What Steps Are Required for Fine-Tuning?

2.1 How Do You Prepare Training Data?

3. Which AI Models Can Be Fine-Tuned?

3.1 Explanation of Fine-Tuning Methods

4. When Should You Use Fine-Tuning?

5. Approaches to Fine-Tuning

Finn Hillebrandt

Related AI Terms

What is Fine-Tuning (for LLMs)?

1. Benefits of Fine-Tuning

2. What Steps Are Required for Fine-Tuning?

2.1 How Do You Prepare Training Data?

3. Which AI Models Can Be Fine-Tuned?

3.1 Explanation of Fine-Tuning Methods

4. When Should You Use Fine-Tuning?

5. Approaches to Fine-Tuning

Finn Hillebrandt

Related AI Terms