Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey

What is fine-tuning?

Fine-tuning is a method common in deep learning for applications such as LLMs or image classification to improve performance on some more specific task for a lower training cost.

Fine-tuning is a transfer learning approach where one uses a pre-trained model such as a foundation model and retrains it on some new data in order to boost its performance on a desired task. This technique is either applied to all or some subset of a model's parameters but typically training time and computation can be heavily reduced, while increasing performance by training a subset or parameters or adding an additional layer and only training it. The real benefit of fine-tuning is that you can make use of a very powerful / expensive model while only having to pay a fraction of the initial training cost.

For certain architectures, such as convolutional neural networks (CNNs), the first layers often encode low level features while later layers encode more complex concepts, therefore it's possible to reduce the compute necessary by only training the later layers, see the hot dog recognition example.

Fine-tuning has also been successfully applied to LLMs to align them with human preferences using RLHF. LLMs are initially trained on a massive data set to produce a pre-trained LLM, albeit powerful this stage is not usually too useful for a general audience. To improve factors such as helpfulness, safety and factuality just to name a few, fine-tuning of the pre-trained model is applied. This technique has been used to improve performance of current frontier LLMs such as ChatGPT, Claude and llama.

Common pitfalls of fine-tuning

Though fine-tuning can be a powerful technique to apply, there are a few common pitfalls that can occur. Firstly, if the fine-tuning data set or the fine-tuning training regimen is of power quality it can lead to overfitting, i.e the pre-trained model might lose its generality which is in most cases a very desirable trait.

Fine-tuning is heavily dependent on not only the quality / capability of the pre-trained model, but also the alignment of the pre-trained model’s target and the goal target. This means that the usage of a powerful pre-trained model does not imply that the fine-tuned model is highly performant on the target task.



AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.