Fine-tuning

Teaching a general model specialist skills

lorasftadaptation

Overview

Fine-tuning adapts a pre-trained model to a specific task or domain by continuing training on a smaller, curated dataset. Rather than training from scratch—which requires billions of dollars and months of compute—fine-tuning leverages the general knowledge already encoded in model weights and redirects it. This is how generic base models become customer-support bots, medical advisors, or code assistants.

Key Concepts

Supervised fine-tuning (SFT): trains the model on curated prompt–response pairs that demonstrate desired behaviour
Full fine-tuning: all model weights are updated; expensive but maximum adaptation
LoRA (Low-Rank Adaptation): injects small trainable matrices into each layer, updating <1% of parameters while preserving base model performance
QLoRA: combines LoRA with 4-bit quantisation, enabling fine-tuning of 70B models on a single consumer GPU
PEFT (Parameter-Efficient Fine-Tuning): umbrella term covering LoRA, prefix tuning, adapters, and prompt tuning

Key Facts

LoRA (Hu et al., 2022) made it practical to fine-tune billion-parameter models on a single GPU
Catastrophic forgetting is a key risk: aggressive fine-tuning can overwrite the model's general capabilities
Instruction tuning on as few as 1 000 high-quality examples can dramatically improve task alignment
OpenAI's GPT-3.5-turbo fine-tuning API allows organisations to build domain-specific assistants without touching model internals
Domain-specific fine-tuned models often outperform larger general models on narrow tasks at a fraction of the inference cost

Fine-tuning

Overview

Key Concepts

Key Facts

Related

Large Language Models

Neural Networks

RLHF

Prompt Engineering