AI
Fine-tuning
Teaching a general model specialist skills
Overview
Fine-tuning adapts a pre-trained model to a specific task or domain by continuing training on a smaller, curated dataset. Rather than training from scratch—which requires billions of dollars and months of compute—fine-tuning leverages the general knowledge already encoded in model weights and redirects it. This is how generic base models become customer-support bots, medical advisors, or code assistants.
Key Concepts
- Supervised fine-tuning (SFT): trains the model on curated prompt–response pairs that demonstrate desired behaviour
- Full fine-tuning: all model weights are updated; expensive but maximum adaptation
- LoRA (Low-Rank Adaptation): injects small trainable matrices into each layer, updating <1% of parameters while preserving base model performance
- QLoRA: combines LoRA with 4-bit quantisation, enabling fine-tuning of 70B models on a single consumer GPU
- PEFT (Parameter-Efficient Fine-Tuning): umbrella term covering LoRA, prefix tuning, adapters, and prompt tuning
Key Facts
- LoRA (Hu et al., 2022) made it practical to fine-tune billion-parameter models on a single GPU
- Catastrophic forgetting is a key risk: aggressive fine-tuning can overwrite the model's general capabilities
- Instruction tuning on as few as 1 000 high-quality examples can dramatically improve task alignment
- OpenAI's GPT-3.5-turbo fine-tuning API allows organisations to build domain-specific assistants without touching model internals
- Domain-specific fine-tuned models often outperform larger general models on narrow tasks at a fraction of the inference cost