EN DA
AI
AI

Fine-tuning

Teaching a general model specialist skills

lorasftadaptation

Overview

Fine-tuning adapts a pre-trained model to a specific task or domain by continuing training on a smaller, curated dataset. Rather than training from scratch—which requires billions of dollars and months of compute—fine-tuning leverages the general knowledge already encoded in model weights and redirects it. This is how generic base models become customer-support bots, medical advisors, or code assistants.

Key Concepts

  • Supervised fine-tuning (SFT): trains the model on curated prompt–response pairs that demonstrate desired behaviour
  • Full fine-tuning: all model weights are updated; expensive but maximum adaptation
  • LoRA (Low-Rank Adaptation): injects small trainable matrices into each layer, updating <1% of parameters while preserving base model performance
  • QLoRA: combines LoRA with 4-bit quantisation, enabling fine-tuning of 70B models on a single consumer GPU
  • PEFT (Parameter-Efficient Fine-Tuning): umbrella term covering LoRA, prefix tuning, adapters, and prompt tuning

Key Facts

  • LoRA (Hu et al., 2022) made it practical to fine-tune billion-parameter models on a single GPU
  • Catastrophic forgetting is a key risk: aggressive fine-tuning can overwrite the model's general capabilities
  • Instruction tuning on as few as 1 000 high-quality examples can dramatically improve task alignment
  • OpenAI's GPT-3.5-turbo fine-tuning API allows organisations to build domain-specific assistants without touching model internals
  • Domain-specific fine-tuned models often outperform larger general models on narrow tasks at a fraction of the inference cost