A machine learning technique where a model trained on one task is adapted for a different but related task. Transfer learning is the foundation of modern multimodal AI, enabling powerful models without requiring massive task-specific training datasets.
Transfer learning takes a model that has been pretrained on a large dataset (like ImageNet for vision or BookCorpus for text) and adapts it for a new task. The pretrained model has learned general features (edges, textures, syntax, semantics) that transfer well to related tasks. Adaptation typically involves replacing the final classification layer and fine-tuning part or all of the network on task-specific data.
Common strategies include feature extraction (freeze pretrained weights, train only the new head), full fine-tuning (update all weights), and gradual unfreezing (progressively unfreeze layers from top to bottom). Learning rates for pretrained layers are typically 10-100x smaller than for new layers. Pretrained models from model hubs (Hugging Face, timm) provide ready-to-use starting points for virtually any vision, language, or multimodal task.