SPLADE - Learned sparse retrieval model using term expansion
SPLADE (SParse Lexical AnD Expansion) is a learned sparse retrieval model that generates sparse representations of queries and documents by predicting term importance weights across the entire vocabulary. Unlike traditional keyword matching, SPLADE learns to expand queries and documents with semantically related terms, combining the interpretability of sparse retrieval with the semantic understanding of neural models.
How It Works
SPLADE passes text through a BERT-based encoder and uses the masked language model (MLM) head to predict a weight for every token in the vocabulary. These weights represent the importance of each term for that query or document. The resulting sparse vector contains non-zero weights for terms that appear in the text (with learned importance) plus terms the model predicts are semantically relevant (expansion). Retrieval uses an inverted index where query and document sparse vectors are matched via dot product.
Technical Details
SPLADE uses log-saturation (log(1 + ReLU(logits))) to produce non-negative, bounded term weights. A FLOPS regularizer controls sparsity by penalizing the expected number of floating-point operations during retrieval. SPLADE can be used with standard inverted indices (Lucene, Anserini) by treating term weights as term frequencies. SPLADEv2 and SPLADE++ introduced distillation from cross-encoders and improved regularization. The model is trained end-to-end with contrastive or distillation losses on query-document pairs.
Best Practices
Use SPLADE as a drop-in replacement for BM25 in existing inverted-index infrastructure
Tune the FLOPS regularization coefficient to balance effectiveness and efficiency for your latency budget
Combine SPLADE with dense retrieval in a hybrid approach for the strongest retrieval quality
Pre-encode documents offline and store sparse vectors for fast query-time matching
Fine-tune on domain-specific data for vocabulary-heavy domains (legal, medical, scientific)
Common Pitfalls
Setting sparsity regularization too high, producing overly sparse vectors that miss relevant terms
Setting sparsity regularization too low, creating dense-like vectors that are slow with inverted indices
Not quantizing term weights for storage, leading to unnecessarily large index sizes
Assuming SPLADE fully replaces dense retrieval instead of treating them as complementary approaches
Ignoring the computational cost of encoding at index time, which is heavier than BM25 preprocessing
Advanced Tips
Use SPLADE for explainable retrieval by inspecting which expanded terms contributed to the match
Combine SPLADE scores with dense retrieval scores (hybrid search) using learned or tuned interpolation weights
Apply document-side-only expansion (efficient SPLADE) to reduce query encoding overhead
Use distillation from cross-encoder teachers to train stronger SPLADE models
Consider SPLADE for multilingual retrieval with mBERT-based backbones