What You’ll Build
A search retriever that starts with manually tuned weights and progressively learns the optimal blend of features from user behavior. By the end, your retriever adapts per-user (or per-segment) without manual tuning. Prerequisites: A namespace with at least two collections producing different embedding types (e.g., text + multimodal). See Semantic Search or Video Understanding to set those up first.1. Start with Static Weights
Begin withweighted fusion. This gives you a deterministic baseline to measure against later.
2. Instrument Your Application with Interaction Signals
Before switching to learned fusion, you need to emit signals. Add interaction tracking wherever users engage with search results.| Domain | Primary Signals | Why |
|---|---|---|
| E-commerce | purchase, add_to_cart, click | Conversion is the strongest relevance indicator |
| Media / Video | long_view, click, share | Watch completion > click for engagement |
| Enterprise Search | positive_feedback, click, bookmark | Clicks may be obligatory; explicit feedback is clearer |
| Content Matching | click, positive_feedback, skip | Editor accept/reject on matched content |
3. Switch to Learned Fusion
Once interactions are flowing, update the retriever to uselearned fusion. You can do this at any time — even with zero interactions (it falls back to uniform weights).
weights field needed — the system samples weights from Beta distributions on every query.
4. Understand Cold Start Behavior
With learned fusion enabled, the system handles sparse data automatically through hierarchical fallback:| User Interactions | What Happens | Effective Behavior |
|---|---|---|
| 0 | Beta(1,1) = uniform prior for all features | Equivalent to RRF |
| < min_interactions (default 5) | Falls back to demographic or global weights | Shared weights from segment or all users |
| >= min_interactions | Personal weights from this user’s history | Per-user individually tuned weights |
min_interactions parameter (default 5). Below that, the system falls back up the hierarchy: personal → demographic → global → uniform prior.
If you pass
user_id in your search requests, the system tracks personal-level weights automatically. Without user_id, you still get global-level learning — the system learns which features are better overall, just not per-user.5. Execute Searches with User Context
Passuser_id on every search request so the bandit can build per-user weight profiles:
- Looks up
user_456’s interaction history in ClickHouse - Computes Beta(α, β) per feature:
α = 1 + clicks,β = 1 + (impressions - clicks) - Samples a weight from each Beta distribution
- Normalizes weights to sum to 1
- Executes each feature search and fuses results using the sampled weights
user_456 has consistently clicked text-matched results over image-matched ones, the text feature’s Beta distribution is peaked higher — so sampled weights skew toward text.
6. Monitor Convergence
Check whether the learned weights are stabilizing using the analytics endpoint:- Weights stabilizing — variance decreasing over time means the system is converging
- Feature dominance — if one feature’s weight approaches 1.0, the other features may not be contributing value
- Per-segment differences — different user segments learning different weights validates that personalization is working
7. Measure Improvement
Use evaluations to compare learned fusion against your static baseline. Create an evaluation with the same queries and judge whether learned fusion produces better-ranked results. The key metrics to track:| Metric | What It Tells You |
|---|---|
| CTR at position 1-3 | Are top results more clickable? |
| Mean Reciprocal Rank | Is the first relevant result appearing earlier? |
| Interaction rate | Are users engaging more overall? |
| Weight variance | Is the system still exploring or has it converged? |
When to Use Each Strategy
| Starting Point | Recommendation |
|---|---|
| No interaction data, launching today | Start with rrf — strong default, no tuning needed |
| Domain expert knows feature importance | Start with weighted — encode expert knowledge as initial weights |
| Have 100+ interactions flowing | Switch to learned — let the data decide |
| Multiple user segments with different needs | learned with user_id — per-segment and per-user personalization |
| Need deterministic, reproducible results | Stay with weighted — learned fusion is stochastic by design |
8. Monitor Personalization
Once learned fusion is running, check per-user weights to verify personalization is working:context_level: "personal". New users fall back to "demographic" or "global" until they cross the min_interactions threshold.
9. Configure Reward Signals
Not all interactions are equal. Customize thereward_map to weight different signals:
negative_feedback actively pushes the weight away from the feature that produced the disliked result. See the Reward Signals reference for all 17 signal types and their defaults.
10. Safe Rollout
Before rolling learned fusion to all traffic, use the built-in rollout controls:- Shadow mode — compute learned weights but serve static results. Compare offline.
- Traffic splitting — route a percentage of users to learned fusion (
rollout_pct: 10). - Kill switch — instantly disable learned fusion if something goes wrong.
- Per-user opt-out — exclude internal test accounts or specific users.
11. Evaluate Learned vs Static
Generate an evaluation dataset from real interactions and compare learned fusion against your static baseline:Next Steps
Auto-Tune Concept
Full overview of the auto-tune system — how it works, when to use it, and configuration options.
Learned Fusion Deep Dive
Thompson Sampling internals, session adaptation, temporal decay, weight clamping, and exploration decay.
Reward Signals
All 17 interaction types, reward weighting, negative signals, and position bias correction.
Rollout Guide
Shadow mode, traffic splitting, kill switch, per-user opt-out, and preference reset.
Fusion Strategies
Compare all 5 fusion strategies: RRF, DBSF, Weighted, Max, and Learned.
Evaluations
Set up benchmarks to measure whether learned fusion is improving result quality.

