Interaction Signals
Capture implicit user behavior — clicks, views, dwell time, purchases — to feed into retrieval optimization.interaction_type is always a JSON array (co-occurring signals of one action, e.g. ["click", "long_view"]), and the document is feature_id. Track position for every signal — it’s critical for correcting position bias.
Signal Strength
| Signal | Weight | When to track |
|---|---|---|
click | Medium | User clicked a result |
long_view | High | Sustained engagement (pass duration_ms) |
add_to_cart | High | Intent / funnel step |
purchase | Highest | User completed a goal action |
negative_feedback | Penalty | User disliked / hid the result |
Auto-Tune (Learned Fusion)
Auto-Tune automatically adapts fusion weights per user based on their interaction history. Instead of manually choosing weights, the system uses Thompson Sampling to learn the optimal blend of features for each user.How It Works
Concept page — Thompson Sampling, context levels, reward signals
Reward Signals
Configure which interactions drive learning and how much
Rollout Guide
Traffic splitting, shadow mode, kill switch, per-user opt-out
Fusion Strategies
When a retriever has multiple search stages, fusion strategies determine how scores combine into the final ranking.| Strategy | How it works | Best for |
|---|---|---|
| RRF (Reciprocal Rank Fusion) | Combines ranks, not scores. 1/(k + rank) | Default — works well with no tuning |
| DBSF (Distribution-Based Score Fusion) | Normalizes score distributions then averages | When scores have different scales |
| Weighted | Manual weights per stage | When you know which stage matters more |
| Max | Takes the highest score across stages | When any match is sufficient |
| Learned | Auto-tunes weights from interaction signals | When you have 500+ interactions |
fusion inside the feature_search stage parameters (alongside searches and final_top_k):
"fusion": "learned", add a learning_config (see the Auto-Tune example above). Learned fusion uses Thompson Sampling to shift weight toward stages whose results users engage with; with zero interactions it behaves like rrf and transitions as signals accumulate.
Evaluations
Measure retriever quality against ground truth datasets with standard IR metrics.Analytics
Monitor retriever performance in production:- Stage latency breakdown — identify which stages are slow
- Cache hit rates — verify caching is effective
- Score distributions — detect relevance drift
- Query patterns — understand what users search for
The Feedback Loop
- Users search via retrievers
- Interaction signals capture what they engage with
- Learned fusion adjusts stage weights automatically
- Annotations provide explicit ground truth for edge cases
- Evaluations measure improvement quantitatively
- The cycle repeats — retrieval improves with usage

