Skip to main content
This tutorial walks through the full lifecycle: static weights → interaction capture → learned fusion → convergence monitoring. You don’t need interaction data to start — the system gracefully degrades to uniform weights with zero signal.

What You’ll Build

A search retriever that starts with manually tuned weights and progressively learns the optimal blend of features from user behavior. By the end, your retriever adapts per-user (or per-segment) without manual tuning. Prerequisites: A namespace with at least two collections producing different embedding types (e.g., text + multimodal). See Semantic Search or Video Understanding to set those up first.

1. Start with Static Weights

Begin with weighted fusion. This gives you a deterministic baseline to measure against later.
curl -X POST "$MP_API_URL/v1/retrievers" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{
    "retriever_name": "product-search",
    "stages": [
      {
        "stage_name": "feature_search",
        "stage_type": "filter",
        "config": {
          "stage_id": "feature_search",
          "parameters": {
            "searches": [
              {
                "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1",
                "query": "{{INPUT.query}}",
                "top_k": 100,
                "weight": 0.6
              },
              {
                "feature_uri": "mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding",
                "query": "{{INPUT.query}}",
                "top_k": 100,
                "weight": 0.4
              }
            ],
            "fusion": "weighted",
            "final_top_k": 25
          }
        }
      }
    ]
  }'
Run searches against this retriever and record the result quality. These static-weight results are your baseline.

2. Instrument Your Application with Interaction Signals

Before switching to learned fusion, you need to emit signals. Add interaction tracking wherever users engage with search results.
document.querySelectorAll('.search-result').forEach((el, index) => {
  el.addEventListener('click', () => {
    fetch(`${MP_API_URL}/v1/retrievers/interactions`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${MP_API_KEY}`,
        'X-Namespace': MP_NAMESPACE
      },
      body: JSON.stringify({
        feature_id: el.dataset.documentId,
        interaction_type: ['click'],
        position: index,
        metadata: { query: currentQuery },
        user_id: userId,
        session_id: sessionId
      })
    });
  });
});
Shortcut: create_interaction_from_result() — pass the full execute response and a position, and the SDK extracts feature_id, execution_id, retriever_id, and feature_uri automatically:
results = client.retrievers.execute(retriever_id, inputs={"query": "earbuds", "user_id": "user_456"})
# User clicked the first result
client.retrievers.create_interaction_from_result(results, position=0, user_id="user_456")
# User purchased the third result
client.retrievers.create_interaction_from_result(results, position=2, interaction_type=["purchase"], user_id="user_456")
Always include position. It is recorded for analytics and evaluation — helping you understand which result positions drive engagement. Position is not currently used in reward computation, but is required for accurate NDCG and other rank-aware metrics.
Which signals to capture depends on your domain:
DomainPrimary SignalsWhy
E-commercepurchase, add_to_cart, clickConversion is the strongest relevance indicator
Media / Videolong_view, click, shareWatch completion > click for engagement
Enterprise Searchpositive_feedback, click, bookmarkClicks may be obligatory; explicit feedback is clearer
Content Matchingclick, positive_feedback, skipEditor accept/reject on matched content
See the Signal Strength Matrix for the full list of 17 signal types and how they’re weighted.

3. Switch to Learned Fusion

Once interactions are flowing, update the retriever to use learned fusion. You can do this at any time — even with zero interactions (it falls back to uniform weights).
curl -X PATCH "$MP_API_URL/v1/retrievers/{retriever_id}" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{
    "stages": [
      {
        "stage_name": "feature_search",
        "stage_type": "filter",
        "config": {
          "stage_id": "feature_search",
          "parameters": {
            "searches": [
              {
                "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1",
                "query": "{{INPUT.query}}",
                "top_k": 100
              },
              {
                "feature_uri": "mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding",
                "query": "{{INPUT.query}}",
                "top_k": 100
              }
            ],
            "fusion": "learned",
            "final_top_k": 25
          }
        }
      }
    ]
  }'
No weights field needed — the system samples weights from Beta distributions on every query.

4. Understand Cold Start Behavior

With learned fusion enabled, the system handles sparse data automatically through hierarchical fallback:
User InteractionsWhat HappensEffective Behavior
0Beta(1,1) = uniform prior for all featuresEquivalent to RRF
< min_interactions (default 5)Falls back to demographic or global weightsShared weights from segment or all users
>= min_interactionsPersonal weights from this user’s historyPer-user individually tuned weights
The threshold for trusting personal weights is the min_interactions parameter (default 5). Below that, the system falls back up the hierarchy: personal → demographic → global → uniform prior.
If you pass user_id in your search requests, the system tracks personal-level weights automatically. Without user_id, you still get global-level learning — the system learns which features are better overall, just not per-user.

5. Execute Searches with User Context

Pass user_id on every search request so the bandit can build per-user weight profiles:
curl -X POST "$MP_API_URL/v1/retrievers/{retriever_id}/execute" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{
    "inputs": {
      "query": "wireless noise canceling earbuds",
      "user_id": "user_456"
    }
  }'
Behind the scenes, the Thompson Sampler:
  1. Looks up user_456’s interaction history in ClickHouse
  2. Computes Beta(α, β) per feature: α = 1 + clicks, β = 1 + (impressions - clicks)
  3. Samples a weight from each Beta distribution
  4. Normalizes weights to sum to 1
  5. Executes each feature search and fuses results using the sampled weights
If user_456 has consistently clicked text-matched results over image-matched ones, the text feature’s Beta distribution is peaked higher — so sampled weights skew toward text.

6. Monitor Convergence

Check whether the learned weights are stabilizing using the analytics endpoint:
curl "$MP_API_URL/v1/analytics/retrievers/{retriever_id}/signals?signal_type=learned_weights&hours=168" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
What to look for:
  • Weights stabilizing — variance decreasing over time means the system is converging
  • Feature dominance — if one feature’s weight approaches 1.0, the other features may not be contributing value
  • Per-segment differences — different user segments learning different weights validates that personalization is working

7. Measure Improvement

Use evaluations to compare learned fusion against your static baseline. Create an evaluation with the same queries and judge whether learned fusion produces better-ranked results. The key metrics to track:
MetricWhat It Tells You
CTR at position 1-3Are top results more clickable?
Mean Reciprocal RankIs the first relevant result appearing earlier?
Interaction rateAre users engaging more overall?
Weight varianceIs the system still exploring or has it converged?
Recommended rollout: Run learned fusion on 10% of traffic alongside your static baseline. Compare metrics over 1-2 weeks. If learned fusion wins or ties, ramp to 100%.

When to Use Each Strategy

Starting PointRecommendation
No interaction data, launching todayStart with rrf — strong default, no tuning needed
Domain expert knows feature importanceStart with weighted — encode expert knowledge as initial weights
Have 100+ interactions flowingSwitch to learned — let the data decide
Multiple user segments with different needslearned with user_id — per-segment and per-user personalization
Need deterministic, reproducible resultsStay with weighted — learned fusion is stochastic by design

8. Monitor Personalization

Once learned fusion is running, check per-user weights to verify personalization is working:
curl "$MP_API_URL/v1/retrievers/$RETRIEVER_ID/learned-fusion/weights/user_123" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"
A user with enough interactions shows context_level: "personal". New users fall back to "demographic" or "global" until they cross the min_interactions threshold.

9. Configure Reward Signals

Not all interactions are equal. Customize the reward_map to weight different signals:
{
  "learning_config": {
    "context_features": ["INPUT.user_id"],
    "reward_map": {
      "click": 1.0,
      "purchase": 3.0,
      "add_to_cart": 2.0,
      "positive_feedback": 2.0,
      "negative_feedback": -2.0,
      "skip": -1.0
    }
  }
}
Negative values act as penalties — negative_feedback actively pushes the weight away from the feature that produced the disliked result. See the Reward Signals reference for all 17 signal types and their defaults.

10. Safe Rollout

Before rolling learned fusion to all traffic, use the built-in rollout controls:
  1. Shadow mode — compute learned weights but serve static results. Compare offline.
  2. Traffic splitting — route a percentage of users to learned fusion (rollout_pct: 10).
  3. Kill switch — instantly disable learned fusion if something goes wrong.
  4. Per-user opt-out — exclude internal test accounts or specific users.
# Enable shadow mode (logs learned weights, serves static results)
curl -X PATCH "$MP_API_URL/v1/retrievers/$RETRIEVER_ID" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{"stages": [{"stage_name": "feature_search", "stage_type": "filter", "config": {"stage_id": "feature_search", "parameters": {"fusion": "learned", "learning_config": {"shadow_mode": true, "rollout_pct": 10.0}}}}]}'
See the full Rollout Guide for step-by-step rollout instructions.

11. Evaluate Learned vs Static

Generate an evaluation dataset from real interactions and compare learned fusion against your static baseline:
# Generate eval dataset from interaction history
curl -X POST "$MP_API_URL/v1/retrievers/$RETRIEVER_ID/evaluations/generate-from-interactions" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{"min_interactions": 5}'

# Run evaluation — compare NDCG, MRR, Precision
curl -X POST "$MP_API_URL/v1/retrievers/$RETRIEVER_ID/evaluations" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -d '{"dataset_id": "'$DATASET_ID'", "limit": 20}'
Run evaluations on a regular cadence (e.g., weekly or after every 1,000 interactions). If learned fusion regresses vs. static, the kill switch gives you an instant rollback path.

Next Steps

Auto-Tune Concept

Full overview of the auto-tune system — how it works, when to use it, and configuration options.

Learned Fusion Deep Dive

Thompson Sampling internals, session adaptation, temporal decay, weight clamping, and exploration decay.

Reward Signals

All 17 interaction types, reward weighting, negative signals, and position bias correction.

Rollout Guide

Shadow mode, traffic splitting, kill switch, per-user opt-out, and preference reset.

Fusion Strategies

Compare all 5 fusion strategies: RRF, DBSF, Weighted, Max, and Learned.

Evaluations

Set up benchmarks to measure whether learned fusion is improving result quality.