The degree to which search results satisfy user information needs, encompassing both the ranking quality and the appropriateness of returned items. Search relevance is the ultimate quality metric for multimodal retrieval systems.
Search relevance is assessed by comparing search results against ground truth judgments of what constitutes a good result for each query. Human annotators rate results on relevance scales, and metrics quantify ranking quality. The relevance optimization loop involves evaluating current performance, identifying failure cases, making improvements, and re-evaluating to confirm gains.
Key metrics include NDCG (Normalized Discounted Cumulative Gain) for graded relevance, MAP (Mean Average Precision) for binary relevance, MRR (Mean Reciprocal Rank) for the first relevant result, and precision/recall at various cutoffs. Online metrics include click-through rate, session success rate, and abandonment rate. Evaluation requires relevance judgment datasets created through human annotation or inferred from user behavior.