Get evaluation results
Retrieve evaluation results with all calculated metrics
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Response
Successful Response
Complete evaluation record with results.
Unique evaluation identifier
ID of retriever being evaluated
ID of dataset used for evaluation
Name of dataset
Evaluation configuration
Current status
pending, in_progress, completed, failed When evaluation was created
Last update timestamp
Namespace ID
Internal organization ID
Number of queries evaluated
When evaluation completed
Aggregated metrics across all queries
Metrics broken down by K value (keys are string K values like '5', '10', '20')
Total queries in the dataset for this run (= evaluated_queries + skipped_queries).
Number of queries that produced metrics. May be < total_queries when some queries were skipped (skip-and-continue on empty/failing input).
Number of queries skipped during evaluation (empty query_input or a per-query execution failure) — these did not fail the whole eval.
Error message if failed

