Use this file to discover all available pages before exploring further.
This tutorial builds a computer vision pipeline that gets smarter with use. You’ll deploy YOLO as a custom extractor, review detections with annotations, export corrections as training data, and close the loop by uploading improved weights — all through Mixpeek primitives.
A closed-loop object detection system that compounds accuracy over time:
1
Detect
Deploy YOLO as a custom extractor. Every image ingested produces bounding boxes, class labels, and detection embeddings.
2
Review
Surface low-confidence detections for human review. Annotate each detection as confirmed, corrected, false positive, or missed.
3
Fine-Tune
Export annotations as YOLO-format training data. Fine-tune externally and upload improved weights as a new extractor version.
4
Compound
Taxonomies auto-classify future detections against your curated ground truth. Clusters discover new categories. Retroactive reapplication improves old data.
Prerequisites: A Mixpeek namespace with an API key. Familiarity with custom extractors and the model registry helps but isn’t required.
Package a YOLO-based detector as a custom extractor. The extractor reads images, runs inference, and outputs detection features — bounding boxes, class labels, and confidence scores.
manifest.py — extractor metadata and feature definitions
Use the exact key names: feature_type, feature_name, embedding_dim, distance_metric. Using name/type/dimensions/distance will silently create zero vector indexes.
pipeline.py — YOLO inference with LazyModelMixin
import numpy as npimport pandas as pdfrom engine.models.lazy import LazyModelMixinfrom engine.inference.services import BaseBatchInferenceServicefrom engine.io import parallel_ioclass YOLODetector(LazyModelMixin, BaseBatchInferenceService): model_id = "ultralytics/yolov8m" model_source = "huggingface" def _instantiate_model(self, cached_data): from ultralytics import YOLO model = YOLO("yolov8m.pt") model.to(self._detect_device()) return model, None def _process_batch(self, batch): model, _ = self.get_model() images = parallel_io(batch["data"].tolist()) results = model(images, conf=0.25) all_detections = [] all_embeddings = [] for result in results: detections = [] for box in result.boxes: detections.append({ "class": result.names[int(box.cls)], "confidence": float(box.conf), "bbox": { "x": float(box.xywh[0][0]), "y": float(box.xywh[0][1]), "w": float(box.xywh[0][2]), "h": float(box.xywh[0][3]), }, }) all_detections.append(detections) if detections: best = max(detections, key=lambda d: d["confidence"]) crop = result.orig_img[ int(best["bbox"]["y"] - best["bbox"]["h"]/2):int(best["bbox"]["y"] + best["bbox"]["h"]/2), int(best["bbox"]["x"] - best["bbox"]["w"]/2):int(best["bbox"]["x"] + best["bbox"]["w"]/2), ] embedding = self._embed_crop(crop) else: embedding = np.zeros(512).tolist() all_embeddings.append(embedding) batch["detections"] = all_detections batch["detection_embedding"] = all_embeddings return batch def _embed_crop(self, crop): # Replace with CLIP or similar for production return np.random.randn(512).astype(np.float32).tolist()def build_steps(extractor_request=None, base_steps=None, **kwargs): steps = list(base_steps or []) steps.append(YOLODetector()) return {"steps": steps, "prepare": lambda ds: ds}def extract(extractor_request=None, base_steps=None, **kwargs): result = build_steps( extractor_request=extractor_request, base_steps=base_steps, **kwargs ) class PipelineResult: def __init__(self, steps, prepare): self.steps = steps self.prepare = prepare return PipelineResult(result["steps"], result["prepare"])
Your extractor is now available at feature URI mixpeek://yolo_detector@1.0.0/detection_embedding. This URI is the stable contract — retrievers, taxonomies, and clusters all reference it, so you can swap model versions without breaking downstream consumers.
Create a retriever that surfaces detections for human review. Filter by confidence to focus reviewers on borderline cases where the model is least sure.
Focus reviewers on uncertainty. Annotating high-confidence correct detections adds little value. Filtering for confidence < 0.7 routes reviewers to the cases where YOLO is least sure — exactly the training signal you need for the next fine-tune.
Reviewers examine each detection and record their decision. The payload carries the corrected bounding boxes and class labels — this is what becomes training data.
Each operation is independent — a failure in one does not roll back the others. The response includes per-operation results so you can retry individual failures.
Always include retriever_id and execution_id when annotating retriever results. This provenance link lets you measure which retrievers produce the most approved vs. rejected results — critical for evaluating retriever quality over time.
Confidence threshold too low, or hard negatives needed
Missed rate
> 5%
Model is missing objects — add missed annotations as positive training data
Track stats over time, not just cumulatively. A model at 90% confirmed overall might be at 60% confirmed on last week’s data if the deployment context changed (new camera angle, different lighting, seasonal changes).
Query your annotations and convert them to YOLO format. Every corrected bounding box and confirmed detection becomes a labeled training sample.
import osconfirmed = mp.annotations.list( collection_id="col_detected_objects", label="confirmed",)corrected = mp.annotations.list( collection_id="col_detected_objects", label="corrected",)os.makedirs("dataset/labels", exist_ok=True)class_map = {}class_counter = 0for ann in confirmed.items + corrected.items: payload = ann.payload true_class = payload.get("true_class", payload.get("predicted_class")) bbox = payload.get("bbox", {}) img_w = payload.get("image_width", 1920) img_h = payload.get("image_height", 1080) if not bbox or not true_class: continue if true_class not in class_map: class_map[true_class] = class_counter class_counter += 1 # Convert to YOLO format: class x_center y_center width height (normalized) x_center = bbox["x"] / img_w y_center = bbox["y"] / img_h width = bbox["w"] / img_w height = bbox["h"] / img_h label_file = f"dataset/labels/{ann.document_id}.txt" with open(label_file, "a") as f: f.write(f"{class_map[true_class]} {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}\n")with open("dataset/classes.txt", "w") as f: for name, idx in sorted(class_map.items(), key=lambda x: x[1]): f.write(f"{name}\n")print(f"Exported {len(confirmed.items) + len(corrected.items)} annotations across {len(class_map)} classes")
The YOLO format expects one .txt file per image with lines of class x_center y_center width height, all values normalized to [0, 1]. The export script handles this conversion from Mixpeek’s pixel-coordinate annotation payloads.
Upload fine-tuned weights to the Model Registry as a namespace model. This separates model weights from extractor code, so you can iterate on weights without repackaging the extractor.
# Package weightstar -czvf yolo_v2_weights.tar.gz ./runs/detect/yolo_v2_finetuned/weights/# Upload to registrycurl -X POST "$MP_API_URL/v1/namespaces/$NS_ID/models" \ -H "Authorization: Bearer $MP_API_KEY" \ -F "file=@yolo_v2_weights.tar.gz" \ -F "name=yolo-detector" \ -F "version=2.0.0" \ -F "model_format=pytorch" \ -F "task_type=detection" \ -F "num_gpus=1"# Deploy to Ray object storecurl -X POST "$MP_API_URL/v1/namespaces/$NS_ID/models/yolo-detector_2_0_0/deploy" \ -H "Authorization: Bearer $MP_API_KEY"
Then reference in your extractor via load_namespace_model("yolo-detector_2_0_0").
The new version gets its own feature URI — mixpeek://yolo_detector@2.0.0/detection_embedding — so you can run both versions side by side and compare results before switching production traffic.
Once you have enough confirmed annotations, promote them to a reference collection. Then create a taxonomy that auto-classifies future detections by matching against your curated ground truth.
Retroactive reapplication is a first-class operation, not a data migration. When your reference improves — more annotations, better coverage, new categories — old data automatically re-benefits.
YOLO might detect “unknown” objects that don’t fit existing classes. Use clustering to group similar unknowns and discover categories you haven’t labeled yet.
Each Mixpeek primitive contributes to a system that gets better with use:
Custom Extractor
Runs YOLO, produces detections with stable feature URIs. Versioned — v1 and v2 coexist.
Annotations
Captures human corrections — the highest-quality training signal. Bulk API for review queues.
Model Registry
Stores fine-tuned weights. Upload, deploy, version — without repackaging the extractor.
Taxonomies
Auto-classifies detections against curated ground truth. Retroactive mode backfills old data.
Clusters
Discovers object categories you haven’t labeled yet. Promote stable clusters to taxonomy nodes.
Webhooks
Triggers downstream actions on every annotation event. No polling required.
The key insight is that these primitives compose. Annotations curate the edges where the model was wrong. Those curated edges become training data and reference collection entries. The reference collection powers taxonomy auto-classification. Clusters discover what you haven’t labeled yet. And every improvement backfills via retroactive taxonomy application — old data re-benefits from every new correction.