Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

This tutorial builds a computer vision pipeline that gets smarter with use. You’ll deploy YOLO as a custom extractor, review detections with annotations, export corrections as training data, and close the loop by uploading improved weights — all through Mixpeek primitives.
Self-improving CV pipeline: YOLO extractor → human review → fine-tune → redeploy, with taxonomies and clusters feeding back into the loop

What You’ll Build

A closed-loop object detection system that compounds accuracy over time:
1

Detect

Deploy YOLO as a custom extractor. Every image ingested produces bounding boxes, class labels, and detection embeddings.
2

Review

Surface low-confidence detections for human review. Annotate each detection as confirmed, corrected, false positive, or missed.
3

Fine-Tune

Export annotations as YOLO-format training data. Fine-tune externally and upload improved weights as a new extractor version.
4

Compound

Taxonomies auto-classify future detections against your curated ground truth. Clusters discover new categories. Retroactive reapplication improves old data.
Prerequisites: A Mixpeek namespace with an API key. Familiarity with custom extractors and the model registry helps but isn’t required.

1. Deploy YOLO as a Custom Extractor

Package a YOLO-based detector as a custom extractor. The extractor reads images, runs inference, and outputs detection features — bounding boxes, class labels, and confidence scores.
feature_extractor_name = "yolo_detector"
version = "1.0.0"
description = "YOLOv8 object detection with bounding boxes and class embeddings"

dependencies = ["ultralytics==8.2.0", "torch>=2.0"]

features = [
    {
        "feature_type": "json",
        "feature_name": "detections",
    },
    {
        "feature_type": "embedding",
        "feature_name": "detection_embedding",
        "embedding_dim": 512,
        "distance_metric": "cosine",
    },
]

output_schema = {
    "detections": {
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "class": {"type": "string"},
                "confidence": {"type": "number"},
                "bbox": {
                    "type": "object",
                    "properties": {
                        "x": {"type": "number"},
                        "y": {"type": "number"},
                        "w": {"type": "number"},
                        "h": {"type": "number"},
                    },
                },
            },
        },
    },
    "detection_embedding": {
        "type": "array",
        "items": {"type": "number"},
        "description": "512-dim CLIP embedding of the highest-confidence crop",
    },
}

input_mappings = {"image": "image"}
tier = 1
tier_label = "OBJECT_DETECTION"
compute_profile = {"resource_type": "gpu"}
Use the exact key names: feature_type, feature_name, embedding_dim, distance_metric. Using name/type/dimensions/distance will silently create zero vector indexes.
import numpy as np
import pandas as pd
from engine.models.lazy import LazyModelMixin
from engine.inference.services import BaseBatchInferenceService
from engine.io import parallel_io


class YOLODetector(LazyModelMixin, BaseBatchInferenceService):
    model_id = "ultralytics/yolov8m"
    model_source = "huggingface"

    def _instantiate_model(self, cached_data):
        from ultralytics import YOLO
        model = YOLO("yolov8m.pt")
        model.to(self._detect_device())
        return model, None

    def _process_batch(self, batch):
        model, _ = self.get_model()

        images = parallel_io(batch["data"].tolist())

        results = model(images, conf=0.25)

        all_detections = []
        all_embeddings = []

        for result in results:
            detections = []
            for box in result.boxes:
                detections.append({
                    "class": result.names[int(box.cls)],
                    "confidence": float(box.conf),
                    "bbox": {
                        "x": float(box.xywh[0][0]),
                        "y": float(box.xywh[0][1]),
                        "w": float(box.xywh[0][2]),
                        "h": float(box.xywh[0][3]),
                    },
                })
            all_detections.append(detections)

            if detections:
                best = max(detections, key=lambda d: d["confidence"])
                crop = result.orig_img[
                    int(best["bbox"]["y"] - best["bbox"]["h"]/2):int(best["bbox"]["y"] + best["bbox"]["h"]/2),
                    int(best["bbox"]["x"] - best["bbox"]["w"]/2):int(best["bbox"]["x"] + best["bbox"]["w"]/2),
                ]
                embedding = self._embed_crop(crop)
            else:
                embedding = np.zeros(512).tolist()
            all_embeddings.append(embedding)

        batch["detections"] = all_detections
        batch["detection_embedding"] = all_embeddings
        return batch

    def _embed_crop(self, crop):
        # Replace with CLIP or similar for production
        return np.random.randn(512).astype(np.float32).tolist()


def build_steps(extractor_request=None, base_steps=None, **kwargs):
    steps = list(base_steps or [])
    steps.append(YOLODetector())
    return {"steps": steps, "prepare": lambda ds: ds}


def extract(extractor_request=None, base_steps=None, **kwargs):
    result = build_steps(
        extractor_request=extractor_request,
        base_steps=base_steps, **kwargs
    )
    class PipelineResult:
        def __init__(self, steps, prepare):
            self.steps = steps
            self.prepare = prepare
    return PipelineResult(result["steps"], result["prepare"])

Upload and Deploy

# Package
zip -r yolo_detector.zip yolo_detector/

# Upload
UPLOAD=$(curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/extractors/uploads" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "yolo_detector", "version": "1.0.0", "file_size_bytes": 50000}')

UPLOAD_ID=$(echo $UPLOAD | jq -r '.upload_id')
PRESIGNED_URL=$(echo $UPLOAD | jq -r '.presigned_url')

curl -s -X PUT "$PRESIGNED_URL" \
  -H "Content-Type: application/zip" \
  --data-binary @yolo_detector.zip

curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/extractors/uploads/$UPLOAD_ID/confirm" \
  -H "Authorization: Bearer $MP_API_KEY"

# Deploy
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/extractors/yolo_detector_1_0_0/deploy?deployment_type=batch_only" \
  -H "Authorization: Bearer $MP_API_KEY"
Your extractor is now available at feature URI mixpeek://yolo_detector@1.0.0/detection_embedding. This URI is the stable contract — retrievers, taxonomies, and clusters all reference it, so you can swap model versions without breaking downstream consumers.

2. Create a Collection and Ingest Images

Bind the YOLO extractor to a bucket so every uploaded image gets processed automatically.
# Create bucket
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/buckets" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "security_footage",
    "bucket_schema": {
      "properties": {
        "image": {"type": "image", "required": true},
        "camera_id": {"type": "text"},
        "timestamp": {"type": "text"}
      }
    }
  }'

# Create collection with YOLO extractor
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/collections" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "detected_objects",
    "feature_extractors": [{
      "feature_extractor_name": "yolo_detector",
      "version": "1.0.0"
    }],
    "source": {
      "type": "bucket",
      "bucket_ids": ["bkt_security_footage"]
    }
  }'

# Upload images
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/buckets/$BUCKET_ID/objects" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {"camera_id": "cam_lobby_01", "timestamp": "2026-05-03T14:30:00Z"},
    "blobs": [{
      "property": "image",
      "type": "image",
      "data": {"url": "s3://my-bucket/footage/frame_001.jpg"}
    }]
  }'
Trigger batch processing to run YOLO across all uploaded images:
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/buckets/$BUCKET_ID/batches/trigger" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"collection_ids\": [\"$COLLECTION_ID\"]}"

3. Build a Retriever for Detection Review

Create a retriever that surfaces detections for human review. Filter by confidence to focus reviewers on borderline cases where the model is least sure.
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/retrievers" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "detection_review",
    "stages": [{
      "stage_name": "low_confidence",
      "stage_type": "filter",
      "config": {
        "stage_id": "feature_search",
        "parameters": {
          "searches": [{
            "feature_uri": "mixpeek://yolo_detector@1.0.0/detection_embedding",
            "query": "{{INPUT.query}}"
          }],
          "filters": {
            "AND": [{
              "field": "detections.0.confidence",
              "operator": "lt",
              "value": 0.7
            }]
          }
        }
      }
    }]
  }'
Focus reviewers on uncertainty. Annotating high-confidence correct detections adds little value. Filtering for confidence < 0.7 routes reviewers to the cases where YOLO is least sure — exactly the training signal you need for the next fine-tune.

4. Annotate Detections

Reviewers examine each detection and record their decision. The payload carries the corrected bounding boxes and class labels — this is what becomes training data.

Label Vocabulary

Before annotating, establish a consistent label vocabulary. The stats endpoint groups by exact string match, so consistency matters.

confirmed

Detection is correct as-is. Becomes a positive training sample that reinforces the model.

corrected

Bounding box or class was adjusted. Highest-value sample — teaches the model its mistakes.

false_positive

No real object at this location. Becomes a hard negative that reduces false alarms.

missed

Object exists but wasn’t detected. Added as new ground truth for the next training run.

Recording Decisions

# Correct detection — class was wrong
curl -X POST "$MP_API_URL/v1/annotations" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_frame_001_det_3",
    "collection_id": "col_detected_objects",
    "label": "corrected",
    "confidence": 1.0,
    "reasoning": "Model predicted car, actual object is delivery van.",
    "payload": {
      "predicted_class": "car",
      "true_class": "delivery_van",
      "bbox": {"x": 340, "y": 220, "w": 180, "h": 120},
      "image_width": 1920,
      "image_height": 1080
    },
    "retriever_id": "ret_detection_review",
    "execution_id": "exec_review_batch_01"
  }'

# Confirm correct detection
curl -X POST "$MP_API_URL/v1/annotations" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_frame_001_det_1",
    "collection_id": "col_detected_objects",
    "label": "confirmed",
    "confidence": 1.0,
    "payload": {
      "predicted_class": "person",
      "true_class": "person",
      "bbox": {"x": 640, "y": 300, "w": 90, "h": 200}
    },
    "retriever_id": "ret_detection_review",
    "execution_id": "exec_review_batch_01"
  }'

# Reject false positive
curl -X POST "$MP_API_URL/v1/annotations" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_frame_001_det_5",
    "collection_id": "col_detected_objects",
    "label": "false_positive",
    "reasoning": "Shadow on wall, not an actual object.",
    "retriever_id": "ret_detection_review",
    "execution_id": "exec_review_batch_01"
  }'
Always include retriever_id and execution_id when annotating retriever results. This provenance link lets you measure which retrievers produce the most approved vs. rejected results — critical for evaluating retriever quality over time.

5. Track Model Quality with Stats

Monitor how your model is performing across review cycles. A rising corrected or false_positive rate signals the model needs retraining.
curl "$MP_API_URL/v1/annotations/stats?collection_id=col_detected_objects" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE"

Interpreting Stats for Retraining Decisions

MetricHealthyAction Needed
Confirmed rate> 80%Model is performing well
Corrected rate> 15%Class confusion — retrain with corrected examples
False positive rate> 10%Confidence threshold too low, or hard negatives needed
Missed rate> 5%Model is missing objects — add missed annotations as positive training data
Track stats over time, not just cumulatively. A model at 90% confirmed overall might be at 60% confirmed on last week’s data if the deployment context changed (new camera angle, different lighting, seasonal changes).

6. Export Annotations as YOLO Training Data

Query your annotations and convert them to YOLO format. Every corrected bounding box and confirmed detection becomes a labeled training sample.
import os

confirmed = mp.annotations.list(
    collection_id="col_detected_objects",
    label="confirmed",
)
corrected = mp.annotations.list(
    collection_id="col_detected_objects",
    label="corrected",
)

os.makedirs("dataset/labels", exist_ok=True)

class_map = {}
class_counter = 0

for ann in confirmed.items + corrected.items:
    payload = ann.payload
    true_class = payload.get("true_class", payload.get("predicted_class"))
    bbox = payload.get("bbox", {})
    img_w = payload.get("image_width", 1920)
    img_h = payload.get("image_height", 1080)

    if not bbox or not true_class:
        continue

    if true_class not in class_map:
        class_map[true_class] = class_counter
        class_counter += 1

    # Convert to YOLO format: class x_center y_center width height (normalized)
    x_center = bbox["x"] / img_w
    y_center = bbox["y"] / img_h
    width = bbox["w"] / img_w
    height = bbox["h"] / img_h

    label_file = f"dataset/labels/{ann.document_id}.txt"
    with open(label_file, "a") as f:
        f.write(f"{class_map[true_class]} {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}\n")

with open("dataset/classes.txt", "w") as f:
    for name, idx in sorted(class_map.items(), key=lambda x: x[1]):
        f.write(f"{name}\n")

print(f"Exported {len(confirmed.items) + len(corrected.items)} annotations across {len(class_map)} classes")
The YOLO format expects one .txt file per image with lines of class x_center y_center width height, all values normalized to [0, 1]. The export script handles this conversion from Mixpeek’s pixel-coordinate annotation payloads.

7. Fine-Tune and Redeploy

Fine-tune YOLO externally with your exported annotations, then upload the improved weights as a new extractor version.
from ultralytics import YOLO

model = YOLO("yolov8m.pt")
model.train(
    data="dataset/data.yaml",
    epochs=50,
    imgsz=640,
    batch=16,
    name="yolo_v2_finetuned",
)

model.export(format="torchscript")
The new version gets its own feature URI — mixpeek://yolo_detector@2.0.0/detection_embedding — so you can run both versions side by side and compare results before switching production traffic.

8. Auto-Classify Detections with Taxonomies

Once you have enough confirmed annotations, promote them to a reference collection. Then create a taxonomy that auto-classifies future detections by matching against your curated ground truth.
1

Create the taxonomy

curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/taxonomies" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"taxonomy_name\": \"object-catalog\",
    \"taxonomy_type\": \"flat\",
    \"retriever_id\": \"ret_object_catalog_search\",
    \"collection_id\": \"col_verified_objects\",
    \"input_mappings\": [{
      \"source\": \"payload.detection_embedding\",
      \"target\": \"query\"
    }],
    \"enrichment_fields\": [
      {\"source\": \"payload.true_class\", \"target\": \"verified_class\"},
      {\"source\": \"payload.category\", \"target\": \"object_category\"}
    ],
    \"threshold\": 0.75,
    \"execution_mode\": \"materialize\"
  }"
2

Apply to your detection collection

Every new image auto-classifies at ingestion time:
{
  "taxonomy_applications": [
    {
      "taxonomy_id": "tax_object_catalog",
      "execution_mode": "materialize"
    }
  ]
}
3

Backfill existing data when the reference improves

When annotations accumulate and your reference collection gets better, trigger retroactive mode to reclassify all existing detections:
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/taxonomies/tax_object_catalog/apply" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"execution_mode": "retroactive", "collection_id": "col_detected_objects"}'
Retroactive reapplication is a first-class operation, not a data migration. When your reference improves — more annotations, better coverage, new categories — old data automatically re-benefits.

9. Discover New Categories with Clusters

YOLO might detect “unknown” objects that don’t fit existing classes. Use clustering to group similar unknowns and discover categories you haven’t labeled yet.
curl -s -X POST "$MP_API_URL/v1/namespaces/$NS_ID/clusters" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"cluster_name\": \"unknown-objects\",
    \"collection_id\": \"$COLLECTION_ID\",
    \"feature_uri\": \"mixpeek://yolo_detector@2.0.0/detection_embedding\",
    \"algorithm\": {\"name\": \"hdbscan\", \"params\": {\"min_cluster_size\": 5}},
    \"llm_labeling\": {
      \"enabled\": true,
      \"input_mappings\": [{
        \"source\": \"payload\",
        \"fields\": [\"detections\"]
      }]
    },
    \"dimension_reduction\": {\"method\": \"umap\", \"n_components\": 2}
  }"
Clusters reveal groups like “delivery trucks,” “bicycles,” or “strollers” — objects the base YOLO model might lump together or miss entirely.
1

Review cluster labels

The LLM-generated name gives you a starting point. Review the cluster members to confirm the grouping makes sense.
2

Promote to taxonomy node

The cluster becomes a reference for auto-classification. Future detections matching this cluster auto-label.
3

Add to training data

Confirmed cluster members become training samples for the next YOLO fine-tune — new classes discovered from your own data.

10. Automate the Loop with Webhooks

Wire up webhooks so the pipeline runs without manual intervention. Each annotation event can trigger downstream processing.
curl -X POST "$MP_API_URL/v1/webhooks" \
  -H "Authorization: Bearer $MP_API_KEY" \
  -H "X-Namespace: $MP_NAMESPACE" \
  -H "Content-Type: application/json" \
  -d '{
    "webhook_name": "detection-review-events",
    "url": "https://your-app.com/webhooks/detections",
    "events": [
      "annotation.created",
      "annotation.updated",
      "batch.completed"
    ]
  }'

Automation Patterns

EventTriggerAction
annotation.createdLabel is confirmed or correctedAdd to reference collection, append to training dataset
annotation.createdLabel is false_positiveLog as hard negative for next training run
Annotation countCrosses threshold (e.g., 500 new corrections)Trigger fine-tuning job, export training data
batch.completedNew extractor version finishes processingRun evaluation comparing v1 vs. v2 detection quality

The Compounding Flywheel

Each Mixpeek primitive contributes to a system that gets better with use:

Custom Extractor

Runs YOLO, produces detections with stable feature URIs. Versioned — v1 and v2 coexist.

Annotations

Captures human corrections — the highest-quality training signal. Bulk API for review queues.

Model Registry

Stores fine-tuned weights. Upload, deploy, version — without repackaging the extractor.

Taxonomies

Auto-classifies detections against curated ground truth. Retroactive mode backfills old data.

Clusters

Discovers object categories you haven’t labeled yet. Promote stable clusters to taxonomy nodes.

Webhooks

Triggers downstream actions on every annotation event. No polling required.
The key insight is that these primitives compose. Annotations curate the edges where the model was wrong. Those curated edges become training data and reference collection entries. The reference collection powers taxonomy auto-classification. Clusters discover what you haven’t labeled yet. And every improvement backfills via retroactive taxonomy application — old data re-benefits from every new correction.

Next Steps

Custom Extractors

Full guide to packaging and deploying custom feature extractors.

Model Registry

Upload fine-tuned weights, manage versions, and deploy to the inference cluster.

Taxonomies

Build flat and hierarchical classification systems with retroactive reapplication.

Clusters

Discover structure in your data with 8 algorithms and LLM-powered labeling.