> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Reverse Media Search

> Find visually similar content using images or videos as your search query

<Tip>Reverse media search uses the warehouse's Decompose layer to extract visual features, then queries them in the Reassemble layer. See [Multi-Stage Retrieval](/retrieval/multi-stage-deep-dive) for composing pipelines.</Tip>

## Input Methods

| Method                 | Example                                    | Speed   |
| ---------------------- | ------------------------------------------ | ------- |
| Pre-computed embedding | `{"embedding": [0.1, 0.2, ...]}`           | Fastest |
| Image URL              | `{"url": "https://example.com/img.jpg"}`   | Fast    |
| Video URL              | `{"url": "s3://bucket/video.mp4"}`         | Medium  |
| Base64                 | `{"base64": "data:image/jpeg;base64,..."}` | Fast    |

## 1. Create a Bucket

```bash theme={null}
POST /v1/buckets
{
  "bucket_name": "visual-assets",
  "bucket_schema": {
    "properties": {
      "asset_url": { "type": "url", "required": true },
      "brand": { "type": "text" },
      "campaign_id": { "type": "text" }
    }
  }
}
```

## 2. Create a Collection

**For images:**

```bash theme={null}
POST /v1/collections
{
  "collection_name": "product-images",
  "source": { "type": "bucket", "bucket_ids": ["bkt_visual_assets"] },
  "feature_extractor": {
    "feature_extractor_name": "image_extractor",
    "version": "v1",
    "input_mappings": { "image_url": "asset_url" },
    "parameters": { "model": "clip-vit-large-patch14" },
    "field_passthrough": [
      { "source_path": "brand" },
      { "source_path": "campaign_id" }
    ]
  }
}
```

**For videos:**

```bash theme={null}
POST /v1/collections
{
  "collection_name": "video-segments",
  "source": { "type": "bucket", "bucket_ids": ["bkt_visual_assets"] },
  "feature_extractor": {
    "feature_extractor_name": "multimodal_extractor",
    "version": "v1",
    "input_mappings": { "video": "asset_url" },
    "parameters": {
      "scene_detection_threshold": 0.3,
      "extract_keyframes": true
    }
  }
}
```

## 3. Ingest Assets

```bash theme={null}
POST /v1/buckets/{bucket_id}/objects
{
  "key_prefix": "/products/shoes",
  "blobs": [
    { "property": "asset_url", "type": "image", "data": "s3://my-bucket/products/sneaker-001.jpg" }
  ],
  "metadata": {
    "brand": "Nike",
    "campaign_id": "fall-2025"
  }
}
```

## 4. Process

```bash theme={null}
POST /v1/buckets/{bucket_id}/batches
{ "object_ids": ["obj_001", "obj_002"] }

POST /v1/buckets/{bucket_id}/batches/{batch_id}/submit
```

## 5. Create a Retriever

```bash theme={null}
POST /v1/retrievers
{
  "retriever_name": "reverse-image-search",
  "collection_identifiers": ["col_product_images"],
  "input_schema": {
    "query_image": { "type": "image", "required": true },
    "min_similarity": { "type": "number", "default": 0.7 }
  },
  "stages": [
    {
      "stage_name": "visual_search",
      "stage_type": "filter",
      "config": {
        "stage_id": "feature_search",
        "parameters": {
          "searches": [
            {
              "feature_uri": "mixpeek://image_extractor@v1/google_siglip_base_v1",
              "query": {
                "input_mode": "content",
                "value": "{{INPUT.query_image}}"
              },
              "top_k": 50
            }
          ]
        }
      }
    },
    {
      "stage_name": "filter",
      "stage_type": "reduce",
      "config": {
        "stage_id": "score_threshold",
        "parameters": {
          "min_score": "{{INPUT.min_similarity}}"
        }
      }
    }
  ]
}
```

## 6. Search

**With image URL:**

```bash theme={null}
POST /v1/retrievers/{retriever_id}/execute
{
  "inputs": {
    "query_image": "https://example.com/reference.jpg",
    "min_similarity": 0.75
  }
}
```

**With pre-computed embedding (faster):**

```bash theme={null}
POST /v1/retrievers/{retriever_id}/execute
{
  "inputs": {
    "embedding": [0.1, 0.2, ...]
  }
}
```

**With base64:**

```python theme={null}
import base64

with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = requests.post(
    "https://api.mixpeek.com/v1/retrievers/{retriever_id}/execute",
    json={
        "inputs": {
            "query_image": f"data:image/jpeg;base64,{image_data}"
        }
    }
)
```

## Cross-Modal Search (Image → Video)

Search videos using a reference image:

```bash theme={null}
{
  "stage_name": "cross_modal_search",
  "stage_type": "filter",
  "config": {
    "stage_id": "feature_search",
    "parameters": {
      "searches": [
        {
          "feature_uri": "mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding",
          "query": {
            "input_mode": "content",
            "value": "{{INPUT.query_image}}"
          },
          "top_k": 50
        }
      ]
    }
  }
}
```

## Multi-Collection Search

Search images and videos together:

```bash theme={null}
{
  "retriever_name": "visual-federated-search",
  "collection_identifiers": ["col_images", "col_videos"],
  "stages": [
    {
      "stage_name": "federated_search",
      "stage_type": "filter",
      "config": {
        "stage_id": "feature_search",
        "parameters": {
          "searches": [
            {
              "feature_uri": "mixpeek://image_extractor@v1/google_siglip_base_v1",
              "collections": ["col_images"],
              "weight": 0.5,
              "top_k": 25
            },
            {
              "feature_uri": "mixpeek://multimodal_extractor@v1/vertex_multimodal_embedding",
              "collections": ["col_videos"],
              "weight": 0.5,
              "top_k": 25
            }
          ],
          "fusion": "rrf"
        }
      }
    }
  ]
}
```

## Similarity Thresholds

| Score      | Meaning        |
| ---------- | -------------- |
| 0.95+      | Near-duplicate |
| 0.85-0.94  | Very similar   |
| 0.70-0.84  | Related        |
| Below 0.70 | Weak match     |

## Classify with Taxonomies

Auto-tag assets by matching against a reference collection of known brands or product types:

```bash theme={null}
POST /v1/taxonomies
{
  "taxonomy_name": "brand-classifier",
  "taxonomy_type": "flat",
  "retriever_id": "ret_reverse_image_search",
  "input_mappings": {
    "query_image": "mixpeek://image_extractor@v1/google_siglip_base_v1"
  },
  "source_collection": {
    "collection_id": "col_product_images",
    "enrichment_fields": [
      { "field_path": "metadata.brand", "merge_mode": "enrich" }
    ]
  }
}
```

New assets automatically get `metadata.brand` enriched when they visually match a known reference. See [Taxonomies](/enrichment/taxonomies) for hierarchical taxonomies.

## Discover Clusters

Find visual themes across your asset library:

```bash theme={null}
POST /v1/clusters
{
  "cluster_name": "visual-themes",
  "collection_ids": ["col_product_images"],
  "cluster_type": "vector",
  "vector_config": {
    "feature_uris": ["mixpeek://image_extractor@v1/google_siglip_base_v1"],
    "clustering_method": "hdbscan",
    "algorithm_params": { "min_cluster_size": 10 }
  },
  "llm_labeling": {
    "provider": "openai",
    "model_name": "gpt-4o-mini"
  },
  "dimension_reduction": {
    "method": "umap",
    "components": 2
  }
}
```

Clusters reveal groupings like "product close-ups", "lifestyle shots", and "packaging" without predefined categories. Promote stable clusters to taxonomy nodes. See [Clusters](/enrichment/clusters) for all algorithms.

## Set Up Alerts

Get notified when new assets closely match existing ones (counterfeit detection, duplicate detection):

```bash theme={null}
POST /v1/alerts
{
  "alert_name": "duplicate-detection",
  "collection_id": "col_product_images",
  "condition": { "field": "taxonomy.detected_brand", "operator": "exists" },
  "notification": { "type": "webhook", "url": "https://example.com/webhook" }
}
```

## Set Up Webhooks

Track batch processing for large asset uploads:

```bash theme={null}
POST /v1/webhooks
{
  "webhook_name": "asset-processing",
  "url": "https://example.com/webhook",
  "events": ["batch.completed", "batch.failed"]
}
```
