Image Search Capabilities
From visual similarity to natural language queries, build any image search experience with a single API.
Visual Similarity Search
Find visually similar images by analyzing content, color, composition, and semantic meaning.
- Content-based image retrieval using deep embeddings
- Configurable similarity thresholds and ranking
- Works across different resolutions and aspect ratios
Reverse Image Search
Upload an image and find exact matches, near-duplicates, or derivative versions across your entire corpus.
- Detect duplicates and near-duplicates at scale
- Find cropped, resized, or modified versions
- Source attribution and provenance tracking
Object & Scene Recognition
Detect and classify objects, scenes, and actions within images for intelligent categorization.
- Identify thousands of object categories
- Scene classification and context understanding
- Action and activity recognition in images
Text-to-Image Search
Describe what you are looking for in natural language and retrieve matching images instantly.
- Natural language queries mapped to visual content
- Semantic understanding beyond keyword matching
- Multi-language query support via CLIP models
How Image Search Works
From raw images to instant retrieval in five steps. Mixpeek handles the entire pipeline so you can focus on building your application.
Upload Images
Ingest images from object storage (S3, GCS, Azure Blob), URLs, or direct upload via the API.
Feature Extraction
Extract visual features using CLIP, SigLIP, or custom models. Generate dense vector embeddings that capture semantic meaning.
Vector Embedding
Transform extracted features into high-dimensional vectors optimized for similarity search.
Indexing
Index embeddings in Qdrant for sub-millisecond approximate nearest neighbor retrieval at any scale.
Retrieval
Query with images, text, or both. Combine semantic search with metadata filtering for precise results.
Search Types
Multiple ways to query your image corpus, each optimized for different use cases and input modalities.
Image-to-Image
Find visually similar images by providing a reference image as the query input.
Text-to-Image
Search your image corpus using natural language descriptions of what you need.
Reverse Image Search
Upload an image to find its source, duplicates, or derivative versions in your collection.
Object Detection Search
Find images containing specific objects, regardless of position, scale, or background.
Combined Search
Combine image queries with text descriptions and metadata filters for multi-signal retrieval.
Batch Processing
Process and index millions of images with distributed pipelines on Ray GPU clusters.
Industry Applications
Image search powers critical workflows across industries, from retail product discovery to industrial quality assurance.
E-commerce Product Discovery
Let shoppers search by photo, find similar products, and discover visual alternatives.
Digital Asset Management
Organize, deduplicate, and search large image libraries with visual intelligence.
Content Moderation
Detect duplicate uploads, flag inappropriate content, and enforce brand guidelines at scale.
Manufacturing Quality Control
Identify visual defects, compare against reference images, and automate inspection workflows.
Any Model, Any Architecture
Use Mixpeek's builtin image extractors or bring your own models. With 50+ extractors and a flexible plugin system, you have full control over how images are processed and embedded.
- CLIP, SigLIP, and ResNet extractors included out of the box
- Deploy custom PyTorch or ONNX models via the plugin system
- Chain multiple extractors for multi-feature embeddings
- Fine-tune models on your domain data for higher accuracy
- GPU-accelerated inference on Ray clusters
Simple to Integrate
A few lines of code to search your image corpus. Use the Python SDK, JavaScript SDK, or REST API directly.
from mixpeek import Mixpeek
client = Mixpeek(api_key="YOUR_API_KEY")
# Search by image URL
results = client.retrievers.search(
retriever_id="img-search-retriever",
queries=[
{
"type": "image",
"value": "https://example.com/query-image.jpg",
"embedding_model": "mixpeek/clip-base"
}
],
filters={
"AND": [
{"key": "category", "value": "product", "operator": "eq"}
]
},
top_k=20
)
for result in results:
print(f"Score: {result.score}, ID: {result.document_id}")Frequently Asked Questions
What is an image search API?
An image search API allows developers to build applications that search, compare, and retrieve images programmatically. Instead of relying on filenames or manual tags, it uses computer vision and vector embeddings to understand visual content, enabling searches by image similarity, natural language descriptions, or a combination of both.
How does visual similarity search work?
Visual similarity search works by converting images into high-dimensional vector embeddings using deep learning models like CLIP or SigLIP. These embeddings capture semantic features such as objects, colors, textures, and composition. When you query with an image, the system finds vectors closest to it in the embedding space using approximate nearest neighbor search, returning the most visually similar results.
What is reverse image search?
Reverse image search takes an image as input and finds matching or similar images in a database. Unlike text-based search, it analyzes the visual content directly to find exact duplicates, near-duplicates (cropped, resized, or filtered versions), and semantically similar images. Common use cases include source attribution, duplicate detection, and copyright enforcement.
What image formats does Mixpeek support?
Mixpeek supports all major image formats including JPEG, PNG, WebP, TIFF, BMP, and GIF (first frame). Images are automatically preprocessed and normalized during ingestion, so you do not need to convert formats before uploading. The API also handles varying resolutions and aspect ratios without manual resizing.
Can I use custom embedding models for image search?
Yes. Mixpeek supports bringing your own models via the plugin system. You can deploy custom PyTorch or ONNX models alongside builtin extractors like CLIP, SigLIP, and ResNet. Custom models run on the same GPU infrastructure and integrate directly into the feature extraction pipeline with no additional setup.
How does image search scale to millions of images?
Mixpeek uses distributed processing on Ray GPU clusters for feature extraction and Qdrant for vector indexing. Qdrant provides sub-millisecond approximate nearest neighbor search even at billions of vectors. Ingestion pipelines automatically scale horizontally, and storage tiering moves cold data to S3 while keeping hot vectors in memory for fast retrieval.
What is the difference between image search and image recognition?
Image recognition identifies and classifies objects, scenes, or attributes within a single image (e.g., 'this image contains a cat'). Image search uses those recognized features to find similar or matching images across a corpus. Mixpeek combines both: recognition happens during feature extraction, and the resulting embeddings power search and retrieval.
Can I combine image search with text and metadata filters?
Yes. Mixpeek retrievers support multi-stage pipelines that combine vector similarity search with metadata filtering. You can search by image similarity and then filter results by date, category, tags, or any custom metadata field. You can also combine image and text queries in a single search for more precise results.
