MixedEmbeddingsConverter
Generate unified vector embeddings from mixed-modality inputs -- text, images, audio, and video combined. Enables cross-modal search where any modality can query any other modality in a single vector space.
How It Works
Provide one or more inputs of any modality.
Each input is processed through its modality-specific encoder.
Modality embeddings are projected into a shared vector space.
A fused embedding is produced that represents the combined input.
The unified embedding enables cross-modal similarity search.
Code Examples
from mixpeek import Mixpeekclient = Mixpeek(api_key="YOUR_API_KEY")result = client.convert(sources=[{"type": "text", "content": "A red sports car on a mountain road"},{"type": "image", "url": "https://example.com/car.jpg"}],from_format="multimodal",to_format="embeddings",options={"model": "clip-vit-l-14","fusion_strategy": "weighted_average","weights": {"text": 0.4, "image": 0.6}})print(f"Fused embedding dim: {len(result.embedding)}")
Use Cases
Supported Input Formats
Quick Info
Try This Conversion
Get started with the Mixpeek API and convert your first file in minutes.
Frequently Asked Questions
Related Converters
Video to Embeddings
Generate dense vector embeddings for video content using multimodal models. Embeddings capture visual, audio, and temporal features, enabling semantic search and similarity matching across video collections.
Image to Embeddings
Convert images into dense vector representations using state-of-the-art vision models. Embeddings capture semantic visual features and can be used for similarity search, clustering, and cross-modal retrieval.
Audio to Embeddings
Convert audio files into dense vector embeddings that capture spoken content, tone, and acoustic features. Use embeddings for audio search, speaker verification, and content-based recommendation.
Text to Embeddings
Convert text strings, paragraphs, or documents into dense vector embeddings using state-of-the-art language models. Supports batching, chunking, and multiple model options for optimal retrieval performance.
Ready to convert mixed to embeddings?
Start using the Mixpeek Multimodal to Embeddings in minutes. Sign up for a free API key and follow the documentation to get started.
