The type or form of data (e.g., text, image, audio, video, tabular) that represents information in different ways.
Different modalities represent information in unique ways - text as sequences of characters, images as pixel matrices, audio as waveforms, etc. Each modality requires specific processing techniques and storage considerations.
Each modality has its own data structures, processing pipelines, and storage requirements. Modern systems often need to handle multiple modalities simultaneously and enable cross-modal operations.
Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.
Start with ManagedKeep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. From $25/mo.
Start with MVS