Choose the plan that fits your needs. Scale as you grow with our flexible pricing options.
Get started with basic features for personal or small projects.
Flexible plan that scales with your usage. Only pay for what you need.
Custom solutions for large-scale enterprise needs with volume discounts.
Each extractor is billed based on what it processes. Costs are measured in credits (1 credit = $0.001).
Extractors are grouped by complexity tier. Higher-tier extractors involve more compute-intensive ML models.
Video, image, and text embedding via Vertex AI (1408D unified space)
Face detection (SCRFD) and recognition (ArcFace 512D embeddings)
Playwright crawling with LLM-based content extraction
PDF layout understanding with optional VLM correction
Video decomposition into scenes, OCR, and transcription
YouTube caption extraction and transcript embedding
Text embedding via E5 (1024D)
Image embedding via CLIP/SigLIP
Text sentiment classification
Storage only, no ML processing
1 credit = $0.001
Credits are deducted from your balance as extractors process data.
Pay per unit processed
Each extractor charges based on its input type: minutes of video, images, pages, tokens, etc.
Composable pricing
Chain multiple extractors in a pipeline. You only pay for the extractors you use.
See what it takes to build multimodal processing infrastructure on your own.
| Component | Mixpeek | DIY on |
|---|---|---|
| Video/Image Processing | Included | Lambda + MediaConvert + Rekognition |
| Embedding Generation | Included | SageMaker + Bedrock |
| Vector Search | Included | OpenSearch |
| Storage | $2/GB | S3 + data transfer |
| Pipeline Orchestration | Included | Step Functions + EventBridge |
| Time to Production | Minutes | Months of engineering |
| Ongoing Maintenance | Managed | Dedicated team required |
Each feature extractor charges credits based on what it processes — minutes of video, number of images, text tokens, document pages, etc. 1 credit = $0.001. Credits are deducted from your account balance as extractors run. You can monitor usage in real time from the dashboard.
Extractors vary in computational complexity. Simple extractors like text embedding use lightweight models and cost as little as 1 credit per 1K tokens. Premium extractors like the multimodal extractor run GPU-intensive models for video segmentation, scene detection, and multi-modal embedding, costing 300 credits per minute of video.
Our usage-based pricing model charges a $49/month base fee plus the cost of your actual usage. You pay for the credits consumed by your extractors and the storage you use. Your costs scale with your actual needs.
No, our usage-based plan is billed monthly with no long-term commitments. You can upgrade, downgrade, or cancel at any time.
There are no hard limits on the usage-based plan. You'll simply be billed for your actual usage at the end of each billing cycle.
Yes, you can compose multiple extractors in a single collection pipeline. Each extractor is billed independently based on its own rates. For example, you could run the multimodal extractor and face identity extractor on the same video — you'd pay for each separately.
Yes, we offer a 10% discount for annual payments on the usage-based plan. Contact our sales team for more information.