embedding

AWS Bedrock

Use Amazon Titan or Cohere embeddings via Bedrock with MVS for enterprise vector search

Generate embeddings inside your AWS account via Bedrock: Amazon Titan, Cohere Embed, or any supported model. IAM controls access, VPC endpoints keep traffic private, CloudTrail logs every call. MVS adds vector storage, indexing, and retrieval on top.

Read the Docs Start Building Schedule Walkthrough

Measurable impact from day one

What teams see after connecting AWS Bedrock to Mixpeek

100%

AWS-native security

IAM roles, VPC endpoints, and CloudTrail audit logs: embeddings never leave your AWS account's security perimeter

SOC2/HIPAA

Compliance ready

Bedrock's compliance certifications plus MVS's data isolation give regulated industries a production-ready embedding pipeline

Embedding providers

Amazon Titan, Cohere Embed, and more: switch models through Bedrock without changing infrastructure or pipeline code

Data leaves your perimeter

Bedrock runs in your AWS account, MVS stores vectors with your encryption keys: no data exfiltration risk

<1 hr

Enterprise setup

IAM role, Bedrock model access, MVS collection, retriever config: production search infrastructure in under an hour

Full

Audit trail

CloudTrail logs every embedding call, Mixpeek lineage tracks every vector: end-to-end auditability for compliance reviews

The Problem

Enterprises in regulated industries need embedding generation that meets strict security and compliance requirements: IAM-based access control, VPC-private networking, audit trails, and data residency guarantees. Public embedding APIs send data outside the corporate network, which fails compliance reviews for healthcare (HIPAA), financial services (SOC2), and government (FedRAMP) workloads. Even when enterprises can use Bedrock for embedding generation, they still need a vector storage and retrieval layer that matches the same security posture, and most vector databases require separate infrastructure, separate access controls, and separate compliance certifications.

The Solution

AWS Bedrock generates embeddings inside your AWS account: IAM roles control access, VPC endpoints keep traffic private, and CloudTrail logs every invocation. Mixpeek Vector Store provides the retrieval layer: store vectors in MVS collections, configure retrievers with feature search and metadata filtering, and query through a single API. The embedding-to-search pipeline stays within your compliance perimeter. Bedrock supports multiple embedding providers (Amazon Titan, Cohere Embed) so you can select models based on your domain requirements without changing infrastructure.

Pipeline Architecture

Hover over each step to see how the components connect

Bedrock Embedding

Titan / Cohere

Call AWS Bedrock's invoke_model endpoint with Amazon Titan Embeddings V2 or Cohere Embed. IAM roles control access, VPC endpoints keep traffic private.

Vector Upsert

MVS Collection

Upsert the embedding vector and metadata to a Mixpeek Vector Store collection. Include source document ID, AWS account context, and custom metadata for lineage.

Enterprise Access Control

IAM + Metadata Filters

Bedrock access is controlled by IAM policies. MVS metadata filters enforce data partitioning and access control at the retrieval layer: row-level security for vectors.

Bulk Indexing

AWS Batch / Step Functions

For large-scale indexing, orchestrate Bedrock embedding calls through AWS Batch or Step Functions. Bulk upsert results to MVS with parallel workers.

Retriever Configuration

Feature Search + Filters

Configure Mixpeek retrievers with feature search stages, metadata filters for access control, and optional reranking. A single API endpoint serves all query patterns.

Audit Trail

CloudTrail + Lineage

Every Bedrock invocation is logged in CloudTrail. Every MVS upsert and query is tracked in Mixpeek's lineage system. End-to-end auditability from embedding to search result.

AWS Bedrock Integration Deep Dive

Use the AWS SDK (boto3) to call Bedrock's invoke_model endpoint with your chosen embedding model: Amazon Titan Embeddings V2 for general-purpose use, Cohere Embed for multilingual workloads, or any other Bedrock-supported embedding model. The response contains a vector array that you upsert directly to a Mixpeek Vector Store collection along with metadata. IAM policies control which principals can invoke which models, and VPC endpoints route Bedrock traffic through your private network. For bulk indexing, run Bedrock embedding calls through AWS Batch or Step Functions, upserting results to MVS in parallel. Configure Mixpeek retrievers with feature search stages that query the MVS collection, add metadata filters for access control or data partitioning, and expose the retriever through Mixpeek's search API. At query time, embed the user's query with the same Bedrock model, search the MVS index, and return ranked results: all auditable through CloudTrail and Mixpeek's lineage tracking.

Quick Start

aws_bedrock_mvs.py

import boto3
import json
from mixpeek import Mixpeek

# 1. Generate embedding with AWS Bedrock
bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
response = bedrock.invoke_model(
    modelId="amazon.titan-embed-text-v2:0",
    body=json.dumps({"inputText": "HIPAA compliance audit report"})
)
vector = json.loads(response["body"].read())["embedding"]

# 2. Upsert to Mixpeek Vector Store
client = Mixpeek(api_key="YOUR_API_KEY")
client.vector_store.upsert(
    namespace="compliance-docs",
    vectors=[{
        "id": "doc_audit_2026",
        "values": vector,
        "metadata": {"dept": "legal", "classification": "internal"}
    }]
)

# 3. Search with metadata filters
results = client.vector_store.search(
    namespace="compliance-docs",
    vector=query_vector,
    top_k=10,
    filters={"classification": "internal"}
)

See the full API reference in the Vector Store docs.