Embedding dimensions are locked at namespace creation — you can’t swap the embedding model in place, because the vector index dimensionality is fixed and vectors from different models aren’t comparable. To move to a new model, you re-extract your data into a target namespace using the new extractor config, validate it, then cut over.
This is the right path whenever you change the embedding model (e.g. a new frontier text model, or switching text_extractor → a higher-dim model). For routine model upgrades within the same family, the model registry hot-swaps the default for new collections without touching existing ones.
How it works
A migration of type re_extract reads your source namespace’s objects and re-runs extraction with a new feature_extractors config (the new model) into a target namespace. Your source stays live and untouched until you choose to cut over.
| Migration type | What it does |
|---|
re_extract | Re-run extraction with new extractor/model config (use this to change embedding models) |
copy | Copy resources as-is to another namespace (no re-extraction) |
extend | Add new features to existing documents without a full re-extract |
1. Validate first (dry run)
Always validate the migration config before committing compute — it checks the source, target, and extractor config without extracting anything.
curl -sS -X POST "$MP_API_URL/v1/namespaces/migrations/validate" \
-H "Authorization: Bearer $MP_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"config": {
"migration_type": "re_extract",
"source_namespace_id": "ns_prod",
"target_namespace_name": "prod-v2",
"feature_extractors": [
{ "feature_extractor_name": "text_extractor", "version": "v1",
"params": { "embedding_model": "<new-model-id>" } }
]
}
}'
2. Create the migration
curl -sS -X POST "$MP_API_URL/v1/namespaces/migrations/" \
-H "Authorization: Bearer $MP_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"config": {
"migration_type": "re_extract",
"source_namespace_id": "ns_prod",
"target_namespace_name": "prod-v2",
"feature_extractors": [
{ "feature_extractor_name": "text_extractor", "version": "v1",
"params": { "embedding_model": "<new-model-id>" } }
],
"dry_run": false,
"webhook_url": "https://example.com/migration-status"
},
"start_immediately": false
}'
The response includes a migration_id.
| Config field | Purpose |
|---|
migration_type | re_extract to change the model |
source_namespace_id | Namespace to migrate from |
target_namespace_name / target_namespace_id | Where the re-extracted data lands |
feature_extractors | New extractor config — the new embedding model |
filters | Optionally migrate a subset (by collection, date, etc.) |
batch_options | Tune batch size / parallelism |
dry_run | Validate only, don’t execute |
webhook_url | Get progress callbacks |
3. Start and monitor
# Start (if you didn't set start_immediately)
curl -sS -X POST "$MP_API_URL/v1/namespaces/migrations/{migration_id}/start" \
-H "Authorization: Bearer $MP_API_KEY"
# Poll status
curl -sS "$MP_API_URL/v1/namespaces/migrations/{migration_id}" \
-H "Authorization: Bearer $MP_API_KEY"
Cancel a running migration with POST /v1/namespaces/migrations/{migration_id}/cancel.
4. Cut over
Re-extraction re-pays GPU/extraction cost for every document, so validate on a subset first (use filters) and confirm quality before migrating everything. Once the target namespace is populated and validated:
- Re-run your evaluations against the target namespace to confirm relevance is at least as good.
- Point your application’s
X-Namespace (and retrievers) at the target namespace.
- Retire the source namespace when you’re confident.
Vectors from different embedding models are not comparable — you cannot mix old and new vectors in the same index, and you cannot copy embeddings across models (only re_extract regenerates them). Plan for the full re-extraction cost.