Environment Branching

Overview

Branching an AI data environment used to mean one of two things: re-processing your entire corpus (slow, expensive) or experimenting directly in production (dangerous). Mixpeek solves this with a clone-based branching model that operates at every layer of the pipeline — namespace, collection, retriever, and taxonomy — so you can create fully isolated environments instantly and promote changes deliberately. This matters most when:

You want to test a new retrieval pipeline on production data without affecting live traffic
You need a staging namespace that mirrors production for QA without re-ingesting everything
You’re evaluating a new embedding model and want side-by-side comparison on the same corpus
A taxonomy schema is about to change and you need a safe place to validate it first

Branching Primitives

Mixpeek resources are immutable by design — you can’t change a collection’s feature extractor or a retriever’s pipeline stages via PATCH. This preserves execution history, dependent results, and audit trails. The branching mechanism is a first-class clone operation available on every resource type.

Namespace Clone (full environment branch)

The most powerful primitive. A namespace clone deep-copies the entire environment: collections (including MVS vectors), retrievers, buckets, and optionally taxonomies — remapping all internal IDs so nothing points back to production.

from mixpeek import Mixpeek

client = Mixpeek(api_key="your-api-key")

clone = client.namespaces.clone(
    "ns_prod",
    namespace_name="listings_staging",
    include_resources={
        "collections": True,   # copies MVS vectors — no reprocessing
        "retrievers": True,    # remaps all collection refs to staging copies
        "taxonomies": False    # optional: include taxonomy configs
    }
)

# The response returns immediately with the new namespace and a clone status.
new_namespace_id = clone.namespace.namespace_id
print(clone.status)            # "cloning" -> completes async to "ready"/"failed"
print(clone.cloned_resources)  # what was copied (source_id -> cloned_id per resource)

Namespace clone copies MVS vectors directly — your data is not re-processed. The clone is isolated: changes to collections, retrievers, or documents in staging have zero effect on production.

What gets copied:

Resource	What’s cloned	Notes
Collections	Metadata + MVS vectors	Vectors copied, not recomputed
Retrievers	Full pipeline stage config	All collection refs remapped to staging
Buckets	Metadata only	S3 objects are not duplicated
Taxonomies	Config only (optional)	Retriever refs remapped

Collection Clone (swap extractor or source)

Clone a single collection and optionally override its feature extractor or source. This is the entry point for embedding model experimentation — run two extractors on the same corpus and compare retrieval quality.

# Clone with a different embedding model
new_col = client.collections.clone(
    "col_properties",
    collection_name="properties_siglip_v2",
    feature_extractor={
        "feature_extractor_name": "image_extractor",
        "version": "v2",
        "parameters": { "model": "google_siglip_base_v1" }
    }
)

# Trigger reprocessing — required when changing the extractor
client.collections.trigger(new_col.collection_id)

When you clone a collection without changing the feature extractor, vectors are reused. When you change the extractor, you must trigger reprocessing — vectors are model-specific and cannot be ported across embedding spaces.

Retriever Clone (pipeline experiment)

Immutable retriever stages mean the safe way to test a new ranking strategy, add a rerank stage, or adjust fusion weights is to clone the retriever with overrides.

# Clone retriever and add an MMR rerank stage
new_ret = client.retrievers.clone(
    "ret_ad_relevance",
    body={
        "retriever_name": "ad_relevance_mmr_v2",
        "stages": [
            {
                "stage_name": "search",
                "stage_type": "filter",
                "config": {
                    "stage_id": "feature_search",
                    "parameters": {
                        "feature_uri": "mixpeek://text_extractor@v1/e5_large",
                        "query": "{{INPUT.query}}",
                        "final_top_k": 50
                    }
                }
            },
            {
                "stage_name": "diversify",
                "stage_type": "sort",
                "config": {
                    "stage_id": "mmr",
                    "parameters": { "lambda": 0.7, "top_k": 20 }
                }
            }
        ]
    }
)

Taxonomy Clone (schema version branch)

Taxonomies are immutable in their core config (retriever_id, input_mappings, enrichment_fields). Clone to branch a schema — swap the backing retriever, adjust the hierarchy, or test a new classification model.

# Branch a taxonomy to test a new classification retriever
new_tax = client.taxonomies.clone(
    taxonomy_identifier="tax_icd_codes",
    body={
        "taxonomy_name": "icd_codes_llama_v2",
        "retriever_id": "ret_llama_classifier"  # only changed field
    }
)

Common Patterns

Pattern 1: Staging environment for a production namespace

The most common use case: a full mirror of production where QA teams can validate changes before go-live.

prod namespace (ns_prod)
├── col_content_v1       (CLIP embeddings, 2M docs)
├── ret_content_search   (semantic + rerank pipeline)
└── tax_iab_v3           (IAB 3.0 taxonomy)

→ clone → staging namespace (ns_staging)
├── col_content_v1_copy  (vectors copied, no reprocessing)
├── ret_content_search_copy  (points to staging collection)
└── [taxonomies excluded]

After cloning, engineers can modify the staging retriever pipeline, run evaluations, and only promote to prod once quality gates pass.

Pattern 2: Embedding model A/B test

Run two embedding models on the same corpus, then compare retrieval quality with Mixpeek’s Evaluations before committing.

# Collection A — existing model (no reprocessing needed)
col_a = "col_listings_clip"   # already exists

# Collection B — new model (clone + reprocess)
col_b = client.collections.clone("col_listings_clip", {
    "collection_name": "col_listings_siglip",
    "feature_extractor": {
        "feature_extractor_name": "image_extractor",
        "version": "v1",
        "parameters": { "model": "google_siglip_base_v1" }
    }
})
client.collections.trigger(col_b.collection_id)

# Retriever A — current model
ret_a = "ret_property_search"

# Retriever B — points to new collection
ret_b = client.retrievers.clone("ret_property_search", {
    "retriever_name": "ret_property_search_siglip",
    "collection_identifiers": [col_b.collection_id]
})

# Run evaluations side by side
eval_a = client.retrievers.run_evaluation("ret_property_search", dataset_id="eval_ds_001")
eval_b = client.retrievers.run_evaluation(ret_b.retriever_id, dataset_id="eval_ds_001")

Pattern 3: Retriever pipeline experiment

Test a new retrieval stage (reranker, MMR, query expansion) without touching the live retriever.

# Current: bare semantic search
# Experiment: add query expansion + rerank

exp_retriever = client.retrievers.clone("ret_content_search", {
    "retriever_name": "ret_content_search_exp_rerank",
    "stages": [
        { "stage_type": "filter", "config": { "stage_id": "query_expand", "parameters": {...} } },   # new
        { "stage_type": "filter", "config": { "stage_id": "feature_search", "parameters": {...} } },
        { "stage_type": "sort", "config": { "stage_id": "rerank", "parameters": {...} } }             # new
    ]
})

# Shadow-test: run both retrievers on the same queries, compare metrics

Pattern 4: Taxonomy version checkpoint

Before migrating an IAB or ICD taxonomy schema, snapshot the current version so you can validate in parallel.

# Snapshot before migration
checkpoint = client.taxonomies.clone("tax_iab_content", {
    "taxonomy_name": "tax_iab_content_v3_snapshot"
})

# Apply to existing docs in the new taxonomy to validate
client.taxonomies.apply(
    taxonomy_identifier=checkpoint.taxonomy_id,
    collection_id="col_content_sample_100"
)

Vertical Examples

Adtech — IAB taxonomy migration

Problem: Your IAB 2.2 taxonomy needs to be upgraded to IAB 3.0. You can’t migrate production mid-campaign without validating classification quality first.Solution:

Clone the production namespace → ns_adtech_staging
Clone tax_iab_v2 with the new retriever trained on IAB 3.0 → tax_iab_v3_candidate
Apply tax_iab_v3_candidate to a sample of staging documents
Validate label quality against ground truth
Promote: update production taxonomy config once quality gates pass

No live campaigns are affected. Staging vectors are reused from production (no reprocessing).

Healthcare — clinical classification model upgrade

Problem: You’re switching the ICD-10 classification retriever from GPT-4o to a fine-tuned clinical model. Patient record classification in production cannot be interrupted.Solution:

Clone tax_icd10_prod with the new retriever → tax_icd10_clinical_v2
Apply to a test collection of de-identified records
Compare classification accuracy against the production taxonomy’s output on the same records
Only swap production once the new model meets or exceeds current accuracy

Both taxonomies run in parallel. There’s no downtime and no disruption to the production classification pipeline.

Media — search ranking experiment

Problem: Editorial wants to test a diversity-aware ranking algorithm (MMR) on the content search endpoint before rolling out to all users.Solution:

Clone ret_content_search → ret_content_search_mmr
Override stages to add an MMR sort step after semantic search
Route 10% of internal QA traffic to the experimental retriever (traffic splitting handled in your application layer)
Compare CTR, dwell time, and result diversity metrics
Promote by cloning the staging retriever config back to production

The collection (and all its vectors) is shared between both retrievers — no extra storage cost.

CRE — new embedding model for property images

Problem: A new image embedding model shows better performance on architectural/interior photos. You want to validate before migrating 5M property images.Solution:

Clone col_property_images with the new extractor → col_property_images_v2
Trigger reprocessing (required for model changes — vectors are model-specific)
Create ret_property_search_v2 pointing to the new collection
Run offline evaluation: compare top-5 retrieval precision on a labeled query set
If metrics improve, migrate production: update ret_property_search to point to the new collection

The old collection stays active during migration as a fallback. Roll back is instant — just repoint the retriever.

Promotion Workflow

Branching is only useful if you have a clear path back to production. The recommended flow:

dev branch  →  staging clone  →  eval gate  →  production
     ↑                                              |
     └──────────── rollback (repoint retriever) ───┘

Promoting a retriever experiment to production:

Run Evaluations on the experimental retriever
If metrics pass, delete the old production retriever (after confirming no dependent published pages)
Rename the experimental retriever to the production name via PATCH (name is mutable)
Or: update your application to point to the new retriever ID directly

Rolling back is always safe — because the old retriever and collection still exist unchanged, you can revert by updating the retriever ID in your application config.

Best Practices

One namespace per environment (dev, staging, prod). Clone from prod to create staging rather than maintaining them separately — this guarantees staging always reflects current production data and config.

Clone, don’t modify. Resist the urge to patch production resources for “quick experiments.” A clone takes seconds and preserves the ability to roll back.
Retriever clones are free — they share the underlying collection (and all its vectors). You only pay for additional MVS storage when collection vectors diverge.
Trigger reprocessing only when the extractor changes. Cloning a collection with the same extractor reuses existing vectors — no GPU time consumed.
Use evaluations before promoting. The Evaluations API lets you run offline quality checks on any retriever before it touches production traffic.
Name branches consistently. A naming convention like {resource}_staging, {resource}_exp_{date}, or {resource}_v{n} makes it easy to identify which resources are active experiments vs. production.
Clean up stale branches. Delete experimental collections and retrievers after promotion or abandonment. MVS vectors from branched collections consume storage until deleted.

Namespaces — isolation boundaries and multi-tenancy
Collections — processing pipelines and lifecycle states
Retrievers — pipeline stages and configuration
Evaluations — offline quality testing before promotion

​Overview

​Branching Primitives

​Namespace Clone (full environment branch)

​Collection Clone (swap extractor or source)

​Retriever Clone (pipeline experiment)

​Taxonomy Clone (schema version branch)

​Common Patterns

​Pattern 1: Staging environment for a production namespace

​Pattern 2: Embedding model A/B test

​Pattern 3: Retriever pipeline experiment

​Pattern 4: Taxonomy version checkpoint

​Vertical Examples

​Promotion Workflow

​Best Practices

​Related

Overview

Branching Primitives

Namespace Clone (full environment branch)

Collection Clone (swap extractor or source)

Retriever Clone (pipeline experiment)

Taxonomy Clone (schema version branch)

Common Patterns

Pattern 1: Staging environment for a production namespace

Pattern 2: Embedding model A/B test

Pattern 3: Retriever pipeline experiment

Pattern 4: Taxonomy version checkpoint

Vertical Examples

Promotion Workflow

Best Practices

Related