Documentation Index
Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
For full configuration details, parameters, and advanced options, see the Taxonomies reference.
Taxonomies
Auto-classify documents by matching them against reference collections. Two types: Flat — match each document against a single reference collection. When similarity exceeds the threshold, enrichment fields (SKU, category, label) are attached. Hierarchical — parent/child nodes with inheritance. Documents traverse levels of refinement (brand → category → subcategory) using different features at each level.When to Run
| Mode | Runs | Use case |
|---|---|---|
on_demand | At query time as a retriever stage | Dynamic classification, A/B testing |
materialize | After extraction, persists to collection | Stable labels, fast queries |
retroactive | Reapplies when taxonomy updates | Backfill when reference data improves |
Retriever Enrichments
Attach a retriever pipeline to a collection so it runs on every new document. The retriever executes, and selected result fields are written back to the document.Annotations
Explicit human decisions with full provenance — the ground truth layer for compliance, review workflows, and improving retrieval quality over time.What Each Annotation Captures
| Field | Purpose |
|---|---|
document_id, collection_id | What was reviewed |
retriever_id, execution_id, stage_name | How it was surfaced |
label, confidence, reasoning | The decision |
payload | Structured workflow-specific data (SKU, action, notes) |
actor_id, actor_type | Who decided (human or model) |
Bulk Operations
Process review queues at scale with the bulk API:The Feedback Loop
Annotations feed directly into the platform’s learning cycle:- Annotations provide explicit ground truth for edge cases
- Learned fusion uses annotations to auto-tune retriever stage weights
- Approved annotations can be piped into reference collections, expanding your taxonomy’s coverage
- Retroactive taxonomy application reclassifies existing documents when annotations improve the reference set
Choosing an Approach
| Goal | Use |
|---|---|
| Auto-label with a reference catalog | Flat taxonomy (materialize mode) |
| Hierarchical classification (brand → category → SKU) | Hierarchical taxonomy |
| Auto-classify via LLM at ingest | Retriever enrichment with llm_enrich stage |
| Cross-collection joins (enrich from another dataset) | Retriever enrichment with document_enrich stage |
| Human review with audit trail | Annotations |
| Backfill when labels improve | Retroactive taxonomy application |

