How are nested JSON objects handled?

Nested objects are flattened using dot-notation (e.g., 'address.city') before serialization. You can control the max nesting depth and exclude specific paths.

Can I select specific fields to embed?

Yes. Use the `fields` parameter to specify which keys to include. Only selected fields will be serialized and embedded, reducing noise from irrelevant data.

What about JSONL (newline-delimited) files?

JSONL and NDJSON files are fully supported. Each line is treated as a separate record. This is the recommended format for large datasets as it enables streaming processing.

data

JSON
Embeddings
Converter

Convert JSON objects and arrays into semantic vector embeddings. Supports nested structures, field selection, and configurable serialization strategies for optimal embedding quality.

Max file size: 500 MB

Estimated: 1-5 sec per 1000 records

3 input formats

How It Works

Upload a JSON file or provide raw JSON in the request body.

Fields are selected and serialized into text representations.

Text representations are chunked if they exceed model context length.

Each record is embedded using the selected text embedding model.

Embeddings are returned alongside source record identifiers.

Code Examples

from mixpeek import Mixpeek

client = Mixpeek(api_key="YOUR_API_KEY")

result = client.convert(
    source="https://example.com/products.json",
    from_format="json",
    to_format="embeddings",
    options={
        "model": "e5-large-instruct",
        "fields": ["title", "description", "category"],
        "id_field": "product_id"
    }
)

for record in result.embeddings:
    print(f"{record.id}: dim={len(record.vector)}")

Use Cases

Embed product catalogs for semantic search

Create vector indexes from API response data

Build recommendation systems from structured metadata

Enable natural-language queries over JSON datasets

Supported Input Formats

JSON

JSONL

NDJSON

Quick Info

Categorydata

Max File Size500 MB

Est. Time1-5 sec per 1000 records

Extractorstructured-data-descriptor

Try This Conversion

Get started with the Mixpeek API and convert your first file in minutes.

Frequently Asked Questions

Related Converters

CSV

Embeddings

CSV to Embeddings

Convert CSV files into vector embeddings by selecting and combining columns into text representations. Supports header mapping, custom delimiters, and batch processing for large datasets.

Text

Embeddings

Text to Embeddings

Convert text strings, paragraphs, or documents into dense vector embeddings using state-of-the-art language models. Supports batching, chunking, and multiple model options for optimal retrieval performance.

Mixed

Embeddings

Multimodal to Embeddings

Generate unified vector embeddings from mixed-modality inputs -- text, images, audio, and video combined. Enables cross-modal search where any modality can query any other modality in a single vector space.

Ready to convert json to embeddings?

Start using the Mixpeek JSON to Embeddings in minutes. Sign up for a free API key and follow the documentation to get started.

JSONEmbeddingsConverter

How It Works

Code Examples

Use Cases

Supported Input Formats

Quick Info

Try This Conversion

Frequently Asked Questions

Related Converters

CSV to Embeddings

Text to Embeddings

Multimodal to Embeddings

Ready to convert json to embeddings?

JSON
Embeddings
Converter