Mixpeek Logo
    data

    HTML
    JSON
    Converter

    Extract structured data from web pages using a combination of CSS/XPath selectors and LLM-based extraction. Captures product details, article metadata, contact information, and custom schemas from any website.

    Max file size: 50 MB
    Estimated: 2-8 sec per page
    2 input formats

    How It Works

    1

    Provide a URL or upload an HTML file.

    2

    Existing structured data (JSON-LD, microdata, RDFa) is extracted first.

    3

    An LLM analyzes the page to extract additional structured fields.

    4

    Results are merged and validated against your target schema.

    5

    Clean JSON output is returned with confidence scores.

    Code Examples

    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    result = client.convert(
    source="https://example.com/product-page",
    from_format="html",
    to_format="structured-data",
    options={
    "target_schema": {
    "product_name": "string",
    "price": "number",
    "currency": "string",
    "rating": "number",
    "reviews_count": "integer"
    }
    }
    )
    print(result.data)

    Use Cases

    Scrape product details from e-commerce pages
    Extract article metadata from news sites
    Capture business listings from directory pages
    Build structured datasets from web sources

    Supported Input Formats

    HTML
    XHTML

    Quick Info

    Categorydata
    Max File Size50 MB
    Est. Time2-8 sec per page
    Extractorweb-scraper

    Try This Conversion

    Get started with the Mixpeek API and convert your first file in minutes.

    Frequently Asked Questions

    Ready to convert html to json?

    Start using the Mixpeek HTML to Structured Data in minutes. Sign up for a free API key and follow the documentation to get started.