Mixpeek Logo
    document

    PDF
    JSON
    Converter

    Extract structured key-value pairs, tables, and form fields from PDF documents. Uses layout analysis and LLM extraction to produce clean JSON output, even from complex forms and invoices.

    Max file size: 200 MB
    Estimated: 2-15 sec per page
    1 input formats

    How It Works

    1

    Upload a PDF or provide a URL.

    2

    Layout analysis identifies form fields, tables, and key-value regions.

    3

    An LLM extracts values and maps them to a structured schema.

    4

    Tables are converted to row/column JSON arrays.

    5

    The complete structured output is returned as JSON.

    Code Examples

    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_API_KEY")
    result = client.convert(
    source="https://example.com/invoice.pdf",
    from_format="pdf",
    to_format="structured-data",
    options={
    "target_schema": {
    "vendor_name": "string",
    "invoice_date": "date",
    "total_amount": "number",
    "line_items": [{"description": "string", "amount": "number"}]
    }
    }
    )
    print(result.data)

    Use Cases

    Extract invoice line items and totals automatically
    Parse insurance claims and medical forms
    Digitize government and tax forms into databases
    Convert product spec sheets into structured catalogs

    Supported Input Formats

    PDF

    Quick Info

    Categorydocument
    Max File Size200 MB
    Est. Time2-15 sec per page

    Try This Conversion

    Get started with the Mixpeek API and convert your first file in minutes.

    Frequently Asked Questions

    Ready to convert pdf to json?

    Start using the Mixpeek PDF to Structured Data in minutes. Sign up for a free API key and follow the documentation to get started.