DocumentKnowledge GraphConverter
Transform documents into structured knowledge graphs by extracting entities, relationships, and concepts. Produces nodes and edges suitable for graph databases, enabling complex queries, reasoning, and visualization over document content.
How It Works
Upload a document or provide a URL to the Mixpeek API.
Text is extracted and segmented into paragraphs and sections.
Named entity recognition identifies people, organizations, locations, concepts, and domain-specific terms.
Relationship extraction identifies connections between entities (e.g., 'works at', 'located in', 'causes').
A knowledge graph is returned as nodes and edges in JSON-LD, RDF, or a custom graph format.
Code Examples
from mixpeek import Mixpeekclient = Mixpeek(api_key="YOUR_API_KEY")result = client.convert(source="https://example.com/research-paper.pdf",from_format="document",to_format="knowledge-graph",options={"entity_types": ["Person", "Organization", "Concept", "Method"],"output_format": "nodes_edges","min_confidence": 0.6,"include_context": True})print(f"Nodes: {len(result.nodes)}, Edges: {len(result.edges)}")for node in result.nodes[:5]:print(f" [{node.type}] {node.label}")for edge in result.edges[:5]:print(f" {edge.source} --{edge.relation}--> {edge.target}")
Use Cases
Supported Input Formats
Quick Info
Try This Conversion
Get started with the Mixpeek API and convert your first file in minutes.
Frequently Asked Questions
Related Converters
PDF to Text
Extract clean, structured text from PDF documents including scanned pages, multi-column layouts, headers/footers, and tables. Combines traditional parsing with OCR and layout analysis for maximum accuracy.
PDF to Structured Data
Extract structured key-value pairs, tables, and form fields from PDF documents. Uses layout analysis and LLM extraction to produce clean JSON output, even from complex forms and invoices.
HTML to Structured Data
Extract structured data from web pages using a combination of CSS/XPath selectors and LLM-based extraction. Captures product details, article metadata, contact information, and custom schemas from any website.
Text to Embeddings
Convert text strings, paragraphs, or documents into dense vector embeddings using state-of-the-art language models. Supports batching, chunking, and multiple model options for optimal retrieval performance.
PDF to JSON
Convert PDF documents into clean, structured JSON output. Extracts text, tables, form fields, metadata, and document structure into a machine-readable JSON format suitable for API ingestion, database storage, and programmatic processing.
Ready to convert document to knowledge graph?
Start using the Mixpeek Document to Knowledge Graph in minutes. Sign up for a free API key and follow the documentation to get started.
