Mixpeek Logo
    Knowledge Graph Intelligence

    Ontologies

    Turn any detectionβ€”a face, logo, or keywordβ€”into a web of connected insights across video, image, audio, and text. Find relationships your current search can't see.

    What You'll Get

    10Γ—

    Discover 10Γ— more

    Connect related content automatically

    +40%

    Boost Relevance 40%

    Rank by relationships, not keywords

    4

    Link 4 Modalities

    Unify video, image, audio, and docs

    Best for: Modeling entity relationships

    Not for simple categorization (use Taxonomies) or grouping (use Clusters)

    How Ontologies Supercharge Search

    Ontologies provide a powerful way to model and traverse relationships between entities in your multimodal content. Unlike simple taxonomies that classify content, ontologies understand how entities relate to each other, enabling sophisticated reasoning about connections, dependencies, and associations.

    Behind the Scenes: Cross-Modal Reasoning

    Entities detected from video, images, and audio connected through relationships

    Face

    LeBron James

    from video

    playsFor
    Team

    LA Lakers

    from text

    sponsoredBy
    Logo

    Nike

    from image

    πŸ’‘ Cross-Modal Query:"Find Nike ads featuring Lakers players"

    Traverse: Face (video) β†’ Team (text) β†’ Sponsor (image)

    Multimodal Entity ExtractionExtract
    // From video frame
    {
    "modality": "video",
    "entity_type": "face",
    "entity_id": "lebron_james"
    }
    // From audio transcript
    {
    "modality": "audio",
    "entity_type": "team",
    "entity_id": "la_lakers"
    }
    // From image
    {
    "modality": "image",
    "entity_type": "logo",
    "entity_id": "nike"
    }
    Cross-Modal QueryQuery
    Try in Mixpeek β†’
    POST /ontologies/expand
    {
    "entity": "lebron_james",
    "source_modality": "video",
    "relations": ["playsFor", "sponsoredBy"],
    "return_modalities": ["image", "audio"],
    "max_hops": 2
    }
    // Returns: Nike logos (images)
    // + Lakers mentions (audio)

    Real-World Application: Multimodal Sports Content

    Connect entities across video frames, audio transcripts, and image logos

    1

    Detect face in video

    Face recognition identifies "LeBron James" at 00:14:23

    πŸŽ₯ Video
    2

    Traverse relationships automatically

    • β€’ πŸ–ΌοΈ Nike logos in images
    • β€’ 🎧 "Lakers" in audio mentions
    • β€’ πŸ“„ Crypto.com Arena in docs
    • β€’ πŸŽ₯ Anthony Davis in videos
    ✨

    Result: Rich Multimodal Context

    One face detection triggers discovery across all modalitiesβ€”images, audio, documents, and videos.

    What You Can Achieve

    Real outcomes from implementing ontologies in your knowledge infrastructure

    Intelligent Discovery

    Find related content across all modalities through relationship chains.

    Multimodal Discovery:

    "LeBron James" (face in video)

    β†’ Discovers 1,200+ related assets:

    • β€’ πŸŽ₯ 840 video clips
    • β€’ πŸ–ΌοΈ 215 product images
    • β€’ 🎧 98 audio mentions
    • β€’ πŸ“„ 47 documents

    Deeper Insights

    Surface patterns across video demos, image catalogs, and document specs.

    Cross-Modal Analysis:

    "Recalled product X" (image)

    β†’ Finds risk across modalities:

    • β€’ πŸŽ₯ 31 demo videos with same part
    • β€’ πŸ–ΌοΈ 12 catalog images
    • β€’ πŸ“„ 4 spec documents

    Better Recommendations

    Recommend related content from any modality based on entity relationships.

    Cross-Modal Recommendations:

    Viewing product video β†’ Recommends:

    • β€’ πŸ–ΌοΈ Images of compatible accessories
    • β€’ 🎧 Audio reviews from experts
    • β€’ πŸ“„ User manuals & guides
    Cross-Modal CTR+2.5x

    Ready to Try Ontology API?

    Start building cross-modal relationships in your content. Connect entities across video, images, audio, and documents.

    Multimodal Questions Ontologies Can Answer

    Go beyond single-modality search with cross-modal relationship reasoning

    ❌

    Without Ontologies

    "Find videos of this person"

    πŸŽ₯ Single modality search

    βœ“

    With Ontologies

    "Find all content with brands this person endorses"

    πŸŽ₯ Video β†’ πŸ–ΌοΈ Images β†’ πŸ“„ Documents

    ❌

    Without Ontologies

    "Search for this product logo"

    πŸ–ΌοΈ Image-only results

    βœ“

    With Ontologies

    "Find videos where people mention this product"

    πŸ–ΌοΈ Logo β†’ 🎧 Audio mentions β†’ πŸŽ₯ Video

    ❌

    Without Ontologies

    "Show documents about this topic"

    πŸ“„ Text-only results

    βœ“

    With Ontologies

    "Show videos, images, and audio of experts on this topic"

    πŸ“„ Topic β†’ πŸ‘€ Experts β†’ πŸŽ₯πŸ–ΌοΈπŸŽ§ All modalities

    Cross-Modal Ontology TripleJSON
    {
    "subject_id": "entity:lebron_james",
    "subject_type": "person",
    "subject_source": "video_frame_detection", // πŸŽ₯
    "subject_modality": "video",
    "relation": "endorses",
    "object_id": "entity:nike",
    "object_type": "brand",
    "object_source": "logo_detection", // πŸ–ΌοΈ
    "object_modality": "image",
    "confidence": 0.92,
    "evidence": [
    { "type": "visual", "modality": "video" },
    { "type": "audio", "modality": "audio" },
    { "type": "contract", "modality": "document" }
    ]
    }

    Relationships span modalities: Faces from video β†’ Logos from images β†’ Mentions from audio

    Use Cases

    Discover how organizations use ontologies to build intelligent systems.

    Sports & Entertainment

    Connect faces in videos, logos in images, and team names in audio to power rich content discovery.

    Cross-Modal Relationships:

    πŸŽ₯ Face (video) β†’ playsFor β†’ 🎧 Team (audio)

    🎧 Team (audio) β†’ sponsoredBy β†’ πŸ–ΌοΈ Logo (image)

    πŸ–ΌοΈ Logo (image) β†’ appearsIn β†’ πŸŽ₯ Video scenes

    Query: Face in video β†’ Find all sponsor logos in images + team mentions in audio

    E-commerce & Retail

    Link product images, video reviews, and document specifications to discover complete product relationships.

    Cross-Modal Relationships:

    πŸ–ΌοΈ Product (image) β†’ reviewedIn β†’ πŸŽ₯ Video

    πŸŽ₯ Video β†’ mentions β†’ 🎧 Audio review

    πŸ“„ Spec (doc) β†’ describes β†’ πŸ–ΌοΈ Product (image)

    Query: Product image β†’ Find video reviews + audio mentions + spec documents

    Media & Publishing

    Connect author faces in videos, voice in podcasts, and bylines in documents for comprehensive content discovery.

    Cross-Modal Relationships:

    πŸŽ₯ Author (video) β†’ discusses β†’ 🎧 Podcast

    πŸ“„ Article (doc) β†’ references β†’ πŸ–ΌοΈ Infographic

    🎧 Interview (audio) β†’ mentions β†’ πŸ“„ Research paper

    Query: Author face in video β†’ Find their podcasts + articles + cited images

    Enterprise Knowledge

    Connect employee faces in security footage, voices in meetings, and names in documents across your organization.

    Cross-Modal Relationships:

    πŸŽ₯ Face (security) β†’ worksIn β†’ πŸ“„ Department (doc)

    🎧 Voice (meeting) β†’ expertIn β†’ πŸ“„ Skill (doc)

    πŸ“„ Project (doc) β†’ features β†’ πŸŽ₯ Demo video

    Query: Face in video β†’ Find their presentations + documents + meeting recordings

    Choose the Right Approach

    Ontologies, Taxonomies, and Clusters serve different purposes

    Ontologies

    Model entity relationships. Traverse connections between people, brands, locations across modalities.

    e.g., "Player β†’ Team β†’ Sponsor"

    Taxonomies

    Classify content into predefined categories. Organize using established systems.

    e.g., IAB 3.0, product types

    Clusters

    Automatically group similar content. Discover patterns without predefined structure.

    e.g., "Similar scenes"

    πŸ’‘ Use them together: Taxonomies classify, Ontologies connect, and Clusters groupβ€”making your multimodal data searchable and intelligent.

    Cross-Modal Relationship Intelligence

    Connect entities across video frames, audio transcripts, images, and documents

    πŸŽ₯

    Video

    Extract faces, objects, scenes from frames

    πŸ–ΌοΈ

    Images

    Detect logos, products, landmarks

    🎧

    Audio

    Extract speakers, topics from transcripts

    πŸ“„

    Documents

    Parse entities, metadata from text

    The Power: Entities extracted from any modality can be connected through ontological relationships. A face detected in a video can link to products in images, mentions in audio, and references in documentsβ€”all in a single query.

    "After enabling ontologies, content recall improved 8Γ— without additional tagging."

    Media company with 500K+ multimodal assets

    Ready to Get Started?

    Start building intelligent knowledge graphs with Mixpeek ontologies today.