NEWAgents can now see video via MCP.Try it now →
    Models/Detection & Recognition/timesformer/facenet-pytorch
    HFFace DetectionMIT

    facenet-pytorch

    by timesformer

    Deep face recognition with triplet loss embeddings

    dl/month
    23Mparams
    Identifiers
    Model ID
    timesformer/facenet-pytorch
    Feature URI
    mixpeek://face_identity@v1/timesformer_facenet_v1

    Overview

    FaceNet maps face images to a compact 128-dimensional embedding space where distances directly correspond to face similarity. Trained using triplet loss, it achieves 99.63% accuracy on the Labeled Faces in the Wild benchmark.

    On Mixpeek, FaceNet provides face embedding extraction for identity-based search, find all appearances of a person across your video and image library.

    Architecture

    InceptionResnetV1 backbone fine-tuned with triplet loss on VGGFace2 dataset. Produces 512-dim or 128-dim face embeddings. Pre-processes with MTCNN face detection and alignment.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "my-collection",
    source: { url: "https://example.com/video.mp4" },
    feature_extractors: [{
    name: "face_detection",
    version: "v1",
    params: {
    model_id: "timesformer/facenet-pytorch"
    }
    }]
    });

    Capabilities

    • Face verification (same/different person)
    • Face identification across large galleries
    • 128-dim or 512-dim face embeddings
    • Built-in MTCNN face alignment

    Use Cases on Mixpeek

    Cast tracking in film/TV, find all scenes with a specific actor
    Customer recognition in retail analytics
    Duplicate face detection across content libraries

    Benchmarks

    DatasetMetricScoreSource
    LFWAccuracy99.63%Schroff et al., 2015 — Table 4
    CFP-FPAccuracy98.12%FaceNet PyTorch model card

    Performance

    Input Size160×160 px
    Embedding Dim512
    GPU Latency~2ms / face (A100)
    CPU Latency~15ms / face
    GPU Throughput~500 faces/sec (A100)
    GPU Memory~0.4 GB

    Specification

    FrameworkHF
    Organizationtimesformer
    FeatureFace Detection
    Outputface embedding
    Modalitiesvideo, image
    RetrieverFace Filter
    Parameters23M
    LicenseMIT
    Downloads/mo

    Research Paper

    FaceNet: A Unified Embedding for Face Recognition and Clustering

    arxiv.org

    Build a pipeline with facenet-pytorch

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Pipeline Builder