Mixpeek Logo
    6 min read

    Semantic Crons: Replace LLM Polling with Vector-Based Alerts

    Instead of polling with an LLM on a cron schedule, Retriever Alerts evaluate semantic conditions at ingestion time. Vector math instead of inference calls. Event-driven instead of scheduled. Three API calls to set up.

    Semantic Crons: Replace LLM Polling with Vector-Based Alerts
    Infrastructure

    Harrison Chase tweeted something that caught our attention: "semantic crons." The idea is simple—instead of scheduling jobs on fixed intervals, you trigger them when some fuzzy, hard-to-quantify event happens in your data. Think "when a video looks like a prior safety incident" or "when a new product photo is suspiciously close to a competitor's."

    The replies immediately called out the problem. One person asked Harrison directly: if you're hitting an LLM every heartbeat to check whether a natural-language condition is true, doesn't that get expensive fast? And yeah, it does. Running hundreds of LLM-powered checks on a cron schedule is a cost nightmare.

    Turns out we already built this. We call them Retriever Alerts—and they sidestep the polling problem entirely. Instead of checking conditions on a timer, alerts evaluate at ingestion time. New data comes in, the system runs a retriever (our word for a multi-stage search pipeline) against it, and if something matches, a webhook fires. No cron. No idle LLM calls. You only pay for work that matters.

    How It Fits Together

    If you've used Mixpeek before, alerts are just a thin layer connecting two things you already have: retrievers and webhooks. Here's the flow every time a new object gets ingested:

    Object ingested into collection Feature extraction embeddings · faces · OCR · audio Taxonomy enrichment classification labels applied Cluster assignment Alert evaluation retriever runs vector search against new doc If results → webhook fires matches · scores · metadata → your endpoint

    The ordering matters. Alerts run last, after taxonomy and cluster enrichment. That means your alert conditions can reference classification labels, cluster IDs, and any other enriched fields—not just raw embeddings. You can express something like "alert me when a video is classified as unsafe AND it falls into the anomaly cluster" and it just works.

    Three API Calls to Set It Up

    The whole thing is three steps. Let's walk through a real-time anomaly detection scenario: flagging newly uploaded videos that look like known safety incidents.

    1. Define the Semantic Condition (Retriever)

    The retriever is where your "fuzzy condition" lives. It defines what to search for: which collection to compare against, what similarity threshold counts as a match, how many results to return. All the vector search logic is here.

    from mixpeek import Mixpeek
    
    client = Mixpeek(api_key="YOUR_API_KEY")
    
    retriever = client.retrievers.create(
        namespace="ns_production",
        name="safety-incident-matcher",
        stages=[
            {
                "type": "feature_search",
                "config": {
                    "collection_ids": ["col_known_incidents"],
                    "vector_index_name": "multimodal_extractor_v1_embedding",
                    "query": "{{INPUT.query_embedding}}",
                    "min_score": 0.85,
                    "top_k": 5
                }
            },
            {
                "type": "attribute_filter",
                "config": {
                    "filters": {
                        "and": [
                            {"field": "_internal.metadata.modality", "operator": "eq", "value": "video"}
                        ]
                    }
                }
            }
        ]
    )
    

    That min_score: 0.85 is doing the heavy lifting. It's the semantic threshold—the retriever only returns results when the similarity search finds something genuinely close to your known incidents. No LLM needed to decide whether something "looks dangerous."

    2. Create the Alert

    The alert itself is just notification config. It points at a retriever and says "when this returns results, ping these endpoints." Clean separation—the alert has no idea what embeddings are or how similarity search works.

    alert = client.alerts.create(
        namespace="ns_production",
        name="Safety Incident Detector",
        description="Flags new videos that match known safety incidents",
        retriever_id=retriever.retriever_id,
        notification_config={
            "channels": [
                {
                    "channel_type": "webhook",
                    "config": {
                        "url": "https://hooks.yourcompany.com/safety-alerts",
                        "headers": {"X-Alert-Secret": "your-shared-secret"}
                    }
                }
            ],
            "include_matches": True,
            "include_scores": True
        }
    )
    

    3. Wire It to Your Collection

    Last step: attach the alert to the collection where new content lands. The input_mappings tell the system which document field to feed into the retriever as the query embedding.

    client.collections.update(
        collection_id="col_incoming_videos",
        namespace="ns_production",
        alert_applications=[
            {
                "alert_id": alert.alert_id,
                "execution_mode": "on_ingest",
                "input_mappings": [
                    {
                        "input_key": "query_embedding",
                        "source": {
                            "source_type": "document_field",
                            "path": "multimodal_extractor_v1_embedding"
                        }
                    }
                ],
                "execution_phase": "alert",
                "priority": 100
            }
        ]
    )
    

    That's it. From now on, every video that gets processed through this collection automatically runs through the safety similarity search. If there's a match, you get a webhook.

    What the Webhook Payload Looks Like

    When a match fires, here's what hits your endpoint:

    {
        "alert_id": "alt_safety_001",
        "alert_name": "Safety Incident Detector",
        "collection_id": "col_incoming_videos",
        "namespace_id": "ns_production",
        "triggered_at": "2026-02-12T14:30:00Z",
        "source_documents": [
            {
                "document_id": "doc_new_upload_92f3",
                "asset_id": "asset_video_8271"
            }
        ],
        "match_count": 2,
        "matches": [
            {
                "document_id": "doc_known_incident_17",
                "score": 0.92,
                "metadata": {"incident_type": "forklift_collision", "severity": "high"}
            },
            {
                "document_id": "doc_known_incident_42",
                "score": 0.87,
                "metadata": {"incident_type": "ppe_violation", "severity": "medium"}
            }
        ],
        "retriever_id": "ret_safety_search"
    }

    You get the source document, every match with its similarity score, and whatever metadata was on the matched documents. Enough context to route, triage, or escalate without making another API call.

    Here's a quick Flask handler that forwards alerts to Slack:

    from flask import Flask, request, jsonify
    import requests as http
    
    app = Flask(__name__)
    SLACK_WEBHOOK = "https://hooks.slack.com/services/T00/B00/xxx"
    
    @app.route("/safety-alerts", methods=["POST"])
    def handle_alert():
        payload = request.json
        top = payload["matches"][0]
    
        http.post(SLACK_WEBHOOK, json={"text": (
            f":rotating_light: *{payload['alert_name']}*\n"
            f"Matched {payload['match_count']} known incident(s)\n"
            f"Closest: `{top['metadata']['incident_type']}` "
            f"(score: {top['score']:.2f})\n"
            f"Doc: `{payload['source_documents'][0]['document_id']}`"
        )})
    
        return jsonify({"status": "ok"}), 200
    

    Why This Works Better Than LLM Polling

    Back to Harrison's thread. The core tension with semantic crons is cost: you want fuzzy, meaning-based triggers, but running an LLM against every condition on every tick is expensive and slow. Here's why the retriever alert approach avoids that:

    It's event-driven, not scheduled. Alerts only fire when new data actually arrives. No ticking clock, no wasted compute during quiet periods. This is the same event driven architecture pattern that makes webhooks preferable to polling—applied to AI monitoring instead of CRUD notifications.

    The "fuzzy matching" is vector math. A similarity search across an embedding index runs in single-digit milliseconds. Compare that to an LLM call that takes 1-3 seconds and costs real money per invocation. You get the semantic matching without the inference cost.

    Structured filters come free. Because alerts run after taxonomy enrichment, you can combine vector similarity with hard attribute filters. "Similar to known incidents and classified as high-severity" is two filter stages in a retriever—no LLM needed to interpret the condition.

    Conditions and notifications are decoupled. Want to tighten the similarity threshold? Edit the retriever. Want to add a Slack channel? Edit the alert. They don't know about each other. This matters when you're managing dozens of alert conditions across different teams.

    Other Things You Can Do With This

    Content Moderation at Ingest

    If you're running a platform with user-generated content, you probably have a corpus of previously removed posts. Attach an alert that runs a similarity search against that corpus every time a new upload is processed. Anything that's semantically close to removed content gets flagged for review before it goes live.

    alert = client.alerts.create(
        namespace="ns_ugc_platform",
        name="Content Policy Violation Detector",
        retriever_id="ret_removed_content_matcher",
        notification_config={
            "channels": [
                {"channel_type": "webhook", "config": {"url": "https://mod.internal/review-queue"}},
                {"channel_type": "slack", "channel_id": "sl_trust_safety"}
            ],
            "include_matches": True
        }
    )
    

    This works for video moderation, image moderation, or text—the retriever handles any modality that Mixpeek extracts embeddings for.

    Visual Competitive Intel

    Index your competitor's product catalog into a collection. Then set up an alert on your own product ingestion pipeline. Every time your team uploads a new product image, the system automatically checks whether it's visually similar to anything in the competitor collection. Good for catching accidental design overlap or tracking how close your products are trending to theirs.

    alert = client.alerts.create(
        namespace="ns_catalog",
        name="Competitor Visual Similarity Alert",
        retriever_id="ret_competitor_image_match",
        notification_config={
            "channels": [{"channel_type": "webhook", "config": {"url": "https://hooks.yourco.com/product-intel"}}],
            "include_matches": True,
            "include_scores": True
        }
    )
    

    Combining Vector Search with Taxonomy Labels

    This is where it gets interesting. Since alerts run after enrichment, you can build retrievers that mix similarity search with classification outputs. For example, a content moderation retriever that only fires when the document is both visually similar to flagged content and the taxonomy classified it as explicit:

    retriever = client.retrievers.create(
        namespace="ns_ugc_platform",
        name="explicit-content-similarity-check",
        stages=[
            {
                "type": "feature_search",
                "config": {
                    "collection_ids": ["col_flagged_content"],
                    "vector_index_name": "multimodal_extractor_v1_embedding",
                    "query": "{{INPUT.query_embedding}}",
                    "min_score": 0.80,
                    "top_k": 3
                }
            },
            {
                "type": "attribute_filter",
                "config": {
                    "filters": {
                        "and": [
                            {"field": "taxonomy_content_policy_label", "operator": "eq", "value": "explicit"}
                        ]
                    }
                }
            }
        ]
    )
    

    Two signals, zero LLM calls at alert time. The taxonomy was already computed during enrichment, the vector search is just math.

    Checking What Fired

    Every execution gets logged, whether or not it triggered. Useful for tuning thresholds or debugging why an alert didn't fire when you expected it to.

    executions = client.alerts.list_executions(
        alert_id="alt_safety_001",
        namespace="ns_production",
        limit=20
    )
    
    for ex in executions:
        print(f"{ex.executed_at} | triggered={ex.triggered} | matches={ex.match_count}")
    
    # Same thing via cURL
    curl -sS "https://api.mixpeek.com/v1/alerts/alt_safety_001/executions?limit=10" \
      -H "Authorization: Bearer $MP_API_KEY" \
      -H "X-Namespace: ns_production"
    

    See the full execution history API for filtering and pagination options.

    Try It

    The whole setup is three API calls:

    1. Create a retriever with your semantic condition
    2. Create an alert pointing at that retriever
    3. Attach it to a collection with input mappings

    If you're already doing cron-based checks or scheduled LLM evaluations against your data, this replaces all of that with something cheaper and faster. And if you've been thinking about building semantic triggers but haven't started because the LLM cost math didn't work—now it does.

    Alerts API reference · Webhooks guide · Retrievers overview