Web Scraper Guide
Summary
Learn how to use Mixpeek's Web Scraper to recursively crawl websites and extract multimodal content with automatic embeddings. This guide demonstrates crawling documentation sites, extracting code snippets and images, and making everything searchable with semantic embeddings.
About this video
Learn how to use Mixpeek's Web Scraper to recursively crawl websites and extract multimodal content with automatic embeddings. This guide demonstrates crawling documentation sites, extracting code snippets and images, and making everything searchable with semantic embeddings. What you'll learn: ⚡ Recursive website crawling with depth control ⚡ Extracting text, code blocks, and images ⚡ Multimodal embeddings (E5-Large, Jina Code, SigLIP) ⚡ JavaScript rendering for SPAs ⚡ URL filtering and structured extraction ⚡ Building searchable knowledge bases from docs
