Mixpeek Logo
    Schedule Demo
    ESEthan Steininger
    3 min read

    Mixpeek & FLUX for Multimodal RAG

    Building a Comprehensive Image Indexing, Retrieval, and Generation Pipeline Using Mixpeek and Replicate's FLUX

    Mixpeek & FLUX for Multimodal RAG

    FLUX is taking the world by storm as the SOTA image generation model. I've seen some phenomenal examples of images generated using FLUX, but none that are dynamically generated using existing images. Therein lies the opportunity for multimodal RAG.

    Let's build a pipeline that combines indexed images with prompts to generate relevant images using AI.

    Overview

    The pipeline will consist of the following steps:

    1. Image Indexing: Index images by their URL
    2. Image Retrieval: Retrieve images using text-based queries or other images
    3. Image Generation: Generate new images based on text prompts.
    4. Integrated Workflow: Combine all steps into a unified system that can dynamically generate, index, and search for images.

    Here’s how these components work together:

    graph TD; A[Image URL or Generated Image] --> B[Mixpeek Indexing]; B --> C[Text/Image-based Search]; D[Text Prompt] --> E[FLUX Image Generation]; C --> F[Results Displayed to User]; E --> B; E --> F;

    Step 1: Image Indexing with Mixpeek

    The first step involves indexing images using Mixpeek’s API. This allows us to create a searchable database of images that can later be queried by text descriptions or similar images.

    Code Example: Indexing an Image

    import requests
    
    mixpeek_api_key = "your_mixpeek_api_key"
    collection_id = "shimmer"
    
    def index_image(url, collection_id):
        headers = {
            'Authorization': f'Bearer {mixpeek_api_key}',
            'Content-Type': 'application/json'
        }
        data = {
            "url": url,
            "collection_id": collection_id
        }
        response = requests.post('https://api.mixpeek.com/index/url', headers=headers, json=data)
        return response.json()
    
    image_url = "https://replicate.delivery/yhqm/Od36elqD9uX3byUfJHfAoi4nYaSv77HfG4Rih8LZjbzDXP5MB/out-0.webp"
    index_response = index_image(image_url, collection_id)
    print("Indexing Response:", index_response)
    

    A caption will be generated as well:


    Step 2: Image Retrieval Using Mixpeek

    Once the images are indexed, you can retrieve them either through text-based queries or by using another image as a search query.

    Text-Based Search Example

    def search_images_by_text(query, collection_id):
        headers = {
            'Authorization': f'Bearer {mixpeek_api_key}',
            'Content-Type': 'application/json'
        }
        data = {
            "modality": "image",
            "input": query,
            "filters": {
                "$or": [{"collection_id": collection_id}]
            }
        }
        response = requests.post('https://api.mixpeek.com/search/text', headers=headers, json=data)
        return response.json()
    
    text_query = "woman skateboarding on the street"
    search_results = search_images_by_text(text_query, collection_id)
    print("Search Results:", search_results)
    

    Image-Based Search Example

    def search_images_by_image(query_url, collection_id):
        headers = {
            'Authorization': f'Bearer {mixpeek_api_key}',
            'Content-Type': 'application/json'
        }
        data = {
            "url": query_url,
            "filters": {
                "$or": [{"collection_id": collection_id}]
            }
        }
        response = requests.post('https://api.mixpeek.com/search/url', headers=headers, json=data)
        return response.json()
    
    query_image_url = "https://replicate.delivery/yhqm/Od36elqD9uX3byUfJHfAoi4nYaSv77HfG4Rih8LZjbzDXP5MB/out-0.webp"
    image_search_results = search_images_by_image(query_image_url, collection_id)
    print("Image Search Results:", image_search_results)
    

    Step 3: Image Generation with Replicate's FLUX

    Next, we use Replicate’s FLUX model to generate new images from text prompts. These images can then be indexed in Mixpeek or used directly.

    Code Example: Generating an Image

    import replicate
    
    def generate_image(prompt):
        output = replicate.run(
            "black-forest-labs/flux-dev",
            input={
                "prompt": prompt,
                "guidance": 3.5,
                "aspect_ratio": "1:1",
                "output_format": "webp",
                "output_quality": 80
            }
        )
        return output
    
    image_prompt = "womens street skateboarding final in Paris Olympics 2024"
    generated_image_url = generate_image(image_prompt)
    print("Generated Image URL:", generated_image_url)
    

    Step 4: Integrating the Pipeline

    The final step integrates the entire process. We first generate a new image, index it using Mixpeek, and then use that image to search for similar images in our indexed collection.

    Code Example: Integrated Pipeline

    # Step 1: Generate a new image
    generated_image_url = generate_image("womens street skateboarding final in Paris Olympics 2024")
    
    # Step 2: Index the generated image
    index_response = index_image(generated_image_url, collection_id)
    print("Generated and Indexed Image:", index_response)
    
    # Step 3: Search for similar images using the generated image
    similar_images = search_images_by_image(generated_image_url, collection_id)
    print("Similar Images Found:", similar_images)
    

    Full code: https://github.com/mixpeek/use-cases/blob/master/multimodal-rag/flux-replicate.py

    Conclusion

    This pipeline combines the best of image indexing, retrieval, and generation technologies. By leveraging Mixpeek’s multimodal search and Replicate’s state-of-the-art image generation model, developers can create powerful, automated systems for managing and creating visual content.

    Join the Discussion

    Have thoughts, questions, or insights about this post? Be the first to start the conversation in our community!

    Start a Discussion
    ES
    Ethan Steininger

    August 21, 2024 · 3 min read