Mixpeek Logo
    Schedule Demo
    ESEthan Steininger
    3 min read

    Set Up and Run OpenAI's CLIP on SageMaker for Inference

    How to deploy and run OpenAI's CLIP model on Amazon SageMaker for efficient real-time and offline inference.

    Set Up and Run OpenAI's CLIP on SageMaker for Inference
    Tutorials

    This tutorial will guide you through the process of deploying the OpenAI's Contrastive Language–Image Pretraining (CLIP) model for inference using Amazon SageMaker. The primary goal is to help you understand how to create an endpoint for real-time inference, and use SageMaker's Batch Transform feature for offline inference.

    For consistency and to make things more interesting, we'll use a theme of identifying and classifying images of different types of animals throughout this tutorial.

    💡 mixpeek simplifies all this with a package that sets up ML model deployment, hosting, versioning, tuning, and inference at scale. All without your data leaving your AWS account.

    Prerequisites

    • An AWS account
    • Familiarity with Python, AWS, and machine learning concepts
    • A copy of the CLIP model, accessible in S3

    Step 1: Setting Up Your Environment

    First, log in to your AWS account and go to the SageMaker console. In your desired region, create a new SageMaker notebook instance (e.g., 'clip-notebook'). Once the instance is ready, open Jupyter and create a new Python 3 notebook.

    In this notebook, let's start by importing the necessary libraries:

    import sagemaker
    from sagemaker import get_execution_role
    

    Step 2: Define S3 Bucket and Roles

    Next, we need to define our S3 bucket and the IAM role:

    sagemaker_session = sagemaker.Session()
    
    # Get a SageMaker-compatible role used by this Notebook Instance.
    role = get_execution_role()
    
    bucket = sagemaker_session.default_bucket()
    prefix = 'clip-model'
    

    Step 3: Upload the Model to S3

    You'll need to upload your trained CLIP model to an S3 bucket. Here's how:

    model_location = sagemaker_session.upload_data(
        'model_path', 
        bucket=bucket, 
        key_prefix=prefix
    )
    

    Remember to replace 'model_path' with the path to your model file.

    Step 3.5: Verify the Model in S3 with boto3

    In this step, we'll use boto3 to check if our model was successfully uploaded to our S3 bucket. First, let's import the library and initialize our S3 client:

    import boto3
    
    s3 = boto3.client('s3')
    

    Next, let's list all objects in our S3 bucket:

    response = s3.list_objects(Bucket=bucket)
    
    for content in response['Contents']:
        print(content['Key'])
    

    You should see the path to your uploaded model in the printed output.

    Step 4: Create a Model

    Once the model is uploaded to S3, you can create a SageMaker model. To do this, you need a Docker container that contains the necessary libraries and dependencies to run CLIP. If you don't have this Docker image yet, you would need to create one. For the purpose of this tutorial, let's assume you have a Docker image named 'clip-docker-image' in your Elastic Container Registry (ECR).

    from sagemaker.model import Model
    
    clip_model = Model(
        model_data=model_location,
        imagker-image',
        role=role
    )

    Step 5: Deploy the Model for Real-Time Inference

    With the model in place, you can now deploy it to a SageMaker endpoint. Let's create an endpoint configuration named 'clip-endpoint-config':

    clip_predictor = clip_model.deploy(
        initial_instance_count=1,
        instance_type='ml.m5.large',
        endpoint_name='clip-endpoint'
    )

    Step 6: Create a Predictor from an Existing Deployment

    Create a predictor from an existing deployment:

    from sagemaker.predictor import Predictor
    
    endpoint_name = "huggingface-pytorch-inference-2023-03-18-13-33-18-657"  
    # Existing endpoint
    clip_predictor = Predictor(endpoint_name)
    

    You can now use this clip_predictor for inference, similar to the previous Step 6.

    Making Inferences

    Now you can use the predictor to make real-time inferences:

    import requests
    from PIL import Image
    import numpy as np
    import json
    
    url = "http://images.cocodataset.org/val2017/000000039769.jpg"
    
    image = Image.open(requests.get(url, stream=True).raw)
    image_array = np.array(image)
    
    data = {
      "inputs": "the mesmerizing performances of the leads keep the film grounded and keep the audience riveted.",
      "pixel_values": image_array.tolist()
    }
    
    response = clip_predictor.predict(json.dumps(data))
    print(response)
    

    Step 7: Offline Inference with Batch Transform

    For offline inference, you can use SageMaker's Batch Transform feature. First, let's define a transformer:

    clip_transformer = clip_model.transformer(
        instance_count=1,
        instance_type='ml.m5.large',
        strategy='SingleRecord',
        assemble_with='Line',
        output_path='s3://{}/{}/output'.format(bucket, prefix)
    )

    Then, start a transform job:

    clip_transformer.transform(
        data='s3://{}/{}/input'.format(bucket, prefix),
        content_type='application/x-image',
        split_type='None'
    )
    clip_transformer.wait()

    In this case, the input data is a collection of animal images stored in an S3 bucket.

    After the transform job is completed, the predictions are stored in the S3 bucket specified in the output_path.

    Sample inference response:

    {
      "predictions": [
        {
          "label": "cat",
          "probability": 0.002
        },
        {
          "label": "dog",
          "probability": 0.98
        },
        {
          "label": "horse",
          "probability": 0.001
        },
        {
          "label": "rabbit",
          "probability": 0.017
        }
      ]
    }
    

    Step Forever: Overwhelmed?

    If not, think about versioning, maintenance, improving and doing it all at scale. We at Mixpeek are focused on abstracting all of this into one-click model hosting that never leaves your AWS account.

    Join the Discussion

    Have thoughts, questions, or insights about this post? Be the first to start the conversation in our community!

    Start a Discussion
    ES
    Ethan Steininger

    June 11, 2024 · 3 min read