Search for Keywords in your Raw Files

You have Terabytes of non-text files in S3, GCP, or Azure. Put them to use with Mixpeek's managed, full-text file search API.

It's a datalake for your files.

How Does it Work?

Upload your Files

POST your file to the /upload endpoint, where we extract relevant text then place it in a search index.

  • Depending on the filetype, various machine learning models are run on the file and relevant text is extracted, then stored in an encrypted database.

    The file itself is never saved on our servers, you provide the url for the file in /upload API, then we'll include that file url in the /search response.

  • .pdf, .doc, .jpg, .jpeg, .gif, .mp4, .avi, .mov, .mp3, .wav, .m4a, .html, .xml

  • All of them! That's the beauty of mixpeek, your files are stored in different locations and formats, we parse and index them all then give you one single api to query them.

Search Your Files

GET your files with the the /search endpoint, where we return every relevant piece of text and it's corresponding filepath.

  • Your file's text is placed in a full-text search index called Lucene. It is the most prominent text search in the industry and fastest.

  • All traffic in flight is encrypted using TLS and the data itself is encrypted at rest once it reaches our database.

  • We're still in beta, so just sign up and we'll talk.

Integration Heaven

Upload files from any cloud filestore, then search across all of them with one API call

Use Cases

You build software for a specific user persona and they expect their uploads to become searchable.

Human Resources Software

Search for keywords in Waivers & Agreements

Locate specific Employee Files

Identify out of date Vaccination Records

Education Software

Search for keywords in Lecture Videos, by timestamp

Organize your Handwritten Notes

Locate terms in your Audio Notes

Real Estate Software

Search for Blueprint Images

Organize On-Site Photographs

Locate labels in Video Walkthroughs

Healthcare & Insurance

Ensure the are up to date Insurance Documents

Organize and structure Diagnostic Images

Keep your patients' Medical Waivers up to date

Articles

  • All
  • Tutorials

Step-By-Step S3 Integration

Add search to your S3 bucket’s non-text files

Frequently Asked Questions

If we haven't answered any of your questions below, send us an email: info@mixpeek.com

  • What is the technology behind mixpeek?

    It depends on which filetype you are uploading. For example, for images we use pytorch, audio images is tesseract, etc. All the extracted text is then put into a Lucene index, where it instantly becomes searchable.

  • All connections between the API server and database enforce TLS 2.0 SSL certificates. Once the data enters the database, the hard drives are encrypted using a managed key service. By using 50 character length API keys, we ensure that you only have access to your data.

  • We have a 99.999% Service Level Agreement with our customers, and we're able to do this by having redundant app servers (for the APIs), in addition to redundant database servers.

  • Not at this time unfortunately.