/
Similarity search using Open AI (POC)

Similarity search using Open AI (POC)

This optional module is part of release 24.4 and is still in the proof-of-concept (POC) stage.

Overview

This document provides technical information for the new module "Similarity search using Open AI (POC) " which includes the plugin "Similarity Search”. The module uses embeddings generated by Open AI during the ingest process, storing the hash in the record metadata and offering an API method to retrieve similar objects for an existing object. This module is the premium version of similarity search versus the basic version Similarity search with perceptual hashes (POC).

Description

Similarity search consists of asking for an existing object the most similar objects. See https://mediahaven.atlassian.net/wiki/spaces/CS/pages/4586110979 for details.

Activation

This feature will be automatically activated for the tertiary organisation on integration environments

The configuration of the Open AI API key is required for the correct operation

  1. Create the following field definitions

    1. MapField named Dynamic.EmbeddingsPoc

    2. VectorField named Dynamic.EmbeddingsPoc.OpenAi with dimensions = 1536 and index = true

    3. Publish them

  2. Enable the module SIMILARITY_SEARCH_OPEN_AI_POC for the customer’s organisation. See the REST document for information on how to do that.

  3. Obtain the Open AI API key for this environment. The development Open AI API key is stored in LastPass.

  4. Update the plugin OPEN_AI_EMBEDDINGS_POC for the property Secret with the above Open AI API key. See Postman “Update secret for plugin”

    image-20240808-161244.png

Embedding

The embedding generated using Open AI is specifically crafted to include both

  • The metadata of the object

  • The data of the object (extracted from the preview, technically PathToPreview if it has the JPG format, otherwise the PathToKeyframe)

Caveats

  • Only works on newly ingested objects after the activation has been fully completed

Front-end

The MediaHaven front end will offer a context menu option for an object to show similar objects.

API

There is a new API method GET records/:recordId/similar to return similar objects. See the REST documentation for further information.

 

Related content