/
Similarity search with perceptual hashes (POC)

Similarity search with perceptual hashes (POC)

This optional module is part of release 24.4 and is still in the proof-of-concept (POC) stage.

Overview

This document provides technical information for the new module "Similarity search with perceptual hashes (POC) " which includes the plugin "Similarity Search”. The module is designed to generate embeddings using perceptual hashes for objects during the ingest process, storing the hash in the record metadata and offering an API method to retrieve similar objects for an existing object. This module is the basic version for similarity search whereas Similarity search using Open AI (POC) offers the premium version.

Description

Similarity search consists of asking for an existing object the most similar objects. See https://mediahaven.atlassian.net/wiki/spaces/CS/pages/4586110979for details.

Activation

This feature will be automatically activated for the secondary organisation on integration environments.

  1. Create the following field definitions

    1. MapField named Dynamic.EmbeddingsPoc

    2. VectorField named Dynamic.EmbeddingsPoc.PerceptualHash with dimensions = 64 and index = true

    3. Publish them

  2. Enable the module SIMILARITY_SEARCH_PERCEPTUAL_HASHES_POC for the customer’s organisation. See the REST document for information on how to do that.

  3. Ingest new objects

Caveats

  • Only works on newly ingested objects after the activation has been fully completed

Front-end

The MediaHaven front end will offer a context menu option for an object to show similar objects

API

There is a new API method GET records/:recordId/similar to return similar objects. See the REST documentation for further information.

 

Related content