I am new here, nice to meet you!
Here I am looking for advice to use OpenSearch with my data, which is different from the sample data provided after setting up. Imagine this data structure:
text, feature vector, file
text: a single sentence
feature vector: numpy vector (computed via GitHub - explosion/sense2vec: 🦆 Contextually-keyed word vectors)
file: link to the file that contains the text
example: “Hello I am a sentence about OpenSearch.”, [0.23455, 0.644, 0.0, 0.3446, … , 0.1395], “/path/to/file.pdf”
The workflow that I am imagining to implement is like this:
.1 search for a string (can be anything, might not be in the database)
.2 receive a list of most similar sentences (based on feature vector distance)
.3 open the associated file of the closest result
Do you have a recommendation how I could go about this? Is OpenSearch the right tool for this? I am very early in my research of how to make use of OpenSearch, in case I am missing something obvious please forgive me.
Thank you already for any tips & leads