I was able to connect to the cluster. I think I had a VPN issue.
I wanted to try to reproduce the test cases for
k=1. With the following query, I was able to get a few random documents from the index:
I took the vectors from 5 of the returned results (document IDs: my-7557114, ss-204754389_9303, my-2173151, my-8457237, kv-1842763) and ran the following query for each:
"size" : 1,
Each query returned the corresponding document. While this is only a small subset, it would still lead one to believe recall to be much higher than 0.0011. If possible, could you provide document IDs for the Elasticsearch documents whose queries do not return their associated doc ID?
Additionally, looking at the
knn-index, it has 29 segments. Each of these segments corresponds to one HNSW graph. During search, Elasticsearch will run the k-NN search over each segment. Each segment will produce it’s top
k results with a score of
1/(1+distance from vector to query). Then, Elasticsearch will take the top
size scores from all of the segment results. So, searching over many smaller graphs and then aggregating the results may improve recall (at the expense of latency) as opposed to searching a single large graph. So, taking the NMSLIB results as ground truth is not correct. It may be better to checkout ann-benchmark data sets. Their data sets contain a set of queries and the ground truth nearest neighbors for each of them.
One more thing, in the code you used to calculate recall, do you account for floating point in the return of intersection_score (i.e.
len(lst3)/len(lst1) will always yield 0 or 1)? This could reduce the score as well.