Getting latency and timeouts for initial knn search queries even after using warmup api

Hi Everyone,
I am getting latency issues and timeouts for initial few knn search queries for my elastic index after indexing even after doing index warmups using the warmup api mentioned here API - Open Distro for Elasticsearch Documentation which return success for all shards in the response. Has anyone experienced something similar?
Thanks in advance.

Hi @mohit62

Could you paste the output of /_opendistro/_knn/stats?pretty?

Also, what dimension/space are you?

Jack

Hi Jack below is the response for knn stats apis:
{
“_nodes” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“cluster_name” : “cluster-abc”,
“circuit_breaker_triggered” : false,
“nodes” : {
“1” : {
“miss_count” : 0,
“graph_memory_usage_percentage” : 0.0,
“graph_query_requests” : 0,
“graph_memory_usage” : 0,
“cache_capacity_reached” : false,
“graph_index_requests” : 0,
“load_exception_count” : 0,
“load_success_count” : 0,
“eviction_count” : 0,
“indices_in_cache” : { },
“script_query_errors” : 0,
“script_compilations” : 0,
“script_query_requests” : 0,
“graph_query_errors” : 0,
“hit_count” : 0,
“graph_index_errors” : 0,
“knn_query_requests” : 0,
“total_load_time” : 0,
“script_compilation_errors” : 0
},
“2” : {
“miss_count” : 0,
“graph_memory_usage_percentage” : 0.0,
“graph_query_requests” : 0,
“graph_memory_usage” : 0,
“cache_capacity_reached” : false,
“graph_index_requests” : 0,
“load_exception_count” : 0,
“load_success_count” : 0,
“eviction_count” : 0,
“indices_in_cache” : { },
“script_query_errors” : 0,
“script_compilations” : 0,
“script_query_requests” : 0,
“graph_query_errors” : 0,
“hit_count” : 0,
“graph_index_errors” : 0,
“knn_query_requests” : 0,
“total_load_time” : 0,
“script_compilation_errors” : 0
},
“3” : {
“miss_count” : 0,
“graph_memory_usage_percentage” : 0.0,
“graph_query_requests” : 0,
“graph_memory_usage” : 0,
“cache_capacity_reached” : false,
“graph_index_requests” : 0,
“load_exception_count” : 0,
“load_success_count” : 0,
“eviction_count” : 0,
“indices_in_cache” : { },
“script_query_errors” : 0,
“script_compilations” : 0,
“script_query_requests” : 0,
“graph_query_errors” : 0,
“hit_count” : 0,
“graph_index_errors” : 0,
“knn_query_requests” : 0,
“total_load_time” : 0,
“script_compilation_errors” : 0
},
“4” : {
“miss_count” : 90,
“graph_memory_usage_percentage” : 37.60101,
“graph_query_requests” : 1325,
“graph_memory_usage” : 1749962,
“cache_capacity_reached” : false,
“graph_index_requests” : 117,
“load_exception_count” : 0,
“load_success_count” : 89,
“eviction_count” : 0,
“indices_in_cache” : {
“index-1” : {
“graph_memory_usage_percentage” : 4.53506,
“graph_memory_usage” : 211063,
“graph_count” : 10
},
“index-2” : {
“graph_memory_usage_percentage” : 0.005951832,
“graph_memory_usage” : 277,
“graph_count” : 4
},
“index-3” : {
“graph_memory_usage_percentage” : 18.508886,
“graph_memory_usage” : 861409,
“graph_count” : 10
},
“index-4” : {
“graph_memory_usage_percentage” : 8.5947034E-4,
“graph_memory_usage” : 40,
“graph_count” : 1
},
“index-5” : {
“graph_memory_usage_percentage” : 14.550252,
“graph_memory_usage” : 677173,
“graph_count” : 9
}
},
“script_query_errors” : 0,
“script_compilations” : 0,
“script_query_requests” : 0,
“graph_query_errors” : 0,
“hit_count” : 1235,
“graph_index_errors” : 0,
“knn_query_requests” : 221,
“total_load_time” : 13741747470,
“script_compilation_errors” : 0
},
“5” : {
“miss_count” : 93,
“graph_memory_usage_percentage” : 35.72612,
“graph_query_requests” : 1338,
“graph_memory_usage” : 1662704,
“cache_capacity_reached” : false,
“graph_index_requests” : 111,
“load_exception_count” : 0,
“load_success_count” : 92,
“eviction_count” : 0,
“indices_in_cache” : {
“index-1” : {
“graph_memory_usage” : 211063,
“graph_memory_usage_percentage” : 4.53506,
“graph_count” : 10
},
“index-2” : {
“graph_memory_usage” : 277,
“graph_memory_usage_percentage” : 0.005951832,
“graph_count” : 4
},
“index-3” : {
“graph_memory_usage” : 40,
“graph_memory_usage_percentage” : 8.5947034E-4,
“graph_count” : 1
},
“index-4” : {
“graph_memory_usage” : 677170,
“graph_memory_usage_percentage” : 14.550189,
“graph_count” : 8
},
“index-5” : {
“graph_memory_usage” : 774154,
“graph_memory_usage_percentage” : 16.63406,
“graph_count” : 9
}
},
“script_query_errors” : 0,
“script_compilations” : 0,
“script_query_requests” : 0,
“graph_query_errors” : 0,
“hit_count” : 1245,
“graph_index_errors” : 0,
“knn_query_requests” : 242,
“total_load_time” : 13417690680,
“script_compilation_errors” : 0
}
}
}
I have two fields in each of the indexes using knn term vectors with each having dimesion of around 10000 and spacetype is cosinesimil
Thanks,
Mohit

@mohit62 Thanks, it looks like the graphs are loaded into memory. A few follow up questions:

  1. What ODFE version are you using?
  2. Could you provide the index mapping?
  3. Could you provide a sample query you are using?
  4. How many documents are there in total?
  5. What are efSearch, efConstruction and M algorithm parameters set to?