Elasticsearch Hybrid Query - No Results

I’m currently trying to do a hybrid search on two indexes: a full text index and knn_vector (word embeddings) index. Currently, over 10’000 documents from Wikipedia are indexed on an ES stack, indexed on both of these fields (see mapping: “content”, “embeddings”).

It is important to note that the knn_vector index is defined as a nested object.

This is the current mapping of the items indexed:

mapping = {
        "settings": {
            "index": {
                "knn": True,
                "knn.space_type": "cosinesimil"
            }
        },
       "mappings": {
        "dynamic": 'strict', 
        "properties": {
            "elasticId": 
                { 'type': 'text' },
            "owners": 
                { 'type': 'text' },
            "type": 
                { 'type': 'keyword' },
            "accessLink": 
                { 'type': 'keyword' },
            "content": 
                { 'type': 'text'}, 
 	"embeddings": {
                'type': 'nested', 
                "properties": {
                  "vector": {
                    "type": "knn_vector", 
                    "dimension": VECTOR_DIM, 
                          },
                    },
 	},
}

My goal is to compare the query scores on both indexes to understand if one is more efficient than the other (full text vs. knn_vectors), and how elastic chooses to return an object from based on the score of each index.

I understand I could simply split the queries (two separate queries), but ideally, we might want to use a hybrid search of this type in production.

This is the current query that searches on both full text and the knn_vectors:

def MakeHybridSearch(query):
    query_vector = convert_to_embeddings(query)
    result = elastic.search({
        "explain": True, 
        "profile": True, 
        "size": 2,
        "query": {
        "function_score": { #function_score
        "functions": [
            {
          "filter": { 
              "match": { 
                  "text": {
                      "query": query,
                      'boost': "5",  
                      }, 
                    }, 
                  },
            "weight": 2
          },
          {
          "filter": { 
              'script': {
                'source': 'knn_score',
                'params': {
                  'field': 'doc_vector',
                  'vector': query_vector,
                  'space_type': "l2"
                      }
                  }
                  },
                  "weight": 4
              }
          ],
          "max_boost": 5,
          "score_mode": "replace",
          "boost_mode": "multiply",
          "min_score": 5
          }
        }
      }, index='files_en', size=1000)

The current problem is that all queries are not returning anything.
Result:

{
"took": 3,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": {
        "value": 0,
        "relation": "eq"
    },
    "max_score": null,
    "hits": []
},

Even when the query does return a response, it returns hits with a score of 0.

Is there an error in the query structure ? Could this be on the mapping side ? If not, is there a better of way of doing this ?

Thank you for your help !

@vincentD apologies for responding late. Some observations from my sides are

  1. Since your knn_vector is nested, field should be “embeddings.vector” instead of doc_vector
{     "script_score": {
                  "script":{
                     "source":"knn_score",
                      "lang": "knn",
                     "params":{
                        "field":"embeddings.vector",
                        "vector":[2.0, 3.0, 5.0, 6.0],
                        "space_type":"l2"
                     }
            }
  1. Since min_score is 5, there is a possibility that calculated l2 score is < min score, hence, i would set min_score to zero, to check whether you are getting any results are not.

Also, If you only want to use custom scoring ( like your example) , you can omit "index.knn": true . The benefit of this approach is faster indexing speed and lower memory usage, but you lose the ability to perform standard k-NN queries on the index.
Please let us know if you still have issues after making above changes.