Benchmark Opensearch with ESRally

Hi,
is it possible to benchmark my Opensearch instance with ESRally or what way would you do it?

rickster

Hi,

Yes, it is possible, see please the Upgrade from Elasticsearch OSS to OpenSearch - OpenSearch documentation for compatibility notes, you will need to add the following setting to your opensearch.yml:

compatibility.override_main_response_version: true

Beyond that, please note that some tracks (aka datasets) are using the mapping data types which are not supported by OpenSearch (fe Text type family | Elasticsearch Guide [master] | Elastic), so you may not be able to run races across all tracks (but most of them just work). Hope it helps, thank you.

Thank you for your help. I will have a look on that.

Okay after a bit of research I’m still confused. Is it right that ESRally is more for benchmarking troughput and latency? I’m asking because, my cluster has high CPU Usage during night, probably because of running some delete by queries.
Currently I store document in indices by resource. Thats the reason why I have to delete documents which have reached retention with a delete by query.
So I want to try to switch to a time based index straegy with an index per day or week, where I can then delete the whole index instead.
Therefore I want to messure the cpu and memory between these two setups. Whats the best way to do this?

That is correct (afaik): esrally intends to benchmark indexing throughput and search/aggregation latencies. It seems like in your case, you may need to outline your own goals: as far as I understand, your are focusing on server-side resource utilisation only. Not sure if benchmarking is what you need in this case (judging on the context you have provided).

Logically, deleting whole index is significantly more lightweight operation that running delete by query. If you could spin off a cluster with this new indexing strategy side by side, you could compare the server-side resource consumption between both deployments.

Not sure if it is very helpful.

Thanks for your quick response. You juged right on the context :wink:
I will try to restructure my data to a time based index strategy on a separate cluster and compare the ressource usage of them using prometheus and grafana.