I want to benchmark my Opensearch cluster with Esrally. Therefore I need indices with different sizes (e.g. 100mb, 5gb, 10gb, etc.). I can use some data from my production cluster which I can import via snapshot.
Is there a way in esrally to define how large the index should be after indexing my data? Wasn’t able to find anything about that in the documentation.
I think the problem is that esrally cann’t determine how much space the documents will have when they are analyzed and stored in Opensearch.
Possible solution would be to create different sized files from my snapshot index which will contain the data for the each different indices created by esrally. But that approach would consume a lot of storage on my machine.
Do you have any other ideas?