Locate reindex bottleneck

sezuan2 · September 23, 2021, 3:16pm

Hello,

I’m need some help to locate the reason for slow reindexing speed. I’m reindexing one index from a 3 node to cluster to another 3 node cluster in the same datacenter. Relevant changes that should affect the speed are disabling of the replicas and the refreshing:

      "index.number_of_replicas" : "0",
      "index.refresh_interval" : "-1",

Neither the CPU, nor the disk io nor the network are even remotely saturated. I get a pretty constant indexing rate of about 1240 documents/s.
The index itself is a bit special since it is for reasons heavily overshared. There are 960 primary shards + 1 replica.

How can I identify the bottleneck?

best regards,
Matthias

searchymcsearchface · September 24, 2021, 1:48pm

Anything unusual about this index itself - e.g. large docs? stored fields?

sezuan2 · September 29, 2021, 10:47am

That’s a good point! The doc size very different, from like 1KB to 10s of MB.

My next approach was to split the documents in groups of a certain size, like 0 to 10000 bytes, 10000 to 15000 bytes… This also allows to run the reindexing in parallel and to set a proper batch size. That was necessary, because reindexing everything at once was often interrupted because the 100mb buffer was exceeded.

With this approach I’ve achieved a reindexing rate of about 6K/s, which sounds reasonable.

searchymcsearchface · September 30, 2021, 1:37pm

Yeah - OpenSearch is more aligned to doing constant ingestion of documents rather than batches like what you originally described. Seems like you are doing a good job now but there are lots of optimizations strategies that are possible, but you often need to tune it according to your specific document quirks.

Topic		Replies	Views
Indexing performance issue Open Source Elasticsearch and Kibana discuss , troubleshoot	0	234	August 17, 2022
Kibana queries taking considerable time than ES search APIs General Feedback	1	355	February 7, 2023
Reindex API Unexpected Timeouts OpenDistro	1	1391	December 29, 2021
Kibana UI taking long time to switch between index patterns in Discover tab Open Source Elasticsearch and Kibana	5	291	October 14, 2021
How can I find the index rate info OpenDistro troubleshoot	6	1698	February 7, 2022

Locate reindex bottleneck

Related Topics