Hi,
In my index, I need to perform updates to specific field.
I perform the update according to document id, and I only update a single field in the document.
Example:
POST fileinstances/_update/updatedfileinstance
{
"doc" :
{
"lastUpdated" : "2021-05-31 11:20:00"
}
}
I try to minimize the updates because every update (as I understand) always creates a new document version and mark the previous version as deleted.
This means that in time, my index size will grow up.
As I read here, index auto merge can reduce index size and optimize it:
Merging reduces the number of segments in each shard by merging some of them together, and also frees up the space used by deleted documents. Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.
My questions:
- What is the schedule of the auto merge? where can I see it and how do I control it?
- I read somewhere that auto-merge is not performed for index larger than 5Gb. Is that correct? If so, can I increase the threshold to 50Gb?
- If I keep updating documents, does it prevent auto-merge from running in the background?
- I have an _ism policy performing an index ‘rollover’ after size of 50Gb limit is reached. How do I force merge on the old index that becomes read only? Is the following policy correct?
PUT _opendistro/_ism/policies/fileinstances_policy
{
"policy": {
"description": "fileinstances rollover policy.",
"default_state": "rollover",
"states": [
{
"name": "rollover",
"actions": [
{
"rollover": {
"min_size": "50gb"
}
},
{
"force_merge": {
"max_num_segments": 1
}
}
],
"transitions": []
}
]
}
}
Thank you,
Ori.