Metric collection frequency

#1

Hi,

We’d like to add some Open Distro for ES metrics to the existing Elasticsearch monitoring in Sematext Cloud.

Over on https://opendistro.github.io/for-elasticsearch-docs/docs/pa/api/ I see “Performance Analyzer updates its data every five seconds. If you create a custom client, we recommend using that same interval for calls to the API.”.

Could you comment on what would happen (to the accuracy of collected metrics?) if metrics were to be collected every 10 seconds?

Thanks,
Otis

#2

Hi Otis, thank you for the interest in Performance Analyzer. As you noted, all metrics are computed every 5s. From your question’s perspective, the component exposes four types of metrics:

  • Averaged and normalized over 5s: in this case, you’ll only lose information if you call every 10s. In other words you’ll implicitly make the assumption that the collected sample represents the last 10s instead of 5s.
  • Sampled every 5s: same as above.
  • Cumulative metric types: these are monotonically increasing, and you won’t lose any information by querying every 10s.
  • un-normalized: these metrics are not normalized by the 5s interval, so you would multiply by two to get the value for 10s, if you sample every 10s. Also, you’ll be making the same assumption on the metric representation as the first two classes.

The un-normalized metrics are:

Indexing_ThrottleTime
Cache_Query_Hit
Cache_Query_Miss
Cache_FieldData_Eviction
Cache_Request_Hit
Cache_Request_Miss
Cache_Request_Eviction
Refresh_Event
Refresh_Time
Flush_Event
Flush_Time
Merge_Event
Merge_Time
ShardEvents
ShardBulkDocs
GC_Collection_Event
GC_Collection_Time
ThreadPool_RejectedReqs
HTTP_RequestDocs
HTTP_TotalRequests
CB_TrippedEvents