Hi,
I have a rolling forecast timeseries prediction being computed 3x per day and I’m wondering how I should structure my data / index in the ES index so that the anomaly detection can find errors?
Currently the data is structured in a way so that there’s an ‘observation datetime’ which is when the rolling forecast was computed being associated with a timeseries of datetime, value points which represent the forecasted timeseries itself.
ie one rolling forecast would be:
observation_datetime = ‘2020-12-26 10:00’,
forecast_timeseries = {
2020-12-26 10:30, 2.34,
2020-12-26 11:00, 4.34,
2020-12-26 11:30, 4.44,
2020-12-26 12:00, 4.44,
2020-12-26 12:30, 4.44,
2020-12-26 13:00, 4.44,
2020-12-26 13:30, 2.34,
2020-12-26 14:00, 4.34,
2020-12-26 14:30, 4.44,
continuing for 5 years
}
and a second forecast later today will be
observation_datetime = ‘2020-12-26 11:00’,
forecast_timeseries = {
2020-12-26 11:30, 4.44,
2020-12-26 12:00, 4.44,
2020-12-26 12:30, 4.44,
2020-12-26 13:00, 4.44,
2020-12-26 13:30, 2.34,
2020-12-26 14:00, 4.34,
2020-12-26 14:30, 4.44,
continuing for 5 years
}
etc.
I would like to use the anomaly detection feature to compare the forecast_timeseries themselves to each other to detect if there any errors ie data gaps, outliers in values etc.
Any suggestions on how I should structure my ES index of my rolling forecast data so that I can take advantage of the OpenDistro anomaly detection feature for this dataset above?
Thx