Anomaly detection with term aggregations

I’ve been doing some research on the Anomaly Detection plugin and comparing the results with a separate analysis performed in Python. The use-case included searching for security anomalies, such as the number of 401 and 403 statuses per IP/user. For aggregating feature data I used single-value aggregations such as value_count and cardinality. The results are quite satisfactory.

I’ve got a couple more use cases I would like to try out but I’m not sure if Anomaly Detection supports such functionality. For example, I would like to perform term aggregations on features and search for anomalies within the count of terms. Specifically, I would like to provide the following input to the model:

[ { "country": "US", "doc_count": 1000 }, { "country": "CA", "doc_count": 700, }, { "country": "FR", "doc_count": 2, } ]

The expected behavior here would be showing an anomaly for France which is not a country visitors usually come from.

Is this functionality currently available with the plugin? If not, is it planned for implementation?

This is anomaly detection based on cardinality which is already on our plan. Paste duplicate Github question here Term aggregation in custom query for features · Issue #88 · opendistro-for-elasticsearch/anomaly-detection · GitHub

That’s great to hear. Thank you for the response Yaliang