Tasks with "source" : "opendistro-im" stuck in queue, one is executing, rest waiting

Hi,

I am very new to opendistro.
We have it deployed in kubernetes (1.13.2), and since a few days it stops processing the logs after a while (it works for a few days after deployment and stops).

I have checked cluster state and indexes in the kibana dev tools and all is reported green.
Also GET _cluster/allocation/explain returns no unallocated stuff.

The problem is - a lot of rejected writes and a lot of tasks in pending tasks (with wait times going into many hours) all with "source" : "opendistro-im"

GET /_cat/thread_pool?v=true&h=id,name,active,rejected,completed:

Yd1JOURVRaGqRY6PuwfoMA ad-batch-task-threadpool                    0        0         0
Yd1JOURVRaGqRY6PuwfoMA ad-threadpool                               0        0         0
Yd1JOURVRaGqRY6PuwfoMA analyze                                     0        0         0
Yd1JOURVRaGqRY6PuwfoMA fetch_shard_started                         0        0         0
Yd1JOURVRaGqRY6PuwfoMA fetch_shard_store                           0        0         0
Yd1JOURVRaGqRY6PuwfoMA flush                                       0        0         0
Yd1JOURVRaGqRY6PuwfoMA force_merge                                 0        0         0
Yd1JOURVRaGqRY6PuwfoMA generic                                     0        0     55735
Yd1JOURVRaGqRY6PuwfoMA get                                         0        0         0
Yd1JOURVRaGqRY6PuwfoMA listener                                    0        0         0
Yd1JOURVRaGqRY6PuwfoMA management                                  1        0     80405
Yd1JOURVRaGqRY6PuwfoMA open_distro_job_scheduler                   0        0         0
Yd1JOURVRaGqRY6PuwfoMA opendistro_asynchronous_search_generic      0        0       618
Yd1JOURVRaGqRY6PuwfoMA refresh                                     0        0         0
Yd1JOURVRaGqRY6PuwfoMA repository_azure                            0        0         0
Yd1JOURVRaGqRY6PuwfoMA search                                      0        0         0
Yd1JOURVRaGqRY6PuwfoMA search_throttled                            0        0         0
Yd1JOURVRaGqRY6PuwfoMA snapshot                                    0        0         0
Yd1JOURVRaGqRY6PuwfoMA sql-worker                                  0        0         0
Yd1JOURVRaGqRY6PuwfoMA system_read                                 0        0         0
Yd1JOURVRaGqRY6PuwfoMA system_write                                0        0         0
Yd1JOURVRaGqRY6PuwfoMA warmer                                      0        0         0
Yd1JOURVRaGqRY6PuwfoMA write                                       0        0         0
L9UXmY5vSku7r9caSeWzBA ad-batch-task-threadpool                    0        0         0
L9UXmY5vSku7r9caSeWzBA ad-threadpool                               0        0         0
L9UXmY5vSku7r9caSeWzBA analyze                                     0        0         0
L9UXmY5vSku7r9caSeWzBA fetch_shard_started                         0        0      1127
L9UXmY5vSku7r9caSeWzBA fetch_shard_store                           0        0      1983
L9UXmY5vSku7r9caSeWzBA flush                                       0        0      3073
L9UXmY5vSku7r9caSeWzBA force_merge                                 0        0         0
L9UXmY5vSku7r9caSeWzBA generic                                     0        0   7033791
L9UXmY5vSku7r9caSeWzBA get                                         0        0      4008
L9UXmY5vSku7r9caSeWzBA listener                                    0        0         0
L9UXmY5vSku7r9caSeWzBA management                                  1        0   5929836
L9UXmY5vSku7r9caSeWzBA open_distro_job_scheduler                   0        0      4614
L9UXmY5vSku7r9caSeWzBA opendistro_asynchronous_search_generic      0        0       822
L9UXmY5vSku7r9caSeWzBA refresh                                     0        0  35359191
L9UXmY5vSku7r9caSeWzBA repository_azure                            0        0         0
L9UXmY5vSku7r9caSeWzBA search                                      0        0     67743
L9UXmY5vSku7r9caSeWzBA search_throttled                            0        0         0
L9UXmY5vSku7r9caSeWzBA snapshot                                    0        0         0
L9UXmY5vSku7r9caSeWzBA sql-worker                                  0        0         0
L9UXmY5vSku7r9caSeWzBA system_read                                 0        0       117
L9UXmY5vSku7r9caSeWzBA system_write                                0        0        95
L9UXmY5vSku7r9caSeWzBA warmer                                      0        0      1704
L9UXmY5vSku7r9caSeWzBA write                                       0        0   2195576
OPUWg9MRQjG-RbYZYqu6qA ad-batch-task-threadpool                    0        0         0
OPUWg9MRQjG-RbYZYqu6qA ad-threadpool                               0        0         0
OPUWg9MRQjG-RbYZYqu6qA analyze                                     0        0         0
OPUWg9MRQjG-RbYZYqu6qA fetch_shard_started                         0        0      1127
OPUWg9MRQjG-RbYZYqu6qA fetch_shard_store                           0        0      2105
OPUWg9MRQjG-RbYZYqu6qA flush                                       0        0      2008
OPUWg9MRQjG-RbYZYqu6qA force_merge                                 0        0         0
OPUWg9MRQjG-RbYZYqu6qA generic                                     0        0   7607326
OPUWg9MRQjG-RbYZYqu6qA get                                         0        0      4128
OPUWg9MRQjG-RbYZYqu6qA listener                                    0        0         0
OPUWg9MRQjG-RbYZYqu6qA management                                  1        0   5368835
OPUWg9MRQjG-RbYZYqu6qA open_distro_job_scheduler                   0        0         0
OPUWg9MRQjG-RbYZYqu6qA opendistro_asynchronous_search_generic      0        0       855
OPUWg9MRQjG-RbYZYqu6qA refresh                                     2        0  32284254
OPUWg9MRQjG-RbYZYqu6qA repository_azure                            0        0         0
OPUWg9MRQjG-RbYZYqu6qA search                                      0        0     54209
OPUWg9MRQjG-RbYZYqu6qA search_throttled                            0        0         0
OPUWg9MRQjG-RbYZYqu6qA snapshot                                    0        0         0
OPUWg9MRQjG-RbYZYqu6qA sql-worker                                  0        0         0
OPUWg9MRQjG-RbYZYqu6qA system_read                                 0        0       747
OPUWg9MRQjG-RbYZYqu6qA system_write                                0        0      4539
OPUWg9MRQjG-RbYZYqu6qA warmer                                      0        0      2680
OPUWg9MRQjG-RbYZYqu6qA write                                       4        0   1583153
DVlyTw4rQg25wfR1Acixkw ad-batch-task-threadpool                    0        0         0
DVlyTw4rQg25wfR1Acixkw ad-threadpool                               0        0         0
DVlyTw4rQg25wfR1Acixkw analyze                                     0        0         0
DVlyTw4rQg25wfR1Acixkw fetch_shard_started                         0        0         0
DVlyTw4rQg25wfR1Acixkw fetch_shard_store                           0        0         0
DVlyTw4rQg25wfR1Acixkw flush                                       0        0         0
DVlyTw4rQg25wfR1Acixkw force_merge                                 0        0         0
DVlyTw4rQg25wfR1Acixkw generic                                     0        0     62326
DVlyTw4rQg25wfR1Acixkw get                                         0        0         0
DVlyTw4rQg25wfR1Acixkw listener                                    0        0         0
DVlyTw4rQg25wfR1Acixkw management                                  1        0     81941
DVlyTw4rQg25wfR1Acixkw open_distro_job_scheduler                   0        0         0
DVlyTw4rQg25wfR1Acixkw opendistro_asynchronous_search_generic      0        0       629
DVlyTw4rQg25wfR1Acixkw refresh                                     0        0         0
DVlyTw4rQg25wfR1Acixkw repository_azure                            0        0         0
DVlyTw4rQg25wfR1Acixkw search                                      0        0      7999
DVlyTw4rQg25wfR1Acixkw search_throttled                            0        0         0
DVlyTw4rQg25wfR1Acixkw snapshot                                    0        0         0
DVlyTw4rQg25wfR1Acixkw sql-worker                                  0        0         0
DVlyTw4rQg25wfR1Acixkw system_read                                 0        0       196
DVlyTw4rQg25wfR1Acixkw system_write                                0        0         0
DVlyTw4rQg25wfR1Acixkw warmer                                      0        0         0
DVlyTw4rQg25wfR1Acixkw write                                       0        0         6
tQBXjgZiQLehN8sptb11zw ad-batch-task-threadpool                    0        0         0
tQBXjgZiQLehN8sptb11zw ad-threadpool                               0        0         0
tQBXjgZiQLehN8sptb11zw analyze                                     0        0         0
tQBXjgZiQLehN8sptb11zw fetch_shard_started                         0        0         0
tQBXjgZiQLehN8sptb11zw fetch_shard_store                           0        0       177
tQBXjgZiQLehN8sptb11zw flush                                       0        0      1948
tQBXjgZiQLehN8sptb11zw force_merge                                 0        0         0
tQBXjgZiQLehN8sptb11zw generic                                     0        0   3004467
tQBXjgZiQLehN8sptb11zw get                                         0        0      3221
tQBXjgZiQLehN8sptb11zw listener                                    0        0         0
tQBXjgZiQLehN8sptb11zw management                                  1        0   5418273
tQBXjgZiQLehN8sptb11zw open_distro_job_scheduler                   0        0      6269
tQBXjgZiQLehN8sptb11zw opendistro_asynchronous_search_generic      0        0       789
tQBXjgZiQLehN8sptb11zw refresh                                     0        0  35025823
tQBXjgZiQLehN8sptb11zw repository_azure                            0        0         0
tQBXjgZiQLehN8sptb11zw search                                      0        0     47624
tQBXjgZiQLehN8sptb11zw search_throttled                            0        0         0
tQBXjgZiQLehN8sptb11zw snapshot                                    0        0         0
tQBXjgZiQLehN8sptb11zw sql-worker                                  0        0         0
tQBXjgZiQLehN8sptb11zw system_read                                 0        0       446
tQBXjgZiQLehN8sptb11zw system_write                                0        0       385
tQBXjgZiQLehN8sptb11zw warmer                                      0        0      3409
tQBXjgZiQLehN8sptb11zw write                                       4     1581   1700624
_0Ow2ObvQDK2L1-P7wplIQ ad-batch-task-threadpool                    0        0         0
_0Ow2ObvQDK2L1-P7wplIQ ad-threadpool                               0        0         0
_0Ow2ObvQDK2L1-P7wplIQ analyze                                     0        0         0
_0Ow2ObvQDK2L1-P7wplIQ fetch_shard_started                         0        0         0
_0Ow2ObvQDK2L1-P7wplIQ fetch_shard_store                           0        0         0
_0Ow2ObvQDK2L1-P7wplIQ flush                                       0        0         0
_0Ow2ObvQDK2L1-P7wplIQ force_merge                                 0        0         0
_0Ow2ObvQDK2L1-P7wplIQ generic                                     0        0     61864
_0Ow2ObvQDK2L1-P7wplIQ get                                         0        0         0
_0Ow2ObvQDK2L1-P7wplIQ listener                                    0        0         0
_0Ow2ObvQDK2L1-P7wplIQ management                                  1        0     81908
_0Ow2ObvQDK2L1-P7wplIQ open_distro_job_scheduler                   0        0         0
_0Ow2ObvQDK2L1-P7wplIQ opendistro_asynchronous_search_generic      0        0       630
_0Ow2ObvQDK2L1-P7wplIQ refresh                                     0        0         0
_0Ow2ObvQDK2L1-P7wplIQ repository_azure                            0        0         0
_0Ow2ObvQDK2L1-P7wplIQ search                                      0        0    137980
_0Ow2ObvQDK2L1-P7wplIQ search_throttled                            0        0         0
_0Ow2ObvQDK2L1-P7wplIQ snapshot                                    0        0         0
_0Ow2ObvQDK2L1-P7wplIQ sql-worker                                  0        0         0
_0Ow2ObvQDK2L1-P7wplIQ system_read                                 0        0       193
_0Ow2ObvQDK2L1-P7wplIQ system_write                                0        0         0
_0Ow2ObvQDK2L1-P7wplIQ warmer                                      0        0         0
_0Ow2ObvQDK2L1-P7wplIQ write                                       0        0        10
AEJZBJVDQXGJf4XBWXlzpw ad-batch-task-threadpool                    0        0         0
AEJZBJVDQXGJf4XBWXlzpw ad-threadpool                               0        0         0
AEJZBJVDQXGJf4XBWXlzpw analyze                                     0        0         0
AEJZBJVDQXGJf4XBWXlzpw fetch_shard_started                         0        0         0
AEJZBJVDQXGJf4XBWXlzpw fetch_shard_store                           0        0         0
AEJZBJVDQXGJf4XBWXlzpw flush                                       0        0         0
AEJZBJVDQXGJf4XBWXlzpw force_merge                                 0        0         0
AEJZBJVDQXGJf4XBWXlzpw generic                                   128        0    324175
AEJZBJVDQXGJf4XBWXlzpw get                                         0        0         0
AEJZBJVDQXGJf4XBWXlzpw listener                                    0        0         0
AEJZBJVDQXGJf4XBWXlzpw management                                  1        0     87532
AEJZBJVDQXGJf4XBWXlzpw open_distro_job_scheduler                   0        0         0
AEJZBJVDQXGJf4XBWXlzpw opendistro_asynchronous_search_generic      0        0      1237
AEJZBJVDQXGJf4XBWXlzpw refresh                                     0        0         0
AEJZBJVDQXGJf4XBWXlzpw repository_azure                            0        0         0
AEJZBJVDQXGJf4XBWXlzpw search                                      0        0       332
AEJZBJVDQXGJf4XBWXlzpw search_throttled                            0        0         0
AEJZBJVDQXGJf4XBWXlzpw snapshot                                    0        0         0
AEJZBJVDQXGJf4XBWXlzpw sql-worker                                  0        0         0
AEJZBJVDQXGJf4XBWXlzpw system_read                                 0        0         0
AEJZBJVDQXGJf4XBWXlzpw system_write                                0        0         0
AEJZBJVDQXGJf4XBWXlzpw warmer                                      0        0         0
AEJZBJVDQXGJf4XBWXlzpw write                                       0        0         0
dMXQM2xgQ3mzK7kuNnarCw ad-batch-task-threadpool                    0        0         0
dMXQM2xgQ3mzK7kuNnarCw ad-threadpool                               0        0         0
dMXQM2xgQ3mzK7kuNnarCw analyze                                     0        0         0
dMXQM2xgQ3mzK7kuNnarCw fetch_shard_started                         0        0         0
dMXQM2xgQ3mzK7kuNnarCw fetch_shard_store                           0        0         0
dMXQM2xgQ3mzK7kuNnarCw flush                                       0        0         0
dMXQM2xgQ3mzK7kuNnarCw force_merge                                 0        0         0
dMXQM2xgQ3mzK7kuNnarCw generic                                     0        0     51323
dMXQM2xgQ3mzK7kuNnarCw get                                         0        0         0
dMXQM2xgQ3mzK7kuNnarCw listener                                    0        0         0
dMXQM2xgQ3mzK7kuNnarCw management                                  1        0     78307
dMXQM2xgQ3mzK7kuNnarCw open_distro_job_scheduler                   0        0         0
dMXQM2xgQ3mzK7kuNnarCw opendistro_asynchronous_search_generic      0        0       602
dMXQM2xgQ3mzK7kuNnarCw refresh                                     0        0         0
dMXQM2xgQ3mzK7kuNnarCw repository_azure                            0        0         0
dMXQM2xgQ3mzK7kuNnarCw search                                      0        0         0
dMXQM2xgQ3mzK7kuNnarCw search_throttled                            0        0         0
dMXQM2xgQ3mzK7kuNnarCw snapshot                                    0        0         0
dMXQM2xgQ3mzK7kuNnarCw sql-worker                                  0        0         0
dMXQM2xgQ3mzK7kuNnarCw system_read                                 0        0         0
dMXQM2xgQ3mzK7kuNnarCw system_write                                0        0         0
dMXQM2xgQ3mzK7kuNnarCw warmer                                      0        0         0
dMXQM2xgQ3mzK7kuNnarCw write                                       0        0         0

Here is a fragment of pending tasks GET /_cluster/pending_tasks:

{
  "tasks" : [
    {
      "insert_order" : 21163,
      "priority" : "NORMAL",
      "source" : "opendistro-im",
      "executing" : true,
      "time_in_queue_millis" : 29827891,
      "time_in_queue" : "8.2h"
    },
    {
      "insert_order" : 21164,
      "priority" : "NORMAL",
      "source" : "opendistro-im",
      "executing" : false,
      "time_in_queue_millis" : 29827891,
      "time_in_queue" : "8.2h"
    },
    {
      "insert_order" : 21165,
      "priority" : "NORMAL",
      "source" : "opendistro-im",
      "executing" : false,
      "time_in_queue_millis" : 29827891,
      "time_in_queue" : "8.2h"
    },
    {
      "insert_order" : 21167,
      "priority" : "NORMAL",
      "source" : "opendistro-im",
      "executing" : false,
      "time_in_queue_millis" : 29825193,
      "time_in_queue" : "8.2h"
    },
    {
      "insert_order" : 21166,
      "priority" : "NORMAL",
      "source" : "opendistro-im",
      "executing" : false,
      "time_in_queue_millis" : 29825193,
      "time_in_queue" : "8.2h"
    },

I have looked into data and master nodes logs and there is no single ERROR/WARNING log there before or after the time that the logs stopped.

If you have any ideas on what to investigate - please let me know.