No user found for indices:data/read/opendistro/replication/changes

Hi,
When attempting to start cross cluster replication on an index, I’m getting this error written out the the follower logs. I get a 200 code back when making the start replication request however.

[ERROR] [c.a.e.r.t.s.ShardReplicationTask] [elasticsearch-replication-node2] [follower-01][0] Task failed due to ElasticsearchSecurityException[No user found for indices:data/read/opendistro/replication/changes]

The steps I’ve taken to set up replication are as follows:

  1. Installed elasticsearch, opendistro security and cross cluster replication plugins on 2 nodes.
  2. Set opendistro_security.unsupported.inject_user.enabled: true in elasticsearch.yml
  3. Set opendistro_security.nodes_dn_dynamic_config_enabled: true in elasticsearch.yml
  4. Set up cluster.remote.leader_cluster_seeds in elasticsearch.yml
  5. Set the permissions as described in the hadbook in roles.yml
  6. Created an index on the leader node called ‘leader-01’.
  7. Added a document to the index.

(The handbook does not mention the below steps, but I was getting exceptions without performing them.)

  1. Created a snapshot repository called ‘opendistro-remote-repo-leader-cluster’ on the follower node.
  2. Created a snapshot called ‘opendistro-remote-snapshot’ for the index ‘leader-01’
  3. Started the replication by making a request to:
PUT https://localhost:9200/_opendistro/_replication/follower-01/_start
{
  "remote_cluster": "leader_cluster",
  "remote_index": "leader-01"
}

I have seen similar issues to this when the admin certificate is the same as the node certificate, however I am using different certificates, and can successfully run the securityadmin.sh tool to reload the config.

When sending the start request, I am using the admin certificate to authenticate, and can confirm that this user has a role assigned to them that includes the indices:data/read/opendistro/replication/changes permission by accessing the /_opendistro/_security/api/account endpoint.

Note: For simplicity I have deployed both leader and follower roles to each node.

Response from /account:

{
  "user_name": "CN=Administrator",
  ...
  "roles": [
    ...
    "replication_backup_follower",
    "replication_backup_leader"
    ... 
  ]
}

Response from /roles:

{
  "replication_backup_leader": {
    "cluster_permissions": [
      "AS_DESCRIBED_IN_HANDBOOK"
    ],
    "index_permissions": [
      {
        "index_patterns": [
          "*"
        ],
        "allowed_actions": [
          ...
          "indices:data/read/opendistro/replication/changes"
        ]
      }
    ]
  },
  "replication_backup_follower": {
    "cluster_permissions": [
      "AS_DESCRIBED_IN_HANDBOOK"
    ],
    "index_permissions": [
      {
        "index_patterns": [
          "*"
        ],
        "allowed_actions": [
           "AS_DESCRIBED_IN_HANDBOOK"
        ]
      }
    ]
  }
}

@bartlettjh - You don’t have to perform steps 8 and 9 as Replication plugin takes care of this. could you please share the cluster state response after you hit the above error?

I found that the plugin was not taking care of it, and I was getting exceptions telling me that no repository, or snapshot existed. I was thinking that this might be related to the reason I’m getting the other issue though.

Today when testing again I get IndexNotFoundException[no such index [leader-01]] so I’m thinking that there might be an issue reading anything from the other cluster.

The cluster status after trying today was green for the follower and leader cluster.

I attempted to run the plugin using the sample project provided, and it was successful, but on my manual deployment it still doesn’t work. When following the instructions verbatim on my deployment (not the example), I get RepositoryMissingException: [opendistro-remote-repo-leader-cluster] missing when trying to start replication before step 8.

The only difference that I can think of between the two deployments, is that we use ldap as our authentication method.