Rolling Upgrade for `opendistro_security.disabled=false`

HI.

I’m having a trouble with applying opendistro_security for my ES cluster.

Here’s my situation.

I’ve been operating a 6 node-elasticsearch cluster on docker with option opendistro_security.disabled=true.

Now, I need to enable opendistro_security.
As the cluster is under production environment, I need it to be changed with rolling upgrade.

I’ve changed one of the six node to use opendistro_security.disabled=false and restarted it.
And somehow, the node fails to join the old cluster with saying,

[2021-12-10T01:24:07,426][WARN ][o.e.c.c.ClusterFormationFailureHelper] [my_cluster_elasticsearch-data-01] master not discovered yet: have discovered [{my_cluster-data-01}{...}{...}{...}{...:9300}{dir}]; discovery will continue using [....] from hosts providers and [] from last-known cluster state; node term 25, last-accepted version 31251 in term 25

And when I command GET localhost:9200 in the docker, it says:

[$elasticsearch] curl localhost:9200
Open Distro Security not initialized.
[2021-12-10T07:51:14,808][ERROR][c.a.o.s.a.BackendRegistry] [my_cluster_elasticsearch-data-01] Not yet initialized (you may need to run securityadmin)

AFAIK, the securityadmin.sh should be executed on a single node and it would apply the change for whole cluster. So I’ve planned executing it after the whole cluster is done restarting.
But, I ran the script just for a single node, as the system says.
The message I got this time,

[$elasticsearch] . securityadmin.sh -cd ../securityconfig/ -icl -nhnv -cacert ../../../config/root-ca.pem -cert ../../../config/kirk.pem -key ../../../config/kirk-key.pem

Cannot retrieve cluster state due to: null. This is not an error, will keep on trying ...
  Root cause: MasterNotDiscoveredException[null] (org.elasticsearch.discovery.MasterNotDiscoveredException/org.elasticsearch.discovery.MasterNotDiscoveredException)
   * Try running securityadmin.sh with -icl (but no -cl) and -nhnv (If that works you need to check your clustername as well as hostnames in your TLS certificates)
   * Make sure that your keystore or PEM certificate is a client certificate (not a node certificate) and configured properly in elasticsearch.yml
   * If this is not working, try running securityadmin.sh with --diagnose and see diagnose trace log file)
   * Add --accept-red-cluster to allow securityadmin to operate on a red cluster.

Okay, now with --diagnose option, and this is the log.

  PendingClusterTasksRequest:
MasterNotDiscoveredException[null]
  at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:230)
  at 
  ...
IndicesStatsRequest:
ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];]
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:190)
	at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:70)
	at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:48)
	at 
    ...

I’ve successfully upgraded another cluster using opendistro_security with rolling-upgrade,
but not lucky for this time.

Is there any mistakes I have not noticed?
Thanks in advance.

@ginseng_h what version of Opensearch / ODFE are you running? Can you provide the elasticsearch/opensearch.yml (redact any sensitive details).

Thanks for reply.

I’m using odfe 1.13.2 version.

Here’s my elasticsearch,yml

# elasticsearch.yml

opendistro_security.ssl.transport.pemcert_filepath: esnode.pem
opendistro_security.ssl.transport.pemkey_filepath: esnode-key.pem
opendistro_security.ssl.transport.pemtrustedcas_filepath: root-ca.pem

opendistro_security.ssl.http.enabled: false
opendistro_security.ssl.http.pemcert_filepath: esnode.pem
opendistro_security.ssl.http.pemkey_filepath: esnode-key.pem
opendistro_security.ssl.http.pemtrustedcas_filepath: root-ca.pem
opendistro_security.ssl.transport.enforce_hostname_verification: false
opendistro_security.allow_default_init_securityindex: true
opendistro_security.allow_unsafe_democertificates: true

opendistro_security.nodes_dn:
  - "CN=*.example.com, OU=SSL, O=Test, L=Test, C=DE"
  - "CN=node.other.com, OU=SSL, O=Test, L=Test, C=DE"do,C=KR'

opendistro_security.authcz.admin_dn:
  - "CN=kirk,OU=client,O=client,L=test,C=DE"

opendistro_security.enable_snapshot_restore_privilege: true
opendistro_security.check_snapshot_restore_write_privileges: true
opendistro_security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]

opendistro_security.audit.type: internal_elasticsearch
opendistro_security.audit.config.index: "'auditlog-'YYYY.MM"

and I’m using bundle Certificates which is given when the cluster has been installed.

@ginseng_h Enabling security via rolling update is not currently possible, as TLS on transport layer is mandatory, therefore the restarting node would not be able to communicate with the cluster.

If the opendistro_security was already enabled and upgrade is necessary, this is entire different situation and is indeed possible, which is probably what you are referring to regarding the successful upgrade. Please confirm

@Anthony thanks for the reply. It really helped me a lot.