Security plugin performance

govule · April 21, 2020, 9:32pm

I have a strange issue which I’m struggling to explain and am hoping someone can help.

I’ve just built a few OpenDistro clusters to replace a legacy Elasticsearch cluster. The old cluster had a dozen dedicated coordinating nodes and with the new clusters I started off with a similar number of coordinating nodes, split across the clusters. While building up the load on the cluster some of the coordinating nodes (which are receiving bulk write requests from a few thousand forwarders) started to max their CPU while others were relatively sleepy.

The hot_threads API is telling me that these nodes are spending the majority of their time on the following:

java.base@13.0.1/java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet$UnmodifiableEntrySetSpliterator.forEachRemaining(Collections.java:1601)
   java.base@13.0.1/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
   java.base@13.0.1/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
   java.base@13.0.1/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
   java.base@13.0.1/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   java.base@13.0.1/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer.resolveIndexPatterns(IndexResolverReplacer.java:236)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer.access$300(IndexResolverReplacer.java:110)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer$2.provide(IndexResolverReplacer.java:331)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer.getOrReplaceAllIndices(IndexResolverReplacer.java:775)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer.getOrReplaceAllIndices(IndexResolverReplacer.java:668)
   com.amazon.opendistroforelasticsearch.security.resolver.IndexResolverReplacer.resolveRequest(IndexResolverReplacer.java:326)
   com.amazon.opendistroforelasticsearch.security.privileges.PrivilegesEvaluator.evaluate(PrivilegesEvaluator.java:186)
   com.amazon.opendistroforelasticsearch.security.filter.OpenDistroSecurityFilter.apply0(OpenDistroSecurityFilter.java:252)
   com.amazon.opendistroforelasticsearch.security.filter.OpenDistroSecurityFilter.apply(OpenDistroSecurityFilter.java:119)
   app//org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:151)

I have scanned through the code and suspect the looping that will be performed by the following line is playing poorly when there are a reasonable number of indices and/or aliases - in each cluster there is ~1.5k aliases (each pointing to the head of an index series).

github.com

opendistro-for-elasticsearch/security/blob/b9c67da1011ef778efa4b5a6afd9e58cd7ad4676/src/main/java/com/amazon/opendistroforelasticsearch/security/resolver/IndexResolverReplacer.java#L235


      
                  log.trace(Arrays.toString(requestedPatterns0) + " is an LOCAL EMPTY request");
              }
              return new Resolved.Builder().addOriginalRequested(Arrays.asList(requestedPatterns0)).addRemoteIndices(remoteIndices).build();
          }
          
          
else {
          
          
    ClusterState state = clusterService.state();
          
          
    final SortedMap<String, AliasOrIndex> lookup = state.metaData().getAliasAndIndexLookup();
              final Set<String> aliases = lookup.entrySet().stream().filter(e -> e.getValue().isAlias()).map(e -> e.getKey())
                      .collect(Collectors.toSet());
          
          
    matchingAliases = new HashSet<>(localRequestedPatterns.size() * 10);
              matchingIndices = new HashSet<>(localRequestedPatterns.size() * 10);
              matchingAllIndices = new HashSet<>(localRequestedPatterns.size() * 10);
          
          
    //fill matchingAliases
              for (String localRequestedPattern : localRequestedPatterns) {
                  final String requestedPattern = resolver.resolveDateMathExpression(localRequestedPattern);
                  final List<String> _aliases = WildcardMatcher.getMatchAny(requestedPattern, aliases);

The thing that is confusing me however is why the CPU load is so unevenly distributed across the coordinating nodes. They are all reporting the above at the hot path however as I said some of them are maxed on CPU and others are very far from it. Can anyone explain this? I could probably live with things if the load was split evenly across the nodes but this is making things difficult to scale.

Topic		Replies	Views
Cluster response time for queries getting high OpenDistro	3	171	February 6, 2024
Management threads eating up a lot of CPU General Feedback	1	1476	August 21, 2020
The coordinator exits the cluster OpenSearch	1	199	August 7, 2023
OpenDistro cluster becomes unstable after losing a node OpenDistro	8	689	January 11, 2022
Golang opensearch client cause high cpu in opendistro coordinating (client) node OpenSearch Client Libraries opensearch-go	1	101	April 5, 2024

Security plugin performance

Related Topics