Convergence of opensearch & opendistro plugins

are there any plans on your side to merge certain opendistro plugins into the opensearch code base? either as plugins but maintained within the same code base (as elastic did with x-pack) or even fully merged.

if you were to do that: what would be the impact on other plugins which are providing similar functionality (e.g. odfe-security vs. search guard)? would your integrated functionality be exclusive or would the APIs continue to exist and they could still hook in and replace your functionality?

what about packaging? will you ship opensearch automatically with all opendistro plugins or will you ship a version without them and one with (some of) them?

1 Like

These are super good questions.

I think the idea is to retain the plugins as distinct software but under the same overall (github) project. That means that a plugin, Alerting as an example, would just be software that uses the same APIs to integrate as any other plugin. So changes in Alerting wouldn’t affect OpenSearch core and someone else can create a plugin with competing functionality with no disadvantage from a code perspective.

I’ve heard no plans of any APIs being taken away - actually quite the opposite. Stability in these APIs is seen as super important to making it a good place for people to build extensions.

The security plugin itself is a bit of a more nuanced case. It’s tricky because other plugins have to be built using the model in mind so the security plugin becomes a dependency. Then there is the issue if it’s a good idea to have security-free software available - it kind of sets users up for failure. Right now, no integration has been done between the OpenSearch and the security plugin. I’d love to know folks opinions on security specifically - is it something that should be closely integrated or kept outside?

As far as packaging goes, the idea is that the plugins will be packed together as an artifact with OpenSearch and OpenSearch Dashboards. So, you can get OpenSearch and/or OpenSearch Dashboards with the plugins in one download. If you don’t want a plugin, it’s a configuration option to turn it off. You’re welcome to build OpenSearch or OpenSearch Dashboards w/o plugins as the build process for each is independent.

3 Likes

Hi kyle,

Agree with you as we should keep Open Distro as a separate project.

As far as packaging concern, It should have both options.

Option 1: OpenSearch and OpenDashboards without plugin
Option 2: OpenSearch and OpenDashboards with plugins

I believe there are valid use cases that security is not needed. So if it is integrated it may rather slow down performance (e.g. having encrypting node-to-node communication when it’s not needed) than doing any good.

Plus, I don’t see the dependency that you mention between core and security. I guess using for example OpenSearch core with ReadOnlyRest as security plugin should be possible. Of course, there may need some adaptions on ReadOnlyRest side, but it shouldn’t be too harsh.

Keeping the same packaging as happens now with OpenDistro sounds good to me.

1 Like

To be clear here - it would be one ‘project’ all housed under the opensearch-project github organization but distinct software. Use of OpenSearch doesn’t predicate the plugins and vice-versa.

And the name “Open Distro” is going away eventually.

(Thanks for the feedback on packaging!)

Sure - I agree that there are cases where you might not need security. How do you feel about default-on / opt-out security?

Let me explain the dependency a tad more clearly. It’s not about dependency between OpenSearch and the security plugin but other plugins depending on the security plugin. You need to design your plugin to work with the security model otherwise the plugin won’t work. Other folks writing plugins would be awesome - but requiring those folks to design a plugin to work with and without security seems like a steep requirement.

I agree to have security by default: standard for production env and in SW (DBMS, apps, …) world.
I quite extremist, but I would like that many opendistro plugin are put in “modules” dir of opnesearch to be able to have a plugin dir mountpoint empty to be able to plug custom plugins via mountpoint in docker.

I also would like to provide the Prometheus exporter as default module with opt-out as default.

I suggest to do a poll on which functionalities/plugins to have in vanilla distro.

1 Like

sorry for the delay in replies here, i had been doing some thinking & having discussions on this.

i’m now driven to reply again because i gather from the implicit statements in this github issue that you want to ship “opensearch” by default with plugins and only as a side-product without plugins?

i see various aspects here:

  • i’ve learn’t the hard way that you should only ship/install what you really need - even if you’re not using it, having it comes with some costs
    • costs might include runtime performance (even if the module is not enabled, it is still loaded by default, giving some overhead)
    • costs might be upgrade costs (some default configs which changed and cause you issues even though you had no intention of using the things)
    • it comes with security risks - if there’s an (exploitable) security issue in a module which you’re not using but which is still installed you might be affected even though you never wanted it
  • depending on whether plugins are managed in the same repo or outside and depending on how they are versioned (semver for the plugins or semver for the whole bundle) packaging them together becomes harder to do

for security i agree with you that all installations should come with security - though it is some additional effort to manage certificates, etc.; but that’s a mindset shift which has to happen in general (and is / has been happening for a few years now) that you need to do security right and should do it from the beginning. so i’d be fine with the security plugin being packaged in all cases (and even being part of opensearch itself, possibly not even built as a plugin but integrated as a 1st class citizen).
what is important here is that you either continue to allow 3rd party security plugins or, if not, check what this means for the existing 3rd party security plugin vendors (namely floragunn/Search Guard and ReadonlyREST (i’ve heard for the first time about the latter only recently and haven’t seen yet that they’re interested in supporting Opensearch, but who knows)).

3 Likes

These are great points, ralph. I especially agree about shipping a security-enabled configuration by default.

A nice thing about plugins is that it’s pretty easy to remove, add, or replace them. One of the questions I’m asking myself is, if people will want to run some set of these plugins in their clusters, what’s the most user-friendly way to let them choose which ones to enable? Is it offering different download options? Having people download each component a la carte? Bundling common plugins together into a default distribution so that people can pick and choose once they’ve downloaded it?

Making this a set of download options has some appeal - you only download what you plan to use - but could be more complicated than an option where users download one thing and configure it.

@dblock stated on the github issue that there would be no downloadable package for “pure” opensearch w/o any plugins, that it’d only be published as a maven artifact.
i am strongly opposed to this:

  • as pointed out above, shipping unused plugins can cause all sorts of issues
  • if the plugin-free package is only available as a bunch of .jars via maven then it’d be a major hassle to build a package together with this (you’d basically have to re-invent the wheel of what is already being done for opensearch)

i doubt that it should be a lot of effort to have a plugin-free (maybe including the security plugin, see discussion above) version of opensearch and ship that. it should be possible to package that and then just add the plugins to the package and ship that as a separate one. it definitely is less trouble to do this once, centrally, rather than everyone who doesn’t need the other plugins having to do this on their own.

in that context i then don’t care much whether one is called opensearch and the other opensearch-full or whether they’re called opensearch-min and opensearch.

downloading a package and then removing things sounds wrong.
i expect to either be able to download a package and install additional plugins as needed or to have multiple downloads to pick the best match and then still install additional plugins on top if needed.

i’m kinda wondering if we couldn’t improve the plugin installation process a bit: already now we’re using elasticsearch-plugin install with a file:/// path (fun-fact: i had an RTFM moment when i failed to notice that i need to use file:/// for it to work and use absolute paths when i ran it with ./someplugin.zip :confused:), maybe it would be possible to use a central plugin repository (i wouldn’t invent something new here, probably using github packages, maven central or similar would work) and then just be able to write opensearch-plugin install someplugin:1.2.3 and it’ll try to fetch it from that default location.

add the options to install multiple plugins at once (à la opensearch-plugin install someplugin:1.2.3 otherplugin:3.2.1), install “latest compatible version” (i.e. just don’t specify a version number: opensearch-plugin install someplugin) and things should be very easy to use.

then the download package could even contain all default opensearch plugins - but not installed but instead just as pre-delivered versions for those which need to install in places w/o internet access (though you could also say that this is a corner-case and they can just download the plugins as normal).

but i’d definitely not see these as mandatory features for a 1.0.0 release but instead would just see them as nice-to-have (but obviously the vision would have an impact on packaging considerations already now).

Great points Ralph. It’s not a lot of effort to provide a tgz / zip download of a plugin free distribution. The main question was, why would it be needed if we can make installing/removing plugins easier through cli or API sugar (and you touch on some of that while I was typing this). Since the project is not there yet, I don’t think the bundle download / manual remove plugin is a bad approach no matter how wrong it feels at this point? Alternatively a lite archive is also an achievable stopgap so I don’t want it to sound like these are edict decisions. The technical reason to make this simpler is related to DistributionDownloader and the concept of “build flavors” that are removed in OpenSearch. DistributionDownloader was one the heavily used ways a third party plugin would pull upstream dependencies (using the old oss flavor label). It downloads the archive from the vendor’s download page, extracts the jar files, and adds them as dependencies to the plugin. This is crazy overhead when a third party plugin should simply be able to add dependencies in build.gradle and rely on maven (central, snapshot, or local). This raised the question of why even provide different flavors of archived files (lite, and X00 plugins). It seems clearer / cleaner to just provide the full bundle and let users pick and choose and then open an issue/pr to add cli sugar around -plugin install/remove. Hopefully that helps provide some perspective to think of this from a different angle.

Bundling the plugins together with OpenSearch is essentially deciding that the official version is the one containing the official plugins in the project. I would prefer that the naming for the artifact that is built with all the plugins be called opensearch-full and the one without the plugins be called opensearch - as suggested by @ralph.
Also, should we even be changing the behavior of installation and usage at such an early stage? If companies are using Elasticsearch and wish to start using OpenSearch I would not want them to have to change their scripts to start removing plugins because they got more than what they asked for. Currently, the future users of this project are the ones still using Elasticsearch, and watching this project carefully to see where it goes.
(edit)
(The last paragraph is not an issue if we go for multiple distributions)

2 Likes

Agreed on this one. What we do with OpenTelemetry Collector is there is a stripped down distribution and then we have a distribution with all of the extra added listeners, processors, and exporters that comes from the opentelemetry-contrib repo. I would suggest there is an OpenSearch repo and an OpenSearch-contrib repo which contains any added plugins or other artifacts that the community contributed to surround the engine.

Related to all of this, today ODFE does publish plugins are available as .zip packages and can be installed directly, e.g. bin/elasticsearch-plugin install https://d3g5vo6xdbdb9a.cloudfront.net/downloads/elasticsearch-plugins/opendistro-anomaly-detection/opendistro-anomaly-detection-1.13.0.0.zip It’s documented, and I am sure many people rely on this.

thanks for this information, that’s very interesting (i’ve admittedly never tried to use ODFE so far). if it’s this simple to install the whole plugin bundle i don’t see why the default packaging should ship with all of them. this makes it a 1-liner for anyone who wants the whole bundle, a 0-liner for anyone who doesn’t want any plugins and a dedicated setup for anyone who wants a specific set of plugins (be they OpenSearch plugins or 3rd party plugins). this seems much simpler than asking people to uninstall plugins again.
just looking at the list of ODFE plugins i can’t imagine that most people have a need to run all of them on the same cluster at the same time?

If we’re going to use the forks at VMware, we’re going to need separate installable artifacts for both OpenSearch and OpenSearch Dashboards without having to go through an extra step of removing the plugins.

Here’s the issue we’re facing. We have dozens of separate products across our many business units who have embedded Elasticsearch into products that we ship to customers, and another half dozen with Kibana. If the installation of these forks makes their work painful or more time consuming, I am not going to be able to convince our many product teams to embrace these forks.

3 Likes

@dawnfoster how do these many products consume ElasticSearch today? A specific example would be super helpful.

@dblock, @dawnfoster asked me to chime back. Vmware open source project Antrea GitHub - vmware-tanzu/antrea: Kubernetes networking based on Open vSwitch uses ELK to collect and visualize network flow data. In a security product we have customer data collection and visualization across a cluster, in two flavors - a self-hosted ELK stack and a vendor managed service. In a third product offering fluentd is used to collect logging and application performance metrics which is then accessed via ELK. Yet another use is to auto fill fields on some user and admin dashboards. While Elastic is used in all, Kibana is not.

@Malini thanks! cool stuff. What artifacts (.zip/.tar.gz/jars?) do you consume and from where (maven, etc) and what do you do to these, if anything (e.g. add/remove a plugin? Links to build scripts in the repo would be amazing. Thx.