Term Vectors, stemmers, tokenizers, stop words etc

orestis · October 6, 2021, 6:42pm

Last year I had a pilot project to use ElasticSearch to power a full-text search project. I was using 7.7 at the time, on AWS ElasticSearch service.

I’ve created custom analysers as described here Language analyzers | Elasticsearch Guide [8.4] | Elastic, I’ve used filters like html_strip, and I was planning of using the Term Vectors API Term vectors API | Elasticsearch Guide [8.4] | Elastic …

Now that I’m revisiting the project, I can’t find references to any of these in the OpenSearch documentation, though if it is a 7.10 fork those features should be there. Is this an oversight in the documentation or are there differences between the projects?

In general I’m a bit worried as all the docs behind OpenSearch are focused on logs ingestion and there’s not too many examples of the text analysis capabilities. It would help me a lot make a decision on whether to base this project on OpenSearch or if I should go with an Elastic Cloud license.

Thanks!

kris · October 15, 2021, 6:52pm

Hello @orestis - welcome to the community. As you mentioned, yes, it is derived from 7.10.2 . However, we did not fork the documentation at the time. The team is working diligently building necessary content for the documentation, and we do track that in the open as well on the GitHub repository. Here is direct link to the backlog. I hope this helps.

orestis · October 15, 2021, 7:31pm

Thanks for clarifying. I thought it might be a documentation issue. I guess until the docs are rewritten (they weren’t under the same license? huh) I can use the existing ElasticSearch 7.10.2 docs for some things.

Topic		Replies	Views
Plugins for Opensearch OpenDistro	5	1794	February 15, 2024
Vector.dev observability data pipelines OpenSearch Client Libraries clients-general	1	824	February 7, 2023
Ingest processor documentation General Feedback	1	315	April 16, 2023
OpenSearch Learning to Rank plugin OpenDistro	5	1327	June 13, 2023
Alternative for Enrich Processor OpenSearch feature-request	0	894	October 13, 2022

Term Vectors, stemmers, tokenizers, stop words etc

Related Topics