Add Arrow (Flight) Endpoint
Apache Arrow (https://arrow.apache.org/) is a popular in-memory columnar storage format. It is to memory what is parquet/ORC are to disk-oriented columnar storage formats.
Arrow standardize in-memory columnar data presentations for all data processing engines (Spark, Drill, Impala, etc.).
This helps with reducing the communication and serialization overheads, increases shared code-base to manage data
Flight , a new general-purpose client-server framework to simplify high performance transport of large datasets over network interfaces
One of the biggest features that sets apart Flight from other data transport frameworks is parallel transfers, allowing data to be streamed to or from a cluster of servers simultaneously.
Supporting this in opensearch will bring large benefits:
- In Memory columnar standard data format that can be transported across nodes
- Interoperability with standard Big Data tools & formats
- Outperform ODBC or JDBC libraries by ten-folds
- Support better hash join capability for inter-indexes joins
- Horizontal Scalability: Parallel and Partitioned Data Access