Apache Pinot

Source: Wikipedia, the free encyclopedia.
Apache Pinot
Original author(s)
  • Kishore Gopalakrishna
  • Xiang Fu
Developer(s)Apache Pinot
Stable release
1.0.0 / 19 September 2023; 6 months ago (2023-09-19)
Apache License 2.0
Websitepinot.apache.org

Apache Pinot is a

Pinot grape vines that are pressed into liquid that is used to produce a variety of different wines. The founders of the database chose the name as a metaphor for analyzing vast quantities of data from a variety of different file formats or streaming data sources.[9]

Pinot was first created at

Factual
.

History

Pinot was started as an internal project at LinkedIn in 2013 to power a variety of user-facing and business-facing products. The first analytics product at LinkedIn to use Pinot was a redesign of the social networking site's feature that allows members to see who has viewed their profile in real-time. The project was open-sourced in June 2015 under an Apache 2.0 license and was donated to the Apache Software Foundation by LinkedIn in June 2019.[9][8]

Architecture

Architecture of Apache Pinot
Architecture diagram of Apache Pinot

Pinot uses Apache Helix for cluster management. Helix is embedded as an agent within the different components and uses Apache ZooKeeper for coordination and maintaining the overall cluster state and health. All Pinot servers and brokers are managed by Helix. Helix is a generic cluster management framework to manage partitions and replicas in a distributed system.

Query management

Queries are received by brokers—which checks the request against the segment-to-server routing table—scattering the request between real-time and offline servers.

Cluster management

Pinot leverages Apache Helix for cluster management. Helix is a cluster management framework to manage replicated, partitioned resources in a distributed system. Helix uses Zookeeper to store cluster state and metadata.

Features

Pinot shares similar features with comparable OLAP datastores, such as

Bitmap Index, Inverted Index
, Star-Tree Index, and Range Index, which are what primarily differentiates Pinot from other OLAP datastores.

Pinot supports near real-time ingestion from streams such as

data warehousing solutions, Pinot supports a SQL
-like query language that supports selection, aggregation, filtering, group by, order by, distinct queries on data.

See also

References

  1. S2CID 12327343
    .
  2. .
  3. .
  4. .
  5. ^ "The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project". blogs.apache.org. 2 August 2021.
  6. ].
  7. .
  8. ^ a b c Pawar, Neha. "Pinot Joins Apache Incubator" Archived 2019-04-02 at the Wayback Machine, LinkedIn Engineering, 01 April 2019
  9. ^ a b c Gopalakrishna, Kishore. "Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics". engineering.linkedin.com. LinkedIn. Archived from the original on 10 September 2015. Retrieved 3 September 2020.
  10. ^ Yegulalp, Serdar (2015-06-11). "LinkedIn fills another SQL-on-Hadoop niche". InfoWorld.
  11. S2CID 232478317
    .
  12. .
  13. .

External links