Apache HBase

Source: Wikipedia, the free encyclopedia.

Apache HBase
Initial release28 March 2008; 16 years ago (2008-03-28)
Stable release
2.4.x2.4.14 / 29 August 2022; 19 months ago (2022-08-29)[1]
2.5.x2.5.3 / 5 February 2023; 14 months ago (2023-02-05)[1]
Preview release
3.0.0-alpha-3 / 27 June 2022; 21 months ago (2022-06-27)[1]
Apache License 2.0
Websitehbase.apache.org

HBase is an

fault-tolerant way of storing large quantities of sparse
data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection).

HBase features compression, in-memory operation, and

Thrift gateway APIs. HBase is a wide-column store
and has been widely adopted because of its lineage with Hadoop and HDFS. HBase runs on top of HDFS and is well-suited for fast read and write operations on large datasets with high throughput and low input/output latency.

HBase is not a direct replacement for a classic

across multiple statements, tables and rows that use HBase as a storage engine.

HBase is now serving several data-driven websites[3] but Facebook's Messaging Platform migrated from HBase to MyRocks in 2018.[4][5] Unlike relational and traditional databases, HBase does not support SQL scripting; instead the equivalent is written in Java, employing similarity with a MapReduce application.

In the parlance of Eric Brewer's

CAP Theorem
, HBase is a CP type system.

History

Apache HBase began as a project by the company Powerset out of a need to process massive amounts of data for the purposes of natural-language search. Since 2010 it is a top-level Apache project.

Facebook elected to implement its new messaging platform using HBase in November 2010, but migrated away from HBase in 2018.[4]

The 2.4.x series is the current stable release line, it supersedes earlier release lines.

Use cases & production deployments

Enterprises that use HBase

The following is a list of notable enterprises that have used or are using HBase:

See also

References

  1. ^ a b c "Apache HBase – Apache HBase Downloads". Retrieved 27 September 2022.
  2. ^ Chang, et al. (2006). Bigtable: A Distributed Storage System for Structured Data
  3. ^ "Apache HBase – Powered By Apache HBase". hbase.apache.org. Retrieved 8 April 2018.
  4. ^ a b "Migrating Messenger storage to optimize performance". www.facebook.com. 26 June 2018. Retrieved 5 July 2018.
  5. ^ Facebook: Why our 'next-gen' comms ditched MySQL Retrieved: 17 December 2010
  6. ^ HBaseCon (2 August 2016). "Apache HBase at Airbnb". slideshare.net. Retrieved 8 April 2018.
  7. ^ "Near Real Time Search Indexing". 4 January 2018.
  8. ^ "Is data locality always out of the box in Hadoop?". 10 March 2018.
  9. ^ "Why Imgur Dropped MySQL in Favor of HBase - DZone Database". dzone.com. Retrieved 8 April 2018.
  10. ^ "Tech Tuesday: Imgur Notifications: From MySQL to HBase - The Imgur Blog". blog.imgur.com. Retrieved 8 April 2018.
  11. ^ Doyung Yoon. "S2Graph : A Large-Scale Graph Database with HBase".
  12. ^ Cheolsoo Park and Ashwin Shankar. "Netflix: Integrating Spark at Petabyte Scale".
  13. ^ Engineering, Pinterest (30 March 2018). "Improving HBase backup efficiency at Pinterest". Medium. Retrieved 14 April 2020. {{cite web}}: |first= has generic name (help)
  14. ^ "Hbase at Salesforce.com".
  15. ^ Josh Baer. "How Apache Drives Spotify's Music Recommendations".
  16. ^ "Tuenti Group Chat: Simple, yet complex". Archived from the original on 24 November 2012. Retrieved 29 September 2015.
  17. ^ "Tuenti Asyncthrift". GitHub. 6 November 2013.

Bibliography

External links