Difference between Hive and HBase ?
- Hive is a datawarehousing package built on the top of Hadoop. It is mainly used for data analysis. It generally target towards users already comfortable with Structured Query Language (SQL).
- It is similar to SQL and called Hive Query Language (HQL).
- Hive manages and queries structured data. Moreover, hive abstracts complexity of Hadoop. It does not support
- Not a full database.
- Not a real time processing system.
- Not SQL-92 compliant.
- Does not provide row level insert, updates or deletes.
- Doesn’t support transactions and limited sub-query support.
- Query optimization in evolving stage.
- HBase is a column-oriented database management system that runs on top of Hadoop Distributed File System (HDFS).
- It is well suited for sparse data sets, which are common in many Big Data use cases.
- It is an opensource, distributed database developed by Apache software foundations.
- Initially, it was named Google Big Table, afterwards it was re-named as HBase and is primarily written in Java.
- It can store massive amount of data from terabytes to petabytes.
- It is built for low-latency operations and is used extensively for read and write operations.
- It stores large amount of data in the form of tables.
|Hive is a query engine.||Data storage particularly for unstructured data.|
|Mainly used for batch processing.||Extensively used for transactional processing.|
|Not a real time processing.||Real-time processing.|
|Only for analytical queries.||Real-time querying.|
|Runs on the top of Hadoop.||Runs on the top of HDFS (Hadoop distributed file system).|
|Apache Hive is not a database.||It support NoSQL database.|
|It has schema model.||It is free from schema model.|
|Made for high latency operations.||Made for low level latency operations.|