Hive

  • Hive is a component of Hortonworks Data Platform(HDP).
  • Hive provides a SQL interface to data stored in HDP.
  • Hive has 3 main functions:
    • Data Summarization
    • Query
    • Analysis.
  • It supports queries expressed language called HiveQL, which automatically translates SQL like queries into MapReduce jobs executed on Hadoop.
  • It also enables data serialization and increases flexibility in schema architecture including a system catalog called Hive Metastore.

Architecture of hive :

what is hive

Features of hive:

  • Different storage such as plain text, RCFile, ORC, HBase, and others.
  • Pre-defined functions (UDFs) to manipulate dates, strings, and other data-mining tools.
  • Hive supports extending the UDF set to handle use-cases not supported by built-in functions.

Limitations of Hive:

  • Hive supports overwriting or hold data, but not updates and deletes.
  • In Hive, sub queries are not supported

Categorized in:

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,