Difference between hive and pig
|Hive is used by data analysts.||Pig Hadoop Component is generally used by Researchers and Programmers.|
|Hive is used in structured Data||Pig Hadoop Component is used for semi structured data.|
|Hive Hadoop Component has a
declarative SQL language (HiveQL).
|Pig has a procedural data flow language (Pig Latin).|
|Hive uses thrift based server that send queries
and corner directly to the Hive
server which execute them.
|This feature is not available with Pig.|
|Hive directly leverages SQL expertise
and thus can be learnt easily.
|Pig is also SQL but varies to a great extent and it will take some time efforts to master Pig.|
|Hive not support in Avro.||Pig supports in Avro.|
|Hive Hadoop Component operates
on the server side of any cluster.
|Pig Hadoop Component operates on the client side of any cluster.|
|Hive Hadoop Component is mainly
used for creating reports.
|Pig Hadoop Component is mainly used for programming.|
|Hive helpful for ETL.||Pig is a great (Extract, Transform and Load) tool for
big data its powerful transformation and processing capabilities.
|Hive makes use of exact variation of the
SQL DLL language by defining the tables beforehand
and storing the schema details in any local database.
|In Pig there is no dedicated metadata database and the schemas or data types will be defined in script itself.|
|The Hive has a provision for partitions so that can
process the subset of data by date
or in an alphabetical order.
|Pig Hadoop component does not have any notion for partitions though might be one can achieve this through filters.|
|It renders users with sample data for each
scenario and each step through
its “Illustrate” function.
|This feature is not incorporated with the Hive Hadoop Component.|