Difference between Impala and Apache Hive
|Impala does runtime code generations
for “big loops ” using llvm.
|Apache hive generates query
expressions at compile time.
|Hadoop 2.6.0||Hadoop 2.7.3|
|Runtime Filtering Optimization Enabled||All queries run through LLAP|
|Parquet format with snappy compression||ORCFile format with zlib compression|
|Impala avoids startup overhead
as daemon processes are started at boot
time itself, always being ready to processes a query.
|Every hive query has this problem
of “cold start”.
|Impala is meant for interactive computing.||Apache Hive might not be ideal
for interactive computing .
|Impala is more like MPP database.||Hive is batch based Hadoop MapReduce .|
|Impala does not support complex types.||Hive supports complex types .|
|Impala does not support fault tolerance||Apache Hive is fault tolerant|