Difference between Impala and Apache Hive

Impala Apache hive
Impala does runtime code generations
for “big loops ” using llvm.
Apache hive generates query
expressions at compile time.
Hadoop 2.6.0 Hadoop 2.7.3
Runtime Filtering Optimization Enabled All queries run through LLAP
Parquet format with snappy compression ORCFile format with zlib compression
Impala avoids startup overhead
as daemon processes are started at boot
time itself, always being ready to processes a query.
Every hive query has this problem
of “cold start”.
Impala is meant for interactive computing. Apache Hive might not be ideal
for interactive computing .
Impala is more like MPP database. Hive is batch based Hadoop MapReduce .
Impala does not support complex types. Hive supports complex types .
Impala does not support fault tolerance Apache Hive is fault tolerant
hive-architecture

rimpala

Categorized in:

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,