What is the Difference between apache hive and impala ?
Apache hive Vs Impala
| Apache hive | Impala |
|---|---|
| Hive generates query expressions at compile time;Hive is batch based Hadoop MapReduce |
Impala does not support for complex types and fault tolerance. |
| Apache does not generations runtime code for “big loops ” using llvm. |
Impala does generations runtime code for “big loops ” using llvm. |
| Hadoop 2.7.3 | Hadoop 2.6.0 |
| All queries run through LLAP | Runtime Filtering Optimization Enabled |
| ORCFile format with zlib compression | Parquet format with snappy compression |
| Every hive query has this problem of “cold start”. | Impala avoids startup overhead as daemon processes are started at boot time itself, always being ready to processes a query. |
| Apache Hive might not be ideal for interactive computing | Impala is meant for interactive computing. |
| Hive is batch based Hadoop MapReduce. | Impala is more like MPP database. |
| Hive supports complex types. | Impala does not support complex types. |
| Apache Hive is fault tolerant. | Impala does not support fault tolerance. |
| It is more universal, versatile and pluggable language. | It is used unleash its brute processing power and give lightning fast analytic results. |