What is the difference between Impala and Apache Hive ?
Difference between Impala and Apache Hive
| Impala | Apache hive |
|---|---|
| Impala does runtime code generations for “big loops ” using llvm. |
Apache hive generates query expressions at compile time. |
| Hadoop 2.6.0 | Hadoop 2.7.3 |
| Runtime Filtering Optimization Enabled | All queries run through LLAP |
| Parquet format with snappy compression | ORCFile format with zlib compression |
| Impala avoids startup overhead as daemon processes are started at boot time itself, always being ready to processes a query. |
Every hive query has this problem of “cold start”. |
| Impala is meant for interactive computing. | Apache Hive might not be ideal for interactive computing . |
| Impala is more like MPP database. | Hive is batch based Hadoop MapReduce . |
| Impala does not support complex types. | Hive supports complex types . |
| Impala does not support fault tolerance | Apache Hive is fault tolerant |

