Best tool to process web streaming data in Hadoop or PIG or HIVE :

  • It contains easy programming.
  • It is little to achieve parallel execution of simple, “embarrassingly parallel” data analysis tasks.
  • For Complex tasks it comprised the multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
  • To upgrade their execution automatically, allowing the user to focus on semantics rather than efficiency and Extensibility.
  • A special-purpose processing to create own functions.
  • For efficient working pig and hive can be used together.
  • Pig is best tool for parsing (ETL) kind of job, and even pig supports UDF better than hive.
  • It can help you to develop you own framework using pig where you can call Hive ql and map-reduce job for better functionality.

Categorized in:

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,