Apache Hadoop:

  • It is a Big Data framework which uses Hadoop Distributed File System to Store the data and MapReduce framework to process that data. Java is used as native language to write MapReduce programs.
  • Apache Hbase and Cassandra are both NoSQL databases.
  • It does not maintain the strict ACID transactions and require better data modelling to be used effectual.
  • Their presentation can be calculated by benchmarking the database using a tool called YCSB(Yahoo Cloud Serving Benchmark).
  • Factors examine in benchmarking can be read, write ,read-modify-write latency in executing queries.
Hbase:
  • It is a Master -Slave NoSQL database dependent upon Hadoop cluster .
  • It is good for Heavy reads and less Write applications
apache hadoop ecosystem

Apache Cassandra:

  • It is a master less and shared-nothing, ring based architecture which does not depend upon Hadoop framework.
  • It is good for Write heavy and less read applications.

Apache Hive:

  • It is a collective processing framework to process the data using a language called Hive Query Language(HQL).
  • HQL is a sql wrapper on top of HDFS which protect writing Mapreduce programs in Java.
  • Instead one can use SQL like language to do their daily tasks.
  • Apache Hive is mainly used for ETL and data warehousing feature in Hadoop.
  • The process of hive is to create tables, joins, unions, aggregates etc. It develop visualization reports can be done by Tableau.

Categorized in:

cassandra

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Share Article:

Leave a Reply

Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.

Powered By
100% Free SEO Tools - Tool Kits PRO