Big Data

Why do we need Data Locality in Hadoop ?

July 12, 2021 2 Min Read

106 0

Why do we need Data Locality in Hadoop ?

Data Locality in Hadoop

Datasets in HDFS store as blocks in DataNodes the Hadoop cluster.
During the execution of a MapReduce job the individual Mapper processes the blocks (Input Splits).
If the data does not reside in the same node where the Mapper is executing the job, the data needs to be copied from the DataNode over the network to the mapper DataNode.

Datasets in HDFS - Data Locality in Hadoop

Now if a MapReduce job has more than 100 Mapper and each Mapper tries to copy the data from other DataNode in the cluster simultaneously, it would cause serious network congestion which is a big performance issue of the overall system.
Hence, data proximity to the computation is an effective and cost-effective solution which is technically termed as Data locality in Hadoop. It helps to increase the overall throughput of the system.

Types of data locality

Data local
- In this type data and the mapper resides on the same node. This is the closest proximity of data and the most preferred scenario.

Rack Local
- In this type data and the mapper resides on the same node. This is the closest proximity of data and the most preferred scenario.
- In this scenarios mapper and data reside on the same rack but on the different data nodes.

Different Rack
- In this scenario mapper and data reside on the different racks.

Types of data locality

Tags:

3 data locality Accenture interview questions and answers apache hadoop AT&T interview questions and answers Atos interview questions and answers azure hadoop big data hadoop Capgemini interview questions and answers CASTING NETWORKS INDIA PVT LIMITED interview questions and answers CGI Group Inc interview questions and answers Collabera Technologiesinterview questions and answers data flow in mapreduces data locality data locality c++data locality definition data locality in cloud computing Data locality in Hadoop data locality in spark data locality in yarn data locality nutanix data locality optimization in hadoop data localization in hadoop Dell International Services India Pvt Ltd interview questions and answers distributed file system Ernst & Young interview questions and answers Flipkart interview questions and answers Genpact interview questions and answers hadoop cluster hadoop data partitioning hadoop database hadoop distributed file system hadoop ecosystem hadoop file system hadoop framework hadoop mapreduce hadoop optimization techniques hdfs architecture IBM interview questions and answers Importance of Data Locality Improving Data Processing Performance with Hadoop Data Locality in the local disk of the name node the files which are stored persistently are Indecomm Global Services interview questions and answers Introduction to Data Locality in Hadoop MapReduce Job scheduling for optimizing data locality in Hadoop clusters L&T Infotech interview questions and answers locality optimization in compiler design mapreduce data locality Mindtree interview questions and answers NetApp interview questions and answers R Systems interview questions and answers rack awareness in hadoop RBS India Development Centre Pvt Ltd interview questions and answers SAP Labs India Pvt Ltd interview questions and answers Tata Consultancy Service interview questions and answers Tech Mahindra interview questions and answers Trigent Software interview questions and answers UnitedHealth Group interview questions and answers Virtusa Consulting Services Pvt Ltd interview questions and answers Wells Fargo interview questions and answers what is big data and hadoop what is big data hadoop What is Data Locality what is data locality in hadoop What is Data Locality in HadoopWhat does the term 'data locality' mean in Hadoop What is Data locality optimization in hadoop what is data localization in hadoop what is hadoop what is hadoop used for Wipro Infotech interview questions and answers Wipro interview questions and answers Xoriant Solutions Pvt Ltd interview questions and answers yarn hadoop ZS Associates interview questions and answers

Author

Editor

Other Articles

Previous

What are the running modes of Hadoop ?

Next

Difference between nfs and hdfs ?

No Comment! Be the first one.

Leave a Reply

Our site uses cookies. By using this site, you agree to the Privacy Policy and Terms of Use.