<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>how will you optimize hive performance - Wikitechy</title>
	<atom:link href="https://www.wikitechy.com/interview-questions/tag/how-will-you-optimize-hive-performance/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.wikitechy.com/interview-questions/tag/how-will-you-optimize-hive-performance/</link>
	<description>Interview Questions</description>
	<lastBuildDate>Mon, 13 Sep 2021 05:32:03 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://www.wikitechy.com/interview-questions/wp-content/uploads/2025/10/cropped-wikitechy-icon-32x32.png</url>
	<title>how will you optimize hive performance - Wikitechy</title>
	<link>https://www.wikitechy.com/interview-questions/tag/how-will-you-optimize-hive-performance/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>What is the difference between &#8216;select from table&#8217; and &#8216;select column from table&#8217; in hive ?</title>
		<link>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-select-from-table-and-select-column-from-table-in-hive/</link>
					<comments>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-select-from-table-and-select-column-from-table-in-hive/#respond</comments>
		
		<dc:creator><![CDATA[Editor]]></dc:creator>
		<pubDate>Tue, 13 Jul 2021 22:18:52 +0000</pubDate>
				<category><![CDATA[Hive]]></category>
		<category><![CDATA[Accenture interview questions and answers]]></category>
		<category><![CDATA[Altimetrik India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[ANI Technologies Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Capgemini interview questions and answers]]></category>
		<category><![CDATA[CASTING NETWORKS INDIA PVT LIMITED interview questions and answers]]></category>
		<category><![CDATA[CGI Group Inc interview questions and answers]]></category>
		<category><![CDATA[Collabera Technologies interview questions and answers]]></category>
		<category><![CDATA[Dell International Services India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[explain in hive]]></category>
		<category><![CDATA[explain the difference between sql and apache hive.]]></category>
		<category><![CDATA[Flipkart interview questions and answers]]></category>
		<category><![CDATA[Genpact interview questions and answers]]></category>
		<category><![CDATA[hive interview questions]]></category>
		<category><![CDATA[hive interview questions and answers]]></category>
		<category><![CDATA[hive query based interview questions]]></category>
		<category><![CDATA[hive query without mapreduce]]></category>
		<category><![CDATA[hive questions]]></category>
		<category><![CDATA[hive scenario based interview questions]]></category>
		<category><![CDATA[how will you optimize hive performance]]></category>
		<category><![CDATA[IBM interview questions and answers]]></category>
		<category><![CDATA[Impetus Technologies interview questions and answers]]></category>
		<category><![CDATA[Indiabulls Technology Solutions Ltd interview questions and answers]]></category>
		<category><![CDATA[Mindtree interview questions and answers]]></category>
		<category><![CDATA[NetApp interview questions and answers]]></category>
		<category><![CDATA[Prokarma Softech Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[R Systems interview questions and answers]]></category>
		<category><![CDATA[Reliance Industries Ltd interview questions and answers]]></category>
		<category><![CDATA[Synechron Te interview questions and answers]]></category>
		<category><![CDATA[Tata Consultancy Service interview questions and answers]]></category>
		<category><![CDATA[Tech Mahindra interview questions and answers]]></category>
		<category><![CDATA[Trigent Software interview questions and answers]]></category>
		<category><![CDATA[UnitedHealth Group interview questions and answers]]></category>
		<category><![CDATA[Virtusa Consulting Services Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Wells Fargo interview questions and answers]]></category>
		<category><![CDATA[Wipro Infotech interview questions and answers]]></category>
		<category><![CDATA[Wipro interview questions and answers]]></category>
		<category><![CDATA[Yash Technologies interview questions and answers]]></category>
		<category><![CDATA[Yodlee Infotech Pvt Ltd interview questions and answers]]></category>
		<guid isPermaLink="false">https://www.wikitechy.com/interview-questions/?p=602</guid>

					<description><![CDATA[Answer : Table in Hive is stored as a directory in the HDFS...]]></description>
										<content:encoded><![CDATA[<div class="TextHeading">
<div class="hddn">
<h2 id="difference-between-select-from-table-and-select-column-from-table-in-hive" class="color-green" style="text-align: justify;">Difference between &#8216;select * from table&#8217; and &#8216;select column from table&#8217; in hive</h2>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ul>
<li>Table in Hive is stored as a directory in the HDFS.</li>
<li>Using select from table the Hive query processor simply goes directory that have one or more files in table schema.</li>
<li>You may do this if you have very small data like less than a Gigabyte.</li>
<li>In real clusters if you hit ‘select * from table’, it may have data in Terabytes and displaying that will run for long time.</li>
<li>Hive achieved sequence of map reduce programs that reads data from table stored on Hadoop Distributed File System.</li>
<li>Any data processing you do in Hive is achieved through sequence of map reduce programs that reads data from table stored on HDFS.</li>
<li>Hive map reduce based on query processing engine.</li>
<li>Tables have wide number of columns that representing different values.To perform select column the map reduce program will scan all rows and extract a column.</li>
</ul>
</div>
</div>
<div class="text-center row" style="text-align: justify;">
<div class="col-sm-12">
<div id="bsa-zone_1590522538159-8_123456"></div>
</div>
</div>
<div class="ImageContent">
<div class="hddn" style="text-align: justify;"><a href="https://cdn.wikitechy.com/interview-questions/hive/select-from-table-and-select-column-from-table-in-hive.png"><img decoding="async" class="aligncenter size-medium" src="https://cdn.wikitechy.com/interview-questions/hive/select-from-table-and-select-column-from-table-in-hive.png" alt="select-from-table-and-select-column-from-table-in-hive" width="288" height="149" /></a></div>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-select-from-table-and-select-column-from-table-in-hive/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>What is the difference between Hive and HBase ?</title>
		<link>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-hive-and-hbase/</link>
					<comments>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-hive-and-hbase/#respond</comments>
		
		<dc:creator><![CDATA[Editor]]></dc:creator>
		<pubDate>Tue, 13 Jul 2021 22:18:28 +0000</pubDate>
				<category><![CDATA[Hive]]></category>
		<category><![CDATA[Accenture interview questions and answers]]></category>
		<category><![CDATA[Altimetrik India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[ANI Technologies Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Capgemini interview questions and answers]]></category>
		<category><![CDATA[CASTING NETWORKS INDIA PVT LIMITED interview questions and answers]]></category>
		<category><![CDATA[CGI Group Inc interview questions and answers]]></category>
		<category><![CDATA[Collabera Technologies interview questions and answers]]></category>
		<category><![CDATA[Dell International Services India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[difference between hbase and hdfs]]></category>
		<category><![CDATA[difference between hive and hdfs]]></category>
		<category><![CDATA[Flipkart interview questions and answers]]></category>
		<category><![CDATA[Genpact interview questions and answers]]></category>
		<category><![CDATA[hive interview questions]]></category>
		<category><![CDATA[hive query based interview questions]]></category>
		<category><![CDATA[hive scenario based interview questions]]></category>
		<category><![CDATA[how to use hbase with hadoop]]></category>
		<category><![CDATA[how will you optimize hive performance]]></category>
		<category><![CDATA[IBM interview questions and answers]]></category>
		<category><![CDATA[Impetus Technologies interview questions and answers]]></category>
		<category><![CDATA[Indiabulls Technology Solutions Ltd interview questions and answers]]></category>
		<category><![CDATA[Mindtree interview questions and answers]]></category>
		<category><![CDATA[NetApp interview questions and answers]]></category>
		<category><![CDATA[Prokarma Softech Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[R Systems interview questions and answers]]></category>
		<category><![CDATA[Reliance Industries Ltd interview questions and answers]]></category>
		<category><![CDATA[Synechron Te interview questions and answers]]></category>
		<category><![CDATA[Tata Consultancy Service interview questions and answers]]></category>
		<category><![CDATA[Tech Mahindra interview questions and answers]]></category>
		<category><![CDATA[Trigent Software interview questions and answers]]></category>
		<category><![CDATA[UnitedHealth Group interview questions and answers]]></category>
		<category><![CDATA[Virtusa Consulting Services Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Wells Fargo interview questions and answers]]></category>
		<category><![CDATA[when to use hbase]]></category>
		<category><![CDATA[Wipro Infotech interview questions and answers]]></category>
		<category><![CDATA[Wipro interview questions and answers]]></category>
		<category><![CDATA[Yash Technologies interview questions and answers]]></category>
		<category><![CDATA[Yodlee Infotech Pvt Ltd interview questions and answers]]></category>
		<guid isPermaLink="false">https://www.wikitechy.com/interview-questions/?p=599</guid>

					<description><![CDATA[Answer : Hive is query engine...]]></description>
										<content:encoded><![CDATA[<div class="TextHeading">
<div class="hddn">
<h2 id="difference-between-hive-and-hbase" class="color-green">Difference between Hive and HBase</h2>
</div>
</div>
<div class="ImageContent">
<div class="hddn"><img fetchpriority="high" decoding="async" class="alignnone size-medium aligncenter" src="https://cdn.wikitechy.com/interview-questions/hive/integration-of-hive-and-hbase.png" alt="integration of hive and hbase" width="588" height="386" /></div>
</div>
<div class="text-center row">
<div class="col-sm-12">
<div id="bsa-zone_1590522538159-8_123456"></div>
</div>
</div>
<table class="table-bordered table-striped table table-responsive">
<tbody>
<tr>
<th>Hive</th>
<th>HBase</th>
</tr>
<tr>
<td class="text-leftalign" align="justify">Hive is query engine</td>
<td class="text-leftalign" align="justify">HBase is a data storage particularly for<br />
unstructured data.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Apache Hive is mainly used for<br />
batch processing i.e. OLAP</td>
<td class="text-leftalign" align="justify">HBase is extensively used for transactional<br />
processing wherein the response time of the query<br />
is not highly interactive i.e. OLTP.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Operations in Hive are<br />
used to transformed into mapreduce jobs.</td>
<td class="text-leftalign" align="justify">Operations in HBase are run<br />
in real-time on the database</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">For big data applications that require complex<br />
and fine grained processing, Hadoop MapReduce<br />
is the best choice.</td>
<td class="text-leftalign" align="justify">HBase should be used when Data model<br />
schema is sparse.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">It used for data warehousing requirements<br />
the programmers do not<br />
write complex mapreduce code.</td>
<td class="text-leftalign" align="justify">HBase is an ideal big data solution if the<br />
application requires random read or random<br />
write operations or both.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Hive does not currently<br />
support update statements.</td>
<td class="text-leftalign" align="justify">HBase queries are written in a custom language<br />
that needs to be learned.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify"><b>Hive</b> does not provide interactive<br />
querying it only runs batch processes on Hadoop.</td>
<td class="text-leftalign" align="justify">Apache <b>HBase</b> is a NoSQL key/value store which<br />
runs on top of HDFS.</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Hive has some limitations<br />
of high latency</td>
<td class="text-leftalign" align="justify">HBase does not have analytical capabilities</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Hive is to analytical queries.</td>
<td class="text-leftalign" align="justify">HBase is to real-time querying</td>
</tr>
<tr>
<td class="text-leftalign" align="justify">Hive used for analytical querying of<br />
data collected over a period of time.Hive<br />
should not be used for real-time querying.</td>
<td class="text-leftalign" align="justify">HBase is perfect for real-time example<br />
Facebook use for messaging and real-time analytics.<br />
They may even be using it to count Facebook likes.</td>
</tr>
</tbody>
</table>
]]></content:encoded>
					
					<wfw:commentRss>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-hive-and-hbase/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>What is the Difference between apache hive and impala ?</title>
		<link>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-apache-hive-and-impala/</link>
					<comments>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-apache-hive-and-impala/#respond</comments>
		
		<dc:creator><![CDATA[Editor]]></dc:creator>
		<pubDate>Tue, 13 Jul 2021 22:18:19 +0000</pubDate>
				<category><![CDATA[Hive]]></category>
		<category><![CDATA[Accenture interview questions and answers]]></category>
		<category><![CDATA[Altimetrik India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[ANI Technologies Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Capgemini interview questions and answers]]></category>
		<category><![CDATA[CASTING NETWORKS INDIA PVT LIMITED interview questions and answers]]></category>
		<category><![CDATA[CGI Group Inc interview questions and answers]]></category>
		<category><![CDATA[Collabera Technologies interview questions and answers]]></category>
		<category><![CDATA[Dell International Services India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[explain the difference between sql and apache hive.]]></category>
		<category><![CDATA[Flipkart interview questions and answers]]></category>
		<category><![CDATA[Genpact interview questions and answers]]></category>
		<category><![CDATA[hive query based interview questions]]></category>
		<category><![CDATA[hive scenario based interview questions]]></category>
		<category><![CDATA[hive vs impala vs spark]]></category>
		<category><![CDATA[how will you optimize hive performance]]></category>
		<category><![CDATA[IBM interview questions and answers]]></category>
		<category><![CDATA[impala vs hive performance]]></category>
		<category><![CDATA[impala vs hive vs pig]]></category>
		<category><![CDATA[Impetus Technologies interview questions and answers]]></category>
		<category><![CDATA[Indiabulls Technology Solutions Ltd interview questions and answers]]></category>
		<category><![CDATA[Mindtree interview questions and answers]]></category>
		<category><![CDATA[NetApp interview questions and answers]]></category>
		<category><![CDATA[pig interview questions]]></category>
		<category><![CDATA[Prokarma Softech Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[R Systems interview questions and answers]]></category>
		<category><![CDATA[Reliance Industries Ltd interview questions and answers]]></category>
		<category><![CDATA[sqoop interview questions]]></category>
		<category><![CDATA[Synechron Te interview questions and answers]]></category>
		<category><![CDATA[Tata Consultancy Service interview questions and answers]]></category>
		<category><![CDATA[Tech Mahindra interview questions and answers]]></category>
		<category><![CDATA[Trigent Software interview questions and answers]]></category>
		<category><![CDATA[UnitedHealth Group interview questions and answers]]></category>
		<category><![CDATA[Virtusa Consulting Services Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Wells Fargo interview questions and answers]]></category>
		<category><![CDATA[what is difference between hive and impala ?]]></category>
		<category><![CDATA[which version of hadoop introduced yarn]]></category>
		<category><![CDATA[why impala is faster than hive]]></category>
		<category><![CDATA[Wipro Infotech interview questions and answers]]></category>
		<category><![CDATA[Wipro interview questions and answers]]></category>
		<category><![CDATA[Yash Technologies interview questions and answers]]></category>
		<category><![CDATA[Yodlee Infotech Pvt Ltd interview questions and answers]]></category>
		<guid isPermaLink="false">https://www.wikitechy.com/interview-questions/?p=598</guid>

					<description><![CDATA[Answer : Hive generates query expressions at compile
time...]]></description>
										<content:encoded><![CDATA[<h2 id="apache-hive-vs-impala" class="color-green">Apache hive Vs Impala</h2>
<table class="table-bordered table-striped table table-responsive">
<tbody>
<tr>
<th>Apache hive</th>
<th>Impala</th>
</tr>
<tr>
<td class="text-leftalign">Hive generates query expressions at compile<br />
time;Hive is batch based Hadoop MapReduce</td>
<td class="text-leftalign">Impala does not support for complex types<br />
and fault tolerance.</td>
</tr>
<tr>
<td class="text-leftalign">Apache does not generations runtime code<br />
for “big loops ” using llvm.</td>
<td class="text-leftalign">Impala does generations runtime code<br />
for “big loops ” using llvm.</td>
</tr>
<tr>
<td class="text-leftalign">Hadoop 2.7.3</td>
<td class="text-leftalign">Hadoop 2.6.0</td>
</tr>
<tr>
<td class="text-leftalign">All queries run through LLAP</td>
<td class="text-leftalign">Runtime Filtering Optimization Enabled</td>
</tr>
<tr>
<td class="text-leftalign">ORCFile format with zlib compression</td>
<td class="text-leftalign">Parquet format with snappy compression</td>
</tr>
<tr>
<td class="text-leftalign">Every hive query has this problem of “cold start”.</td>
<td class="text-leftalign">Impala avoids startup overhead as daemon<br />
processes are started at boot time itself,<br />
always being ready to processes a query.</td>
</tr>
<tr>
<td class="text-leftalign">Apache <b>Hive</b> might not be ideal for interactive computing</td>
<td class="text-leftalign">Impala is meant for interactive computing.</td>
</tr>
<tr>
<td class="text-leftalign"><b>Hive</b> is batch based Hadoop MapReduce.</td>
<td class="text-leftalign">Impala is more like MPP database.</td>
</tr>
<tr>
<td class="text-leftalign"><b>Hive</b> supports complex types.</td>
<td class="text-leftalign"><b>Impala</b> does not support complex types.</td>
</tr>
<tr>
<td class="text-leftalign">Apache Hive is fault tolerant.</td>
<td class="text-leftalign">Impala does not support fault tolerance.</td>
</tr>
<tr>
<td class="text-leftalign">It is more universal, versatile and pluggable language.</td>
<td class="text-leftalign">It is used unleash its brute processing power<br />
and give lightning fast analytic results.</td>
</tr>
</tbody>
</table>
]]></content:encoded>
					
					<wfw:commentRss>https://www.wikitechy.com/interview-questions/hive/what-is-the-difference-between-apache-hive-and-impala/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>What is a tool for tuning hive queries ?</title>
		<link>https://www.wikitechy.com/interview-questions/hive/what-is-a-tool-for-tuning-hive-queries/</link>
					<comments>https://www.wikitechy.com/interview-questions/hive/what-is-a-tool-for-tuning-hive-queries/#respond</comments>
		
		<dc:creator><![CDATA[Editor]]></dc:creator>
		<pubDate>Tue, 13 Jul 2021 21:50:34 +0000</pubDate>
				<category><![CDATA[Hive]]></category>
		<category><![CDATA[Accenture interview questions and answers]]></category>
		<category><![CDATA[Altimetrik India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[ANI Technologies Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Capgemini interview questions and answers]]></category>
		<category><![CDATA[CASTING NETWORKS INDIA PVT LIMITED interview questions and answers]]></category>
		<category><![CDATA[CGI Group Inc interview questions and answers]]></category>
		<category><![CDATA[Collabera Technologies interview questions and answers]]></category>
		<category><![CDATA[cost based query optimization in hive]]></category>
		<category><![CDATA[Dell International Services India Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Flipkart interview questions and answers]]></category>
		<category><![CDATA[Genpact interview questions and answers]]></category>
		<category><![CDATA[hive performance tuning hortonworks]]></category>
		<category><![CDATA[hive performance tuning techniques]]></category>
		<category><![CDATA[hive query based interview questions]]></category>
		<category><![CDATA[hive query optimization parameters]]></category>
		<category><![CDATA[hive query optimization techniques]]></category>
		<category><![CDATA[hive scenario based interview questions]]></category>
		<category><![CDATA[how will you optimize hive performance]]></category>
		<category><![CDATA[IBM interview questions and answers]]></category>
		<category><![CDATA[Impetus Technologies interview questions and answers]]></category>
		<category><![CDATA[Indiabulls Technology Solutions Ltd interview questions and answers]]></category>
		<category><![CDATA[Mindtree interview questions and answers]]></category>
		<category><![CDATA[NetApp interview questions and answers]]></category>
		<category><![CDATA[pig interview questions]]></category>
		<category><![CDATA[Prokarma Softech Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[R Systems interview questions and answers]]></category>
		<category><![CDATA[Reliance Industries Ltd interview questions and answers]]></category>
		<category><![CDATA[Synechron Te interview questions and answers]]></category>
		<category><![CDATA[Tata Consultancy Service interview questions and answers]]></category>
		<category><![CDATA[Tech Mahindra interview questions and answers]]></category>
		<category><![CDATA[Trigent Software interview questions and answers]]></category>
		<category><![CDATA[UnitedHealth Group interview questions and answers]]></category>
		<category><![CDATA[Virtusa Consulting Services Pvt Ltd interview questions and answers]]></category>
		<category><![CDATA[Wells Fargo interview questions and answers]]></category>
		<category><![CDATA[will the reducer work or not if you use “limit 1” in any hiveql query ?]]></category>
		<category><![CDATA[Wipro Infotech interview questions and answers]]></category>
		<category><![CDATA[Wipro interview questions and answers]]></category>
		<category><![CDATA[Yash Technologies interview questions and answers]]></category>
		<category><![CDATA[Yodlee Infotech Pvt Ltd interview questions and answers]]></category>
		<guid isPermaLink="false">https://www.wikitechy.com/interview-questions/?p=576</guid>

					<description><![CDATA[Answer : By doing compression at various phases (i.e. on final output, intermediate data),we achieve performance improvement in Hive Queries.]]></description>
										<content:encoded><![CDATA[<div class="TextHeading">
<div class="hddn">
<h2 id="tool-for-tuning-hive-queries" class="color-green" style="text-align: justify;">Tool for tuning hive queries</h2>
</div>
</div>
<div class="ImageContent" style="text-align: justify;">
<div class="hddn"><img decoding="async" class="alignnone size-medium aligncenter" src="https://cdn.wikitechy.com/interview-questions/hive/what-is-a-tool-for-tuning-hive-queries.png" alt="tool for tuning hive queries" width="732" height="538" /></div>
</div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="1-enable-compression-in-hive" class="color-green">1. Enable Compression in Hive</h2>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ul>
<li>By doing compression at various phases (i.e. on final output, intermediate data),we achieve performance improvement in Hive Queries.</li>
</ul>
</div>
</div>
<div class="text-center row" style="text-align: justify;"></div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="2-optimize-joins" class="color-green">2. Optimize Joins</h2>
</div>
</div>
<p style="text-align: justify;">We can improve the performance of joins.By enabling Auto Convert Map Joins and enabling optimization of skew join.</p>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ol>
<li>Auto Map Join</li>
<li>Skew Joins</li>
<li>Enable Bucketed Map Joins</li>
</ol>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<h2 id="auto-map-join">Auto Map Join:</h2>
<ul>
<li style="list-style-type: none;">
<ul>
<li>Auto Map-Join is useful feature when joining a big table with a small table.</li>
<li>If we enable this feature, the small table will be saved in the local cache on each node, joined with the big table in the Map phase.</li>
<li>Enabling Auto Map Join provides 2 advantages.</li>
<li>Primary,it loads a small table into cache will save read time on each data node.</li>
<li>Secondary, it avoids skew joins in the Hive query, since the join operation has been already done in the Map phase for each block of data.</li>
</ul>
</li>
</ul>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<h2 id="skew-joins">Skew joins:</h2>
<ul>
<li style="list-style-type: none;">
<ul>
<li>We enable skew joins by setting hive.optimize.</li>
<li>Skew join property SET command in hive shell or hive-site.xml file.</li>
</ul>
</li>
</ul>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<h2 id="enable-bucketed-map-joins">Enable Bucketed Map Joins</h2>
<ul>
<li style="list-style-type: none;">
<ul>
<li>The tables as specific column and tables used in joins to improve performance bucketed map join is used.</li>
</ul>
</li>
</ul>
</div>
</div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="3-enable-parallel-execution" class="color-green">3. Enable Parallel Execution</h2>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ul>
<li>Hive converts a query into more stages.The MapReduce stage, sampling stage, a mergestage and a limit stage.</li>
<li>By default, Hive executes only one time for these satges.</li>
<li>A particular job may consist of some stages that are not dependent on each other and could be executed in parallel, possibly allowing the overall job to complete more quickly.</li>
</ul>
</div>
</div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="4-single-reduce-for-multi-group-by" class="color-green">4. Single Reduce for Multi Group BY</h2>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ul>
<li>The single reducer used for multi operations, it combine multiple GROUP BY operations in a query into a single MapReduce job</li>
</ul>
</div>
</div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="5-enable-vectorization" class="color-green">5. Enable Vectorization</h2>
</div>
</div>
<div class="Content" style="text-align: justify;">
<div class="hddn">
<ul>
<li>Vectorization introduced into hive for the first time in hive-0.13.1 release only</li>
<li>It improve operations like scans, aggregations, filters and joins, batches of 1024 rows for each time.</li>
</ul>
</div>
</div>
<div class="TextHeading" style="text-align: justify;">
<div class="hddn">
<h2 id="6-enable-cost-based-optimization" class="color-green">6. Enable Cost Based Optimization</h2>
</div>
</div>
<div class="Content">
<div class="hddn">
<ul>
<li style="text-align: justify;">It provided the cost based optimization, based on query cost, resulting in different decisions: how to order joins, which type of join to perform and degree of parallelism.</li>
</ul>
</div>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.wikitechy.com/interview-questions/hive/what-is-a-tool-for-tuning-hive-queries/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
