Hive

What is best practice indexing hdfs data into solr using hive ?

July 13, 2021 One Min Read

61 0

Best practice indexing hdfs data into solr using hive

partitionned table in hive

Here,based on the requirement especially how typically your data gets updated, volume and architecture.

Run a MR job to index data using solrj.
Create Lucene index using mr job and duplicate to the appropriate shards.
Use Hbase indexer to populate Solr.

Properly Size Index:

Understanding what to index typically requires deep business domain expertise on the data.
This yields better indexing plan and increases accuracy for searching data.
Not all data will be indexed but for an organization user have new data,Needs classification of all data untill it is understood what value it brings to the business.
It implies is that data needs to be re-indexed so it is a good practice to store raw data somewhere low cost, often in HDFS or in the cloud object storage.

Tags:

Accenture interview questions and answers Altimetrik India Pvt Ltd interview questions and answers ANI Technologies Pvt Ltd interview questions and answers apache solr analytics can we update data in hadoop Capgemini interview questions and answers CASTING NETWORKS INDIA PVT LIMITED interview questions and answers CGI Group Inc interview questions and answers change data capture in hive example cloudera solr tutorial Collabera Technologies interview questions and answers Dell International Services India Pvt Ltd interview questions and answers Flipkart interview questions and answers Genpact interview questions and answers hive query based interview questions hive scenario based interview questions how would you load incremental data into hive IBM interview questions and answers Impetus Technologies interview questions and answers implementing change data capture using hive Indiabulls Technology Solutions Ltd interview questions and answers Mindtree interview questions and answers NetApp interview questions and answers pig interview questions Prokarma Softech Pvt Ltd interview questions and answers R Systems interview questions and answers Reliance Industries Ltd interview questions and answers solr analytics component solr hadoop example solr hadoop integration example solr index hdfs files Synechron Te interview questions and answers Tata Consultancy Service interview questions and answers Tech Mahindra interview questions and answers Trigent Software interview questions and answers UnitedHealth Group interview questions and answers Virtusa Consulting Services Pvt Ltd interview questions and answers Wells Fargo interview questions and answers Wipro Infotech interview questions and answers Wipro interview questions and answers Yash Technologies interview questions and answers Yodlee Infotech Pvt Ltd interview questions and answers

Author

Editor

Other Articles

Previous

What is the result of clustering a partitioned table in Hive ?

Next

What is the best studio software/tool to run HIVE SQL/HQL queries by a data analyst ?

No Comment! Be the first one.

Leave a Reply

Our site uses cookies. By using this site, you agree to the Privacy Policy and Terms of Use.