pig tutorial - apache pig tutorial - Apache Pig - Top() - pig latin - apache pig - pig hadoop

The TOP() function of Pig Latin is used to get the top N tuples of a bag.
To this function, as inputs, we have to pass a relation, the number of tuples you need, and the column name whose values are being compared.
This function will return a bag containing the required columns.

Syntax

grunt> TOP(topN,column,relation)

Example

Ensure we have a file named wikitechy_emp_details.txt in the HDFS directory /pig_data/, with the following content.

Wikitechy_emp_details.txt

111,Anu,22,newyork 
112,Bastin,23,Kolkata 
113,Cimen,23,Tokyo 
114,Darathy,25,London 
115,Enba,23,Bhuwaneshwar 
116,Favin,22,Chennai 
117,Robert,22,newyork 
118,Syam,23,Kolkata 
119,Mary,25,Tokyo 
120,Vincent,25,London 
121,Preethi,25,Bhuwaneshwar 
122,Antony,22,Chennai

You have loaded this file into Pig with the relation name emp_data as given below.

grunt>emp_data = LOAD 'hdfs://localhost:9000/pig_data/ wikitechy_emp_details.txt' USING PigStorage(',')
   as (id:int, name:chararray, age:int, city:chararray);

Group the relation emp_data by age, and store it in the relation emp_group.

grunt> emp_group = Group emp_data BY age;

Now verify the relation emp_group using the Dump operator as given below.

grunt> Dump emp_group;
(22,{(122,Antony,22,Chennai),(117,Robert,22,newyork),(116,Favin,22,Chennai),(111,Anu,22,newyork)}) 
(23,{(118,Syam,23,Kolkata),(115,David,23,Bhuwaneshwar),(113,Cimen,23,Tokyo),(112,Bastin,23, Kolkata)}) 
(25,{(111,Anu,25,Bhuwaneshwar),(120,Vincent,25,London),(119,Mary,25,Tokyo),(114,Darathy, 25,London)})

Now, you can get the top two records of each group arranged in ascending order (based on id) as given below.

grunt> data_top = FOREACH emp_group { 
   top = TOP(2, 0, emp_data); 
   GENERATE top; 
}

In this instance we are retriving the top 2 tuples of a group having greater id.
Then we are retriving top 2 tuples basing on the id, we are passing the index of the column name id as second parameter of TOP() function.

Verification

You can verify the contents of the data_top relation using the Dump operator as given below.

grunt> Dump data_top;
({(117,Robert,22,newyork),(122,Antony,22,Chennai)}) 
({(115,David,23,Bhuwaneshwar),(118,Syam,23,Kolkata)}) 
({(120,Vincent,25,London),(111,Anu,25,Bhuwaneshwar)})

Related Searches to Apache Pig - Top()

hadoop and pighadoop pig and hivepig detailshadoop hive pigdata types in pighadoop pig latinpig for hadooppig language hadoopapache pig latinpig on hadoopapache hadoop pigpig latin language hadooppig scripting languageapache pig examplepig data analysishadoop pig script exampleapache pig architecturepig latin script examplespig introductionpig commands in hadoopload command in pigpig dump limitpig top examplepig top functionpig string lengthpig regex_extractregex_extract pig exampleregular expression in pig latinpig substring end of stringpig tutorial apache pig tutorial hadoop pig tutorial pig latin tutorial learn pig pig hadoop pig tutorial point learn pig latin pig big data pig latin hadoop apache pig pig latin pig commands pig hive pig interview questions hadoop pig hive pig script how to learn pig latin pig and hive pig language pig tutorial pdf apache pig tutorial pdf hadoop pig examples pig store pig programming apache pig download pig data pig script example pig group pig storage pig in latin pig order what is apache pig how to read pig latin pig flatten pigstorage flatten in pig pig latin examples pig mapreduce apache pig commands pig commands pdf pig examples pig load pig code guide pig pig jobs store command in pig tutorial peppa pig peppa pig tutorial simple pig how to write in pig latin datapig pig latin program uses of pig

pig tutorial - apache pig tutorial - Apache Pig - Top() - pig latin - apache pig - pig hadoop

What is TOP() function in Apache Pig ?

Syntax

Example

Verification

Related Searches to Apache Pig - Top()

Wikitechy

Workshop

Join our Community

Other Languages

pig tutorial - apache pig tutorial - Apache Pig - Top() - pig latin - apache pig - pig hadoop

What is TOP() function in Apache Pig ?

Syntax

Example

Verification

Related Searches to Apache Pig - Top()

Summer Offline Internship

Summer Online Internship

Internship in Chennai

Programming / Technology Internship in Chennai

Wikitechy

Workshop

Join our Community

Other Languages