pig tutorial - apache pig tutorial - Apache Pig TOBAG() - pig latin - apache pig - pig hadoop



What is TOBAG() in Apache Pig ?

  • The TOBAG() function of Pig Latin converts one or more expressions to individual tuples.
  • And these tuples are placed in a bag.

Syntax

TOBAG(expression [, expression ...])

Example

  • Assume we have a file named wikitechy_emp_details.txt in the HDFS directory /pig_data/, with the following content.

wikitechy_emp_details.txt 

111,Anu,22,newyork
112,Bastin,23,Kolkata
113,Martin,23,Tokyo 
114,Sam,25,London 
115,David,23,Bhuwaneshwar 
116,Vincent,22,Chennai
  • You have loaded this file into Pig with the relation name wikitechy_emp_data as given below.
grunt> wikitechy_emp_data = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_emp_details.txt' USING PigStorage(',')
   as (id:int, name:chararray, age:int, city:chararray);
  • Now convert the id, name, age and city, of each employee (record) into a tuple as given below.
tobag = FOREACH wikitechy_emp_data GENERATE TOBAG (id,name,age,city);

Verification

  • You can verify the contents of the tobag relation using the Dump operator as given below.
grunt> DUMP tobag;
  
({(111),(Anu),(22),(newyork)}) 
({(112),(Bastin),(23),(Kolkata)}) 
({(113),(Martin),(23),(Tokyo)}) 
({(114),(Sam),(25),(London)}) 
({(115),(David),(23),(Bhuwaneshwar)}) 
({(116),(Vincent),(22),(Chennai)})

Related Searches to Apache Pig TOBAG()