pig tutorial - apache pig tutorial - Apache Pig BagToString() Function - pig latin - apache pig - pig hadoop

What is BagToString Function in Apache Pig ?

  • The BagToString() function is used to concatenate the elements of a bag into the string in Apache Pig.
  • We can place a delimiter between these values while concatenating the string
  • The BagToString() function creates a single string from the elements of a bag, which is similar to the function SQL's GROUP_CONCAT
  • The BagToString() function can be of the arbitrary size, in which strings in Java cannot: either exhaust available memory or exceed the maximum number of characters
  • Bags used in BagToString() function are disordered and can be ordered by using the ORDER BY operator.


grunt> BagToString(vals:bag [, delimiter:chararray])



  • We have loaded this file wikitechy_dateofbirth.txt into Pig with the relation name dob which is given below:
grunt> dob = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_dateofbirth.txt' USING PigStorage(',')
   as (day:int, month:int, year:int);

Converting Bag to String

  • Using the BagToString() function, we can convert the data from bag to string by using the bagtostring function.
  • We need to group the dob relation and hence this group operation will produce a bag which contains all the tuples of the relation.
  • We can group the relation dob by using the Group All operator, and we need to store the result in the relation name group_dob which is given below:
grunt> group_dob = Group dob All;
  • It will produce a relation for group_dob which is given below:
grunt> Dump group_dob; 
  • It will produce a relation for group_dob which is given below:
  • We are going to observe a bag which is having all the date-of-births as tuples
  • Now, we are going to convert the bag to string by using the BagToString() function.
grunt> dob_string = foreach group_dob Generate BagToString(dob);


grunt> Dump dob_string;



Related Searches to Apache Pig BagToString() Function