[Solved-1 Solutions] Sort related bag in pig ?



Bag

  • A bag is collection of tuples.
  • To understand bag we need to understand tuple and field.

Field

  • A field is a piece of data

Tuple

  • An ordered set of fields.
  • Just as an example, lets assume we have products data which has product name and its price information stored in file and we are loading it in pig using below script.

Pig Script:

A = LOAD 'product' USING PigStorage() AS (name:chararray, price:int);
DUMP A;
(TV,100)

Problem :

If we have a Pig script which generated a relation

A: {x: chararray,B: {(y: chararray,z: int)}}

We need to sort A based on B.y, however the following piece gives me error:

Syntax error, unexpected symbol at or near z

output = foreach A{
    sorted = order B by z DSC;
    generate x,sorted;
}

Solution 1:

  • Use DESC instead of DSC.
  • We can sort related bags in the tuple by using DESC
output = foreach A{
    sorted = order B by z DESC;
    generate x,sorted;
}

Related Searches to Sort related bag in pig ?