[Solved-1 Solution] Pig Order By Query ?



What is Order By ?

  • The ORDER BY operator is used to display the contents of a relation in a sorted order based on one or more fields.

Syntax

  • Given below is the syntax of the ORDER BY operator.
grunt> Relation_name2 = ORDER Relatin_name1 BY (ASC|DESC);

Problem:

grunt> dump jn;

(k1,k4,10)
(k1,k5,15)
(k2,k4,9)
(k3,k4,16)

grunt> jn = group jn by $1;
grunt> dump jn;
  • Now, from here we want the following output :
(k4,{(k3,k4,16),(k1,k4,10)})
(k5,{(k1,k5,15)})
  • Basically, we want to sort on the numbers : 10,9,16 and select the top 2 for every row.

How can we do it?

Solution 1:

  • We can use the nested foreach with ORDER BY like below code
A = LOAD 'data';
jn = group A by $1;
B = FOREACH jn {
  sorted = ORDER A by $2 ASC;
  lim = LIMIT sorted 2;
  GENERATE lim;
};

Related Searches to Pig Order By Query