[Solved-2 Solutions] How to access an array element in pig ?



What is array

  • An array is a data structure that contains a group of elements. Typically these elements are all of the same data type, such as an integer or string.
  • Arrays are commonly used in computer programs to organize data so that a related set of values can be easily sorted or searched.

Problem :

How to access an array element in pig ?

Solution 1:

  • Pig is a scripting language and not relational one like SQL, it is well suited to work with groups with operators nested inside a FOREACH.
  • Array elements can be accessed with help of an operators and foreach statement .

Example:

A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:float, v2:float);
B = GROUP A BY id; -- isolate all rows for the same id
C = FOREACH B { -- here comes the scripting bit
    elems = ORDER A BY v1 DESC; -- sort rows belonging to the id
    two = LIMIT elems 2; -- select top 2
    two_invers = ORDER two BY v1 ASC; -- sort in opposite order to bubble second value to the top
    second = LIMIT two_invers 1;
    GENERATE FLATTEN(group) as id, FLATTEN(second.v2);
};
DUMP C;

Solution 2:

The below code helps for accessing the array element

A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:int, v2:int);
B = ORDER A BY id ASC, v1 DESC;
C = FOREACH B GENERATE id, v2;
DUMP C;

Related Searches to How to access an array element in pig ?