[Solved-1 Solution] Percentile calculation in Pig Latin ?



Problem:

How to calculate percentile in pig latin ?

Solution 1:

  • We can use the UDF StreamingQuantile from the Apache DataFulibrary.

To calculate the percentile we can use the below code

Input

item1,234
item1,324
item1,769
item2,23
item2,23
item2,45

Pig Script

register datafu-1.2.0.jar;
define Quantile datafu.pig.stats.StreamingQuantile('0.0','0.5','1.0');
data = load 'data' using PigStorage(',') as (item:chararray, value:int);
quantiles = FOREACH (GROUP data by item) GENERATE group, Quantile(data.value);
dump quantiles;

Output

(item1,(234.0,324.0,769.0))
(item2,(23.0,23.0,45.0))

Related Searches to Percentile calculation in Pig Latin