pig tutorial - apache pig tutorial - Apache Pig SUM() Function - pig latin - apache pig - pig hadoop




What is SUM() Function in Apache Pig ?

  • The SUM() function used in Apache Pig is used to get the total of the numeric values of a column in a single-column bag.
  • The SUM() function ignores the NULL values while computing the total
  • The SUM() Function will requires a preceding GROUP ALL statement for global sums and the GROUP BY statement for group sums.
  • The SUM() function can add individual values, cell references and the ranges.
  • The SUM() Function will adds all the numbers which are given in a column.
 learn pig tutorial - apache  pig sum function - pig example

learn pig tutorial - apache pig sum function - pig example

Syntax

grunt> SUM(expression)

Example

wikitechy_employee.txt

1,Joseph,2007-01-24,250  
2,Ramesh,2007-05-27,220  
3,John,2007-05-06,170  
3,John,2007-04-06,100 
4,Phil,2007-04-06,220 
5,Sarah,2007-06-06,300
5,Sarah,2007-02-06,350
  • We have loaded the file into Pig with the relation name called employee_data which is given below:
grunt> employee_data = LOAD 'hdfs://localhost:9000/pig_data/ wikitechy_employee.txt' USING PigStorage(',')
   as (id:int, name:chararray, workdate:chararray, daily_typing_pages:int);

Calculating the Sum of All GPA

  • We need to group the relation name employee_data by using the Group All operator, and we need to store the result in the relation name employee_group which is given below:
grunt> employee_group = Group employee_data all;
  • It will produce a relation for calculating the sum of all gpa which is given below:
grunt> Dump employee_group;
(all,{(5,Sarah,2007-02-06,350),
(5,Sarah,2007-06-06,300),
(4,Phil,2007-0406,220),
(3,John,2007-04-06,100),
(3,John,2007-05-06,170),
(2,Ramesh,2007-0527,220),
(1,Joseph,2007-01-24,250)})
  • Now, we will need to calculate the global sum of the pages which is typed daily.
grunt> employee_workpages_sum = foreach employee_group Generate(employee_data.name,employee_data.daily_typing_pages),SUM(employee_data.daily_typing_pages);

Verification

grunt> Dump employee_workpages_sum;

Output

(({ (Sarah), (Sarah), (Phil) ,(John) , (John) , (Ramesh) , (Joseph) }, 
{ (350) , (300) , (220) ,(100) , (170)  ,  (220)  , (250)  }),1610)

Related Searches to Apache Pig SUM() Function

Adblocker detected! Please consider reading this notice.

We've detected that you are using AdBlock Plus or some other adblocking software which is preventing the page from fully loading.

We don't have any banner, Flash, animation, obnoxious sound, or popup ad. We do not implement these annoying types of ads!

We need money to operate the site, and almost all of it comes from our online advertising.

Please add wikitechy.com to your ad blocking whitelist or disable your adblocking software.

×