pig tutorial - apache pig tutorial - Apache Pig AVG() Function - pig latin - apache pig - pig hadoop



What is AVG() Function in Apache Pig ?

  • The AVG() function which is used in Apache Pig returns the average value of the numeric columns which is done in a bag.
  • The AVG() function is done to compute the average of the numerical values which is done within a bag.
  • The AVG() function will ignore the NULL values while we are calculating the average values
  • The AVG() Function computes the numeric values within a single-column bag.
  • The AVG() function will require a preceding GROUP ALL statement for global averages and the GROUP BY statement for the group averages. which is given in the average function
 Apache Pig AVG()

Learn Apache Pig - Apache Pig tutorial - Apache Pig AVG() - Apache Pig examples - Apache Pig programs



 Apache Pig AVG() Function

Learn Apache Pig - Apache Pig tutorial - Apache Pig AVG() Function - Apache Pig examples - Apache Pig programs

Syntax

grunt> AVG(expression) 

Example

wikitechy_employee_details.txt

001,Suresh,Reddy,21,9848022337,Hyderabad,89
002,fathima,Battacharya,22,9848022338,Kolkata,78
003,Ramesh,Khanna,22,9848022339,Delhi,90 
004,Preethi,Agarwal,21,9848022330,Pune,93 
005,Deepthi,Mohanthy,23,9848022336,Bhuwaneshwar,75 
006,Sruti,Mishra,23,9848022335,Chennai,87 
007,Kama,Nayak,24,9848022334,trivendram,83 
008,Vanitha,Nambiayar,24,9848022333,Chennai,72 
  • And hence we have loaded this file into Pig which is done with the relation name wikitechy_employee_details as shown below.
grunt> wikitechy_employee_details = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_details.txt' USING PigStorage(',')
   as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray, gpa:int);

Calculating the Average GPA

  • Let us group the relation wikitechy_employee_details by using the Group All operator, and store the result in the relation which is called employee_group_all which is given below:
grunt> employee_group_all = Group wikitechy_employee_details All;
  • This statement which is given above will produce a relation for employee_group_all as shown below.
grunt> Dump employee_group_all;
   
(all,{(8,Vanitha,Nambiayar,24,9848022333,Chennai,72),
(7,Kama,Nayak,24,9848022334,trivendram,83),
(6,Sruti,Mishra,23,9848022335,Chennai,87),
(5,Deepthi,Mohanthy,23,9848022336,Bhuwaneshwar,75),
(4,Preethi,Agarwal,21,9848022330,Pune,93),
(3 ,Ramesh,Khanna,22,9848022339,Delhi,90),
(2,fathima,Battacharya,22,9848022338,Ko lkata,78),
(1,Suresh,Reddy,21,9848022337,Hyderabad,89)})
  • We will now calculate the global average GPA of all the employees by using the AVG() function which is given below:
grunt> employee_gpa_avg = foreach employee_group_all  Generate (wikitechy_employee_details.firstname,wikitechy_employee_details.gpa), AVG(wikitechy_employee_details.gpa); 

Verification:

grunt> Dump employee_gpa_avg; 

Output:

(({(Vanitha),(Kama),(Sruti),(Deepthi),(Preethi),(Ramesh),(fathima),(Suresh) }, 
  {   (72)   ,  (83) ,   (87)  ,   (75)  ,   (93)  ,  (90)  ,   (78)   ,  (89)  }),83.375) 

Related Searches to Apache Pig AVG() Function