pig tutorial - apache pig tutorial - Apache Pig - IsEmpty() Function - pig latin - apache pig - pig hadoop



What is IsEmpty() function in Apache Pig ?

  • The IsEmpty() function used in Apache Pig is to check if a bag or map is empty.
  • The IsEmpty() function is used to filter the data.
  • The IsEmpty() returns a Boolean value is indicating whether a variable has been initialized.
  • The IsEmpty() function will only return the meaningful information for the variants.
  • The IsEmpty() function expresses the argument which is most often uses a single variable name.
 isempty() in apache pig

Learn apache pig - apache pig tutorial - isempty() in apache pig - apache pig examples - apache pig programs

Syntax

grunt> IsEmpty(expression)

Example

We can assume that we have two files namely wikitechy_employee_sales.txt and wikitechy_employee_bonus.txt which is given in the HDFS directory /pig_data/ which is given below:

wikitechy_employee_sales.txt

1,Robin,22,25000,sales 
2,BOB,23,30000,sales 
3,Maya,23,25000,sales 
4,Sara,25,40000,sales 
5,David,23,45000,sales 
6,Maggy,22,35000,sales

wikitechy_employee_bonus.txt

1,Robin,22,25000,sales 
2,Jaya,23,20000,admin 
3,Maya,23,25000,sales 
4,Alia,25,50000,admin 
5,David,23,45000,sales 
6,Omar,30,30000,admin
  • We have loaded the files into Pig, with the relation names which are called employee_sales and employee_bonus.

employee_sales

grunt> employee_sales = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_sales.txt' USING PigStorage(',')
as (sno:int, name:chararray, age:int, salary:int, dept:chararray);

employee_bonus

grunt> employee_bonus = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_bonus.txt' USING PigStorage(',')
as (sno:int, name:chararray, age:int, salary:int, dept:chararray);
grunt> cogroup_data = COGROUP employee_sales by age, employee_bonus by age;

Verify the relation cogroup_data by using the DUMP operator which is given below.

grunt> Dump cogroup_data;
(22,{(6,Maggy,22,35000,sales),(1,Robin,22,25000,sales)}, {(1,Robin,22,25000,sales)}) 
(23,{(5,David,23,45000,sales),(3,Maya,23,25000,sales),(2,BOB,23,30000,sales)},   {(5,David,23,45000,sales),(3,Maya,23,25000,sales),(2,Jaya,23,20000,admin)})  
(25,{(4,Sara,25,40000,sales)},{(4,Alia,25,50000,admin)}) 
(30,{},{(6,Omar,30,30000,admin)})

We need to list some empty bags from the employee_sales relation which is given in the group by using the IsEmpty() function.

grunt> isempty_data = filter cogroup_data by IsEmpty(employee_sales);

Verification

grunt> Dump isempty_data;   
(30,{},{(6,Omar,30,30000,admin)}

Related Searches to Apache Pig - IsEmpty() Function