pig tutorial - apache pig tutorial - Apache Pig - SUBTRACT() Function - pig latin - apache pig - pig hadoop

What is SUBTRACT() Function in Apache Pig ?

    • The SUBTRACT() function used in Apache Pig is used to subtract two bags.
    • The SUBTRACT() function takes two bags as inputs and returns the bag contains the tuples of the first bag which are not there in the second bag.
    • The SUBTRACT() function returns the difference between two numbers.


      grunt> SUBTRACT(expression, expression)


        We need to assume that we have two files namely wikitechy_employee_sales.txt and wikitechy_employee_bonus.txt in the HDFS directory /pig_data/ which is given below:


        We have loaded the files into Pig, with the relation names called employee_sales and employee_bonus respectively.


          grunt> employee_sales = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_sales.txt' USING PigStorage(',')
             as (sno:int, name:chararray, age:int, salary:int, dept:chararray);


            grunt> employee_bonus = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_bonus.txt' USING PigStorage(',')
               as (sno:int, name:chararray, age:int, salary:int, dept:chararray);	
            • We need to group the records and the tuples of the relation names employee_sales and employee_bonus with the key sno, by using the COGROUP operator which is given below:
            grunt> cogroup_data = COGROUP employee_sales by sno, employee_bonus by sno;
            • We need to verify the relation cogroup_data by using the DUMP operator which is given below:
            <b>grunt> Dump cogroup_data;</b> 

            Subtracting One Relation from the Other

              • We will now need to subtract the tuples of employee_bonus relation from the employee_sales relation.
              grunt> sub_data = FOREACH cogroup_data GENERATE SUBTRACT(employee_sales, employee_bonus);


                <b>grunt> Dump sub_data;</b>
                • Now in the same way, we need to subtract the employee_sales relation from the relation employee_bonus relation which is given below:
                grunt> sub_data = FOREACH cogroup_data GENERATE SUBTRACT(employee_bonus, employee_sales);
                • Now we need to verify the contents of the sub_data relation by using the Dump operator which is given below:
                <b>grunt> Dump sub_data;</b>

                Related Searches to Apache Pig - SUBTRACT() Function

                Adblocker detected! Please consider reading this notice.

                We've detected that you are using AdBlock Plus or some other adblocking software which is preventing the page from fully loading.

                We don't have any banner, Flash, animation, obnoxious sound, or popup ad. We do not implement these annoying types of ads!

                We need money to operate the site, and almost all of it comes from our online advertising.

                Please add wikitechy.com to your ad blocking whitelist or disable your adblocking software.