pig tutorial - apache pig tutorial - Apache Pig - Union Operator - pig latin - apache pig - pig hadoop



What is Union Operator in Apache Pig ?

    • The UNION operator of Pig Latin is used to merge the content of two relations.
    • To perform UNION operation on two relations, their columns and domains must be identical.
  • UNION instruction:
    • Joins in the same relation multiple relations
    learn apache pig - apache pig tutorial - pig tutorial - apache pig examples - big data - apache pig script - apache pig program - apache pig download - apache pig example  - apache pig union operation

    Syntax

      The syntax of the UNION operator is

      grunt> Relation_name3 = UNION Relation_name1, Relation_name2;
      

      Example

        Ensure that we have two files namely wikitechy_employee_data1.txt and wikitechy_employee_data2.txt in the /pig_data/ directory of HDFS as shown below.

        wikitechy_employee_data1.txt

          111,Anu,Shankar,23,9876543210,Chennai
          112,Barvathi,Nambiayar,24,9876543211,Chennai
          113,Kajal,Nayak,24,9876543212,Trivendram
          114,Preethi,Antony,21,9876543213,Pune
          115,Raj,Gopal,21,9876543214,Hyderabad
          116,Yashika,Kannan,22,9876543215,Delhi
          

          wikitechy_employee_data2.txt

            117,siddu,Narayanan,22,9876543216,Kolkata
            118,Timple,Mohanthy,23,9876543217,Bhuwaneshwar
            

            And we have loaded these two files into Pig with the relations employee1 and employee2 as shown below.

            grunt> employee1 = LOAD 'hdfs://localhost:9000/pig_data/ wikitechy_employee_data1.txt' USING PigStorage(',') 
               as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray); 
             
            grunt> employee2 = LOAD 'hdfs://localhost:9000/pig_data/ wikitechy_employee_data2.txt' USING PigStorage(',') 
               as (id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray);
            

            Now merge the contents of these two relations using the UNION operator as given below.

            grunt> employee = UNION employee1, employee2;
            

            Verification

              Now verify the relation employee using the DUMP operator as given below.

              grunt> Dump employee; 
              

              Output

                The following output, displaying the contents of the relation employee.

                111,Anu,Shankar,23,9876543210,Chennai
                112,Barvathi,Nambiayar,24,9876543211,Chennai
                113,Kajal,Nayak,24,9876543212,Trivendram
                114,Preethi,Antony,21,9876543213,Pune
                115,Raj,Gopal,21,9876543214,Hyderabad
                116,Yashika,Kannan,22,9876543215,Delhi
                117,siddu,Narayanan,22,9876543216,Kolkata
                118,Timple,Mohanthy,23,9876543217,Bhuwaneshwar
                

                Related Searches to Apache Pig - Union Operator