pig tutorial - apache pig tutorial - Apache Pig Cross Operator - pig latin - apache pig - pig hadoop

What is CROSS operator?

The CROSS operator computes the cross-product of two or more relations. This chapter explains with example how to use the cross operator in Pig Latin.

CROSS instruction:

Cartesian product of two or more relations

learn apache pig - apache pig tutorial - pig tutorial - apache pig examples - big data - apache pig script - apache pig program - apache pig download - apache pig example - apache pig cross join operation

Syntax:

The syntax of the CROSS operator is

grunt> Relation3_name = CROSS Relation1_name, Relation2_name;

Example 1:

Assume that we have two files namely customers.txt and orders.txt in the /pig_data/ directory of HDFS as shown below.

customers.txt

1,Ramesh,32,Ahmedabad,2000.00
2,Khilan,25,Delhi,1500.00
3,kaushik,23,Kota,2000.00
4,Chaitali,25,Mumbai,6500.00
5,Hardik,27,Bhopal,8500.00
6,Komal,22,MP,4500.00
7,Muffy,24,Indore,10000.00

orders.txt

102,2009-10-08 00:00:00,3,3000
100,2009-10-08 00:00:00,3,1500
101,2009-11-20 00:00:00,2,1560
103,2008-05-20 00:00:00,4,2060

And we have loaded these two files into Pig with the relations customers and orders as shown below.

grunt> customers = LOAD 'hdfs://localhost:9000/pig_data/customers.txt' USING PigStorage(',')
   as (id:int, name:chararray, age:int, address:chararray, salary:int);
  
grunt> orders = LOAD 'hdfs://localhost:9000/pig_data/orders.txt' USING PigStorage(',')
   as (oid:int, date:chararray, customer_id:int, amount:int);

Now get the cross-product of these two relations using the cross operator on these two relations as shown below.

grunt> cross_data = CROSS customers, orders;

Verification:

Verify the relation cross_data using the DUMP operator as shown below.

grunt> Dump cross_data;

Output:

It will produce the following output, displaying the contents of the relation cross_data.

(7,Muffy,24,Indore,10000,103,2008-05-20 00:00:00,4,2060) 
(7,Muffy,24,Indore,10000,101,2009-11-20 00:00:00,2,1560) 
(7,Muffy,24,Indore,10000,100,2009-10-08 00:00:00,3,1500) 
(7,Muffy,24,Indore,10000,102,2009-10-08 00:00:00,3,3000) 
(6,Komal,22,MP,4500,103,2008-05-20 00:00:00,4,2060) 
(6,Komal,22,MP,4500,101,2009-11-20 00:00:00,2,1560) 
(6,Komal,22,MP,4500,100,2009-10-08 00:00:00,3,1500) 
(6,Komal,22,MP,4500,102,2009-10-08 00:00:00,3,3000) 
(5,Hardik,27,Bhopal,8500,103,2008-05-20 00:00:00,4,2060) 
(5,Hardik,27,Bhopal,8500,101,2009-11-20 00:00:00,2,1560) 
(5,Hardik,27,Bhopal,8500,100,2009-10-08 00:00:00,3,1500) 
(5,Hardik,27,Bhopal,8500,102,2009-10-08 00:00:00,3,3000) 
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060) 
(4,Chaitali,25,Mumbai,6500,101,2009-20 00:00:00,4,2060) 
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560) 
(2,Khilan,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500) 
(2,Khilan,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000) 
(1,Ramesh,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060) 
(1,Ramesh,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560) 
(1,Ramesh,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500) 
(1,Ramesh,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)-11-20 00:00:00,2,1560) 
(4,Chaitali,25,Mumbai,6500,100,2009-10-08 00:00:00,3,1500) 
(4,Chaitali,25,Mumbai,6500,102,2009-10-08 00:00:00,3,3000) 
(3,kaushik,23,Kota,2000,103,2008-05-20 00:00:00,4,2060) 
(3,kaushik,23,Kota,2000,101,2009-11-20 00:00:00,2,1560) 
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500) 
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000) 
(2,Khilan,25,Delhi,1500,103,2008-05-20 00:00:00,4,2060) 
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560) 
(2,Khilan,25,Delhi,1500,100,2009-10-08 00:00:00,3,1500)
(2,Khilan,25,Delhi,1500,102,2009-10-08 00:00:00,3,3000) 
(1,Ramesh,32,Ahmedabad,2000,103,2008-05-20 00:00:00,4,2060) 
(1,Ramesh,32,Ahmedabad,2000,101,2009-11-20 00:00:00,2,1560) 
(1,Ramesh,32,Ahmedabad,2000,100,2009-10-08 00:00:00,3,1500) 
(1,Ramesh,32,Ahmedabad,2000,102,2009-10-08 00:00:00,3,3000)

Example 2:

Step 1 - Change the directory to /usr/local/pig/bin

$ cd /usr/local/pig/bin

Step 2 - Enter into grunt shell in MapReduce mode.

 $ ./pig -x mapreduce

Step 3 - Create a customers.txt file.

customers.txt

Step 4 - Add these following lines to customers.txt file.

1,Ramesh,32,Ahmedabad,2000.00
2,Khilan,25,Delhi,1500.00
3,kaushik,23,Kota,2000.00
4,Chaitali,25,Mumbai,6500.00
5,Hardik,27,Bhopal,8500.00
6,Komal,22,MP,4500.00
7,Muffy,24,Indore,10000.00

Step 5 - Create a orders.txt file.

orders.txt

Step 6 - Add these following lines to orders.txt file.

102,2009-10-08 00:00:00,3,3000
100,2009-10-08 00:00:00,3,1500
101,2009-11-20 00:00:00,2,1560
103,2008-05-20 00:00:00,4,2060

Step 7 - Copy customers.txt and orders.txt from local file system to HDFS. In my case, the customers.txt and orders.txt file are stored in /home/hduser/Desktop/PIG/ directory.

$ hdfs dfs -copyFromLocal /home/hduser/Desktop/PIG/customers.txt /user/hduser/pig/
$ hdfs dfs -copyFromLocal /home/hduser/Desktop/PIG/orders.txt /user/hduser/pig/

Step 8 - Load customers data.

customers = LOAD 'hdfs://localhost:9000/user/hduser/pig/customers.txt' USING
PigStorage(',')as (id:int, name:chararray, age:int, address:chararray,
salary:int);

Step 9 - Load orders data.

orders = LOAD 'hdfs://localhost:9000/user/hduser/pig/orders.txt' USING
PigStorage(',')as (oid:int, date:chararray, customer_id:int, amount:int);

Step 10 - Cross data.

cross_data = CROSS customers, orders;
Dump cross_data;

pig tutorial - apache pig tutorial - Apache Pig Cross Operator - pig latin - apache pig - pig hadoop

What is CROSS operator?

Syntax:

Example 1:

customers.txt

orders.txt

Verification:

Output:

Example 2:

Related Searches to Apache Pig Cross Operator

Wikitechy

Workshop

Join our Community

Other Languages

pig tutorial - apache pig tutorial - Apache Pig Cross Operator - pig latin - apache pig - pig hadoop

What is CROSS operator?

Syntax:

Example 1:

customers.txt

orders.txt

Verification:

Output:

Example 2:

Related Searches to Apache Pig Cross Operator

Summer Offline Internship

Summer Online Internship

Internship in Chennai

Programming / Technology Internship in Chennai

Wikitechy

Workshop

Join our Community

Other Languages