[Solved-1 Solution] How to add a column to an already existing table in apache pig ?

Problem :

How to add a column to an already existing table in apache pig ?

Solution 1:

We can use random for this purpose


A = ...
B = foreach A generate (int)(RANDOM()*100.0) as rnd, [other fields...]

Random Function

  • The RANDOM() function is used to get a pseudo random number (type double) greater than or equal to 0.0 and less than 1.0.
grunt> RANDOM()


  • Assume that there is a file named math.txt in the HDFS directory
  • This file contains integer and floating point values as shown below.
  • we have loaded this file into Pig with a relation named math_data as shown below.
grunt> math_data = LOAD 'hdfs://localhost:9000/pig_data/math.txt' USING PigStorage(',')
   as (data:float);
  • Let us now generate random values of the contents of the math.txt file using RANDOM() function as shown below.
grunt> random_data = foreach math_data generate (data), RANDOM();
  • The above statement stores the result in the relation named random_data. Verify the contents of the relation using the Dump operator as shown below.
grunt> Dump random_data;

Related Searches to How to add a column to an already existing table in apache pig