pig tutorial - apache pig tutorial - Apache Pig RANDOM() - pig latin - apache pig - pig hadoop



What is RANDOM() function in Apache Pig ?

  • The above statement stores the result in the relation named log_data.
  • The RANDOM() function returns a pseudo-random number in the range of 0 to RAND_MAX ie. greater than or equal to 0.0 and less than 1.0.
  • RAND_MAX is a constant whose default value may vary between implementations but it is granted to be at least 32767.
 Apache Pig Random

Learn Apache Pig - Apache Pig tutorial - Apache Pig Random - Apache Pig examples - Apache Pig programs

Syntax

grunt> RANDOM()

Example

  • Ensure that we have a file named wikitechy_math.txt in the HDFS directory /pig_data/. This file contains integer and floating point values as given below.

Wikitechy_math.txt

5 
16 
9 
2.5 
5.9 
3.1 
  • You have loaded this file into Pig with a relation named math_data as given below.
grunt> math_data = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_math.txt' USING PigStorage(',')
   as (data:float);
  • Now you can calculate random values of the contents of the wikitechy_math.txt file using RANDOM() function as given below.
grunt> random_data = foreach math_data generate (data), RANDOM();

Verification

  • Verify the contents of the relation using the Dump operator as given below.
grunt> Dump random_data;

Output

  • The above statement stores the result in the relation named random_data.
(5.0,0.6842057767279982) 
(16.0,0.9725172591786139) 
(9.0,0.4159326414649489) 
(2.5,0.30962777780713147) 
(5.9,0.705213727551145) 
(3.1,0.24247708413861724)

Related Searches to Apache Pig RANDOM()