pig tutorial - apache pig tutorial - Apache Pig - REPLACE() - pig latin - apache pig - pig hadoop



What is REPLACE()?

  • REPLACE() function is used to replace all the characters in a given string with the new characters.

Syntax:

Given below is the syntax of the REPLACE() function. This function accepts three parameters, namely,

  • string − The string that is to be replaced. If we want to replace the string within a relation, we have to pass the column name the string belongs to.
  • regEXP − Here we have to pass the string/regular expression we want to replace.
  • newChar − Here we have to pass the new value of the string.
grunt> REPLACE(string, 'regExp', 'newChar');

Example:

Assume that there is a file named wikitechy_emp.txt in the HDFS directory /pig_data/as shown below. This file contains the employee details such as id, name, age, and city.

wikitechy_emp.txt

001,Robin,22,newyork
002,BOB,23,Kolkata
003,Maya,23,Tokyo
004,Sara,25,London 
005,David,23,Bhuwaneshwar 
006,Maggy,22,Chennai
007,Robert,22,newyork 
008,Syam,23,Kolkata
009,Mary,25,Tokyo 
010,Saran,25,London 
011,Stacy,25,Bhuwaneshwar 
012,Kelly,22,Chennai

And, we have loaded this file into Pig with a relation named wikitechy_emp_data as shown below.

grunt> wikitechy_emp_data = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_emp1.txt' USING PigStorage(',')
   as (id:int, name:chararray, age:int, city:chararray);

Following is an example of the REPLACE() function. In this example, we have replaced the name of the city Bhuwaneshwar with a shorter form Bhuw.

grunt> replace_data = FOREACH wikitechy_emp_data GENERATE (id,city),REPLACE(city,'Bhuwaneshwar','Bhuw');

The above statement replaces the string 'Bhuwaneshwar' with 'Bhuw' in the column named city in the wikitechy_emp_data relation and returns the result. This result is stored in the relation named replace_data. Verify the content of the relation replace_data using the Dump operator as shown below.

grunt> Dump replace_data;

 ((1,Tokyo),Tokyo) 
((2,Kolkata),Kolkata)
((3,London),London) 
((4,London),London) 
((5,Bhuwaneshwar),Bhuw) 
((6,Chennai),Chennai)
((7,Bhuwaneshwar),Bhuw) 
((8,Kolkata),Kolkata)
((9,Chennai),Chennai)
((10,newyork),Newyork) 
((11,Tokyo),Tokyo)
((12,newyork),Newyork)  

Related Searches to Apache Pig - REPLACE()