pig tutorial - apache pig tutorial - Apache Pig TOKENIZE() Function - pig latin - apache pig - pig hadoop




What is TOKENIZE() function in Apache Pig ?

  • The TOKENIZE() function used in Apache Pig is used to split a string in a single tuple and returns a bag which contains the output of the split operation.
  • The TOKENIZE() function is used to break an input string into tokens separated by a regular expression pattern.
  • The TOKENIZE() function is when the Token elements are placed under the element
  • The TOKENIZE() function will returns one token element, which contains the input string.
  • The TOKENIZE() function has each substring value which is found between the separator matches is placed inside elements with the name token and the namespace mhub

Syntax

grunt> TOKENIZE(expression [, 'field_delimiter']) 

Example

wikitechy_student_details.txt

111,Suresh Reddy,21,Hyderabad
112,Arvin Battacharya,22,Kolkata 
113,Ramesh Khanna,22,Delhi 
114,Preethi Agarwal,21,Pune 
115,Sruthi Mohanthy,23,Bhuwaneshwar 
116,Vanitha Mishra,23 ,Chennai 
117,Kamala Nayak,24,trivendram 
118,Bhargavi Nambiayar,24,Chennai 

We have loaded the file into Pig with the relation name wikitechy_student_details which is given below:

grunt> wikitechy_student_details = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_student_details.txt' USING PigStorage(',')
as (id:int, name:chararray, age:int,  city:chararray);

Tokenizing a String

We can use the TOKENIZE() function to split into a string.

grunt> student_name_tokenize = foreach wikitechy_student_details Generate TOKENIZE(name);

Verification

grunt> Dump student_name_tokenize;

Output

({(Suresh),(Reddy)})
({(Arvin),(Battacharya)})
({(Ramesh),(Khanna)})
({(Preethi),(Agarwal)})
({(Sruthi),(Mohanthy)})
({(Vanitha),(Mishra)})
({(Kamala),(Nayak)})
({(Bhargavi),(Nambiayar)})

Related Searches to Apache Pig TOKENIZE() Function

Adblocker detected! Please consider reading this notice.

We've detected that you are using AdBlock Plus or some other adblocking software which is preventing the page from fully loading.

We don't have any banner, Flash, animation, obnoxious sound, or popup ad. We do not implement these annoying types of ads!

We need money to operate the site, and almost all of it comes from our online advertising.

Please add wikitechy.com to your ad blocking whitelist or disable your adblocking software.

×