How to ignore " (double quotes) while loading file in PIG ?



Problem:

How to ignore " (double quotes) while loading file in PIG ?

Solution 1:

REPLACE function

  • This function is used to replace all the characters in a given string with the new characters.

grunt> REPLACE(string, 'regExp', 
  • We can use the REPLACE function .
file1 = load 'your.csv' using PigStorage(',');
data = foreach file1 generate $0 as (f1:chararray), $1 as (f2:chararray), REPLACE($2, '\\"', '') as (f3:int), REPLACE($3, '\\"', '') as (f4:int);

We may also use regexes with

REGEX_EXTRACT :

file1 = load 'your.csv' using PigStorage(',');
data = foreach file1 generate $0, $1, REGEX_EXTRACT($2, '([0-9]+)', 1), REGEX_EXTRACT($3, '([0-9]+)', 1);

Solution 2:

Try the below one:

using org.apache.pig.piggybank.storage.CSVExcelStorage() 

Related Searches to How to ignore " (double quotes) while loading file in PIG ?