[Solved-2 Solutions] Formatting Date in Generate Statement of pig ?



Problem

In Pig, if you have a statement which basically appends the date to generated values.

Data = FOREACH Input GENERATE (CurrentTime()),FLATTEN(group), COUNT(guid)oas Cnt;

The output gives me the date 2013-05-25T09:01:38.914-04:00 in ISO8601.

How to make this just as "YYYY-MM-DD" ?

Solution 1:

  • We have several options:Convert it with Pig functions :

Example

A = load ...
B = foreach A {
  currTime = CurrentTime();
  year = (chararray)GetYear(currTime);
  month = (chararray)GetMonth(currTime);
  day = (chararray)GetDay(currTime);
  generate CONCAT(CONCAT(CONCAT(year, '-'), CONCAT(month, '-')),day) as myDate;
  • OR pass the date to the script as a parameter:
pig -f script.pig -param CURR_DATE=`date +%Y-%m-%d
  • OR declare it in script:
%declare CURR_DATE `date +%Y-%m-%d`;
  • Then refer to the variable as '$CURR_DATE' in the script.
  • We may also create a modified CurrentTime UDF in which we convert the DateTime object to the appropriate format with the Joda-Time library.

Solution 2:

  • If we are using Pig 0.12 .
  • We can use ToString(CurrentTime(),'yyyy-MM-dd')
  • we can use any datetime type instead of CurrentTime()

Related Searches to Formatting Date in Generate Statement of pig