[Solved-1 Solution] Filtering null values with pig ?



Problem :

How to filtering null values with pig ?

Solution 1:

  • If all of the fields are of numeric types, we can use the below one
filtered = FILTER data BY $0*$1*$2 is not null;
  • In Pig, if any terms in an arithmetic expression are null, the result is null.
  • It can also write a UDF to take an arbitrary number of arguments and return null..
filtered = FILTER data BY NUMBER_OF_NULLS($0, $1, $2) == 0;
  • where NUMBER_OF_NULLS is defined elsewhere, e.g.
public class NUMBER_OF_NULLS extends EvalFunc {
    public Integer exec(Tuple input) {
        if (input == null) { return 0; }

        int c = 0;
        for (int i = 0; i < input.size(); i++) {
            if (input.get(i) == null) c++;
        }
        return c;
    }
}

Related Searches to Filtering null values with pig