[Solved-2 Solutions] Apache Pig permissions issue ?



Problem:

  • If your attempting to get Apache Pig up and running on my Hadoop cluster, and am encountering a permissions problem. Pig itself is launching and connecting to the cluster just fine- from within the Pig shell, if it ls through and around HDFS directories.
  • However, when you try and actually load data and run Pig commands, you may run into permissions-related errors:
grunt> A = load 'all_annotated.txt' USING PigStorage() AS (id:long, text:chararray, lang:chararray);
grunt> DUMP A;
2011-08-24 18:11:40,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - You don't have permission to perform the operation. Error from the server: org.apache.hadoop.security.AccessControlException: Permission denied: user=steven, access=WRITE, inode="":hadoop:supergroup:r-xr-xr-x
2011-08-24 18:11:40,977 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A
Details at logfile: /Users/steven/Desktop/Hacking/hadoop/pig/pig-0.9.0/pig_1314230681326.log
grunt> 
  • In this case, all_annotated.txt is a file in my HDFS home directory that you may created, and most definitely have permissions to; the same problem occurs no matter what file you try to load. However the problem, as the error itself indicates Pig is trying to write somewhere.

If you have any ideas as to what might be going on ?

Solution 1:

Probably our pig.temp.dir setting. It defaults to /tmp on hdfs. Pig will write temporary result there. If we don't have permission to /tmp, Pig will complain. We can try to override it by -Dpig.temp.dir.

Solution 2:

A problem might be that hadoop.tmp.dir is a directory on your local filesystem, not HDFS. Try setting that property to a local directory .we can run into the same error using regular MapReduce in Hadoop.


Related Searches to Apache Pig permissions issue