pig tutorial - apache pig tutorial - Apache Pig Grunt Shell - pig latin - apache pig - pig hadoop
What is Grunt Shell in Apache Pig ?
The Grunt Shell: An interactive shell to write and execute Pig-Latin and to access HDFS
- Grunt Shell is a Shell Command.
- The Grunt shell of Apache Pig is mainly used to write Pig Latin scripts. Prior to that, we can invoke any shell commands using sh and fs.
- There are certain useful shell and utility commands provided and given by the Grunt shell.
- Invokes any FsShell command from within a Pig script or the Grunt shell.
- fs -mkdir /tmp
- fs -copyFromLocal file-x file-y
- fs -ls file-y
- Invokes any sh shell command from within a Pig script or the Grunt shell.
- Run a Pig script.
- exec [–param param_name = param_value] [–param_file file_name] [script]
- Use the exec command to run a Pig script with no interaction between the script and the Grunt shell (batch mode).
- Aliases defined in the script are not available to the shell;
- Run a Pig script
- run [–param param_name = param_value] [–param_file file_name] script
- Interactive mode
- The Grunt shell of Apache Pig is used to write Pig Latin scripts.
- We can invoke any shell commands by two commands and they are sh and fs.
- We can invoke any shell commands which is given from the Grunt shell by using the sh command.
- By the using the sh command from the Grunt shell, we cannot execute the commands which are a part of the shell environment.
- We can invoke any FsShell commands from the Grunt shell by using the fs command.
- The fs command extends the set of supported file system commands and the capabilities supported for existing commands
- grunt> fs -ls
- Found 3 items
- drwxrwxrwx - Hadoop supergroup 0 2015-09-08 14:13 Hbase
- drwxr-xr-x - Hadoop supergroup 0 2015-09-09 14:52 seqgen_data
- drwxr-xr-x - Hadoop supergroup 0 2015-09-08 11:30 twitter_data
- The Grunt shell provides a set of utility commands which is a type of shell command which is used.
- They include utility commands such as clear, help, history, quit, set, exec, kill, and run to control Pig from the Grunt shell.
- The clear command is a utility command which is used to clear the screen of the Grunt shell.
- The help command is a utility command which give us a list of Pig commands and Pig properties.
- We get a list of Pig commands by using the help command which is given below:
- This command will display a list of statements which are executed and used since the Grunt sell has been invoked.
- We have executed the three statements since the opening the Grunt shell.
- We can produce the following output by using the history command
- The set command which is given is used to show and assign values to the keys which is used in Pig.
- We can set values to the following keys by using set commands
|Key||Description and values|
|default_parallel||You can set the number of reducers for a map job by passing any whole number as a value to this key.|
|debug||You can turn off or turn on the debugging freature in Pig by passing on/off to this key.|
|job.name||You can set the Job name to the required job by passing a string value to this key.|
You can set the job priority to a job by passing one of the following values to this key −
|stream.skippath||For streaming, you can set the path from where the data is not to be transferred, by passing the desired path in the form of a string to this key.|
- We can quit from the Grunt shell by using the quit command.
- We can execute Pig scripts from the Grunt shell by using the exec command
- Here is the sample script which is given for Exec command and it is given as sample_script.pig
- We can kill a MapReduce job from the Grunt shell by using the kill command.
- We can run a Pig script from the Grunt shell by using the run command
- We can assume that we have a script file which is called sample_script.pig in the local file system which is given with the following content.