pig tutorial - apache pig tutorial - Apache Pig - Pig Storage() - pig latin - apache pig - pig hadoop
What is Pig Storage() in Apache Pig ?
- The PigStorage() function loads and stores data as structured text files.
- It takes a delimiter using which each entity of a tuple is separated as a parameter.
- By default, it takes ‘\t’ as a parameter.
- Let us suppose we have a file named wikitechy_employee_data.txt in the HDFS directory named /data/ with the following content.
- We can load the data using the PigStorage function as given below.
- In the above example, we have seen that we have used comma (‘,’) delimiter.
- Therefore, we have separated the values of a record using (,).
- In the similar way, we can use the PigStorage() function to store the data into HDFS directory as given below.
- This will store the data into the given directory. You can verify the data as given below.
- First of all, list out the files in the directory named pig_output using ls command as given below.
- We can perceive that two files were created after executing the Store statement.
- Then, using the cat command, list the contents of the file named part-m-00000 as given below.