pig tutorial - apache pig tutorial - Apache Pig - TextLoader() - pig latin - apache pig - pig hadoop



What is TextLoader() in Apache Pig ?

  • The Pig Latin function TextLoader() is a Load function which is used to load unstructured data in UTF-8 format.
  • Each resulting tuple contains a single field with one line of input text.
  • TextLoader also supports compression.
  • Now, TextLoader support for compression is limited.
  • TextLoader cannot be used to store data.

Syntax

grunt> TextLoader()

Example

  • Assume that there is a file with named wikitechy_employee_data.txt in the HDFS directory named /data/ as given below.
111,Anu,Shankar,23,9876543210,Chennai
112,Barvathi,Nambiayar,24,9876543211,Chennai
113,Kajal,Nayak,24,9876543212,Trivendram
114,Preethi,Antony,21,9876543213,Pune
115,Raj,Gopal,21,9876543214,Hyderabad
116,Yashika,Kannan,22,9876543215,Delhi
117,siddu,Narayanan,22,9876543216,Kolkata
118,Timple,Mohanthy,23,9876543217,Bhuwaneshwar
  • You can load the above file using the TextLoader() function.
grunt> details = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_data.txt' USING TextLoader();
  • Now verify the loaded data using the Dump operator.
grunt> dump details;
111,Anu_Shankar,23,Chennai
112,Barvathi_Nambiayar,24,Chennai
113,Kajal_Nayak,24,Trivendram
114,Preethi_Antony,21,Pune
115,Raj_Gopal,21,Hyderabad
116,Yashika_Kannan,22,Delhi
117,siddu_Narayanan,22,Kolkata
118,Timple_Mohanthy,23,Bhuwaneshwar

Related Searches to Apache Pig - TextLoader()