pig tutorial - apache pig tutorial - Where to contribute Apache Pig UDF ? - pig latin - apache pig - pig hadoop

In addition to the built-in functions, Apache Pig provides extensive support for User Defined Functions (UDF’s).
Using these UDF’s, we can define our own functions and use them. The UDF support is provided in six programming languages, namely, Java, Jython, Python, JavaScript, Ruby and Groovy.
For writing UDF’s, complete support is provided in Java and limited support is provided in all the remaining languages. Using Java, you can write UDF’s involving all parts of the processing like data load/store, column transformation, and aggregation.
Since Apache Pig has been written in Java, the UDF’s written using Java language work efficiently compared to other languages.
In Apache Pig, we also have a Java repository for UDF’s named Piggybank. Using Piggybank, we can access Java UDF’s written by other users, and contribute our own UDF’s.

Using Functions

To see how to use your own functions in a pig script. Note that only JAVA functions are supported at this time.
In brief, you either DEFINE a function to give it a short name, or else call it with full package name as shown below.
The functions are currently distributed in source form. Users are required to checkout the code and build the package themselves. No binary distributions or nightly builds are available at this time.

To build a jar file that contains all available user defined functions (UDFs).

Here is the following steps

Create a directory for the Pig source code:`mkdir pig`

cd into that directory:'cd pig'

Checkout the Pig source code 'http://svn.apache.org/repos/asf/pig/trunk/

Build the project: `ant`

cd into the piggybank dir:cd contrib/piggybank/java

Build the piggybank:'ant'

You should now see a piggybank.jar file in that directory.

Make sure your classpath includes the hadoop jars as well. This worked for me using the cloudera CDH2 / hadoop AMIs

pig_version=0.4.99.0+10   ; pig_dir=/usr/lib/pig ;
hadoop_version=0.20.1+152 ; hadoop_dir=/usr/lib/hadoop ;
export CLASSPATH=$CLASSPATH:${hadoop_dir}/hadoop-${hadoop_version}-core.jar:${hadoop_dir}/hadoop-${hadoop_version}-tools.jar:${hadoop_dir}/hadoop-${hadoop_version}-ant.jar:${hadoop_dir}/lib/commons-logging-1.0.4.jar:${pig_dir}/pig-${pig_version}-core.jar
export PIG_CONF_DIR=/path/to/mapred-site/and/core-site/pointing/to/your/cluster

Obtain `javadoc` description of the functions run ant javadoc from trunk/contrib/piggybank/java directory.

The document is generate in trunk/contrib/piggybank/java/build/javadoc directory.

The top level packages correspond to the function type and currently are:

org.apache.pig.piggybank.comparison - For custom comparator used by ORDER operator
org.apache.pig.piggybank.evaluation - for eval functions like aggregates and column transformations
org.apache.pig.piggybank.filtering - For functions used in FILTER operator
org.apache.pig.piggybank.grouping - For grouping functions
org.apache.pig.piggybank.storage - For load/store functions

(The exact package of the function can be seen in the javadocs or by navigating the source tree.)

For example, to use the UPPER command:

REGISTER /public/share/pig/contrib/piggybank/java/piggybank.jar ;
TweetsInaug  = FILTER Tweets BY org.apache.pig.piggybank.evaluation.string.UPPER(text) MATCHES '.*(INAUG|OBAMA|BIDEN|CHENEY|BUSH).*' ;
STORE TweetsInaug INTO 'meta/inaug/tweets_inaug' ;

pig tutorial - apache pig tutorial - Where to contribute Apache Pig UDF ? - pig latin - apache pig - pig hadoop

User Defined Pig Functions

Using Functions

The top level packages correspond to the function type and currently are:

Related Searches to Where to contribute Apache Pig UDF?

Wikitechy

Workshop

Join our Community

Other Languages

pig tutorial - apache pig tutorial - Where to contribute Apache Pig UDF ? - pig latin - apache pig - pig hadoop

User Defined Pig Functions

Using Functions

The top level packages correspond to the function type and currently are:

Related Searches to Where to contribute Apache Pig UDF?

Summer Offline Internship

Summer Online Internship

Internship in Chennai

Programming / Technology Internship in Chennai

Wikitechy

Workshop

Join our Community

Other Languages