What is the purpose of @outputSchema decorator in Python UDF when using in Apache Pig ?

July 12, 2021 One Min Read

68 0

Specifying the UDF output schema

A UDF has input and output. Here is the different ways you can specify the output format of a Python UDF through use of the outputSchema decorator.

Sample Code:

# the original udf
# it returns a single chararray (that's PigLatin for String)
@outputSchema('word:chararray')
def hi_world():
    return "hello world"
    
# this one returns a Python tuple. Pig recognises the first element 
# of the tuple as a chararray like before, and the next one as a 
# long (a kind of integer)
@outputSchema("word:chararray,number:long")
def hi_everyone():
  return "hi there", 15

#we can use outputSchema to define nested schemas too, here is a bag of tuples
@outputSchema('some_bag:bag{t:(field_1:chararray, field_2:int)}')
def bag_udf():
    return [
        ('hi',1000),
        ('there',2000),
        ('bill',0)
    ]

#and here is a map
@outputSchema('something_nice:map[]')
def my_map_maker():
    return {"a":"b", "c":"d", "e","f"}

OutputSchema can be used to imply that a function outputs one or a combination of basic types. Those types are:

chararray: like a string
bytearray: a bunch of bytes in a row. Like a string but not as human friendly
long: long integer
int: normal integer
double: floating point number
datetime
boolean
No schema is specified;then the Pig assumes that the UDF outputs a bytearray.

Tags:

What is the purpose of @outputSchema decorator in Python UDF when using in Apache Pig ?

Specifying the UDF output schema

Sample Code:

Tags:

Editor

Other Articles

What is the main difference between pig vs hive vs sql ?

What are the features of Apache Solr ?

No Comment! Be the first one.

Leave a Reply

Popular Posts

UV, RI, PDA Detector principles ?

Why Na lamp is used in a polarimeter ?

What is unit of KF ?

What is organic chemistry ?

Categories

Type and hit Enter to search

Type and hit Enter to search

What is the purpose of @outputSchema decorator in Python UDF when using in Apache Pig ?

Specifying the UDF output schema

Sample Code:

Tags:

Share Article

Editor

Other Articles

What is the main difference between pig vs hive vs sql ?

What are the features of Apache Solr ?

No Comment! Be the first one.

Leave a Reply

Popular Posts

UV, RI, PDA Detector principles ?

Why Na lamp is used in a polarimeter ?

What is unit of KF ?

What is organic chemistry ?

Categories