apache hive - Hive user defined functions - user defined types - user defined data formats- hive tutorial - hadoop hive - hadoop hive - hiveql



apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

Hive User Defined Functions :

  • User Defined Function
    -------One-to-one row mapping
  • -------Concat(‘foo’, ‘bar’)
  • User Defined Aggregate Function
  • -------Many-to-one row mapping
    -------Sum(num_ads)
  • User Defined Table Function
  • -------One-to-many row mapping
    -------Explode([1,2,3])
learn hive - hive tutorial - apache hive - hive user defined functions -  hive examples

learn hive - hive tutorial - apache hive - hive user defined functions - hive examples

Hive User Defined Data Formats :

learn hive - hive tutorial - apache hive - hive user defined data formats -  hive examples

learn hive - hive tutorial - apache hive - hive user defined data formats - hive examples

TABLE GENERATING FUNCTION EXAMPLE

  • Consider a table with a single column, x, which contains arrays of strings.

  •          CREATE TABLE arrays (x ARRAY)
             ROW FORMAT DELIMITED
             FIELDS TERMINATED BY '\001'
             COLLECTION ITEMS TERMINATED BY '\002';

  • The example file has the following contents, where ^B is a representation of the Control-B character to make it suitable for printing:example.txt

  •          a^B
             b^B
             c^B
             d^B
             e^B


  • After running a LOAD DATA command, the following query confirms that the data was loaded correctly:

  •          hive> SELECT * FROM arrays;
             [ "a", "b"]
             ["c", "d", "e"]


  • Next, we can use the explode UDTF to transform this table. This function emits a row for each entry in the array, so in this case the type of the output column y is STRING.
  • The result is that the table is flattened into five rows.

  •          hive> SELECT explode(x) AS y FROM arrays;
             a
             b
             c
             d
             e
  • EXAMPLE OF UDF –simple UDF to trim characters - UDF – strip.

  •          package com.hadoopbook.hive
            import org.apache.commons.lang.StringUtils;
            import org.apache.hadoop.hive.ql.exec.UDF;
            import org.apache.hadoop.io.TEXT;
            public class Strip extends UDF {
            private Text result = new Text();
            public Text evaluate(Text str) {
            if(str == null) {
            return null; }
            result.set(StringUtils.strip(str.toString()));
            return result; }
            public Text evaluate(Text str, String stripChars) {
            if(str == null) {
            return null;
            }
            result.set(StringUtils.strip(str.toString(), stripChars))
            return result;
            }
            }

  • The evaluate() method is not defined by an interface, so it may take an arbitrary number of arguments, of arbitrary types, and it may return a value of arbitrary type.
  • To use the UDF in Hive, we need to package the compiled Java class in a JAR file and register the file with Hive.

  •         ADD JAR /path/to/hive/hive-example.jar;
  • We also need to create an alias for the Java classname.

  •         CREATE TEMPORARY FUNCTION strip AS 'com.hadoopbook.hive.Strip';


  • The UDF is now ready to be used, just like a built-in function:

  •         hive> SELECT strip(' bee ') FROM dummy;
            bee hive> SELECT strip ('banana' , 'ab') FROM dummy;
            nan

    IMPORTING DATA SETS

  • The easiest way to import dataset from relational database into Hive, is to export database from table to CSV file. After this is accomplished you should create table in hive:

  •         hive> CREATE TABLE SHOOTING (arhivesource string, text string, to_user_id string, from_user string, id string, from_user_id string , iso_language_code string, source string , profile_image_url string, geo_type string, geo_coordinates_0 double, geo_coordinates_1 double, created_at string, time int, month int, day int, year int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

  • Next load data from local directory.

  •         >hive LOAD DATA LOCAL INPATH '/dlrlhive/shooting/shooting.csv' INTO TABLE shooting;
    apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

    Hive User Defined Types :

    learn hive - hive tutorial - apache hive - hive user defined types -  hive examples

    learn hive - hive tutorial - apache hive - hive user defined types - hive examples

    Hive Reading Rich Data :

    This can be acheived using Serde
    learn hive - hive tutorial - apache hive - hive serde reading rich data -  hive examples

    learn hive - hive tutorial - apache hive - hive serde reading rich data - hive examples

    • Easy to write a SerDe for old data stored in your own format

    • ---------------- You can write your own SerDe (XML, JSON …)
    • Existing SerDe families
      ---------------- Thrift DDL based SerDe
      ---------------- Delimited text based SerDe
      ---------------- Dynamic SerDe to read data with delimited maps, lists and primitive types
    apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

    Hive Interoperability – External Tables :

  • Data already in hdfs
  • Use External Tables to associate metadata with existing hdfs data
  • learn hive - hive tutorial - apache hive - hive hdfs external table -  hive examples

    learn hive - hive tutorial - apache hive - hive hdfs external table - hive examples

    apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

    Hive - Insert into Files, Tables and Local Files :

    learn hive - hive tutorial - apache hive - hive hdfs insert into table and insert into files -  hive examples

    learn hive - hive tutorial - apache hive - hive hdfs insert into table and insert into files -  hive examples

    learn hive - hive tutorial - apache hive - hive hdfs insert into table and insert into files - hive examples

    apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

    Hive - Extensibility - Custom Map/Reduce Scripts - User defined map reduce scripts :

    learn hive - hive tutorial - apache hive - hive hdfs - Extensibility user defined Map Reduce code -  hive examples

    learn hive - hive tutorial - apache hive - hive hdfs - Extensibility user defined Map Reduce code - hive examples


    Wikitechy Apache Hive tutorials provides you the base of all the following topics . Enjoy learning on big data , hadoop , data analytics , big data analytics , mapreduce , hadoop tutorial , what is hadoop , big data hadoop , apache hadoop , apache hive , hadoop wiki , hadoop jobs , hadoop training , hive tutorial , hadoop big data , hadoop architecture , hadoop certification , hadoop ecosystem , hadoop fs , apache pig , hadoop cluster , cloudera hadoop , hadoop download , hadoop mapreduce , hadoop workflow , hive data types , hadoop hive , pig hadoop , hadoop administration , hadoop installation , hive hadoop , learn hadoop , hadoop for dummies , hadoop commands , hive definition , hiveql , learnhive , hive sql , hive database , hive date functions , hive query , apache hive tutorial , hive apache , hive wiki , what is a hive , hive big data , programming hive , what is hive in hadoop , hive documentation , how does hive work

    Related Searches to Hive vs Mapreduce