apache hive - Hive Sort By vs Order By - hive tutorial - hadoop hive - hadoop hive - hiveql



  • Hive sort by and order by commands are used to fetch data in sorted order. The main differences between sort by and order by commands are given below.
apache hive related article tags - hive tutorial - hadoop hive - hadoop hive - hiveql - hive hadoop - learnhive - hive sql

Sort by

  • Hive uses the columns in SORT BY to sort the rows before feeding the rows to a reducer.
  • The sort order will be dependent on the column types. If the column is of numeric type, then the sort order is also in numeric order.
  • If the column is of string type, then the sort order will be lexicographical order.
hive> SELECT  E.EMP_ID FROM Employee E SORT BY E.empid;  
Clicking "Copy Code" button will copy the code into the clipboard - memory. Please paste(Ctrl+V) it in your destination. The code will get pasted. Happy coding from Wikitechy hive tutorial team
  • May use multiple reducers for final output.
  • Only guarantees ordering of rows within a reducer.
  • May give partially ordered result.

Ordering : It orders data at each of ‘N’ reducers , but each reducer can have overlapping ranges of data.

  • Outcome : N or more sorted files with overlapping ranges.

  • learn hive - hive tutorial - hive sql datatypes -  hive programs -  hive examples

    learn hive - hive tutorial - hive sql datatypes - hive programs - hive examples

    Order by

    • This is similar to ORDER BY in SQL Language.
    • In Hive, ORDER BY guarantees total ordering of data, but for that it has to be passed on to a single reducer, which is normally unacceptable and therefore in strict mode, hive makes it compulsory to use LIMIT with ORDER BY so that reducer doesn’t get overburdened.
    hive> SELECT  E.EMP_ID FROM Employee E order BY E.empid;  
    
    Clicking "Copy Code" button will copy the code into the clipboard - memory. Please paste(Ctrl+V) it in your destination. The code will get pasted. Happy coding from Wikitechy hive tutorial team
    • Uses single reducer to guarantee total order in output.
    • LIMIT can be used to minimize sort time.

    Ordering : Total Ordered data.

  • Outcome : Single output i.e. fully ordered.


  • Wikitechy Apache Hive tutorials provides you the base of all the following topics . Enjoy learning on big data , hadoop , data analytics , big data analytics , mapreduce , hadoop tutorial , what is hadoop , big data hadoop , apache hadoop , apache hive , hadoop wiki , hadoop jobs , hadoop training , hive tutorial , hadoop big data , hadoop architecture , hadoop certification , hadoop ecosystem , hadoop fs , apache pig , hadoop cluster , cloudera hadoop , hadoop download , hadoop mapreduce , hadoop workflow , hive data types , hadoop hive , pig hadoop , hadoop administration , hadoop installation , hive hadoop , learn hadoop , hadoop for dummies , hadoop commands , hive definition , hiveql , learnhive , hive sql , hive database , hive date functions , hive query , apache hive tutorial , hive apache , hive wiki , what is a hive , hive big data , programming hive , what is hive in hadoop , hive documentation , how does hive work

    Related Searches to Hive Sort By vs Order By