What is Shark ?

  • Shark is a tool, developed for people who are from a database background – to access Scala MLib capabilities through Hive like SQL interface.
  • Shark tool helps data users run Hive on Spark – offering compatibility with Hive metastore, queries and data.
  • Like Hive, Spark queries are written using a SQL-like language called HiveQL, which Spark translates into Spark Directed Acyclic Graphs (DAGs) that are executed on the Hadoop cluster.
  • More complex queries are supported through User Defined Functions (UDFs) that can be written in Java and referenced by a HiveQL query.
What is Shark

Categorized in:

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,