What should I know?
To learn this Hive query tutorial, you need basic knowledge of SQL, Hadoop and knowledge of other databases will be of an additional help.
Hive Course Syllabus
Introduction
Advanced Stuff
Introduction to Hive
Hive evolved as a data warehousing solution built on top of Hadoop Map-Reduce framework. The size of data sets being collected and analyzed in the industry for business intelligence is growing and in a way, it is making traditional data warehousing solutions more expensive. Hadoop with MapReduce framework, is being used as an alternative solution for analyzing data sets with huge size. Though, Hadoop has proved useful for working on huge data sets, its MapReduce framework is very low level and it requires programmers to write custom programs which are hard to maintain and reuse. Hive comes here for rescue of programmers. Hive comes with a command-line shell interface which can be used to create tables and execute queries. Hive query language is similar to SQL wherein it supports subqueries. With Hive query language, it is possible to take a MapReduce joins across Hive tables. It has a support for simple SQL like functions– CONCAT, SUBSTR, ROUND etc., and aggregation functions– SUM, COUNT, MAX etc. It also supports GROUP BY and SORT BY clauses. It is also possible to write user defined functions in Hive query language.
Hive Vs Map Reduce
Prior to choosing one of these two options, we must look at some of their features. While choosing between Hive and Map reduce following factors are taken in consideration;
Type of Data Amount of Data Complexity of Code