Best way to learn HIVE and the rest of Hadoop puzzle pieces is to get one of the "Hadoop Distros" available now days such as Horton, Cloudera or MapR.
In this article i will use Horton Work Hadoop platform, is easy to use and they have some great free online documentation that will get on your way with learning Hadoop and all its components.
So, What is HIVE ?
Apache Hive is the data warehouse infrastructure that runs on top of Apache Hadoop and provides data ad-hoc query and analysis of large data-sets. It provides a mechanism to project structure onto the data in Hadoop and to query that data using a SQL-like language called HiveQL (HQL).
We will assume you have downloaded and started the Horton Sandbox on your local machine, if not follow this link to see how is done.
To start using HIVE you just need to run the hive command:
Hive HQL is very similar to MySQL SQL language, i said is similar and not exactly the same !
So let us go over some examples of using HiveQL that similar to MySQL Sql
Metadata Commands
Create a database
Show database
Use Database
Create table in Hive
we are going to go over data type in future tutorial.
Describe Table
describe also comes with FORMATTED|EXTENDED options that will be covered in future tutorials.
List Tables in a Hive Database
Data Retrival/Select Commands
Select Statement using * or column name
Select using predicates:
Select using sort
a job is created for this operation.
Count/Aggregate, Group and order data in HiveQL
Select run a count(*) in HiveQL
This are just a few of the most used HiveQL commands that can be related to MySQL SQL language.
Done with this tutorial ? Jump into Hive Data Types and get more knowledge
For the full manual on Hive HQL see the following link