Friday, January 13, 2017

HIVE 2.1.1 INSTALLATION IN HADOOP 2.7.3 IN UBUNTU 16

Hello Friends,


Welcome to the blog where I am going to explain and take you through the installation procedures of Hive 2.1.1 on Hadoop 2.7.3 in Ubuntu 16.

The recent release of hive is quite different then the previous one and why it shouldn't be. The working mechanism we will go through some other time now its time to take you through the installation part.

HIVE VERSION  - WE ARE USING - HIVE-2.1.1

STEP 1:- Download the HIVE 2.1.1 version:-


You will fine the Hive 2.1.1 version in the below link and download the hive.2.1.1.bin/tar.gz file onto your desktop:-


STEP 2:- Copy and place the tar file in /usr/local folder:- 

(You can place it in hadoop folder also to avoid ownership problem)

In the Terminal and type the following command:

sudo cp <Apache hive file> /usr/local/

sudo cp apache-hive-2.1.1-bin.tar.gz /usr/local/ 

(This will copy the hive file to local folder)

STEP3:- Untar the hive tar file:-


Give the following command to untar the tar file in the same folder:

sudo tar -xzvf <Apache Hive tar file>

sudo tar -xzvf apache-hive-2.1.1-bin.tar.gz

STEP 4:- Give a soft link to the extracted file.


Type the following in the terminal to give the soft link.

sudo ln -s <Apache Hive Folder> <New folder>

sudo ln -s apache-hive-2.1.1-bin/ hive

(This will create a folder named hive and will show all files in target folder apache-hive-2.1.1-bin)

STEP 5:- Change ownership of both folder:- 

(Skip if placed it under hadoop folder)

Change ownership of both folder to the user by following command:-

sudo chown -R <User:User> <Folder name>

sudo chown -R hadoop:hadoop hive
sudo chown -R hadoop:hadoop apache-hive-2.1.1-bin/

STEP 6:- Change .Bashrc file to include HIVE_HOME & PATH:


Open bashrc file by giving following command:

nano ~/.bashrc

Then edit and add the following at the end:-

export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin

(press ctrl+x then press y to come out)

The you need to source bashrc, in order to the take the effect of the changes made, by following command:

source ~/.bashrc

STEP 7:-  Start Hadoop daemons:


Hive works on top of hadoop so hadoop daemons must be up and running start the daemon by following commands:-

start-dfs.sh      (To start storage daemons)
start-yarn.sh    (To start processing daemons)

start-all.sh        (Deprecated command to start all daemons)

Type jps command to confirm all daemons are up and running.

STEP 8:- Create temporary directory and warehouse in HDFS with proper permission:


Hive uses a temporary directory and warehouse, so we have to create them on HDFS.

Give the following commands to create the temporary directory & warehouse:

hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -mkdir -p /tmp/hive

Now give permission by giving the following command:-

hdfs dfs -chmod 777 /user/hive/warehouse
hdfs dfs -chmod 777 /tmp/
hdfs dfs -chmod 777 /tmp/hive

STEP 9:- Delete the obsolete log4j-slf4j-impl.jar file provided by the hive as we will use the same from hadoop:-  


Since having multiple slf4j file will give errors we have to remove the same from hive lib directory by giving the following command:

Go to hive lib folder:-

cd /usr/local/hive/lib

rm log4j-slf4j-impl-2.4.1.jar


STEP 10:- Initialize the database to be used with hive. 



Hive 2.1.1 installation is bit tricky here as in pervious version before 2.x the default database was derby and was initialized after the installation but in here we have to manually do the same by the following steps:-

1. Before you run hive for the first time, remove previous metastore information:

Go to hive bin directory and run the below command:
mv metastore_db metastore_db.tmp

2. Now run the schematool command:

Open a new terminal and give following command:-

schematool -initSchema -dbType derby

(If successful it will display success else error will be shown schematool not found, etc). 

STEP 11:- Start hive and enjoy:


To start hive give the following command:-

hive



NOTE: If you change directories, the metastore_db created above won't be found! I'm sure there's a good reason for this that I don't know yet because I'm literally trying to use hive for the first time today.
Here's information on this: metastore_db created in the directory wherever I run Hive for first time.
So delete the first directory and derby.log file and follow the Step 9 to create a new schema.
Be aware, this will remove your schema completely!

Hope you all understood the procedures... 

Please do notify me for any corrections...
Kindly leave a comment for any queries/clarification...

ALL D BEST..


















8 comments:

  1. Replies
    1. Understanding Hadoop By Mahesh Maharana: Hive 2.1.1 Installation In Hadoop 2.7.3 In Ubuntu 16 >>>>> Download Now

      >>>>> Download Full

      Understanding Hadoop By Mahesh Maharana: Hive 2.1.1 Installation In Hadoop 2.7.3 In Ubuntu 16 >>>>> Download LINK

      >>>>> Download Now

      Understanding Hadoop By Mahesh Maharana: Hive 2.1.1 Installation In Hadoop 2.7.3 In Ubuntu 16 >>>>> Download Full

      >>>>> Download LINK Ho

      Delete
  2. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.

    Big Data Solutions

    Data Lake Companies

    Advanced Analytics Solutions

    Full Stack Development Company

    ReplyDelete
  3. As we know, AWS big data consultant is the future of the industries these days, this article helps me to figure out which language I need to learn to pursue the future in this field.

    ReplyDelete
  4. Well explained article, loved to read this blog post and bookmarking this blog for future.boss linux wifi adapter

    ReplyDelete