CloudFront: HOW TO CONFIGURE HBASE IN PSEUDO DISTRIBUTED MODE ON A SINGLE LINUX BOX

If you have successfully configured Hadoop on a single machine in pseudo-distributed mode and looking for some help to use Hbase on top of that then you may find this writeup useful. Please let me know if you face any issue.

Since you are able to use Hadoop, I am assuming you have all the pieces in place . So we'll directly start with Habse configuration. Please follow the steps shown below to do that:

1 - Download the Hbase release from one of the mirrors using the link shown below. Then unzip it at some convenient location (I'll call this location as HBASE_HOME now on) -
http://apache.techartifact.com/mirror/hbase/

2 - Go to the /conf directory inside the unzipped HBASE_HOME and do these changes :

- In the hbase-env.sh file modify these line as shown :
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HBASE_REGIONSERVERS
=/PATH_TO_YOUR_HBASE_FOLDER/conf/regionservers
export HBASE_MANAGES_ZK=true

- In the hbase-site.xml add these properties :
<property>
<name>hbase.rootdir</name>
<value>SAME VALUE AS YOUR fs.default.name IN core-site.xml/hbase</value>
</property>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
  </property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/mohammad/hbase/zookeeper</value>
   </property>

3 - Now copy the hadoop-core-*.jar from your HADOOP_HOME and commons-collections-3.2.1.jar from HADOOP_HOME/lib folder into your HBASE_HOME/lib folder.

4 - Change the line 127.0.1.1 in your /etc/hosts file to 127.0.0.1. Since this file protected, we need to open it with root privileges. Use the command shown below to that :
mohammad@ubuntu:~$ sudo gedit /etc/hosts
This will open the /etc/hosts file with root privileges. Modify it, and save it.

5 - Now go to the terminal and change the directory to your HBASE_HOME and issue this command to start the Hbase processes (your Hadoop processes must be running in advance though) :

mohammad@ubuntu:~/hbase-0.90.4$ bin/start-hbase.sh

If everything was fine then you will see something like this on your terminal :

localhost: starting zookeeper, logging to /home/mohammad/hbase/logs/hbase-mohammad-zookeeper-ubuntu.out

starting master, logging to /home/mohammad/hbase/logs/hbase-mohammad-master-ubuntu.out

localhost: starting regionserver, logging to /home/mohammad/hbase/logs/hbase-mohammad-regionserver-ubuntu.out

mohammad@ubuntu:~/hbase-0.90.4$

NOTE : To verify whether your Hbase processes are running or not do this on your terminal :

mohammad@ubuntu:~/hbase-0.90.4$ jps

This will list down all the process running currently. If your Hbase is running fine you will see something like this :

mohammad@ubuntu:~/hbase-0.90.4$ jps

4585 SecondaryNameNode

4055 NameNode

5674 HRegionServer

5439 HMaster

5388 HQuorumPeer

4939 TaskTracker

4326 DataNode

5852 Jps

4686 JobTracker

Hbase also provides a WEB-UI which you can visit to see if everything is ok. Point your web browser to this link :

http://localhost:60010

This will show you the Hbase master's web page.

6 - Now go to the Hbase shell and get yourself familiar with a few Hbase commands. To do this issue the follow command :

mohammad@ubuntu:~/hbase-0.90.4$ bin/hbase shell

This command will take you to the Hbase shell. You should be able to see this on your terminal screen by now :

mohammad@ubuntu:~/hbase-0.90.4$ bin/hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011

hbase(main):001:0>

EXAMPLE COMMANDS :

1 - List the existing tables :

hbase(main):001:0> list

2 - To create a new table called table1 having a column family cf :

hbase(main):001:0> create 'table1', 'cf'

0 row(s) in 1.9500 seconds

hbase(main):002:0> list

TABLE

demo

table1

test3

3 row(s) in 0.0310 seconds

hbase(main):003:0>

** For a complete know how of Hbase please visit the Hbase home page at - http://hbase.apache.org/

7 comments:

Andlib SaifJuly 1, 2012 at 12:55 PM
really a nice post..helped me a lot in configuring hbase on my machine
UnknownJanuary 4, 2013 at 2:23 PM
Hbase is the Hadoop database. Think of it as a distributed, scalable, big data store.
UnknownMarch 29, 2013 at 6:31 PM
Tariq, you haven't mentioned command to list all tables
UnknownMarch 29, 2013 at 6:33 PM
and Thanks this is very helpful...:)
TariqMarch 29, 2013 at 7:28 PM
You are always welcome Shiv..and thanks a lot for the pointer.
UnknownApril 1, 2013 at 9:56 PM
Hi Tariq,
Thanks a lot :)
Really, this post is very helpful.
TariqApril 1, 2013 at 10:49 PM
you are always welcome saket :)

CloudFront

Thursday, June 21, 2012

HOW TO CONFIGURE HBASE IN PSEUDO DISTRIBUTED MODE ON A SINGLE LINUX BOX

7 comments:

How to work with Avro data using Apache Spark(Spark SQL API)

About Me