Wednesday, September 11, 2013

How to run Hive queries through Hive Web Interface.

One of the good things about Hadoop, and related projects, which I really like is the WebUI provided to us. It makes our life a lot easier. Just point your web browser to the appropriate URL and quickly perform the desired action. Be it browsing through HDFS files or glancing over HBase tables. Otherwise you need to go the shell and issue the associated commands one by one for each action [I know i'm a bit lazy ;)].

Hive is no exception and provides us a WebUI, called as Hive Web Interface, or HWI in short. But, somehow I feel it is less documented and talked about as compared to HDFS and HBase WebUI. But that doesn't make it any less useful. In fact I personally find it quite helpful. With its help you can do various operations like browsing your DB schema, see your sessions, query your tables etc. You can also see the System and User variables like Java Runtime, your OS architecture, your PATH etc etc.

OK, enough brand building. Let's get started and see how to use HWI. The process is quite simple. First a couple of things on configuration. Following are the properties which you might have to modify as per your requirements :

  • : The host address the Hive Web Interface will listen on.
  • hive.hwi.listen.port : The port the Hive Web Interface will listen on.
  • hive.hwi.war.file : This is the WAR file with the jsp content for Hive Web Interface.

Values for these properties is totally your choice. I'll go ahead with the defaults.
You would probably want to setup HiveDerbyServerMode as well if you wish to allow multiple sessions at the same time.

Note : Make these changes in hive-site.xml file inside your $HIVE_HOME/conf/ directory. Create it if you don't have it already. Please don't change anything in default-site.xml file. This is important.

Now start HWI using the following command :
bin/hive --service hwi 

If everything goes fine you will see something like this on your terminal :
hive-0.10.0 miqbal1$ bin/hive --service hwi
13/09/11 00:21:46 INFO hwi.HWIServer: HWI is starting up
13/09/11 00:21:46 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
13/09/11 00:21:46 INFO mortbay.log: jetty-6.1.26
13/09/11 00:21:46 INFO mortbay.log: Extract /Users/miqbal1/hadoop-eco/hive-0.10.0/lib/hive-hwi-0.10.0.war to /var/folders/n3/d0ghj1ln2zl0kpd8zkz4zf04mdm1y2/T/Jetty_0_0_0_0_9999_hive.hwi.0.10.0.war__hwi__ae9cmk/webapp

13/09/11 00:21:46 INFO mortbay.log: Started SocketConnector@

You are good to go now. So point your web browser to HWI. For example, http://localhost:9999/hwi/index.jsp in my case, since i'm working on a local machine on my localhost with all default configuration parameters. Use the hostname and port as per your setup. This will take you to the HWI front page which will look like this :

You can click on Home if you wish to read about HWI a bit more, on Authorize to authorize a user. If you want to browse through your DB schema you can click on Browse Schema under DATABASE section. You can click on Diagnostics if you want to have a look at various System and User variables on your box. All this is merely a matter of one click. So we will move onto the main part, querying Hive tables. Follow the steps below in order to that :

  • Click on Create Session under SESSIONS section, enter some session name and hit Submit.

  • This will take you to the Manage Session screen. This is the place where all the action will take place. Come down to the Session Details section and enter a file name, say /Users/tariq/res.txt, in the Result File box. This is the file where the result of your query will get stored. If you expect your result to be very huge you can just enter /dev/null over there. Remember the result file is local to the web server. Similarly enter the error file if you wish.
  • Now come down to the Query box and write the query you want to execute.
  • Choose Yes or No for Silent Mode as per your wish. Select Yes for Start Query and hit Submit.

You should be able to see the file /Users/tariq/res.txt by now containing the result of your query. You can also view the result by clicking on View File option which will appear next to the Result File box upon the successful completion of your query.

That is it. Hope it helps. Do let me know in case of any issue.

No comments:

Post a Comment

How to work with Avro data using Apache Spark(Spark SQL API)

We all know how cool Spark is when it comes to fast, general-purpose cluster computing. Apart from the core APIs Spark also provides a rich ...