Thursday, October 25, 2012


HDInsight is Microsoft’s 100% Apache compatible Hadoop distribution, supported by Microsoft. HDInsight, available both on Windows Server or as an Windows Azure service, empowers organizations with new insights on previously untouched unstructured data, while connecting to the most widely used Business Intelligence (BI) tools on the planet. In this post we'll directory jump into the hands-on. But, if you want more on HDInsight, you can visit my another post here.

NOTE : OS used - Windows 7

So let's get started.

First of all go to the Microsoft Big Data page, and click on the Download HDInsight Server link (shown in the blue eclipse). You will see something like this :

Once you click the link it will guide you to the Download Center. Now, go to the Instructions heading and click on Microsoft Web Platform Installer.

This will automatically download and install all the required thing.

Once the installation is over open the  Microsoft Web Platform Installer and go to the Top Right corner of the Microsoft Web Platform Installer UI where you will find a Search Box. Type Hadoop in there. This will show you Microsoft HDInsight for Windows Server Community Technology Preview bar. Select it and click on install. And you are done.

NOTE : It may take some time to install all the necessary components depending upon your connection speed.

On successful completion of HDInsight you can find the Hadoop Command Line icon on your desktop. Also you will find a brand new directory named Hadoop inside your C drive. This indicates that everything was OK and you are good to go.


It's time now to test HDInsight.

Step1. Go to the C:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin directory :
c:\>cd Hadoop\hadoop-1.1.0-SNAPSHOT\bin

Step2. Now, start the daemons using start_daemons.cmd :

It will show you something like this on your terminal :

This means that your Hadoop processes have been started successfully and you are all set.

Let us use few of the Hadoop Commands to get ourselves familiar with Hadoop.

1. List all the directories, sub-directories and file present in Hdfs. And we do it using fs -ls :

2. Create a new directory inside Hdfs.We use fs -mkdir to do that :

You would have become familiar with the Hadoop shell by now. But I would suggest to go to the official Hadoop Page and try more in order to get a good grip. HAPPY HADOOPING..!!

No comments:

Post a Comment

How to work with Avro data using Apache Spark(Spark SQL API)

We all know how cool Spark is when it comes to fast, general-purpose cluster computing. Apart from the core APIs Spark also provides a rich ...