1- Github-YCSB page : https://github.com/brianfrankcooper/YCSB
2- The paper from ACM Symposium on Cloud Computing, "Benchmarking Cloud Serving Systems with YCSB" : http://research.yahoo.com/files/ycsb.pdf
So, let us get started...
Step1- Clone the YCSB git repository :
apache@hadoop:~$ git clone http://github.com/brianfrankcooper/YCSB.git
This will create a directory caleed YCSB inside your current directory. (It might take some time depending on your internet connection speed. So, be patient)
Step2- Go inside this newly created YCSB directory and move inside the hbase directory. You will find an xml file here named as pom.xml. Open this pom.xml file and edit it so that it looks like this :
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>root</artifactId>
<version>0.1.4</version>
</parent>
<artifactId>hbase-binding</artifactId>
<name>HBase DB Binding</name>
<dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<!--<version>${hbase.version}</version>-->
<version>0.94.4</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<!--<version>1.0.0</version>-->
<version>1.0.4</version>
</dependency>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>core</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>${maven.assembly.version}</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Pay attention to the lines in red. These are the changes that you have to make in order to build YCSB without any problem for your specific version of Hbase.
NOTE : As of this writing I am usign hadoop-1.04 and hbase-0.94.4, so I have mentioned these versions in the above shown file. You have to specify the versions which you are going to use.
Step3- Now, go back to your terminal and move inside the YCSB directory :
apache@hadoop:~$ cd YCSB
Step4- It's time to do the build now :
apache@hadoop: /YCSB/ mvn clean package
This will start the build process. You can see all the information as the build process continues. If everything goes fine then you will see something like this on your terminal :
NOTE: If multiple descriptors or descriptor-formats are provided for this project, the value of this file will be non-deterministic!
[WARNING] Replacing pre-existing project main-artifact file: /hadoop/projects/YCSB/voldemort/target/archive-tmp/voldemort-binding-0.1.4.jar
with assembly file: /hadoop/projects/YCSB/voldemort/target/voldemort-binding-0.1.4.jar
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building YCSB Release Distribution Builder 0.1.4
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.3:clean (default-clean) @ ycsb ---
[INFO]
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (validate) @ ycsb ---
[INFO]
[INFO] --- maven-assembly-plugin:2.2.1:single (default) @ ycsb ---
[INFO] Reading assembly descriptor: src/main/assembly/distribution.xml
[INFO] Processing sources for module project: com.yahoo.ycsb:core:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:cassandra-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:hbase-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:hypertable-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:dynamodb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:elasticsearch-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:infinispan-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:jdbc-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:mapkeeper-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:mongodb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:orientdb-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:redis-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:voldemort-binding:jar:0.1.4
[INFO] Processing sources for module project: com.yahoo.ycsb:ycsb:pom:0.1.4
[INFO] Building tar : /hadoop/projects/YCSB/distribution/target/ycsb-0.1.4.tar.gz
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] YCSB Root ......................................... SUCCESS [1.940s]
[INFO] Core YCSB ......................................... SUCCESS [23.149s]
[INFO] Cassandra DB Binding .............................. SUCCESS [7.421s]
[INFO] HBase DB Binding .................................. SUCCESS [15.638s]
[INFO] Hypertable DB Binding ............................. SUCCESS [2.805s]
[INFO] DynamoDB DB Binding ............................... SUCCESS [3.451s]
[INFO] ElasticSearch Binding ............................. SUCCESS [8.123s]
[INFO] Infinispan DB Binding ............................. SUCCESS [2:27.468s]
[INFO] JDBC DB Binding ................................... SUCCESS [18.235s]
[INFO] Mapkeeper DB Binding .............................. SUCCESS [10.011s]
[INFO] Mongo DB Binding .................................. SUCCESS [4.874s]
[INFO] OrientDB Binding .................................. SUCCESS [19.702s]
[INFO] Redis DB Binding .................................. SUCCESS [3.960s]
[INFO] Voldemort DB Binding .............................. SUCCESS [14.181s]
[INFO] YCSB Release Distribution Builder ................. SUCCESS [7.076s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:48.305s
[INFO] Finished at: Mon Feb 04 01:13:00 IST 2013
[INFO] Final Memory: 107M/737M
[INFO] ------------------------------------------------------------------------
This shows that the build has been completed successfully and you are all set to go.
Step5- Step4 will create a directory named target inside your /YCSB/distribution/ directory. You will find the YCSB tar file here, ycsb-0.1.4.tar.gz in my case. Copy this file to some location of your choice and extract it. This will give you the ycsb-1.0.4 directory which contains all the important and necessary stuff.
Step6- Move inside the ycsb-1.0.4 directory where you will find a directory called /hbase-binding. Go inside the /hbase-binding and open the /lib directory situated there. Copy the following jars from your /HBASE_HOME/lib into this /lib directory :
1-slf4j-api-*.jar
2-slf4j-log4j12-*.jar
3-zookeeper-*.jar
Step7- You will find another directory named /conf inside /hbase-binding. You will find an xml file here named as hbase-site.xml file. Replace this hbase-site.xml file with the habse-site.xml present in your /HBASE_HOME/conf directory.
Step8- You are all set for testing your Hbase now. Start the Hadoop and Hbase processes and go inside ycsb-1.0.4. Now, issue the following command to load test your Hbase deployment :
apache@hadoop:/ycsb-0.1.4$ bin/ycsb load hbase -P workloads/workloada -p columnfamily=f1 -p recordcount=1000000 -p threadcount=4 -s | tee -a workloada.dat
This will start the load test and after sometime it will give you the result summary. Do not get overwhelmed by the great amount of information displayed on your terminal after this operation. For our convenience we have piped this ycsb command with the Linux tee command and written the entire output information to the terminal and the workloada.dat. You will find this file inside your ycsb-0.1.4
directory which contains the same content as your terminal has. You can extract useful insights from this file(or from your terminal) like :
The overall runtime in milliseconds
Throughput i.e. operations per second
Number of operations
AverageLatency etc etc
Here are some of the lines from my terminal :
[OVERALL], RunTime(ms), 73258.0
[OVERALL], Throughput(ops/sec), 13650.386305932458
[UPDATE], Operations, 4
[UPDATE], AverageLatency(us), 530564.25
[UPDATE], MinLatency(us), 65895
[UPDATE], MaxLatency(us), 1642179
I hope you found this post helpful. Stay connected for more :)
it is a great instructin and with all the detail stepbystep. I started to setup YCSB yesterday, and was looking for an instruction like this. And have to say this is my luck day to find this one which was just posted one day earlier. Thank you!
ReplyDeleteyou are always welcome..i would like to hear from you whether it really worked for you or not.
Deletemay I ask a dump question here? get the following exception. the HBase is standalone with 1 servers. Many thanks
ReplyDeleteI ran the step 8 with a little modification(recordcount = 1000, and threadcount=4), but get an error, here is the log:
-----------------------------
YCSB Client 0.1
Command line: -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=f1 -p recordcount=1000 -p threadcount=1 -s -load
com.yahoo.ycsb.DBException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1000 actions: DoNotRetryIOException: 1000 times, servers with issues: localhost:60020,
at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:111)
at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
at com.yahoo.ycsb.ClientThread.run(Client.java:307)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1000 actions: DoNotRetryIOException: 1000 times, servers with issues: localhost:60020,
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1591)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:945)
at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:106)
... 2 more
[OVERALL], RunTime(ms), 878.0
[OVERALL], Throughput(ops/sec), 1138.9521640091116
[INSERT], Operations, 1000
[INSERT], AverageLatency(us), 652.439
[INSERT], MinLatency(us), 51
[INSERT], MaxLatency(us), 489779
[INSERT], 95thPercentileLatency(ms), 0
[INSERT], 99thPercentileLatency(ms), 0
[INSERT], Return=0, 1000
[INSERT], 0, 995
[INSERT], 1, 1
[INSERT], 2, 1
[INSERT], 3, 1
[INSERT], 4, 1
[INSERT], 5, 0
....
....
....
[INSERT], 997, 0
[INSERT], 998, 0
[INSERT], 999, 0
[INSERT], >1000, 0
--------------------------------------------
add this property to your hbase-site.xml file and restart your hbase :
Deletezookeeper.session.timeout
1800000
Session Time out.
and see if it works for you.
Mohammad,
DeleteMany thanks. I did the change, but doesn't work. pretty much the same output.
I dig a bit more. seems something wrong with my hbase setting. I put 'http://localhost:60020/' in the browser, and get the following
----------------------
use ���� ���|ÿÿÿÿ���)org.apache.hadoop.ipc.RPC$VersionMismatch���>Server IPC version 3 cannot communicate with client version 47
----------------------
btw, localhost:60010 looks fine as the browser will show the master cluster
here is the hbase_site.xml setting for the port:
Deletehbase.master.port
60000
hbase.master.info.port
60010
hbase.regionserver.port
60020
hbase.regionserver.info.port
60030
could you please show me your hbase-site.xml. dontariq@gmail.com is my email address.
DeleteCool post.
ReplyDeletehello, there. It is great! But I just wondered, can you also write such instructions for voldemort? I mean 'How to configure YCSB to beat voldemort'.
ReplyDeletethanks!
We need to create the usertable also before executing the ycsb command.
ReplyDelete