Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, Canonical, the organization behind the Ubuntu operating system, partnered with MapR, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories. The partnership announced that MapR's M3 Edition for Apache Hadoop will be packaged and made available for download as an integrated part of the Ubuntu operating system. Canonical and MapR are also working to develop a Juju Charm that can be used by OpenStack and other customers to easily deploy MapR into their environments.
The free MapR M3 Edition includes HBase, Pig, Hive, Mahout, Cascading, Sqoop, Flume and other Hadoop-related components for unlimited production use. MapR M3 will be bundled with Ubuntu 12.04 LTS and 12.10 via the Ubuntu Partner Archive. MapR also announced that the source code for the component packages of the MapR Distribution for Apache Hadoop is now publicly available on GitHub.
MapR is the only distribution that enables Linux applications and commands to access data directly in the cluster via the NFS interface that is available with all MapR Editions. The MapR M5 and M7 Editions for Apache Hadoop, which provide enterprise-grade features for HBase and Hadoop such as mirroring, snapshots, NFS HA and data placement control, will also be certified for Ubuntu.
Now, as you get Hadoop integrated natively with Ubuntu, it's a lot easier to install it and go. No more unnecessary downloads and wacky configuration steps. And the best part is the NFS interface available with MapR's distribution that enables other Linux commands and application to access the cluster data directly. The Ubuntu/MapR package will be available through the Ubuntu Partner Archive for 12.04 LTS and 12.10 releases of Ubuntu on the official website starting from April 25, 2013.
For more info you can get the Ubuntu and Hadoop: the perfect match white paper from here.
We all know how cool Spark is when it comes to fast, general-purpose cluster computing. Apart from the core APIs Spark also provides a rich ...
Hive is a wonderful tool for those who like to perform batch operations to process their large amounts of data residing on a Hadoop cluster ...
SSH (Secure Shell) is a network protocol secure data communication, remote shell services or command execution and other secure network ser...