Installing Hadoop on Ubuntu

Hadoop is developed in Java and aim to run mainly on Linux.
In this tutorial I will demonstrate how to install and run Hadoop on Ubuntu 12.04. The setup will run Hadoop in a single node cluster.

As a requirement Java JDK 6 need to be install:
$ sudo apt-get install openjdk-6-jdk

I – Activate SSH without password

Hadoop master node remotely control it sub-nodes using SSH.
In single node cluster, master and sub-nodes run on the same machine but Hadoop is not aware of that.
It will still use the exact same way to communicate between them using SSH.

Install SSH server:
$ sudo apt-get install openssh-server

Create a ssh-key without password:
$ cd ~
$ ssh-keygen -t rsa -P “”

Set the key as trusted key for remote login:
$ cat .ssh/ >> .ssh/authorized_keys

Try to connect on localhost and accept the connection (mandatory)
$ ssh localhost

ssh key + login

II – Install Hadoop

Download Hadoop:
$ wget

$ tar xvf hadoop-1.2.1-bin.tar.gz hadoop

Update the $PATH to use Hadoop from the command line:
$ vim.tiny ~/.bashrc

Add at the end of the file:

export HADOOP_HOME=~/hadoop

Close and re-open your console so the $PATH get updated.

III – Configure Hadoop for single-node

Locate the path where Java JDK 6 is installed.
On Ubuntu 12.04 it is in /usr/lib/jvm/java-6-openjdk-amd64/

$ vim.tiny ~/hadoop/conf/

find and replace:
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64/

remove the # to uncomment the line

$ vim.tiny ~/hadoop/conf/mapred-site.xml


$ vim.tiny ~/hadoop/conf/hdfs-site.xml


$ vim.tiny ~/hadoop/conf/core-site.xml



Hadoop will store it data under hadoop-hdfs directory.
Create the directory:
$ mkdir ~/hadoop-hdfs

Finally format the hdfs
$ hadoop namenode -format

VI – Start the cluster

Run the command

This will startup Namenode DataNode JobTracker and TaskTracker on your machine.

JobTracker UI run at http://localhost:50030

Hadoop JobTracker

Hadoop JobTracker

TaskTracker UI run at http://localhost:50080

Hadoop TaskTracker

Hadoop TaskTracker