How to Quick-Start With Hadoop on Ubuntu 15.10 Wily 32-64bit Step-by-Step Easy Guide

October 1, 2015 | By the+gnu+linux+evangelist.

Install Apache Hadoop Ubuntu 15.10 Wily Linux

Hello Linux user! The Tutorial shows you Step-by-Step How to Install and Getting Started with Apache Hadoop/Map-Reduce vanilla in Pseudo-Distributed mode on Ubuntu 15.10 Wily Werewolf GNU/Linux Desktop.

Hadoop is a distributed master-slave that consists of the Hadoop Distributed File System (HDFS) for storage and Map-Reduce for computational capabilities.

The Guide Describe a System-Wide Setup with Root Privileges but you Can Easily Convert the Procedure to a Local One.

The Apache Hadoop for Ubuntu 15.10 Wily Require an Oracle JDK 8+ Installation on System.

Install Hadoop for Ubuntu 15.10 Wily - Featured

Download Latest Apache Hadoop Stable Release:

Apache Hadoop Binary tar.gz

Apache Hadoop
Double-Click on Archive and Extract Into /tmp Directory
Open a Shell Terminal emulator window
Ctrl+Alt+t
(Press “Enter” to Execute Commands)

In case first see: Terminal QuickStart Guide.

Relocate Apache Hadoop Directory

sudo su

If Got “User is Not in Sudoers file” then see: How to Enable sudo

mv /tmp/hadoop* /usr/local/

ln -s /usr/local/hadoop* /usr/local/hadoop

mkdir /usr/local/hadoop/tmp

sudo chown -R root:root /usr/local/hadoop*

How to Install Required Java JDK 8+ on Ubuntu

Install Oracle JDK 8+ for Ubuntu

Installing Java Oracle JDK 8+ on Ubuntu
Set JAVA_HOME in Hadoop Env File.
```
sudo su
```
If Got “User is Not in Sudoers file” then see: How to Enable sudo
```
mkdir /usr/local/hadoop/conf
```
```
nano /usr/local/hadoop/conf/hadoop-env.sh
```
Append:
```
export JAVA_HOME=/usr/lib/jvm/<oracleJdkVersion>
```
Ctrl+x to Save & Exit from nano Editor :)

Hadoop Configuration for Pseudo-Distributed mode

nano /usr/local/hadoop/conf/core-site.xml

Append:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>

nano /usr/local/hadoop/conf/hdfs-site.xml

Append:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<!-- specify this so that running 'hdfs namenode -format'
formats the right dir -->
<name>dfs.name.dir</name>
<value>/usr/local/hadoop/cache/hadoop/dfs/name</value>
</property>
</configuration>

Last:

nano /usr/local/hadoop/conf/mapred-site.xml

Append:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>

SetUp Local Path & Environment.

exit

cd

nano .bashrc

Inserts:

HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=/usr/lib/jvm/<oracleJdkVersion>

Then Load the New Setup:

source $HOME/.bashrc

SetUp Needed Local SSH Connection

sudo systemctl start ssh

Generate SSH Keys to Access:

ssh-keygen -b 2048 -t rsa

echo "$(cat ~/.ssh/id_rsa.pub)" > ~/.ssh/authorized_keys

Testing Connection:

ssh 127.0.0.1

Formatting HDFS
```
hdfs namenode -format
```
Starting Up Hadoop Database
```
start-all.sh
```
Apache Hadoop Database Quick Start Guide

Hadoop MapReduce Quick Start

The Linked Guide Contains a Step-by-Step Jump Start Guide for Apache Hadoop Database