How to Install Apache Hadoop on Gentoo 2012/2013 Linux Easy Guide

March 3, 2014 | By the+gnu+linux+evangelist.

Install Hadoop 2.X for Gentoo GNU/Linux

Hi! The Tutorial shows you Step-by-Step How to Install and Getting-Started with Apache Hadoop/Map-Reduce vanilla in Pseudo-Distributed mode on Linux Gentoo 2012/2013 x86/amd64 Desktop.

Hadoop is an Open Source Framework for Writing and Running distributed Applications that Process Big Data (large amounts of data).

Apache Hadoop Key Features:

  • Accessible: Hadoop runs on large clusters of commodity machines or on cloud computing services such as Amazon’s Elastic Compute Cloud (EC2).
  • Robust: Because it is intended to run on commodity hardware, Hadoop is architected with the assumption of frequent hardware malfunctions. It can gracefully handle most such failures.
  • Scalable: Hadoop scales linearly to handle larger data by adding more nodes to the cluster.
  • Simple: Hadoop allows users to quickly write efficient parallel code.

The Guide Describe a System-Wide Setup with Root Privileges but you Can Easily Convert the Procedure to a Local One.

The Contents and Details are Expressly Essentials to Give Focus Only to the Essentials Instructions and Commands.

Install Hadoop on Gentoo Linux - Featured

  1. Download Latest Apache Hadoop Stable Release:

    Apache Hadoop Binary tar.gz
  2. Right-Click on Archive > Open with Ark

    Then Extract Into /tmp.

    Install Hadoop on Gentoo GNU/Linux - KDE4 Apache Hadoop Stable tar.gz Extraction
  3. Open Terminal Window
    (Press “Enter” to Execute Commands)

    Install Hadoop on Gentoo GNU/Linux - Gentoo Open Terminal

    In case first see: Terminal QuickStart Guide.

  4. Relocate Apache Hadoop Directory

    sudo su

    If Got “User is Not in Sudoers file” then see: How to Enable sudo

    mv /tmp/hadoop* /usr/local/
    ln -s /usr/local/hadoop* /usr/local/hadoop
    mkdir /usr/local/hadoop/tmp
    sudo chown -R root:root /usr/local/hadoop*
  5. How to Install Custom Oracle Java JDK on Gentoo:

    Install Oracle JDK for Gentoo

  • Set JAVA_HOME in Hadoop Env File

    nano /usr/local/hadoop/conf/


    /nexport JAVA_HOME=/usr/lib/jvm/<oracleJdkVersion>

    Ctrl+x to Save & Exit :)

  • Eclipse Hadoop 2.X Integration with Free Plugin.

    Hadoop 2.X Eclipse Plugin SetUp
  • Configuration for Pseudo-Distributed mode

    nano /usr/local/hadoop/conf/core-site.xml

    The Content Should Look Like:

    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>


    nano /usr/local/hadoop/conf/hdfs-site.xml

    The Content Should Look Like:

    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
    <!– specify this so that running ‘hdfs namenode -format’
    formats the right dir –>


    nano /usr/local/hadoop/conf/mapred-site.xml

    The Content Should Look Like:

    <?xml version=”1.0″?>
    <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

  • SetUp Path & Environment

    su <myuser>
    nano .bashrc


    /nexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

    The JAVA_HOME is Set Following Oracle Java JDK6+ Installation Version…

    Then Load New Setup:

    source $HOME/.bashrc
  • SetUp Needed Local SSH Connection

    sudo systemctl start ssh

    Generate SSH Keys to Access:

    ssh-keygen -b 2048 -t rsa
    echo "$(cat ~/.ssh/" > ~/.ssh/authorized_keys

    Testing Connection:

  • Formatting HDFS

    hdfs namenode -format

    Install Hadoop on Gentoo GNU/Linux - Terminal Hadoop HDFS Formatting Success

  • Starting Up Hadoop Database
  • Apache Hadoop Database Quick Start Guide

  • Tags: , , , , , , , , , , , , ,