[DEPRECATED] CentOS 6: Install Hadoop from Apache Bigtop

WARNING

This guide is a work-in-progress and currently does not result in a fully working Hadoop. Please see CentOS 6: Install Single-node Hadoop from Cloudera CDH

Overview

Guide for setting up a single-node Hadoop on CentOS using the Apache Bigtop repo.

Versions

  • CentOS 6.3
  • Oracle Java JDK 1.6
  • Apache BigTop 0.5.0
  • Hadoop 2.0.2-alpha

Prerequisties

Install

1. Download the yum repo file:

2. Install

Configure

Separate where the namenode and datanode store their files

1. Edit /etc/hadoop/conf/hdfs-site.xml and change the following properties to the listing below:

  • dfs.namenode.name.dir
  • dfs.namenode.checkpoint.dir
  • dfs.datanode.data.dir

Note: this step is not part of the official Apache BigTop instructions, but was required to avoid errors when running a datanode on the same machine as the namenode.

2. Format the name node

Output:

Note: formatting the datanode is not required, *however* if you have a previous install, you may have to to remove /var/lib/hadoop-hdfs/datanode to clear locks

3. Start hadoop namenode and datanode

TODO: figure out why hadoop-hdfs-zkfc doesn’t start
4. Start services on boot

5. Optional: Create a home directory on the hdfs

6. Edit /etc/profile.d/hadoop.sh

7. Load into session

Test

1. Download the examples (they are missing 2.0.2-alpha for some reason)

2. Get a directory listing from hadoop hdfs

3. Run one of the examples

TODO: while the cluster appears to be working, this example hangs. :[

4. Navigate browser to http://<hostname>:50070
Hadoop NameNode localhost:8020 - Google Chrome_021
5. Click on “Live Nodes”
Hadoop NameNode localhost:8020 - Google Chrome_022

Sources

3 thoughts on “[DEPRECATED] CentOS 6: Install Hadoop from Apache Bigtop

  1. Mark Hentov

    You say, “change the following properties.” I say, “to what?” I keep thinking I’m going to see a list of data node hostnames or IP addresses for the secondary NN and the job tracker in one of these how-tos but I never do.

    Reply
    1. Lance Post author

      Hey Mark! Not sure what your question is exactly, but the code listing below the property names shows what the properties should be set to. I’ll admit its not the clearest. If you have some suggestions on how to present these type of file edits in a way that is clearer — I’m all ears. Also, this configuration is for a single-node testing configuration, so namenode, datanode, etc are on the same box.

      Reply
  2. Mark Hentov

    The stock bigtop packages gave me a hdfs-site.xml file with the values you posted here. Probably this was not the case when you wrote this. Nevermind.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">