The best way to learn about Hadoop is getting your hands dirty with real Hadoop programs and their execution. In order to do so we first need a Hadoop installation in local development box.
Steps to install Hadoop:
1. Download and install the Oracle Virtual Box
https://www.virtualbox.org/wiki/Downloads
2. Download and install Hortonworks Sandbox virtual appliance for VirtualBox
http://hortonworks.com/products/hortonworks-sandbox/#install
*Tip: If you get any error running Oracle Virtual Box, please check the BIOS settings to enable virtualization on the machine. And ensure you downloaded the correct installation matching your system configuration (32 bit vs 64 bit).
You can access you Hadoop installation using the browser based interface (Hue) at http://localhost:8888/
You can also SSH to the linux virtual box using credentials root/hadoop.
You can SFTP to the linux box using ip and port 2222.
Login into the SSH terminal and run command: hadoop version
This will print the version information of Hadoop installation.
[root@sandbox ~]# hadoop version Hadoop 2.2.0.2.0.6.0-76 Subversion git@github.com:hortonworks/hadoop.git -r 8656b1cfad13b03b29e98cad042626205e7a1c86 Compiled by jenkins on 2013-10-18T00:19Z Compiled with protoc 2.5.0 From source with checksum d23ee1d271c6ac5bd27de664146be2 This command was run using /usr/lib/hadoop/hadoop-common-2.2.0.2.0.6.0-76.jar [root@sandbox ~]# |