p1-ax761_facebo_f_20101017185109

Hadoop installation guide Step by Step

Hadoop Installation Guide

Download Updated  Vmware  from the given link:

Link -https://my.vmware.com/web/vmware/downloads

hadopp-ins

Step 2

Once VMware downloading will be done then click on the downloaded
Vmware .exe file to install the VMware and then click on Next button.

hadoopii

Step 3

It will ask to enter your email id and click on OK button.

hjgjkhlkfd

Step 4

Once the VmWare installation will be done its look like as given screenshot.

hadoop-ins

Step 5

Now download Ubuntu 14.04t.zip from the given link

Link: http://www.traffictool.net/vmware/ubuntu1404t.html

hadoopin2

Step 6

Once Ubuntu will be downloaded unzip the downloaded Ubuntu 14.04t.zip  by right click on Ubuntu 14.04t.zip and click on unzip file.

Step 7

Now open Vmware and click on Open Virtual Machine (To add unzipped Ubuntu)

hadoopin3

Step 8

Now add the unzipped Ubuntu in Vmware and click on Open.

hadoop4

Step 9

Once Ubuntu will be added then Vmware will be looking like as given screenshot

hadoopin5

Step 10

Now Increase the VMware RAM size 3 GB to 6 Gb according to your
machine Ram size. Click on Edit Virtual Machine Settings

hadoopin6

Step 11

Increase RAM size according to your machine RAM size (specified size here 4 GB) and click on OK.

hadoopin7

Step 12

Now click on Play Virtual Machine to start Ubuntu.

hadoopin8

Step 13

Once Ubuntu will start its looks like the given screenshot.

hadoopin9

Step 14

Now click on the terminal to open  console window.

hadoopin10

Step 15

Once terminal will open its looks like.

hadoopin11

Step 16 ( Optional )

You can also change the console text colour and background colour. Right click on console and click on Profile and then click on Profile Preferences.

hadoopin13

Step 16

Now go to the Colours tab and select the text and background colour as you want (here specified colour is for background colour – Black and for text colour – green)

hadoopin14

Step 17

Now add all java repository by using command (it will ask to press enter ).

$ sudo add-apt-repository ppa:webupd8team/java

hadoopin15

Step 18

Now update all packages list using given command, once you enter command it will ask for password, the password for Ubuntu is always “password”

$ Sudo apt-get update

It  takes awhile to update all packages

hadoopin16

Step 19

Now invoke the java7 installer by using command.

$ sudo apt-get install oracle-java7-installer

16

Step 20

It will ask to agree for  license agreement (press enter).

 

picture1

Step 21

Now select “Yes” and press enter.

19

Step 22

Check installed java version using command.

$ java –version

20

Step 23

Now install “open ssh server” by using command.

$ sudo apt-get install openssh-server

21

Step 24

Now generate “ssh” by using command and press enter as given in screenshot.

$ ssh-keygen  –  t rsa  –P “”

23

Step 25

Now enable the “ssh” access by using command.

$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

24

Step 26

Now test “SSH” access by using command (it will ask to type “yes”).

$ ssh localhost

25

Step 27

Now disable IPV6 0 to 1 by using command.

$ sudo gedit  /etc/sysctl.conf

28

Step 28

Now ready to install Hadoop-2.6.0 by using command.

$ wget https://archive.apache.org/dist/hadoop/core/hadoop-2.6.0/hadoop-2.6.0.tar.gz

29

Step 29

Once downloading will be done then extract the downloaded hadoop by using command.

$ tar –xzvf hadoop-2.6.0.tar.gz

30

Step 30

You can see all folders by using command.

$ ls

you can see the download hadoop and extracted hadoop folder as marked red in screenshot

31

Step 31

Now open bashrc file to configure hadoop path by using command.

$ sudo gedit .bashrc

32

Step 32

Once bashrc file will open then add the given below  lines in last and save the file as given in screenshot.

# — HADOOP ENVIRONMENT VARIABLES START — #
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export HADOOP_HOME=/home/user/hadoop-2.6.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=”-Djava.library.path=$HADOOP_HOME/lib”
# — HADOOP ENVIRONMENT VARIABLES END — #

33

Step 33

Now you can check hadoop and java both path by using these commands.

$ echo $HADOOP_HOME

$ echo $JAVA_HOME

34

Step 34

Now configure the hadoop environment. Go to hadoop-2.6.0/etc/hadoop directory in Hadoop folder to configure haddop-env.sh by using command.

$ cd hadoop-2.6.0

$ cd etc

$ cd hadoop

36

Step 35

Now configure hadoop environment by using command

$ gedit hadoop-env.sh

35

Step 36

Now modify the java in env.sh file as given in screenshot and save it.

Add these line to modify :  export JAVA_HOME=/usr/lib/jvm/java-7-oracle

37

Step 37

Now open core-site.xml file to configure the property by using command.

$ gedit core-site.xml

38

Step 38

Now add the given below properties in configuration as shown in diagram.

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

39

Step 39

Now configure yarn-site.xml by using command.

$ gedit yarn-site.xml

40

Step 40

Now add the given below properties in configuration as shown in diagram.

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

41

Step 41

Now configure mapred-site.xml. To open mapred-site.xml run this given command.

$ sudo cp mapred-site.xml.templete mapred-site.xml

42

Step 42

Now open mapred-site.xml to configuration by using command also show the screenshot

$ sudo gedit mapred-site.xml

45

Step 43

Now add these given lines to configure.

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

44

Step 44

Now create two directories for hadoop storage by using commands (open a new terminal and create these two directories as given in screenshot).

$ sudo mkdir -p hadoop_store/hdfs/namenode

$ sudo mkdir -p hadoop_store/hdfs/datanode

46

Step 45

Now open hdfs-site.xml to configure by using command.

$ gedit hdfs-site.xml

48

Step 46

Now add the givens lines to configure hdfs-site.xml and also check in screenshot.

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/user/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/user/hadoop_store/hdfs/datanode</value>
</property>

47

Step 47

Now change the folder permission by using commands.

$ sudo chown -R user:user /home/user/hadoop-2.6.0/etc/hadoop
$ sudochown -R user:user /home/user/hadoop_store

49

Step 48

Now format the Namenode by using command (to format Namenode we have to go to bin folder as given in screenshot).

$ hdfs namenode -format

51

52

Step 49

Now change the folder permission by using commands.

$ sudo chown -R user:user /home/user/hadoop-2.6.0/etc/hadoop
$ sudochown -R user:user /home/user/hadoop_store

49

Step 50

Now start all hadoop services by using command.

$ start-all.sh

53

Step 51

Now change the folder permission by using commands.

$ sudo chown -R user:user /home/user/hadoop-2.6.0/etc/hadoop
$ sudochown -R user:user /home/user/hadoop_store

49

Step 52

Now to check all running hadoop services run command

$ jps

54

Step 54

To check hdfs local server use this given url and see also screenshot.

http://localhost:50070/dfshealth.html#tab-overview

55

 

 

 

 

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *