After installed java...
Now, we install Hadoop.
Create User and Group
1. Create Hadoop Group
sudo addgroup hadoop
2. Create Hadoop User and give a password
sudo adduser --ingroup hadoop hduser
3. Give hduser can "sudo"
sudo adduser hduser sudo
4. logout, and login with hduser
5. Create SSH
ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh localhost
Download and install Hadoop
1. Download hadoop-1.2.1.tar.gz
2. sudo tar -zxvf hadoop-1.2.1.tar.gz -C /usr/local
3. cd /usr/local
sudo mv hadoop-1.2.1/ hadoop
4. sudo chown -R hduser:hadoop hadoop
Configure the User Environment
1. cd ~
nano .bashrc
export JAVA_HOME=/opt/jdk1.8.0
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_INSTALL/bin:$PATH
2. sudo reboot
3. Loggin as hduser, and
$hadoop version
Hadoop 1.2.1
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:23:09 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /usr/local/hadoop/hadoop-core-1.2.1.jar
4. nano /usr/local/hadoop/conf/hadoop-env.sh
export JAVA_HOME=/opt/jdk1.8.0
export HADOOP_HEAPSIZE=272
5. nano /usr/local/hadoop/conf/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/fs/hadoop/tmp</value>
<description>Sets the operating directory for Hadoop data.
</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation.
The URI's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The URI's authority is used to
determine the host, port, etc. for a filesystem.
</description>
</property>
</configuration>
6. nano /usr/local/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
7. nano /usr/local/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>
</description>
</property>
</configuration>
8. Create working directory
sudo mkdir -p /fs/hadoop/tmp
sudo chown hduser:hadoop /fs/hadoop/tmp
sudo chmod 750 /fs/hadoop/tmp/
9. /usr/local/hadoop/bin/hadoop namenode -format
10. To mark "-server" in /usr/local/hadoop/bin/hadoop
Start Hadoop and Run First Job
1. cd /usr/local/hadoop
./bin/start-all.sh
2. jps
You should see....
6828 Jps
6434 DataNode
6334 NameNode
6727 TaskTracker
6617 JobTracker
6545 SecondaryNameNode
3. Download plain text books from Project Gutenberg
4. ./bin/hadoop dfs -copyFromLocal ~pi/books /fs/hduser/books
5. ./bin/hadoop jar hadoop*examples*.jar wordcount /fs/hduser/books /fs/hduser/books-output
13/08/25 20:00:21 INFO mapred.JobClient: Running job: job_201308251952_0001
13/08/25 20:00:22 INFO mapred.JobClient: map 0% reduce 0%
13/08/25 20:02:06 INFO mapred.JobClient: map 46% reduce 0%
13/08/25 20:02:11 INFO mapred.JobClient: map 66% reduce 0%
13/08/25 20:03:19 INFO mapred.JobClient: map 100% reduce 0%
13/08/25 20:03:32 INFO mapred.JobClient: map 100% reduce 77%
13/08/25 20:03:35 INFO mapred.JobClient: map 100% reduce 100%
13/08/25 20:03:59 INFO mapred.JobClient: Job complete: job_201308251952_0001
13/08/25 20:03:59 INFO mapred.JobClient: Counters: 29
13/08/25 20:03:59 INFO mapred.JobClient: Job Counters
恭喜你囉...
留言列表