Hadoop2.7的配置部署及测试

1.环境准备:

安装Centos6.5的操作系统

下载hadoop2.7版本的软件

wget http://124.205.69.132/files/224400000162626A/mirrors.hust.edu.cn/apache/hadoop/common/stable/hadoop-2.7.1.tar.gz


下载jdk1.87版本的软件

wget http://download.oracle.com/otn-pub/java/jdk/8u60-b27/jdk-8u60-linux-x64.tar.gz?AuthParam=1443446776_174368b9ab1a6a92468aba5cd4d092d0

2.修改/etc/hosts文件及配置互信:

在/etc/hosts文件中增加如下内容:

192.168.1.61 host61

192.168.1.62 host62

192.168.1.63 host63

配置好各服务器之间的ssh互信

3.添加用户,解压文件并配置环境变量:

useradd hadoop

passwd hadoop

tar -zxvf hadoop-2.7.1.tar.gz

mv hadoop-2.7.1 /usr/local

ln -s hadoop-2.7.1 hadoop

chown -R hadoop:hadoop hadoop-2.7.1

tar -zxvf jdk-8u60-linux-x64.tar.gz

mv jdk1.8.0_60 /usr/local

ln -s jdk1.8.0_60 jdk

chown -R root:root jdk1.8.0_60


echo 'export JAVA_HOME=/usr/local/jdk' >>/etc/profile

echo 'export PATH=/usr/local/jdk/bin:$PATH' >/etc/profile.d/java.sh

4.修改hadoop配置文件:

1)修改hadoop-env.sh文件:

cd /usr/local/hadoop/etc/hadoop/hadoop-env.sh

sed -i 's%#export JAVA_HOME=${JAVA_HOME}%export JAVA_HOME=/usr/local/jdk%g' hadoop-env.sh

2)修改core-site.xml,在最后添加如下内容:

<configuration>  

<property>

<name>fs.default.name</name>

<value>hdfs://host61:9000/</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/temp</value>

</property>

</configuration>  

3)修改hdfs-site.xml文件:

<configuration>    

<property>    

<name>dfs.replication</name>    

<value>3</value>    

</property>    

</configuration>

4)修改mapred-site.xml

<configuration>    

<property>    

<name>mapred.job.tracker</name>    

<value>host61:9001</value>    

</property>    

</configuration>

5)配置masters

host61

6)配置slaves

host62

host63

5.用同样的方式配置host62及host63


6.格式化分布式文件系统

/usr/local/hadoop/bin/hadoop namenode format

7.替换hadoop的库文件:

mv /usr/local/hadoop/lib/native /usr/local/hadoop/lib/native_old

将编译好的hadoop文件下的lib/native文件夹复制过来;

8.运行hadoop

1)/usr/local/hadoop/sbin/start-dfs.sh

2)/usr/local/hadoop/sbin/start-yarn.sh


9.检查:

[root@host61 sbin]# jps

4532 ResourceManager

4197 NameNode

4793 Jps

4364 SecondaryNameNode

[root@host62 ~]# jps

32052 DataNode

32133 NodeManager

32265 Jps

[root@host63 local]# jps

6802 NodeManager

6963 Jps

6717 DataNode

10.通过web了解hadoop:

namenode的信息:

http://192.168.1.61:50070/

secondnamenode的信息:

http://192.168.1.61:50090/

datanode的信息:

http://192.168.1.62:50075/

11.测试

echo "this is the first file" >/tmp/mytest1.txt

echo "this is the second file" >/tmp/mytest2.txt

cd /usr/local/hadoop/bin;

[hadoop@host61 bin]$ ./hadoop fs -mkdir /in

[hadoop@host61 bin]$ ./hadoop fs -put /tmp/mytest*.txt /in

[hadoop@host61 bin]$ ./hadoop fs -ls /in

Found 2 items

-rw-r--r--   3 hadoop supergroup         23 2015-10-02 18:45 /in/mytest1.txt

-rw-r--r--   3 hadoop supergroup         24 2015-10-02 18:45 /in/mytest2.txt

[hadoop@host61 hadoop]$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar  wordcount /in /out

15/10/02 18:53:30 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id

15/10/02 18:53:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=

15/10/02 18:53:34 INFO input.FileInputFormat: Total input paths to process : 2

15/10/02 18:53:35 INFO mapreduce.JobSubmitter: number of splits:2

15/10/02 18:53:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1954603964_0001

15/10/02 18:53:40 INFO mapreduce.Job: The url to track the job: http://localhost:8080/

15/10/02 18:53:40 INFO mapreduce.Job: Running job: job_local1954603964_0001

15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter set in config null

15/10/02 18:53:40 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter

15/10/02 18:53:41 INFO mapred.LocalJobRunner: Waiting for map tasks

15/10/02 18:53:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:41 INFO mapreduce.Job: Job job_local1954603964_0001 running in uber mode : false

15/10/02 18:53:41 INFO mapreduce.Job:  map 0% reduce 0%

15/10/02 18:53:41 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:41 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:41 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest2.txt:0+24

15/10/02 18:53:51 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/10/02 18:53:51 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/10/02 18:53:51 INFO mapred.MapTask: soft limit at 83886080

15/10/02 18:53:51 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/10/02 18:53:51 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/10/02 18:53:51 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/10/02 18:53:52 INFO mapred.LocalJobRunner: 

15/10/02 18:53:52 INFO mapred.MapTask: Starting flush of map output

15/10/02 18:53:52 INFO mapred.MapTask: Spilling map output

15/10/02 18:53:52 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid = 104857600

15/10/02 18:53:52 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600

15/10/02 18:53:52 INFO mapred.MapTask: Finished spill 0

15/10/02 18:53:52 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000000_0 is done. And is in the process of committing

15/10/02 18:53:53 INFO mapred.LocalJobRunner: map

15/10/02 18:53:53 INFO mapred.Task: Task 'attempt_local1954603964_0001_m_000000_0' done.

15/10/02 18:53:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:53 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:53 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest1.txt:0+23

15/10/02 18:53:53 INFO mapreduce.Job:  map 100% reduce 0%

15/10/02 18:53:53 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/10/02 18:53:53 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/10/02 18:53:53 INFO mapred.MapTask: soft limit at 83886080

15/10/02 18:53:53 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/10/02 18:53:53 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/10/02 18:53:53 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/10/02 18:53:54 INFO mapred.LocalJobRunner: 

15/10/02 18:53:54 INFO mapred.MapTask: Starting flush of map output

15/10/02 18:53:54 INFO mapred.MapTask: Spilling map output

15/10/02 18:53:54 INFO mapred.MapTask: bufstart = 0; bufend = 43; bufvoid = 104857600

15/10/02 18:53:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600

15/10/02 18:53:54 INFO mapred.MapTask: Finished spill 0

15/10/02 18:53:54 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000001_0 is done. And is in the process of committing

15/10/02 18:53:54 INFO mapreduce.Job:  map 50% reduce 0%

15/10/02 18:53:54 INFO mapred.LocalJobRunner: map

15/10/02 18:53:54 INFO mapred.Task: Task 'attempt_local1954603964_0001_m_000001_0' done.

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:54 INFO mapred.LocalJobRunner: map task executor complete.

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Waiting for reduce tasks

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_r_000000_0

15/10/02 18:53:54 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:54 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:54 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5205a129

15/10/02 18:53:55 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10

15/10/02 18:53:55 INFO reduce.EventFetcher: attempt_local1954603964_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events

15/10/02 18:53:55 INFO mapreduce.Job:  map 100% reduce 0%

15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000001_0 decomp: 55 len: 59 to MEMORY

15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 55 bytes from map-output for attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 55, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->55

15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000000_0 decomp: 56 len: 60 to MEMORY

15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 56 bytes from map-output for attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 56, inMemoryMapOutputs.size() -> 2, commitMemory -> 55, usedMemory ->111

15/10/02 18:53:56 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning

15/10/02 18:53:56 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs

15/10/02 18:53:57 INFO mapred.Merger: Merging 2 sorted segments

15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 97 bytes

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merged 2 segments, 111 bytes to disk to satisfy reduce memory limit

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 1 files, 113 bytes from disk

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce

15/10/02 18:53:57 INFO mapred.Merger: Merging 1 sorted segments

15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 102 bytes

15/10/02 18:53:57 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:57 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords

15/10/02 18:53:59 INFO mapred.Task: Task:attempt_local1954603964_0001_r_000000_0 is done. And is in the process of committing

15/10/02 18:53:59 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:59 INFO mapred.Task: Task attempt_local1954603964_0001_r_000000_0 is allowed to commit now

15/10/02 18:53:59 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1954603964_0001_r_000000_0' to hdfs://host61:9000/out/_temporary/0/task_local1954603964_0001_r_000000

15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce > reduce

15/10/02 18:53:59 INFO mapred.Task: Task 'attempt_local1954603964_0001_r_000000_0' done.

15/10/02 18:53:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_r_000000_0

15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce task executor complete.

15/10/02 18:53:59 INFO mapreduce.Job:  map 100% reduce 100%

15/10/02 18:53:59 INFO mapreduce.Job: Job job_local1954603964_0001 completed successfully

15/10/02 18:54:00 INFO mapreduce.Job: Counters: 35

File System Counters

FILE: Number of bytes read=821850

FILE: Number of bytes written=1655956

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=118

HDFS: Number of bytes written=42

HDFS: Number of read operations=22

HDFS: Number of large read operations=0

HDFS: Number of write operations=5

Map-Reduce Framework

Map input records=2

Map output records=10

Map output bytes=87

Map output materialized bytes=119

Input split bytes=196

Combine input records=10

Combine output records=10

Reduce input groups=6

Reduce shuffle bytes=119

Reduce input records=10

Reduce output records=6

Spilled Records=20

Shuffled Maps =2

Failed Shuffles=0

Merged Map outputs=2

GC time elapsed (ms)=352

Total committed heap usage (bytes)=457912320

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters 

Bytes Read=47

File Output Format Counters 

Bytes Written=42

[hadoop@host61 hadoop]$ 

[hadoop@host61 hadoop]$ ./bin/hadoop fs -ls /out

Found 2 items

-rw-r--r--   3 hadoop supergroup          0 2015-10-02 18:53 /out/_SUCCESS

-rw-r--r--   3 hadoop supergroup         42 2015-10-02 18:53 /out/part-r-00000

[hadoop@host61 hadoop]$ ./bin/hadoop fs -cat /out/_SUCCESS

[hadoop@host61 hadoop]$ ./bin/hadoop fs -cat /out/part-r-00000

file 2

first 1

is 2

second 1

the 2

this 2

[hadoop@host61 hadoop]$ 

12.至此hadoop的配置部署工作顺利完成;

相关文章
相关标签/搜索