立即注册 登录
About云-梭伦科技 返回首页

XUELANG的个人空间 https://www.aboutyun.com/?50482 [收藏] [复制] [分享] [RSS]

日志

大数据之hadoop集群安装(一)

热度 1已有 1864 次阅读2016-11-8 11:20 |个人分类:大数据

集群规划:

namenode 192.168.72.131 hadoop1

datanode1 192.168.72.132 hadoop2

datanode2 192.168.72.133 hadoop3

 

相关安装包下载地址;

http://mirrors.hust.edu.cn/apache/

 

系统安装,是通过本人的台式电脑,在本机安装VMware 构建3台虚拟机,每台50G硬盘,1G内存,1核处理器操作系统为centos6.5 64,具体安装很简单,这里就不重复了

在安装好系统后,需要删除虚拟网卡,否则通信会存在问题

#virsh net-destroy default

#virsh net-undefine default

 

如果是在物理机上安装hadoop,对磁盘规划会有要求:

1、所有服务器操作系统盘采用raid1模式。

2DataNode节点中数据盘,每块数据盘作JBOD模式,如果不支持JBOD,每块盘做一个分区、格式化之后直接挂载。

一、hadoop的安装部署1、创建hadoop用户

安装运维及监控需使用hadoop用户

root用户下执行如下命令进行添加hadoop用户

useradd hadoop

passwd hadoop(输入密码)

开发用户可选用其他新建用户。

2、配置/etc/hosts

[root@hadoop1 hadoop]# vi /etc/hosts

#127.0.0.1   localhost hadoop1 localhost4 localhost4.localdomain4

#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.72.131 hadoop1

192.168.72.132 hadoop2

192.168.72.133 hadoop3

 

[root@hadoop2 hadoop]# vi /etc/hosts

#127.0.0.1   localhost hadoop2 localhost4 localhost4.localdomain4

#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.72.131 hadoop1

192.168.72.132 hadoop2

192.168.72.133 hadoop3

 

[root@hadoop3 .ssh]# vi /etc/hosts

#127.0.0.1   localhost hadoop3

#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.72.131 hadoop1

192.168.72.132 hadoop2

192.168.72.133 hadoop3

以上的127.0.0.12行 需要注释掉,不然在后面的datanode节点执行hadoop  fs -ls的命令时会报错,也会导致datanodenodemanager 进程会down掉。

[hadoop@hadoop2 ~]$ hadoop fs -ls /data

ls: Call From hadoop2/127.0.0.1 to hadoop1:9000 failed on connection exception: java.net.ConnectException: Connection refused;

 

3ssh配置,hadoop用户配置

在第一个节点进行配置,然后将密钥拷贝到其他节点

进入到/home/hadoop/.SSH路径

 

[hadoop@hadoop1 .ssh]$ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):

/home/hadoop/.ssh/id_rsa already exists.

Overwrite (y/n)? y

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /home/hadoop/.ssh/id_rsa.

Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.

The key fingerprint is:

65:b3:65:10:e3:64:45:21:08:9f:69:b7:3e:c6:a6:31 hadoop@hadoop1

The key's randomart image is:

+--[ RSA 2048]----+

|      .. .B++.   |

|       ..* +     |

|        = * o    |

|       . + *     |

|        S o      |

|         o       |

|        E *      |

|         * .     |

|        .        |

+-----------------+

[hadoop@hadoop1 .ssh]$

[hadoop@hadoop1 .ssh]$ ls -lt

total 16

-rw-------. 1 hadoop hadoop 1675 Jul 11 18:47 id_rsa

-rw-r--r--. 1 hadoop hadoop  396 Jul 11 18:47 id_rsa.pub

-rw-r--r--. 1 hadoop hadoop 1570 Jul 11 18:12 known_hosts

-rw-rw-r--. 1 hadoop hadoop 1230 Jul  8 09:32 authorized_keys

 

[hadoop@hadoop1 .ssh]$ cp id_rsa.pub authorized_keys

[hadoop@hadoop1 .ssh]$ ls -lt

total 16

-rw-rw-r--. 1 hadoop hadoop  396 Jul 11 18:51 authorized_keys

-rw-------. 1 hadoop hadoop 1675 Jul 11 18:47 id_rsa

-rw-r--r--. 1 hadoop hadoop  396 Jul 11 18:47 id_rsa.pub

-rw-r--r--. 1 hadoop hadoop 1570 Jul 11 18:12 known_hosts

[hadoop@hadoop1 .ssh]$ chmod 600 authorized_keys

[hadoop@hadoop1 .ssh]$ ls -lt

total 16

-rw-------. 1 hadoop hadoop  396 Jul 11 18:51 authorized_keys

-rw-------. 1 hadoop hadoop 1675 Jul 11 18:47 id_rsa

-rw-r--r--. 1 hadoop hadoop  396 Jul 11 18:47 id_rsa.pub

-rw-r--r--. 1 hadoop hadoop 1570 Jul 11 18:12 known_hosts

[hadoop@hadoop1 .ssh]$ scp * hadoop2:~/.ssh/

hadoop@hadoop2's password:

authorized_keys                                                                                                                                                                                                                             100%  396     0.4KB/s   00:00    

id_rsa                                                                                                                                                                                                                                      100% 1675     1.6KB/s   00:00    

id_rsa.pub                                                                                                                                                                                                                                  100%  396     0.4KB/s   00:00    

known_hosts                                                                                                                                                                                                                                 100% 1570     1.5KB/s   00:00    

[hadoop@hadoop1 .ssh]$ scp * hadoop3:~/.ssh/

hadoop@hadoop3's password:

authorized_keys                                                                                                                                                                                                                             100%  396     0.4KB/s   00:00    

id_rsa                                                                                                                                                                                                                                      100% 1675     1.6KB/s   00:00    

id_rsa.pub                                                                                                                                                                                                                                  100%  396     0.4KB/s   00:00    

known_hosts                                                                                                                                                                                                                                 100% 1570     1.5KB/s   00:00    

[hadoop@hadoop1 .ssh]$

[hadoop@hadoop1 .ssh]$

[hadoop@hadoop1 .ssh]$ ssh hadoop2

Last login: Mon Jul 11 18:12:08 2016 from hadoop1

[hadoop@hadoop2 ~]$ exit

logout

Connection to hadoop2 closed.

[hadoop@hadoop1 .ssh]$ ssh hadoop3

Last login: Mon Jul 11 18:12:16 2016 from hadoop1

[hadoop@hadoop3 ~]$ ssh hadoop2

Last login: Mon Jul 11 18:54:06 2016 from hadoop1

[hadoop@hadoop2 ~]$ exit

 

4JDK安装

网上下载JDK压缩包:jdk-7u79-linux-x64.tar.gz

[root@hadoop1 .ssh]#tar -xzvf jdk-7u79-linux-x64.tar.gz -C /opt/

配置环境变量:每个节点都进行配置

[root@hadoop1 hadoop]# vi /etc/profile

 

# JAVA_HOME

 

export JAVA_HOME=/opt/jdk1.7.0_79/

# CLASSPATH

export

 

CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

 

#PATH

 

export PATH=$PATH:$JAVA_HOME/bin

 

启动配置项:

[root@hadoop1 hadoop]#source /etc/profile

 

·验证jdk是否安装成功:

 

[root@hadoop1 bin]# java -version

java version "1.7.0_45"

OpenJDK Runtime Environment (rhel-2.4.3.3.el6-x86_64 u45-b15)

OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)

 

显示的还是老版本信息,处理如下:

 

使用which java which javac 分别可以看到

 

[root@localhost ~]# which java

/usr/bin/java

[root@localhost ~]# which javac

/usr/bin/javac

简单说一下,就是把这2个文件ln -s 到我们新的jdk 下的 java javac 上,命令如下:

[root@hadoop1 bin]#rm -rf /usr/bin/java

[root@hadoop1 bin]#rm -rf /usr/bin/javac

[root@hadoop1 bin]# ln -s  $JAVA_HOME/bin/javac /usr/bin/javac

[root@hadoop1 bin]# ln -s  $JAVA_HOME/bin/java /usr/bin/java

[root@hadoop1 bin]# java -version

java version "1.7.0_79"

Java(TM) SE Runtime Environment (build 1.7.0_79-b15)

Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

5Selinux配置

设置Selinux。执行如下命令可实时将Selinux修改为permissive状态。

setenforce 0

修改配置文件将默认状态修改disabled

修改/etc/selinux/config

SELINUX=disabled

6、配置防火墙

#查看防火墙开机启动状态

 [root@hadoop1 ~]#  chkconfig iptables --list

关闭防火墙,立即生效,重启失效:

 [root@hadoop1 ~]#  /etc/init.d/iptables stop

关闭防火墙,重启生效:

[root@hadoop1 ~]# chkconfig iptables off

 

 

7、时间同步

hadoop集群节点时间不宜超过30秒,否则在运行任务可能出现异常

如果使用hbase数据库那么对于时间同步的要求非常严格。

修改时区:

这部分需要检查一下自己系统的时区,如果不是东八区,需要修改:

#date -R

复制相应的时区文件,替换系统时区文件

#cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime

同步时间:ntpdate -u 202.120.2.101

----上海交大ntp服务器

定时同步时间

  * * * * * /usr/sbin/ntpdate 210.72.145.44 > /dev/null 2>&1

8、修改最大进程数和句柄数

root用户进行操作

##由于默认用户的最大进程数和句柄数不满足大数据多文件,参数需要调整大一些。否则会报错。

##这里只是设置了hadoop用户和root用户的权限,如果存在其他开发用户,也需要一并添加。

修改/etc/security/limits.conf添加如下

hadoop   soft  nofile  131072

hadoop   hard  nofile  131072

修改/etc/security/limits.d/90-nproc.conf添加如下

hadoop    soft    nproc     unlimited

root       soft    nproc     unlimited

hadoop    hard    nproc     unlimited

 

具体目录的准备与配置如下所示:

在每个节点上创建程序存储目录/home/hadoop/hadoop2.6.4,用来存放Hadoop程序文件。

 

在每个节点上创建数据存储目录/home/hadoop/ hdfs,用来存放集群数据。

 

在主节点node上创建目录/home/hadoop/hdfs/name,用来存放文件系统元数据。

在每个从节点上创建目录/home/hadoop/hdfs/data,用来存放真正的数据。

 

所有节点上的日志目录为/home/hadoop/ logs

 

所有节点上的临时目录为/home/hadoop/tmp

9、安装hadoop软件

网上下载hadoop压缩包:hadoop-2.6.4.tar.gz,这个只是针对32位的操作系统,如果是64位的,需要自己编译安装包,具体方法见14部分。

将安装包移动到hadoop1节点,并将节点1的安装包复制到其他节点:

[hadoop@hadoop1 ~]$ scp

hadoop-2.6.4.tar.gz hadoop@hadoop2:/home/hadoop/

[hadoop@hadoop1 ~]$ scp

hadoop-2.6.4.tar.gz hadoop@hadoop3:/home/hadoop/

解包:

[hadoop@hadoop1 ~]$ tar zxvf hadoop-2.6.4.tar.gz

下面2步可以不做,节点1相关安装配置完事将整个hadoop目录拷贝到其他节点

[hadoop@hadoop2 ~]$ tar zxvf hadoop-2.6.4.tar.gz

[hadoop@hadoop3~]$ tar zxvf hadoop-2.6.4.tar.gz

配置环境变量:

[root@hadoop1 hadoop]# vi /etc/profile

export HADOOP_HOME=/home/hadoop/hadoop-2.6.4

exportPATH=$PATH:$HADOOP_HOME/bin

启动配置项:

[root@hadoop1 hadoop]#source /etc/profile

 

10、修改配置文件

一共有7个文件要修改:

$HADOOP_HOME/etc/hadoop/hadoop-env.sh

$HADOOP_HOME/etc/hadoop/yarn-env.sh

$HADOOP_HOME/etc/hadoop/core-site.xml

$HADOOP_HOME/etc/hadoop/hdfs-site.xml

$HADOOP_HOME/etc/hadoop/mapred-site.xml

$HADOOP_HOME/etc/hadoop/yarn-site.xml

$HADOOP_HOME/etc/hadoop/slaves

 

配置hadoop-env.sh

jdk目录位置,根据实际情况修改):

export JAVA_HOME=/opt/jdk1.7.0_79

增加下面一行:

export HADOOP_PREFIX=/home/hadoop/hadoop-2.6.4

 

配置yarn-env.sh

jdk目录位置,根据实际情况修改):

export JAVA_HOME=/opt/jdk1.7.0_79

 

配置core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://hadoop1:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/tmp</value>----需要建立对应的tmp目录

</property>

</configuration>

 

 

 

配置:hdfs-site.xml

<configuration>

<property>

<name>dfs.datanode.ipc.address</name>

<value>0.0.0.0:50020</value>

</property>

<property>

<name>dfs.datanode.http.address</name>

<value>0.0.0.0:50075</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/tmp</value>----需要建立对应的tmp目录

</property>

<property>

  <name>dfs.namenode.servicerpc-address</name>

  <value>hadoop1:53310</value>

</property>

<property>

<name>dfs.secondary.http.address</name>

<value>hadoop2:50090</value>  ---配置sencondnamenode

</property>

</configuration>

dfs.replication表示数据副本数,一般不大于datanode的节点数

---------------------------------------------------------

<property>

<name>dfs.namenode.name.dir</name>

<value>file:///home/hadoop/hdfs/name</value>

<description> namenode 用来持续存放命名空间和交换日志的本地文件系统路径             </description>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:///home/hadoop/hdfs/data</value>

<description> DataNode 在本地存放块文件的目录列表,用逗号分隔 </description>

</property>

 

 

---------------------------如果不配置,对应的默认路径为:/home/hadoop/tmp/dfs/name

配置mapred-site.xml

<!--##配置计算机构为2yarn -->

 

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

 

配置yarn-site.xml

 

<!--##NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序-->

 

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.resourcemanager.hostname</name>

<value>192.168.72.131</value>

</property>

 ----yarn.resourcemanager.hostname必须配置,对应的NN主机IP地址必须正确,否则影响NNDN的通信,导致nodemanager 服务失败

 

配置slaves

输入slave节点的名字

vi slaves 编辑该文件,输入

hadoop2

hadoop3

 

core-site.xml的完整参数请参考

http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/core-default.xml

hdfs-site.xml的完整参数请参考

http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

mapred-site.xml的完整参数请参考

http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

yarn-site.xml的完整参数请参考

http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

 

masterhadoop1)上的hadoop目录复制到hadoop2,hadoop3

需要在节点2和节点3建立对应的tmp目录

[hadoop@hadoop2 ~]mkdir tmp

[hadoop@hadoop3~]mkdir tmp

 

[hadoop@hadoop1 bin]$cd /home/hadoop

 

[hadoop@hadoop1 ~]$scp -r hadoop-2.6.0 hadoop@ hadoop2:/home/hadoop/

 

[hadoop@hadoop1 ~]$scp -r hadoop-2.6.0 hadoop@ hadoop3:/home/hadoop/

 

11、启动服务

格式化HDFS

[hadoop@hadoop1 bin]$ ./hdfs namenode -format

///////////////////////////////////////////

Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to hadoop1/192.168.72.131:9000

datanodeclusterID namenodeclusterID 不匹配。

解决办法:

根据日志中的路径,cd /home/hadoop/tmp/dfs

能看到 dataname两个文件夹,

name/current下的VERSION中的clusterID复制到data/current下的VERSION中,覆盖掉原来的clusterID

让两个保持一致

然后重启,启动后执行jps,查看进程

20131 SecondaryNameNode
20449 NodeManager
19776 NameNode
21123 Jps
19918 DataNode
20305 ResourceManager

/////////////////////////////////////////////////////////////////////////////////////////

出现该问题的原因:在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令(hdfs namenode -format),这时namenodeclusterID会重新生成,而datanodeclusterID 保持不变。

启动DFS:

$HADOOP_HOME/sbin/start-dfs.sh

 

启动完成后,输入jps查看进程,如果看到以下进程:

[hadoop@hadoop1 sbin]$ jps

10133 NameNode

10417 Jps

10315 SecondaryNameNode

 

启动YARN

[hadoop@hadoop1 sbin]$ ./start-yarn.sh

[hadoop@hadoop1 sbin]$ jps

10133 NameNode

10734 Jps

10315 SecondaryNameNode

10466 ResourceManager

 

其他节点:

[hadoop@hadoop2 name]$ jps

4760 DataNode

4850 NodeManager

4974 Jps

 

[hadoop@hadoop3 hadoop]$ jps

5247 Jps

5132 NodeManager

5042 DataNode

 

访问:

http://192.168.72.131:50070

http://192.168.72.131:8088/cluster

http://192.168.72.132:8042

 

启动JobHistory Server

[hadoop@hadoop1 sbin]$ ./mr-jobhistory-daemon.sh start historyserver

终止JobHistory Server,执行如下命令:mr-jobhistory-daemon.sh stop historyserver

 

HADOOP各种服务启动:

start-all.sh 启动所有的Hadoop守护进程。包括NameNodeSecondary NameNodeDataNodeJobTrackerTaskTrack

stop-all.sh 停止所有的Hadoop守护进程。包括NameNodeSecondary NameNodeDataNodeJobTrackerTaskTrack

start-dfs.sh 启动Hadoop HDFS守护进程NameNodeSecondaryNameNodeDataNode

stop-dfs.sh 停止Hadoop HDFS守护进程NameNodeSecondaryNameNodeDataNode

hadoop-daemons.sh start namenode 单独启动NameNode守护进程

hadoop-daemons.sh stop namenode 单独停止NameNode守护进程

hadoop-daemons.sh start datanode 单独启动DataNode守护进程--在需要启动的节点执行

hadoop-daemons.sh stop datanode 单独停止DataNode守护进程

hadoop-daemons.sh start secondarynamenode 单独启动SecondaryNameNode守护进程

hadoop-daemons.sh stop secondarynamenode 单独停止SecondaryNameNode守护进程

start-mapred.sh 启动Hadoop MapReduce守护进程JobTrackerTaskTracker

stop-mapred.sh 停止Hadoop MapReduce守护进程JobTrackerTaskTracker

hadoop-daemons.sh start jobtracker 单独启动JobTracker守护进程

hadoop-daemons.sh stop jobtracker 单独停止JobTracker守护进程

hadoop-daemons.sh start tasktracker 单独启动TaskTracker守护进程

hadoop-daemons.sh stop tasktracker 单独启动TaskTracker守护进程

MapReduce的守护进程启动也是有顺序的,即:

1)启动 JobTracker守护进程;

2)启动TaskTracker守护进程。

单独启动一个节点的nodemanageryarn

./yarn-daemon.sh start nodemanager

 

查看datanode块状态:

http://192.168.72.132:50075/blockScannerReport

查看namenode块状态:

hadoop fsck

 

namenode活动状态监控:通过将namenode的元数据保存到日志路径下。

hadoop dfsadmin -metasave word.txt

12、集群验证

 

可以使用Hadoop自带的WordCount例子进行验证。先在HDFS创建几个数据目录:

 

[hadoop@hadoop1 tmp] hadoop fs -mkdir -p /data/wordcount

[hadoop@hadoop1 tmp]hadoop fs -ls /data/

[hadoop@hadoop1 tmp]$ hadoop fs -mkdir -p /output/

将本地文件上传到HDFS中:

[hadoop@hadoop1 hadoop]$ hadoop fs -put /home/hadoop/hadoop-2.6.4/etc/hadoop/*.xml /data/wordcount/

 

执行中报如下错误:

java.io.IOException: File /data/wordcount/capacity-scheduler.xml._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

查看日志:

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: hadoop1/192.168.72.131:9000

显示这个端口连接有问题

查看:

[root@hadoop1 etc]# netstat -an | grep 9000

tcp        0      0 192.168.72.131:9000         0.0.0.0:*                   LISTEN      

tcp        0      0 192.168.72.131:40809        192.168.72.131:9000         TIME_WAIT  

现在端口没起来,可能是防火墙问题,关闭防火墙:

[root@hadoop1 etc]# /etc/init.d/iptables stop

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]

iptables: Flushing firewall rules:                         [  OK  ]

iptables: Unloading modules:                               [  OK  ]

 

[root@hadoop1 etc]# netstat -an | grep 9000

tcp        0      0 192.168.72.131:9000         0.0.0.0:*                   LISTEN      

tcp        0      0 192.168.72.131:9000         192.168.72.133:51281        ESTABLISHED

tcp        0      0 192.168.72.131:40812        192.168.72.131:9000         TIME_WAIT   

tcp        0      0 192.168.72.131:9000         192.168.72.132:59126        ESTABLISHED

 

发现已OK

 

查看上传的文件:

[hadoop@hadoop1 hadoop]$ hadoop fs -ls /data/wordcount

Found 9 items

-rw-r--r--   2 hadoop supergroup       4436 2016-07-14 18:34 /data/wordcount/capacity-scheduler.xml

-rw-r--r--   2 hadoop supergroup        981 2016-07-14 18:34 /data/wordcount/core-site.xml

-rw-r--r--   2 hadoop supergroup       9683 2016-07-14 18:34 /data/wordcount/hadoop-policy.xml

-rw-r--r--   2 hadoop supergroup       1329 2016-07-14 18:34 /data/wordcount/hdfs-site.xml

-rw-r--r--   2 hadoop supergroup        620 2016-07-14 18:34 /data/wordcount/httpfs-site.xml

-rw-r--r--   2 hadoop supergroup       3523 2016-07-14 18:34 /data/wordcount/kms-acls.xml

-rw-r--r--   2 hadoop supergroup       5511 2016-07-14 18:34 /data/wordcount/kms-site.xml

-rw-r--r--   2 hadoop supergroup        196 2016-07-14 18:34 /data/wordcount/mapred-site.xml

-rw-r--r--   2 hadoop supergroup        746 2016-07-14 18:34 /data/wordcount/yarn-site.xml

 

下面,运行WordCount例子,执行如下命令:

[hadoop@hadoop1 mapreduce]$ hadoop jar /home/hadoop/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount /data/wordcount /output/wordcount

 

查看运行结果:

[hadoop@hadoop1 ~]$ hadoop fs -cat /output/wordcount/part-r-00000 |head

"*"     18

"AS     8

"License");     8

"alice,bob      18

"kerberos".   1

"simple"      1

'HTTP/' 1

'none'  1

'random'        1

'sasl'  1

cat: Unable to write to output stream.

 

通过http://192.168.72.131:8088/可以看到节点的运行情况。

 

 

////////////////////////////////////

hadoop平台中编译java文件的方法:

HADemo.java文件拷贝到linux环境中配置HADOOP_HOME/bin到环境中,启动集群,进入HADemo.java文件目录中

:下面的lib目录里面的文件由HADOOP_HOME/share/hadoop/httpfs/tomcat/webapps/ webhdfs/WEB-INF/lib目录中获取,下面做的目的是为了缩减命令长度

1.编译java

# mkdir class#Javac -classpath .:lib/hadoop-common-2.2.0.jar:lib/hadoop-annotations-2.2.0.jar -d class HADemo.java

2.生成jar

#jar -cvf hademo.jar -C class/ .added manifestadding: com/(in = 0) (out= 0)(stored 0%)adding: com/wan/(in = 0) (out= 0)(stored 0%)adding: com/wan/demo/(in = 0) (out= 0)(stored 0%)adding: com/wan/demo/HADemo.class(in = 844) (out= 520)(deflated 38%)

3.测试运行

#hadoop jar hademo.jar com.wan.demo.HADemo /test

检测:#hadoop fs -ls /

////////////////////////////////////////////////////////

13、报错及处理:

[hadoop@hadoop1 sbin]$ ./start-dfs.sh

16/07/12 02:13:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

16/07/12 02:13:53 WARN hdfs.DFSUtil: Namenode for null remains unresolved for ID null.  Check your hdfs-site.xml file to ensure namenodes are configured properly.

Starting namenodes on [master]

master: ssh: Could not resolve hostname master: Name or service not known

hadoop3: Error: JAVA_HOME is not set and could not be found.

hadoop2: Error: JAVA_HOME is not set and could not be found.

Starting secondary namenodes [0.0.0.0]

The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.

RSA key fingerprint is 3e:b6:3f:b7:77:32:86:4d:dc:2c:28:c3:a0:91:84:61.

Are you sure you want to continue connecting (yes/no)? no

------------

16/07/12 02:13:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

 

报以上错误是因为从网上直接下载的安装包,目前官网上不提供64位的安装包,需要自己下载源文件进行编译,具体编译过程如下:

---------------------------------------------------------------------------------------------------------------

14hadoop安装包编译:

Apache官网上提供的hadoop本地库是32位的,如果我们的Linux服务器是64位的话,就会现问题。

 

    我们在64位服务器执行Hadoop命令时,则会报以下错误:

 

    WARNutil.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java classes where applicable

 

编译hadoop2.6.1需要的软件

 

jdk 1.7

gcc 4.4.5

maven 3.3.3

protobuf 2.5.0

cmake 2.8.12.2

ant 1.9.6

finbugs(可选择)

 

安装JDK

[root@centos ~]# tar zxvf jdk-7u79-linux-x64.tar.gz -C /opt/

配置环境变量,编辑/etc/profile文件

 

export JAVA_HOME=/opt/jdk1.7.0_79/

# CLASSPATH

export

 

CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

 

#PATH

 

export PATH=$PATH:$JAVA_HOME/bin

 

 

验证版本信息:

[root@centos ~]#java -version

安装apache-maven-3.3.9-bin.tar

下载地址:http://maven.apache.org/download.cgi

[root@centos ~]# tar zxvf apache-maven-3.3.9-bin.tar.gz

 

配置环境变量:

vi /etc/profile

export  MAVEN_HOME=/root/apache-maven-3.3.9

export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin

 

安装protobuf

下载:http://pan.baidu.com/s/1pJlZubT

安装:

[root@centos ~]# tar zxvf protobuf-2.5.0.tar.gz -C /opt/

 

使用tar -zxf protobuf-2.5.0.tar.gz命令解压后得到是protobuf-2.5.0的源码,

 

[root@centos opt]# cd protobuf-2.5.0/ 进入目录

 

[root@centos protobuf-2.5.0]#  ./configure   

 [root@centos protobuf-2.5.0]# make

[root@centos protobuf-2.5.0]#  make install

 

    最后输入[root@centos protobuf-2.5.0]# protoc --version命令,

如显示libprotoc 2.5.0则安装成功

 

安装cmake

[root@centos ~]# rpm -ivh cmake-2.6.4-5.el6.x86_64.rpm

[root@centos ~]# cmake -version

cmake version 2.6-patch 4

 

安装ant

# tar zxvf apache-ant-1.9.7-bin.tar.gz -C /opt/

vi /etc/profile

export ANT_HOME=/opt/apache-ant-1.9.7

PATH=$PATH:$ANT_HOME/bin

 

 

安装其他软件:

[root@centos ~]# rpm -ivh autoconf-2.63-5.1.el6.noarch.rpm

[root@centos ~]# rpm -ivh automake-1.11.1-4.el6.noarch.rpm

编译hadoop2.6.1

必须保证虚拟机能访问外网,这里选用的是NAT模式,虚拟机中的的ip地址设置了静态模式,配置网关:192.168.72.2

编辑虚拟机机器的DNS配置文件/etc/resolv.conf

nameserver 192.168.72.2

· Apache官网上,下载hadoop-2.6.4的源码包hadoop-2.6.4-src.tar.gz。

· 解压源码包tar -zxvf hadoop-2.6.4-src.tar.gz

· 进入hadoop-2.6.4-src解压目录。cd /root/hadoop-2.6.4-src

· 执行命令mvn package -DskipTests -Pdist,native -Dtar进行编译。

· 编译过程中,需要下载很多包,等待时间比较长。当看到hadoop各个项目都编译成功,即出现一系列的SUCCESS之后,即为编译成功。

· 编译好的安装包hadoop-2.6.4.tar.gz,可以在文件目录/root/hadoop-2.6.4-src/hadoop-dist/target下找到

操作步骤:

[root@centos ~]# tar zxvf hadoop-2.6.4-src.tar.gz

[root@centos ~]# cd hadoop-2.6.4-src

[root@centos hadoop-2.6.4-src]#mvn package -DskipTests -Pdist,native -Dtar

 

[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.041 s]

[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 40.865 s]

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 03:43 h

[INFO] Finished at: 2016-07-13T19:13:56+08:00

[INFO] Final Memory: 100M/237M


路过

雷人
1

握手

鲜花

鸡蛋

刚表态过的朋友 (1 人)

发表评论 评论 (1 个评论)

回复 easthome001 2016-11-8 14:54
够详细

facelist doodle 涂鸦板

您需要登录后才可以评论 登录 | 立即注册

关闭

推荐上一条 /2 下一条