立即注册 登录
About云-梭伦科技 返回首页

desehawk的个人空间 https://www.aboutyun.com/?29 [收藏] [复制] [分享] [RSS]

日志

Ganglia监控 hadoop 集群

热度 1已有 1919 次阅读2015-4-27 00:13

环境:

服务器名称 角色 IP
Client 监控节点 10.0.2.59
nn01 被监控 10.0.2.220
jt01 被监控 10.0.2.219
dn01 被监控 10.0.2.216

hadoop 集群(nn01,jt01,dn01),监控节点:Client
Hadoop1.2.1 版本和Hadoop2.5.2版本

监控节点部署:

下载软件包:

下载epel 源 : 
http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

ganglia 3.6 :
http://ftp.jaist.ac.jp/pub/sourceforge/g/ga/ganglia/ganglia monitoring core/3.6.1/ganglia-3.6.1.tar.gz

ganglia-web 3.6:
http://ftp.jaist.ac.jp/pub/sourceforge/g/ga/ganglia/ganglia-web/3.6.2/ganglia-web-3.6.2.tar.gz

依赖包安装:
[root@Client src]# yum -y install httpd-devel automake autoconf libtool ncurses-devel libxslt groff pcre pcre-devel pkgconfig rrdtool* apr-devel apr-util check-devel cairo-devel pango-devel libxml2-devel rpm-build glib2-devel dbus-devel freetype-devel fontconfig-devel gcc-c++ expat-devel python-devel libXrender-devel libconfuse×

源码安装:

安装libconfuse
wget http://savannah.nongnu.org/download/confuse/confuse-2.7.tar.gz
tar -zxvf confuse-2.7.tar.gz 
./configure CFLAGS=-fPIC --disable-nls
make && make install

安装ganglia
tar zxf ganglia-3.6.1.tar.gz
./configure --prefix=/usr/local/ganglia --with-gmetad --with-librrd --sysconfdir=/etc/ganglia
make
make install

添加gmond和gmetad为系统服务:
cp gmond/gmond.init /etc/rc.d/init.d/gmond 
cp gmetad/gmetad.init /etc/rc.d/init.d/gmetad 
chkconfig --add gmond && chkconfig gmond on
chkconfig --add gmetad && chkconfig gmetad on

配置gmetad 
Ganglia web前端设置:
mkdir -p /var/lib/ganglia/rrds

tar zxf ganglia-web-3.6.2.tar.gz
cd ganglia-web-3.6.2
make install

添加服务命令,修改权限:

ln -s /usr/local/ganglia/bin/* /usr/bin/
ln -s /usr/local/ganglia/sbin/* /usr/sbin/
chown -R apache:apache /var/lib/ganglia

Ganglia的简单配置

生成gmond默认配置文件:54
gmond -t |tee /etc/ganglia/gmond.conf

修改ganglia配置文件
vim gmetad.conf
#"Hadoop cluster" 为
data_source "Hadoop cluster" 10.0.2.59

vim gmond.conf
cluster {
#与gmetad.conf 中的命名一致
name = "Hadoop cluster" 
}

ganglia 
启动ganglia,并访问其web页面:
/etc/init.d/gmond restart
/etc/init.d/gmetad restart
/etc/init.d/httpd restart

将监控节点的 /etc/ganglia/ /usr/local/ganglia/ /etc/init.d/gmond 拷贝到被监控节点的相应位置
scp -r /etc/ganglia/ nn01:/etc/

被监控节点配置hadoop信息:
vim /usr/local/hadoop/conf/hadoop-metrics2.properties
# for Ganglia 3.1 support
#我用的ganglia3.1版本
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
# default for supportsparse is false
*.sink.ganglia.supportsparse=true
*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
namenode.sink.ganglia.servers=239.2.11.71:8649
datanode.sink.ganglia.servers=239.2.11.71:8649
jobtracker.sink.ganglia.servers=239.2.11.71:8649
tasktracker.sink.ganglia.servers=239.2.11.71:8649
maptask.sink.ganglia.servers=239.2.11.71:8649
reducetask.sink.ganglia.servers=239.2.11.71:8649
注:239.2.11.71这个是ganglia用的多播的地址,不需要改成gmetad的服务器地址。

如果需要监控hbase的话,也一样找到hbase目录下的这个文件,改法一样就不重复了。 改完以后将配置文件分发到各个datanode节点的${HADOOP_HOME}/conf目录下,重启Hadoop集群即可。

监控多Clusters 的方法:
修改文件:
[root@Client ganglia]# vim gmetad.conf 
data_source "Hadoop cluster01" 10.0.2.59
data_source "Hadoop cluster02" 10.0.2.54

Hadoop cluster01 节点服务器配置文件:
vim gmond.conf
cluster {
name = "Hadoop cluster01"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}

Hadoop cluster02 节点服务器配置文件:
vim gmond.conf
cluster {
name = "Hadoop cluster02"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}

Hadoop 2.5.2 版本:
需要修改 hadoop-metrics.properties和hadoop-metrics2.properties配置文件

[root@hnn01 hadoop]# grep -v "^#" hadoop-metrics.properties | grep -v "^$"
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
dfs.period=10
dfs.servers=hnn01:8649
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
mapred.period=10
mapred.servers=hnn01:8649
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
jvm.period=10
jvm.servers=hnn01:8649
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
rpc.period=10
rpc.servers=hnn01:8649
ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
ugi.period=10
ugi.servers=hnn01:8649

[root@hnn01 hadoop]# grep -v "^#" hadoop-metrics2.properties | grep -v "^$"
*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
*.period=10
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
namenode.sink.ganglia.servers=hnn01:8649
resourcemanager.sink.ganglia.serves=hnn01:8649
datanode.sink.ganglia.servers=hnn01:8649
nodemanager.sink.ganglia.servers=hnn01:8649
maptask.sink.ganglia.servers=hnn01:8649
reducetask.sink.ganglia.servers=hnn01:8649

重启hadoop集群 配置生效 !

1

路过

雷人

握手

鲜花

鸡蛋

刚表态过的朋友 (1 人)

评论 (0 个评论)

facelist doodle 涂鸦板

您需要登录后才可以评论 登录 | 立即注册

关闭

推荐上一条 /2 下一条