立即注册 登录
About云-梭伦科技 返回首页

XUELANG的个人空间 https://www.aboutyun.com/?50482 [收藏] [复制] [分享] [RSS]

日志

大数据之HBase安装(三)

热度 1已有 3070 次阅读2016-11-15 10:22

整个安装环境同大数据之hadoop集群安装(一)1、安装:

进入官网页面:http://hbase.apache.org/book.html#java

找到与 hadoop 版本对应的 hbase 并下载,我们这里选择的是HBase1.2.2 

安装包下载地址:

http://mirrors.hust.edu.cn/apache/hbase/1.2.2/

下载安装包,并将安装包上传到节点1下的/home/hadoop路径下:

hbase-1.2.2-bin.tar.gz 

[hadoop@hadoop1 ~]$ tar zxvf hbase-1.2.2-bin.tar.gz 

进入 hbase lib 目录,查看 hadoop jar 包的版本

[hadoop@hadoop1 lib]$ find -name 'hadoop*jar'

./hadoop-mapreduce-client-core-2.5.1.jar

./hadoop-yarn-server-common-2.5.1.jar

./hadoop-mapreduce-client-app-2.5.1.jar

./hadoop-yarn-common-2.5.1.jar

./hadoop-yarn-client-2.5.1.jar

./hadoop-auth-2.5.1.jar

./hadoop-mapreduce-client-jobclient-2.5.1.jar

./hadoop-mapreduce-client-common-2.5.1.jar

./hadoop-hdfs-2.5.1.jar

./hadoop-yarn-api-2.5.1.jar

./hadoop-common-2.5.1.jar

./hadoop-annotations-2.5.1.jar

./hadoop-mapreduce-client-shuffle-2.5.1.jar

./hadoop-client-2.5.1.jar

 

[hadoop@hadoop1 bin]$ hadoop version

Hadoop 2.6.4

Subversion Unknown -r Unknown

Compiled by root on 2016-07-13T09:54Z

Compiled with protoc 2.5.0

From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010

This command was run using /home/hadoop/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar

 

发现与 hadoop 集群的版本号不一致,需要用 hadoop 目录下的 jar 替换 hbase/lib 目录下的 jar 文件, 所有节点都需替换。

 编写脚本来完成替换,如下所示:

[hadoop@hadoop1 lib]$ vim f.sh

find -name "hadoop*jar" | sed 's/2.5.1/2.6.4/g' | sed 's/\.\///g' > f.log

rm ./hadoop*jar

cat ./f.log | while read Line

do

find /home/hadoop/hadoop-2.6.4/share/hadoop -name "$Line" | xargs -i cp {} ./

done

rm ./f.log

 

[hadoop@hadoop1 lib]$ chmod u+x f.sh

[hadoop@hadoop1 lib]$ ./f.sh

[hadoop@hadoop1 lib]$ find -name 'hadoop*jar'

OKjar 包替换成功;

或者直接执行下面的操作:

删除旧的JAR包:

[hadoop@hadoop1 lib]$ rm ./hadoop*jar

hadoop下的jar包复制到hbase

[hadoop@hadoop1 lib]$find /home/hadoop/hadoop-2.6.4/share/hadoop -name 'hadoop*jar' | xargs -i cp {} ./

 

#######################

hbase/lib 目录下还有个 slf4j-log4j12-XXX.jar,在机器有装hadoop时,由于classpath中会有hadoop中的这个jar包,会有冲突,直接删除掉

[hadoop@hadoop1 lib]$ rm `find -name 'slf4j-log4j12-*jar'

 

以上部分存在问题,应该用hadoop下的slf4j-log4j12-XXX.jar替换hbase

######################################

#cd /home/hadoop/hadoop-2.6.4/share/hadoop/common/lib

[hadoop@hadoop1 lib]$ cp slf4j-* /home/hadoop/hbase-1.2.2/lib/

[hadoop@hadoop1 lib]$ scp slf4j-* hadoop@hadoop2:/home/hadoop/hbase-1.2.2/lib/

slf4j-api-1.7.5.jar                                                                                                                                                                                                                         100%   25KB  25.5KB/s   00:00   

slf4j-log4j12-1.7.5.jar                                                                                                                                                                                                                     100% 8869     8.7KB/s   00:00   

[hadoop@hadoop1 lib]$ scp slf4j-* hadoop@hadoop3:/home/hadoop/hbase-1.2.2/lib/

slf4j-api-1.7.5.jar                                                                                                                                                                                                                         100%   25KB  25.5KB/s   00:00   

slf4j-log4j12-1.7.5.jar                                                                                                                                                                                                                     100% 8869     8.7KB/s   00:00   

[hadoop@hadoop1 lib]$

2、修改配置:

[hadoop@hadoop1 conf]$ vi hbase-env.sh

export JAVA_HOME=/opt/jdk1.7.0_79

export HBASE_CLASSPATH=/home/hadoop/hadoop-2.6.4/etc/hadoop

#########hadoop配置文件路径

 

# export HBASE_MANAGES_ZK=true 这项没有配置,没有使用hbase自带的zookeeper

本集群采用的单独安装的zk,在配置中需要设置为false

export HBASE_MANAGES_ZK=false

 

[hadoop@hadoop1 conf]$ vi hbase-site.xml

<property>

<name>hbase.master</name>

<value>192.168.72.131:6000</value>

</property>

<property>

<name>hbase.master.maxclockskew</name>

<value>180000</value>

</property>

<property>

<name>hbase.rootdir</name>

<value>hdfs://192.168.72.131:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.tmp.dir</name>

<value>/home/hadoop/hbase-1.2.2/tmp</value>

</property>

<property>

<name>hbase.master.info.port</name>

<value>60010</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>hadoop1,hadoop2,hadoop3</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/hadoop/zookeeper-3.4.8/tmp/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

 

其中,hbase.master是指定运行HMaster的服务器及端口号;hbase.master.maxclockskew是用来防止HBase节点之间时间不一致造成regionserver启动失败,默认值是30000hbase.rootdir指定HBase的存储目录;hbase.cluster.distributed设置集群处于分布式模式;hbase.zookeeper.quorum设置Zookeeper节点的主机名,它的值个数必须是奇数;hbase.zookeeper.property.dataDir设置Zookeeper的目录,默认为/tmpdfs.replication设置数据备份数,集群节点小于3时需要修改,所以修改为2

 

[hadoop@hadoop1 conf]$ vi regionservers

hadoop2

hadoop3

 

配置环境变量:/etc/profile

export HBASE_HOME=/home/hadoop/hbase-1.2.2

export PATH=$PATH:$HBASE_HOME/bin

 

分发hbase到其他节点:

[hadoop@hadoop1 ~]$ scp -r hbase-1.2.2 hadoop@hadoop2:/home/hadoop/

[hadoop@hadoop1 ~]$ scp -r hbase-1.2.2 hadoop@hadoop3:/home/hadoop/

3、启动HBASE

[hadoop@hadoop1 bin]$ ./start-hbase.sh

 

单独启动master节点的服务:

[hadoop@hadoop3 bin]$ ./hbase-daemon.sh start regionserver

 

启动时报错:

Caused by: java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider

缺少该aws jar包:aws-java-sdk-1.7.4.1

从网下下载,复制到hbase lib目录下即可

 

启动时报错:

Caused by: java.lang.ClassNotFoundException: org.htrace.Trace

是因为缺少htrace包,将hadoop下的包复制过去

[hadoop@hadoop1 lib]$ cp htrace-core-3.0.4.jar /home/hadoop/hbase-1.2.2/lib/

[hadoop@hadoop1 lib]$ scp htrace-core-3.0.4.jar hadoop@hadoop2:/home/hadoop/hbase-1.2.2/lib/

htrace-core-3.0.4.jar                                                                                                                                                                                                                       100%   30KB  30.5KB/s   00:00   

[hadoop@hadoop1 lib]$ scp htrace-core-3.0.4.jar hadoop@hadoop3:/home/hadoop/hbase-1.2.2/lib/

启动时节点3报错,服务无法启动:

/hbase/WALs/hadoop3,16020,1477533677583-splitting is non empty': Directory is not empty

处理办法:

将对应的路径删除

[hadoop@hadoop3 current]$ hadoop fs -rm -r /hbase/WALs

 

测试:

前台查看数据库状态

 

http://192.168.72.131:60010/master-status

 

查看hbase数据库:

[hadoop@hadoop1 bin]$ hbase shell

查看帮助:

hbase(main):001:0> help

状态查看:

hbase(main):002:0> status

列出表:

hbase(main):012:0> list

 

创建表:

hbase(main):013:0> create  'cf','name','sex','edu'

查看表结构:

hbase(main):016:0> desc "cf"

 

在建表时,报CF已经存在,这是因为之前安装过,建过一个同样的表,需要删除,HDFS上和Hbase相关的东西都已经删除了。有可能是zookeeper的原因导致,进入HMaster节点,执行,zkCli.sh -server 192.168.72.131:2181

执行:[zk: 192.168.72.131:2181(CONNECTED) 1] ls /hbase/table

[hbase:meta, hbase:namespace, cf]

zookeeper中显示依然存在,删除cf,

[zk: 192.168.72.131:2181(CONNECTED) 3] rmr /hbase/table/cf

[zk: 192.168.72.131:2181(CONNECTED) 4] ls /hbase/table

[hbase:meta, hbase:namespace]

可以看到已经不存在了。

重新进入hbase使用create即可成功

插入数据:

语法:put <table>,<rowkey>,<family:column>,<value>,<timestamp>

hbase(main):024:0> put 'cf','one','name','chenfeng'

 

查询:

语法:get <table>,<rowkey>,[<family:column>,....]

 

查询某一列的数据:

scan 'word',COLUMNS => 'f1'

 

hbase hbck 检查数据

 

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -      Dimporttsv.columns=info:userid,HBASE_ROW_KEY,info:netid test2    /application/logAnalyse/test/test3.dat

 

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=num,aa,HBASE_ROW_KEY,num word /data/input/word.txt

 

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns='HBASE_ROW_KEY,f1,f2' word /home/hadoop/word.txt

以第二列数据作为行键:

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=f1,HBASE_ROW_KEY word /home/hadoop/word.txt

查某一列的数据:

scan 'word',COLUMNS => 'f3'

查某一时间戳数据:

scan 'word',TIMESTAMP=>1477548320766

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=f1,HBASE_ROW_KEY,f3 word /home/hadoop/word.txt

 

报以下信息:

2016-09-26 11:15:56,285 INFO  [main] client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:18032

2016-09-26 11:15:56,739 INFO  [main] Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

2016-09-26 11:15:58,098 INFO  [main] ipc.Client: Retrying connect to server: localhost/127.0.0.1:18032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2016-09-26 11:15:59,100 INFO  [main] ipc.Client: Retrying connect to server: localhost/127.0.0.1:18032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2016-09-26 11:16:00,102 INFO  [main] ipc.Client: Retrying connect to server: localhost/127.0.0.1:18032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2016-09-26 11:16:01,103 INFO  [main] ipc.Client: Retrying connect to server: localhost/127.0.0.1:18032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

最后检查发现,hadoop中的yarn配置信息resourcemanager 地址端口和hbase中调用的resourcemanager地址端口信息不一致,HBase中调用的是yarn-site.xml中的默认的信息,不是手工配置的信息,最后将hadoop配置下的yarn-site.xml文件复制到hbase的配置文件路径下,重启hbase问题解决。

 

将文件导入到HBASE中:

hbase 中创建表:

hbase(main):002:0> create 'hbase_test','info'

0 row(s) in 1.3120 seconds

文件ftphadoop机器上,并导入到hdfs

hadoop fs -mkdir -p /data/input

[hadoop@hadoop1 bin]$ cd /home/hadoop/

[hadoop@hadoop1 ~]$ hadoop fs -put hs_alt_chr2.fa /data/input/

[hadoop@hadoop1 conf]$ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,info -Dimporttsv.separator=, hbase_test /data/input/hs_alt_chr2.fa


路过

雷人

握手

鲜花

鸡蛋

发表评论 评论 (1 个评论)

回复 ljlinux2012 2017-3-1 16:35
学习了

facelist doodle 涂鸦板

您需要登录后才可以评论 登录 | 立即注册

关闭

推荐上一条 /2 下一条