HDFS 3.1.2 HA 分布式安装部署
阅读原文时间:2021年04月20日阅读:1

一、安装环境:centos7、hadoop-3.1.2、zookeeper-3.4.14、3个节点(192.168.56.60,192.168.56.62,192.168.56.64)。

centos60

centos62

centos64

NameNode

NameNode

 

Zookeeper

Zookeeper

Zookeeper

DataNode

DataNode

DataNode

JournalNodes

JournalNodes

JournalNodes

二、HA:高可用,本篇hdfs中的HA指的是namenode的高可用(hadoop2.x开始支持),因为datanode一直都是支持HA的。这里只记录HA有关,普通分布式安装请先参考我的另一篇:hdfs分布式安装(超详细),默认已经看过了。

三、HA原理简述:

Zookeeper集群能够保证NamaNode服务高可用的原理是:Hadoop集群中有两个NameNode服务,两个NaameNode都定时地给Zookeeper发送心跳,告诉Zookeeper我还活着,可以提供服务,单某一个时间只有一个是Action状态,另外一个是Standby状态,一旦Zookeeper检测不到Action NameNode发送来的心跳后,就切换到Standby状态的NameNode上,将它设置为Action状态,所以集群中总有一个可用的NameNode,达到了NameNode的高可用目的。

为了使备用节点保持其状态与Active节点同步,两个节点都与一组称为“JournalNodes”(JNs)的单独守护进程通信。当Active节点执行任何名称空间修改时,它会将修改记录持久地记录到大多数这些JNs中。待机节点能够从JNs读取编辑,并且不断观察它们对edit log的更改。当备用节点看到edit log 时,它会将它们应用到自己的命名空间。如果发生故障转移,Standby将确保在将自身升级为Active状态之前已从JournalNodes读取所有edit log 内容。这可确保在发生故障转移之前完全同步命名空间状态。
为了提供快速故障转移,备用节点还必须具有关于群集中块的位置的最新信息。为了实现这一点,DataNode配置了所有NameNode的位置,并向所有人发送块位置信息和心跳。

必须至少有3个JournalNode守护进程,因为编辑日志修改必须写入大多数JN。这将允许系统容忍单个机器的故障。N JournalNodes运行时,系统最多可以容忍(N-1)/ 2个故障并继续正常运行。
HA群集中,备用NameNode还会执行命名空间状态的检查点,因此无需在HA群集中运行Secondary NameNode。

hadoop3.x HA新特性:支持多个Standby状态的namenode(2.x只支持一个)。那么为啥centos64服务器我不打算部署namenode?原因:3个JournalNode只允许一个JournalNode挂掉,所以这里3台服务器只能允许宕机一台,这样部署3个namenode就没有意义了,因为就不允许宕机2台。不过服务器数量足够时可部署3个namenode(官方文档: HA的NameNode最小数量为2,但您可以配置更多。由于通信开销,建议不要超过5 - 推荐3个NameNodes),例如:5台服务器时可部署5个JournalNode,3个namenode,这样就允许宕机任意2台服务器了。

四、运行中非HA的集群的停止(我默认你是参照步骤二中博客部署的,官网有直接使非HA转为HA的,我想自己先部署一遍没有使用)

4.1 停止集群:

[root@centos60 hadoop]# jps
1682025 NameNode
1682520 DataNode
1262406 Jps
1683780 SecondaryNameNode
[root@centos60 hadoop]# stop-dfs.sh 
[root@centos60 hadoop]# jps
1266880 Jps

4.2 删除之前集群的持久化文件,包含配置及数据:

[root@centos60 hadoop]# ls
datanode  namenode  tmp
[root@centos60 hadoop]# rm -rf *

[root@centos62 hadoop]# ls
datanode
[root@centos62 hadoop]# rm -rf *

[root@centos64 hadoop]# ls
datanode
[root@centos64 hadoop]# rm -rf *

五、HA下配置的修改

5.1 core-site.xml的修改:

[root@centos60 hadoop]# cd /usr/local/hadoop-3.1.2/
[root@centos60 hadoop-3.1.2]# ls
bin  etc  include  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share
[root@centos60 hadoop-3.1.2]# cd etc/hadoop/
[root@centos60 hadoop]# ls
capacity-scheduler.xml  hadoop-metrics2.properties        httpfs-signature.secret  log4j.properties            shellprofile.d                 yarn-env.sh
configuration.xsl       hadoop-policy.xml                 httpfs-site.xml          mapred-env.cmd              ssl-client.xml.example         yarnservice-log4j.properties
container-executor.cfg  hadoop-user-functions.sh.example  kms-acls.xml             mapred-env.sh               ssl-server.xml.example         yarn-site.xml
core-site.xml           hdfs-site.xml                     kms-env.sh               mapred-queues.xml.template  user_ec_policies.xml.template
hadoop-env.cmd          httpfs-env.sh                     kms-log4j.properties     mapred-site.xml             workers
hadoop-env.sh           httpfs-log4j.properties           kms-site.xml             root                        yarn-env.cmd
[root@centos60 hadoop]# vi core-site.xml
<configuration>
        <!-- nameservice ID,HA是连接到nameservice mycluster -->
        <property>
                <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
        </property>
    <!-- 临时目录,NameNode、DataNode、JournalNode等存放数据的默认目录,也可以后面单独指定这三类节点目录 -->
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/hadoop/tmp</value>
    </property>
    <!-- zookeeper集群地址 -->
    <property>
       <name>ha.zookeeper.quorum</name>
       <value>centos60:2181,centos62:2181,centos64:2181</value>
 </property>
</configuration>

5.2 hdfs-site.xml的修改:

<configuration>
  <!--指定hdfs的nameservice,需要和core-site.xml中的保持一致-->
  <property>
    <name>dfs.nameservices</name>
    <value>mycluster</value>
  </property>

  <!--mycluster下面有两个NameNode-->
  <property>
    <name>dfs.ha.namenodes.mycluster</name>
    <value>centos60,centos62</value>
  </property>

  <!--RPC通信地址,rpc用来和datanode通讯 -->
  <property>
    <name>dfs.namenode.rpc-address.mycluster.centos60</name>
    <value>centos60:9000</value>
  </property>

  <!--Hadoop3开始http默认端口已经改为9870,这里为了兼容之前的,还是设置成50070-->
  <property>
    <name>dfs.namenode.http-address.mycluster.centos60</name>
    <value>centos60:50070</value>
  </property>

  <property>
    <name>dfs.namenode.rpc-address.mycluster.centos62</name>
    <value>centos62:9000</value>
  </property>

  <property>
    <name>dfs.namenode.http-address.mycluster.centos62</name>
    <value>centos62:50070</value>
  </property>

  <!--标识NameNodes 读写 edits  的JNs 的URI-->
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://centos60:8485;centos62:8485;centos64:8485/mycluster</value>
  </property>

  <!--指定JournalNode在本地磁盘存放数据的位置-->
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/hadoop/journalnode</value>
  </property>

  <!--开启NameNode故障时自动切换-->
  <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>

  <!--   HDFS客户端用于联系Active NameNode的Java类 -->
  <property>
    <name>dfs.client.failover.proxy.provider.mycluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>

  <!--用于在故障转移期间屏蔽 Active NameNode,如果ssh是默认22端口,value直接写sshfence即可-->
  <!--如果不是22端口,则写sshfence(hadoop:22022),其中22022为新的ssh端口号-->
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
  </property>

  <!--ssh隔离机制时需要ssh免登陆-->
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/root/.ssh/id_rsa</value>
  </property>

  <!--namenode数据存放目录-->
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/hadoop/namenode</value>
  </property>

  <!--datanode数据存放目录-->
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/hadoop/datanode</value>
  </property>

  <!--数据备份数量-->
  <property>
    <name>dfs.replication</name>
    <value>2</value>
  </property>

</configuration>

5.3 修改好后的配置文件scp到另外2节点:

[root@centos60 hadoop]# scp core-site.xml root@centos62:/usr/local/hadoop-3.1.2/etc/hadoop/core-site.xml 
core-site.xml                                                                                                                               100% 1280     1.4MB/s   00:00    
[root@centos60 hadoop]# scp core-site.xml root@centos64:/usr/local/hadoop-3.1.2/etc/hadoop/core-site.xml 
core-site.xml                                                                                                                               100% 1280   289.1KB/s   00:00    
[root@centos60 hadoop]# scp hdfs-site.xml root@centos62:/usr/local/hadoop-3.1.2/etc/hadoop/hdfs-site.xml 
hdfs-site.xml                                                                                                                               100% 3393     3.6MB/s   00:00    
[root@centos60 hadoop]# scp hdfs-site.xml root@centos64:/usr/local/hadoop-3.1.2/etc/hadoop/hdfs-site.xml 
hdfs-site.xml                                                                                                                               100% 3393     1.3MB/s   00:00    

六、zookeeper的下载安装:

下载地址:http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz(官方3.5版本的包有问题)

解压安装:

[root@centos60 tmp]# tar zxvf zookeeper-3.4.14.tar.gz -C /usr/local/
[root@centos60 tmp]# cd /usr/local/zookeeper-3.4.14/
[root@centos60 zookeeper-3.4.14]# ls
bin        dist-maven       lib          pom.xml               src                       zookeeper-3.4.14.jar.md5   zookeeper-contrib  zookeeper-jute
build.xml  ivysettings.xml  LICENSE.txt  README.md             zookeeper-3.4.14.jar      zookeeper-3.4.14.jar.sha1  zookeeper-docs     zookeeper-recipes
conf       ivy.xml          NOTICE.txt   README_packaging.txt  zookeeper-3.4.14.jar.asc  zookeeper-client           zookeeper-it       zookeeper-server
[root@centos60 zookeeper-3.4.14]# cd conf/
[root@centos60 conf]# ls
configuration.xsl  log4j.properties  zoo_sample.cfg
[root@centos60 conf]# cp zoo_sample.cfg  zoo.cfg
[root@centos60 conf]# vi zoo.cfg 
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/local/zookeeper-3.4.14/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

server.1=centos60:2888:3888
server.2=centos62:2888:3888
server.3=centos64:2888:3888

dataDir=/usr/local/zookeeper-3.4.14/data

末尾添加:

server.1=centos60:2888:3888
server.2=centos62:2888:3888
server.3=centos64:2888:3888

创建myid文件:

[root@centos60 conf]# cd ..
[root@centos60 zookeeper-3.4.14]# mkdir data
[root@centos60 zookeeper-3.4.14]# cd data/
[root@centos60 data]# touch myid
[root@centos60 data]# echo 1 >> myid 
[root@centos60 data]# cat myid 
1

scp到其它2节点,修改myid的值:

[root@centos60 local]# scp -r zookeeper-3.4.14 root@centos62:/usr/local/zookeeper-3.4.14/
[root@centos60 local]# scp -r zookeeper-3.4.14 root@centos64:/usr/local/zookeeper-3.4.14/

[root@centos62 ~]# cd /usr/local/zookeeper-3.4.14/data
[root@centos62 data]# vi myid 
2

[root@centos64 ~]# cd /usr/local/zookeeper-3.4.14/data/
[root@centos64 data]# vi myid
3

启动zookeeper,3个节点都要执行:

[root@centos60 bin]# pwd
/usr/local/zookeeper-3.4.14/bin
[root@centos60 bin]# ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.14/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@centos60 bin]# jps
172550 QuorumPeerMain
172838 Jps

报错:

[root@centos60 bin]# ./zkServer.sh  status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.14/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
[root@centos60 bin]# cat zookeeper.out 
2019-07-29 20:23:42,049 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@584] - Cannot open channel to 3 at election address centos64/192.168.56.64:3888
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:610)
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:838)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:958)

发现是60防火墙的iptables没有关闭,关闭防火墙:

[root@centos60 bin]# systemctl stop firewalld.service
[root@centos60 bin]# sudo service iptables stop
Redirecting to /bin/systemctl stop iptables.service
[root@centos60 bin]# ./zkServer.sh  status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.4.14/bin/../conf/zoo.cfg
Mode: follower

正常情况下有一台节点会是leader,其他2节点是follower。

七、 hdfs的启动,第一次启动比较繁琐,之后直接start-all就好了。
7.1 在ZooKeeper中初始化所需的状态。在一个NameNode节点下运行以下命令(这将在ZooKeeper中创建一个znode,其中存储自动故障转移系统的数据):

[root@centos60 bin]# hdfs zkfc -formatZK

7.2 在每个journalnode节点用如下命令启动journalnode:

[root@centos60 bin]# hdfs --daemon start journalnode
[root@centos60 bin]# jps
420150 Jps
367814 QuorumPeerMain
420125 JournalNode

7.3 在第一个namenode节点下格式化namenode和journalnode目录:

[root@centos60 bin]# hdfs namenode -format

重要!然后将生成的namenode文件夹copy到第二个namenode节点(开始部署好久都失败最后发现是这个问题):

[root@centos60 hadoop]# pwd
/hadoop
[root@centos60 hadoop]# ls
journalnode  namenode
[root@centos60 hadoop]# scp -r namenode/ root@centos62:/hadoop/

7.4 在一个namenode节点执行start-dfs.sh(或者 start-all.sh,start-all.sh会将NodeManage、ResourceManage都启动起来):

[root@centos60 bin]# start-dfs.sh

报错:

Starting journal nodes [centos60 centos62 centos64]
ERROR: Attempting to operate on hdfs journalnode as root
ERROR: but there is no HDFS_JOURNALNODE_USER defined. Aborting operation.
Starting ZK Failover Controllers on NN hosts [centos60 centos62]
ERROR: Attempting to operate on hdfs zkfc as root
ERROR: but there is no HDFS_ZKFC_USER defined. Aborting operation.

解决:

[root@centos60 sbin]# cd /usr/local/hadoop-3.1.2/sbin/
[root@centos60 sbin]# pwd
/usr/local/hadoop-3.1.2/sbin
[root@centos60 sbin]# vi start-dfs.sh 
[root@centos60 sbin]# vi stop-dfs.sh

start-dfs.sh、stop-dfs.sh文件开头添加以下配置:

HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_ZKFC_USER=root

然后将修改后的文件scp到另外2节点,重启集群,启动成功:

[root@centos60 sbin]# stop-dfs.sh
[root@centos60 hadoop]# start-dfs.sh
[root@centos60 hadoop]# jps
21062 JournalNode
202677 DataNode
205173 DFSZKFailoverController
206346 Jps
13196 QuorumPeerMain
201277 NameNode

[root@centos62 hadoop]# jps
5777 JournalNode
17763 DFSZKFailoverController
2007 QuorumPeerMain
17544 DataNode
17358 NameNode
17822 Jps

[root@centos64 hadoop]# jps
32128 Jps
12039 QuorumPeerMain
25385 DataNode
13596 JournalNode

7.5 如果是运行中的集群,hdfs从非HA模式修改为HA模式,需要如下操作:

#在备namenode节点执行如下命令,格式化并复制主节点的元数据
$HADOOP_HOME/bin/hdfs namenode -bootstrapStandby

#在主节点执行如下命令,初始化JournalNodes的edit数据
$HADOOP_HOME/bin/hdfs namenode -initializeSharedEdits

#然后在备节点上启动namenode
$HADOOP_HOME/bin/hdfs --daemon start namenode

八、验证HA

8.1 查看NameNode的状态,应该是一个active,一个standby:

[root@centos60 namenode]# hdfs haadmin -getAllServiceState
centos60:9000                                      active    
centos62:9000                                      standby

8.2 模拟故障:

[root@centos60 hadoop]# jps
262694 Jps
21062 JournalNode
202677 DataNode
205173 DFSZKFailoverController
13196 QuorumPeerMain
201277 NameNode
[root@centos60 hadoop]# kill -9  201277

8.3 等待一会,再次查看节点状态:

[root@centos60 hadoop]# hdfs haadmin -getAllServiceState
2019-07-30 21:06:29,298 INFO ipc.Client: Retrying connect to server: centos60/192.168.56.60:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
centos60:9000                                      Failed to connect: Call From centos60/192.168.56.60 to centos60:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
centos62:9000                                      active

补充:

制作目录
hdfs dfs -mkdir  /input
将文件放在目录中
hdfs dfs -put /file.txt  /input
检查目录中的文件
hdfs dfs -ls  /input

ha管理命令
hdfs haadmin -help

192.168.56.60:50070页面无法访问,关闭防火墙即可:systemctl stop firewalld.service

standby namenode 启动失败解决:https://blog.csdn.net/yzh_1346983557/article/details/97812820

参考:https://blog.csdn.net/shshheyi/article/details/84893371

https://blog.csdn.net/zhanglong_4444/article/details/87699369

https://blog.csdn.net/hliq5399/article/details/78193113