本次操作目标是搭建一个HDFS集群,且此集群采用HA和Federation。
真是硬件环境为一台服务器,为了模拟分布式集群,采用kvm虚拟出9台虚拟机。分别为:
10.0.5.31 zk1 10.0.5.32 zk2 10.0.5.33 zk3 10.0.5.41 namenode1 10.0.5.42 namenode2 10.0.5.43 namenode3 10.0.5.44 namenode4 10.0.5.51 datanode1 10.0.5.52 datanode2 10.0.5.53 datanode3
java环境变量
export JAVA_HOME=/root/java export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
10台机器基本上是一样的虚拟机,这里基本共享配置。
/root/hadoop/data/nn_data//root/hadoop/data/dn_data1/,这里只配置一个/root/hadoop/data/journaldata//root/hadoop/data/zkdata/zoo.cfg
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/root/hadoop/data/zkdata clientPort=2181 server.1=zk1:2888:3888 server.2=zk2:2888:3888 server.3=zk3:2888:3888
在/root/hadoop/data/zkdata下配置myid
./zkServer.sh start
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>viewfs://nsX</value>
</property>
<property>
<name>fs.viewfs.mounttable.nsX.link./c1</name>
<value>hdfs://cluster1/tmp</value>
</property>
<property>
<name>fs.viewfs.mounttable.nsX.link./c2</name>
<value>hdfs://cluster2/tmp2</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/hadoop/data/nn_data/</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster1,cluster2</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nn1</name>
<value>namenode1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nn2</name>
<value>namenode2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nn1</name>
<value>namenode1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nn2</name>
<value>namenode2:50070</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster2</name>
<value>nn3,nn4</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.nn3</name>
<value>namenode3:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.nn4</name>
<value>namenode4:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.nn3</name>
<value>namenode3:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.nn4</name>
<value>namenode4:50070</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/hadoop/data/dn_data1/</value>
</property>
<property>
<!-- 每个nameserver 不一样!!!!-->
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://zk1:8485;zk2:8485;zk3:8485/cluster1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/root/hadoop/data/journaldata/</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence(root:22)</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>360</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
sbin/hadoop-daemon.sh start journalnode
首先格式化集群,在主namenode上
./bin/hdfs namenode -format ./bin/hdfs namenode -initializeSharedEdits -force ./sbin/hadoop-daemon.sh start namenode
在次namenode上
bin/hdfs namenode -bootstrapStandby sbin/hadoop-daemon.sh start namenode
首先格式化zk
bin/hdfs zkfc -formatZK
然后在所有namenode上启动
./hadoop/hadoop-2.0.0-cdh4.4.0/sbin/hadoop-daemon.sh start zkfc
启动第二组HA时,唯一的区别是namenode format时指定集群id,例如
bin/hdfs namenode -format -clusterid CID-5c5d754c-20f6-43b6-bf16-e2239e93dbb7
./hadoop/hadoop-2.0.0-cdh4.4.0/sbin/hadoop-daemon.sh start datanode