CDH5.16.1集群企业真正离线部署

视频:https://www.bilibili.com/video/av52167219
PS:建议先看课程视频1-2篇,再根据视频或文档部署,
如有问题,及时与@若泽数据J哥联系。


一.准备工作

1.离线部署主要分为三块:

a.MySQL离线部署
b.CM离线部署
c.Parcel文件离线源部署

2.规划:

节点 MySQL部署组件 Parcel文件离线源 CM服务进程 大数据组件
hadoop001 MySQL Parcel Activity Monitor NN RM DN NM
hadoop002 Alert Publisher Event Server DN NM
hadoop003 Host Monitor Service Monitor DN NM

3.下载源:

  • CM
    cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz
  • Parcel
    CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
    CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha1
    manifest.json
  • JDK
    https://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html
    下载jdk-8u202-linux-x64.tar.gz
  • MySQL
    https://dev.mysql.com/downloads/mysql/5.7.html#downloads
    下载mysql-5.7.26-el7-x86_64.tar.gz
  • MySQL jdbc jar
    mysql-connector-java-5.1.47.jar
    下载完成后要重命名去掉版本号,
    mv mysql-connector-java-5.1.47.jar mysql-connector-java.jar

准备好百度云,下载安装包:

链接:https://pan.baidu.com/s/10s-NaFLfztKuWImZTiBMjA 密码:viqp

二.集群节点初始化

1.阿里云上海区购买3台,按量付费虚拟机

CentOS7.2操作系统,2核8G最低配置

2.当前笔记本或台式机配置hosts文件

  • MAC: /etc/hosts
  • Window: C:\windows\system32\drivers\etc\hosts

公网地址:

1
2
3
1106.15.234.222 hadoop001  
2106.15.235.200 hadoop002
3106.15.234.239 hadoop003

3.设置所有节点的hosts文件

私有地址、内网地址:

1
2
3
1echo "172.19.7.96 hadoop001">> /etc/hosts
2echo "172.19.7.98 hadoop002">> /etc/hosts
3echo "172.19.7.97 hadoop003">> /etc/hosts

4.关闭所有节点的防火墙及清空规则

1
2
3
1systemctl stop firewalld 
2systemctl disable firewalld
3iptables -F

5.关闭所有节点的selinux

1
2
3
1vi /etc/selinux/config
2将SELINUX=enforcing改为SELINUX=disabled
3设置后需要重启才能生效

6.设置所有节点的时区一致及时钟同步

6.1.时区

1
2
3
4
5
6
7
8
9
10
11
 1[root@hadoop001 ~]# date
2Sat May 11 10:07:53 CST 2019
3[root@hadoop001 ~]# timedatectl
4 Local time: Sat 2019-05-11 10:10:31 CST
5 Universal time: Sat 2019-05-11 02:10:31 UTC
6 RTC time: Sat 2019-05-11 10:10:29
7 Time zone: Asia/Shanghai (CST, +0800)
8 NTP enabled: yes
9NTP synchronized: yes
10 RTC in local TZ: yes
11 DST active: n/a
查看命令帮助,学习至关重要,无需百度,太👎
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
 1[root@hadoop001 ~]# timedatectl --help
2timedatectl [OPTIONS...] COMMAND ...
3Query or change system time and date settings.
4 -h --help Show this help message
5 --version Show package version
6 --no-pager Do not pipe output into a pager
7 --no-ask-password Do not prompt for password
8 -H --host=[USER@]HOST Operate on remote host
9 -M --machine=CONTAINER Operate on local container
10 --adjust-system-clock Adjust system clock when changing local RTC mode
11Commands:
12 status Show current time settings
13 set-time TIME Set system time
14 set-timezone ZONE Set system time zone
15 list-timezones Show known time zones
16 set-local-rtc BOOL Control whether RTC is in local time
17 set-ntp BOOL Control whether NTP is enabled
查看哪些时区
1
2
3
4
5
6
7
1[root@hadoop001 ~]# timedatectl list-timezones
2Africa/Abidjan
3Africa/Accra
4Africa/Addis_Ababa
5Africa/Algiers
6Africa/Asmara
7Africa/Bamako
所有节点设置亚洲上海时区
1
2
3
1[root@hadoop001 ~]# timedatectl set-timezone Asia/Shanghai
2[root@hadoop002 ~]# timedatectl set-timezone Asia/Shanghai
3[root@hadoop003 ~]# timedatectl set-timezone Asia/Shanghai

6.2.时间

所有节点安装ntp
1
1[root@hadoop001 ~]# yum install -y ntp
选取hadoop001为ntp的主节点
1
1[root@hadoop001 ~]# vi /etc/ntp.conf
time
1
2
3
4
1server 0.asia.pool.ntp.org
2server 1.asia.pool.ntp.org
3server 2.asia.pool.ntp.org
4server 3.asia.pool.ntp.org
当外部时间不可用时,可使用本地硬件时间
1
1server 127.127.1.0 iburst local clock
允许哪些网段的机器来同步时间
1
1restrict 172.19.7.0 mask 255.255.255.0 nomodify notrap
开启ntpd及查看状态
1
2
3
4
5
6
7
8
9
10
11
12
 1[root@hadoop001 ~]# systemctl start ntpd
2[root@hadoop001 ~]# systemctl status ntpd
3● ntpd.service - Network Time Service
4 Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
5 Active: active (running) since Sat 2019-05-11 10:15:00 CST; 11min ago
6 Main PID: 18518 (ntpd)
7 CGroup: /system.slice/ntpd.service
8 └─18518 /usr/sbin/ntpd -u ntp:ntp -g
9May 11 10:15:00 hadoop001 systemd[1]: Starting Network Time Service...
10May 11 10:15:00 hadoop001 ntpd[18518]: proto: precision = 0.088 usec
11May 11 10:15:00 hadoop001 ntpd[18518]: 0.0.0.0 c01d 0d kern kernel time sync enabled
12May 11 10:15:00 hadoop001 systemd[1]: Started Network Time Service.
验证
1
2
3
4
1[root@hadoop001 ~]# ntpq -p
2 remote refid st t when poll reach delay offset jitter
3==============================================================================
4 LOCAL(0) .LOCL. 10 l 726 64 0 0.000 0.000 0.000
其他从节点停止禁用ntpd服务
1
2
3
4
5
1[root@hadoop002 ~]# systemctl stop ntpd
2[root@hadoop002 ~]# systemctl disable ntpd
3Removed symlink /etc/systemd/system/multi-user.target.wants/ntpd.service.
4[root@hadoop002 ~]# /usr/sbin/ntpdate hadoop001
511 May 10:29:22 ntpdate[9370]: adjust time server 172.19.7.96 offset 0.000867 sec
每天凌晨同步hadoop001节点时间
1
2
3
4
5
6
7
8
9
10
 1[root@hadoop002 ~]# crontab -e
200 00 * * * /usr/sbin/ntpdate hadoop001
3[root@hadoop003 ~]# systemctl stop ntpd
4[root@hadoop004 ~]# systemctl disable ntpd
5Removed symlink /etc/systemd/system/multi-user.target.wants/ntpd.service.
6[root@hadoop005 ~]# /usr/sbin/ntpdate hadoop001
711 May 10:29:22 ntpdate[9370]: adjust time server 172.19.7.96 offset 0.000867 sec
8#每天凌晨同步hadoop001节点时间
9[root@hadoop003 ~]# crontab -e
1000 00 * * * /usr/sbin/ntpdate hadoop001

7.部署集群的JDK

1
2
3
4
5
6
7
8
1mkdir /usr/java
2tar -xzvf jdk-8u45-linux-x64.tar.gz -C /usr/java/
3#切记必须修正所属用户及用户组
4chown -R root:root /usr/java/jdk1.8.0_45
5echo "export JAVA_HOME=/usr/java/jdk1.8.0_45" >> /etc/profile
6echo "export PATH=${JAVA_HOME}/bin:${PATH}" >> /etc/profile
7source /etc/profile
8which java

8.hadoop001节点离线部署MySQL5.7(假如觉得困难哟,就自行百度RPM部署,因为该部署文档是我司生产文档)

  • 文档链接:https://github.com/Hackeruncle/MySQL
  • 视频链接:https://pan.baidu.com/s/1jdM8WeIg8syU0evL1-tDOQ 密码:whic

9.创建CDH的元数据库和用户、amon服务的数据库及用户

1
2
3
4
5
1create database cmf DEFAULT CHARACTER SET utf8;
2create database amon DEFAULT CHARACTER SET utf8;
3grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'Ruozedata123456!';
4grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'Ruozedata123456!';
5flush privileges;

10.hadoop001节点部署mysql jdbc jar

1
2
1mkdir -p /usr/share/java/
2cp mysql-connector-java.jar /usr/share/java/

三.CDH部署

1.离线部署cm server及agent

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
 11.1.所有节点创建目录及解压
2mkdir /opt/cloudera-manager
3tar -zxvf cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz -C /opt/cloudera-manager/
41.2.所有节点修改agent的配置,指向server的节点hadoop001
5sed -i "s/server_host=localhost/server_host=hadoop001/g" /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-agent/config.ini
61.3.主节点修改server的配置:
7vi /opt/cloudera-manager/cm-5.16.1/etc/cloudera-scm-server/db.properties
8com.cloudera.cmf.db.type=mysql
9com.cloudera.cmf.db.host=hadoop001
10com.cloudera.cmf.db.name=cmf
11com.cloudera.cmf.db.user=cmf
12com.cloudera.cmf.db.password=Ruozedata123456!
13com.cloudera.cmf.db.setupType=EXTERNAL
141.4.所有节点创建用户
15useradd --system --home=/opt/cloudera-manager/cm-5.16.1/run/cloudera-scm-server/ --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
161.5.目录修改用户及用户组
17chown -R cloudera-scm:cloudera-scm /opt/cloudera-manager

2.hadoop001节点部署离线parcel源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
 12.1.部署离线parcel源
2$ mkdir -p /opt/cloudera/parcel-repo
3$ ll
4total 3081664
5-rw-r--r-- 1 root root 2127506677 May 9 18:04 CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel
6-rw-r--r-- 1 root root 41 May 9 18:03 CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha1
7-rw-r--r-- 1 root root 841524318 May 9 18:03 cloudera-manager-centos7-cm5.16.1_x86_64.tar.gz
8-rw-r--r-- 1 root root 185515842 Aug 10 2017 jdk-8u144-linux-x64.tar.gz
9-rw-r--r-- 1 root root 66538 May 9 18:03 manifest.json
10-rw-r--r-- 1 root root 989495 May 25 2017 mysql-connector-java.jar
11$ cp CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel /opt/cloudera/parcel-repo/
12#切记cp时,重命名去掉1,不然在部署过程CM认为如上文件下载未完整,会持续下载
13$ cp CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.16.1-1.cdh5.16.1.p0.3-el7.parcel.sha
14$ cp manifest.json /opt/cloudera/parcel-repo/
152.2.目录修改用户及用户组
16$ chown -R cloudera-scm:cloudera-scm /opt/cloudera/

3.所有节点创建软件安装目录、用户及用户组权限

1
2
1mkdir -p /opt/cloudera/parcels
2chown -R cloudera-scm:cloudera-scm /opt/cloudera/

4.hadoop001节点启动Server

1
2
3
4
5
14.1.启动server
2/opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-server start
34.2.阿里云web界面,设置该hadoop001节点防火墙放开7180端口
44.3.等待1min,打开 http://hadoop001:7180 账号密码:admin/admin
54.4.假如打不开,去看server的log,根据错误仔细排查错误

5.所有节点启动Agent

1
1/opt/cloudera-manager/cm-5.16.1/etc/init.d/cloudera-scm-agent start

6.接下来,全部Web界面操作

http://hadoop001:7180/
账号密码:admin/admin

7.欢迎使用Cloudera Manager–最终用户许可条款与条件。勾选

img

8.欢迎使用Cloudera Manager–您想要部署哪个版本?选择Cloudera Express免费版本

img

9.感谢您选择Cloudera Manager和CDH

img

10.为CDH集群安装指导主机。选择[当前管理的主机],全部勾选

img

11.选择存储库

img

12.集群安装–正在安装选定Parcel假如

本地parcel离线源配置正确,则”下载”阶段瞬间完成,其余阶段视节点数与内部网络情况决定。

img

13.检查主机正确性

img

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
 113.1.建议将/proc/sys/vm/swappiness设置为最大值10。
2swappiness值控制操作系统尝试交换内存的积极;
3swappiness=0:表示最大限度使用物理内存,之后才是swap空间;
4swappiness=100:表示积极使用swap分区,并且把内存上的数据及时搬迁到swap空间;
5如果是混合服务器,不建议完全禁用swap,可以尝试降低swappiness。
6临时调整:
7sysctl vm.swappiness=10
8永久调整:
9cat << EOF >> /etc/sysctl.conf
10# Adjust swappiness value
11vm.swappiness=10
12EOF
1313.2.已启用透明大页面压缩,可能会导致重大性能问题,建议禁用此设置。
14临时调整:
15echo never > /sys/kernel/mm/transparent_hugepage/defrag
16echo never > /sys/kernel/mm/transparent_hugepage/enabled
17永久调整:
18cat << EOF >> /etc/rc.d/rc.local
19# Disable transparent_hugepage
20echo never > /sys/kernel/mm/transparent_hugepage/defrag
21echo never > /sys/kernel/mm/transparent_hugepage/enabled
22EOF
23# centos7.x系统,需要为"/etc/rc.d/rc.local"文件赋予执行权限
24chmod +x /etc/rc.d/rc.local

14.自定义服务,选择部署Zookeeper、HDFS、Yarn服务

img

15.自定义角色分配

img

16.数据库设置

img

17.审改设置,默认即可

img

18.首次运行

img

19.恭喜您!

img

20.主页

img


CDH全套课程目录,如有buy,加微信(ruoze_star)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
 1    0.青云环境介绍和使用 
2 1.Preparation
3 谈谈怎样入门大数据
4 谈谈怎样做好一个大数据平台的运营工作
5 Linux机器,各软件版本介绍及安装(录播)
6 2.Introduction
7 Cloudera、CM及CDH介绍
8 CDH版本选择
9 CDH安装几种方式解读
10 3.Install&UnInstall
11 集群节点规划,环境准备(NTP,Jdk and etc)
12 MySQL编译安装及常用命令
13 推荐:CDH离线安装(踩坑心得,全面剖析)
14 解读暴力卸载脚本
15 4.CDH Management
16 CDH体系架构剖析
17 CDH配置文件深度解析
18 CM的常用命令
19 CDH集群正确启动和停止顺序
20 CDH Tsquery Language
21 CDH常规管理(监控/预警/配置/资源/日志/安全)
22 5.Maintenance Experiment
23 HDFS HA 配置 及hadoop/hdfs常规命令
24 Yarn HA 配置 及yarn常规命令
25 Other CDH Components HA 配置
26 CDH动态添加删除服务(hive/spark/hbase)
27 CDH动态添加删除机器
28 CDH动态添加删除及迁移DataNode进程等
29 CDH升级(5.10.0-->5.12.0)
30 6.Resource Management
31 Linux Cgroups
32 静态资源池
33 动态资源池
34 多租户案例
35 7.Performance Tunning
36 Memory/CPU/Network/Disk及集群规划
37 Linux参数
38 HDFS参数
39 MapReduce及Yarn参数
40 其他服务参数
41 8.Cases Share
42 CDH4&5之Alternatives命令 的研究
43 CDH5.8.2安装之Hash verification failed
44 记录一次CDH4.8.6 配置HDFS HA 坑
45 CDH5.0集群IP更改
46 CDH的active namenode exit(GC)和彩蛋分享
47 9. Kerberos
48 Kerberos简介
49 Kerberos体系结构
50 Kerberos工作机制
51 Kerberos安装部署
52 CDH启用kerberos
53 Kerberos开发使用(真实代码)
54 10.Summary
55 总结

Join us if you have a dream.

若泽数据官网
腾讯课堂,搜若泽数据
Bilibili网站,搜若泽数据
若泽大数据–官方博客
若泽大数据–博客一览
若泽大数据–内部学员面试题
0%