Jusene's Blog

crorosync v2+pacemaker+pcs+crmsh 构建高可用web服务器

字数统计: 1.8k阅读时长: 10 min
2017/05/20 Share

corosync

corosync在官网上有两个版本在维护,前面已经提到corosync v1的部署实践,现在我来尝试下corosync v2版本的部署。

规划

node1 10.211.55.39 centos7
node2 10.211.55.43 centos7
corosync 2.4.0
pacemaker 1.1.15
crmsh 3.0.0
pcs 0.9.152

准备工作

  1. 时间同步
  2. hosts解析对方节点ip
  3. 双机互信

安装pcs并部署集群

  • node1
    1
    ~]#cat /etc/ansible/hosts
    2
    [ha]
    3
    node1
    4
    node2
    5
    ~]#ansible ha -m yum -a "name=corosync,pacemaker,pcs"
    6
    node2 | SUCCESS => {
    7
        "changed": false, 
    8
        "msg": "", 
    9
        "rc": 0, 
    10
        "results": [
    11
            "corosync-2.4.0-4.el7.x86_64 providing corosync is already installed", 
    12
            "pacemaker-1.1.15-11.el7_3.4.x86_64 providing pacemaker is already installed", 
    13
            "pcs-0.9.152-10.el7.centos.3.x86_64 providing pcs is already installed"
    14
        ]
    15
    }
    16
    node1 | SUCCESS => {
    17
        "changed": false, 
    18
        "msg": "", 
    19
        "rc": 0, 
    20
        "results": [
    21
            "corosync-2.4.0-4.el7.x86_64 providing corosync is already installed", 
    22
            "pacemaker-1.1.15-11.el7_3.4.x86_64 providing pacemaker is already installed", 
    23
            "pcs-0.9.152-10.el7.centos.3.x86_64 providing pcs is already installed"
    24
        ]
    25
    }
    26
    ~]# ansible ha -m service -a 'name=pcsd state=started enabled=yes'
    27
    ~]# ansible ha -a 'systemctl status pcsd'
    28
    node2 | SUCCESS | rc=0 >>
    29
    ● pcsd.service - PCS GUI and remote configuration interface
    30
       Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
    31
       Active: active (running) since Sat 2017-05-20 10:06:18 EDT; 1min 34s ago
    32
     Main PID: 31312 (pcsd)
    33
       CGroup: /system.slice/pcsd.service
    34
               └─31312 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &
    35
    36
    May 20 10:06:18 node2 systemd[1]: Starting PCS GUI and remote configuration interface...
    37
    May 20 10:06:18 node2 systemd[1]: Started PCS GUI and remote configuration interface.
    38
    39
    node1 | SUCCESS | rc=0 >>
    40
    ● pcsd.service - PCS GUI and remote configuration interface
    41
       Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
    42
       Active: active (running) since Sat 2017-05-20 10:06:18 EDT; 1min 34s ago
    43
     Main PID: 7884 (pcsd)
    44
       CGroup: /system.slice/pcsd.service
    45
               └─7884 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &
    46
    47
    May 20 10:06:18 node1 systemd[1]: Starting PCS GUI and remote configuration interface...
    48
    May 20 10:06:18 node1 systemd[1]: Started PCS GUI and remote configuration interface.
    49
    50
    51
    #给hacluster增加密码
    52
    ~]# ansible ha -m shell -a 'echo "jusene" | passwd --stdin hacluster'
    53
    node2 | SUCCESS | rc=0 >>
    54
    Changing password for user hacluster.
    55
    passwd: all authentication tokens updated successfully.
    56
    57
    node1 | SUCCESS | rc=0 >>
    58
    Changing password for user hacluster.
    59
    passwd: all authentication tokens updated successfully.
    60
    61
    #认证节点身份
    62
    ~]# pcs cluster auth node1 node2
    63
    Username: hacluster
    64
    Password: 
    65
    node1: Authorized
    66
    node2: Authorized
    67
    68
    #配置集群,集群名字,集群中有两个节点
    69
    ~]# pcs cluster setup --name jusene-cluster node1 node2
    70
    Destroying cluster on nodes: node1, node2...
    71
    node1: Stopping Cluster (pacemaker)...
    72
    node2: Stopping Cluster (pacemaker)...
    73
    node2: Successfully destroyed cluster
    74
    node1: Successfully destroyed cluster
    75
    76
    Sending cluster config files to the nodes...
    77
    node1: Succeeded
    78
    node2: Succeeded
    79
    80
    Synchronizing pcsd certificates on nodes node1, node2...
    81
    node1: Success
    82
    node2: Success
    83
    84
    Restarting pcsd on the nodes in order to reload the certificates...
    85
    node1: Success
    86
    node2: Success
    87
    88
    #查看corosync v2
    89
    ]# vim /etc/corosync/corosync.conf
    90
    totem {                            #集群信息
    91
        version: 2                     #版本
    92
        secauth: off                   #安全功能
    93
        cluster_name: jusene-cluster   #集群名字
    94
        transport: udpu                #传输协议
    95
    }
    96
    97
    nodelist {                         #集群中所以节点
    98
        node {
    99
            ring0_addr: node1
    100
            nodeid: 1
    101
        }
    102
    103
        node {
    104
            ring0_addr: node2
    105
            nodeid: 2
    106
        }
    107
    }
    108
    109
    quorum {                            #仲裁投票
    110
        provider: corosync_votequorum   #投票系统
    111
        two_node: 1                     #是否为2节点集群
    112
    }
    113
    114
    logging {
    115
        to_logfile: yes
    116
        logfile: /var/log/cluster/corosync.log
    117
        to_syslog: no
    118
    }
    119
    120
    #启动集群
    121
    ]# pcs cluster start --all
    122
    node2: Starting Cluster...
    123
    node1: Starting Cluster...
    124
    125
    #查看node1的节点是否启动成功
    126
    ~]# ansible ha -a "corosync-cfgtool -s"
    127
    node2 | SUCCESS | rc=0 >>
    128
    Printing ring status.
    129
    Local node ID 2
    130
    RING ID 0
    131
            id      = 10.211.55.43
    132
            status  = ring 0 active with no faults
    133
    134
    node1 | SUCCESS | rc=0 >>
    135
    Printing ring status.
    136
    Local node ID 1
    137
    RING ID 0
    138
            id      = 10.211.55.39
    139
            status  = ring 0 active with no faults
    140
    141
    #查看集群信息
    142
    ~]# corosync-cmapctl | grep members
    143
    runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
    144
    runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(10.211.55.39) 
    145
    runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
    146
    runtime.totem.pg.mrp.srp.members.1.status (str) = joined
    147
    runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
    148
    runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(10.211.55.43) 
    149
    runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
    150
    runtime.totem.pg.mrp.srp.members.2.status (str) = joined
    151
    ~]# ]# pcs status
    152
    Cluster name: jusene-cluster
    153
    WARNING: no stonith devices and stonith-enabled is not false
    154
    Stack: corosync
    155
    Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
    156
    Last updated: Sat May 20 10:27:25 2017          Last change: Sat May 20 10:23:15 2017 by hacluster via crmd on node2
    157
    158
    2 nodes and 0 resources configured
    159
    160
    Online: [ node1 node2 ]
    161
    162
    No resources
    163
    164
    165
    Daemon Status:
    166
      corosync: active/disabled
    167
      pacemaker: active/disabled
    168
      pcsd: active/enabled

使用crmsh配置集群

vip 10.211.55.24

  • node1
    1
    ~]# ansible ha -m yum -a 'name=httpd'
    2
    node2 | SUCCESS => {
    3
        "changed": false, 
    4
        "msg": "", 
    5
        "rc": 0, 
    6
        "results": [
    7
            "httpd-2.4.6-45.el7.centos.4.x86_64 providing httpd is already installed"
    8
        ]
    9
    }
    10
    node1 | SUCCESS => {
    11
        "changed": false, 
    12
        "msg": "", 
    13
        "rc": 0, 
    14
        "results": [
    15
            "httpd-2.4.6-45.el7.centos.4.x86_64 providing httpd is already installed"
    16
        ]
    17
    }
    18
    ~]# echo '<h1>node1</h1>' > /var/www/html/index.html;ssh node2 "echo '<h1>node2</h1>' > /var/www/html/index.html"
    19
    ~]# ansible ha -m service -a "name=httpd state=started enabled=no"
    20
    ~]# curl 10.211.55.39
    21
    <h1>node1</h1>
    22
    ~]# curl 10.211.55.43
    23
    <h1>node2</h1>
    24
    ~]# ansible ha -m service -a "name=httpd state=stopped"

crmsh配置

1
~]# crm
2
crm(live)# status
3
Stack: corosync
4
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
5
Last updated: Sat May 20 10:39:55 2017          Last change: Sat May 20 10:23:15 2017 by hacluster via crmd on node2
6
7
2 nodes and 0 resources configured
8
9
Online: [ node1 node2 ]
10
11
No resources
12
crm(live)# configure
13
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=10.211.55.24 op monitor interval=20s timeout=20s 
14
crm(live)configure# property no-quorum-policy=ignore
15
crm(live)configure# property stonith-enabled=false
16
crm(live)configure# verify
17
crm(live)configure# commit
18
crm(live)configure# cd ..
19
crm(live)# status
20
Stack: corosync
21
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
22
Last updated: Sat May 20 10:51:15 2017          Last change: Sat May 20 10:51:03 2017 by root via cibadmin on node2
23
24
2 nodes and 1 resource configured
25
26
Online: [ node1 node2 ]
27
28
Full list of resources:
29
30
 webip  (ocf::heartbeat:IPaddr):        Started node1
31
crm(live)# configure
32
crm(live)configure# primitive webservice systemd:httpd op start timeout=100s op stop timeout=100s
33
crm(live)configure# verify
34
crm(live)configure# commit
35
crm(live)configure# cd ..
36
crm(live)# status
37
Stack: corosync
38
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
39
Last updated: Sat May 20 10:57:03 2017          Last change: Sat May 20 10:56:54 2017 by root via cibadmin on node2
40
41
2 nodes and 2 resources configured
42
43
Online: [ node1 node2 ]
44
45
Full list of resources:
46
47
 webip  (ocf::heartbeat:IPaddr):        Started node1
48
 webservice     (systemd:httpd):        Started node2
49
crm(live)# configure
50
crm(live)configure# group web webip webservice
51
crm(live)configure# verify
52
crm(live)configure# commit
53
crm(live)configure# cd ..
54
crm(live)# status
55
Stack: corosync
56
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
57
Last updated: Sat May 20 11:00:24 2017          Last change: Sat May 20 11:00:20 2017 by root via cibadmin on node2
58
59
2 nodes and 2 resources configured
60
61
Online: [ node1 node2 ]
62
63
Full list of resources:
64
65
 Resource Group: web
66
     webip      (ocf::heartbeat:IPaddr):        Started node1
67
     webservice (systemd:httpd):        Stopped
68
69
crm(live)# status
70
Stack: corosync
71
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
72
Last updated: Sat May 20 11:00:30 2017          Last change: Sat May 20 11:00:20 2017 by root via cibadmin on node2
73
74
2 nodes and 2 resources configured
75
76
Online: [ node1 node2 ]
77
78
Full list of resources:
79
80
 Resource Group: web
81
     webip      (ocf::heartbeat:IPaddr):        Started node1
82
     webservice (systemd:httpd):        Started node1
83
84
~]# curl 10.211.55.24
85
<h1>node1</h1>
86
~]# crm node standby
87
~]# curl 10.211.55.24
88
<h1>node2</h1>
89
~]# crm status
90
Stack: corosync
91
Current DC: node2 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
92
Last updated: Sat May 20 11:05:57 2017          Last change: Sat May 20 11:03:41 2017 by root via crm_attribute on node2
93
94
2 nodes and 2 resources configured
95
96
Node node1: standby
97
Online: [ node2 ]
98
99
Full list of resources:
100
101
 Resource Group: web
102
     webip      (ocf::heartbeat:IPaddr):        Started node2
103
     webservice (systemd:httpd):        Started node2
CATALOG
  1. 1. corosync
    1. 1.1. 规划
    2. 1.2. 准备工作
  2. 2. 安装pcs并部署集群
  3. 3. 使用crmsh配置集群