Tag Archives: storage

Five Reasons to Switch to Flash Storage

By now you have heard your peers raving about flash storage. But perhaps you have not made the switch from your enterprise HDD storage solution yet, because of nagging questions you may have, about the cost of flash storage or its technical capabilities. Well here is a quick look at five compelling reasons why you should switch your enterprise storage from HDD to flash.

more>>

Read More

High available NFS server: Setup Corosync & Pacemaker

Introduction

This post is the continuation of the series of posts for setup a High available NFS server, check the first post to setup the iSCSI storage part: http://opentodo.net/2015/06/high-available-nfs-server-setup-iscsi-multipath/
On this post I’ll explain how to setup the NFS cluster and the failover between two servers setup on the first post, using Corosync as the cluster engine and Pacemaker as the resource manager of the cluster.

Corosync

Corosync is an open source cluster engine which allows to share messages between the different servers of the cluster to check the health status and inform the other components of the cluster in case one of the servers goes down and starts the failover process.

Pacemaker

Pacemaker is an open source high availability resource manager. The task of Pacemaker is to keep the configuration of all the resources of the cluster and the relations between the servers and resources. For example if we need to setup a VIP (virtual IP), mount a filesystem or start a service on the active node of the cluster, pacemaker will setup all the resources assigned to the server in the order we specify on the configuration to ensure all the services will be started correctly.

Resource Agents

They’re just Scripts that manages different services. That scripts are based on the OCF standard: http://opencf.org/home.html The system comes already with some scripts, where most of the time will be enough for typical cluster setups, but of course that’s possible to develop a new one depending on your needs and requirements.

pcmk-stack

So after this small introduction about the cluster components, let’s get started with the configuration:

Corosync configuration

– Install package dependencies:


# aptitude install corosync pacemaker

– Generate a private key to ensure the authenticity and privacy of the messages sent between the nodes of the cluster:


# corosync-keygen –l

NOTE: This command will generate the private key on the path: /etc/corosync/authkey copy the key file to the other server.

– Edit /etc/corosync/corosync.conf:


# Please read the openais.conf.5 manual page

totem {
version: 2

# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Enable encryption
secauth: on

# How many threads to use for encryption/decryption
threads: 0

# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: active

interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.55.71.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}

nodelist {
node {
ring0_addr: nfs1-srv
nodeid: 1
}
node {
ring0_addr: nfs2-srv
nodeid: 2
}
}

amf {
mode: disabled
}

quorum {
# Quorum for the Pacemaker Cluster Resource Manager
provider: corosync_votequorum
expected_votes: 1
}

service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}

aisexec {
user: root
group: root
}

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}

 

Pacemaker configuration

– Disable the quorum policy, since we need to deploy a 2-node configuration:

# crm configure property no-quorum-policy=ignore

– Setup the VIP resource of the cluster:


# crm configure primitive p_ip_nfs ocf:heartbeat:IPaddr2 params ip="10.55.71.21" cidr_netmask="24" nic="eth0" op monitor interval="30s"

– Setup the init script for the NFS server:

# crm configure primitive p_lsb_nfsserver lsb:nfs-kernel-server op monitor interval="30s"

NOTE: The nfs-kernel-server init script will be managed by the cluster, so disable the service to start it at boot time using update-rc.d utility:

# update-rc.d -f nfs-kernel-server remove

– Configure a resource group with the nfs service and the VIP:

# crm configure group g_nfs p_lsb_nfsserver p_ip_nfs meta target-role="Started"

– Configure the mount point for the NFS export:

# crm configure primitive p_fs_nfs ocf:heartbeat:Filesystem params device="/dev/mapper/nfs1" directory="/mnt/nfs" fstype="ext3" op start interval="0" timeout="120" op monitor interval="60" timeout="60" OCF_CHECK_LEVEL="20" op stop interval="0" timeout="240"

– Configure the initialization order of the resources:

# crm configure order o_fs_before_nfs inf: p_fs_nfs g_nfs:start

– Ensure the colocation of all the resources will be started together on the same server:

# crm configure colocation c_nfs_on_fs inf: p_lsb_nfsserver p_fs_nfs

– Prevent healthy resources from being moved around the cluster configuring a resource stickiness:

# crm configure rsc_defaults resource-stickiness=200

 

Check cluster status

– Check the status of the resources of the cluster:


# crm status
Last updated: Wed Jun 3 21:44:29 2015
Last change: Wed Jun 3 16:56:15 2015 via crm_resource on nfs1-srv
Stack: corosync
Current DC: nfs1-srv (1) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
3 Resources configured

Online: [ nfs1-srv nfs2-srv ]

Resource Group: g_nfs
p_lsb_nfsserver (lsb:nfs-kernel-server): Started nfs2-srv
p_ip_nfs (ocf::heartbeat:IPaddr2): Started nfs2-srv
p_fs_nfs (ocf::heartbeat:Filesystem): Started nfs2-srv

 

Cluster failover

– If resources are in nfs2-srv and we want to failover to nfs1-srv:

# crm resource move g_nfs nfs1-srv

– Remove all constraints created by the move command:

# crm resource unmove g_nfs

 

Resulting configuration


# crm configure show
node $id="1" nfs1-srv
node $id="2" nfs2-srv
primitive p_fs_nfs ocf:heartbeat:Filesystem 
params device="/dev/mapper/nfs-part1" directory="/mnt/nfs" fstype="ext3" options="_netdev" 
op start interval="0" timeout="120" 
op monitor interval="60" timeout="60" OCF_CHECK_LEVEL="20" 
op stop interval="0" timeout="240"
primitive p_ip_nfs ocf:heartbeat:IPaddr2 
params ip="10.55.71.21" cidr_netmask="24" nic="eth0" 
op monitor interval="30s"
primitive p_lsb_nfsserver lsb:nfs-kernel-server 
op monitor interval="30s"
group g_nfs p_lsb_nfsserver p_ip_nfs 
meta target-role="Started"
colocation c_nfs_on_fs inf: p_lsb_nfsserver p_fs_nfs
order o_volume_before_nfs inf: p_fs_nfs g_nfs:start
property $id="cib-bootstrap-options" 
dc-version="1.1.10-42f2063" 
cluster-infrastructure="corosync" 
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" 
resource-stickiness="200"

 

References

https://wiki.ubuntu.com/ClusterStack/Natty
http://clusterlabs.org/quickstart-ubuntu.html
http://clusterlabs.org/doc/

This post is a second part of the series of post High available NFS server, find the first part here.

Read More

High available NFS server: Setup iSCSI & multipath

Introduction

On this series of post I’ll explain how to setup a high available and redundant NFS cluster using iSCSI with DM-Multipath and Corosync & Pacemaker to manage the cluster and the resources associated. The objective of this scenario it’s create a redundant and fault tolerant NFS storage with automatic failover, to ensure the maximum availability of the NFS exports most of the time.

For this environment I’ve used two servers running Ubuntu 14.04.2 LTS with two NICs configured on each server, one to provide the NFS service to the clients and another one to connect with the iSCSI SAN network. For the iSCSI SAN storage device, I’ve already setup two physical adapters and two network interfaces for each adapter for redundant network access and provide two physical paths to the storage system. Both NFS servers will have attached the LUN device using a different InitiatorName and will have setup the device mapper multipathing (DM-Multipath), which allows you to configure multiple I/O paths between server nodes and storage arrays into a single device. These I/O paths are physical SAN connections that can include separate cables, switches, and controllers, so basically It is as if the NFS servers had a single block device.

iscsi-multipathing

The cluster software used is Corosync and the resource manager Pacemaker, where Pacemaker will be the responsible to assign a VIP (virtual ip address), mount the file system from the block device and starts the nfs service with the specific exports for the clients on the active node of the cluster. In case of failure of the active node of the cluster the resources will be migrated to the passive node and the services will continue to operate as if nothing had happened.

This post specifically will cover the configuration part of the iSCSI initiator for both NFS servers and the configuration for the device mapper multipathing, so let’s get started with the setup!

iSCSI initiator configuration

– Install dependencies:

# aptitude install multipath-tools open-iscsi</p>

Server 1

– Edit configuration file /etc/iscsi/initiatorname.iscsi:

InitiatorName=iqn.1647-03.com.cisco:01.vdsk-nfs1

Server 2

– Edit configuration file /etc/iscsi/initiatorname.iscsi:

InitiatorName=iqn.1647-03.com.cisco:01.vdsk-nfs2

NOTE: initiator identifiers on both servers are different but they are associated with the same LUN device.

– Runs a discovery on iSCSI targets:

# iscsiadm -m discovery -t sendtargets -p 10.54.61.35
# iscsiadm -m discovery -t sendtargets -p 10.54.61.36
# iscsiadm -m discovery -t sendtargets -p 10.54.61.37
# iscsiadm -m discovery -t sendtargets -p 10.54.61.38

– Connect and login with the iSCSI target:

# iscsiadm -m node -T iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.a -p 10.54.61.35 --login
# iscsiadm -m node -T iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.a -p 10.54.61.36 --login
# iscsiadm -m node -T iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.b -p 10.54.61.37 --login
# iscsiadm -m node -T iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.b -p 10.54.61.38 --login

– Check the sessions established with the iSCSI SAN device:

# iscsiadm -m node
10.54.61.35:3260,1 iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.a
10.54.61.36:3260,2 iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.a
10.54.61.37:3260,1 iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.b
10.54.61.38.38:3260,2 iqn.2054-02.com.hp:storage.msa2012i.0390d423d2.b

– At this point the block devices should be available on both servers like a local attached devices, you can check it simply running fdisk:

# fdisk -l

Disk /dev/sdb: 1000.0 GB, 1000000716800 bytes
255 heads, 63 sectors/track, 121576 cylinders, total 1953126400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              63  1953118439   976559188+  83  Linux
 
Disk /dev/sdc: 1000.0 GB, 1000000716800 bytes
255 heads, 63 sectors/track, 121576 cylinders, total 1953126400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63  1953118439   976559188+  83  Linux

In my case /dev/sda is the local disk for the server and /dev/sdb and /dev/sdc corresponds to the iSCSI block devices (one device for each adapter). Now We need to setup a device mapper multipath for these two devices, /dev/sdb and /dev/sdc, so in case one of the adapter fails the LUN device will continue working in our system and multipath will switch the used disk for our block device.

Multipath configuration

– We need first to retrieve and generate a unique SCSI identifier to configure on the multipath configuration, running the following command for one of the iSCSI devices:

# /lib/udev/scsi_id --whitelisted --device=/dev/sdb
3600c0ff000d823e5ed6a0a4b01000000

– Create the multipath configuration file /etc/multipath.conf with the following content:

##
## This is a template multipath-tools configuration file
## Uncomment the lines relevent to your environment
##
defaults {
       user_friendly_names yes
       polling_interval        3
       selector                "round-robin 0"
       path_grouping_policy    multibus
       path_checker            directio
       failback                immediate
       no_path_retry           fail
}
blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z][[0-9]*]"
}
 
multipaths{
        multipath {
	        # id retrieved with the utility /lib/udev/scsi_id
                wwid                    3600c0ff000d823e5ed6a0a4b01000000
                alias                   nfs
        }
}

– Restart multipath-tools service:

# service multipath-tools restart

– Check again the disks available in the system:

# fdisk -l

Disk /dev/sdb: 1000.0 GB, 1000000716800 bytes
255 heads, 63 sectors/track, 121576 cylinders, total 1953126400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              63  1953118439   976559188+  83  Linux
 
Disk /dev/sdc: 1000.0 GB, 1000000716800 bytes
255 heads, 63 sectors/track, 121576 cylinders, total 1953126400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
 
   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63  1953118439   976559188+  83  Linux
 
Disk /dev/mapper/nfs: 1000.0 GB, 1000000716800 bytes
255 heads, 63 sectors/track, 121576 cylinders, total 1953126400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
 
          Device Boot      Start         End      Blocks   Id  System
/dev/mapper/nfs1              63  1953118439   976559188+  83  Linux
 
Disk /dev/mapper/nfs-part1: 1000.0 GB, 999996609024 bytes
255 heads, 63 sectors/track, 121575 cylinders, total 1953118377 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Now as you can see we’ve a new block device using the alias setup on the multipath configuration file /dev/mapper/nfs. The disk I’ve partitioned and implemented the filesystem is the block device /dev/mapper/nfs-part1, so you can mount it in your system with the mount utility.

– You can check the health of the multipath block device and check if both devices are operational, running the following command:

# multipath -ll
nfs (3600c0ff000d823e5ed6a0a4b01000000) dm-3 HP,MSA2012i
size=931G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 6:0:0:0 sdb 8:16 active ready running
  `- 5:0:0:0 sdc 8:32 active ready running

References

https://help.ubuntu.com/14.04/serverguide/device-mapper-multipathing.html

http://linux.dell.com/files/whitepapers/iSCSI_Multipathing_in_Ubuntu_Server.pdf

Read More