David Ramsden

High Availability iSCSI Target Using Linux

Software

  • Linux-HA - Linux clustering software.
  • DRBD - Distributed Replicated Block Device. Allows you to RAID1 partitions over IP.
  • iscsitarget - Linux implementation of an iSCSI target.

Configuration

This guide is based on the following:

  • Two nodes (Ubuntu 9.10 AMD64)
  • Each node has 3x NICs (2x bonded on network and 1x for DRBD data).
  • Nodes:
    • san01-n1 (“node1”) / 172.16.254.101 / bond0 [slaves: eth0, eth1]
      • DRBD sync network: node1-drbd / 10.10.10.101 / eth2
    • san01-n2 (“node2”) / 172.16.254.102 / bond0 [slaves: eth0, eth1]
      • DRBD sync network: node2-drbd / 10.10.10.102 / eth2
  • Cluster IP address: 172.16.254.100

Note: Unless explicitly stated (i.e. commands prefixed with [node1] or [node2]), commands and configurations should be completed on both nodes.

Install Ubuntu/Debian. Use LVM and create one Volume Group (vg01). Create a Logical Volume for the OS (mount point /) and a Logival Volume for swap. Leave the rest of the space.

Install package ifenslave and configure /etc/network/interfaces:

Create DRBD meta data Logical Volume on Volume Group vg01:

Create DRBD meta data Logical Volume on Volume Group vg01:

Create a Logical Volume to become a test LUN later on:

Edit /etc/hosts (removing the loopback entry for the host):

Install packages drbd8-utils and heartbeat.

Change permissions and group ownership on some DRBD binaries for use with heartbeat:

Edit /etc/drbd.conf and define two resources:

  1. The DRBD device that will contain iscsitarget configuration files.
  2. The DRBD device that will become the test LUN.

Reboot nodes. Test connectivity (both networks) between nodes.

Initialise DRBD meta data discs for the DRBD resources. This needs to be done on both nodes:

Restart DRBD service.

Decide which node will act as the primary for the DRBD device that will contain the iSCSI configuration files (/dev/drbd0) and initiate the first full sync between the nodes. Run the following on the primary:

Check the status of the initial sync:

[node1] # cat /proc/drbd

You can wait until the initial sync completes but it's not a requirement. Create a filesystem on /dev/drbd0 (iSCSI configs) and mount it:

Create the /srv/iscsi-config mount point on node 2.

Ensure replication is working as expected. On the primary node:

On node 2:

Test replication the other way by deleting the file:

Make node 1 the primary and mount /srv/iscsi-config (/dev/drbd0) and ensure the file has gone:

Decide which node will act as the primary for the DRBD device that contains the test LUN (/dev/drbd1) and initiate the first full sync between the nodes. Run the following on the primary:

Install the iscsitarget package. By default, iscsitarget (ietd) will not start. Edit /etc/defaults/iscsitarget and set ISCSITARGET_ENABLE to true.

Heartbeat will be used to control the iscsitarget service so remove it from init:

Relocate iscsitarget config to DRBD device. Make sure that node 1 is the primary and that /srv/iscsi-config is mounted:

Create iscsitarget config on node 1. Example:

Configure heartbeat to control virtual IP address of cluster and to failover iscsitarget when a node fails. The following should be completed on node 1:

/etc/ha.d/ha.cf:

/etc/ha.d/authkeys:

/etc/ha.d/haresources:

chmod /etc/ha.d/authkeys to 600.

Copy ha.cf, authkeys and haresources to node 2:

Note: At the time of writing, the portblock resource agent script (/etc/ha.d/resource.d/portblock) is broken. Ubuntu bug #489719 has been filed, along with Debian bug #538987. Apply the following patch to both nodes:

Finally, reboot both nodes and test failover. The best way to do this is to connect the test LUN to a server, copy on a movie and play it. Fail one of the nodes either by pulling the power or via ”/etc/init.d/heartbeat stop”. The movie will freeze for a few seconds but should resume. Also tail /var/log/syslog.