|
VMware ESX Server 2.1
Features | Documentation | Knowledge Base | Discussion Forums ESX Server 2.1 includes multipathing support to maintain a constant connection between the server machine and the storage device in case of the failure of a host bus adapter (HBA), switch, storage controller (or storage processor; abbreviated as SP in the following diagram), or a Fibre Channel cable. Unlike previous versions of ESX Server, this version of multipathing support does not require specific failover drivers. In the preceding diagram, there are multiple, redundant paths from each server to the storage device. For example, if HBA1, or the link between HBA1 and the Fibre Channel (FC) switch breaks, HBA2 takes over and provides the connection between the server and the switch. This process is called HBA failover. Similarly, if SP1, or the link between SP1 and the switch breaks, SP2 takes over and provides the connection between the switch and the storage device. This process is called SP failover. VMware ESX Server 2.1 provides both HBA and SP failover with its multipathing feature. (SP failover may not be supported by all disk arrays.) For information on supported SAN hardware, download the VMware ESX Server SAN Compatibility List from the VMware Web site at www.vmware.com/support/. ESX Server allows you to configure and manage multipath access to storage devices through both the Management Interface and the Service Console. The sections below describe how to manage multipathing in the Service Console with the vmkmultipath command. For instructions on configuring multipathing with the Management Interface, seeViewing Failover Paths Connections. You can view your current multipathing configuration with the vmkmultipath -q command. The -q option displays the state of all or selected paths recognized by ESX Server. The report displayed by vmkmultipath shows the current multipathing policy for a disk and the connection state and mode for each path to the disk. The report identifies disks by their canonical name. The canonical name for a disk is the first path ESX Server finds to the disk. Since ESX Server begins its scans at the first controller and the lowest device number, the first path (and thus the canonical name of the disk) is the path with the lowest number controller and device number. For example, if the paths to a disk are vmhba0:0:2, vmhba1:0:2, vmhba0:1:2 and vmhba1:1:2, then the canonical name of the disk is vmhba0:0:2. To see a report for all disks, enter: # vmkmultipath -q Below is a typical report displayed for a configuration of ESX Server managing a SAN:
Disk and multipath information follows: In this system configuration, the disk vmhba0:0:2 has a "fixed" policy. There are six paths to the disk recognized by ESX Server. The list of paths indicates the different physical routes by which the disk can be accessed. The status of each path to the disk is indicated in the second column. The report lists each path as on, off, or dead:
The report lists the mode of each path in the third column:
Note: Reports returned by vmkmultipath list paths to both physical disks and storage controllers. In the example above, the "disks" listed as having no space available are actually storage processors. You can display the multipathing status for a single disk by specifying it in the query command. For example, to display the report for disk vmhba0:0:6, enter: # vmkmultipath -q vmhba0:0:6 You can specify the default policy for the multipathing feature. There are two policies:
Note: You can select a different policy for each disk. You can use the vmkmultipath command to disable and enable paths, set the active path, and set the preferred path, as illustrated in the following examples. You configure paths by setting path modes with the -s option. Use the -e option to enable paths with vmkmultipath. In this example, you are enabling the path from controller vmhba1:0:1 to disk vmhba0:0:1. # vmkmultipath -s vmhba0:0:1 -e vmhba1:0:1 Use the -d option to disable paths with vmkmultipath. In this example, you are disabling the path from controller vmhba1:0:1 to disk vmhba0:0:1. # vmkmultipath -s vmhba0:0:1 -d vmhba1:0:1 Use the -r option to specify the preferred path to a disk. In this example, you are setting as preferred the path from controller vmhba1:0:1 to disk vmhba0:0:1. # vmkmultipath -s vmhba0:0:1 -r vmhba1:0:1 Note: ESX Server ignores the preferred path when the multipathing policy is set to mru. Your multipathing settings are saved when shutting down ESX Server normally. However, we suggest you run the following command, as root, to ensure your settings are saved, in case of an abnormal shutdown. # /usr/sbin/vmkmultipath -S By running this command, your multipathing settings are restored automatically when you restart your system. When a cable is pulled, I/O freezes for approximately 30-60 seconds, until the SAN driver determines that the link is down, and failover occurs. During that time, the virtual machines (with their virtual disks installed on a SAN) may appear unresponsive, and any operations on the /vmfs directory may appear to hang. After the failover occurs, I/O should resume normally. Even though ESX Server's failover feature ensures high availability and prevents connection loss to SAN devices, all connections to SAN devices may be lost due to disastrous events, that include multiple breakages. If all connections to the storage device are not working, then the virtual machines will begin to encounter I/O errors on their virtual SCSI disks. Also, operations in the /vmfs directory may eventually fail after reporting an "I/O error". For QLogic cards, you may want to adjust the PortDownRetryCount value in the QLogic BIOS. This value determines how quickly a failover occurs when a link goes down. If the PortDownRetryCount value is <n>, then a failover typically takes a little longer than <n> multiplied by 2 seconds. A typical recommended value for <n> is 15, so in this case, failover takes a little longer than 30 seconds. For more information on changing the PortDownRetryCount value, refer to your QLogic documentation. For the Windows 2000 and Windows Server 2003 guest operating systems, you may want to increase the standard disk TimeOutValue so that Windows will not be extensively disrupted during failover.
|
