VMware

VMware Site Recovery Manager Release Notes

Site Recovery Manager 1.0 Update 1 | 12/18/08 | Build 128004


VMware Site Recovery Manager 1.0 Update 1 provides defect fixes and adds new features to Site Recovery Manager 1.0

These release notes contain the following sections:

Features

Site Recovery Manager (SRM) is a disaster recovery workflow-automation product. Leveraging array replication between a protected site and a recovery site, SRM installs into VMware VirtualCenter as a plug-in. SRM simplifies disaster recovery, increases reliability and reduces costs. SRM provides the following features.

Provides Simple Setup and Configuration
  • Uses VirtualCenter to protect virtual machines.
  • Automates discovery of what virtual machines are replicated by your underlying Fibre Channel or iSCSI replication systems and are available to be configured for recovery.
  • Makes grouping of virtual machines and assigning recovery policies based on business requirements easy.
  • Automates the management of replicated virtual machines into the VirtualCenter Server at the recovery site.
  • Enables quality of service guarantees for recovered virtual machines.
  • Manages network connectivity and configuration of recovered virtual machines.

Press One Button to Start the Recovery Process

  • Shuts down non-essential virtual machines at the recovery site.
  • Automatically stops replication and promotes the read-only replica at the recovery site to read-write.
  • Rescans the VMware ESX Servers at the recovery site.
  • Registers the replicated virtual machines.
  • Completes power-up of replicated virtual machines in the order specified during creation of the disaster recovery plan.

Testing Automation for Increased Reliability

  • Allows testing to be done at any time.
  • Creates a test environment that is isolated from the production environment so testing requires no down time for production systems.
  • Provides a report of test results.
  • Resets everything in preparation for a disaster.

Supports Bi-Directional Protection Between Two Sites

  • Reduces the need for costly hardware to sit idle at recovery site.
  • Reduces maintenance costs and power consumption with less hardware at the recovery site.

New in This Release

This release of Site Recovery Manager introduces several new features:

  • New Permission Required to Run a Recovery Plan
    SRM now distinguishes between permission to test a recovery plan and permission to run a recovery plan. After an SRM server is updated to this release, existing users of that server who had permission to run a recovery plan no longer have that permission. You must grant Run permission to these users after the update is complete. Until you do, no user can run a recovery plan. (Permission to test a recovery plan is unaffected by the update.)
  • Full Support for RDM devices
    SRM now provides full support for virtual machines that use raw disk mapping (RDM) devices. This enables support of several new configurations, including Microsoft Cluster Server. (Virtual machine templates cannot use RDM devices.)
  • Batch IP Property Customization
    This release of SRM includes a tool that allows you to specify IP properties (network settings) for any or all of the virtual machines in a recovery plan by editing a comma-separated-value (csv) file that the tool generates.
  • Limits Checking and Enforcement
    A single SRM server can support up to 500 protected virtual machines and 150 protection groups. This release of SRM prevents you from exceeding those limits when you create a new protection group. If a configuration created in an earlier release of SRM exceeds these limits, SRM displays a warning, but allows the configuration to operate.
  • Improved Support for Virtual Machines that Span Multiple Datastores.
    This release provides improved support for virtual machines whose disks reside on multiple datastores.
  • Single Action to Reconfigure Protection for Multiple Virtual Machines
    This release introduces a Configure All button that applies existing inventory mappings to all virtual machines that have a status of Not Configured.
  • Simplified Log Collection
    This release introduces new utilities that retrieve log and configuration files from the server and collect them in a compressed (zipped) folder on your desktop.
  • Improved Acceptance of Non-ASCII Characters
    non-ASCII characters are now allowed in many fields during installation and operation.
  • Compatibility

    SRM 1.0 Update 1 servers are not compatible with SRM 1.0 clients. You must use an SRM 1.0 Update 1 client plug-in to connect to an SRM 1.0 Update 1 server. Connections from an SRM 1.0 client plug-in to an SRM 1.0 Update 1 server may appear to succeed in some cases, but will produce unpredictable results. If the SRM 1.0 Update 1 server has any protection groups defined, the connection will fail.

    You cannot pair sites that are running different versions of SRM. If you update one member of a site pair to SRM 1.0 Update 1, you must update the other member of the pair before attempting any SRM operations.

    SRM requires VMware Infrastructure 3 Foundation, Standard, or Enterprise. The Site Recovery Manager Compatibility Matrixes guide lists all compatible versions of VMware Infrastructure components, and also lists supported supported databases and guest operating systems. Compatible storage arrays and SRAs are listed in Site Recovery Manager Storage Partners.

    Installation Notes for This Release

    Before you install this release:

    • Read the Getting Started with Site Recovery Manager for a high-level architectural overview and workflow.
    • Read the Administration Guide for Site Recovery Manager for step-by-step guidance on installing and configuring SRM. This guide contains information about all requirements and procedures to set up SRM and also addresses minimum requirements and scaling limits.
    • Read the Site Recovery Manager Evaluation Guide for a conceptual overview as well as step-by-step workflows describing planning for using SRM, setting up protected and recovery sites, testing failover, the failover and failback process, alarms and status monitoring, and a discussion of roles and privileges.

    Known Issues

    The following are known issues and configuration notes with Site Recovery Manager:

    Database

    • Mixed mode SQL Server Authentication
      When configuring a database connection to a SQL Server database that is not on the same host as the SRM Server, select mixed mode rather than Windows authentication.

    Installation

    • VirtualCenter Database Must Not be Overwritten if VirtualCenter is Updated
      SRM is a VirtualCenter extension. If you update the VirtualCenter installation that SRM extends, you must not overwrite the Virtual Center database during the update. Doing so removes information that SRM has stored in that database and invalidates the current installation of SRM.
    • Update Servers First
      To avoid various problems with the SRM plug-in when updating SRM, update the SRM servers before you update the plug-in.
    • Before Updating, Uninstall SRM 1.0 Plug-In
      Before you can update the SRM plug-in in a VI Client to version 1.0 Update 1, you must use the Windows "Add and remove Software" tool to uninstall the SRM 1.0 plug-in from that client host.
    • Recovering Overwritten Versions of vmware-dr.xml and Other Configuration Files
      An update of SRM overwrites vmware-dr.xml and other configuration files, including certGenUtil.xml and extension.xml. If you have made any changes to these files, you can recover them from the backup files created by the update (for example, vmware-dr.xml.BAK).
    • Length and Character Set Requirements for Passwords.
      SRM passwords cannot be more than 31 characters long and must consist entirely of ASCII characters.
    • SRM Service Does Not Start After Reinstallation in a Different Directory.
      If you uninstall SRM and then reinstall it in a different directory on the same host but re-use the database connection created by the previous installation, the SRM service fails to start.
    • Non ASCII Characters are Not Supported in Some Fields
      SRM supports entry of non-ASCII characters in most fields during installation. If you enter a non-ASCII character into a field that does not support it, the installer warns you and requires you to re-type the entry in an acceptable character set.
    • Enabling and Disabling the SRM Plug-in
      The VI Client fails to display the SRM user interface if the SRM plug-in is disabled and then enabled.
      Workaround: Close and reopen the VI Client after you enable the SRM plug-in.
    • SRM Server Installation Fails and Reports the Error: "Failed to Register Extension"
      During the SRM Server installation, the installation fails and reports the error message: "failed to register extension." SRM reports this error if VirtualCenter Server has license issues. For example, if the VirtualCenter Server isn't licensed, or it lost connection to its license server, registration of the extension fails during installation.
    • Installation fails if DSN has trailing space
      During SRM installation, if you specify a DSN that has a trailing space character (for example, "SRM DB "), the installation fails.
    • A Non-Specific Error Message Displays if the SRM Server is Down During SRM Plug-in Installation
      If the SRM Server is down or unreachable when you try to install the SRM plug-in in the VI Client, the VI Client displays the message "The remote server returned an error: (503) Server Unavailable."

    Role and Permissions

    • Recovery Plan Administrator Must Have Read Permission for All Recovery Plans
      A user who has administrator permission for any recovery plan must be granted read permission for all recovery plans. Assigning read permission for all recovery plans enables the user to access hidden metadata that must be read when an administrator role accesses a specific recovery plan
    • SRM Role Assignments and VirtualCenter
      When you assign a role to an SRM inventory object such as a protection group or recovery plan, that role assignment is not visible in the VirtualCenter Administration Roles pane. You can see it by viewing the properties of the SRM object.

    SRM Service Failure

    • SRM Service Fails to Start if SRA is Corrupted or Not Found
      The SRM service will fail to start if an SRA it has been configured to use is uninstalled, becomes corrupted, or is reinstalled in a different directory.
    • SRM Service Fails to Start if VirtualCenter is not Running
      The SRM will not start unless the VirtualCenter one which it depends is running.
      Workaround: Ensure that VirtualCenter is running before trying to start SRM.

    VI Client and SRM Plug-in

    • Display Refresh Issues When Using Multiple Virtual Infrastructure Clients
      If you are using more than one Virtual Infrastructure Client to manage SRM, changes initiated by one client may not be reflected in the displays of the the other clients.
    • Certificate Warnings when Connecting to SRM
      The SRM plug-in may report a certificate problem warning about a host name mismatch when you connect to a local or remote SRM server. Unless there are other problems with the certificate, you can accept it for this connection.
    • VI Client Does Not Display Current Information if the SRM Service Fails
      If the SRM service fails and then reconnects to the SRM Server, the VI Client does not display current information for Site Recovery.
      Workaround: Restart the SRM Service and then restart the VI Client.
    • VI Client Must Be Restarted if it Loses Connection with SRM
      Site connection is not updated if the local SRM server loses connectivity with the remote SRM server.
      Workaround: Restart the VI Client after the recovery SRM Server restarts.
    • Unauthorized operations can sometimes be selected
      Some operations for which the user does not have privileges appear to be available in the user interface and can be selected. If they are selected, the operation fails due to an authorization failure.
    • SRM Plug-in is Still Present After the VI Client is Uninstalled
      The SRM plug-in is not uninstalled when the VI Client is uninstalled. After reinstalling the VI Client, the SRM plug-in is still present.
      Workaround: Using the VI Client, uninstall the SRM plug-in before uninstalling the VI Client.

    Site Pairing

    • Invalid ESX Server Certificate Causes Errors During Customization
      Server certificates created by the default ESX installation may appear invalid to SRM and cause errors indicating problems with the server certificate to be logged during customization.
      Workaround: If you cannot install an acceptable certificate on the ESX host, you can disable certificate checking by setting the value of the <disableNFCServerCertificateChecks> parameter in vmware-dr.xml to true. This forces all ESX server certificates to be accepted, and therefore creates a security risk that could potentially compromise the user name and password for any ESX server involved in customization.
    • SRM Reports Error Messages When Breaking Site Connection
      After attempting to break the protected and recovery site connection, SRM reports the errors: "Unable to break the connection with remote site because it is currently user by other users" and "The request refers to an object that no longer exists or has never existed." These errors appear if the recovery user permissions are changed to "No Access" when the VI client is connected to the protected site.
      Workaround: Do not change user permissions to "No Access" from the recovery site while protected site VI Client is connected to remote site with this user. If you receive these errors, restart the protected site's SRM service and the VI Client.
    • Accepting Thumbprints for Secondary Servers During Site Pairing Reports "Incompatible Authentication Method" Error
      During site pairing, SRM suggests to accept thumbprints for the secondary server. Thumbprint certificate validation during pairing is not a valid option if SRM and VirtualCenter authentication is using trusted certificates.
    • VI Client Displays "Loading..." in the SRM Tab if the SRM Server is Unavailable
      If the SRM Server is not installed or available, the "Connect To VMware SRM Server" button displays and the SRM tab displays "Loading..." for the status of each SRM component.
      Workaround: Start the SRM service if it is not running.
    • Configure Array Managers Display is Not Refreshed After Connecting the Protected and Recovery Sites
      After reconnecting the protected and recovery sites, the Configure Array Managers summary information in the VI Client is not refreshed and the information is out of sync.
      Workaround: Restart VI Client then launch Site Recovery Manager.
    • Protected Sites Shows "Unable to Connect" After Successful Connection
      After successful connection between protected and recovery sites, the protected site reports "Unable to Connect" and eventually reports the error: "Low Resources on Pair..."
      Workaround:
      1. Restart the SRM Service.
      2. Close the VI Client for the recovery site.
      3. Break the connection and configure connection from the protected Summary page.
      4. Start the VI Client and log in to the recovery site.
      5. Select Site Recovery and configure the connection from the remote site.
    • During Site Connection, an SSL Exception error reports: "The host certificate chain is not complete"
      When trying to connect protected and recovery sites, a SSL Exception error reports: "The host certificate chain is not complete." This error occurs if the certificate on the protected site is changed before pairing with the recovery site.
      Workaround: Restart the SRM service on the protected site before pairing with the recovery site.
    • Error Message Displays While Breaking Recovery Site Connection
      Breaking the connection from the protected site to the recovery site displays the error: "Object reference not set to an instance of an object" after the sites are disconnected.
      Workaround: Acknowledge the error message.
    • Cannot Break Connection After the VI Client Process is Terminated Abnormally
      You cannot break the connection with the recovery site from the protected site if the vpxClient.exe process is not running. The error message: "Unable to break the connection with the remote site because it is currently used by other users" is reported. Workaround: Restart both SRM Servers then break the connection between the recovery site and the protected site.
    • Inventory Mappings Information is Incorrect
      After breaking and reconnecting site pairing, the VI Client at the protection site might not display correct information in Inventory Mappings.
      Workaround: Refresh the Inventory Mappings to display the actual mappings. Click the Refresh button from the Inventory Mappings tab.
    • Pairing Site to Itself Doesn't Fail in the Correct Step
      If you select Site Recovery > Configure and enter the local VirtualCenter Server IP address, SRM continues to the next connection step and asks for user credentials. The connection should fail when the local VirtualCenter Server's IP address is entered.
      Workaround: Do not attempt to pair with the local VirtualCenter Server.
    • When Pairing Sites, Use Trusted Certificates
      When pairing sites and the certificates of the recovery-site VirtualCenter Server and SRM Server are not trusted by the protection-site SRM server, yellow warning triangles, rather than green check boxes, appear to the left of the Certificate Validation steps. The yellow warning triangles warn the user that the given certificates did not pass the validation requirements that the certificates be signed by a trusted Certificate Authority (CA) and have a DNS value matching the address of the server. During the pairing, the user indicated that the certificates should be accepted based on their SHA-1 thumb-prints. It is a serious security violation to accept certificates based on their thumb-prints without verifying that the thumb-prints are correct.
      Workaround: Ensure that both VirtualCenter Servers and both SRM Servers use trusted certificates.

    Protection Group

    • VM Name Column Must be Populated When Using Batch IP Customization Tool
      If you use the batch IP customization tool to customize IP properties, you must copy the VM Name (column 2 of the row for Adapter ID 0) into column 2 of each row that you add for a virtual machine.
    • Protected Virtual Machine Converted to Template Loses Protection
      If you convert a protected virtual machine to a template, the protection on that virtual machine becomes invalid and must be reconfigured. Otherwise subsequent recoveries of that VM will fail.
      Workaround:Remove protection from the virtual machine at the protected site, and then reprotect it.
    • No Support for Customization of Debian and Ubuntu Guests
      Linux guests based on the Debian and Ubuntu distributions (and related ones) cannot be customized. Placeholder virtual machines for these guests are recovered running the configuration that they have at the protected site.
    • Customization Specification Manager Does Not Reflect Changes Made by Batch IP Customization Tool
      If you use the batch IP customization tool to customize IP properties in a recovery group, the Customization Specification Manager window does not reflect those changes even after you refresh the display.
      Workaround Close and re-open the Customization Specification Manager window.
    • VI Client Inventory Reports the Error: "The request refers to an object that no longer exists or has never existed"
      After removing a protection group, the VI Client Inventory view on the recovery site is not refreshed. Attempting to select an object from the Inventory reports the error: "The request refers to an object that no longer exists or has never existed."
      Workaround: Restart the VI Client.
    • Protection Groups Display is Not Refreshed After Connecting the Protected and Recovery Sites
      After reconnecting the protected and recovery sites, the Protection Groups display in the VI Client is not refreshed and the information is out of sync.
      Workaround: Restart the VI Client.

    Recovery Plan

    • Curly Braces Not Allowed in Recovery Plan Name
      You cannot use the { or } characters ("curly braces") in the name of a recovery plan.
    • Inaccurate Description of Normal and Low Priority Virtual Machine Startup in Administrator's Guide
      When a recovery plan includes virtual machines hosted on more than one ESX host, virtual machines that have a recovery priority of normal or low are started in parallel. Because they are started sequentially on each ESX host, the amount of parallelism is determined by the number of ESX hosts.
    • Problems Customizing Certain Linux Guest Configurations During Recovery
      Linux guests that are not running an ext2, ext3, or ReiserFS file system may experience customization failures when recovered.
    • Error reported when running recovery plans simultaneously
      Certain array managers do not support simultaneous execution of recovery plans and report an error when such recoveries are attempted.
    • SRM Reports the Error: "Cannot execute scripts" When Customizing Windows Virtual Machines During Recovery
      During test recovery or recovery, when Windows guests are customized, occasionally the virtual machines attempt to shut down gracefully and SRM reports the error "Cannot execute scripts." This results in a hard shut down after customization is complete and the virtual machine remains powered off regardless of its recovery plan priority.
      Workaround: Manually power on the Windows virtual machines that report this error.

    Test Recovery

    • Failure to Power Down Virtual Machine at Protected Site Causes Spurious Report of Recovery Failure
      If a recovery plan includes a step that powers down one or more virtual machines at the protected site and does not receive confirmation that the requested power-down completed, the recovery plan is reported as failed even though all other steps may have succeeded.
    • A Stop Button Appears After Starting a Recovery Plan Test
      Occasionally, after you start recovery test for the first time, a "Stop" button appears with the message: "Stop Recovery. Are you sure you want to stop this recovery plan? This process may take several minutes."
      Workaround: Click "No." The test proceeds and completes successfully.
    • Recovery Plan Test Status is "Running Test" After the Test is Canceled
      Canceling a recovery plan test from the task list cancels the recovery plan test, but the VI Client displays the status as "Running Test" under Recovery Plans.
    • Converting a Template During a Test Leaves the Virtual Machine Unprotected
      If you test a recovery plan containing a virtual machine template, and during the test you convert the template to a virtual machine and then power it on, the test cleanup steps do not unregister the virtual machine correctly and its protection is lost.
      Workaround: To restore protection, manually power-off and unregister the placeholder virtual machine and then reconfigure protection.

    Miscellaneous Issues

    • Refresh Inventory Mappings Can Make Display Unresponsive at Sites That Support Large Numbers of ESX Hosts
      When you are connected to a site that supports more than 7 ESX hosts and refresh inventory mappings, the display becomes unresponsive for up to ten minutes.
      Workaround: A patch that corrects this problem is available on the SRM Download Site
    • Some Arrays May Present Too Many iSCSI Targets
      When using the ESX software iSCSI stack, SRM can manage up to 23 iSCSI targets per host. Arrays that present each LUN as a separate iSCSI target may exceed this limit.
    • Some Arrays Might Require a Second Rescan.
      Some storage arrays might require a second rescan to discover LUNs. HP arrays have been identified as having this requirement. To enable the additional rescan, edit the vmware-dr.xml file at both the protected and recovery sites to add a <hostRescanRepeatCnt> element within the <SanProvider> element. Set the value of <hostRescanRepeatCnt> to 2, as shown in the following example:
      <SanProvider>
          .
          .
          .
          <hostRescanRepeatCnt>2</hostRescanRepeatCnt>
      </SanProvider>
    • Incorrect Step in Specify a Nonreplicated Datastore for Swapfiles Procedure.
      The first line of step 3 of the "Specify a Nonreplicated Datastore for Swapfiles" procedure in Appendix D of the Administration Guide should read "For each host in the cluster:"
    • Long Timeouts for Misconfigured or Corrupted Virtual Machine.
      If a recovered virtual machine does not power on within the specified timeout period, either because it has been improperly configured or has become corrupted during data replication, the recovery plan will wait considerably longer for that virtual machine to timeout than the interval specified by "Change Network Settings" in the recovery plan. This type of abnormally long timeout typically occurs only when applying a customization specification to the virtual machine."
      Workaround: During a test recovery, verify that the virtual machine image is not corrupted (will boot successfully) and has VMware Tools installed before customizing it.
    • SRM is Not Compatible With DPM (Distributed Power Management)
      SRM recovery plans cannot power-on a host that is in standby mode. If a recovery plan specifies that a host at the recovery site exit standby mode, the host will remain in standby mode, and the virtual machines assigned to that host will not start.
    • Log Collector Does Not Support non-ASCII Encodings
      The log collector does not support use of non-ASCII encodings when writing log files.
    • Japanese Characters in SRM Log Files Use Shift-JIS Encoding
      To read these log files, use a browser, viewer, or editor that can interpret Shift-JIS.
    • Cannot Specify RDM Devices for Templates
      You cannot specify an RDM device in a virtual machine template.
    • Problems When a LUN in a Consistency Groups is Not Part of a Datastore Group
      If a consistency group contains a LUN that is not used as a datastore or as an RDM device, SRM may not be able to recover that consistency group.
      Workaround Add a virtual machine without an OS that has the LUN mapped as an RDM device.
    • VirtualCenter 2.5 Simultaneous Boot Limit
      VirtualCenter 2.5 does not allow you to boot more than 16 virtual machines simultaneously.
    • "Unexpected MethodFault" error when using VC 2.5 Update 1
      When you are using SRM in conjunction with Virtual Center 2.5 Update 1, attempts to connect to the recovery site sometimes fail and log an error message of the form "DR: Unexpected MethodFault".
      Workaround: Upgrade to Virtual Center 2.5 Update 2 or later, or re-start the VirtualCenter server at the recovery site.
    • SRM is Incompatible with DRS Clusters That Mix ESX 3.5 and ESX 3.0.x Hosts
      SRM does not support using ESX 3.5 and ESX 3.0.x versions of ESX Server in DRS clusters. Virtual machines fail and report errors during customization and resource pool configuration fails.
      Workaround: Create DRS clusters using ESX hosts of the same version.
    • SRM Alarms Appear in the VI Client After SRM is Uninstalled
      SRM Alarm Status (if any) is kept after SRM is uninstalled. If the VirtualCenter Server is not reinstalled and you install SRM again, the previous SRM Alarm Status is applied.
    • The srm-config Tool Exits and Reports the Error: "Error [2]: OSERROR [0x80090016] Failed to open crypto key container for certificate"
      During the SRM certificate replacement process, a Windows API can fail with the error message: "Failed to open crypto key container for certificate." This is caused by one of the following:
      • A missing operating system internal file in the following folder:
        C:\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\MachineKeys
      • Incorrect permissions of one of the following folders:
        C:\Documents and Settings\All Users\Application Data\Microsoft\Crypto
        C:\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\S-1-5-18
        C:\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\MachineKeys

      Workaround: Run the command again or fix the permissions.

    Configuration Notes

    • VMware tools required when recovering on ESX 3.0.2
      If a protected VM that does not have VMware tools installed is recovered to an ESX 3.0.2 host, the recovery will not complete: The recovered VM will be registered on the secondary site, but no pre/post power on steps or customization will be performed on the recovered VM. No customer data will be lost.
    • RDM Descriptors Must be Placed on a Replicated Datastore
      In order to protect virtual machines that use raw disk mapping (RDM) devices, ensure that the RDM descriptor files reside on replicated datastores. Placing the RDM descriptor files in the same datastore as the .vmx file that refers to them is highly recommended.