VMware

VMware vSphere Big Data Extensions 1.0 Release Notes

vSphere Big Data Extensions 1.0 | 22 September 2013 | Build 1315424

Check these release notes for additions and updates.

What's in the Release Notes

These release notes apply to vSphere Big Data Extensions 1.0 and cover the following topics:

vSphere Big Data Extensions 1.0 Features

Big Data Extensions enables the rapid deployment of a Hadoop cluster on a VMware vSphere virtual platform. This release provides the following features.

  • Support for Major Hadoop Distributions. Big Data Extensions includes support for Apache Hadoop, Cloudera, Greenplum, Hortonworks, MapR, and Pivotal. HBase, Pig, and Hive are also supported. The Big Data Extensions virtual appliance includes Apache Hadoop 1.2. Customers can easily upload distributions of their choice and configure Big Data Extensions to deploy their preferred distributions.
  • Quickly Deploy, Manage, and Scale Hadoop Clusters. Big Data Extensions enables the rapid deployment of Hadoop clusters on VMware vSphere. You can quickly deploy, manage, and scale Hadoop nodes using the virtual machine as a simple and elegant container. Big Data Extensions provides a simple deployment toolkit that can be accessed though VMware vCenter Server to deploy a highly available Hadoop cluster in minutes using the Big Data Extensions user interface.
  • Graphical User Interface Simplifies Management Tasks. The Big Data Extensions plug-in, a graphical user interface integrated with vSphere Web Client, lets you easily perform common Hadoop infrastructure and cluster management administrative tasks.
  • Elastic Scaling Lets You Optimize Cluster Performance and Resource Utilization. Elasticity-enabled clusters start and stop virtual machines automatically and dynamically to optimize resource consumption. Elasticity is ideal in a mixed workload environment to ensure that high priority jobs are assigned sufficient resources. Elasticity adjusts the number of active compute virtual machines based on configuration settings you specify.

Installation Notes for This Release

Read the vSphere Big Data Extensions documentation for step-by-step instructions on installing and configuring Big Data Extensions.

If you installed the Beta edition of Big Data Extensions, you can not upgrade to the released version. Instead, you must create a new Big Data Extensions environment, and install the new version of the software.

Notes for the Product Guides

The following information is not currently addressed by the product guides.

  • Do not use Big Data Extensions is conjunction with vSphere Storage DRS
    Big Data Extensions places virtual machines on hosts according to available resources, Hadoop best practices, and user defined placement policies prior to creating virtual machines. For this reason, you should not deploy Big Data Extensions on vSphere environments in combination with Storage DRS. Storage DRS continuously balances storage space usage and storage I/O load to meet application service levels in specific environments. If used with Big Data Extensions, it will disrupt the placement policies of your Big Data cluster virtual machines.

Known Issues

Big Data Extensions 1.0 has the following known issues. If you encounter an issue that is not in this known issues list, search the VMware Knowledge Base, or let us know by contacting VMware Technical Support.

  • When you create an HBase cluster using a specification file, omit the --type parameter
    When you create an HBase cluster using the cluster create command with a specification file (using the --specFile parameter), do not use the --type HBase parameter. Doing so creates a default HBase cluster, ignoring any configuration you may have defined in the specification file.

  • Workaround: Omit the --type Hbase parameter when creating an HBase cluster with a specification file. For example:

    cluster create --name myHBase --distro hw --specFile /opt/serengeti/mySpecFile.txt

    For more information on the create cluster command, refer to the VMware vSphere Big Data Extensions Command-Line Interface Guide documentation.

  • Specifying an odd number of cores per socket causes cluster provisioning or CPU resizing to fail
    If you create Hadoop Template virtual machines with multiple cores per socket, when you specify the CPU settings for the virtual machine you must specify a multiple of cores per socket. For example, if the virtual machine uses two cores per socket, the vCPU settings must be an even number. For example: 4, 8, or 12. If you specify an odd number the cluster provisioning or CPU resizing will fail.

  • Create New Big Data Cluster dialog box fails in Internet Explorer 9 if you click the New Big Data Cluster (+) button several times in quick succession
    When using Internet Explorer 9, if you click the New Big Data Cluster (+) button several times in quick succession, several Create New Big Data Cluster dialog boxes appear.

    Workaround: Choose one of the following workarounds to prevent this issue:

    • When using Internet Explorer 9, do not click the click the New Big Data Cluster (+) button several times in quick succession. Click the button once, and wait for the Create New Big Data Cluster dialog box to appear.

    • Use a Web browser other than Internet Explorer 9 that is supported by the VMware vSphere Web Client.

  • Big Data Extensions user interface shows Big Data clusters as running when they are powered-off
    If you manually shut down virtual machines that are part of the Big Data Extensions environment using vCenter, the Big Data Extensions plug-in is unable to report the change in operational status from running to powered-off.

    Workaround: Use only the Big Data Extensions plug-in interface in the vSphere Web Client, or the Serengeti Command-Line Interface Client, to monitor and manage your Big Data Extensions environment. Performing management operations in vCenter Server may cause the Big Data Extensions management tools to become un-synchronized, and unable to accurately report the operational status of your Big Data Extensions environment.

  • Migrating virtual machines in vCenter Server may disrupt the virtual machine placement policy
    Big Data Extensions places virtual machines based on available resources, Hadoop best practices, and user defined placement policies that you specify. For this reason, DRS is disabled on all the virtual machines created within the Big Data Extensions environment. While this prevents virtual machines from being automatically migrated by vSphere, it does not prevent you from inadvertently moving virtual machines using the vCenter Server user interface. This may break the Big Data Extensions defined placement policy. For example, this may disrupt the number of instances per host and group associations.

    Workaround: If you need to migrate Big Data Extensions virtual machines, carefully plan the migration to ensure the placement policy is not disrupted during migration.

  • Temporarily powering off hosts will cause Big Data clusters to fail during cluster creation
    When creating Big Data clusters, Big Data Extensions calculates virtual machine placement according to available resources, Hadoop best practices, and user defined placement policies prior to creating the virtual machines. When performing placement calculations, if some hosts are powered off or set to stand-by, either manually, or automatically by VMware Distributed Power Management (VMware DPM), those hosts will not be considered as available resources when Big Data Extensions calculates virtual machine placement for use with a Big Data cluster.

    If a host is powered off or set to stand-by after Big Data Extensions calculates virtual machine placement, but before it creates the virtual machines, the cluster will fail to create until you power on those hosts.

    Workaround: The following workarounds can help you both prevent and recover from this issue.

    • Disable VMware DPM on those vSphere clusters where you deploy and run Big Data Extensions.

    • Put hosts in maintenance mode before you power them off.

    • If a Big Data cluster fails to create due to its assigned hosts being temporarily unavailable, resume the cluster creation after you power-on the hosts.