VMware

VMware vSphere Big Data Extensions 1.1 Release Notes

Last updated on: 3 October 2014

vSphere Big Data Extensions 1.1 | 3 October 2014 | Build 1474321

Check these release notes for additions and updates.

What's in the Release Notes

These release notes apply to vSphere Big Data Extensions 1.1 and cover the following topics:

What's New in vSphere Big Data Extensions 1.1

Big Data Extensions enables the rapid deployment of a Hadoop cluster on a VMware vSphere virtual platform. This release provides the following new features and enhancements.

  • Support for the Intel Distribution for Apache Hadoop* Software. Big Data Extensions supports the Intel Distribution for Apache Hadoop* 2.5.1. Big Data Extensions users may deploy Intel Distribution clusters, including HBase, Pig, and Hive.

  • Multi-Network Support. You can create separate networks for management, MapReduce, and HDFS network traffic, as well as isolate your storage system on a separate network. Using multiple networks you can separate these categories of network traffic to improve the performance of your Hadoop deployment. You can also expand the IP range available to Big Data Extensions-deployed clusters when using static IP networks.

  • Configurable Minimum and Maximum Number of Elastic Compute Nodes. When using automatic elasticity you can specify a range of compute nodes within which your cluster can operate. Big Data Extensions shrinks and expands the Hadoop cluster within the range of compute nodes you specify. You can also force the cluster to operate exactly at a fixed number of active compute nodes by specifying the same number of minimum and maximum compute nodes.

  • Schedule Fixed Elastic Scaling for a Hadoop Cluster. You can enable fixed, elastic scaling according to a preconfigured schedule. Scheduling fixed, elastic scaling for your Hadoop cluster provides greater control than variable, elastic scaling while still improving efficiency, allowing explicit changes in the number of active compute nodes during periods of predictable usage.

    For example, in an IT environment with typical workday hours, there is likely a reduced load on a VMware resource pool after the office staff goes home. You can configure scheduled fixed, elastic scaling to specify a greater number of compute nodes from 8 PM to 4 AM, when you know that the workload would otherwise be very light.

  • User-Defined Password. You can specify passwords for the Hadoop nodes you create within Big Data Extensions instead of having to use a randomly generated password.

  • Manage vCenter Single Sign-On Using the Big Data Extensions Plug-In. Big Data Extensions uses vCenter Single Sign-On to authenticate between vSphere services. You can use the Big Data Extensions plug-in, a graphical user interface integrated with vSphere Web Client, to register or re-register with the vCenter Single Sign-On server, or update the Certificate Authority (CA) certificates of the vCenter Single Sign-On server.

  • Script to Collect Log Files for Troubleshooting. You can run the serengeti-support.sh script to collect log files from the Serengeti server or from a cluster's nodes to help you and the VMware support team with troubleshooting. Big Data Extensions collects the log files and saves them in a single tarball in the Serengeti Management Server directory from which the script was run.

  • Big Data Extensions Upgrade. You can upgrade from Big Data Extensions 1.0 to the current version and preserve all the data within the clusters under Big Data Extensions management. All of your existing clusters will be available to be managed by Big Data Extensions once the upgrade to Big Data Extensions 1.1 is complete.


Installation Notes for This Release

Read the vSphere Big Data Extensions documentation for step-by-step instructions on installing and configuring Big Data Extensions.

If you installed the Beta edition of Big Data Extensions, you can not upgrade to the released version. Instead, you must create a new Big Data Extensions environment, and install the new version of the software.


Notes for the Product Guides

The following information is not currently addressed by the product guides.

  • Do not use Big Data Extensions is conjunction with vSphere Storage DRS
    Big Data Extensions places virtual machines on hosts according to available resources, Hadoop best practices, and user defined placement policies prior to creating virtual machines. For this reason, you should not deploy Big Data Extensions on vSphere environments in combination with Storage DRS. Storage DRS continuously balances storage space usage and storage I/O load to meet application service levels in specific environments. If used with Big Data Extensions, it will disrupt the placement policies of your Big Data cluster virtual machines.


Resolved Issues

The following issues have been resolved in Big Data Extensions 1.1.
  • A critical security vulnerability in the Bash shell, referred to as Shellshock, has been identified.

    Exploitation of this issue might lead to remote code execution. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the following names to this issue:

    • CVE-2014-6271
    • CVE-2014-7169
    • CVE-2014-7186
    • CVE-2014-7187
    • CVE-2014-6277
    • CVE-2014-6278

    Big Data Extensions might use the Bash shell that is part of the Linux operating system. If the operating system has a vulnerable version of Bash, the Bash security vulnerability might be exploited through Big Data Extensions.

    If you are running Big Data Extensions 1.x, your environment is vulnerable to the Bash shell security issue. To remediate this issue, you must upgrade to Big Data Extensions 2.0 and install and apply the BDE 2.0 Patch 1. To learn more about Shellshock security issues, and how to download and install the patch, see VMware Knowledge Base article #2091050.

Known Issues

Big Data Extensions 1.0 has the following known issues. If you encounter an issue that is not in this known issues list, search the VMware Knowledge Base, or let us know by contacting VMware Technical Support.

  • When you create an HBase cluster using a specification file, omit the --type parameter
    When you create an HBase cluster using the cluster create command with a specification file (using the --specFile parameter), do not use the --type HBase parameter. Doing so creates a default HBase cluster, ignoring any configuration you may have defined in the specification file.

  • Workaround: Omit the --type Hbase parameter when creating an HBase cluster with a specification file. For example:

    cluster create --name myHBase --distro hw --specFile /opt/serengeti/mySpecFile.txt

    For more information on the create cluster command, refer to the VMware vSphere Big Data Extensions Command-Line Interface Guide documentation.

  • Specifying an odd number of cores per socket causes cluster provisioning or CPU resizing to fail
    If you create Hadoop Template virtual machines with multiple cores per socket, when you specify the CPU settings for the virtual machine you must specify a multiple of cores per socket. For example, if the virtual machine uses two cores per socket, the vCPU settings must be an even number. For example: 4, 8, or 12. If you specify an odd number the cluster provisioning or CPU resizing will fail.

  • Create New Big Data Cluster dialog box fails in Internet Explorer 9 if you click the New Big Data Cluster (+) button several times in quick succession
    When using Internet Explorer 9, if you click the New Big Data Cluster (+) button several times in quick succession, several Create New Big Data Cluster dialog boxes appear.

    Workaround: Choose one of the following workarounds to prevent this issue:

    • When using Internet Explorer 9, do not click the click the New Big Data Cluster (+) button several times in quick succession. Click the button once, and wait for the Create New Big Data Cluster dialog box to appear.

    • Use a Web browser other than Internet Explorer 9 that is supported by the VMware vSphere Web Client.

  • Big Data Extensions user interface shows Big Data clusters as running when they are powered-off
    If you manually shut down virtual machines that are part of the Big Data Extensions environment using vCenter, the Big Data Extensions plug-in is unable to report the change in operational status from running to powered-off.

    Workaround: Use only the Big Data Extensions plug-in interface in the vSphere Web Client, or the Serengeti Command-Line Interface Client, to monitor and manage your Big Data Extensions environment. Performing management operations in vCenter Server may cause the Big Data Extensions management tools to become un-synchronized, and unable to accurately report the operational status of your Big Data Extensions environment.

  • Migrating virtual machines in vCenter Server may disrupt the virtual machine placement policy
    Big Data Extensions places virtual machines based on available resources, Hadoop best practices, and user defined placement policies that you specify. For this reason, DRS is disabled on all the virtual machines created within the Big Data Extensions environment. While this prevents virtual machines from being automatically migrated by vSphere, it does not prevent you from inadvertently moving virtual machines using the vCenter Server user interface. This may break the Big Data Extensions defined placement policy. For example, this may disrupt the number of instances per host and group associations.

    Workaround: If you need to migrate Big Data Extensions virtual machines, carefully plan the migration to ensure the placement policy is not disrupted during migration.

  • Temporarily powering off hosts will cause Big Data clusters to fail during cluster creation
    When creating Big Data clusters, Big Data Extensions calculates virtual machine placement according to available resources, Hadoop best practices, and user defined placement policies prior to creating the virtual machines. When performing placement calculations, if some hosts are powered off or set to stand-by, either manually, or automatically by VMware Distributed Power Management (VMware DPM), those hosts will not be considered as available resources when Big Data Extensions calculates virtual machine placement for use with a Big Data cluster.

    If a host is powered off or set to stand-by after Big Data Extensions calculates virtual machine placement, but before it creates the virtual machines, the cluster will fail to create until you power on those hosts.

    Workaround: The following workarounds can help you both prevent and recover from this issue.

    • Disable VMware DPM on those vSphere clusters where you deploy and run Big Data Extensions.

    • Put hosts in maintenance mode before you power them off.

    • If a Big Data cluster fails to create due to its assigned hosts being temporarily unavailable, resume the cluster creation after you power-on the hosts.