Agility with Performance

Virtualizing Hadoop using VMware vSphere® brings new levels of agility to help deploy, run and manage Hadoop clusters while maintaining system performance on par with physical deployments. Through the VMware vSphere® Web Client, enterprises can flexibly deploy resources to adapt to changing business requirements at the click of a button.

Elastic Scaling

Dynamically scale Hadoop clusters by separating data and compute to allow for elastic scaling while keeping data safe and persistent. By placing compute and data in separate virtual machines, administrators can commission / decommission stateless compute nodes to adapt to fast changing business requirements while keeping data persistent. The ability to intelligently scale allows enterprises to increase resource utilization and flexibly adapt to bursty workloads.

Reliability and Security

Leverage an accepted enterprise-wide HA solution from vSphere for Hadoop workloads while keeping your data safe and secure through VM-level isolation. Virtualizing your Hadoop cluster using vSphere allows for enterprises to achieve resource, version, and security isolation to give enterprises complete control of data, solve resource contention issues, and provide privacy between users.

Mixed Workload Capabilities

Eliminate the need to buy dedicated hardware for Hadoop clusters. By virtualizing Hadoop using vSphere, enterprises can repurpose underutilized hardware resources to concurrently run different types of workloads alongside Hadoop on a single physical host. This unlocks unused resources and increases system utilization to achieve maximum hardware efficiency.

Technical Details

Mixed Workloads / Hadoop-as-a-Service

Separating data and compute in different virtual machines allows for true multi-tenancy, mixed workload capability, and paves the way to enabling an internal Hadoop-as-a-Service.

VM-based Isolation

Creating multiple virtual machines on a single physical host allows for resource, version, and security isolation to give enterprises a new level of control and agility when managing data across the enterprise.

vSphere HA / FT for Hadoop

Leverage vSphere to enable broad HA and FT support across multiple single points of failure in a Hadoop system, beyond the NameNode and JobTracker nodes.

Lifecycle Automation

Automate resource optimizations using VMware vSphere® Big Data Extensions™ to allocate compute and storage resources to adjust on the fly to balance different workload requirements.

Optimize Hadoop on Virtual Platform

Leverage Hadoop Virtual Extensions (HVE) to make Hadoop systems topology aware. HVE, which is part of Apache Hadoop 1.2.0 release, preserves data locality across Hadoop clusters to ensure comparable performance to native deployment.