What are some specific requirements for developing a virtualization platform benchmark?
Besides the need to capture key performance characteristics of virtual systems, an appropriate virtual platform benchmark must employ realistic, diverse workloads running on multiple operating systems on multiple hosts. Further, there is a need to define a single, easy to understand metric while ensuring that the benchmark is representative of various end-user environments. The benchmark specification needs to be platform neutral and must also provide a methodical way to measure scalability so that the same benchmark can be used for small platforms as well as larger platforms from different hardware vendors.
Why did VMware develop VMmark 2.x?
VMware realized the need for a next-generation virtualization benchmark to compare different virtualization platforms, which consist of multiple hosts, diverse multi-tier workloads and infrastructure operations. VMmark 2.x was created as a standardized way to compare these virtualization platforms.
What is a VMmark tile?
A VMmark tile is group of eight virtual machines concurrently executing a collection of diverse workloads. Each of these workloads represents a common application workload found in today's data centers. Included in each tile are a mail server, a web 2.0 database and web system, an e-commerce back-end database and front-end web layer, and an idle machine.
Each virtual machine in a tile is tuned to use only a fraction of the system's total resources. As a tile, the aggregate of all workloads utilizes less than the full capacity of modern servers. The saturation of a system's resources and accurate measurement of server performance with VMmark 2.x therefore requires the simultaneous execution of multiple tiles.
Each workload within a VMmark 2.x tile is constrained to execute at less than full utilization of its virtual machine. The performance of each workload can vary to a degree with the speed and capabilities of the underlying system. For example, disk-centric workloads might respond to the addition of a fast disk array with a more favorable score. These variations can capture system improvements that don't warrant the addition of another tile. The workload throttling will force the use of additional tiles for large jumps in system performance. When the number of tiles is increased, workloads in existing tiles might have lower performance. If the system has not been overcommitted, the aggregate score, including the new tile, should increase. The result is a flexible benchmark metric that provides a relative measure of the number of workloads that can be supported by a particular system as well as the cumulative performance level of all the virtual machines.
Who uses VMmark 2.x?
VMmark 2.x was developed as a tool for hardware vendors, system integrators, and customers to evaluate the performance of their systems. Many customers will not run the benchmark themselves, but rather rely on published VMmark 2.x scores from their hardware vendors to make purchasing and configuration decisions for their virtualization infrastructure.
What are the use cases for VMmark 2.x?
The main use-case for VMmark 2.x is to compare the performance of different hardware platforms and configurations. Organizations implementing or evaluating virtualization platforms use VMmark 2.x for comparing performance and scalability of different virtualization platforms, making appropriate hardware choices, and for measuring platform performance on an ongoing basis.
It is also important to note that VMmark 2.x is neither a capacity planning tool nor a sizing tool. It does not provide deployment guidelines for specific applications. Rather VMmark 2.x is meant to be representative of a general-purpose virtualization environment. The virtual machine configurations and the software stacks inside the virtual machines are fixed as part of the benchmark specification. Recommendations derived from VMmark 2.x results will capture many common cases; however, specialized scenarios will likely require individual measurement.
What are the benefits of VMmark 2.x?
With VMmark 2.x, organizations now have a robust and reliable benchmark that captures the key performance characteristics of virtual platforms, is representative of real world environments running multiple workloads, is hardware platform neutral, and provides a methodical way to measure scalability so that the same benchmark can be used across different vendor platforms.
How do I compare VMmark 2.x scores across different virtualization platforms?
A higher VMmark 2.x score implies that a virtualization platform is capable of sustaining greater throughput in a mixed workload consolidation environment, while experiencing data center operations in the background. A larger number of VMmark 2.x tiles used to generate the benchmark means that the platform supported more virtual machines across the multiple hosts during the benchmark run. Typically a higher benchmark score requires a higher number of tiles.
If two different virtualization platforms achieve similar VMmark 2.x scores with a different number of tiles, the score with the lower tile count is generally preferred. The higher tile count could be an indication that the underlying hardware resources were not properly balanced. Studying the individual workload metrics is suggested in these cases.
How is VMmark version 2.x different than VMmark version 1.x?
VMmark 1.x was designed as a single-system consolidation benchmark consisting of six isolated single-tier workloads. VMmark 2.x was designed as a multi-host benchmark reflecting typical, modern-day usage of virtualized infrastructure. VMmark 2.x consists of two single-tier application workloads, two multi-tier application workloads, and four infrastructure-level workloads.
Are VMmark 1.x results comparable to VMmark 2.x results?
No, the workloads and load levels of VMmark 2.x have changed significantly from VMmark 1.x in order to take better advantage of today's larger and more powerful server hardware. Because the VMmark 2.x workloads and load levels have changed since VMmark 1.x, the VMmark 2.x benchmark scores are not comparable to VMmark 1.x benchmark scores.
Are VMmark 2.0 results comparable to VMmark 2.1 and VMmark 2.5 results?
Yes. VMmark 2.1 added support for client systems running certain versions of Windows Server 2008 (in addition to Windows Server 2003, which was supported in VMmark 2.0) as well as support for virtualized clients (subject to certain conditions). VMmark 2.5 added support for optional power measurement.
How is VMmark 2.5 different from previous 2.x versions?
In addition to the optional power measurement functionality in VMmark 2.5, the benchmark has been enhanced in a variety of ways, including increased parallelism at the client level, additional client support, added support for vSphere 5.1 and vSphere 5.5, support for short runs (for testing purposes), and enhanced messaging.
Do all VMmark 2.5 benchmark results include power measurements?
No. VMmark 2.5 provides three test types:
- Performance only (no power measurement)
- Performance with server power
- Performance with server and storage power
Why aren't all VMmark 2.5 results shown on the same web page?
Benchmarkers may choose to optimize a test configuration for a particular aspect of measurement. For instance, if running with a power measurement, the benchmarker may choose to optimize for power over server performance. Representing all server performance results (both with and without power measurements) on the same results page could be misleading. In order to ensure consistent comparability of results, separate results pages are used.
How do I obtain support for VMmark?
There are a number of sources for VMmark support:
- Refer to the VMware VMmark Benchmarking Guide, particularly the "Troubleshooting" section.
- Visit the VMmark community forum.
- Contact the VMmark team (an email address is provided in the VMware VMmark Benchmarking Guide in the "Submit the Benchmark Results for Review" section).