VMmark 2.x will reach end-of-life status on September 26, 2017.
VMmark 2.x results will remain available indefinitely. Please note that VMmark2 and VMmark3 results are not comparable.
VMmark is a free tool used to measure the performance and scalability of virtualization platforms.
What is the VMmark Benchmark?
VMmark is a free tool used by hardware vendors, virtualization software vendors, and others to measure the performance, scalability, and power consumption of virtualization platforms.
The VMmark benchmark:
- Allows accurate and reliable benchmarking of virtual data center performance and power consumption.
- Allows comparison of the performance and power consumption of different virtualization platforms.
- Can be used to determine the performance effects of changes in hardware, software, or configuration within the virtualization environment.
Why a Virtualization Platform Benchmark?
Server consolidation typically collects several diverse workloads onto a virtualization platform — a collection of physical servers accessing shared storage and network resources. Traditional single-workload performance and scalability benchmarks for non-virtualized environments were developed with neither virtual machines nor server consolidation in mind. Even previous single-server virtualization benchmarks have not fully captured the complexities of today's virtualized data centers.
VMmark 2.x: A True Multi-Server Virtualization Platform Benchmark
VMmark 1.x pioneered single-server virtualization benchmarking with its unique tile-based multi-application design. VMmark 2.x took the next step, filling the need for a multi-server virtualization platform benchmark by incorporating a variety of common platform-level workloads such as live migration of virtual machines, cloning and deploying of virtual machines, and automatic virtual machine load balancing across the data center.
VMmark 2.5: Server and Storage Power Measurement

In addition to the performance measurement in previous versions of VMmark, VMmark 2.5 adds the option to measure both the absolute power consumption and the power efficiency of the servers and storage used in the benchmark tests. This allows purchasing decisions to include not just absolute performance, but also performance per kilowatt, an increasingly significant factor in the total cost of ownership of computing resources.
Note: Though VMmark uses the SPEC® PTDaemon, VMmark results are not SPEC metrics and cannot in any manner be compared to SPEC metrics.
How Does VMmark Work?
The VMmark benchmark combines commonly-virtualized applications into predefined bundles called "tiles." The number of VMmark tiles a virtualization platform can run, as well as the cumulative performance of those tiles and the performance of a variety of platform level workloads, determine the VMmark 2.x score.
A Peer Reviewed Benchmark with More than 200 Published Results
Before being published, VMmark results must be submitted to a review panel composed of a variety of companies that have published VMmark benchmark results, ensuring the fairness and integrity of the benchmark.
Since its inception in 2007, more than 200 VMmark results have been published on the VMmark website and VMmark has become the standard by which the performance of virtualization platforms is evaluated.
FEATURES
Application-Centric Benchmarking of Real-World Workloads
VMmark uses workloads representative of the applications commonly found in the data center, such as email servers, databases, and so on. VMware has worked closely with its partners to design and implement the benchmark across various software and hardware platforms and has gathered extensive customer feedback to understand how these applications are typically used in virtualized environments. In this way VMmark measures performance using well-understood, existing workloads with which customers are already familiar.
Unique Tile-Based Implementation
The unit of work for a virtualized data center can be usefully defined as a collection of virtual machines executing a set of diverse workloads. The VMmark benchmark refers to this unit of work as a "tile." The virtual machines that make up a VMmark tile, driven by a client system associated with that tile, perform a variety of tasks, both internally and by interacting with the client system and other virtual machines in the tile.
In VMmark 2.x, the total number of tiles that multiple systems in the data center can accommodate, while administrative operations are performed in the background, provides a coarse measure of that data center's consolidation capacity. The performance of the workloads within those tiles provides a fine measure of the data center's overall performance and, combined with the performance of the administrative operations, is used to calculate a VMmark benchmark score.
Multi-Server Virtualized Data Center Benchmarking
The rapid pace of innovation has quickly transformed typical server usage by enabling easier virtualization of bursty and heavy workloads, dynamic virtual machine relocation (vMotion), dynamic datastore relocation (storage vMotion), and automation of many provisioning and administrative tasks across large-scale multi-host environments. In this paradigm, a significant proportion of the stresses on the CPU, network, disk and memory subsystems can be generated by the underlying infrastructure operations. Load balancing across multiple hosts can also greatly affect application performance. Any relevant benchmarking methodology must still focus on user-centric application performance while accounting for the effects of this infrastructure activity on overall platform performance. VMmark 2.x generates a realistic measure of platform performance by incorporating a variety of platform-level workloads such as virtual machine migration, clone and deploy, and storage migration operations, in addition to traditional application-level workloads.
High-Precision Scoring Methodology
During a VMmark benchmark run, which lasts at least three hours, individual performance metrics are collected every 60 seconds. Each of these metrics represents the performance of an individual application or infrastructure workload.
The application workload metrics for each tile are computed and aggregated into a score for that tile by normalizing the different performance metrics, such as MB/second or database commits/second, with respect to a reference system. A geometric mean of the normalized scores is then computed as the final score for the tile. Finally, the resulting per-tile scores are summed to create the application workload portion of the final metric.
A similar calculation is used to create the infrastructure workload portion of the final metric except that, unlike the application workloads, the infrastructure workloads are not scaled explicitly by the user. Consequently, the infrastructure workloads are compiled as a single group and no multi-tile sums are required.
The final benchmark score is computed as a weighted average: 80% to the application workload component and 20% to the infrastructure workload component. These weights were chosen to reflect the relative contribution of infrastructure and application workloads to overall resource demands.
In order for the resultant benchmark score to be considered compliant, the benchmark run must also meet a number of conditions, including minimum quality-of-service requirements.
In addition to the overall benchmark score, a VMmark 2.x full disclosure report also includes the raw and normalized results for each underlying workload and complete details of the virtualization platform configuration. In some cases, studying the workload metrics along with the platform configuration can provide insight into system performance and scaling.
Power Measurement
Power and cooling expenses are a substantial — and increasing — part of the cost of running a data center. Additionally, environmental considerations are a growing factor in data center design and selection. To address these issues, VMmark 2.5 adds optional power measurement to the performance measurements provided by previous VMmark versions. VMmark 2.5 benchmark results can be any of three types:
- Performance only (no power measurement)
- Performance with server power
- Performance with server and storage power
VMmark results with power measurement allow purchasers to see not just absolute performance, but also absolute power consumption and performance per kilowatt. This makes it possible to consider both capital expenses and operating expenses in the selection of new datacenter components.
Note: Though VMmark uses the SPEC® PTDaemon, VMmark results are not SPEC metrics and cannot in any manner be compared to SPEC metrics.
RESOURCES
Publishing VMmark Results
Join the Community
GETTING STARTED
VMmark 2.x System Requirements
This section provides an abbreviated outline of the VMmark 2.x system requirements. For a detailed list, see the most recent version of the VMware VMmark Benchmarking Guide (on the VMmark download page, available after accepting the VMmark EULA).
SYSTEM REQUIREMENTS
VMmark 2.x Software Requirements
Software for VMmark Workloads
Operating systems:
- Microsoft Windows Server 2008 Enterprise Edition, 64-bit
- Microsoft Windows Server 2003 SP2 Enterprise Edition, 32-bit
- SUSE Linux Enterprise Server 11, 64-bit
Applications:
- Microsoft Exchange 2007
- Olio (web and DB) and its required components
- DVD Store 2 (web and DB) and its required components
Software for VMmark Client Systems
Each VMmark 2.x tile requires a client machine. One of these client machines must run Microsoft Windows Server 2008 Enterprise Edition (32-bit or 64-bit); the others can run Microsoft Windows Server 2003 Enterprise Edition (32-bit) or Microsoft Windows Server 2008 Enterprise Edition (32-bit or 64-bit).
Client machines run the following applications:
- VMmark 2.x harness
- STAF framework and STAX execution engine
- Microsoft Exchange Load Generator
- Microsoft Outlook 2007 (standalone or included in Microsoft Office 2007)
- Microsoft Exchange 2007 management tools
- Cygwin
- A Java JDK
- Rain Workload Toolkit
Software Licensing Considerations
Though VMmark is a free tool, some of the third-party software required to run VMmark requires licensing. The following list summarizes which software is free, which is available in no-cost evaluation versions, and which requires paid licenses.
Free Software
You can download the following free software packages:
- Cygwin environment
- STAF/STAX software
- Olio
- DVD Store 2
- Java
- VMware VMmark Harness
Evaluation Software
You can download evaluation versions of the following software packages:
- SLES 11 Linux (64-bit)
- tc Server
- Microsoft Outlook 2007 (standalone or included in Microsoft Office 2007)
Purchased Software
You must purchase licenses for the following software packages if you do not already own them:
- Microsoft Windows Server (three copies per tile; see the VMware VMmark Benchmarking Guide for the specific version requirements)
- Microsoft Exchange Server 2007 Enterprise Edition (available as part of the Microsoft Developer's Network (MSDN) Universal subscription)
Getting Started With VMmark 2.x
To get started with VMmark, follow these steps:
- Download the latest VMmark package
The VMmark package contains the VMmark 2.x harness, the configuration files, and much of the software needed to run VMmark.
- Download the VMware VMmark Benchmarking Guide
- Download the template virtual machines
The download page contains links to pre-built template virtual machines for the four types of Linux virtual machines used in the benchmark.
- Extract the VMmark package
Extract the contents of the VMmark package to C:\ on a native Windows Server 2008 client system.
- Refer to the VMware VMmark Benchmarking Guide
Follow the instructions in the Benchmarking Guide for directions on how to set up and run the benchmark.
- Carefully read the VMmark Run and Reporting Rules
The VMmark Run and Reporting Rules document (found on the download page or in the docs directory of the VMmark package) outlines the requirements for producing a publishable VMmark result. To be published, or otherwise publicly disclosed, a VMmark result must adhere to the latest version of the Run and Reporting Rules.