The ConsoleVMware's Executive BlogFri, 22 Sep 2006Power and cooling savings with VMware Infrastructure
Posted by Bogomil Balkansky The datacenter has typically been a quiet, and face it, boring, place: rows upon rows of identical computers and the droning sound of air conditioning – not exactly the place where journalists look for spectacular drama. And you want to keep it that way – if things are working as they should, no one needs to know what is going on in the datacenter, or that such a place even exists. So why has the datacenter come out of its anonymity, and is suddenly filling the pages of the New York Times and the Wall Street Journal? And why are huge billboards on highways and airports touting the concern about computer energy consumption and cost? The truth is that rising energy prices have hit the datacenter as hard as they have hit Detroit. Modern servers have increasingly become the computing equivalent of gas-guzzling SUVs. While a typical server 10 years ago consumed 100W of power, the average server today consumes four times as much. Servers would use about 30% of their peak electricity consumption while sitting idle, which is often more than 80% of the time. Imagine your SUV going through gallon after gallon of gas while sitting in the garage. And to make things even worse, the density of servers per square foot has doubled at the same time – from 7 servers per rack to 14 servers per rack. The overall power density of the datacenter is increasing by 15% per year. All the electricity consumed by servers is transformed into heat – so to prevent data centers from turning into hot houses, about 125% more electricity is consumed by the cooling equipment. IDC calculates that the total power and cooling bill for servers in the US stands at a whopping $14 billion a year, and if the current trends persist, the bill is going to rise to $50 billion by the end of the decade. The growth of datacenter energy spending far outpaces the rate at which IT budgets grow, dangerously crowding out other vital IT initiatives and projects. Not that virtualization is a panacea for every IT woe, but it can definitely help overturn that dire forecast. One of the mainstay use cases of virtualization – server consolidation and containment – allows customers to “squeeze” multiple workloads on the same server. There is a flow through effect from needing fewer physical servers – it means that VMware customers need less space in the datacenter, and less electricity and cooling. We estimate conservatively that for every workload moved from a physical to virtual environment, customers can save about $290 in electricity costs, and about $360 a year in cooling costs. The more important thing is that these savings accrue year after year. For example, VMware customer Provident Bank reports cutting power consumption by 13,000 watts. Beside the company bottom line effect, there is something to be said about the environmental impact of virtualization. The $650 per virtualized workload represents 8,000 kWh of electricity saved. With more than 1 million workloads running in VMware virtual machines, the aggregate power savings are about 8 billion kWh, which is more than the heating, ventilation, and cooling electricity consumed in New England in a year. With results like that, your datacenter won’t mind some attention from the press. posted at: 15:36 | reply to the console | permanent link
Wed, 13 Sep 2006On Benchmarking Virtual InfrastructurePosted by the Engineering Performance Group
There are a number of unique challenges in creating sound and meaningful benchmarks for virtualized systems:
Let's take a look at these in turn.
However, in a virtual environment, the typical usage of a machine is different from what is common on physical machines. One of the key benefits of virtualization is the ability to run multiple virtual machines on the same physical machine to increase the utilization of server resources. Multiple virtual machines running different operating systems and different applications with diverse resource requirements can all be running on the same machine. Moreover, these applications are typically not bottlenecked on any one resource, and have different (often conflicting) response time requirements. Running single application benchmarks one at a time and then aggregating their metrics might appear to be an easy solution, but that approach doesn’t work. Overall performance can be negatively impacted by competing resource demands among the workloads or positively impacted by optimizations such as transparent page sharing. A good virtualization benchmark must include multiple virtual machines running simultaneously.
Make the benchmark specification platform neutral. Care must be taken to ensure that the benchmark specification does not depend on any platform specifics. We’d like to be able to use the benchmark to answer common customer questions such as “What’s the benefit of dual-core (or quad-core) over single-core processors?”, “How does my storage hardware affect the performance of my overall system?” or “What’s the performance difference between hosted virtualization products (like VMware Server) and bare-metal virtualization products (like VMware ESX Server)?”. Another aim of this benchmark is to drive improvement in future platforms and we would not be able to accomplish this if the benchmark was tied to any specific platform.
Once the metric for each component workload is defined, the next question is how to aggregate them. The aggregation must be done carefully as the units of the underlying workloads can vary widely and we don’t want a single workload unfairly influencing the final metric. The aggregation should also be meaningful with regard to making the benchmark representative of what end users really run. In addition, the metric for a new benchmark must be easy to reason about, make sense to end users, be easy to compute and reflect underlying platform differences.
In this article we’ve discussed some of the main design challenges for a virtualization benchmark. Turning the design into an easy-to-use benchmark kit brings up additional practical considerations. These include issues such as timing in virtual machines, orchestrating the startup of multiple virtual machines running simultaneously, and determining the right measurement window in the face of bursty workloads. Any approach to creating a benchmark for virtualization must address all these challenges. Creating a benchmark is easy, but creating a credible benchmark that provides a meaningful metric, that measures both workload overhead and scalability, that is representative of end user environments, that cannot be easily defeated, and that is broadly applicable -- is a hard problem!
posted at: 00:00 | reply to the console | permanent link
|
| ||
As virtualization becomes commonplace in the industry there is increasing
interest in measuring the performance of virtualized platforms. Plenty of
benchmarks exist to measure the performance of physical systems, but they fail
to capture essential aspects of virtual infrastructure performance. We need a
common workload and methodology for virtualized systems so that benchmark
results can be compared across different platforms.
Capture the key performance characteristics of virtual systems. Users compare platforms based on their specific needs -- for example a user running a web server will compare the number of web requests that can be served by each platform, while a DBA will be more interested in the number of database transactions or simultaneous database connections. In the non-virtualized world users typically bind a single application to a single machine, and benchmarks have been developed to provide metrics for important application categories. For my examples of web servers and databases users could look at results for the industry standard SPECweb2005 and TPC-C benchmarks, respectively. Such benchmarks run the application to a point where some resource (usually CPU) is saturated and record the performance metric while respecting quality of service measures such as response time. Users not only compare platforms using the metrics provided by these benchmarks, but over time they build up expertise allowing them to relate their particular environment to published benchmark scores.
Ensure that the benchmark is representative of end user environments. Which workloads should run within the virtual machines in the benchmark? This is a difficult question as users run a wide range of guest operating systems (e.g. Windows, Linux or Solaris), virtual hardware configurations (32-bit, 64-bit, 1, 2 or 4 virtual CPUs) and applications in virtual machines. Any workload included in the benchmark must be representative of end user applications, especially in terms of their resource usage. For many virtual environments, CPU utilization of the workloads is an important factor in the overall performance of the system, but so are memory, storage and network I/O. Any benchmark that aims to measure the performance of virtual environments is incomplete if it does not address these resources. A virtualization benchmark must take into account existing customer use cases and future trends in hardware and software.
Define a single, easy to understand metric. Any good benchmark will have a single, simple metric so that it’s easy to compare different platforms. Secondary metrics can be used to give additional information, but users will base their comparisons on the primary metric. For a virtualization benchmark, should latency or throughput be the primary metric? Or should the load on all the virtual machines be held constant, and CPU utilization used as the metric so that the systems that are able to handle the load with the lowest CPU usage are deemed best? Should the CPU utilization be capped at some limit or should the workload be allowed to saturate the server? How are quality of service constraints factored in? Considerable experimentation is required to determine the best choices for the benchmark design and hence the validity of the benchmark metrics.
Provide a methodical way to measure scalability. One of the key benefits of virtualization is being able to consolidate workloads in a scalable manner onto machines. It’s important that the benchmark be able to run on a small two CPU system as well as on the large multicore, multisocket system of tomorrow and provide a meaningful measure of the relative work that can be performed on the two systems. Besides CPU, platform differences in storage and networking hardware can also affect scalability of the system and need to be captured by the benchmark.
The benefits of solving this hard problem are great. Having an industry
standard way of comparing virtualized solutions will allow users to make more
informed decisions regarding the entire stack of virtualization technology.
Such a standard can also drive improvements in future hardware and software,
again benefiting the industry. For these reasons, VMware is committed to
solving this problem. For a while now we've been working on just such a
benchmark. We’ve been talking to many of our customers and partners and doing
lots of experiments to develop a sound design and methodology.
We're referring to this benchmark as VMmark and we plan to present it at