Friday, June 19, 2009

Why Virtualize? A Primer for HPC Guys

Some folks who primarily know High-Performance Computing (HPC) think virtualization is a rather silly thing to do. Why add another layer of gorp above a perfectly good CPU? Typical question, which I happened to get in email:

I'm perplexed by the stuff going on the app layer -

first came the chip + programs

then came - chip+OS+ applications

then came - Chip+Hypervisor+OS+applications

So for a single unit of compute, the capability keeps decreasing while extra layers are added over again and again.. How does this help?

I mean Virtualization came for consolidation and this reducing the prices of the H/W footprint...now it's being associated with something else?

In answering that question, I have two comments:

First Comment:

Though this really doesn't matter to the root question, you're missing a large bunch of layers. After your second should be

chip+OS+middleware+applications

Where middleware expands to many things: Messaging, databases, transaction managers, Java Virtual Machines, .NET framework, etc. How you order the many layers within middleware, well, that can be argued forever; but in any given instance they obviously do have an order (often with bypasses).

So there are many more layers than you were considering.

How does this help in general? The usual way infrastructure software helps: It lets many people avoid writing everything they need from scratch. But that's not what you're really asking, which was why you would need virtualization in the first place. See below.

Second Comment:

What hypervisors -- really, virtual machines; hypervisors are one implementation of that notion – do is more than consolidation. Consolidation is, to be sure, the killer app of virtualization; it's what put virtualization on the map.

But hypervisors, in particular, do something else: They turn a whole system configuration into a bag of bits, a software abstraction decoupled from the hardware on which they are running. A whole system, ready to run, becomes a file. You can store it, copy it, send it somewhere else, publish it, and so on.

For example, you can:

  • Stop it for a while (like hibernate – a snapshot (no, not disk contents)).
  • Restart on the same machine, for example after hardware maintenance.
  • Restart on a different machine (e.g., VMware's VMotion; others have it under different names)
  • Copy it – deploy additional instances. This is a core technology of cloud computing that enables "elasticity." (That, and apps structured so this can work.)
  • By adding an additional layer, run it on a system with a different architecture from the original.

Most of these things have their primary value in commercial computing. The classic HPC app is a batch job: Start it, run it, it's done. Commercial computing's focus nowadays tends to be: Start it, run it, run it, run it, keep it running even though nodes go down, keep it running through power outages, earthquakes, terrorist strikes, … Think web sites, or, before them, transaction systems to which bank ATMs connect. Not that commercial batch doesn't still exist, nor is it less important; payrolls are still met, although there's less physical check printing now. But continuously operating application systems have been a focus of commercial computing for quite a while now.

Of course, some HPC apps are more continuous, too. I'm thinking of analyzing continuous streams of data, e.g., environmental or astronomical data that is continuously collected. Things like SETI or Folding at home could use it, too, running beside interactive use in a separate machine, continually, unhampered by your kids following dubious links and getting virii/trojans. But those are in the minority, so far. Grossly enormous-scale HPC with 10s or 100s of thousands of nodes will soon have to think in these terms as the time between failures (MTBF) of nodes exceeds the run time of jobs, but there it's an annoyance, a problem, not an intentional positive aspect of the application like it is for commercial.

Grid application deployment could do it á la clouds, but except for hibernate-style checkpoint/restart, I don't see that any time soon. They effectively have a kind of virtualization already, at a higher level (like Microsoft does its cloud virtualization in Azure above and assuming .NET).

The traditional performance cost of virtualization is anathema to HPC, too. But that's trending to near zero. With appropriate hardware support (Intel VT, AMD-V, similar things from others) that's gone away for processor & memory. It's still there for IO, but can be fixed; IBM zSeries native IO has had essentially no overhead for years. The rest of the world will have to wait for PCIe to finish finalizing its virtualization story and IO vendors to implement it in their devices; that will come about in a patchy manner, I'd predict, with high-end communication devices (like InfiniBand adapters) leading the way.

So that's what virtualization gets you: isolation (for consolidation) and abstraction from hardware into a manipulable software object (for lots else). Also, security, which I didn't get into here.

Not much for most HPC today, as I read it, but a lot of value for commercial.