Archive for October, 2007

Virtualized Computing Blades? Storage Blades? Talk About Running With Scissors

Know What Virtualization Is, But What Is Next? – Chapter 04

Talk about running with scissors? Running with scissors is very easy to do, and even easier to end up getting cut or worse. The computing industry does the same thing all the time. Locking onto one fact and letting it trump others or even excluding reality. What does that visual have to do with virtualization, or more to the point, blade technology? Blade technology addresses many of the concerns of the current and future datacenters. For example:

  • Cable/Connection Management
  • Power Consumption
  • Heat Generation
  • Total Rack Space
  • Technical Assistance Confusion
  • Remote Control/Console
  • Configuration Management (Profiles/Personality)

Given a bit of thought I am sure you see how the above concerns are addressed by blade design. Even more important blades are cool, sexy (in a Geek way). Yes, they actually generated significant heat in past and current vendor specific blade generations, but newer lower wattage CPUs and better designs are addressing this. Another issue, with early blade design was a lack of redundancy, in connectivity, power, etc. However, blades reduce space, and complexity of maintenance, are hardware homogeneous, and when using boot from SAN, failover is relatively painless, in fact with some vendors these virtual connections are virtual transparent, no pun intended. Adding in the emerging technology of direct-current power backbones? Should that technology ever proves its-self, then power consumption may be improved as well, and some of the heat loading might be offset. But direct current is a shoot the moon issue, as I see it now.

Furthermore, I leave it to the reader to find out what vendors do what in reference to specific details, but they are all moving to making virtualization easier on blades in some way. From my personal experience, HP is leading the way, IBM is slower but moving, and Dell, well, Dell has missed the boat so far in comparison.

The real news is storage blades, and to a less degree half height chassis design, which provides some smaller site scaling. Imagine a small site that has, say, for four (4) computing blades, one being a failover blade, and three (3) storage blades that virtualization sees as shared storage? Talk about easy to do! Did some one say running with scissors? With virtual connection management, such as what HP does, and transplanting the blade profiles or personalities makes it even less of a pain, maybe even painless it that the local technical support needs to know absolutely nothing about the setup, if a blade fails, just swap it, just like swapping a single physical disk in a RAID 5 set.

What blades do not do well, is avoid costs up front, setup of blade infrastructure is not cheap, however, I do not see this as a negative, since a proper virtualization strategy is to pre-provision. This is part where those of us running with scissors get cut, and of course management does all the bleeding. I can hear the snorts, coughs, moans, and in a few cases the sound of blood rushing to the brain to stage a massive headache. I have yet to see anyone in management faint? But I remain hopeful to see that as well at some point. Is it not funny how management has some much trouble with words that begin with M, for example, management (of time/resources), money, maintenance, monitoring, etc? So in reponse, and to avoid the potential impact to management and all words related that start with M, does blade technology really save any money? That is a good question. The only real answer is… economies of scale, and your respective implementation strategy. If virtualization is the dominate strategy for your organization, not looking at blade technology is a true opportunity cost.

Add comment October 31st, 2007

No More Low Hanging Fruit, Really?

A Strategy for Controlling Non-Virtualization Candidate Selection
Virtualization, Fine, Well Sort Of? – Chapter 04

There are three philosophical conflicts within the virtualization community, one you are already familiar of when I was rather comical, in my view, with how virtualization is addictive to management. This article will now discuss the two counter measures to this addiction of declaring anything and everything a candidate for virtualization.

The first counter measure, is to establish virtual instances for everything based on factual data which is not a bad thing in concept, however, this is expensive in that every questionable virtual instance must be validated via empirical results. This type of proof of concept oriented candidate solution is like stepping into the mud, before you look to see if there is mud to step in. Developers and clients alike actually hate this methodology. Forcing developers to develop on virtual instances, the clients to certify on virtual instances, and only then going to production on virtual instances is safe, sound, and consistent, but it makes everyone follow the rules all the time. Yikes! By chance or fate if the number of virtualized instances that have to be backed-out to traditional hardware is small then maybe this strategy works. Another significant factor that really hurts this back out scenario is when physical-to-virtual (P2V) efforts are done to retire very old hardware, so backing out from virtualization can mean purchasing new hardware. Ouch!

The second counter measure, which is difficult to communicate to management, is to drive co-hosting, or container oriented concepts in parallel to typical hypervisor based virtualization. Now, now, those that are thinking, wait containers are virtualization, yes, but containers are really in between hypervisor virtualization and true co-hosting of application frameworks. For example, Citrix is a variant container model that promotes heterogeneous application co-hosting. Enlightened use of Citrix can achieve some very interesting and significant results. However, moving beyond far beyond Citrix, and of course even Microsoft System Resource Manager, which is at this point is a feature weak container framework, re-evaluate the true power of Microsoft SQL server, Oracle, and Microsoft IIS. Yes there are others, but I use these few as examples only. These applications are true frameworks, which support instances! Using application frameworks effectively, you can far out perform hypervisor based virtualization. Wow! True this methodology requires a very experienced engineering and operational teams, in truth, skilled integrators. Of course, co-hosting capable application frameworks do have a dark side, which they share with their operating system based container heritage siblings, one bad operating system or application framework wide issue, an update or patch gone wrong, can be catastrophic. This is the one risk that hypervisor models avoid of course since the operating system and any encapsulated application framework in such an instance could survive with scaled cautious deployment of updates.

So how does this discussion apply to the title? It should jump at you! The true strategy for non-virtualization is to be very good at virtualization candidate selection and implementation. Bad candidates for virtualization of any type, hypervisor or container oriented, are obvious, beyond hardware disqualification, performance disqualification against existing empirical data, management has a very hard time yelling about objective results, doing so is a waste of time, thus a waste of money. Chah-ching!

Add comment October 17th, 2007

Long Distance Disaster Recovery?

Know What Virtualization Is, But What Is Next? – Chapter 04

To be honest I have been avoiding this subject. Long distance disaster recovery is one of the weak points in virtualization for a number of technical, and a few financial reasons. For example, some of the technical reasons often sited in reference to long distance recovery solutions:

  • Every single instance archival/restore solution available today does not scale well
  • Are vendor specific
  • Require extensive bandwidth
  • Limited vertical scale (imaging issues)
  • Scale horizontal (DASD allocation issues)
  • Hard to manage, monitor and control
  • Inflexible once implemented 

For example the financial reasons often sited in reference to long distance recovery:

  • Infrastructure that is under utilized/idle
  • Require network bandwidth over distance

As with any solution, or should I say situation? The world is changing, and vendors are running after issues, cough, dollar signs, VMworld 2007 was no exception in this regard, the hot topics for the super-sessions, meaning attendance was in the multiple 100s per session, were centered around disaster recovery, site management, and to a lesser degree image scaling. Unfortunately, the concept of total scale is still not quite enterprise level for my taste; for example, every single solution for disaster recovery presented would have issues beyond 250 host servers, or more than 1500 virtual instances, and anything at 10 or more of actual allocated terabytes of DASD per site. Wait, wait, some are saying already, that it is pretty big corporations that are doing that!  Well, in point of fact, everyone doing virtualization is looking at more not less virtualization, so realistic scaling for disaster recovery solutions over long distance should be looking to support 1000 or more virtual hosts, and at least 10,000 or more virtual instances across an enterprise, and distances should not be in the 10s or even 100s of miles, but in the 1000s of miles. Yes, 1000s of miles. Think of real disasters! A Disaster recovery site should/could be 1000 miles a way, even on a different tectonic plate if possible, or in other words, how far can a hurricane travel of over land, not far, but the flooding and storm damage is often 100s of miles in land. Archival/restore options per virtual instance just do not scale, thus storage array based methods will dominate the virtualization industry, and this is not a predicative comment, but a fact. All the vendors applicable know this, never mind the fact that we, as clients of virtualization have been yelling about this for the last 2 to 3 years.

But I digress, for long distance recovery, as a concept, to explode in a positive sense, regardless of where it is situated, a few things need to happen beyond the scaling discussion, these include the following:

  • Standardized use and implementation of image scaling
  • Standardized use of thin-disk methods for DASD allocation
  • Standardized use of storage-array level snapping, cloning, etc.
  • Convince management that no matter what is done, bandwidth will be required, maybe even a dedicated storage area network, cough, cough

I am not going to explain the bandwidth issue, it is obvious, very long distance disaster recovery models need bandwidth, and of course no one wants to hear that, but it is true. Storage Array networking is needed to implement a number of emerging virtualization technologies, welcome to emerging virtualization life. Moving on… standardized use of anything is good, from a practical perspective, so we eliminate one of our big issues, vendor specific implementations. We want storage array snapshots compatible just like SCSI is compatible today right?  Or better if we can get it. We want to have NetApp in one site migrate storage-array snapshots to EMC, for example. Don’t laugh, it will happen, but the storage array vendors don’t like the idea. This cross vendor model also would address migrations to newer platforms and different models of implementation. Image scaling, which is not here yet to any realistic degree, is the idea that DASD that is really read-only in concept, is leveraged. For example, 90 plus percent of the operating system foot-print in a virtual instance is static, so I should be able to use it over and over per instance, and then only the DASD that actually changes per virtual instance is isolated per instance. Dang, does this sound like a container model? It should! Image scaling, combined with thin-disking methods, where the operating system thinks it has 100GB for data, but the actual partitioning on the storage-array is what is needed plus a growth factor offset, and only unique data is DASD growth. For example, if the given instance is only using 20GB, it really only has a 20 plus GB footprint on the storage-array. This reduces the cost factors and allows for better utilization of resources for DASD, which should make the accounting geeks happy. Did I really just say accounting geek?

For those keeping score, about long distance disaster recovery, not the geek name calling, the last issue is monitoring, management, and control. Well, that is the real kicker, without universal standardization of storage-array models across vendors, at least all significant vendors, such a tool or methodology is lacking. Well… Actually… That is changing, but is in the embryonic stage at best. VMware and other virtualization vendors will follow soon, has a new toy, VMware Site Recovery Manager (SRM). It aims to solve this key issue that plagues our topic of discussion, the lack of monitoring, control and management. However, since VMware SRM does not employ thin-disking or image scaling as yet, the opportunity for someone else to snake this market niche away from VMware is obvious, no?

Add comment October 10th, 2007

Engineering By Fact? Select A Type Of Virtualization?

Virtualization?  What The Heck Is That? – Chapter 04

Virtualization is not a simple concept, in this series introducing virtualization, we have discussed what the various types of virtualization are. How to identify what you don’t know, so you know what to look for in a virtualization solution. Even discussed how total cost of ownership can be impacted by virtualization. Thus, virtualization is not a simple concept. However, the day has arrived, now the time has come to pick a virtualization strategy.  Now is the time to implement. Most important, now is the time to not fall in love with it, implement it, and then realize it is not quite what you expected. Never, commit a true crime, ignore results. Virtualization is what it is, nothing more, and nothing less.

Yes, money must be spend, time spent, and results must be evaluated. Virtualization is a commitment, no doubt. Providing the initial scale was sufficient to provide valid results, say, a few host servers, although for some, even on server is enough, just how many virtual instances can you run per host? How many hosts will you need per site? How will you backup the instances sufficient to your archive needs, and disaster recovery needs?  What can be virtualized? What can not be? The horizontal and vertical scale of virtualization is appropriate for your situation? Watch out, this is all results driven, or it better be, and what do you do when the results cost real money?

This is why the concept of engineering by fact must be the corner stone of your defense. Yes, defense, you management is already addicted; they are just itching to scale up and out, only one problem.  They want more virtual instances per host, they want more bang for the buck. In fact, they do not want to honor the results. Just about everyone at the front end of virtualization scaling will say, come on, this is just not rationale, and yet, just about everyone on the back-end of the virtualization, is saying, oh wow, you bet it happens. For the sake of discussion, a given company proved that given a specific set of hardware, it could host ten (10) virtual instances per virtual host, taking into account network IO, disk IO, memory loading, and CPU loading, in fact after 18 months, the results prove with proper tuning of the virtual hosts, twelve (12) virtual instances makes factual sense.  Ok, engineering by fact, 10 to 12 instances, based on rock solid data. End-users are happy, engineering team feels good; operations team feels wonderful because they do not feel like they are sitting under the sword of Damocles.

What happens? Senior management states they must have twenty-five (25) virtual instances per host, or virtualization is not profitable! When the engineering team states that they had done engineering by fact, evaluated real data both from lab, and actual production results for trending load analysis, and even had positive end-user experience as long as no more than 12 virtual instances are used per host, and then asks where in the world did 25 instances per host come from? Management says, because we want it. Want to guess how many times this same senior management will ask why 25 instances was not possible? Engineering by fact, not doing it, is just plain insane.

Add comment October 3rd, 2007


October 2007
« Sep   Nov »

Posts by Month

Posts by Category