« Green Data Center Man | The Mainframe Blog Home | The Pitfalls of Mainframe Linux »
Forrester: Service Level Shortcomings Are Pervasive & Costly
Stephen Swoyer over at Enterprise Systems Journal reports on a new Forrester Research study which finds serious deficiencies in IT service delivery. The study finds that IT only meets Service Level Agreements (SLAs) about 75% of the time. Moreover, business users often find the SLAs woefully deficient — both poorly defined and, even when written and enforced, too limited to serve more demanding business needs.
Anecdotally I am finding the same situation among IT organizations around the world. Many organizations fail to define service levels in business terms, and they often have poor dialog with business stakeholders to help them understand risks and outcomes. They do not even agree on how to measure the outcomes, and frequently there is no measurement. It all sounds disturbingly familiar.
I remember one recent case when a major insurance company went into production with infrastructure to support their entire core application portfolio — underwriting, claims, etc. — but the IT service team failed to provide any disaster recovery. (Yes, literally zero DR.) One data center fire or explosion would have put the company out of business, probably disrupt that entire country's economy, and send a bunch of company executives to prison. The CEO, to his credit, found out about the gap and immediately ordered expensive remediation. Of course that remediation meant the whole IT project had a negative return on investment, but that's a story for another day.
So what do Forrester's findings have to do with mainframes? Everything, and in at least two ways:
1. Mainframe technologies and disciplines encourage the measurement and management of well-defined service levels to meet or exceed business goals. That remains a serious set of advantages for the platform. If Forrester is correct, that set of advantages is growing over time as business users become more demanding, as IT becomes ever more essential to myriad business processes.
2. If you have a mainframe, you have a challenge if you are not delivering these advantages to your customers. I still run into shops that schedule outages to IPL (reboot) every week to "fix" a 30 year old (and long ago defeated) memory leak. Come on! Do you think your customers appreciate that? Are you still taking online offline in order to run batch? It's 2008 — why? If you don't have Parallel Sysplex, why not? If you do have Parallel Sysplex, is it only for pricing purposes (a "ShamPlex")? Why not actually use what you paid for? Why are you still taking all of DB2 out of service for a version upgrade? Why are you taking the whole Sysplex out of service when you upgrade a frame or microcode? In other words, why are you forcing unnecessary outages on your users? Have you asked them lately how much they'd prefer to avoid outages?
Forrester suggests that the bulk of improving SLAs involves sitting down and talking in business terms with business users. Shocking, I know. :-) But are you?
I see a lot of RFIs and RFPs in my work. I am not fond of them, but I particularly dislike the ones that have little or no appreciation for service level requirements. If you're putting "99.99%" in your RFP and think you're done, you're in serious trouble. Most vendors (including IBM often enough) think you are describing a limit for unplanned outages. Are you, and is that what your business users expect? Then is it OK to have a planned outage every week for 6 hours for backup? (Salespeople being what they are, what do you think their answer is?) And that's just the start. For example, in disaster recovery — you did remember that, right? — do your users want full production capacity? Most vendors won't include that. (And don't you still need to develop and test at some point after a disaster? That may be the most critical time to develop and test. Where's that capacity?) What are the RTOs and RPOs for each business service? (You do have RTOs and RPOs well defined, right?) What sort of failures do you want to protect against? (Have you done a business risk analysis of some kind?) And, assuming you do a good job articulating all your requirements, how will you prove that the vendor is proposing a solution that will actually meet or exceed your standards?
I think it's also a good idea to understand the costs and implications for incremental SLA improvements. For example, if today you buy something that requires 6 hours of outage every week plus 24 hours of outage to upgrade certain infrastructure elements (such as the database version), how expensive (and would it even be possible?) to close those service level gaps? Business needs change, after all: let's call that service level scalability. Any platform you can think of that can manage to multiple service levels, and that can provide easy and comparatively low cost service level upgrades as business needs change?
I still vividly remember one conversation between a Web development team and a mainframe operations team. (Why they weren't talking with one another before our meeting is a mystery.) The Web team said, "We cannot take the Web application down every weekend, so we need to rehost the application from the mainframe." The mainframe operations team replied, "So, you want us not to IPL that LPAR each weekend?" Web team: "Uh, I think so, but what's an IPL?" Ops team: "An IPL is a reboot. OK, done. We won't IPL that LPAR this weekend or any future weekend. We'll send you a new SLA for your approval." Web team: "Why didn't you do that before?" Ops team: "Because you didn't tell us [in precisely the technical way we expected]." Web team: "It's just that easy? Do you need to do anything?" Ops team: "We weren't IPLing most weekends anyway — we just had that possible planned outage listed in our previous SLA with you. No, we don't need to do anything except send you a new agreement. We'll keep your application up and running." Web team: "Uh, thanks."
Yes, I agree with Forrester: IT organizations need to do much better in these areas. What do you think?
| by Timothy Sipples | July 6, 2008 in Economics Permalink |
TrackBack
TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d834521c8469e200e5538b9da38833
Listed below are links to weblogs that reference Forrester: Service Level Shortcomings Are Pervasive & Costly:
