Business Continuity

The Mainframe Blog Home

Bitcoin Needs a Mainframe

The MTGox exchange and Instawallet, which both deal in Bitcoins, are suffering security-related outages. The whole currency declined in value as a result, and the attacks may be a way to manipulate the value of the currency.

by Timothy Sipples April 5, 2013 in Business Continuity, Security
Permalink | Comments (0) | TrackBack (0)

Unplanned 3 Hour Global Mainframe Outage Scheduled

IBM is notifying all its mainframe customers that due to a once-in-a-lifetime "rear endian horological issue" of critical importance, all mainframes, worldwide, must be completely shut down and powered off for 3 hours on April 1, 2013. Any 3 hours will do, according to IBM's press release, but the downtime must occur on April 1, 2013, based on Mainframe Standard Time (MST). Moreover, the outage must be Sysplex-wide in order to correct the horological issue properly. All forms of GDPS, SRDF, and other cross-site recovery procedures must not be initiated. Failure to shut down on April 1 will result in random transposition of date fields so that, for example, April 2, 2013, which might be represented as 02/05/2013, could instead be rendered as 3102/50/20. Such improper date rendering could result in catastrophic business losses, such as bank account holders getting paid 0.1% in annual interest on their accounts.

There are only two exceptions listed in IBM's urgent "red alert." The first exception applies to customers that do not use the so-called "Western calendar" and which only process dates using other calendar systems such as the Maya calendar. They can postpone powering off their mainframes until 5 minutes before their IBM hardware service agreement expires. The other exception is Cyprus's Laiki Bank which the red alert describes as a "Permanent Mercy Outage (PMO)."

This worldwide mainframe outage is unprecedented, and it goes without saying it will significantly disrupt global society and our everyday lives. The world's financial systems, public safety including national security, Taco Bell's new SuperMax Burrito, and many other facets of our everyday lives depend on IBM's ordinarily incredibly reliable workhorses. That said, we humans always try to look on the bright side. The BBC interviewed Abigail Smythe, a customer of a large U.K. bank, who says she is looking forward to 21 hours of continuous service. Meanwhile, South Korean broadcaster KBS, which was on the air for 10 minutes today until anonymous hackers compromised their servers again to transmit footage of Kim Jong Un eating sushi and playing with an iMac, mentioned that IBM's red alert will not affect their operations. Visa and MasterCard announced that they've programmed all retail terminals to approve all charges during their 3 hour outages. Credit card industry spokesperson Stuart Umpton notes that "We don't want to do anything that will interrupt our cardholders' spending beyond their means, so we'll just approve everything and clean up any messes starting on April 2."

End-of-the-world

Mainframe users in Venice, California, express horror over the forced worldwide outage, warning the public using this helpful sign. Photo taken by Anthony Citrano (Creative Commons License).

by Timothy Sipples April 1, 2013 in Business Continuity, Events
Permalink | Comments (0) | TrackBack (0)

RBS NatWest May Need a Mainframe Operator

June was an awful month for the British banking industry: revelations that Barclays and other banks have been fiddling with the LIBOR survey, cheating borrowers out of billions; several banks offering restitution to small business customers who bought poorly explained rate hedging contracts that were written primarily for the sole benefit of the banks; and a massive breakdown in the payment processing flows at NatWest and Ulster Bank (both owned by RBS). Some bank customers couldn't access their money for days due to the failures.

RBS hasn't said anything about the root cause of the problem that affected so many customers — and which has cost the bank a substantial sum. However, The Register has published two reports (here and here) which explain the causes. The reports are thinly sourced, so caveat emptor. If the reports are correct, RBS applied an upgrade to their CA-7 scheduling software. The upgrade did not go well, but fortunately they had a safety net: they could back out the upgrade. Unfortunately the operator who backed out the upgrade also erased the roster of scheduled jobs, and it took days to recover from that operator error.

The Register also goes on to explain that RBS as recently as February advertised for one or more India-based CA-7 operators. The Register pointedly asked RBS whether the particular operator(s) who botched the back out of the upgrade is(are) based in India. RBS has declined to comment, but I'm sure that information will be revealed in due course since British banking regulators will conduct a full investigation.

Let me make an editorial comment here. Some of the most talented IT staff in the world are located in India. That said, RBS apparently wasn't interested in hiring the most talented staff. RBS's management was evidently interested in hiring the cheapest. I don't think a CA-7 operator, particularly one with the awesome responsibility of delivering banking service levels, ought to be paid £11,000 or less per annum.

I'm curious to know the truth, and hopefully the truth will be revealed and/or confirmed. In the meantime, remember that "you get what you pay for" — or what you don't pay for.

by Timothy Sipples July 2, 2012 in Business Continuity, Economics, Financial
Permalink | Comments (0) | TrackBack (0)

Sweden's Tieto Needs a Mainframe

Several government departments in Sweden were down hard for days.

by Timothy Sipples January 13, 2012 in Business Continuity
Permalink | Comments (0) | TrackBack (0)

RIM Needed a Mainframe

Has anybody else noticed that IT service delivery is bad and getting worse? Hardly a day has gone by when there hasn't been a big security or availability failure.

For example, Rearch in Motion (RIM), the maker of the Blackberry, certainly needs — needed — a mainframe. The company is already in trouble, rapidly losing marketshare, and now they can't even keep their messaging service running. If businesspeople cannot rely on RIM to keep their Blackberries in service, then they'll stop buying Blackberries. RIM might go out of business as a result of this disaster. How much does bankruptcy cost?

R.I.P., R.I.M.

by Timothy Sipples October 19, 2011 in Business Continuity, Current Affairs
Permalink | Comments (4) | TrackBack (0)

Come On, Irene: Tips for Disaster Preparation

The most important news in the world is that a Category 1 or maybe 2 hurricane will come close to the centers of the universe. That would be Washington, D.C., and New York City, of course, which have the biggest populations of English speaking journalists. I remember when a Category 4 didn't get so much TV coverage, and I'm a young intern compared to Bob Neidig.

Hurricanes happen. Even if the TV networks get way too excited about them, they deserve respect. IBM reminds us about preparing for disasters in a timely press release.

By the way, if you (or your service provider) have got a couple mainframes spread across a couple different sites, and you are reasonably competent in using them at least to support your critical business processes, congratulations. "Irene" should blow but not bite. And if you've got some hurricane stories to share, please post them in the comments.

by Timothy Sipples August 26, 2011 in Business Continuity, Current Affairs
Permalink | Comments (0) | TrackBack (0)

Amazon and Microsoft Need Mainframes

According to Wikipedia, lightning occurs somewhere in the world about 44 times per second on average. A quarter of those lightning flashes strike the ground. Meteorologically, lightning is extremely common. All the more reason to wonder why both Amazon's and Microsoft's customers suffered hours-long outages due to a lightning strike.

I found Amazon's advice to its customers particularly galling: "For those looking for what you can do to recover more quickly, we recommend re-launching your instance in another Availability Zone." Translation: You handle your own disaster recovery (if you can), because obviously we can't.

These are certainly not these companies' first outages. It's rational to assume they won't be the last.

Fortunately Amazon and Microsoft customers have an alternative. They can follow these simple steps:

  1. Find at least two data centers, physically separated — your own, or someone else's. (Scores of IT service companies, not only IBM, operate mainframe-based clouds. They used to be called "service bureaus.")
  2. Put a mainframe at each site — your own, or share someone else's.
  3. Use any of several common, cross-site disaster recovery features available with mainframes, notably IBM's GDPS. Choose whichever flavor meets your particular RTO and RPO requirements.
  4. Hire competent IT staff, and pay them reasonably.
  5. Put your applications and information systems on these mainframes, at least for your most critical business services, end-to-end.
  6. Stop wasting money with Amazon and/or Microsoft.

by Timothy Sipples August 8, 2011 in Business Continuity, Cloud Computing
Permalink | Comments (8) | TrackBack (0)

Goldman Sachs Needs a Mainframe

"To our Valued Clients, Due to unexpected trade processing delays, we are experiencing custody reporting delays. We apologize in advance for this inconvenience...."

(Blank) Needs a Mainframe is an ongoing series of posts to The Mainframe Blog, offered as a public educational service to our readers. Mankind always strives for perfection, but unfortunately perfection is not yet obtainable. However, when you need IT service delivery that's as close as possible to perfection, you need a mainframe, or maybe a couple — and you must use them, end-to-end, for the business services that must be delivered as perfectly as possible. Anything else simply isn't the best — and could cost you and your clients billions in the midst of a global financial crisis. Which would be...inconvenient.

by Timothy Sipples August 5, 2011 in Business Continuity, Economics
Permalink | Comments (1) | TrackBack (0)

Mainframe Reports that Customs Clearing Agency Needs a Mainframe

I'm expecting an important letter from China, and it's being shipped via one of the world's biggest express shipping companies. Unfortunately it's now two days late (and counting). I have been following the letter's progress online, and here's an actual record from the system:

Shanghai, China29/06/20110:33Clearing agency computer system breakdown;
temporarily unable to transmit shipment info

That reminds me of another true story. IBM mainframes offer a standard "call home" feature, and most customers use it. The system itself can contact IBM and report any urgent issues that might impact service. One day a mainframe automatically contacted IBM and (paraphrasing) reported that "It's getting warm in here. I'll keep running, but if it gets much hotter then I'll reduce CPU speed and give priority to the most important services and users. Before I do that, it'd be nice if a human investigated." An IBM technician immediately telephoned the customer and asked that they check their data center. Sure enough, a major server vendor's (not IBM's) blade server had caught fire and was burning out of control. (If there was a fire suppression system it didn't activate.) The mainframe literally saved the data center.

by Timothy Sipples June 30, 2011 in Business Continuity
Permalink | Comments (3) | TrackBack (0)

Travelodge Needs a Mainframe (Updated)

The hotel chain faces fines up to £500,000.

UPDATE #1: The Arizona Department of Public Safety needs more safety: a mainframe.

UPDATE #2: Starbucks Singapore still needs a mainframe. It's Monday in Singapore, and Starbucks Singapore cards aren't working at Starbucks Singapore shops again. That's just ridiculously, embarrassingly bad. It's the payment equivalent of going to Starbucks and discovering they've run out of coffee. Memo to Starbucks Singapore: Call NCS, First Data, and/or IBM, ask them for a mainframe-hosted payment card solution, and fix this problem already. Sincerely, your caffeine-addicted (and now dwindling) customer base.

Meanwhile, over at The Coffee Bean shops in Singapore, there's a buy one get one free promotion if you use your Visa payWave card. Visa has a mainframe and uses it. Last I heard they've had a total of a very few seconds of outage (planned and unplanned combined) over the past decade plus — if you merely swiped (or waved) twice, you wouldn't have noticed.

UPDATE #3: To give you a better idea how serious this problem is, the Starbucks Singapore shop near my office has a corporate-issued (and professionally made) "Our cards aren't working" sign posted atop the counter. In other words, Starbucks cards are so unreliable the corporate office had to issue signs to its shops, similar in quality and appearance to its menu boards.

My dear Starbucks friends: do you make your coffee with portable electric tea kettles? Pick the right tool for the job. That would be a mainframe for payments. Mainframes work, and you can buy or rent one. In the meantime, if you need some help you can find me over at The Coffee Bean.

by Timothy Sipples June 24, 2011 in Business Continuity, Security
Permalink | Comments (3) | TrackBack (0)

About "(Blank) Needs a Mainframe"

By now Mainframe Blog readers have seen several "...needs a mainframe" posts. We try to set some trends here, and that's the whole point about these posts.

The central premise (if you'll pardon the pun) of mainframe computing is about quality. Sure, you can add, subtract, multiply, divide, and branch on a PC or an iPod. Lots of computers are Turing-complete, including mainframes. But if you have a business or government to run, and if at least some of your business processes are important, then, quite simply, you need a mainframe — and you need to use it. Otherwise, it's going to be much harder to deliver the security, reliability, and other qualities real people increasingly demand.

The information technology industry solved these quality problems a long time ago, and the solutions to those problems involved relying on the highest quality infrastructure (i.e. mainframes) combined with centrally focused, highly disciplined operations and change management (i.e. mainframe-related development and operations), end-to-end. We know that formula works. Yet there are way too many businesses, governments, and their IT organizations that have lost the plot, implementing obscenely complex, Rube Goldberg-esque application architectures to fulfill even the most common and critical business functions. Such architectures are costly, fragile, and vulnerable.

Unfortunately, as we've seen over just the past few weeks, quality is deteriorating. Major businesses are crashing and burning, hard, with security and availability crises causing major disruptions. Public "cloud computing" isn't going very far unless quality improves dramatically and quickly. Only the fit will survive: the organizations that have or adopt mainframes and actually use them for their critical business processes, end-to-end. It's really that simple: "Fit for Quality."

One technology company that distinguishes itself on quality is the world's largest technology company: Apple. Here's a 30 second video example from 1995:

Apple is a remarkable company. Apple has mastered the "it just works" segment of the consumer technology market. As technology (and life) gets ever more complicated, and as the value of time increases, more and more people value technology like Apple's. The same is true in the world's data centers. Businesses and governments want solutions that deliver secure, reliable service. Those qualities are becoming more important every day. And I think IBM ought to press home its advantages and repeat this simple phrase:

Get a mainframe... and use it!

by Timothy Sipples June 14, 2011 in Business Continuity, Security
Permalink | Comments (1) | TrackBack (0)

U.S. Airways (Now United Airlines) Needs a Mainframe

U.S. Airways, which is merging with United Airlines, had to stop flying on Friday when key applications, notably their boarding system, became unavailable. The airline claims that a power outage near their (only?) data center in Phoenix caused the outage, which in turn caused chaos at U.S. Airways counters and boarding gates despite clear skies and good weather.

If U.S. Airways had a pair of IBM mainframes — one in their primary data center, one in a second data center — if they configured them in a remote cluster (using an appropriate flavor of Geographically Dispersed Parallel Sysplex), and if they actually used those mainframes to support their most critical business processes, end-to-end, then it's extremely unlikely they would have had this problem — and certainly not for hours. That particular infrastructure formula should be familiar. Was U.S. Airways following that formula? If not, why not?

UPDATE #1: The International Monetary Fund needs a mainframe.

UPDATE #2: The United States Senate needs a mainframe.

by Timothy Sipples June 12, 2011 in Business Continuity, Security
Permalink | Comments (3) | TrackBack (0)

Sony Needs a Mainframe (Update: Starbucks Singapore, Too)

Sony's Playstation Network, Sony Entertainment Japan, Sony Music Greece, and Sony Ericsson Canada have all been hacked.

UPDATE #1: Skype... er, Microsoft... needs a mainframe, too.

UPDATE #2: Starbucks needs a mainframe, at least in Singapore. I tried to use my Starbucks card to pay for my coffee this morning, but the barista informed me that "the servers are down in Singapore." So Starbucks cards don't work reliably at Starbucks.

UPDATE #3: Sony still needs a mainframe. Sony Pictures has also been hacked. Meanwhile, Starbucks Singapore still needs a mainframe, too. Starbucks Singapore is accepting its own cards once again after days offline. But Starbucks won't accept Singapore-issued cards outside Singapore nor even at Singapore Changi Airport Starbucks locations. Anybody know why I should allocate precious wallet space to a Starbucks Singapore card?

by Timothy Sipples May 25, 2011 in Business Continuity, Security
Permalink | Comments (1) | TrackBack (0)

Maybe It's Time for More Mainframe Solutions

Sony-executives-bow

Sony reports a huge data breach involving its PlayStation Network. At this writing, Sony has not been able to bring services back online, leaving millions of gamers (and Sony's coffers) poorer.

South Korea's NH Bank also went offline. Preliminary signs point to a sophisticated employee-mounted attack in that case, which wiped out both primary and disaster recovery resources concurrently. Nobody is sure which employee(s), though.

I hope we can all learn from these experiences and others, which unfortunately seem to be growing in frequency and severity.

UPDATE: South Korean investigators now think that North Korean experts were behind the devastating attack on Nonghyup Bank which wiped out many of the bank's credit card records and disabled the bank's core services for several days. Meanwhile, the Korea Internet Security Agency (KISA) reports that 82.7% of South Korean companies do not have any plan for recovery in the event of a disaster or attack. That includes numerous large South Korean businesses. Lack of any DR plan, at least a sub-standard one, would be unthinkable in many countries — and hopefully now unthinkable in Korea. (Photo: Sony executives bow. See the full story at The Australian.)

by Timothy Sipples April 26, 2011 in Business Continuity, Current Affairs, Games, Security
Permalink | Comments (1) | TrackBack (0)



The postings on this site are our own and don’t necessarily represent the positions, strategies or opinions of our employers.
© Copyright 2005 the respective authors of the Mainframe Weblog.