Systems Technology

The Mainframe Blog Home

Sun Exits High-End CPU Design

The New York Times reports that Sun has canceled its chronically delayed next generation SPARC CPU, codenamed "Rock":

http://bits.blogs.nytimes.com/2009/06/15/sun-is-said-to-cancel-big-chip-project

The enterprise-class server processor chip business is extremely competitive and capital-intensive. The tough global economy is forcing some tough business decisions. I expect that IBM, in particular, will disproportionately benefit from this decision and should gain additional marketshare with its POWER and System z products.

But when I first heard this news I did not think about the market implications, the effect on IBM, what happens to Intel, etc. I immediately thought about the employment implications of this announcement. This cancellation is a major blow to the relatively small processor engineering community, and too many talented engineers are presumably losing their jobs. Whatever one might say about Sun now, the company has (and had) some of the best engineers for so many years. That part of the story needs to be reported, too.

UPDATE: IBM reacted quickly to the Itanium (HP) and SPARC (Sun) processor roadmap failures and broken promises, improving its offerings to encourage more migrations to IBM System z and POWER servers. Here's one example: http://www.ibm.com/systems/migratetoibm/systems/z. And another: http://www.ibm.com/press/us/en/pressrelease/27613.wss.

UPDATE 2: Back at the end of April, Marketwatch reported (from a Japanese newspaper source) that Fujitsu is scrapping plans to develop next generation semiconductor technology on its own. The story refers to semiconductor fabrication but also (disturbingly) says that Fujitsu is cutting "other technologies that were in the development pipeline." Which ones? In contrast, IBM has state-of-the-art semiconductor factories and leads semiconductor design (and process) research and development.

by Timothy Sipples June 16, 2009 in Systems Technology
Permalink | Comments (1) | TrackBack (0)

Paving the On Ramp to the Super Highway Mainframe!

For you friends of the mainframe, you know that its the "super highway" of computing. But a lot of people believe it has a dirt road for an onramp! IBM Rational has been paving that road with a number of offerings to modernize your current mainframe programming environment, but to also simplify new workload deployments as well. There are a couple of web casts targeted toward financial customers in the month of February.

First is on Feb 12 - Overcoming IT Challenges and meeting business objectives in the banking industry.  Here's a link to register for that webcast.
 http://www.ibm.com/software/info/feb12

Then on Feb 25th and 26th, there will be a webcast entitled: Faster payback for the banking industry using application development solutions from IBM.
These are broadcast at different times to meet worldwide needs. Register for those webcasts at:  http://www.ibm.com/software/info/feb25 

And for your expeditionary planning, Rational will be hosting it's annual Software Development Conference in Orlando, FL, USA from May 31-June 4, 2009.
Rational
Top 10 reasons not to miss this year’s IBM Rational Software Conference!
    1.    Over 400 sessions in 18 tracks
    2.    Interact with over 4,000 of your industry peers 
    3.    Keynotes with industry-leading experts
    4.    IBM Expo featuring key IBM Business Partners and the IBM Solution Center
    5.    IBM Rational Labs (see possible future capabilities in our products!)
    6.    3 and 5 hour Technical Workshops 
    7.    Free Certification testing
    8.    Executive Summit 2009
    9.    Interactive Birds-of-a-Feather Sessions
    10. Unlimited networking opportunities
And just as important in this cost conscious environment, register today with discount code SZ09 and get $100 off the registration fee.
Register at http://www.ibm.com/rational/rsdc

by JimPorell February 11, 2009 in Systems Technology
Permalink | Comments (0) | TrackBack (0)

Dynamic Infrastructure + Mainframe = Smart Planet

The electronic age has changed the world remarkably in a single generation. Communication is now instantaneous, across the globe, making it smaller and flatter. And now, with information at our finger tips, with so many electronic devices, the world is getting smarter as well. IBM recently introduced its initiatives for a Smart Planet. Today, IBM brought more offerings toward that objective through the announcement of its Dynamic Infrastructure.

 

The Dynamic Infrastructure is intended to make the best use of the computing resources available to solve any type of problem that we can foresee. Its objective is to improve service toward satisfying a problem, while reducing costs and managing risks. Problems today can be much more complex thanPicture1 they were in the past. They are no longer single threaded transactions that might have been quickly solved. Many problems today might be a series of transactions, across a workflow with dependencies from pre-existing information that might have occurred just moment ago. This form of coordination requires rapid response and coordination. To that end, the role of a mainframe has evolved to be an important element of this dynamic infrastructure, in collaboration with other systems to meet business problem objectives.

 

Businesses have learned the hard way that, yes indeed, they can throw technology at a problem. Hundreds of individual servers can be strung together to create a complex workflow from which a series of problems can be solved. But in many cases, these processors may be under utilized, require massive power and by the sheer redundancy necessary to meet the business problems goals, there will be replication of information and data across the infrastructure which can add security risk and complexity in managing risk.

 

The goals of the Dynamic Infrastructure are to improve the service of the overall heterogeneous computing infrastructure and to manage this community toward common goals. In addition, costs for computing and risks must be reduced at the same time. This is where the ever evolving mainframe can be of tremendous value. The mainframe’s heritage includes meeting strict service level agreements for business resilience, security, utilization, storage management and business process integration. The goal of IBM’s System z is to extend these strengths into other computing architectures when they are used in collaboration with the mainframe. As a result, a heterogeneous computing infrastructure, made up of IBM’s Power and Modular and storage systems, in conjunction with System z will be enabled to tackle business problems in creative ways. The use of management software and middleware from IBM's Tivoli division, as well as the experience of IBM’s Global Technology Services personnel, will further improve the capabilities to a business to improve service, reduce costs and manage risks.

 

And examples of Dynamic Infrastructure are already evident where organizations see tremendous value. From managing traffic more efficiently to providing a more resiliently financial trading environment to meet customer demands, businesses are seeing operational savings that are yielding savings and benefits to their consumers as well. In essence, it’s yielding a Smarter Planet for all of us to benefit.

by JimPorell February 9, 2009 in Systems Technology
Permalink | Comments (0) | TrackBack (0)

A Tale of Two Mainframe customers – one growing and one leaving the mainframe

This is the tale of two mainframe customers. One customer has achieved a period of tremendous growth in their business, processing transactions on the mainframe, while reducing expenses and becoming more resilient. The other business chose to get off the mainframe at a significant cost and in all likelihood, spends more today than they would have on the mainframe. What’s interesting is that at one time, they shared the same system infrastructure. And Clerity, a consulting firm, would like you to believe that the non-mainframe customer got tremendous value in the move. Here are their stories.

In any basic computer architecture class, a student will learn that the fewer the number of data moves, the better for performance. Now, in an era of regulatory compliance and privacy considerations, that becomes exceedingly true because each instance of data must now be auditable and recoverable which implies additional costs for each extra instance of data.

This becomes important when considering an outsourced computing environment. It appears that SIAC never got this level of education, while its customer, DTCC seems to have excelled in this computer architecture class. Even funnier is that Clerity has decided that SIAC is a model customer…that doesn’t bode too well for their other consulting arrangements.

 

So what really happened?

DTCC’s trading business was growing tremendously. But let’s have them tell you, in their own words:

 

Sometimes "insourcing" pays off more than outsourcing. Until last year, two DTCC subsidiaries outsourced all their infrastructure support activities to the Securities Industry Automation Corporation (SIAC). Now, following the completion of a multi-year initiative, DTCC has cut costs and bolstered business continuity by insourcing the activities previously performed by SIAC into DTCC’s infrastructure. The two subsidiaries are National Securities Clearing Corporation (NSCC) and Fixed Income Clearing Corporation (FICC).  ……

On top of strengthening the industry's business continuity and infrastructure, the project is yielding financial benefits, enabling DTCC to cut the industry’s overall operating expenses. In 2006, by leveraging DTCC's processing capabilities, insourcing has reduced DTCC's annual operating expenses an estimated $42 million, said William Aimetti, DTCC’s chief operating officer. This was one factor that enabled DTCC to lower its fees in 2006.

 

And going back to another DTCC newsletter, they explained that they got a 167% performance improvement, without a line of code change, because they reduced the number of data moves and connections necessary to process a transaction:

To keep ahead of transaction volumes that have been rising sharply over the past several years, DTCC has significantly increased the capacity of its mainframe database for equity processing. The system, called Trade Repository Processing (TRP), can now process at least 160 million sides per day. This 167% increase is nearly triple the previous capacity of 60 million sides.

What’s more, the TRP can handle the additional volume within the same time frames, thanks to changes that make the system perform more efficiently. In addition, for current volumes, the upgrade allows DTCC to deliver certain participant reports, such as the Consolidated Trade Summary, up to 45 minutes earlier.


DTCC was able to do all this without modifying any of their customers applications. While DTCC was hosted on the SIAC systems, they found that there were extra network hops and copies of data deployed. They also were heavily dependent on SIAC to make changes to the infrastructure on a regular basis. Because both of the SIAC systems were located in the New York City area, DTCC was also afraid that a single catastrophe would take out the redundant systems and affect their availability. These were the fundamental concerns that led DTCC to move out on its own.

Prior to this decision, SIAC signed a multi year agreement for software, systems and services with IBM. This agreement included discounted pricing assuming capacity growth projections that SIAC provided as their objectives. By sharing the mainframe infrastructure with DTCC, SIAC dramatically reduced their own operational overhead which was predominantly associated with batch processing and account reconciliation in the evening, while DTCC used the same processing infrastructure for trade transactions during the day. Each had applications that overlapped each other though.

 

When DTCC pulled their two applications from the SIAC system, SIAC was left with a lot of free daytime capacity, but still a reasonably busy system in the evening. But SIAC was now completely responsible for the costs of this system. The growth potential that they had promised to IBM would no longer be possible and as such, the discounts they were offered were no longer relevant.  This underutilized mainframe was now quite a bit more expensive to SIAC than it had been when it was sharing the expense with DTCC. That’s a fact…nothing sweet about that.

Now SIAC could actually have downsized with their mainframe and reduced their costs. Instead, they chose to “down size” to a distributed environment. In doing so, they also needed to solve federally mandated business resilience requirements, something they had ignored on their previous mainframe, and build another data center. 

So using SIAC’s own words:

 

In 2006, when the Shared Data Center team, the technology arm of the New York Stock Exchange (NYSE), evaluated its internal infrastructure in light of competitive market factors, changing regulations, and anticipated future growth, the decision was made to replatform its 1,660 MIPS mainframe workload onto IBM System p Model 595 servers running AIX and UniKix rehosting software from Clerity.

"Quite simply, we can now transact more business per hour at a lower rate," said Francis Feldman, Vice President of the Shared Data Center. "Open systems servers and middleware technology have greatly evolved over the past decade. The combination of UniKix on System p servers gave us the reliability and flexibility we required at a competitive price point to quickly enhance our market position."

 

So let’s parse this a little bit. Notice the timeframe: 2006….it’s the same time that DTCC moved off the SIAC system. The 1660 MIPS is the combined processing power for both DTCC and SIAC. The reality is that SIAC NEVER used all that capacity themselves, even though they owned the system. So while factually true, it wasn’t real. Changing regulations refers to the need to develop a second site outside the New York metropolitan area.  After converting the applications, SIAC had to create automation and recovery scripts and deploy the new servers at an alternative location as well. I am assuming that they licensed the software for those systems as well. And this required changes for all of SIAC’s customers as well. That cost certainly isn’t factored into the migration costs for SIAC. Finally, I wonder what SIAC’s cost were when they shared the infrastructure….is the new solution more or less than that environment? Unfortunately, we’ll never know because the new environment includes a new data center….but I could imagine it was less.

So, an alternative plan, at presumably far less expense, effort and time, with little or no application changes, would have been to downsize the mainframe to the size appropriate to SIAC’s new capacity requirements. To meet the resiliency objectives, SIAC could have installed a mainframe in another geographic location using IBM’s Capacity Backup pricing which would not have charged for software usage except during a disaster, while still allowing for regular disaster preparedness testing with no additional licensing costs.  Many of IBM’s customers take advantage of this model for availability processing.

And has availability improved? Click here for a list of outages that SIAC experienced in 2008. All types, but not associated with the mainframe. Perhaps SIAC changed their reporting structure in 2008, but I can’t find many outages listed in a search on 2007, though there is an awful lot of information dating back many years on their site.

 So where has DTCC evolved? They have reduced their fees annually to their clearing customers to take advantage of savings that they’ve achieved in their own processing models. And like many businesses, they are taking their traditional fixed format “mainframe” data and making it available via Portals in XML and spreadsheet formats. They are using the best of both the mainframe and the distributed world and in doing so, meeting or improving their costs per transaction while meeting and exceeding their service level goals.

As for SIAC, I don't know, but they don't spend nearly the amount of time bragging about their infrastructure as DTCC. That should give you a clue right there!

by JimPorell January 6, 2009 in Current Affairs, Economics, History, Systems Technology
Permalink | Comments (1) | TrackBack (0)

HP attacking the mainframe? Like a car vs. a truck

Well, HP is at it again. They are making more generalities about IBM’s venerable mainframe to scare customers off that platform. Check their facts and sources, though and you’ll find that something’s rotten in

Palo   Alto

. Their comparisons are just not realistic. In this note, I’ll be giving you some consolidation efforts that IBM has seen with its customers.

 

Before we get into that, though, let’s do a quick comparison benchmark to establish a baseline. Let’s compare a four passenger Mini Cooper car Minicooper to a two passenger Freight liner truck cabTruck . Benchmark 1: which is cheaper to commute to work in? Pretty obvious, but I’ll vote for the car. Especially given gas price vs. diesel now…the car is the “green” solution. Benchmark 2: We want to move the contents of our house. Most people would say the truck, but they’d be wrong. We need to accessorize and add a trailer to each vehicle. Now the Mini happens to put the tailpipe right in the middle of the car on many of their models. Why? You’d have to be a moron to put a trailer on their car. As for the truck, with a large enclosed trailer, you can put all kinds of materials in it. In fact, you might even put a couple of the Mini’s inside. So we’ve just proven that with the right benchmark, either solution is appropriate. But benchmarks aren’t reality either. Most people will move their family in the car and outsource to a shipping company to move the contents of their house. So continuing that analogy, there is no one computer that will solve all of a business’ problems, neither a mainframe nor a PC server will do the job by themselves. It’s all about collaboration and using the best servers for the right jobs.

 

So let’s get back to HP’s claims. I’m a little confused by Robert Frances Group claims right now. In the HP quoted report, they say you get less electricity and floor space with a PC server than you do with a mainframe. I’ve never seen a mainframe that only ran a single workload. Most of them will have transaction processing, batch, interactive, query and decision support running all at the same time. It’s true that you can take one workload off of a mainframe and run it on a PC server and then compare that PC server to a mainframe. The data might actually be real, but as information, it is “incredible”. A single PC server may be smaller than a mainframe and use less electricity (The car). But no single PC server is going to be comparable to a mainframe running multiple workloads. In fact, RFG published a paper in which they said a mainframe will use 3% of the electricity of a comparable PC server cluster attempting to accomplish the same workload. It will also use a fraction of the floor space. (The Truck). But don’t believe me….here’s exactly what they said:

 

RFG believes mainframe computing platforms have many of the characteristics that will ameliorate, if not eliminate, the current challenges data center managers face with power and cooling. First, mainframe power consumption and heat characteristics are, for many companies, the most efficient servers in the data center. This is true in an absolute sense, where the energy per square foot is lower than any data center system measured by our clients. More significantly, this is massively true in a relative sense, when comparing power used per transaction. On a total workload throughput basis, mainframe system power consumption is almost negligible when compared with distributed systems on a power per transaction basis. As power and cooling costs continue to rise, IT executives should reevaluate mainframe computers total cost and overall value in reducing data center operations costs.

Quote used with permission of Robert Francis Group.

 

So who are you going to believe? RFG or RFG? Well, in the HP cited paper, RFG just republished the results of a report done by HP. So don’t throw RFG under the bus. Just understand that it’s HP’s low quality and misleading information at work, once again.

 

As for the Alinean update, it’s a single workload in each example. And in them, they talk about the SAP application server. But what about the database server? Typically, if the application server is on z, the database server is in DB2 for z/OS. Did that move too? The labor costs for System z appear to be much higher than the norm for a business. The report discusses the price of an older mainframe and again, some incredible Software license charges. But what if SAP was added to a newer mainframe? How would that have compared in this report? What if it was added to an existing, newer mainframe, what would the incremental charges be as compared to net new computing servers?

 

HP mentions the BART system avoiding 50% of their paycheck errors. Wow…that sounds like a big number. They went to Peoplesoft, from what I guess was a homegrown application that was running on a mainframe…at least that’s what HP wants you to believe. So it sounds like the BART people are better running trains than they are at writing programs? I doubt it. That wouldn’t be fair to the hard working people at BART. But remember, if there are two paycheck errors a month and it goes down to one paycheck error a month, that’s a 50% reduction as well. (The Car). So sometimes the big numbers quoted are really just a meaningless indicator to scare you into thinking something else. How many errors a month was BART really seeing? I don’t know and neither do you based on HP's comments.

 

So let’s talk about something I do know about….consolidations of servers are occurring and System z has been a great place to do that. Nationwide and DGTI are two examples.

IBM has published a paper on SAP consolidation capabilities on System z. The HP press release described a customer that had mainframes and Windows servers. By eliminating the mainframe, they had a common skill set based on Windows. But how real is a customer with a single computing infrastructure? Maybe for relatively small customers, but not with larger ones. RENFE is the Spanish national rail agency. Prior to its reorganization into the two new operating companies, RENFE was composed of 18 separate business units, each with its own intranet system running various line of business applications. These included human resources systems, helpdesk applications and various internal communication portals. To drive better integration across the business and improve process efficiency, RENFE made a strategic decision to create a single information portal for all employees and that was based on System z.

 

IBM is eating its own cooking by consolidating many of it’s thousands of application and database servers onto System z. But that’s not the whole story either. They are also consolidating some onto System p and some onto System x. In each case, IBM is looking at underutilized stand alone servers, the baseline for the PC server marketplace and leveraging virtualization technologies to get a large reduction in physical server images. IBM is putting the right workload in the right place that makes sense for the business environment. (The Trucks).

 

We see constant examples of taking 100’s of underutilized standalone PC servers and consolidating through virtualization down to 10’s of higher utilized PC or RISC servers or individual mainframe servers. In each case, the customers are saving substantially on labor, environmental and capital costs. HP will tell you that 100’s to 10’s is good enough.

IBM mainframes, though, can get that down to single digits in many cases.

 

Look at HP’s Brazilian Navy example. A lot of folks may perceive that a mainframe could never go on a Battleship, Aircraft carrier, early warning aircraft or other military location. Well, those folks would be wrong. Today’s modern mainframe, the System z, going as far back as the zSeries z800 processor meets or exceeds the electrical, floor space, ambient temperature, humidity, air pressure and vibration specifications necessary to satisfy the locations in which those servers may be deployed. See page 12 to view a subset of these specifications. In addition, it provides operational redundancy built into the hardware architecture and operating systems that exceeds the availability requirements necessary to satisfy those particular business needs. And with its open programming models, including Java, J2EE, C/C++, in addition to the venerable COBOL and PL/I capabilities, it provides a hosting environment to capture those programming needs.

 

In fact, development belongs on the desktop. The most creativity and tooling is possible in that desktop and you can reboot the system at will to test your applications. IBM’s Rational Developer for System z (RDz) and Rational Team Concert suites provide an Integrated Development Environment that can leverage the simplicity of the open programming environment through its Eclipse.org tool base, but easily apply those skills and knowledge to mainframe application deployment. You want mainframe development skills? You have them in your hands already. Get the tools and put those people to work.

 

One of the principals of the mainframe has always been that the operating system, middleware and hardware are responsible for data locking, security, system resilience, storage management and capacity management. This enables multiple workloads to operate as individual processes and maintain the integrity of the system and the data. On other platforms, it’s typically the application that is responsible for many of these characteristics. In order to achieve these qualities of service, additional products must be acquired and additional code may have to be written by application developers to deliver these qualities. The point of this all is that a business might actually reduce the amount of code necessary to achieve their business objectives if it was targeted for deployment on System z and reduce their operational risk at the same time. To summarize this point, it can be the same code from distributed systems in a mainframe operational container and deliver superior operational performance. Same code, different container with superior operations model.

 

So this started by pointing out inaccuracies in the HP press release. How can a business use that information? Well, maybe to buy an individual compute server, that information may be helpful (The Car). But looking at an enterprise that needs to satisfy multiple business needs, it doesn’t appear too helpful at all (The Truck). They use Apples to compare to

Oranges

. Customers continue to grow their compute power on IBM mainframes. New problems are being solved in creative ways, leveraging the best of the mainframe in collaboration with other systems. Like RENFE, get on board the IBM mainframe.

 

 

by JimPorell November 12, 2008 in Economics, Innovation, People, Systems Technology
Permalink | Comments (5) | TrackBack (0)

Response to Jeff Savit Blog

As part of the announcement of z10 IBM made some marketing claims about the large number of distributed Intel servers that  could be consolidated with zVM on a z10.  The example cited used Sun rack optimized servers with  Intel Architecture CPUs.  Sun Blogger Jeff Savit objected strenuosly to the claims mainly because of the low utilization assumed on the Sun machines that the claims compared to.  You can read it here:

http://blogs.sun.com/jsavit/entry/no_there_isn_t_aI responded, he responded.  When I was out of pocket  for awhile and did not respond soon enough and his blog cut off replies on that thread.  I am putting my latest response here.  Thanks to Mainframe blog for providing the venue to do so.  My latest responses to Jeff are in blue italics.

Posted by Joe Temple on June 24, 2008 at 11:28 AM EDT #

This format is very difficult for parry and riposte, but let's try. I would like to use different colors, but I can't (AFAIK) put in HTML markup to permit that. So: Joe's stuff verbatim within brackets, and each of his sections starts with a quote of a sentence of mine (which I identify, within quotes) for context. Each stanza identified by name and employer (this is Jeff speaking):

Joe(IBM): [[[Jeff, your post is rather long and rather than build a point by point discussion too long for a single comment I will put up several comments. Starting with the moral of the story: There are several: • quoting Jeff: "Use open, standard benchmarks, such as those from SPEC and TPC."

Better to use your own. They have not been hyper tuned and specifically designed for. They have a better chance of representing reality. But be careful not to measure wall clock time on “hello world” or lap tops will beat servers every time.]]] 

Jeff(Sun): In a perfect world, every customer would have the opportunity to test their applications on a wide variety of hardware platforms to see how they perform. But they don't, and they rely on open standard benchmarks to give them some information about how the platforms would perform. Or, they do have applications they could benchmark, but they're non-portable, or run solely on a single CPU (making all non-uniprocessor results worthless), or otherwise have poor scalability or any of a hundred other problems. Imagine comparing IBM processors based on the speed of somebody writing to tape with a blocksize of 80 bytes! Even if they get a useful result, the next customer doesn't benefit at all and has to start from scratch. It's not trivial to make good benchmarks that aren't flawed in some way. That's why the benchmark organizations exist - to provide benchmarks that characterize performance and give a level playing field for all vendors. IBM, Sun, and others are active in them - our employers must think they have value. Obviously there is "benchmarketing" and misuse of benchmarks. THAT is what I'm railing against. Hence, my following bullet that says "read and understand". But frankly, benchmarks Specweb/specwebssl/Specjvm, the SPEC fileserver benchmarks, and benchmarks like TPC.org's TPC-E provide representative characterization of system performance (with sad exceptions like TPC-C, which is broken and obsolete, but IBM still uses for POWER). The characterization of TPC-C as "old and broken"  may have something to do with Sun's inability to keep up on that benchmark.  One of the characteristics of TPC-C that none of the other benchmarks has is that it has at least some "non local" accesses in the transactions.  Sun's problem with this is that such accesses defeat the strong NUMA characteristic of their large machines.  One of the results of this  is that all machines scale worse on TPC-C than on the benchmarks Jeff cites. Since Sun is very dependent on scaling a large number of engines to get large machine capacity close to IBM's machines they are highly susceptible to this.   The effect  is  exacerbated by NUMA (non uniform memory access).  That is, a flat SMP structure will mitigate this.   The mainframe community's problem with TPC-C is that the non-local traffic is all balanced and a low percentage of the load.  As a result TPC-C still runs best on a machine with a hard affinity switch set and does not drive enough cache coherence traffic to defeat numa structures.  When workload runs this way it does not gain any advantage from z's schedulers or shared cache or flat design. Think of TPC-C as a fence.  There is workload on Sun's side and there is workload on the mainframe side of TPC-C.  All the Industry Standard Benchmarks sit on Sun's side and scale more linearly than TPC-C.  For workloads that are large enough to need scale that run on the Sun side of the TPC-C fence, IBM sells System p and System x.  When you consolidate disparate loads the Industry Standard benchmarks do not represent the load and  with enough "mixing"  the  composite workload will eventually move to the mainframe side of the TPC-C fence.  See Neil Gunther's Guerilla Capacity Planning, for a discussion of contention and coherence traffic and their effect on scale.  Particularly  read chapter 8, to get an idea about how the benchmarks lead to overestimation of scaling capability.    A lot of people have worked very hard to make them be as good as they are. IBM uses these benchmarks all the time - with the notable exception of System z.  System z is designed  to run workloads with non uniform memory access patterns, randomly variable loads, and much more serialization and cache migration than occurs in the standard benchmarks , where strong affinity hurts, rather than enhances throughput. It is the only machine designed that way (Large shared L3 and only 4 way NUMA on 64 processors). Also, the standard benchmarks are generally used for "benchmarketing".  As a result the hard work involved is not purely driven by the noble effort by technical folks that Jeff portrays, but rather by practical business needs, including the need to show throughput and scale in the best possible light.  That's the point, isn't it. It works in a monopoly priced marketplace where it doesn't have to compete on price/performance,  as it does with its x86 and POWER products. Where else are you going to run CICS, IMS, and JES2?  There are alternatives to System z on all workloads, it is matter of migration costs v benefits of moving.  Many applications have moved off CICs and IMS to UNIX )and Windows over the years. Sun has whole marketing programs to encourage migration.  In fact a large fraction of UNIX/Windows loads do work that was once done on mainframes.  As result the mainframe must compete.   Similar costs are incurred moving work from any UNIX (Solaris, HPUX,  AIX, Linux to zOS. Or moving from UNIX to Windows.  The other part of the barrier is the difference in machine structure.  This barrier is workload dependent.  Usually, when considering two platforms for a given piece of work one of the machine structures will be a better fit.   When moving work in the direction favored by the machine structure difference the case can be made to pay for the migration..  This is what all verndors do.  Greg Pfister (In Search of Clusters), suggests that there are three basic categories of work.  Parallel Hell, Paralle Nirvana, and Parallel Purgatory.  I would suggest that there are three types of machines optimized for these environments (Blades in Nirvana, Large UNIX machines in Purgatory, and Mainframes in Hell)  To the extent that workload is in parallel hell, the barrier to movement off the mainframe will be quite high.   Similarly attempts to run purgatory or nirvana loads on the mainframe will run in to price and scaling issues. IBM asserts that consolidation of disparate workloads using virtualization will drive the composite workload toward parallel hell, where the mainframe has advantages due to its design features, mature hypervisors and machine structure.

To the second observation about wall clock time on trivial applications: yes, obviously.

Joe(IBM): [[[quoting Jeff: •"Read and understand what they measure, instead of just accepting them uncritically."
Yes, particularly understand that the industry standard benchmarks run with low enough variability and low thread interaction that it makes sense to turn on a hard affinity scheduler. Your workload probably does not work this way.]]] 

Jeff(Sun): I'm not sure what's intended by that. Are you claiming that benchmarks should be run against systems without fully loading them to see what they can achieve at max loads? Hmm. Anyway, see below my comments about low variability and low thread count - which applies nicely to IBM's LSPR.]]]   I guess I am claiming that the industry benchmarks basically represent parallel nirvana and parallel purgatory.  I am asserting that mixing workload under single OS or virtualizing servers within an SMP drives platforms toward parallel hell.  The near linear scaling of the industry standard loads on machines optimized for them will not be achieved on mixed and virtualized workloads.  In part this because sharing the hardware across multiple applications will lead to more cache reloads and migrations than occur in the benchmarks.   I see Jeff's reference  to LSPR as a red herring for two reasons.  While LSPR has not been applied across the industry,  the values it contains have been used to do capacity planning rather than marketing. The loads for which this planning is done are usually a combination of virtualized images each either running mixed and workload managed  under zOS or  VM and zLinux.   This could not be done successfully if  the scalability were as idealized as the Industry standard benchmarks.   Second, I do not suggest that LSPR is the answer, but rather that the current benchmarks do not sufficiently represent the workloads in question (mixed/virtualized) for Jeff to make the claim that z does not scale as he did elswhere in the blog entry.  Basically,  to draw his conclusion he compares the LSPR scaling ratios to Industry benchmark results on UNIX SMPs. This is not  a good comparison.

Joe(IBM): [[[quoting Jeff: •"Get the price-tag associated with the system used to run the benchmark." Better to understand your total costs including admin, power, cooling, floorspace, outages, licensing, etc."

Jeff(Sun): That's what I meant. Great.  Because the hardware price difference that Sun usually talks about is only a small percentage of total cost.  The share of total cost represented by hardware price shrinks every year.

Joe(IBM): [[[quoting Jeff: • Relate benchmarks to reality. Nobody buys computers to run Dhrystone." Only performance engineers run benchmarks for a living.]]]

Jeff(Sun): Sounds like a dog's life, eh? OTOH, they don't have users...

Joe(IBM): [[[quoting Jeff: •"Don't permit games like "assume the other guy's system is barely loaded while ours is maxed out". That distorts price/performance dishonestly." Understand what your utilization story is by measuring it. Don’t permit games in which hypertuned benchmarks with little or no load variability and low thread interaction represent your virtualized or consolidated workload. Understand the differences in utilization saturation design points in your IT infrastructure and what drives them."]]]

Jeff(Sun): Your comment has nothing to do with what I'm describing. What I'm talking about is the dishonest attempt to make expensive products look competitive by proposing that they be run at 90% utilization, while the opposition is stipulated to be at 10%, and claim magic technology (like WLM, which z/Linux can't use) to permit higher utilization and claim better cost per unit of work on your own kit. That's nothing more than a trick to make mainframes look only 1/9th as expensive as they are. Imagine comparing EPA mileage between two cars by spilling 90% of the gas out of the competitor's tank before starting. As far as "no load variability and low thread interaction", I suggest you take a good look at IBM's LSPR. See http://www-03.ibm.com/servers/eserver/zseries/lspr/lsprwork.html which describes long running batch jobs (NO thread interaction at all) on systems run 100% busy (NO load variability). The IMS, CICS (mostly a single address space, remember), and WAS workloads in LSPR should not be assumed to be different in this regard either. This doesn't make LSPR evil: it is not - it's very useful for comparisons within the same platform family. But consider SPECjAppserver, which has interactions between web container, JSP/servlet, EJB container, database, JMS messaging layer, and transaction management - many in different thread and process contexts. I suggest you reconsider your characterization about thread interaction. Complaints about thread interaction and variability of load are misplaced and misleading.  The comparison of zLinux /VM at high utilization with highly distributed solution at low utiliation is valid, and well founded on both data  and system theory.   You could make similar comparisons of  consolidated  Virtualized UNIX v  distributed Unix,, VMware v Distirbuted Intel.  Any cross comparison of virtualized v distributed servers  will be leveraged mainly by utilization rather than by raw  performance as measured by benchmarks.  Thus the comparison Jeff complains about as dishonest does in fact represent what happens when consolidating existing servers using virtualization.   My second point is that in making comparisons between consolidated mixed worklload solutions that industry benchmarks are not represetative of the relative capacity or the saturation design point for each of the  systems in question.  There is no current benchmark to use for these comparisons.  This includes LSPR, Suns Mvalues, rPerfs,  as well as the industry benchmarks.  None of them works.  Each vendor asserts leverage for consolidation based on their own empirical results, or perceived strengths in terms of machine design.     I am saying that the scaling of these types of workloads is  less linear that the industry benchmark results and that  some of the things z leverages to do LSPR well  will  apply in this environment as well. Joe(IBM): [[[quoting Jeff: •"Don't compare the brand-new machine to the competitor's 2 year old machine" Understand what the vintage of your machine population is. When you embark on a consolidation or virtualization project compare alternative consolidated solutions, but understand that the relative capacity of mixed workload solutions is not represented by any of the existing industry standard benchmarks.]]] 

Jeff(Sun): We're talking at mixed purposes. What I mean is that one vendor's 2008 product tends to look a lot better than the competition's 2002 box, making invidious comparisons easy. Moore's Law has marched on.  The truth is that when you do a consolidation you usually deal with a range of servers some of which are 4 or 5 years old.  2 year old  vintage is probably farirly representative.  In any case Moore's law does not improve utilization of distributed boxes unless you consolidate work in the process of upgrading. Unless a consolidation is done the utilization will drop when you replace old servers with new servers.  For the consolidation to occur within a single application, the application has to span multiple old servers in capacity.  Server farms are full of applications which do not use a single modern engine efficiently let alone a full multicore server.   Jeff's main argument is with the utilization comparison.   The utilization of distributed servers, including HP's, Sun's and IBM's, is  very often quite low.  It is possible to consolidate a lot of low utilized servers on a larger machine. The mainframe has a long term lead in the ability to do this, that includes hardware design characteristics (Cache/Memory Nest), specific scheduling capability in hypervisors (PR/SM and VM), and hardware features (SIE).   How many two year old low utilized servers  running disparate work can an M9000 consolidate?   

Joe(IBM): [[[quoting Jeff: • "Insist that your vendors provide open benchmarks and not just make stuff up."
Get underneath benchmarketing and really understand what vendor data is telling you. Relate benchmark results to design characteristics. Characterize your workloads. (Greg Pfister's In Search of Clusters and Neil Guther's Guerilla Capacity Planning suggest taxonomies for doing so.) Understand how fundamental design attributes are featured or masked by benchmark loads. Understand that ultimately standard benchmarks are “made up” loads that scale well. Learn to derate claims appropriately, by knowing your own situation. (Neil Gunther's Guerilla Capacity Planning suggests a method for doing so)]]]

Jeff(Sun): This is not the "making stuff up" that I was referring to. I was referring to misuse of benchmarks in the z10 announcement, which IBM was required to redact from the announcement web page and the blogs that linked to it. I'm not arguing against synthetic benchmarks that honestly try to mimic reality, I'm arguing against attempts to game the system that I discussed in my "Ten Percent Solution" blog entry.  I have explained the comparison made for the z10 announcement above.   Jeff objects to the utilzation coparison which is legitimate. In fact when servers are running at low utilization most of them are doing nothing most of the time.  That is the central argument for virtualization which is generally accepted in the industry.  I am also pointing out that Industry Standard Benchmarks are not created in purely noble attempt to uncover the truth about capacity.  In fact they are generally defined in a way that supports the distributed processing, scale out. client server camp of solution design, which is why they scale so well.   Think about it.  The industry standard committees each vendor has a vote.  System z represents 1/4 of IBM's vote.   Do you think there will ever be an industry standard benchmark which represents loads that do well on its machine structure?  The benchmarks and their machines have evolved together.  They can represent loads from single application codes that are cluster or numa concious.   What happens to all of those optimizations when workloads are stacked and the data doesn't remain in cache or must migrate from cache to cache?  The point is that relevance and validity of  either side of this argument is highly workload dependent.   The local situation will govern most cases.  Neither an industry benchmark result nor a single consolidation scenario  is more valid than the other. 

Joe(IBM): [[[quoting Jeff: • "Be suspicious!"Be aware of your own biases. Most marketing hype is preaching to the choir. Do not trust “near linear scaling” claims. Measure your situation. Don’t accept the assertion that the lowest hardware price leads to the lowest cost solution. Pay attention to your costs, and don’t mask business priorities with flat service levels. Be aware of your chargeback policies and their effects. Work to adjust when those effects distort true value and costs."]]]

Jeff(Sun): With this I cannot disagree. That's exactly what I have been discussing in my blog entries: unsubstantiated claims of "near linear scaling" to permit 1,500 servers to be consolidated onto a single z (well, the trick here is to stipulate that 1,250 of the 1,500 do no work!) By definition servers running at low utilization are doing nothing most of the time.or to ignore service levels (see my "Don't keep your users hostage" entry). Actually virtualization  of servers  on shared hardware can improve service levels by improving latency of interconnects.  I'll also add "beware of the 'sunk cost fallacy'": you shouldn't throw more money into using a too-expensive product that has excess capacity because you've already sunk costs there.  Actually, adding workload to an existing large server can be the most effiicent thing to do in terms of power, cooling, floorspace, people, deployment, and time to market, even if the price of the processor hardware is higher.  These efficiencies and the need for them is locally driven.  In general there may or may not be a "sunk cost fallacy" .  In fact  you should also be aware of the "hardware price bargain fallacy".  Finally, Sun itself recognized System z and zVM as "the premier virtualization platform" when Sun and IBM jointly announced support of Open Solaris on IBM hardware.

by Joe Temple July 28, 2008 in Systems Technology
Permalink | Comments (2) | TrackBack (0)

Today's Potpourri

1. Japan Airlines (JAL) becomes the latest customer to adopt z/TPF. z/TPF is IBM's extremely high performance transaction processing system, ideally suited for industries such as travel and transportation and financial services. JAL values z/TPF's 64-bit architecture, familiar Linux-based development tools, and sub-capacity pricing aligned with their business volumes. The full press release, in its original Japanese, can be found here. JAL is the largest airline in Asia and a member of the oneworld alliance.

There are signs Japan's traditionally ultra-cautious enterprise IT market is transforming as many Japanese companies become much more savvy, exploiting new technologies to help their businesses. For example, IBM has already sold new System z10 mainframes in Japan.

2. So what's the price for IBM's C/C++ compiler for z/OS, an IBM-MAIN forum poster asks. As little as $6 per month is the answer. I paid more for lunch today, and it wasn't nearly as good.

3. Slashdot picked up the New York Times story that Kevin refers to. Fortunately most of the Slashdot commenters know what they're talking about when it comes to mainframes, although a few still have strange misconceptions.

4. Blogger Arthur Cole waxes less sanguine than most about where the mainframe is headed. What do you think? Stephen Swoyer has a much different take.

5. IBM's relationship with ACI Worldwide is deeper and broader than ever. The two companies have an aggressive partnership to help financial services customers move electronic payment and ATM applications such as BASE24-eps to System z. Now IBM is taking over management of ACI's internal IT needs.

6. Interesting article about Marist College and their 700-odd Linux servers running on a single IBM System z9 mainframe. The article touches on the convenience of virtual firewall protections which Marist has implemented. Some of the servers support internal Marist administrative needs while most of them are available to students for classwork and other projects. All the servers live in harmony, and the students cannot change their own grades or tuition bills, for example.

by Timothy Sipples March 27, 2008 in Economics, Innovation, Systems Technology
Permalink | Comments (3) | TrackBack (0)

IBM Announces the System z10 for the Next Generation Data Center

As Tuesday, February 26, 2008, begins in each timezone, IBM announces the new System z10 for all the world's next generation data centers. Here, for example, is the Japanese language press release:

http://www.ibm.com/jp/press/2008/02/2601.html

We mainframe bloggers will try to keep everyone informed as we hear more from IBM, so check back for updates. I suspect we'll also have some interesting thoughts and perspectives to offer.

I see that Wikipedia already has a short article. That's fast!

UPDATE #1: Released at 12:01 a.m. New York time, here's the English version of the press release along with a short video and some pictures:

http://www.marketwire.com/mw/release.do?id=825324

UPDATE #2: And here's a picture of the new greenest of green machines:
Z10

UPDATE #3: IBM has posted a large number of official announcement letters. Here are the links to the PDFs. Let's read together, shall we?

  • 108-154: IBM System z10 Enterprise Class -- The forward thinking mainframe for the 21st century

  • 108-155: IBM System Storage DS8000 series (Machine types 2421, 2422, 2423, and 2424) delivers Extended Distance FICON for IBM System z environments

  • 108-156: IBM System Storage DS8000 series (Machine type 2107) delivers Extended Distance FICON for IBM System z environments

  • 208-038: IBM Systems Director Active Energy Manager for Linux on System z, V3.1 is designed to enable optimization of energy consumption for heterogeneous IBM systems

  • 208-039: IBM ISPF Productivity Tool V5.10 enhancements deliver increased efficiency for ISPF users

  • 208-041: IBM DB2 for z/OS Value Unit Edition offers one-time-charge price metric for net new workloads on IBM System z

  • 208-042: Preview: z/OS V1.10 -- Raising the bar and redefining scalability, performance, availability, and economics

  • 308-001: IBM GDPS V3.5: Enterprise-wide infrastructure availability

UPDATE #4: Yowza! The System z10 EC announcement is 391 pages!

UPDATE #5: OK, not so much reading. Most of those 391 pages are model conversion tables. So I could skim the announcement quickly for the golden nuggets.

One of the most surprising facts is that the System z10 is available immediately. (Maybe you can send a truck to Poughkeepsie to pick one up today?) The new HiperDispatch feature looks very interesting. There's a bit of a caution in the announcement suggesting that workloads could vary more than usual in how they perform when moving up to the System z10. That makes sense, because there's an awful lot of new technology packed into this model that's way beyond the typical model upgrade. Going from a single core 1.7 GHz processor to a quad-core 4.4 GHz processor design is a big leap.

4.4 GHz per core! (I've got to start getting used to saying "cores" now.) And up to 1.5 TB of memory per machine. Many more capacity model choices to make the costs smoother. Uniprocessor performance increased up to 70% for z/OS mixed workloads -- quite a jump. I also like how IBM is fencing off the HSA. It's 16 GB, but you never see it, and you don't have to pay for it. The Capacity for Planned Events (CPE) looks like a great idea. You can get up to 3 days of capacity for activities like relocating data centers. There's a nice statement of direction concerning z/VM and LPARs. You'll be able to manage all processor types and all operating systems within a single z/VM LPAR. (At present you have to fence resources.) New and improved OSA networking. InfiniBand coupling for Parallel Sysplex, raising the local distance limit up to 150 meters. (Sort of a mini-GDPS distance! Definitely nice for campuses.) There's something about some new time of day capabilities in the base configuration which looks great. The processors now support 1 MB page sizes, and there's both Assembler and C/C++ support for them.

This is a major leap. Still reading.

by Timothy Sipples February 25, 2008 in Systems Technology
Permalink | Comments (0) | TrackBack (0)

IBM CFO: "Next Generation Mainframe in February"

From CFO Mark Loughridge's prepared remarks discussing IBM's 4th quarter 2007 earnings:

...This marks the tenth quarter of a long and successful technology cycle for System z. In 2008 we’ll move to our next generation mainframe, with announcement and availability in late February. This next generation System z has 50 percent more capacity than the current z9, enables unprecedented levels of workload consolidation and extends mainframe’s leadership in energy efficiency, security and resiliency.

by Timothy Sipples January 25, 2008 in Systems Technology
Permalink | Comments (1) | TrackBack (0)

Today's Potpourri

1. If you were lucky enough to visit Palo Alto, California, last week, Charles Webb delivered a presentation to the HOT CHIPS 19 technical conference at Stanford University. The title: "The Next-Generation Mainframe Microprocessor."

2. GUIDE SHARE Europe sponsors their first annual z/VSE, z/VM, and Linux on z conference at IBM's Boeblingen, Germany, labs, from October 15 to 17, 2007. [I need to go to Germany in October, boss....]

3. The IBM Software Group continues to acquire interesting capabilities for its fast moving and growing business. Of particular note to mainframers: Princeton Softech and DataMirror.

4. So how do you save power and minimize cooling in your data center? Stanford has a new study. The study's authors examined the issue carefully and arrived at the answer: get a server which has the best input/output performance. "Since the CPU is usually the highest power component, these results suggest that building a system with more I/O to complement the available processing capacity should provide better energy efficiencies." As one blogger noted, "Ah, the irony! 40 years after the minicomputer we're back to a batch mainframe I/O-centric architecture. All things old are new again."

Both batch and online, actually. The physics never changed.

5. IBM ships the new Integrated Removable Media Manager for System z next month. Why should you care? At long last IBM integrates tape management across multiple operating systems, including z/OS. You can now manage tape media across the enterprise from policies defined on your mainframe, the "hub of the universe."

About freakin' time, since so many businesses are struggling with data retention and archiving policies for regulatory compliance, privacy protection, etc.

by Timothy Sipples August 27, 2007 in Systems Technology
Permalink | Comments (0) | TrackBack (0)



The postings on this site are our own and don’t necessarily represent the positions, strategies or opinions of our employers.
© Copyright 2005 the respective authors of the Mainframe Weblog.