CHAPTER 1 - A Better Way to Optimize
For many people, Oracle performance is a very difficult problem. Since 1990, Ive worked with thousands of professionals engaged in performance improvement projects for their Oracle systems. Oracle performance improvement projects appear to progress through standard stages over time. I think the names of those stages are stored in a vault somewhere beneath Geneva. If I remember correctly, the stages are:
Misery and famine
For some reason, my colleagues and I are rarely invited to participate in a project until the "misery and famine" stage. Here is what performance improvement projects often look like by the time we arrive. Do they sound like situations youve seen before?
Technical experts disagree over root causes
The severity of a performance problem is proportional to the number of people who show up at meetings to talk about it. Its a particularly bad sign when several different companies "best experts" show up in the same meeting. In dozens of meetings throughout my career, Ive seen the "best experts" from various consulting companies, computer and storage subsystem manufacturers, software vendors, and network providers convene to dismantle a performance problem. In exactly 100% of these meetings Ive attended, these groups have argued incessantly over the identity of a performance problems root cause. For weeks. How can dedicated, smart, well-trained, and well-intentioned professionals all look at the same system and render different opinionsoften even contradictory opinions on whats causing a performance problem? Apparently, Oracle system performance is a very difficult problem.
Experts claim excellent progress, while users see no improvement
Many of my students grin with memories when I tell stories of consultants who announce proudly that they have increased some statistic markedlymaybe they increased some hit ratio or reduced some extent count or some suchonly to be confronted with the indignity that the users cant tell that anything is any better at all. The usual result of such an experience is a long report from the consultant explaining as politely as possible that, although the users arent clever enough to tell, the system is eminently better off as a result of the attached invoice.
The story is funny unless, of course, youre either the owner of a company whos paying for all this wasted time, or the consultant who wont get paid because he didnt actually accomplish anything meaningful. Maybe this story seems funny because most of us at some time or another have been that consultant. How is it possible to so obviously improve such important system metrics as hit ratios, average latencies, and wait times, yet have users who cant even perceive the beneficial results of our effort? Apparently, Oracle system performance is a very difficult problem.
Hardware upgrades either dont help, or they slow the system further. Since first picking up Neil Gunthers The Practical Performance Analyst in 1998 [Gunther (1998)], I have presented to various audiences the possibility of one particularly counterintuitive phenomenon. "Do you realize that a hardware upgrade can actually degrade the performance of an important application?" Every audience to which Ive ever presented this question and the facts pertaining to it have had virtually identical reactions. Most of the audience smiles in disbelief while I describe how this can happen, and one or two audience members come to the podium afterward to rejoice in finally figuring out what had happened several months after their horrible "upgrade gone wrong."
Hardware upgrades may not often cause noticeable new performance problems, but they can. Very often, hardware upgrades result in no noticeable difference, except of course for the quite noticeable amount of cash that flows out the door in return for no perceptible benefit. That a hardware upgrade can result in no improvement is somewhat disturbing. The idea that a hardware upgrade can actually result in a performance degradation, on its face, is utterly incomprehensible. How is it possible that a hardware upgrade might not only not improve performance, but that it might actually harm it? Apparently, Oracle system performance is a very difficult problem.
The number one system resource consumer is waste
Almost without exception, my colleagues and I find that 50% or more of every systems workload is waste. We define "waste" very carefully as any system workload that could have been avoided with no loss of function to the business. How can completely unnecessary workload be the number one resource system performance is a very difficult problem.
These are smart people. How could their projects be so messed up? Apparently, Oracle system optimization is very difficult. How else can you explain why so many projects at so many companies that dont talk to each other end up in horrible predicaments that are so similar?