UKOUG 2007: Monday, Monday
His main theme was that projects tend to fail because decision-makers aren't realistic in their risk assessments. So projects are planned on the basis of everything going right. Staff are told to work extra hours and weekends without any recognition that tired people make mistakes, and fixing mistakes costs time. Or additional people are recruited which just increases the number of communication channels, to the point that the overhead of keeping everybody in the loop becomes excessive.
Martin advises us to install disaster tolerant hardware, because it gives us more interesting problems to solve. Of course we shouldn't really switch to clustering technology just for the sake of it. But if we think we are eventually going to need RAC we should move to it now. That way we'll have learned to live with it and smoothed all the wrinkles before the technology has become critical to the system.
There were some entertaining war stories. One concerned a failed powerpack in a cluster. A sysadmin noticed and helpfully substituted the powerpack from another machine. When he connected the first node without a hitch but the second node promptly fried that power pack too. So he called an electrician. In order to get at the problem the electrician had to climb a ladder. The ladder slipped and the flailing electrician grabbed at the nearest thing to hand, the power rail, which collapsed and took out the leccy for the entire server room. We can't plan for such things, we can merely acknowledge that such stuff will happen.
The solutions are the obvious ones: realistic planning, smaller units of delivery, delivering something on a regular basis. One neat idea for solving the communication problem came from somebody who works for Unilever. They use Jabber to post small messages to a central message board, so everybody can see what everybody else is doing in real time. At last a business use for Twitter.
An Industrial Engineer's Approach to DBMS
Problems with the AV set-up seem to have become a theme in the sessions I've chaired. Cisco's Robyn Sands turned up with a Mac but without the requisite dongle which would allow her to plug it into the hall's projector. So she ended up having to drive her presentation from a PDF on a loaned Windows machine. She handled the transition to an unfamiliar OS but it was an unlucky start to her session.
Industrial engineering concerns itself with "design, improvement and installation of integrated systems of people, material, equipment and energy", which is a pretty good definition of the role of a DBA too. Industrial engineers focus on efficiency, value and methodology; they are the accountants of the engineering world. The application of IE methods to DBMS has the aim of producing a consistent application. For instance, every database in an organisation should be consistent: same installed components, same init parameters, same file locations, with transportable tablespaces and reusable code/scripts/methods. This results in safer and more flexible systems. The installation process is a flowchart; in effect the database instance is deployed as an appliance.
Another IE practice is value analysis. This says that a cost reduction adds value to a system just as much as adding a new feature. Which brings us to Statistical Process Control . Every process displays variability: there is controlled variability and uncontrolled variability. We need to use logging to track the elapsed time of our processes, and measure the degree of variability. Benchmarking is crucial because we need to define the normal system before we can spot abnormality. Abnormal variability falls into three categories:
- special case;
- excess variation.
Robyn described a project she had worked on which focused on systems reliability. The goal was to reduce or eliminate recurring issues with a view to reducing outages - and henceout-of-hours calls - to increase the amount of uninterrupted sleep for DBAs and developers. A worthy end. The problem is simply that a DBA or developer woken at three in the morning will apply the quickest possible fix to resolve the outage but there was no budget to fix the underlying problem when they got back into the office. Usually old code is the culprit. There's lots of kruft and multiple dependencies, which make the programs brittle. The project worked to identify the underlying causes of outages and fix them. The metric they used to monitor the project's success was the number of out-of-hours calls: over the course of the year these fell by orders of magnitude.
Robyn finished her presentation with some maxims:
- Rules of thumbs are not heuristics.
- Discipline and consistency lead to agility.
- Reduce variation to improve performance.
- No general model applies to all systems.
- Understand what the business wants.
- Model and benchmark your system accurately.
- Understand the capabilities of you system.
The licensing round table
This event was billed as "Oracle's Right To Reply". Unfortunately there wasn't an Oracle representative present and even when one was rustled up they could only take away our observations to pass them on. This confirmed an observation from Rocela's Jason Pepper that Oracle employees are expressly forbidden from discussing licencing unless they are an account manager. This can lead to situations where advice from Support or Consulting leads to customers having exposure to increased licences.
The issues aired were the usual suspects. Why isn't partitioning part of the Enterprise Edition licence? Why aren't the management packs available for Standard Edition? Why isn't there a single, easily locatable document explaining pricing policy? How can we have sensible negotiations when the account managers keep changing? There was one area which was new to me. There is a recent initiative, the Customer Optimization Team, whose task is to rationalise a customer's licences. Somebody asked the pertinent question: what is the team's motivation - to get the best value for customer or to sell additional licences for all the things which the customer is using without adequate licences? Perhaps we'll get answers. I shall watch my inbox with bated breath.
This was a bonus session I hadn't been meaning to attend but it was worthwhile. Philip Marshall from Joraph presented his research into the effects of compression, because these are not well documented in the manual. Compression works by storing the distinct values of the compressed columns and then linking to each instance of that value, which obviously imposes a small overhead per row. So the space saved on storing the compressed column is dependent on both the length of the column and the number of instances of those values. The overhead means that compressing an index with small columns which have high variability could result in very small savings or even a larger index.
Also we need to remember that the apparent space saving could be due to the act of rebuilding the index rather than compressing it. This matters because (as we all should know) the space savings from rebuilding can be quickly lost once the index is subjected to DML. Furthermore there is a cost associated with uncompressing an index when we query its table. This is can be quite expensive. The good news is that the CPU cost of uncompressing the columns is incurred by the index read only: so it is usually only a small slice of the whole query. Still it's a cost we should avoid paying if we aren't actually getting a compensating saving on space. Also compression does not result in more index blocks being cached. More blocks will be read in a single sweep, but the unused blocks will be quickly discarded.
I thought this presentation was a classic example of the Formica Table approach. A focused - dare I say compressed? - look at a feature which probably most of us have contemplated using at some time without really understanding the implications. It was the kind of presentation which might just as easily have been a white paper (I will certainly be downloading the presentation to get the two matrices Philip included) but there is already so much to read on the net that a paper would have just got lost.
11g for DBAs
This was the first of a tie-less Tom Kyte's two selections from the 11g chocolate box. I think the 11g features have been sufficiently rehearsed over the last few months that I have decided to skip the details. So here is just a list of the new features Tom thinks DBAs most need to pay the attention to.
- Encrypted tablespaces
- Active Dataguard
- Real Application Testing
- Enhancements to Data Pump (EXP and IMP are now deprecated)
- Virtual columns
- Enhancements to partitioning
- Finer grained dependency tracking
- the xml version of the alert log
- invisible indexes
One notable thing in Tom's presentation occurred when he was demonstrating the new INTERVAL operation for partitioning. The new partitions had a year of 2020, but the dates were supposed to be from this year. It turns out Tom had been tinkering with his demo code and had removed an explicit date conversation without checking the default date format. It's nice to know even the demi-gods fall prone to such things ;)