Tuesday, December 04, 2007

UKOUG 2007: Monday, Monday

The first session I chaired was Martin Widlake talking on "Avoiding Avoidable Disasters and Surviving Survivable Ones" He took as his sermon the text What could possibly go wrong? Well, for starters, there could be a problem with the USB drive so that the wireless slide controller doesn't work. Martin disappeared to try and find a replacement, which left me with the possibility of having to vamp for an hour if he didn't come back. Fortunately he did return, albeit without a replacement gadget, so his presentation was more static than useful usual [see Comments - Ed]. Fortunately the ideas were lively. I was particularly taken with the Formica Table. This would be a forum where "not bad" DBAs answered questions which fitted 95% of all scenarios; sort of an Oak Table For The Rest Of Us.

His main theme was that projects tend to fail because decision-makers aren't realistic in their risk assessments. So projects are planned on the basis of everything going right. Staff are told to work extra hours and weekends without any recognition that tired people make mistakes, and fixing mistakes costs time. Or additional people are recruited which just increases the number of communication channels, to the point that the overhead of keeping everybody in the loop becomes excessive.

Martin advises us to install disaster tolerant hardware, because it gives us more interesting problems to solve. Of course we shouldn't really switch to clustering technology just for the sake of it. But if we think we are eventually going to need RAC we should move to it now. That way we'll have learned to live with it and smoothed all the wrinkles before the technology has become critical to the system.

There were some entertaining war stories. One concerned a failed powerpack in a cluster. A sysadmin noticed and helpfully substituted the powerpack from another machine. When he connected the first node without a hitch but the second node promptly fried that power pack too. So he called an electrician. In order to get at the problem the electrician had to climb a ladder. The ladder slipped and the flailing electrician grabbed at the nearest thing to hand, the power rail, which collapsed and took out the leccy for the entire server room. We can't plan for such things, we can merely acknowledge that such stuff will happen.

The solutions are the obvious ones: realistic planning, smaller units of delivery, delivering something on a regular basis. One neat idea for solving the communication problem came from somebody who works for Unilever. They use Jabber to post small messages to a central message board, so everybody can see what everybody else is doing in real time. At last a business use for Twitter.

An Industrial Engineer's Approach to DBMS


Problems with the AV set-up seem to have become a theme in the sessions I've chaired. Cisco's Robyn Sands turned up with a Mac but without the requisite dongle which would allow her to plug it into the hall's projector. So she ended up having to drive her presentation from a PDF on a loaned Windows machine. She handled the transition to an unfamiliar OS but it was an unlucky start to her session.

Industrial engineering concerns itself with "design, improvement and installation of integrated systems of people, material, equipment and energy", which is a pretty good definition of the role of a DBA too. Industrial engineers focus on efficiency, value and methodology; they are the accountants of the engineering world. The application of IE methods to DBMS has the aim of producing a consistent application. For instance, every database in an organisation should be consistent: same installed components, same init parameters, same file locations, with transportable tablespaces and reusable code/scripts/methods. This results in safer and more flexible systems. The installation process is a flowchart; in effect the database instance is deployed as an appliance.

Another IE practice is value analysis. This says that a cost reduction adds value to a system just as much as adding a new feature. Which brings us to Statistical Process Control . Every process displays variability: there is controlled variability and uncontrolled variability. We need to use logging to track the elapsed time of our processes, and measure the degree of variability. Benchmarking is crucial because we need to define the normal system before we can spot abnormality. Abnormal variability falls into three categories:
  • special case;
  • trend;
  • excess variation.
Once we have explained the special cases we can file and forget them. Trends and excess variation both have to be investigated and fixed. The aim is achieving a consistent level of service rather than extreme performance. If you can accurately predict the Mean Response Time then you understand your system well.

Robyn described a project she had worked on which focused on systems reliability. The goal was to reduce or eliminate recurring issues with a view to reducing outages - and henceout-of-hours calls - to increase the amount of uninterrupted sleep for DBAs and developers. A worthy end. The problem is simply that a DBA or developer woken at three in the morning will apply the quickest possible fix to resolve the outage but there was no budget to fix the underlying problem when they got back into the office. Usually old code is the culprit. There's lots of kruft and multiple dependencies, which make the programs brittle. The project worked to identify the underlying causes of outages and fix them. The metric they used to monitor the project's success was the number of out-of-hours calls: over the course of the year these fell by orders of magnitude.

Robyn finished her presentation with some maxims:
  • Rules of thumbs are not heuristics.
  • Discipline and consistency lead to agility.
  • Reduce variation to improve performance.
  • No general model applies to all systems.
  • Understand what the business wants.
  • Model and benchmark your system accurately.
  • Understand the capabilities of you system.

The licensing round table


This event was billed as "Oracle's Right To Reply". Unfortunately there wasn't an Oracle representative present and even when one was rustled up they could only take away our observations to pass them on. This confirmed an observation from Rocela's Jason Pepper that Oracle employees are expressly forbidden from discussing licencing unless they are an account manager. This can lead to situations where advice from Support or Consulting leads to customers having exposure to increased licences.

The issues aired were the usual suspects. Why isn't partitioning part of the Enterprise Edition licence? Why aren't the management packs available for Standard Edition? Why isn't there a single, easily locatable document explaining pricing policy? How can we have sensible negotiations when the account managers keep changing? There was one area which was new to me. There is a recent initiative, the Customer Optimization Team, whose task is to rationalise a customer's licences. Somebody asked the pertinent question: what is the team's motivation - to get the best value for customer or to sell additional licences for all the things which the customer is using without adequate licences? Perhaps we'll get answers. I shall watch my inbox with bated breath.

Index compression


This was a bonus session I hadn't been meaning to attend but it was worthwhile. Philip Marshall from Joraph presented his research into the effects of compression, because these are not well documented in the manual. Compression works by storing the distinct values of the compressed columns and then linking to each instance of that value, which obviously imposes a small overhead per row. So the space saved on storing the compressed column is dependent on both the length of the column and the number of instances of those values. The overhead means that compressing an index with small columns which have high variability could result in very small savings or even a larger index.

Also we need to remember that the apparent space saving could be due to the act of rebuilding the index rather than compressing it. This matters because (as we all should know) the space savings from rebuilding can be quickly lost once the index is subjected to DML. Furthermore there is a cost associated with uncompressing an index when we query its table. This is can be quite expensive. The good news is that the CPU cost of uncompressing the columns is incurred by the index read only: so it is usually only a small slice of the whole query. Still it's a cost we should avoid paying if we aren't actually getting a compensating saving on space. Also compression does not result in more index blocks being cached. More blocks will be read in a single sweep, but the unused blocks will be quickly discarded.

I thought this presentation was a classic example of the Formica Table approach. A focused - dare I say compressed? - look at a feature which probably most of us have contemplated using at some time without really understanding the implications. It was the kind of presentation which might just as easily have been a white paper (I will certainly be downloading the presentation to get the two matrices Philip included) but there is already so much to read on the net that a paper would have just got lost.

11g for DBAs


This was the first of a tie-less Tom Kyte's two selections from the 11g chocolate box. I think the 11g features have been sufficiently rehearsed over the last few months that I have decided to skip the details. So here is just a list of the new features Tom thinks DBAs most need to pay the attention to.
  • Encrypted tablespaces
  • Active Dataguard
  • Real Application Testing
  • Enhancements to Data Pump (EXP and IMP are now deprecated)
  • Virtual columns
  • Enhancements to partitioning
  • Finer grained dependency tracking
  • the xml version of the alert log
  • invisible indexes
Of that list encrypted tablespaces, Active Dataguard, Real Application Testing and partitioning are (or require) chargeable extras. In the earlier round table Ronan kept reminding us that we must distinguish between licensing and pricing: we have to accept that Oracle has 47% of the database market so lots of CTOs and CFOs must think it offers value for money. Which is a fair point, but it is hard to see a compelling reason why a Standard Edition DBA would choose to upgrade. Actually the enhancements for developers are very attractive, but alas we don't carry as much sway.

One notable thing in Tom's presentation occurred when he was demonstrating the new INTERVAL operation for partitioning. The new partitions had a year of 2020, but the dates were supposed to be from this year. It turns out Tom had been tinkering with his demo code and had removed an explicit date conversation without checking the default date format. It's nice to know even the demi-gods fall prone to such things ;)

Labels: , , ,

5 Comments:

Anonymous Martin W said...

"so his presentation was more static than useful"

Oh dear. I must remember to run around more to ensure my talks have value :-)
One thing I learned from the USB slide control woes is how much such a small change like that can throw you! Must. Practice. More.

7 December 2007 04:47:00 GMT-8  
Blogger APC said...

>> "so his presentation was more static than useful"


Ooops. I of course meant "usual". Abject apologies.

Cheers, APC

7 December 2007 05:49:00 GMT-8  
Blogger Alex Gorbachev said...

At last a business use for Twitter.

Indeed, but this is about a year old post and the tool I described is even older -- Twitter, Pythian Style. IT should really employ Twitter style more often.

9 December 2007 14:50:00 GMT-8  
Blogger Alex Gorbachev said...

There was one area which was new to me. There is a recent initiative, the Customer Optimization Team, whose task is to rationalise a customer's licences. Somebody asked the pertinent question: what is the team's motivation - to get the best value for customer or to sell additional licences for all the things which the customer is using without adequate licences?

Exactly. Oracle can't be a prosecutor and an advocate at the same time so third-party could be the right answer? Paul Vallee recently posted on this topic. Some smart chaps realized this is a big problem and an opportunity for good business. ;-)

9 December 2007 15:00:00 GMT-8  
Blogger Robyn said...

Hello Andrew ...

Thank you for covering my presentation - Your notes are fantastic and will help me as I work to refine the topic and some of the subtopics. I'm still trying to figure out the right level of IE info to include and which topics have the most potential to be useful to the community. If you have additional feedback or comments, I'd love to hear them.

It was lovely to meet you at the conference ... Robyn

18 December 2007 07:02:00 GMT-8  

Post a Comment

<< Home