Thursday, October 27, 2005

Agile Databases and other paradoxes

So Niall Litchfield’s aside about Sandra Momali’s taste for Agile development has kicked off a bit of a debate. I thought I’d throw in my tuppenceworth, as I find XP a very seductive idea, even though I haven’t had much opportunity to use more than a couple of its techniques. I think it is easy for discussion about Agile development to descend into straightforward bashing of Java heads and developer-vs-DBA scrapping. Which can be fun for the participants but are not particularly enlightening for spectators. Besides there’s more to Agile than Java. In fact, Bruce Tate (author of Better, Faster, Lighter Java) seems ready to dump it in favour of Ruby because Java is insufficiently agile, EJB3.0 notwithstanding.

I hope we can all agree that the problems that Agile seeks to address are valid ones. Applications are all too frequently delivered which are bug ridden, late, not what the customer wanted, not what the customer now needs, and sometimes all of the above simultaneously. Let’s disregard considerations of whether there are other approaches that can also solve these problems and take the Agile approach as a given. The question that needs to be answered is, "Can Agile techniques sensibly and profitably be applied to database development?"1

From the Agile side of the fence the answer is mostly a resounding raspberry. Agile practioners tend to regard regard databases as persistence repositories of last resort. They have a number of strategies to avoid writing SQL. Not just by deferral, using Mock Objects in the development stage, but by making deliberate architectural decisions to not use databases. These range from serialization to files through use of in-memory object stores like Prevayler to hands-off Object-Relational mapping tools like Hibernate. Last year I was talking to a UKOUG member who no longer attended the SIG I help run. He explained that his project was currently being migrated to a new version using Hibernate, Spring, Velocity, etc. I said, that sounds exciting, could someone do a presentation at the SIG? Back came the answer: No. Being Java developers they literally could not see the point in talking to a database user group. The whole point about their approach to application development was to avoid thinking about the database at all.

Some of this resistance may be down to sheer pigheadedness, but (at the risk of seeming to patronise Ron Jeffries) many of these people are actually very bright and aware people. They just don’t think databases can be Agile. Why is this? Let’s look at some of the Agile practices that are made harder by working with an RDBMS.

Incremental Design


Agile starts with rather sparsely-defined requirements and uses techniques such as Test Driven Design to flesh them out. This is not building a prototype, this is working, production quality code. By building real code the Agile practioner can show his user a working application and get meaningful feedback that allows him to refine the requirements. Hence together they build the system that the user actually wants. The anti-pattern for this is Big Design Up Front. Here we spend ages drawing a mesh of boxes called an Entity Relationship Diagram. We show it to a bemused user who nods guardedly. We turn it into a data model but the user still isn’t as excited as we are. Finally the database gets built and then the front end gets built on it and whereupon the user wails, "But that’s not what I meant at all!" The question is, does the fault lie in the database or the process? Are databases inherently BDUF?

Obviously, we can just build one table at a time. But is that enough? Probably not. We still need to normalise our database2. Consequently, any attempt to build our database incrementally means constantly refactoring of our existing tables, to eliminate duplication. Can we do this? Yes we can! Of course it requires that we hide all our tables from public gaze behind an API - PL/SQL packages, views or types, which some people will just hate. The larger question is Tom Kyte's point about the importance of designing for performance right from the start. Does that imply BDUF? Yes, at least some. The other problem is that refactoring an object is a lot simpler than splitting one table filled with twenty million rows into two tables. Our database design can be reasonably plastic in development but once we go into production inertia kicks in. I think here is the killer for Agile practices: database refactoring is hard. Scott Ambler seems to be ploughing a lonely furrow in this area but his thoughts on The Process of Database Refactoring make interesting reading.

User stories


Agile practices tend to re-inforce each other and that’s the case with User Stories and Incremental Design. User stories describe how the customer’s business works. These stories will not mention data storage, except by implication. Customers don’t specify a database, they specify an application, that is, a front end. Customers cannot accept or validate databases. Even if we gave them TOAD and they wrote SQL queries that would not prove the database met any of their requirements, because their requirements are specified in terms of the front end. And without continuous user feedback we cannot be sure that we are coding in the right direction.

Kent Graziano has been working on building a datawarehouse using Agile techniques. He has presented on Agile Methods and DataWarehousing several times this year. When I saw him talk at Open World he said he had got some grief from Agilistas because the ETL section of datawarehouses doesn’t have users. So there cannot be any user stories. But as Kent points out, just because the business user (the customer) doesn’t know or care about the ETL process doesn’t mean that ETL has no users. On Kent's project they put the BI report writer in the user role. Which is a nimble, indeed agile, extension of the practice which doesn’t, as far as I can see, break any tenet.

Common Ownership of Code


When programming an agile practioner drives a path through the code base. If she finds an existing part of the application that blocks her implementation she can change, extend or fix that code. She can do this safely because the existing code has unit tests and the existing application has integration tests. More importantly, she can do this new work because the existing code base is in Java and she knows Java. But if this piece in the code base was in PL/SQL she would be lost: she doesn’t know PL/SQL. Even worse, suppose her change requires a new table? Then she would have to go to the DBA to find out about schemas, tablespaces, etc. This is too slow for the Agile way of working.

Teams of specialists are not Agile. We know this is true. How often have we been trying to progress a project only to be told, “We can’t do that, Gary’s on leave today.” To be fair this does not just apply to databases. I think Agile practices militates against heterogeneous programming environments and would be equally suspicious of (say) extensive use of shell programming. So, unless pretty much the whole team is competent in PL/SQL, using stored procedures just does not fit in an Agile project.

What this boils down to is the matter of where we should put the business logic. Unfortunately this frequently descends into a religious war. Either you believe it’s obvious that business rules belong in the database or it’s obvious to you that they belong in the middle tier. So let’s put it another way: in a client/server application where does the business logic belong? I think a large part of the middle tier proponents would then grudgingly accept the merits of the database as the repository of business logic. In most sites today we probably have heterogenous applications written in a variety of styles, techniques and languages, including many client/server applications. Our best hope of consolidating business rules is to use the database. But, this is an emotional debate and not one I expect either side to cede.

Object Oriented Programming


The alert amongst you will have noticed how often the term object appears in that list of persistence techniques: this is key. Agile programming languages are object-oriented languages. I was recently embroiled in a debate on the Test Driven Design list when I was told this surprising fact. Surprisng because I thought for several years I had been doing Test Driven Design by using Steven Feuerstein’s utPLSQL to build database applications. Apparently there’s more to Test First than merely writing our tests first.

Obviously an RDBMS is not object-oriented. A non-OO language like PL/SQL lacks certain key features supposedly necessary to the full panoply of Agile practices - abstraction, interfaces, reflection. Does this matter? I remain unconvinced. XP originated in the Smalltalk community and pretty much everybody you come across on Agile/Test Driven arenas seems to be an OO programmer. However, this is a self-fulfilling prophecy. Come out as a relational database developer and you get labelled as a handyman with only a hammer in your toolbox.

Use of Tools


Agile practioners make full use of utilities to improve their productivity - smart IDEs, code completion, continuous integration, automated unit testing, test coverage estimnators, refactoring browsers, etc. Compared to this panoply of riches PL/SQL looks very Spartan. There is the aforementioned utPLSQL but not much else. Of course, this is a vicious circle. PL/SQL lacks the tools to support Agile practices so people who want to be Agile don’t use PL/SQL so nobody builds the tools for PL/SQL.

Database development tools tend to focus on browsing the data dictionary and running SQL statements. There is very little around that genuinely supports the developer in building better PL/SQL or helps the DBA build better databases. This is a situation that is only going to get worse, as Oracle Designer slowly fades into the sunset. Paul Dorsey’s BRIM project (AKA The Tool) may be one to watch. Interestingly there is very little open source stuff for PL/SQL developers, compared to the wealth of freely downloadable utilities for Java heads and Pythonistas. I think this illustrates the lack of people who are competent in both an application (i.e. a UI building) language and PL/SQL.

Still, the Agile Manifesto says we have come to value individuals and interactions over processes and tools, so that’s all right then.

Instaneous Feedback


There is an old rule of thumb from the HCI arena that states that a user will regard as “too slow” any task that takes longer than ten seconds; if it takes longer than a minute they will start on another task and come back to the first task later. In Test Driven circles, running a suite of unit tests is such a task. The JUnit list regularly features complaints about how long it takes to run unit tests that require population of a database. This interferes with flow. Hence the popularity of Mock Objects. Hence the drive to isolate the database elements to as small a subset of the application as possible.

I think the database has to cop to this one. If your sole experience of database population is writing tedious XML files for use in DbUnit then you will naturally have a jaundiced view of RDBMS systems. But even people who know how to write SQL must admit that deleting and inserting records takes longer than instantiating some objects (once the JVM has been warmed up!). That’s the price we happily pay for rigour.

Pair Programming


Well, database people are notoriously misanthropic. Especially DBAs. We can still talk to each other.

Conclusion


The Agile argument is databases are slow to build and inherently hard to understand. This is true. But it’s true because Agile is primarily a methodology targeted at building user-facing applications. Its adherents tend to work from the user interface downwards. Of course it’s crucial that we build an application that meets the user’s needs. But the user’s needs extend beyond the functional requirements of the application screens.

That doesn’t mean we cannot apply Agile techniques to database development projects. Certainly I regard Test First to be an invaluable technique in building stored procedures; I’m always nevervous when I’m not coding to a utPLSQL test case. My experience of working on DSDM projects tells me that incremental development - in particular placing working code in front of users early and often - is a very good idea. Kent’s work proves that Agile techniques can be profitably applied even to a juggernaut like a datawarehouse. The pain comes when OO-fixated bean-mongers clash with constraint obsessed databasers. Perhaps if they drank less coffee they wouldn’t be so hyped up all the time.


1. By database development I mean building data structures (i.e. tables) as well as programs (PL/SQL).

2. I learnt yesterday that Codd’s Twelve Rules have been deprecated because "Codd formulated them as a quick and dirty way to counter all the nonsense that was floating at that early time, they are not orthogonal or systematic, and our understanding of RM has progressed considerably since then."

8 comments:

Eddie Awad said...

Hi, I just want to let you know that none of the links in this post works.

Noons said...

That's OK, Eddie: it's called Agile web writing: first we write the contents with the customer, then we worry about fixing the glaring bugs later on. It's all working production code, you know? What else would you expect from something that involves that blithering ignorant Scott Ambler?

It would be fun if it wasn't sad...

APC said...

>> Hi, I just want to let you know that none of the links in this post works.

Thanks for that Eddie. I've been experimenting with an RTF editor and it looks like it's default setting is smart quotes instead of ASCII 34. Serves me right for not using TextPad like I usually do.

APC said...

>> it's called Agile web writing: first we write the contents with the customer

As I said, I always get nervous when not writing a program with a proper test harness. In a Test First environment this would not have happened.

>> What else would you expect from something that involves that blithering ignorant Scott Ambler?

...and people say the internet has substituted childish insults for informed debate!

APC said...

In view of Nuno's comments above, his post on Eddie's blog about respect would be quite funny.

If it wasn't so sad. Am I starting a flame war? I hope not.

Anonymous said...

Cracking post Andrew. You've hit a number of nails on the head there.

Although I think that some of the people you describe are just as guilty of bias as the database developers they deride.

The truth of the matter is that to be truly agile your team has to know and understand a broad range of technologies. To say that you're agile because you know both Struts and Hibernate is disngenuous to say the least.

In my opinion you can never know too many technologies - although you're never going to be an expert in all of them. Without wanting to fan the flames I think Scott Ambler has it pretty right when he talks about agile developers having to be generalising specialists [1].

A database is an important part of many systems and to dismiss it as just a persistent object store is doing both your system and your customers a disservice.

[1] http://www.agilemodeling.com/essays/generalizingSpecialists.htm

Anonymous said...

I like a lot of this. "Plastic" is a good word for an Agile thing. Database refactoring is hard because we try to pretend that it is like refactoring code. We have to change how we think about databases before we can take them Agile... and we definitely can take them Agile.

I've got a series of articles on the topic of database agility. You might consider reading them in addition to this post and the other recommended materials.

Anonymous said...

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情人歡愉用品,情趣用品,AIO交友愛情館,情人歡愉用品,美女視訊,情色交友,視訊交友,辣妹視訊,美女交友,嘟嘟成人網,按摩棒,震動按摩棒,微調按摩棒,情趣按摩棒,逼真按摩棒,G點,跳蛋,跳蛋,跳蛋,性感內衣,飛機杯,充氣娃娃,情趣娃娃,角色扮演,性感睡衣,SM,潤滑液,威而柔,香水,精油,芳香精油,自慰,自慰套,性感吊帶襪,情趣用品加盟,情人節禮物,情人節,吊帶襪,成人網站,AIO交友愛情館,情色,情色貼圖,情色文學,情色交友,色情聊天室,色情小說,七夕情人節,色情,A片,A片下載,免費A片,免費A片下載,情色電影,色情網站,辣妹視訊,視訊聊天室,情色視訊,免費視訊聊天,視訊聊天,美女視訊,視訊美女,美女交友,美女,情色交友,成人交友,自拍,本土自拍,情人視訊網,視訊交友90739,生日禮物,情色論壇,正妹牆,正妹,成人網站,A片,免費A片,A片下載,免費A片下載,AV女優,成人影片,色情A片,成人論壇,情趣,免費成人影片,成人電影,成人影城,愛情公寓,色情影片,保險套,舊情人,微風成人,成人,成人遊戲,成人光碟,色情遊戲,跳蛋,按摩棒,一夜情,男同志聊天室,肛交,口交,性交,援交,免費視訊交友,視訊交友,一葉情貼圖片區,性愛,視訊,嘟嘟成人網

愛情公寓,情色,舊情人,情色貼圖,情色文學,情色交友,色情聊天室,色情小說,一葉情貼圖片區,情色小說,色情,色情遊戲,情色視訊,情色電影,aio交友愛情館,色情a片,一夜情,辣妹視訊,視訊聊天室,免費視訊聊天,免費視訊,視訊,視訊美女,美女視訊,視訊交友,視訊聊天,免費視訊聊天室,情人視訊網,影音視訊聊天室,視訊交友90739,成人影片,成人交友,美女交友,微風成人,嘟嘟成人網,成人貼圖,成人電影,A片