Thursday, July 30, 2020

Minimal declaration of foreign key columns

Here is the full declaration of an inline foreign key constraint (referencing a primary key column on a table called PARENT):
, parent_id number(12,0) constraint chd_par_fk foreign key references parent(parent_id)
But what is the fewest number of words required to implement the same constraint? Two. This does exactly the same thing:
, parent_id references parent
The neat thing about this minimalist declaration is the child column inherits the datatype of the referenced primary key column. Here's what it looks like (with an odd primary key declaration, just to prove the point):
 
SQL> create table parent1
  2  (parent_id number(15,3) primary key)
  3  /

Table PARENT1 created.

SQL> create table child1
  2  ( id        number(12,0) primary key
  3   ,parent_id references parent1) 
  4  /

Table CHILD1 created.

SQL> desc child1
Name      Null?    Type         
--------- -------- ------------ 
ID        NOT NULL NUMBER(12)   
PARENT_ID          NUMBER(15,3) 
SQL> 
If we want to specify a name for the foreign key we need to include the constraint keyword:
 
SQL> create table parent2
  2  (parent_id number(15,3) constraint par1_pk primary key)
  3  /

Table PARENT2 created.

SQL> create table child2
  2  ( id        number(12,0) constraint chd2_pk primary key
  3   ,parent_id              constraint chd2_par2_fk references parent2) 
  4  /

Table CHILD2 created.

SQL> desc child2
Name      Null?    Type         
--------- -------- ------------ 
ID        NOT NULL NUMBER(12)   
PARENT_ID          NUMBER(15,3) 
SQL> 
This minimal declaration always references the parent table's primary key. Suppose we want to reference a unique key rather than the primary key. (I would regard this as a data model smell, but sometimes we need to do it.) To make this work we need merely explicitly reference the unique key column:
SQL> create table parent3
  2  ( parent_id  number(15,3)          constraint par3_pk primary key
  3   ,parent_ref varchar2(16) not null constraint par3_uk unique
  4  )
  5  /

Table PARENT3 created.

SQL> create table child3
  2  ( id         number(12,0) constraint chd3_pk primary key
  3   ,parent_ref              constraint chd3_par3_fk references parent3(ref)) 
  4  /

Table CHILD3 created.

SQL> desc child3
Name       Null?    Type         
---------- -------- ------------ 
ID         NOT NULL NUMBER(12)   
PARENT_REF          VARCHAR2(16) 
SQL> 
Hmmm, neat. What if we have a compound primary key? Well, that's another data model smell but it still works. Because we're constraining multiple columns we need to use a table level constraint and so the syntax becomes more verbose; we need to include the magic words foreign key:
SQL> create table parent4
  2  ( parent_id  number(15,3)   
  3   ,parent_ref varchar2(16) 
  4   ,constraint par4_pk primary key (id, ref)
  5  )
  6  /

Table PARENT4 created.

SQL> create table child4
  2  ( id number(12,0) constraint chd4_pk primary key
  3   ,parent_id
  4   ,parent_ref
  5   ,constraint chd4_par4_fk foreign key (parent_id, parent_ref) references parent4) 
  6  /

Table CHILD4 created.

SQL> desc child4
Name       Null?    Type         
---------- -------- ------------ 
ID         NOT NULL NUMBER(12)   
PARENT_ID           NUMBER(15,3) 
PARENT_REF          VARCHAR2(16) 
SQL> 
Okay, but supposing we change the declaration of the parent column, does Oracle ripple the change to the child table?
 
SQL> alter table parent4 modify parent_ref varchar2(24);

Table PARENT4 altered.

SQL> desc child4
Name       Null?    Type         
---------- -------- ------------ 
ID         NOT NULL NUMBER(12)   
PARENT_ID           NUMBER(15,3) 
PARENT_REF          VARCHAR2(16) 
SQL> 
Nope. And rightly so. This minimal syntax is a convenience when we're creating a table, but there's no object-style inheritance mechanism.

Generally I prefer a verbose declaration over minimalism, because clarity trumps concision. I appreciate the rigour of enforcing the same datatype on both ends of a foreign key constraint. However, I hope that in most cases our CREATE TABLE statements have been generated from a data modelling tool. So I think this syntactical brevity is a neat thing to know about, but of limited practical use.

Monday, September 03, 2018

UKOUG London Development and Middleware event - free!

The Oracle development landscape is an extremely broad and complicated one these days. It covers such a wide range of tools, technologies and practices it is hard to keep up.

The UKOUG is presenting a day of sessions which can bring you up to speed. It's a joint initiative between the Development and Middleware SIGs - a composite if you will - at the Oracle City Office on Thursday 6th September. This event is free. If you are a UKOUG member attending it won't count against your allotment of SIG delegates; if you're not a UKOUG member there's no charge so come along and get a taste of what the UKOUG has to offer.

The day covers a broad spectrum. Martin Beeby is a popular speaker; his talk covers how Oracle is embracing new cool technologies such as Blockchain, Docker and chatbots. There are talks from Oracle ACE Director Simon Haslam on mobile applications and Oracle ACE Director Mark Simpson on real-life uses for AI. There are also sessions on API design, building bots and JavaScript frameworks.

Even last year these things might have been considered cutting edge, certainly in the enterprise realm. But most organisations of whatever size are at least thinking about or running Proof of Concept projects in AI or blockchain. Some already have these technologies active in Production. These things will affect everybody working in IT, and probably sooner rather than later. It's always good to know what's coming.

Check out the full agenda here.
Register here.

Oh, and did I mention it's free? Treat yourself to a day out from the present and get a glimpse of the future.

Thursday, May 31, 2018

The Single Responsibility principle

The Single Responsibility principle is the foundation of modular programming, and is probably the most important principle in the SOLID set. Many of the other principles flow from it.

It is quite simple: a program unit should do only one thing. A procedure should implement a single task; a package should gather together procedures which solve a set of related tasks. Consider the Oracle library package UTL_FILE. Its responsibility is quite clear: it is for working with external files. It implements all the operations necessary to work with OS files: opening them, closing them, reading and writing, etc. It defines a bespoke suite of exceptions too.

Each procedure in the package has a clear responsibility too. For instance, fclose() closes a single referenced file whereas fclose_all() closes all open files. Now, the package designers could have implemented that functionality as a single procedure, with different behaviours depending on whether the file parameter was populated or unpopulated. This might seem a simpler implementation, because it would be one fewer procedure. But the interface has actually become more complicated: essentially we have a flag parameter, which means we need to know a little bit more about the internal processing of fclose(). It would have made the package just a little bit harder to work with without saving any actual code.

Of course, it's pretty easy to define the Single Responsibility of a low level feature like file handling. We might think there are some superficially similarities with displaying information to the screen but it's fairly obvious that these are unrelated and so we need tow packages, UTL_FILE and DBMS_OUTPUT. When it comes to our own code, especially higher level packages, it can be harder to define the boundaries. At the broadest level we can define domains - SALES, HR, etc. But we need more than one package per domain: how do we decide the responsibilities of indvidual pacakages?

Robert C Martin defines the Single Responsibility principle as: "A class should have only one reason to change." Reasons for change can be many and various. In database applications dependence on tables is a primary one. So procedures which work a common set of table may well belong together. But there are at least two sets of privileges for data: reading and manipulating. So it's likely we will need a package which gathers together reporting type queries which can be granted to read-only users and a package which executes DML statements which can be granted to more privileged users. Maybe our domain requires special processing, such as handling sensitive data; procedures for implementing that business logic will belong in separate packages.

Single responsibility becomes a matrix, with dependencies along one access and audience of users along another.

The advantages of Singel Responsibility should be obvious. It allows us to define a cohesive package, collecting together all the related functionality which makes it easy for others reuse it. It also allows us to define private routines in a package body, which reduces the amount of code we have to maintain while giving us a mechanism for preventing other developers from using it. Restricting the features to a single responsibility means unrelated functions are not coupled together. This gives a better granularity for granting the least privileges necessary to users of our code.

Part of the Designing PL/SQL Programs series

Sunday, April 15, 2018

UKOUG Northern Technology Summit 2018

The UKOUG has run something called the Northern Server Day for several years. Northern because they were held in a northern part of England (but south of Scotland) and Server because the focus was the database server. Over the last couple of years the day has had several streams, covering Database, High Availability and Engineered Systems. So primarily a day for DBAs and their ilk.

This year the event has expanded to let in the developers. Yay!

The Northern Technology Summit 2018 is effectively a mini-conference: in total there are five streams - Database, RAC Cloud Infrastructure & Availability, Systems, APEX and Development. But for registration it counts as a SIG. So it's free for UKOUG members to attend. What astonishingly good value!1 And it doesn't affect your entitlement to attend the annual conference in December.

The Development stream

The Development stream covers a broad range of topics. Application development in 2018 is a tangled hedge with new technologies like Cloud, AI and NoSQL expanding the ecosystem but not displacing the traditional database and practices. The Development stream presents a mix of sessions from the new-fangled and Old Skool ends of the spectrum.

  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • The New Frontier: Mobile and AI Powered Conversational Apps. Oracle are doing interesting work with AI and Grant Ronald is king of the chatbots. This is an opportunity to find out what modern day apps can do.
  • Building a Real-Time Streaming Platform with Oracle, Apache Kafka, and KSQL No single technology is a good fit for all enterprise problems. Robin Moffat of Confluent will explain how we can use Apache Kafka to handle event-based data processing.
  • Modernising Oracle Forms Applications with Oracle Jet Oracle Forms was< - still is - a highly-productive tool for building OLTP front-ends. There are countless organisations still running Forms applications. But perhaps the UX looks a little jaded in 2018. So here's Mark Waite from Griffiths Waite to show how we can use Oracle's JET JavaScript library to write new UIs without having to re-code the whole Forms application.
  • 18(ish) Things Developers Will Love about Oracle Database 18c Oracle's jump to year-based release numbers doesn't make live easier for presenters: ten things about 10c was hard enough. But game for a challenge, Oracle's Chris Saxon attempts to squeeze as many new features as possible into his talk.
  • Modernize Your Development Experience With Oracle Cloud Cloud isn't just something for the sysadmins, there's a cloud for developers too. Sai Janakiram Penumuru from DXC Technology will explain how Oracle Developer Cloud might revolutionise your development practices.
  • Designing for Deployment As Joel Spolsky says, shipping is a feature. But it's a feature which is hard to retrofit. In this talk I will discuss some design principles which make it easier to build, deploy and ship database applications.

Everything else

So I hope the Development stream offers a day of varied and useful ideas. There are things you might be able to use right now or in the next couple of months, and things which might shape what you'll be doing next year. But it doesn't matter if not everything floats your boat. The cool thing about the day is that delegates can attend any of the streams. 2 .

So you can listen to Nigel Bayliss talking about Optimisation in the Database Stream, Vipul Sharma: talking about DevOps in the Availability stream, Anthony talking about Kubernetes in the Systems stream and John Scott talking about using Docker with Oracle in the Apex stream. There are sessions on infrastructure as code, upgrading Oracle 12cR1 to 12cR2, GDPR (the new EU data protection law), the Apex Interactive grid, Apache Impala, and Cloud, lots of Cloud. Oh my!

The full agenda is here (pdf).

Register now

So if you're working with Oracle technology and you want to attend this is what you need to know:
  • Date: 16th May 2018
  • Location: Park Plaza Hotel, Leeds
  • Cost: Free for UKOUG members. There is a fee for non-members but frankly you might as well buy bronze membership package and get a whole year's work of access to UKOUG events (including the annual conference). It's a bargain.

1. The exact number of SIG passes depends on the membership package you have
2. The registration process requires you to pick a stream but that is just for administrative purposes. It's not a lock-in.

Sunday, December 31, 2017

Data Access Layer vs Table APIs

One of the underlying benefits of PL/SQL APIs is the enabling of data governance. Table owners can shield their tables behind a layer of PL/SQL. Other users have no access to the tables directly but only through stored procedures. This confers many benefits:
  • Calling programs code against a programmatic interface. This frees the table owner to change the table's structure whenever it's necessary without affecting its consumers.
  • Likewise the calling programs get access to the data they need without having to know the details of the table structure, such as technical keys.
  • The table owner can use code to enforce complicated business rules when data is changed.
  • The table owner can enforce sophisticated data access policies (especially for applications using Standard Edition without DBMS_RLS).
So naturally the question arises, is this the same as Table APIs?

Table APIs used to be a popular approach to encapsulating tables. The typical Table API comprised two packages per table; one package provided methods for inserting, updating and deleting records, and the other package provided query methods. The big attraction of Table APIs was that they could be 100% generated from the data dictionary - both Oracle Designer and Steven Feuerstein's QNXO library provided TAPI generators. And they felt like good practice because, y'know, access to the tables was shielded by a PL/SQL layer.

But there are several problems with Table APIs.

The first is that they entrench row-by-agonising-row processing. Table APIs have their roots in early versions of Oracle so the DML methods only worked with a single record. Even after Oracle 8 introduced PL/SQL collection types TAPI code in the wild tended to be RBAR: there seems to something in the brain of the average programmer which predisposes them to prefer loops executing procedural code rather than set operations.

The second is that they prevent SQL joins. Individual records have to be selected from one table to provide keys for looking up records in a second table. Quite often this leads to loops within loops. So-called PL/SQL joins prevent the optimizer from choosing good access paths when handling larger amounts of data.

The third issue is that it is pretty hard to generate methods for all conceivable access paths. Consequently the generated packages had a few standard access paths (primary key, indexed columns) and provided an dynamic SQL method which accepted a free text WHERE clause. Besides opening the package to SQL injection this also broke the Law of Demeter: in order to pass a dynamic WHERE clause the calling program needed to know the structure of the underlying table, which defeats the whole objective of encapsulation.

Which leads on to the fourth, more philosophical problem with Table APIs: there is minimal abstraction. Each package is generated so it fits very closely to the structure of the Table. If the table structure changes we have to regenerate the TAPI packages: the fact that this can be done automatically is scant recompense for the tight coupling between the Table and the API.

So although Table APIs could be mistaken for good practice in actuality they provide no real benefit. The interface is 1:1 with the table structure so it has no advantage over granting privileges on the table. Combined with the impact of RBAR processing and PL/SQL joins on performance and the net effect of Table APIs is disastrous.

We cannot generate good Data Access APIs: we need to write them. This is because the APIs should be built around business functions rather than tables. The API packages granted to other users should comprise procedures for executing transactions. A Unit Of Work is likely to touch more than one table. These have to be written by domain experts who understand the data model and the business rules.

Part of the Designing PL/SQL Programs series

Friday, December 29, 2017

On hitting 100K on StackOverflow

100,000 is just another number. It's one more than 99,999. And yet, and yet. We live in a decimal cultural. We love to see those zeroes roll up. Order of magnitude baby! It's the excitement of being a child, going on a journey in the family car when the odometer reads 99994. knowing you'll see 100000. Of course everybody got distracted by the journey and next time you look at the dial it reads 100002.

Earlier this year my StackOverflow reputation passed 100,000. Like the car journey I missed the actual moment. My rep had been 99,986 when I last checked the previous evening and 100,011 the next day. Hey ho.

Reputation is a big deal on StackOverflow because it is the prime measure of contribution. As a Q&A site (not a forum - that confuses a lot of people) it needs content, it needs good questions and good answers. Reputation points are the reward for good posts. In this context good is determined democratically: people vote up good questions and good answers, and - crucially - vote down poor questions and answers. Votes are the main way of gaining reputation points: +5 for an upvoted question, +10 for an upvoted answer and +15 for an accepted answer. (There are other ways of gaining - and losing - rep) but posting is the main one.
"Reputation is a rough measurement of how much the community trusts you; it is earned by convincing your peers that you know what you’re talking about." Meta Stack Exchange FAQ


So is reputation just a way of keeping score? Nope: it is gamification but there is more to it than that. Reputation means points and what do points make? Prizes Privileges. StackOverflow is largely a self-policing community. There are full-on (elected) moderators but most moderation is actually carried out by regular SO users with sufficient rep. Somebody has asked an unclear question: once you have 50 rep you can post a comment asking for clarification. Got a user who doesn't know how to turn off the CAPSLOCK key? With 2000 rep you can just edit their post and apply sentence case. And so on.

Hmmm, so StackOverflow rewards its keenest contributors by allowing them to do chores around the site. Yes and it works. One of the big problems with forums is other users. Not griefers as such but there are a lot of low-level irritations: users who don't know how to search the site, or how to format their posts, or just generally fail to understand etiquette. Granting increasing moderation privileges at reputation milestones allows committed users to smooth away soem of those irritations.

But still, getting to 100,000 took eight years and almost 3000 answers. Was it worth it? Of course. It's nice to give back to the community. We are here to help: upvotes and accepted answers provide a nice feedback that we've succeeded. Downvotes also provide a necessary corrective (even if it is annoying when some rando dings you on an answer from five years back without leaving comment). And while there are no prizes, when you get to 100,000 you do get swag. A big box of swag:



Here is the box with a standard reference pear so you can see just how big it is.


Inside there is - a pen ....


Some stickers ....


A StackOverflow T-shirt (I have negotiated with my better half to keep this one) ...


And an over-sized coffee mug...


One more thing. There are also badges. Badges are nudges to encourage desirable behaviour such as editing posts, voting in moderator elections, reviewing posts, offering bounties, being awesome. Because let's face it, badges are cool. More badges = more flair. And who doesn't want more flair? Got flair? Heck yeah!

profile for APC at Stack Overflow, Q&A for professional and enthusiast programmers

Wednesday, May 31, 2017

Avoiding Coincidental Cohesion

Given that Coincidental Cohesion is bad for our code base so obviously we want to avoid writing utilities packages. Fortunately it is mostly quite easy to do so. It requires vigilance on our part. Utilities packages are rarely planned. More often we are writing a piece of business functionality when we find ourselves in need of some low level functionality. It doesn't fit in the application package we're working on, perhaps we suspect that it might be more generally useful, so we need somewhere to put it.


The important thing is to recognise and resist the temptation of the Utilities package. The name itself (and similarly vague synonyms like helper or utils) should be a red flag. When we find ourselves about to type create or replace package utilities we need to stop and think: what would be a better name for this package? Consider whether there are related functions we might end up needing? Suppose we're about to write a function to convert a date into Unix epoch string. It doesn't take much imagine to think we might need a similar function to convert a Unix timestamp into a date. We don't need to write that function now but let's start a package dedicated to Time functions instead of a miscellaneous utils package.


Looking closely at the programs which comprise the DBMS_UTILITY package it is obviously unfair to describe them as a random selection. In fact that there seven or eight groups of related procedures.


DB Info

  • INSTANCE_RECORD Record Type
  • DBLINK_ARRAY Table Type
  • INSTANCE_TABLE Table Type
  • ACTIVE_INSTANCES Procedure
  • CURRENT_INSTANCE Function
  • DATA_BLOCK_ADDRESS_BLOCK Function
  • DATA_BLOCK_ADDRESS_FILE Function
  • DB_VERSION Procedure
  • GET_ENDIANNESS Function
  • GET_PARAMETER_VALUE Function
  • IS_CLUSTER_DATABASE Function
  • MAKE_DATA_BLOCK_ADDRESS Function
  • PORT_STRING Function
Runtime Messages
  • FORMAT_CALL_STACK Function
  • FORMAT_ERROR_BACKTRACE Function
  • FORMAT_ERROR_STACK Function
Object Management
  • COMMA_TO_TABLE Procedures
  • COMPILE_SCHEMA Procedure
  • CREATE_ALTER_TYPE_ERROR_TABLE Procedure
  • INVALIDATE Procedure
  • TABLE_TO_COMMA Procedures
  • VALIDATE Procedure
Object Info (Object Management?)
  • INDEX_TABLE_TYPE Table Type
  • LNAME_ARRAY Table Type
  • NAME_ARRAY Table Type
  • NUMBER_ARRAY Table Type
  • UNCL_ARRAY Table Type
  • CANONICALIZE Procedure
  • GET_DEPENDENCY Procedure
  • NAME_RESOLVE Procedure
  • NAME_TOKENIZE Procedure
Session Info
  • OLD_CURRENT_SCHEMA Function
  • OLD_CURRENT_USER Function
SQL Manipulation
  • EXPAND_SQL_TEXT Procedure
  • GET_SQL_HASH Function
  • SQLID_TO_SQLHASH Function
Statistics (deprecated))
  • ANALYZE_DATABASE Procedure
  • ANALYZE_PART_OBJECT Procedure
  • ANALYZE_SCHEMA Procedure
Time
  • GET_CPU_TIME Function
  • GET_TIME Function
  • GET_TZ_TRANSITIONS Procedure
Unclassified
  • WAIT_ON_PENDING_DML Function
  • EXEC_DDL_STATEMENT Procedure
  • GET_HASH_VALUE Function
  • IS_BIT_SET Function


We can see an alternative PL/SQL code suite, with several highly cohesive packages. But there will be some procedures which are genuinely unrelated to anything else. The four procedures in the Unclassified section above are examples. But writing a miscellaneous utils package for these programs is still wrong. There are better options.

  1. Find a home. It's worth considering whether we already have a package which would fit the new function. Perhaps WAIT_ON_PENDING_DML() should have gone in DBMS_TRANSACTION; perhaps IS_BIT_SET() properly belongs in UTL_RAW.
  2. A package of their own. Why not? It may seem extravagant to have a package with a single procedure but consider DBMS_DG with its lone procedure INITIATE_FS_FAILOVER(). The package delivers the usual architectural benefits plus it provides a natural home for related procedures we might discover a need for in the future.
  3. Standalone procedure. Again, why not? We are so conditioned to think of a PL/SQL program as a package that we forget it can be just a Procedure or Function. Some programs are suited to standalone implementation.


So avoiding the Utilities package requires vigilance. Code reviews can help here. Preventing the Utilities package becoming entrenched is crucial: once we have a number of packages dependent on a Utilities package it is pretty hard to get rid of it. And once it becomes a fixture in the code base developers will consider it more acceptable to add procedures to it.


Part of the Designing PL/SQL Programs series

Utilities - the Coincidental Cohesion anti-pattern

One way to understand the importance of cohesion is to examine an example of a non-cohesive package, one exhibiting a random level of cohesion. The poster child for Coincidental Cohesion is the utility or helper package. Most applications will have one or more of these, and Oracle's PL/SQL library is no exception. DBMS_UTILITY has 37 distinct procedures and functions (i.e. not counting overloaded signatures) in 11gR2 and 38 in 12cR1 (and R2). Does DBMS_UTILITY deliver any of the benefits the PL/SQL Reference says packages deliver?

Easier Application Design?

One of the characteristics of utilities packages is that they aren't designed in advance. They are the place where functionality ends up because there is no apparently better place for it. Utilities occur when we are working on some other piece of application code; we discover a gap in the available functionality such as hashing a string. When this happens we generally need the functionality now: there's little benefit to deferring the implementation until later. So we write a GET_HASH_VALUE() function,x stick it in our utilities package and proceed with the task at hand.

The benefit of this approach is we keep our focus on the main job, delivering business functionality. The problem is, we never go back and re-evaluate the utilities. Indeed, now there is business functionality which depends on them: refactoring utilities introduces risk. Thus the size of the utilities package slowing increases, one tactical implementation at a time.

Hidden Implementation Details?

Another characteristic of utility functions is that they tend not to share concrete implementations. Often a utilities package beyond a certain size will have groups of procedures with related functionality. It seems probable that DBMS_UTILITY.ANALYZE_DATABASE(), DBMS_UTILITY.ANALYZE_PART_OBJECT() and DBMS_UTILITY.ANALYZE_SCHEMA() share some code. So there are benefits to co-locating them in the same package. But it is unlikely that CANONICALIZE() , CREATE_ALTER_TYPE_ERROR_TABLE() and GET_CPU_TIME() have much code in common.

Added Functionality?

Utility functions are rarely part of a specific business process. They are usually called on a one-off basis rather than being chained together. So there is no state to be maintained across different function calls.

Better Performance?

For the same reason there is no performance benefit from a utilities package. Quite the opposite. When there is no relationship between the functions we cannot make predictions about usage. We are not likely to call EXPAND_SQL_TEXT() right after calling PORT_STRING(). So there is no benefit in loading the former into memory when we call the latter. In fact the performance of EXPAND_SQL_TEXT() is impaired because we have to load the whole DBMS_UTILITY package into the shared pool, plus it uses up a larger chunk of memory until it gets aged out. Although to be fair, in these days of abundant RAM, some unused code in the library cache need not be our greatest concern. But whichever way we bounce it, it's not a boon.

Grants?

Privileges on utility packages is a neutral concern. Often utilities won't be used outside the owning schema. In cases where we do need to make them more widely available we're probably granting access on some procedures that the grantee will never use.

Modularity?

From an architectural perspective, modularity is the prime benefit of cohesion. A well-designed library should be frictionless and painless to navigate. The problem with random assemblages like DBMS_UTILITY is that it's not obvious what functions it may contain. Sometimes we write a piece of code we didn't need to.

The costs of utility packages

Perhaps your PL/SQL code base has a procedure like this:
create or replace procedure run_ddl
  ( p_stmt in varchar2)
is
  pragma autonomous_transaction;
  v_cursor number := dbms_sql.open_cursor;
  n pls_integer;
begin
  dbms_sql.parse(v_cursor, p_stmt, dbms_sql.native);
  n := dbms_sql.execute(v_cursor);
  dbms_sql.close_cursor(v_cursor);
exception
  when others then
    if dbms_sql.is_open(v_cursor) then
      dbms_sql.close_cursor(v_cursor);
    end if;
    raise;
end run_ddl;
/

It is a nice piece of code for executing DDL statements. The autonomous_transaction pragma prevents the execution of arbitrary DML statements (by throwing ORA-06519), so it's quite safe. The only problem is, it re-implements DBMS_UTILITY.EXEC_DDL_STATEMENT().


Code duplication like this is a common side effect of utility packages. Discovery is hard because their program units are clumped together accidentally. Nobody sets out to deliberately re-write DBMS_UTILITY.EXEC_DDL_STATEMENT(), it happens because not enough people know to look in that package before they start coding a helper function. Redundant code is a nasty cost of Coincidental Cohesion. Besides the initial wasted effort of writing an unnecessary program there are the incurred costs of maintaining it, testing it, the risk of introducing bugs or security holes. Plus each additional duplicated program makes our code base a little harder to navigate.


Fortunately there are tactics for avoiding or dealing with this. Find out more.


Part of the Designing PL/SQL Programs series