Wednesday, May 31, 2017

Utilities - the Coincidental Cohesion anti-pattern

One way to understand the importance of cohesion is to examine an example of a non-cohesive package, one exhibiting a random level of cohesion. The poster child for Coincidental Cohesion is the utility or helper package. Most applications will have one or more of these, and Oracle's PL/SQL library is no exception. DBMS_UTILITY has 37 distinct procedures and functions (i.e. not counting overloaded signatures) in 11gR2 and 38 in 12cR1 (and R2). Does DBMS_UTILITY deliver any of the benefits the PL/SQL Reference says packages deliver?

Easier Application Design?

One of the characteristics of utilities packages is that they aren't designed in advance. They are the place where functionality ends up because there is no apparently better place for it. Utilities occur when we are working on some other piece of application code; we discover a gap in the available functionality such as hashing a string. When this happens we generally need the functionality now: there's little benefit to deferring the implementation until later. So we write a GET_HASH_VALUE() function,x stick it in our utilities package and proceed with the task at hand.

The benefit of this approach is we keep our focus on the main job, delivering business functionality. The problem is, we never go back and re-evaluate the utilities. Indeed, now there is business functionality which depends on them: refactoring utilities introduces risk. Thus the size of the utilities package slowing increases, one tactical implementation at a time.

Hidden Implementation Details?

Another characteristic of utility functions is that they tend not to share concrete implementations. Often a utilities package beyond a certain size will have groups of procedures with related functionality. It seems probable that DBMS_UTILITY.ANALYZE_DATABASE(), DBMS_UTILITY.ANALYZE_PART_OBJECT() and DBMS_UTILITY.ANALYZE_SCHEMA() share some code. So there are benefits to co-locating them in the same package. But it is unlikely that CANONICALIZE() , CREATE_ALTER_TYPE_ERROR_TABLE() and GET_CPU_TIME() have much code in common.

Added Functionality?

Utility functions are rarely part of a specific business process. They are usually called on a one-off basis rather than being chained together. So there is no state to be maintained across different function calls.

Better Performance?

For the same reason there is no performance benefit from a utilities package. Quite the opposite. When there is no relationship between the functions we cannot make predictions about usage. We are not likely to call EXPAND_SQL_TEXT() right after calling PORT_STRING(). So there is no benefit in loading the former into memory when we call the latter. In fact the performance of EXPAND_SQL_TEXT() is impaired because we have to load the whole DBMS_UTILITY package into the shared pool, plus it uses up a larger chunk of memory until it gets aged out. Although to be fair, in these days of abundant RAM, some unused code in the library cache need not be our greatest concern. But whichever way we bounce it, it's not a boon.

Grants?

Privileges on utility packages is a neutral concern. Often utilities won't be used outside the owning schema. In cases where we do need to make them more widely available we're probably granting access on some procedures that the grantee will never use.

Modularity?

From an architectural perspective, modularity is the prime benefit of cohesion. A well-designed library should be frictionless and painless to navigate. The problem with random assemblages like DBMS_UTILITY is that it's not obvious what functions it may contain. Sometimes we write a piece of code we didn't need to.

The costs of utility packages

Perhaps your PL/SQL code base has a procedure like this:
create or replace procedure run_ddl
  ( p_stmt in varchar2)
is
  pragma autonomous_transaction;
  v_cursor number := dbms_sql.open_cursor;
  n pls_integer;
begin
  dbms_sql.parse(v_cursor, p_stmt, dbms_sql.native);
  n := dbms_sql.execute(v_cursor);
  dbms_sql.close_cursor(v_cursor);
exception
  when others then
    if dbms_sql.is_open(v_cursor) then
      dbms_sql.close_cursor(v_cursor);
    end if;
    raise;
end run_ddl;
/

It is a nice piece of code for executing DDL statements. The autonomous_transaction pragma prevents the execution of arbitrary DML statements (by throwing ORA-06519), so it's quite safe. The only problem is, it re-implements DBMS_UTILITY.EXEC_DDL_STATEMENT().


Code duplication like this is a common side effect of utility packages. Discovery is hard because their program units are clumped together accidentally. Nobody sets out to deliberately re-write DBMS_UTILITY.EXEC_DDL_STATEMENT(), it happens because not enough people know to look in that package before they start coding a helper function. Redundant code is a nasty cost of Coincidental Cohesion. Besides the initial wasted effort of writing an unnecessary program there are the incurred costs of maintaining it, testing it, the risk of introducing bugs or security holes. Plus each additional duplicated program makes our code base a little harder to navigate.


Fortunately there are tactics for avoiding or dealing with this. Find out more.


Part of the Designing PL/SQL Programs series

1 comment:

Unknown said...

very interesting article I feel very enthusiastic while reading and the information
provided in this article is so useful for me. Content in this article guides in clarifying some of my doubts.For additional information please visit our website.

Oracle Fusion Financial Training