Sunday, December 31, 2017

Data Access Layer vs Table APIs

One of the underlying benefits of PL/SQL APIs is the enabling of data governance. Table owners can shield their tables behind a layer of PL/SQL. Other users have no access to the tables directly but only through stored procedures. This confers many benefits:
  • Calling programs code against a programmatic interface. This frees the table owner to change the table's structure whenever it's necessary without affecting its consumers.
  • Likewise the calling programs get access to the data they need without having to know the details of the table structure, such as technical keys.
  • The table owner can use code to enforce complicated business rules when data is changed.
  • The table owner can enforce sophisticated data access policies (especially for applications using Standard Edition without DBMS_RLS).
So naturally the question arises, is this the same as Table APIs?

Table APIs used to be a popular approach to encapsulating tables. The typical Table API comprised two packages per table; one package provided methods for inserting, updating and deleting records, and the other package provided query methods. The big attraction of Table APIs was that they could be 100% generated from the data dictionary - both Oracle Designer and Steven Feuerstein's QNXO library provided TAPI generators. And they felt like good practice because, y'know, access to the tables was shielded by a PL/SQL layer.

But there are several problems with Table APIs.

The first is that they entrench row-by-agonising-row processing. Table APIs have their roots in early versions of Oracle so the DML methods only worked with a single record. Even after Oracle 8 introduced PL/SQL collection types TAPI code in the wild tended to be RBAR: there seems to something in the brain of the average programmer which predisposes them to prefer loops executing procedural code rather than set operations.

The second is that they prevent SQL joins. Individual records have to be selected from one table to provide keys for looking up records in a second table. Quite often this leads to loops within loops. So-called PL/SQL joins prevent the optimizer from choosing good access paths when handling larger amounts of data.

The third issue is that it is pretty hard to generate methods for all conceivable access paths. Consequently the generated packages had a few standard access paths (primary key, indexed columns) and provided an dynamic SQL method which accepted a free text WHERE clause. Besides opening the package to SQL injection this also broke the Law of Demeter: in order to pass a dynamic WHERE clause the calling program needed to know the structure of the underlying table, which defeats the whole objective of encapsulation.

Which leads on to the fourth, more philosophical problem with Table APIs: there is minimal abstraction. Each package is generated so it fits very closely to the structure of the Table. If the table structure changes we have to regenerate the TAPI packages: the fact that this can be done automatically is scant recompense for the tight coupling between the Table and the API.

So although Table APIs could be mistaken for good practice in actuality they provide no real benefit. The interface is 1:1 with the table structure so it has no advantage over granting privileges on the table. Combined with the impact of RBAR processing and PL/SQL joins on performance and the net effect of Table APIs is disastrous.

We cannot generate good Data Access APIs: we need to write them. This is because the APIs should be built around business functions rather than tables. The API packages granted to other users should comprise procedures for executing transactions. A Unit Of Work is likely to touch more than one table. These have to be written by domain experts who understand the data model and the business rules.

Part of the Designing PL/SQL Programs series


Tim... said...

I agree. Mark Farnham calls this "Logical Units of Work", which I like the sound of. I've never likes table APIs. I've always fallen in the Tom Kyte camp of APIs for business functions. What's more, this style of APIs will typically work really well when they are exposed as RESTful web services using ORDS, which solves two problems. You have a great API layer for Oracle-centric apps, and a great RESTful web service API for non-Oracle-centric apps. Boom! :)



Jeffrey Kemp said...

Also in the "agreed" camp, with all the issues raised in this article.

The TAPIs I generate in my applications do solve most of these problems, however:

1. They only encapsulate all DML (inserts, updates, deletes, merges) - queries are always done via direct queries or (preferably) views.

2. The generated code is only a starting point. I add more methods to handle specific requirements, such as bulk inserts, updates, deletes, etc.

3. There is a tight coupling between a TAPI and its table, this is intentional. Application code, however, usually calls a Transaction API (XAPI) which translates a requirement into one or more calls to TAPIs.

My preferred method has been accused of not being a "true" TAPI and I don't really care :)

Just my 2c.

Sean said...


I'd add that there is also a reduction in unintended consequences. Direct DB processing generated from/by an app (Hibernate, anyone?), either natively or through a table API, opens a greater possibility for code with unreliable performance profiles. Breaking into logical units of work prevents (or at least reduces) the possibility that whatever trickles down to the database engine is Bad.

Moreover, functional units in the database are more likely to be reused across applications to reproduce rule-based functionality. I may have many clients that want to update or report inventory, potentially written in different languages, from receiving systems, BOM/assembly, public facing commerce, internal customer service, shipping, inventory management, etc. I have to define the same rules individually in each client with a table API, and update them individually when they change. A functional/logical API distills that to one place—the database—and prevents the back-and-forth we often see in transactional systems. It protects data from inconsistencies in the application of changes in client software and reduces effort, with fewer people required to update a single PL/SQL package than the upstream alternative that necessitates multiple client teams writing Java, .Net, C#, etc. to simultaneously coordinate code releases on multiple systems.