Friday, September 18, 2020

The use and misuse of %TYPE and %ROWTYPE attributes in PL/SQL APIs

PL/SQL provides two attributes which allow us to declare a data structure with its datatype derived from a database table or a previously declared variable.

We can use %type attribute for defining a constant, a variable, a collection element, record field or PL/SQL program parameters. While we can reference a previously declared variable, the most common use case is to tie the declaration to a table column. The following snippet declares a variable with the same datatype and characteristics (length, scale, precision) as the SAL column of the EMP table.

l_salary emp.sal%type;
We can use the %rowtype attribute to declare a record variable which matches the projection of a database table or view, or a cursor variable. The following snippet declares a variable with the same projection as the preceeding cursor.
cursor get_emp_dets is
  select emp.empno
         , emp.ename
         , emp.sal
         , dept.dname
  from   emp
  inner join dept on dept,deptno = emp.deptno;        
l_emp_dets get_emp_dets%rowtype;
Using these attributes is considered good practice. PL/SQL development standards will often mandate their use. They deliver these benefits:
  1. self-documenting code: if we see a variable with a definition which references emp.sal%type we can be reasonably confident this variable will be used to store data from the SALARY column of the EMP table.
  2. datatype conformance: if we change the scale or precision of the the SALARY column of the EMP table all variables which use the %type attribute will pick up the change automatically. If we add a new column to the EMP table, all variables defined with the %rowtype attribute will be able to handle that column without us needing to change those programs.
That last point comes with an amber warning: the automatic conformance only works when the %rowtype variable is populated by SELECT * FROM queries. If we are using an explicit projection with named columns then we have now broken our code and we need to fix it. More generally, this silent propagation of changes to our data structures means we need to pay more attention to impact analysis. Is it right that we can just change a column's datatype or amend a table's projection without changing the code which depends on them? Maybe it's okay, maybe not. By shielding us from the immediate impact of broken code, using these attributes also withholds the necessity to revisit our programs: so we have to remember to do it.

Overall I think the benefits listed above outweigh the risks, and I think we should always use these attributes whenever it is appropriate, for the definition of local variables and constants. However, complications arise if we use them to declare PL/SQL program parameters, specifically for procedures in package specs and standalone program units. It's not so bad if we're writing an internal API but it becomes a proper headache when we are dealing with a public API, one which will be called by programs owned by another user, one whose developers are in another team or outside our organisation, or even using Java, dotNet or whatever. So why is the use of these attributes so bad for those people?

  1. obfuscated code: these attributes are only self-documenting when we have a familiarity with the underlying schema, or have easy access to it. This will frequently not be the case for developers in other teams (or outside the organisation) who need to call our API. They may be able to guess at the datatype of SALARY or HIREDATE, but they really shouldn't have to. And, of course, a reference to emp%rowtype is about as unhelpful as it could be. Particularly when we consider ...
  2. loss of encapsulation: one purpose of an API is to shield consumers of our application from the gnarly details of its implementation. However, the use of %type and %rowtype is actually exposing those details. Furthermore, a calling program cannot define their own variables using these attributes unless we grant them SELECT on the tables. Otherwise the declaration will hurl PLS-00201. This is particularly problematic for handling %rowtype, because we need to define a record variable which matches the row structure.
  3. breaking the contract: an interface is an agreement between the provider and the calling program. The API defines input criteria and in return guarantees outcomes. It forms a contract, which allows the consumer to write code against stable definitions. Automatically propagating changes in the underlying data structures to parameter definitions creates unstable dependencies. It is not simply that the use of %type and %rowtype attributes will cause the interface to change automatically, the issue is that there is no mechanism for signalling the change to an API's consumers. Interfaces demand stable dependencies: we must manage any changes to our schema in a way which ideally allows the consumers to continue to use the interface without needing to change their code, but at the very least tells them that the interface has changed.

Defining parameters for public APIs

The simplest solution is to use PL/SQL datatypes in procedural signatures. These seem straightforward. Anybody can look at this function and understand that input parameter is numeric and the returned value is a string.
function get_dept_manager (p_deptno in number) return varchar2;
So clear but not safe. How long is the returned string? The calling program needs to know, so it can define an appropriately sized variable to receive it. Likewise, in this call, how long is can a message be?
procedure log_message (p_text in varchar2);
Notoriously we cannot specify length, scale or precision for PL/SQL parameters. But the calling code and the called code will write values to concretely defined types. The interface needs to communicate those definitions. Fortunately PL/SQL offers a solution: subtypes. Here we have a substype which explicitly defines the datatype to be used for passing messages:
subtype st_message_text is varchar2(256);

procedure log_message (p_text in st_message_text);
Now the calling program knows the maximum permitted length of a message and can trim its value accordingly. (Incidentally, the parameter is still not constrained in the called program so we can pass a larger value to the log_message() procedure: the declared length is only enforced when we assign the parameter to something concrete such as a local variable.)

We can replace %rowtype definitions with explicit RECORD defintions. So a function which retrieves the employee records for a department will look something like this:

subtype st_deptno is number(2,0);

type r_emp is record(
  empno          number(4,0),
  ename          varchar2(10),
  job            varchar2(9),
  mgr            number(4,0),
  hiredate       date
  sal            number(7,2),
  comm           number(7,2),
  deptno         st_deptno
);
  
type t_emp is table of r_emp;  

function get_dept_employees (p_deptno in st_deptno) return t_emp;
We do this for all our public functions.
subtype st_manager_name is varchar2(30);

function get_dept_manager (p_deptno in st_deptno) return st_manager_name;
Now the API clearly documents the datatypes which calling programs need to pass and which they will receive as output. Crucially, this approach offers stability: the datatype of a parameter cannot be changed invisibly, as any change must be implemented in a new version of the publicly available package specification. Inevitably this imposes a brake on our ability to change the API but we ought not to be changing public APIs frequently. Any such change should arise from either new knowledge about the requirements or a bug in the data model. Wherever possible we should try to handle bugs internally within the schema. But if we have to alter the signature of a procedure we need to communicate the change to our consumers as far ahead of time as possible. Ideally we should shield them from the need to change their code at all. One way to achieve that is Edition-Based Redefinition. Other ways would be to deploy the change with overloaded procedures or even using a different procedure name, and deprecate the old procedure. Occasionally we might have no choice but to apply the change and break the API: sometimes with public interfaces the best we can do is try to annoy the fewest number of people.

Transitioning from a private to a public interface

There is a difference between internal and public packages. When we have procedures which are intended for internal usage (i.e. only called by other programs in the same schema) we can define their parameters with %type and %rowtype attributes. We have access and - it is to be hoped! - familiarity with the schema's objects, so the datatype anchoring supports safer coding. But what happens when we have a package which we wrote as an internal package but now we need to expose its functionality to a wider audience? Should we re-write the spec to use subtypes instead?

No. The correct thing to do is to write a wrapper package which acts as a facade over the internal one, and grant EXECUTE privileges on the wrapper. The wrapper package will obviously have the requisite subtype definitions in the spec, and procedures declared with those subtypes. The package body will likely consist of nothing more than those procedures, which simply call their equivalents in the internal package. There may be some affordances for translating data structures, such as populating a table %rowtype variable from the public record type, but those will usually be necessary only for the purposes of documentation (this publicly defined subtype maps to this internally defined table column). There is an obvious overhead to writing another package, especially one which is really just a pass-through to the real functionality, but there are clear benefits which justify the overhead:

  • Stability. Not re-writing an existing package is always a good thing. Even if we are mechanically just replacing one set of datatype definitions with a different set which have the same characteristics we are still changing the core system, and that's a chunk of regression testing we've just added to the task.
  • Least privilege escalation. Even if the internal package has been written with a firm eye on the SOLID principles, the chances are it contains more functionality than we need to expose to other consumers. Writing a wrapper package gives us the opportunity to grant access to only the required procedures.
  • Composition. It is also likely that the internal package doesn't have the exact procedure the other team needs. Perhaps there are actually two procedures they need to call, or there's one procedure but it has some confusing internal flags in its signature. Instead of violating the Law of Demeter we can define one simple procedure in the wrapper package spec and handle the internal complexity in the body.
  • Future proofing. Writing a wrapper package gives us an affordance where we can handle subsequent changes in the internal data model or functionality without affecting other consumers. By definition a violation of YAGNI, but as it's not the main reason why we're doing this I'm allowing this as a benefit.

Design is always a trade off

The use of these attributes is an example of the nuances which Coding Standards often lack. In many situations their use is good practice, and we should employ them in those cases. But we also need to know when their use is a bad practice, and why, so we can do something better instead.

Part of the Designing PL/SQL Programs series

Wednesday, September 09, 2020

Ten character classes your project team needs

A dungeon-crawling party requires a good mix of character classes to be successful. If everyone is a wizard there's nobody who can fight off the orc warband. Similarly, a software development team needs a range of character traits and aspects to successfully deliver working software which meets the project's goals. Here's my take.

Scavenger

The Scavenger understands the importance of not re-inventing the wheel. To this end they acquire an encyclopaedic understanding of our languages' built-in libraries, the existing features of our system and other systems in the wider organisation, and open-source libraries.

Unless given a precise list of the project's wants a Scavenger will become a Mutant Renegade, scouring the post-atomic wasteland for useless relics, which are broken or undocumented or both.

Aspect: "Here's one we made earlier"
Traits: Focused laziness, Unfocused research, GitHub

Rat King

The Rat King understands that software development is a communal task. Consequently they work to forge a collection of disparate individuals into a team. Their remit includes facilitating meetings and arranging after-work socials. Despite their fearsome appearance and collectivist instincts the Rat King is extremely sensitive to what each person brings to the party, and strives to ensure that introverts and teetotallers are included without feeling pressurized.

Unless met with a smidgeon of friendly scepticism a Rat King will become a Facebook.

Aspect: "We must hang together or we will surely hang separately"
Traits: Teamwork, Communications, Contacts

Paladin

The Paladin is the defender of the project but is also committed to the ideal of a project which is worth defending. They ensure everybody follows best practice, adheres to coding standards and observes the agile ceremonies.

Unless there's a Rogue to balance them a Paladin will become a Grand Inquisitor (although maybe without the thumbscrews).

Aspect: "Just do it right"
Traits: Rigour, Weird inner light

Rogue

The Rogue is pragmatic where the Paladin is dogmatic. They have a swashbuckling approach to getting things done. They understand the concept of technical debt, they just tip the trade-off toward delivering stuff over following the rules. Very fond of observing that there's no such thing as "best practice".

Unless kept in line a Rogue will become a Cowboy. Yee-hah!

Aspect: "Let's do the show right here"
Traits: Resourcefulness, Acute bullshit detector, Cynefin

Mad Scientist

The Mad Scientist has a deep technical understanding of software development, both practice and theory. They are obsessed with innovative and extremely clever solutions to business problems.

Unless your business problem actually requires an extremely clever solution a Mad Scientist will become an Evil Supervillain, who will derail the project (but, to be fair, will not destroy the entire planet. Probably).

Aspect: "My monster lives!"
Traits: Single-mindedness, Visionary

Lab Assistant

The Lab Assistant is vital to delivering the work of a Mad Scientist. They document APIs on wiki pages, they write build scripts and unit tests, they productionize the PoC code. In short, they undertake all the tedious essential tasks which would distract a Mad Scientist from their creation.

A Lab Assistant to an Evil Supervillain is still a Lab Assistant, but the wiki pages are half-complete, the unit tests don't run and the code isn't fit to be checked into source control.

Aspect: "Here is the brain you wanted"
Traits: Flexibility, Service to the higher cause

Major-domo

The Major-domo helps the project run smoothly by taking care of all the little things everybody else forgets. They clean the whiteboard before a meeting starts, they bring Sharpies and Post-It notes to the retrospective, they write Jiras for the stories we just agreed we needed and they circulate minutes after decision-making meetings.

Unless other people occasionally do some of these tasks a Major-domo will become a Resentful Skivvy.

Aspect: "I'll add that to my To-Do list"
Traits: Well-stocked stationary cupboard, Scrivener

Court Jester

The Court Jester says out loud the things everybody else is thinking. They aren't afraid to appear ridiculous in order to make a point. Their role is to speak truth to power.

If they go too far a Jester becomes an Angry Ranter, ignored and shunned by everybody.

Aspect: "The true fool stays silent in the face of foolishness"
Traits: Humour, Insight, Lack of inhibition

Bounty Hunter

The Bounty Hunter lives for finding and fixing bugs. They are never happier when writing test cases to reproduce a bug or stepping through lines of code in debug mode. They understand that fixing production code is more important than delivering a new feature.

Unless kept on a tight leash a Bounty Hunter will become a Mindless Delver or a Tinkerer.

Aspect: “To defeat the bug, we must understand the bug”
Traits: Sense of purpose, Perseverance, Debugging

Druid

The Druid has an understanding of ecosystem beyond our project's bounded context. They know what the business seeks to achieve and how our project furthers those goals. They also know about other projects in the organisation, and work to ensure our project integrates with them harmoniously.

Unless given a clear sense of our project's priorities and direction a Druid is still a Druid, just servicing the needs of other projects.

Aspect: "Listen to the trees, dude"
Traits: Awareness, Empathy, Balance

Multi-faceted characters

Obviously these aren't main character classes. A project team comprises base classes such as Developers, Testers, Analysts, Architects, heck maybe even a Project Manager. What I list here are ancillary classes, which modify a base class. An Architect, a Developer, an Analyst or a Project Manager can benefit from having a touch of the Druid about them. Any Developer should spend some time being a Bounty Hunter or Scavenger. Different circumstances demand different class behaviours. When there's a major outage in Production we need Rogues to fix it, not Paladins muttering about process and sign-off. But after Production is back it is the Paladins who make sure the problem and its resolution are properly documented, and appropriate preventative measures put in place. Most people on the team will flow through several of these class behaviours, even over the course a single sprint.

When we're forming a new team to deliver some piece of software we focus on the hard skills. the main character classes. We need this many Developers, this many Analysts, a UX expert, an SEO specialist, and so forth. These are the easy things to define. But the success of the project will in large part depend on the soft skills and temperaments of the individuals in the team. This is a lot harder to measure. It's why personality tests like Myers-Briggs and Insights exist: some people think they're hokum but they provide a framework for assessing the make-up of a team in an age when we're uncomfortable casting horoscopes or taking auguries from the liver of a freshly-slaughtered goose. Using RPG character classes as metaphors for desirable behaviours has the advantage of jokiness. There is a categorical absence of pyschological research underpinning this article. Also it doesn't require us to obtain live waterfowl.

One last thing. The next time you find yourself at a retrospective with no marker pens and nothing to write on, look around for a Major-domo. And if you can't spot one why not appoint yourself to the role?

Epilogue

That final paragraph makes me sad for the times when retrospectives happened in a room with other people, with a whiteboard covered in post-It notes. Let's hope we can do them like that again.