Coding Naked

I know the title sounds like clickbait, but it’s how I’ve felt for a long time.  Why? Because of CASE tools.

CASE stands for Computer-Aided Software Engineering. Here, I’m referring to data modeling tools like ErWin and ER/Studio.  It disheartens me that the industry as a whole has gotten away from using these tools. In this world of rushing software to market, we’ve forgotten that the data-driven marketplace starts with the word ‘data‘.

A brief history of CASE

I started in IT in the 90s when things moved a little slower and we actually took time to model databases.  [Editor: I somehow managed to resist inserting a gif of an old man and a cane.]  So, I got used to using CASE tools, and by the year 2000 I was quite adept at using the different flavors of these tools.  And I’m going to tell you, I miss them greatly.

The word domain has had a lot of play in IT, because it’s a nice generic container for many things. In the 1990s, a domain was a universal concept when working with a CASE tool. In SQL, we have another word for domain: column.  Old timers like Joe Celko get really irate when you call a domain a column, because they know what it’s for.  But we’ll get to that in a minute. First, let’s talk about the difference between a domain and a column.

Column vs Domain

A column is a single property in a table, which holds the same type of data throughout all rows.  Think of an Excel spreadsheet that stores a list of names.  That’s how data is represented to us, even in SQL Server Management Studio (and practically every other query tool out there).  We see the data as these little on-the-fly spreadsheets.  But, that would be a very myopic view of what a domain is.

In data modeling, a domain is implemented as a column. But, the domain itself is an organizational structure for column definitions.  A single domain can apply to individual columns in dozens of tables, if you like. Let’s take an example.

Let’s say you’re writing a SQL application, and you sit down to create a table for it. In that table, you need a FirstName column.  After some thought, you give that column a datatype of VARCHAR(50).  You then get pulled away, and don’t get back to the code for another week or so. When you return to the code, you pick up where you left off and start creating another table.  The second table also needs a FirstName column. As it’s been weeks, you forget that the first table has a FirstName column with VARCHAR (50), and you’re in a different mindset today, so you give this new FirstName column VARCHAR(75).

This sort of thing happens all the time, especially in projects with multiple developers. See how this app is just begging for bugs?  These bugs can be very difficult to track down.

Developing with CASE and domains

This is where domains and CASE tools come in.  In a CASE tool, you can define FirstName as a domain.  That domain a single entity where you define the datatype, length, nullability, and often even more properties.  Then, when you’re creating tables in your CASE tool, you associate each column with a domain, instead of assigning datatypes.

With a defined domain, every column you associate with that domain automatically gets consistent properties.  Furthermore, if you have a change of heart and decide to change FirstName from varchar(50) to nvarchar(100), you change it at the domain level. Then all the columns that use it pick up the changes.  (I’ve always liked how meta it is that you’re normalizing the data model process itself.)

You can see why the old-timers insist on calling it a domain: because it is.  It’s not a single instance of a FirstName…it’s the domain of FirstName, which every FirstName column will take its properties from.

Fill your data applications with domains, and not columns.  Now that you’ve got a domain for FirstName, you can create domains for every other ‘column’ in your different data projects.  Even better, you can reuse these domains in your different projects, so FirstName is the same in every application you write!  This is what we call a data dictionary.  It’s the right way to do things because it forces you to think about the right data type for the job, and then it enforces it throughout your entire data enterprise.

What’s more, publish your data dictionary! Anyone writing writing stored procedures and other code against your databases should use that data dictionary to determine what datatypes they need for parameters, variables, temporary tables, and the like.  With any luck, they would look up those values in the database anyway, so you may as well expose them all in a single, searchable place.

Coding naked, without CASE

All of this is to show now naked I feel when I code these days.  You see, I don’t have access to a CASE tool any more, because they’re so expensive.  The companies that sell them think an awful lot of the software, while the companies I’ve worked for don’t. So, I haven’t been able to talk anyone into footing the bill in a long time.

The industry as a whole doesn’t see the value in not rushing software out the door, so nobody takes time to actually model data anymore.  [Old mane cane gif, resisted again. -Editor] So, the tools should be cheaper because demand isn’t very high.

All I know is that my projects have suffered as a result.  I’ve had silly bugs and inconsistencies that are pointless, because I was in a different frame of mind when designing different tables, and didn’t realize I was creating my columns with different properties.  Until I can solve my CASE tool problem, it’s like coding naked because I’m exposed to bugs, data anomalies, performance issues due to implicit conversion issues, and probably more.

Let’s get back to a time where we can slow things down and start doing things right.  And vendors, stop treating CASE tools like a precious commodity.  Price them to own and you may see an increase in sales.

"Old man yells at cloud."
Close enough. -Ed.

Comments are closed.