Using descriptive data types
gabr just posted about using descriptive variable names, e.g. to show units: a “_kb_s” suffix if the variable is in kb/s, for example, so you can easily spot places where you’re assigning a kb/s measurement into a bits/s variable.
We’ve done one better: make descriptive data types, and lean on the compiler.
Basically, the idea is this: if you find yourself making a certain kind of mistake, then do something so the compiler will check for you. Or, better yet, make it so that you can’t even type in code that has that mistake.
In our app, we have a grid that does some heavy lifting. It’s basically a miniature spreadsheet, with heavy integration into our app’s stored data. So we do a lot of work with coordinates: rows and columns.
Now, to most of us, the words naturally go in that order: “rows and columns”. But the grid control we use tends to put the parameters in the other order: “AColumn, ARow: Integer“. This impedence mismatch led to a few subtle bugs over the years.
Another source of subtle bugs was the fact that the data we were displaying was logically a two-dimensional, zero-based array. But in the grid, that data was all in one-based coordinates, because the row and column header cells took up index 0 in the grid. So we were forever chasing bugs where we forgot to add one or subtract one.
So finally, a couple of years ago, we got fed up and wrote TRowCol. I think this was shortly after we upgraded from Delphi 6 to Delphi 2005, because that’s when the compiler gave us records with methods.
TRowCol’s name makes it clear what order the parameters go in: row first, then column. That was the first big win. The pair that first put TRowCol into the code base did not immediately update everyplace in the code that ever used rows and columns, but it wasn’t long before TRowCol dominated the field, as other pairs spread its usage.
The second big win was that, by putting methods and properties onto TRowCol, we could fix that thing of forgetting to add or subtract one. TRowCol stores zero-based “driver coordinates” internally, but it can present itself as either driver or GUI coordinates:
RowCol := TRowCol.FromGui(ARow, ACol); InsertRow(RowCol.DriverRow);
This became an even bigger win when it became clear that the grid control we were using couldn’t really cope with hidden rows and columns. It had support for hiding, but it was half-assed at best. So we gave up on its hiding logic, and wrote our own coordinate mapper to map between “real indexes” and “visible indexes”.
The great thing was, now that nearly everything was using TRowCol, there was a single inflection point. Only one piece of code knew how to convert between driver and GUI coordinates, and that was TRowCol itself. So we started passing the mapper object into the TRowCol.FromGui, TRowCol.GuiRow, and TRowCol.GuiCol methods. It went in quite smoothly for such a fundamental change — and we found one or two as-yet-undiscovered bugs while we were doing it!
Since then, we’ve put in a few other records-with-methods for fundamental concepts in our code. For example, we now have TVersion, which unifies all the different ways we used to represent program and data versions, and can convert itself from and to any of the different formats we used to have — no more dozens of idiosyncratic conversion calls that we used to have to chain together in strange ways.
We do not yet have one to wrap up the concept of “a month and a year”, which we have at least four different ways of representing. But we’ll get there, I’m sure.
The idea is: if you have to worry about the units your data is expressed in, and making sure you convert it from one form to another when you need to, make the compiler help you. Naming your variables to show what units they’re in is a good idea, but they require visual inspection. So does TRowCol, but only at the endpoints, never in the middle.
And who knows? You may find more operations that belong on this new type. We certainly have.
Note: If you use records with methods, beware the dreaded (and sporadic) compiler Internal Errors. These mean (loosely translated): “The people who write Delphi do not actually use records with methods themselves, so they never see the compiler bugs.” Here’s a hard-won hint: break up long expressions using temporary variables.
April 20th, 2007 at 8:22 pm
Caveat: You know I’m a shameless dynamic language advocate.
I will be the first person to say that most systems do not have enough types. Programmers today still rely too heavily on primitive types when more robust, functional objects could accomplish the task much better. To that end, I agree with the sentiment you’ve expressed.
But, that said. The idea that we should do this sort of thing because then we can let the compiler catch our mistakes just rubs me the wrong way.
There is not one kind of error that a compiler will catch that a unit test would not. Furthermore, there are entire classes of errors that unit tests will catch that a compiler can’t possibly detect. This is why statically typed languages are at a disadvantage when it comes to speed of development. They add cumbersome requirements to the code without adding truly discernable advantages in the long run.
Just keep in mind how often you find yourself saying "Man, I wish I had this feature from Ruby in Delphi," or "Dude, the syntax for Ruby-style blocks in C# is clunky." I know how often I say it, and how often I hear it.
Just remember, a vast majority of "new" programming ideas that are coming out for statically typed languages today have been in dynamic languages like Smalltalk and Lisp for decades. And all the rest are easily implemented. The converse isn’t always true.
April 21st, 2007 at 4:36 am
Hmm. You’re definitely right that the big win is in *making* the custom type, and using it instead of the primitive. (As we found out when we added more operations to TRowCol.)
But whenever I’ve tried to make a custom type in Ruby, and introduce it into code that was already using a primitive, I’ve missed places, even when I had unit tests. Generally it’s because I miss some of the glue. I test classes individually, and when the interface of one changes, I forget to update the tests of all its playmates to show the new contract (and the tests don’t tell me about the problem, because they’re testing each class in isolation). Usually it’s not until I run through a particular code path, much later, that I find that I missed something.
I’ll be the first to admit that I’m not that skilled yet at writing unit tests in a dynamic language (and feel free to blog about how to do this sort of thing effectively, or point me to some articles — I’d love to learn how to do it). But given where I am now, I prefer having a statically-typed compiler to lean on for changes like this. Granted that the compiler gets in the way for a lot of other things, but this specific thing is the very thing statically-typed compilers are good at.
April 24th, 2007 at 8:39 pm
Poor man’s Find References