DRY...ER!

"Don't Repeat Yourself! Ever! Really!"

I have what might be regarded as an extreme allergy to repetition. Once in a job interview I used my desire to "end repitition in our lifetime" (hereafter "ERIOL") sorta like a campaign slogan. (My interviewer pointed out the irony of using that phrase multiple times within the interview. Sadly the techniques I will discuss here are not available in everyday speech. However, lawyers use the "hereafter" trick above all the time in contracts to define shorthands to be used throughout.)

So I take DRY coding very seriously. It really just derives from laziness, as I don't like having to look for the same thing over and over again and change it in multiple places. And I like to make the computer do the work to make sure that things stay consistent when in flux.

So here are a few key techniques that I use over and over again to ERIOL:

Using constants in place of literals

Picked this habit up from C/C++, using #define to prevent the scattering of "magic numbers" throughout the code. My take is that all non-trivial literal values (i.e. pretty much any value where the purpose for it being what it is and where it is are not immediately intuitively obvious) are best represented by a constant, especially if that value is likely to be used more than once. In practice, I don't do it too much when there is no impending repetition, unless it makes the code much more sensible. But even if I'm just using the value twice in the same block, I will almost always take the extra time to make a constant.

Note: I'm currently doing a lot of work in Perl, where I don't have explicit constants (at least without putting some extra work in) so I settle for naming conventions, i.e. using all caps for "variables" whose values I don't expect to ever change.

So, for an extremely contrived example (in Perl):

sub test {
my ($first_name, $last_name) = @_;
my ($JOHN, $JANE, $JERRY) = ('John', 'Jane', 'Jerry');
my $found = 0;
if ($last_name eq 'Doe') {
foreach my $test ($JOHN, $JANE, $JERRY) {
if ($test eq $first_name) {
$found = 1;
my @others = $test eq $JOHN ? ($JANE, $JERRY) : $test eq $JANE ? ($JOHN, $JERRY) : ($JOHM, $JANE);
print "Hello $test, where are ", join(' and ', @others), "?";
last;
}
}
}
if (!$found) {
print "I don't know you!"
}
}

I know I could have coded it much more efficiently, but I intentionally coded it this way as a demonstration. What is particularly helpful about this is that it illustrates the power of DRYER coding in making the computer do your error-checking for you. Note there is one place where I wrote "$JOHM" instead of "$JOHN". If I had just written the literal 'Johm', the computer would be perfectly happy with my typo. However, because I mis-typed an identifier, then the compiler will catch the error. (You Perl-ers are using strict, aren't you?)

Using the template method pattern

The template method pattern is a way of specifying an overall algorithm in one place, and then calling out to a group of other subroutines/methods (often collected in an object) to act out the details relevant to the specific case. Take for example the following code:

if ($foo eq 'bar') {
  print "Step on up to the bar!\n";
  print "Welcome worthy friend\n";
  print "What can I get you?\n";
  $foo .= 'e';
} elsif ($foo eq 'baz') {
  print "Welcome worthy friend\n";
  print "You has the baz!\n";
  print "What can I get you?\n";
} else {
  print "What can I get you?\n";
  print "Now go away!\n";
}

Obviously there are a few repeated bits. One way to reduce repetition would be to change literals to constants, and then reuse. Works in this case because the individual statements are pretty trivial. But for more complicated steps, there are better ways. Here's one:

if ($foo eq 'bar') {
  print "Step on up to the bar!\n";
}
if ($foo eq 'bar' || $foo eq 'baz') {
  print "Welcome worthy friend\n";
}
if ($foo eq 'baz') {
  print "You has the baz!\n";
}
print "What can I get you?\n";
if ($foo ne 'bar' || $foo ne 'baz') {
  print "Now go away!\n";
}
if ($foo eq 'bar') {
  $foo .= 'e';
}

Now each print statement occurs only once. But we are branching a bunch more times. We could cut out the repetitions of "$foo eq 'bar'" and "$foo eq 'baz'" by saving the results to variables, but we're still stuck with all these nutty branches. Most painful of all is the "if (A) {...} if (A || B) {...} if (B) {...}". It really feels like there should be a better way. Fortunately, there is, in using the template method pattern. The example below will not use actual classes to accomplish the groupings of subroutines, partly because I'm too lazy to throw all that out, and partly to demonstrate that the pattern can apply even outside of the object-oriented world:

# A little bit of setup, including choosing the collection of methods to use:
my $welcome = sub {print "Welcome worthy friend"};
my ($first_print, $second_print, $third_print, $fifth_print, $assignment) =
  $foo eq 'bar' ? (sub {print "Step on up to the bar!\n"}, $welcome, undef, undef, sub {$foo .= 'e'}) :
  $foo eq 'baz' ? (undef, $welcome, sub {print "You has the baz!\n"}, undef, undef) :
  (undef, undef, undef, sub {print "Now go away!\n"}, undef);
my $maybe_run = sub {my $sub = shift; $sub->() if $sub};

# Here begins the actual skeleton:
$maybe_run->($first_print);
$maybe_run->($second_print);
$maybe_run->($third_print);
print "What can I get you?\n"; # The skeleton doesn't need to delegate everything to a method
$maybe_run->($fifth_print);
$maybe_run->($assigment);

I have the methods stored in individual variables, whereas in an object-oriented approach you might have methods on an instance, but the principle is the same. Now it may be a little harder to see what gets done in any single case. But if we want to ERIOL, we must live with such obscurities. Plus, information hiding has its benefits, and now the skeleton lays out what the overarching approach is in a clear manner. (If you care about the specifics of any given case, best to write a test to make sure it performs as expected.) Code that outgrows its original purpose can often end up looking like a magnified version of one or both of the first two examples as more cases need to be dealt with, but someone may later come along and say "WTF is this code trying to accomplish and how to I change it to handle my new case?" It turns into a big game of Jenga, and aggressive test coverage may prevent it from toppling over, but it will take far too much mental effort to make the next incremental change. A good refactor using this pattern will do wonders for your outlook on life.

There are some other techniques that are useful that I will but allude to at this point:

  • Leveraging production code to make tests more realistic and less duplicative
  • Code generation for when the code to be used actually requires redundancy

If there is enough interest in any of those, I may just follow up with a post on them, but the two that I covered here are the ones that I find most widely applicable day to day.

Trackback URL for this post:

http://tigretigre.com/trackback/24

Comments

maybe_run

Any reason why not:

sub maybe_run { $_[0] && $_[0](); }

Shorter, no need to shift() (irrelevant probably unless you're hammering the heck out of it, but, why not).

I appreciate the thought, and

I appreciate the thought, and didn't feel great about the shift(), but I was shooting less for optimization than for readability, including to people who may not be too familiar with Perl. Also, I try to avoid using @_ where possible to avoid side effects. Moreover, in the spirit of the post, the index 0 is in this case a magic literal which is repeated, so it would be a setback in my quest to ERIOL.