Tuesday, March 1, 2011

Label Considered Harmful

Code is straightforward and logical, but because it's written by human beings, it's as vulnerable to superstition as anything else. One of the most detrimental superstitions is the fear of early return. Convoluted nested ifs and pointless temporary flags are created just to avoid early returns. Early returns are feared because they are believed to be the same as gotos, but more awful code has been written to avoid early returns than was ever written with the goto statement. Although the return statement is superficially like a goto, it was never the goto statement that was the real problem with goto programs. If we look at why the rule forbidding goto was made, we'll see the real culprit.

Superstition begins when a rule's reason is forgotten, and the rule is blindly obeyed. Most programmers today know that goto is wrong, but never worked on a goto program, so don't know why goto is forbidden. The answer has to do with readability and information hiding. Each statement in a program changes the state of the program. How easy it is to understand the state change depends on what statement is called. Let's look at some examples.

b = 10;
c = a – b;
if (a <= 25) return;

After these statements execute, we know that a is greater than 25, b is ten, and c is ten less than a, so c is greater than 15. Notice that the return statement actually increased our knowledge about program state, by eliminating any a's less than 25. If we replace the return statement with a goto statement, we can reason similarly about program state.

b = 10;
c = a – b;
if (a <= 25) goto label1;

The goto statement is not quite as meaningful as return. A return means we're done with whatever the current method is trying to do, so if the current method has a meaningful name, we understand what the return means. A goto has different meanings depending on where it goes. Is it going back in the program to try and get a better value of a? Is it going to a special routine to calculate something different? Is it skipping over the next chunk of code since it doesn't have a good value for a? But even though there's this ambiguity with the goto statement, we can still reason fairly well about program state.

Now look at the label statement, the statement that gets executed after a goto.

  b = 10;
  c = a – b;
label2:

After label2 is executed, what do we know about program state? Nothing. The statement 'goto label2' could be anywhere in the program and before that statement a, b and c could be set to anything. We would have to find every instance of 'goto label2' in the program and read the code around it before we could have any idea what the program state is. This could be exceedingly hard in languages where labels are numbers and can be computed. This is what makes goto programming so difficult to understand and debug. It was never the goto statement itself that caused these problems; instead it was the passive label that received the goto.

And here's where the human mind gets into trouble. Because people are naturally attracted to an action like goto instead of a passive statement like label, all the negative press is directed toward goto. But the real problem with unstructured programming is having spots in the program that can be gotten to from anywhere else in the program.

No comments:

Post a Comment