Categories
C Correctness Cplusplus Programming

“No-else-after-return” considered harmful

(Update below)

At the time of writing this, the Mozilla Coding Style guidelines have this recommendation under “General C/C++ Practices”:

Don’t put an else right after a return. Delete the else, it’s unnecessary and increases indentation level.

I can appreciate this in some circumstances, such as when checking for an error case early in a long function:

void f()
{
    if (errorcase())
        return;

    a();
    b();
    c();
    d();
}

But as a blanket guideline I think it’s misguided because it promotes an impure style of programming.  And by “impure” here I mean the opposite of “pure” in this sense, i.e. “free from side-effects and other such error-prone junk”.  Let’s consider a simple example:  the max() function.  Here’s a way to write it that is quite sensible, in my opinion:

int max(int a, int b)
{
    if (a > b)
        return a;
    else
        return b;
}

But this violates the no-else-after-return guideline.  The Mozilla way to write this is like so:

int max(int a, int b)
{
    if (a > b)
        return a;
    return b;
}

(I’m not exaggerating, see this comment and this comment from Mozilla’s VP of Engineering and CTO respectively, about an example very similar to this.)

Urk.  That’s horrible.  Any time you’re using an ‘if’ without a corresponding ‘else’, you’re relying on an impure feature — you must be doing something with a side-effect in the ‘then’ branch, or some kind of impure control flow.  Mozilla’s coding guidelines insist that we write max(), a pure function, in an impure way.  Also, it’s just harder to read because the two return statements have different indentation levels despite the fact that they are closely related.

In this case there’s a way around the problem:

int max(int a, int b)
{
    return (a > b ? a : b);
}

Even though C’s if-then-else expression syntax is awful, it’s a better way to do it.

In general, statements suck, expressions rule.  Even in a language like C/C++ you can do something approaching functional programming by avoiding statements where possible.  In a tiny example like max() the difference may seem trivial, but in bigger examples the difference becomes more important.  Consider the following code from Nanojit’s register allocator which picks a register by looking at the free registers, the registers that are allowed to be used, and possibly the preferred registers.  You don’t have to understand it closely, but it helps to know that RegisterMask is just a bitfield with one bit per register.

Register Assembler::registerAlloc(RegisterMask allow)
{
    RegAlloc &regs = _allocator;
    RegisterMask prefer = SavedRegs & allow;
    RegisterMask free = regs.free & allow;

    RegisterMask set = prefer;
    if (set == 0) set = allow;

    if (free)
    {
        // at least one is free
        set &= free;

        // ok we have at least 1 free register so let's try to pick
        // the best one given the profile of the instruction
        if (!set)                                                             
        {
            // desired register class is not free so pick first of any class
            set = free;
        }
        NanoAssert((set & allow) != 0);
        Register r = nRegisterAllocFromSet(set);
        regs.used |= rmask(r);
        return r;
    }

    // ... more code here ...
}

I find this code incredibly hard to understand.  There are three ‘ifs’ that lack ‘elses’, so it relies heavily on state updates to do its work.  In fact, in order to understand it, I ended up rewriting it in a more pure style, by using a sequence of small refactorings:

Register Assembler::registerAlloc(RegisterMask allow)
{
    RegAlloc &regs = _allocator;
    RegisterMask allowedAndFree = allow & regs.free;

    if (allowedAndFree)
    {
        // At least one usable register is free -- no need to steal.  
        // Pick a preferred one if possible.
        RegisterMask preferredAndFree = allowedAndFree & SavedRegs;
        RegisterMask set = ( preferredAndFree ? preferredAndFree : allowedAndFree );
        Register r = nRegisterAllocFromSet(set);
        regs.used |= rmask(r);
        return r;
    }
    // ... more code here ...
}

There’s now only a single ‘if’ lacking an ‘else’, and it turns out even that can be removed by putting the “… more code here …” part into an else branch, and then the ‘return r;’ can be moved to the end of the function.

The code is now comprehensible — it picks a register that is free and allowed, and preferred as well if possible.

Part of the art of programming involves knowing when to avoid using a language feature.  Everybody knows that ‘goto’ is occasionally highly useful, but should be avoided whenever something less dangerous (a for-loop, a while-loop, a ‘break’ statement) can be used.  And everybody knows it because everybody is taught it.  But programmers aren’t taught to avoid unnecessary side-effects… at least, I wasn’t, certainly not via examples such as those I’ve given above.  I’ve only come to realise these things as I’ve switched several times between using C/C++ and some pure functional languages over the years, and come to appreciate how purity helps make programs more readable and less error-prone.

To come full circle, I’d like to see the no-else-after-return guideline removed from the Mozilla style guide, or at the very least, be qualified in a way that allows the cases I’ve argued for.  I’d also love to see a guideline (with examples) added such as “prefer expressions to statements”.

Update:

(The arguments had the wrong names in the initial post, I’ve fixed them now.)

All three versions of max() are pure, because it’s possible to combine impure pieces of code (such as an else-less if) into a pure piece of code.  So I’ll augment my recommendation: pure code is preferable, and furthermore, given that C/C++ makes it so easy to write impure code, it’s preferable that pure code be obviously pure. I argue that the 3rd version of max() with the ?: operator is the most obviously pure, because a ?: is pure if its operands are pure.  In comparison, it’s harder to tell if an if-then-else is pure (although this case is still pretty easy thanks to the matching return statements).  And an if-without-else almost never results in pure code;  the max() example above is a rare exception.

I also omitted a fourth formulation of max:

int max(int a, int b)
{
    int m;
    if (a > b)
        m = a;
    else
        m = b;
    return m;
}

For this very simple case, such a formulation is overkill.  But if the ‘then’ and ‘else’ cases are more complicated it makes more sense.  (Furthermore, much of this discussion should be thought of in more complex cases, as these kinds of small decision can make a bigger difference then.)  This version also doesn’t violate the no-else-after-return rule.  So perhaps that rule isn’t as bad as I thought, if only because it’s easier to work around than I thought… but I still think rewriting the 1st version of max() into the 2nd version is a bad idea.

19 replies on ““No-else-after-return” considered harmful”

Your cleanup of registerAlloc is certainly a great improvement, but it’s not directly related to else-after-return. “max” doesn’t prove your point either, since ?: is fine syntax there.

Personally I like being able to read a function and see various cases being peeled off and handled with early returns. If they don’t use early returns, storing the result in a local variable instead, then I have to read to the bottom of the function to check if anything else happens before the value is returned.

So I think early returns are worthwhile, and I’m not yet convinced there are situations where allowing ‘else’ after an early return is necessary for optimal readability.

Your cleanup of registerAlloc is certainly a great improvement, but it’s not directly related to else-after-return.

True… I segued into a broader illustration of the benefits of purity. The relevance of that was that no-else-before-return pushes in the impure description.

“max” doesn’t prove your point either, since ?: is fine syntax there.

But reviewers aren’t saying “use ?:, please”, they’re saying “remove the else, please”. Maybe a guideline “use ?: where possible” would help. That would be a step in the “prefer expressions to statements” direction. (Obviously you’d need to qualify it further, as using ?: isn’t great if the operands themselves have side-effects.)

Really, “prefer pure code to impure code” is what I’m after. And no-else-after-return violates that.

You’re not programming in Haskell, here. All C is impure. All expressions are statements.

But that’s beside the point. You’re on the wrong track. One should prefer total functions, and total functions are easier to write in C if you follow the “mozilla” approach. And that approach tries not to hide returns.

Your example implementations of max are total functions, but as code grows in complexity, it gets harder to use if/else structures with nested returns as a means to ensure totality.

The problem, of course, is that as control flow graphs grow more complicated, a return can become overlooked, making it hard to know how the program finishes execution. Since most functions in a complicated project are themselves complicated, a rule like this makes perfect sense.

When I program in C, I use techniques like this to keep track of how a routine returns. This helps me keep functions total.

When I program in Haskell, my means to achieve totality and purity are much greater, but in C, we must use design patterns like this mozilla guideline to help us along.

All C is impure. All expressions are statements.

I totally disagree with that, but moving on…

One should prefer total functions, and total functions are easier to write in C if you follow the “mozilla” approach. And that approach tries not to hide returns.

Your example implementations of max are total functions, but as code grows in complexity, it gets harder to use if/else structures with nested returns as a means to ensure totality.

I agree that total functions are a good thing. If you’re nesting returns inside conditionals in non-trivial ways then you’re asking for trouble; relying on a catch-all return at the end of a function is a terrible idea.

I agree that there can be times when else-after-return looks desirable, such as your max() example. Though IMO this is generally only true when the code is short and trivial, so it’s hard for me to get worked up about either form.

I think the rule has clear benefits for more complex code, so I wouldn’t want to see it done away with entirely.

I don’t like early returns. If I can’t use the conditional assignment operator ?: I use a local variable whenever possible with the return statement at the very end of the body, which makes my life a lot easier because most of the time I need to use heavy exception handling with a lot of logging and having the return value at hand simplifies the process. It is true you have to read the hole body to find out what happens but, and I’m talking only about what I do, only in a few cases (like UI coding) method bodies get long enought to be a problem, and that includes up to 80% of validation and error handling code, which is usually more obscure and harder to read. On my side of things, any function body with enought lines to overflow my screen is probably wrong.

That is from the human point of view. Unless your compiler really sucks there is no significant difference in compiled code if you use the “else”, only a two-byte jmp opcode in IA32 before the else block wich doesn’t interfere with the CPU cache prefetching, and only if there is any code after the else block. In the a > b case discussed in the post the compiled code is absolutely identical but if I recall correctly most C compiler implementations have more room for optimization when the ?: operator is used, which don’t apply in the simple case of a > b but is a good thing to know if the conditions are a bit more complex.

In terms of coding I’m used to think in terms of how the code will run instead of how the code will look, so in this case there is no difference for me between using the else or not because there is no difference for the compiler. So it reduces to the form which need less keystrokes and that means using the conditional assignment.

I don’t normally comment on philosophical programming opinions, but I’ll bite here.

I agree with Mozilla style guidelines on this one. The IF without the ELSE is what I have typically called a “guard statement”, as in it “guards” against a particular condition.

Guards are usually short and early. I’m usually against returning early from a method, as I’ve debugged way too many junior programmers code who returned at arbitrary points in the midst of execution. Personally, I usually have a “result” variable as early in the method as I can and then set it accordingly. But there are cases where returning-early/guards are useful and add readability to your code.

I think it’s important to add whitespace after a guard – it separates it out, makes it almost like a “trap, which your second code example lacks.

I’ll be the first to admit – I’m a nobody, my opinion counts for pretty much nothing, and you could probably school me in software design, but I thought I would toss my 2 cents in.

i would write the max() method without the else.
it’s shorter and at least for me, it’s as clear as the other one, maybe even clearer (less to parse).

I’ve come to the realization in recent times that I should write code the way I’d explain what the function does. So if I’d say “If a is greater than b, return a, otherwise return b.” I’d encode it with an else, if I’d say “Check if a is greater than b and if so, return it. Then return b.” I’d code it without an else.
This way, I feel the readability of my code really improves, especially when later trying to understand why the code does what it does r when trying to explain the code to someone else.

I’ve worked on somebody’s code that particularly liked the (awful) nested if-then-else style… Even worse: the error case was managed in the ‘else’ part, so to see if the error case was managed I had to scroll down.

Even before that bad experience, I prefered early return, because it means “ok, this case has been processed, you don’t need to bother anymore”, and lets you go on without that weight on your shoulders.

Now I use this 99% of the cases (and use G_LIKELY sometimes when there may be a performance issue):

if G_UNLIKELY (error == TRUE)
{
/* process that error */
return error_code;
}

/* more code here… */

liberforce, I said in the post that I think early returns are fine for error cases like you mention. It’s for other, non-error cases (eg. in max()) that I’m concerned about.

Benjamin, the reason we say “otherwise” there is that conversation is line writing entire programs on a single line. 🙂 We have the luxury of newlines when programming, so the “otherwise” in that case is no longer necessary. (Of course when you explain it to someone else you may have to translate slightly to add an “otherwise”, but then again, you’re hardly going to say “switch on the value type” if you’re explaining a piece of code, so you have to do some of that already.)

Benjamin Otte is right on the spot. There is a good reason behind using reserved words having an actual meaning in some spoken language. Even a skilled programmer at some point has to stop and think about the linear steps required to perform an action before actually writing the code and having language constructs with resemblance to spoken language helps writing better code and also reading someone else’s. That is probably why Pascal is so easy to learn for pepole new to programming. So I don’t think the else makes the code any harder to read and understand, specially when is has no damaging impact in the compiled output.

I’ve just sent a survey to some senior CS students and I expect to have the results by Friday night. This is definitely a funny thing to ask to other CS students and I might do just that. 😉

Comments are closed.