The Cost of If-Else
The if-else
keyword is not free. So, how expensive is it? Today’s article finds out.
For lots of background on why if-else
and other so-called “branch” instructions can be costly, see the Wikipedia article on branch prediction. In short, the CPU is trying to predict which branch of your code to take. When you write an if-else
, there are two sets of code that it might execute.
Now to find out how expensive the branch is. Let’s look at two loops that do the same thing: compute the sum of a series of numbers. It’s pretty useless in the real world, but it serves to show the cost of branching pretty well. Here’s the first loop that doesn’t us branching:
private function noBranch(reps:int): int { var sum:int; for (var i:int; i < reps; ++i) { var cur:int = i & 1; sum += cur; } return sum; }
At each iteration, the low bit of the loop value i
is retrieved and added to the sum. That’s it. It’s very straightforward. Now, let’s throw in an if-else
:
private function branch(reps:int): int { var sum:int; for (var i:int; i < reps; ++i) { var cur:int = i & 1; if (cur > 0) { sum += cur; } else { sum += cur; } } return sum; }
This loop also takes the low bit of i
and adds it to sum
, but there’s a pointless if-else
thrown in so we’ll have something to measure. Notice that both the true (if
) and false (else
) cases do the exact same thing as each other and as the previous function.
Now that we’ve set this up, here’s a little test app that calls each function with a high reps
value.
package { import flash.display.*; import flash.utils.*; import flash.text.*; public class BranchCost extends Sprite { private var logger:TextField = new TextField(); private function row(...cols): void { logger.appendText(cols.join(",") + "\n"); } public function BranchCost() { stage.align = StageAlign.TOP_LEFT; stage.scaleMode = StageScaleMode.NO_SCALE; logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); init(); } private function init(): void { const REPS:int = 1000000000; var i:int; var j:int; var beforeTime:int; var afterTime:int; var sum:int; row("Operation", "Time"); beforeTime = getTimer(); sum = noBranch(REPS); afterTime = getTimer(); row("No Branch", (afterTime-beforeTime)); beforeTime = getTimer(); sum = branch(REPS); afterTime = getTimer(); row("Branch", (afterTime-beforeTime)); } private function noBranch(reps:int): int { var sum:int; for (var i:int; i < reps; ++i) { var cur:int = i & 1; sum += cur; } return sum; } private function branch(reps:int): int { var sum:int; for (var i:int; i < reps; ++i) { var cur:int = i & 1; if (cur > 0) { sum += cur; } else { sum += cur; } } return sum; } } }
I tested this app using the following environment:
- Release version of Flash Player 13.0.0.206
- 2.3 Ghz Intel Core i7-3615QM
- Mac OS X 10.9.2
- Google Chrome 34.0.1847.131
- ASC 2.0.0 build 354130 (
-debug=false -verbose-stacktraces=false -inline -optimize=true
)
And got these results:
Operation | Time |
---|---|
No Branch | 2080 |
Branch | 2321 |
The version with the branch (if-else
) is taking about 12% longer to perform the same task. While this isn’t a real-world case, there are plenty of examples where you should consider removing a branching statement. For example, you might think you have an optimization by not adding zero to a sum:
if (val != 0) { sum += val; }
However, it’s likely that the if-else
is taking longer than just adding zero. Since adding zero has no effect, why not just do it and save some time?
Try to keep branch instructions in mind while writing performance-critical code. They’re often necessary, but never free. You just may realize a nice speedup if you can find a way around using them.
Spot a bug? Have a question or suggestion? Post a comment!
#1 by henke37 on May 5th, 2014 ·
A smart compiler would merge the two blocks due to them being identical and then remove the condition expression due to it lacking side effects. The actionscript compiler is dumb.
#2 by Tronster on May 5th, 2014 ·
henke37, you are right that some compilers could detect the redundancy and strip the if/else; collapsing the two code blocks into one. But as what is being tested is the cost of if/else, it’s good that the compiler does not partake in that optimization; otherwise another strategy would need to be employed.
In the bulk of real world situations (whether AS3, C++, C#, etc…) an if/else statement isn’t setup as a contrived computation and will incur a cost as it cannot be optimized away by a compiler. I love the fact the time was taken to perform this test, showing the impact of a fundamental programming concepts within an Actionscript environment.
#3 by Shams on May 13th, 2014 ·
I’ve seen many statistical analysis series of your blog about AS 3.0 optimizations and command’s cost. Keep going Jackson! Your posts are really interesting.
Thanks
#4 by Joihn on July 23rd, 2014 ·
I think it makes sense. The compiler spends time on that extra instruction IF… Without branching we only had a simple sum (1 operation =~ 2100ms). With branching there is an extra instruction (2 instructions, 1 IF and 1 sum =~ 2300ms) which means +12% processing time.
But there is also something interesting here: Operation ADD takes 2100 ms, while the IF only takes 200 ms (~10 times faster), the 12%.
Anyways, I don’t agree with the final conclusion. You said “if (val != 0) { add_sum }” is more efficient without the IF. Without the IF, the add instruction is always executed (1 instruction =~ 2100 ms). With the IF we will have on the WORST case 2 instructions (IF + sum =~2300 ms) and in the best case there will be only 1 instruction (IF instuction =~200ms) and we will be saving the ADD operation 2100ms……….
NoBranching = 2100 ms always. Branching = 200 ms (Best Case), 2300 ms (Worst Case). OF course, the programmer should take a decission depending on how many times the If condition will be true and he will be on the Best Case. For example, in case that the condition is true 50% of the time, we will have an average total processing time of 200*0.5 + 2300*0.5 = 1250ms avg time, which is a lot better than 2100 without the IF branching. In case the IF condition were true let’s say just 10% of times: 200*0.1 + 2300*0.9 = 20 + 2070 = 2090 ms (which is still better than removing the IF!!!!!). Of course, if the condition would be true in a very small percentage like 1%-2%, then I would say yes to remove the IF of this example.
#5 by jackson on July 23rd, 2014 ·
The total loop time for branching was 200 ms longer than the total loop time for not branching. But the loop consists of more than just an add and a branch. It also has the
for
loop itself. If you run an emptyfor
loop, you’ll see that it takes most of the time. I just re-ran the test on the same machine and got 1862 ms for an empty loop.Even with these three parts identified, the loop time is not a simple sum of its parts due to branch prediction. We can’t simply add “
for
+ add + branch” to get a total loop time. Even if we could, the no-branch (2080 ms) minus the empty loop (1862 ms) is 218 ms. The branch (2321 ms) minus the no-branch (2080 ms) is 241. That would mean that the cost of the add is 218 ms and the cost of the branch is 241, so branching would still be slower.#6 by Joihn on July 25th, 2014 ·
Great. Is nice to learn new stuff. I found your site interesting and useful. Btw, it would be cool to have that example also tested:
val = // random value, or something like i % 5
if (val != 0)
{
sum += val;
}
vs.
val = // random value, or somthing like i % 5
sum += val;