Conditionals Performance
Now that the Flash Player 10.1 testing is through I can return to a comment asking about the performance difference between if-else chains and the ternary (? :) operator. Further, I’ll discuss switch statements to see if there is any difference in performance for these commonly-used methods of flow control.
All AS3 programmers make tons of use of if-else chains and many also use switch statements and the ternary (? :) operator. Given their essential nature, it’s important to know the performance differences, if any, between them. It would seem that since none of these constructs exist at the bytecode level that all of them would be compiled down to the same bytecode using conditional jumps/branches. Consider the following very simple functions designed to be easily read in bytecode:
private function ifElse6(val:int): void { if (val == 0) { func0(); } else if (val == 1) { func1(); } else if (val == 2) { func2(); } else if (val == 3) { func3(); } else { func4(); } } private function ternary6(val:int): void { val == 0 ? func0() : val == 1 ? func1() : val == 2 ? func2() : val == 3 ? func3() : func4(); } private function switch6(val:int): void { switch (val) { case 0: func0(); break; case 1: func1(); break; case 2: func2(); break; case 3: func3(); break; default: func4(); } }
Each of them is simply calling a function based on the value of val, be it 0, 1, 2, 3, or something else. The switch version is arguably the cleanest and the ternary arguably the hardest to read since it’s usually a bad idea to have such a deeply-nested ternary statement. The if-else version is somewhere in the middle as it is straightforward but verbose. Let’s start analyzing the generated bytecode with this if-else version:
function private::ifElse6(int):void /* disp_id 0*/
{
// local_count=2 max_scope=1 max_stack=2 code_len=67
0 getlocal0
1 pushscope
2 getlocal1
3 pushbyte 0
5 ifne L1
9 getlocal0
10 callpropvoid private::func0 (0)
13 jump L2
L1:
17 getlocal1
18 pushbyte 1
20 ifne L3
24 getlocal0
25 callpropvoid private::func1 (0)
28 jump L2
L3:
32 getlocal1
33 pushbyte 2
35 ifne L4
39 getlocal0
40 callpropvoid private::func2 (0)
43 jump L2
L4:
47 getlocal1
48 pushbyte 3
50 ifne L5
54 getlocal0
55 callpropvoid private::func3 (0)
58 jump L2
L5:
62 getlocal0
63 callpropvoid private::func4 (0)
L2:
66 returnvoid
}This bytecode is nearly as straightforward as the AS3 code it was compiled from. It’s simply as sequence of skipping over blocks of code, i.e., the function calls, where val doesn’t pass the equality test and then skipping to the end of the function after val does match.
Since the ternary version should theoretically just be syntax sugar, let’s see how MXMLC compiles it:
function private::ternary6(int):void /* disp_id 0*/
{
// local_count=2 max_scope=1 max_stack=2 code_len=71
0 getlocal0
1 pushscope
2 getlocal1
3 pushbyte 0
5 equals
6 iffalse L1
10 getlocal0
11 callpropvoid private::func0 (0)
14 jump L2
L1:
18 getlocal1
19 pushbyte 1
21 equals
22 iffalse L3
26 getlocal0
27 callpropvoid private::func1 (0)
30 jump L2
L3:
34 getlocal1
35 pushbyte 2
37 equals
38 iffalse L4
42 getlocal0
43 callpropvoid private::func2 (0)
46 jump L2
L4:
50 getlocal1
51 pushbyte 3
53 equals
54 iffalse L5
58 getlocal0
59 callpropvoid private::func3 (0)
62 jump L2
L5:
66 getlocal0
67 callpropvoid private::func4 (0)
L2:
70 returnvoid
}This version is very similar to the if-else version, but unfortunately involves more stack access as it keeps using equals then iffalse rather than directly using ifne. This would be like writing in AS3 if ((val == 3) == true) rather than the much more common if (val == 3). They both have the same effect, but one pointlessly uses more instructions.
The switch statement is quite fancy in AS3 compared to, for example, C/C++ or Java as it can work on non-integer types and, unlike C#, supports falling through even when there is code in a case. Let’s see how this translates into bytecode:
function private::switch6(int):void /* disp_id 0*/
{
// local_count=3 max_scope=1 max_stack=2 code_len=140
0 getlocal0
1 pushscope
2 jump L1
L2:
6 label
7 getlocal0
8 callpropvoid private::func0 (0)
11 jump L3
L4:
15 label
16 getlocal0
17 callpropvoid private::func1 (0)
20 jump L3
L5:
24 label
25 getlocal0
26 callpropvoid private::func2 (0)
29 jump L3
L6:
33 label
34 getlocal0
35 callpropvoid private::func3 (0)
38 jump L3
L7:
42 label
43 getlocal0
44 callpropvoid private::func4 (0)
47 jump L3
L1:
51 getlocal1
52 setlocal2
53 pushbyte 0
55 getlocal2
56 ifstrictne L8
60 pushbyte 0
62 jump L9
L8:
66 pushbyte 1
68 getlocal2
69 ifstrictne L10
73 pushbyte 1
75 jump L9
L10:
79 pushbyte 2
81 getlocal2
82 ifstrictne L11
86 pushbyte 2
88 jump L9
L11:
92 pushbyte 3
94 getlocal2
95 ifstrictne L12
99 pushbyte 3
101 jump L9
L12:
105 jump L13
109 pushbyte 4
111 jump L9
L13:
115 pushbyte 4
L9:
117 kill 2
119 lookupswitch default:L7 maxcase:4 L2 L4 L5 L6 L7
L3:
139 returnvoid
}Well this version sure is different! Even though it has the same effect as the other two versions, MXMLC produces bytecode that’s about twice as long. and makes use of some special instructions like lookupswitch. It starts off by jumping past all of the case blocks and then using yet-another approach to jump/branch logic compared to the if-else and ternary versions. Here, similar to the if-else version, ifstrictne is used rather than the boolean test that the ternary version used. Still, it’s using ifstrictne instead of the plain ifne version, which is analogous to using the AS3 !== operator instead of !=. This isn’t necessary though since the values being compared are simply of int type, but we’ll have to wait and see if it contributes to any performance degradation. Regardless, this jumping/branching doesn’t actually jump into the case blocks, but rather sets up the arguments to the lookupswitch instruction which does the actual work to decide which case statement should be executed.
So how do all of the above differences manifest themselves in actual performance? We’ll let’s look at a quick performance test:
package { import flash.text.*; import flash.utils.*; import flash.display.*; public class ConditionalsTest extends Sprite { public function ConditionalsTest() { stage.scaleMode = StageScaleMode.NO_SCALE; stage.align = StageAlign.TOP_LEFT; var logger:TextField = new TextField(); logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); function log(msg:*): void { logger.appendText(msg+"\n"); } var beforeTime:int; var afterTime:int; var i:int; const ITERATIONS:int = 50000000; for each (var val:int in [0,1,2,3,4]) { log(val); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { if (val == 0) { func0(); } else if (val == 1) { func1(); } else if (val == 2) { func2(); } else if (val == 3) { func3(); } else { func4(); } } afterTime = getTimer(); log("\tIf-else: " + (afterTime-beforeTime)); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { val == 0 ? func0() : val == 1 ? func1() : val == 2 ? func2() : val == 3 ? func3() : func4(); } afterTime = getTimer(); log("\tTernary: " + (afterTime-beforeTime)); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { switch (val) { case 0: func0(); break; case 1: func1(); break; case 2: func2(); break; case 3: func3(); break; default: func4(); } } afterTime = getTimer(); log("\tSwitch: " + (afterTime-beforeTime)); } } private function func0(): void{} private function func1(): void{} private function func2(): void{} private function func3(): void{} private function func4(): void{} } }
This test is designed to show the differences between hitting on the first attempt (val == 0), the second, third, fourth, and default cases. Here are the results:
| Environment | If-Else | Ternary | Switch | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | 3 | 4 | 0 | 1 | 2 | 3 | 4 | 0 | 1 | 2 | 3 | 4 | |
| 3.0 Ghz Intel Core 2 Duo, Windows XP | 361 | 400 | 453 | 457 | 455 | 360 | 405 | 437 | 457 | 456 | 443 | 468 | 504 | 526 | 502 |
| 2.0 Ghz Intel Core 2 Duo, Mac OS X | 702 | 737 | 750 | 892 | 858 | 702 | 737 | 753 | 892 | 859 | 725 | 802 | 874 | 902 | 927 |
Here are some observations:
- Aside from overall speed, there seem to be no performance differences between the operating systems involved.
- We do see a general slowdown as
valincreased and it look more comparisons to find its match, regardless of the type of conditional used. It would have been nice ifswitchcould have improved this, but this wasn’t expected from the bytecode above. - The
if-elseand ternary versions are nearly identical from a performance standpoint. It seems as though the boolean test we saw in the bytecode doesn’t have much of any impact, especially on Mac OS X. - Using
switchis about 15% slower on Windows XP and 10% slower on Mac OS X. Beware of switches in performance-critical code!
#1 by whitered on August 16th, 2010 ·
such a deeply-nested ternary statement can be pretty readable if written in this way:
val == 0 ? func0() :
val == 1 ? func1() :
val == 2 ? func2() :
val == 3 ? func3() :
func4();
considering that this is the only statement that returns a value, using it isn’t such a bad idea
#2 by jackson on August 16th, 2010 ·
Good point. I should also point out that readability is always subjective. To me, the
if-elseis still much more readable.#3 by Alama on September 19th, 2010 ·
Lol; for me, it’s better switch readable, and the ternary syntax proposed by Whitered is very similar! Good idea for best performances switch ! ;-)
#4 by skyboy on October 20th, 2010 ·
Aside from taking an extra 100ms, the switch statement is actually faster between hits (20-40 ms instead of 40-60 ms).
I think this test should be redone with 20, 30 or more options to see which is faster in the long run, maybe even a test against Dictionary, Object or Class to see which is faster if all you’re doing is retrieving a value, or even running a function(s) in the case of a class, using getters.
#5 by jackson on October 21st, 2010 ·
I’m not sure I understand what you mean by “the switch statement is actually faster between hits”. I could certainly increase the number of cases beyond five, but it looks like there’s already a good demonstration that more cases means slower performance. Still, it may be interesting to see a graph of that performance as the number of cases increases. Perhaps I’ll do a followup article…
#6 by craig on August 10th, 2013 ·
very helpful. thanks for this!