Should You Bother Giving Variables a Type?
Many modern strongly-typed languages have introduced a way for you to not have to type a variable’s type. In C#, you can use var
instead of the actual type. In C++, you use auto
. AS3 has a similar feature with it’s “untyped” type: *
. In those other languages, var
and auto
are syntax sugar that the compiler replaces with the actual type. Will the AS3 compiler and/or Flash Player do the same for us? Today’s article finds out if it’s safe to skip the type and just use *
.
Let’s start with a trivial for loop:
private function typed(): void { const REPS:int = 1000000; for (var i:int = 0; i < REPS; ++i) { } }
I compiled this using ASC 2.0.0 build 354130 (-debug=false -verbose-stacktraces=false -inline -optimize=true
) and then used its swfdump
tool to get this bytecode: (my annotations added)
private function typed():void { // derivedName typed // method_info 2 // max_stack 2 // max_regs 3 // scope_depth 0 // max_scope 1 // code_length 29 bb0 succs=[bb2] 0 getlocal0 // get "this" 1 pushscope // push "this" to scope 2 pushbyte 0 // push 0 3 setlocal1 // set 0 to REPS (unnecessarily...) 4 pushint // push 1000000 (not shown for some reason) 5 setlocal1 // set 1000000 to REPS 6 pushbyte 0 // push 0 7 setlocal2 // set 0 to i 8 pushbyte 0 // push 0 9 setlocal2 // set 0 to i (redundantly...) 10 jump bb2 // go to bb1 block (for loop check) bb1 succs=[bb2] 11 label // start of loop body 12 inclocal_i 2 // ++i bb2 succs=[bb1,bb3] 13 getlocal2 // get i 14 pushint // push 1000000 (instead of REPS...) 15 iflt bb1 // if i < 1000000, go to bb1 block (loop body) bb3 succs=[] 16 returnvoid // after loop just return void }
Now let’s try using the untyped (*
) type for the loop iterator i
:
private function untyped(): void { const REPS:int = 1000000; for (var i:* = 0; i < REPS; ++i) { } }
And here’s its bytecode: (also annotated by me)
private function untyped():void { // derivedName untyped // method_info 3 // max_stack 2 // max_regs 3 // scope_depth 0 // max_scope 1 // code_length 30 bb0 succs=[bb2] 0 getlocal0 // get "this" 1 pushscope // push "this" to scope 2 pushbyte 0 // push 0 3 setlocal1 // set 0 to REPS (unnecessarily...) 4 pushint // push 1000000 (not shown for some reason) 5 setlocal1 // set 1000000 to REPS 6 pushundefined // push undefined (unnecessarily...) 7 setlocal2 // set undefined to i 8 pushbyte 0 // push 0 9 setlocal2 // set 0 to i 10 jump bb2 // go to bb1 block (for loop check) bb1 succs=[bb2] 11 label // start of loop body 12 getlocal2 // get i 13 increment // increment i 14 coerce_a // coerce incremented value to an "atom" 15 setlocal2 // set incremented, atomized value to i bb2 succs=[bb1,bb3] 16 getlocal2 // get i 17 pushint // push 1000000 (not shown for some reason...) 18 iflt bb1 // if i < 1000000, go to bb1 block (loop body) bb3 succs=[] 19 returnvoid // after loop just return void }
For a three-character change, quite a lot changes have occurred in the bytecode. Setting i
to undefined
at first isn’t a very big deal since it’s immediately overwritten by the 0
value it is initialized to, but the process of incrementing it is quite different. Here’s the int
version:
12 inclocal_i 2 // ++i
What could be faster than a single, specialized instruction? Probably not the untyped version:
12 getlocal2 // get i 13 increment // increment i 14 coerce_a // coerce incremented value to an "atom" 15 setlocal2 // set incremented, atomized value to i
That’s four times the instructions and none of them are specialized to integers.
Clearly, the compiler is not analyzing the code to figure out that i
is really just an int
, even though that’s how it’s initialized and that’s how this very simple function uses it.
So, what kind of performance damage is this causing? The following performance test is designed to find out.
package { import flash.display.*; import flash.utils.*; import flash.text.*; public class VariableTyping extends Sprite { private var logger:TextField = new TextField(); private function row(...cols): void { logger.appendText(cols.join(",") + "\n"); } public function VariableTyping() { stage.align = StageAlign.TOP_LEFT; stage.scaleMode = StageScaleMode.NO_SCALE; logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); var beforeTime:int; var afterTime:int; row("Operation", "Time"); beforeTime = getTimer(); typed(); afterTime = getTimer(); row("int", (afterTime-beforeTime)); beforeTime = getTimer(); untyped(); afterTime = getTimer(); row("Untyped", (afterTime-beforeTime)); } private function typed(): void { const REPS:int = 1000000; for (var i:int = 0; i < REPS; ++i) { } } private function untyped(): void { const REPS:int = 1000000; for (var i:* = 0; i < REPS; ++i) { } } } }
I tested this app using the following environment:
- Release version of Flash Player 14.0.0.125
- 2.3 Ghz Intel Core i7-3615QM
- Mac OS X 10.9.2
- Google Chrome 35.0.1916.153
- ASC 2.0.0 build 354130 (
-debug=false -verbose-stacktraces=false -inline -optimize=true
)
And got these results:
Operation | Time |
---|---|
int | 2 |
Untyped | 20 |
Plain and simple, that’s a 10x slowdown caused only by omitting a type. Those extra, generalized increment instructions are way slower than the one specialized instruction the compiler generates for you when you assign the variable a type.
In conclusion, AS3 has a feature that works like C#’s var
and C++’s auto
but the performance sacrifice it asks of you is huge. It’s best avoided in any performance-critical code.
Spot a bug? Have a question or suggestion? Post a comment!
#1 by Clark on July 14th, 2014 ·
Great article as always thanks!
Results:
Release Flash Player 14.0.0.125
3.4GHZ i5-4670k
Windows 7 Pro 64
Chrome 35.0.1916.153 m
Operation,Time
int,2
Untyped,12
#2 by jackson on July 14th, 2014 ·
Glad to see your results are within the normal range of variance depending on CPU, platform, etc. 6x is still plenty enough slowdown to deter anyone interested in performance from using
*
. Thanks for posting your results.#3 by Thomas H. on July 14th, 2014 ·
Great article! A small thing: from 2 ms to 20 ms not 20x slowdown, “only” 10x.
#4 by jackson on July 14th, 2014 ·
Thanks for spotting the typo. I’ve updated the article with a fix.
#5 by henke37 on July 14th, 2014 ·
It is not like in c++ at all, you lose the type checking. It is more like an union+enum where you need to call an accessor function each time that you want to read or write the value. Which, unsurprisingly, is slower than a direct memory access.
#6 by jackson on July 14th, 2014 ·
Good point about losing the type checking. I didn’t mention that in the article. Instead, I wanted to show that the
*
“type” is the same in that you don’t need to specify a type and the code still works, but is not the same behind the scenes resulting in a huge slowdown. The lack of type checking is one more reason not to use*
as a direct replacement like you could in languages like C++ and C#.#7 by gavin on July 15th, 2014 ·
as c++ 11 auto has been changed?