Typecasting: Part 3
Today’s article is a followup to an article (Cast Speed, itself a followup to Two Types of Casts) from September that continues to gather comments. Sharp-eyed reader fastas3 brought up a good point that warranted some further investigation into the topic. So today we’ll be taking yet-another look at typecasting in AS3 to try to unravel some of its strange mysteries.
fastas3’s comment was in regards to an anomaly: when testing the cast performance with a class defined in AS3 rather than a native class, the function call-style cast (MyClass(obj)
) runs a good deal faster. After confirming this by simply replacing all instances of BitmapData
with MyClass
—an empty class I had made—I was puzzled and so I started to dig into the problem. Upon inspecting the bytecode, I saw this for the “cast succeeds, function call-style” test’s loop body:
134 findpropstrict flash.display::BitmapData 136 getscopeobject 1 138 getslot 7 140 callpropvoid flash.display::BitmapData (1)
getslot
is a telltale sign of an activation object at work. I knew right away that this was caused by my local log
function, so I moved it out to the class level as you’ve seen in many of my more recent articles:
private var __logger:TextField = new TextField(); private function log(msg:*): void { __logger.appendText(msg+"\n"); } public function CastSpeed() { __logger.autoSize = TextFieldAutoSize.LEFT; addChild(__logger); // ... }
Running the test after this change yielded no change in the performance output, so I took another look at the bytecode. The activation objects were still there! I looked around for any more local functions and saw none and everything looked perfectly normal… except for the try/catch
block necessary for testing the function call-style’s failure to cast. Could this be the cause of the try/catch slowdown?? I decided to investigate by splitting out each test into its own function and stringing them together with ENTER_FRAME
listeners. Immediately, this brought down the three tests that weren’t using a try/catch
block by a factor of three! To confirm that the activation objects were gone, I inspected the bytecode and saw this:
51 findpropstrict flash.display::BitmapData 53 getlocal3 54 callpropvoid flash.display::BitmapData (1)
What a breath of fresh air to have all of those getscopeobject
and getslot
instructions replaced by a simple (and fast!) getlocal
. But the work wasn’t yet done, as I still needed to compare casting with AS3 classes (e.g. the empty MyClass
) with native classes (e.g. Point
). Since I needed to test success and failure, I used MyClass
and MyOtherClass
—both empty—for AS3 classes and Point
and Rectangle
—both small and without side-effects— for native classes. Here’s how the performance test came out:
package { import flash.display.*; import flash.events.*; import flash.geom.*; import flash.text.*; import flash.utils.*; public class CastSpeed extends Sprite { private var __logger:TextField = new TextField(); private function log(msg:*): void { __logger.appendText(msg+"\n"); } public function CastSpeed() { __logger.autoSize = TextFieldAutoSize.LEFT; addChild(__logger); addEventListener(Event.ENTER_FRAME, testAS3SuccessFunctionCall); } private function testAS3SuccessFunctionCall(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testAS3SuccessFunctionCall); const ITERATIONS:int = 5000000; var mc:MyClass = new MyClass(); log("AS3"); log("\tCast succeeds:"); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { MyClass(mc); } var afterTime:int = getTimer(); log("\t\tFunction call style: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testAS3SuccessAs); } private function testAS3SuccessAs(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testAS3SuccessAs); const ITERATIONS:int = 5000000; var mc:MyClass = new MyClass(); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { mc as MyClass; } var afterTime:int = getTimer(); log("\t\tAs keyword: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testAS3FailFunctionCall); } private function testAS3FailFunctionCall(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testAS3FailFunctionCall); const ITERATIONS:int = 5000000; var moc:MyOtherClass = new MyOtherClass(); log("\tCast fails:"); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { try { MyClass(moc); } catch (err:TypeError) { } } var afterTime:int = getTimer(); log("\t\tFunction call style: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testAS3FailAs); } private function testAS3FailAs(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testAS3FailAs); const ITERATIONS:int = 5000000; var moc:MyOtherClass = new MyOtherClass(); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { moc as MyClass; } var afterTime:int = getTimer(); log("\t\tAs keyword: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testNativeSuccessFunctionCall); } private function testNativeSuccessFunctionCall(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testNativeSuccessFunctionCall); const ITERATIONS:int = 5000000; var pt:Point = new Point(); log("Native"); log("\tCast succeeds:"); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { Point(pt); } var afterTime:int = getTimer(); log("\t\tFunction call style: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testNativeSuccessAs); } private function testNativeSuccessAs(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testNativeSuccessAs); const ITERATIONS:int = 5000000; var pt:Point = new Point(); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { pt as Point; } var afterTime:int = getTimer(); log("\t\tAs keyword: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testNativeFailFunctionCall); } private function testNativeFailFunctionCall(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testNativeFailFunctionCall); const ITERATIONS:int = 5000000; var rc:Rectangle = new Rectangle(); log("\tCast fails:"); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { try { Point(rc); } catch (err:TypeError) { } } var afterTime:int = getTimer(); log("\t\tFunction call style: " + (afterTime-beforeTime)); addEventListener(Event.ENTER_FRAME, testNativeFailAs); } private function testNativeFailAs(ev:Event): void { removeEventListener(Event.ENTER_FRAME, testNativeFailAs); const ITERATIONS:int = 5000000; var rc:Rectangle = new Rectangle(); var beforeTime:int = getTimer(); for (var i:int = 0; i < ITERATIONS; ++i) { rc as Point; } var afterTime:int = getTimer(); log("\t\tAs keyword: " + (afterTime-beforeTime)); } } } class MyClass{} class MyOtherClass{}
Compiling this and running it via the release Flash Player plugin on a 2.4 Ghz Intel Core i5 with Mac OS X 10.6, I got these results:
Type | AS3 | Native | ||
---|---|---|---|---|
Success | Failure | Success | Failure | |
Function Call | 11 | 11005 | 155 | 11193 |
As Keyword | 12 | 11 | 12 | 12 |
Even after removing the activation objects from the local log
function and try/catch
block, we still see fastas3’s point that successful function call-style casts run faster on AS3 classes than native classes. The bytecode behind both casts looks exactly the same, too:
48 findpropstrict private::MyClass 50 getlocal3 51 callpropvoid private::MyClass (1)
48 findpropstrict flash.geom::Point 50 getlocal3 51 callpropvoid flash.geom::Point (1)
This means that there must be something in the Player itself that is causing the performance difference. What this is, I don’t know. So for now, mark this as another victory for the as
keyword. Only in the best case scenario—successful casting of an AS3 class—does function call-style casting even approach the as
keyword. For your performance-critical code there is a clear choice: use the as
keyword!
#1 by Henke37 on January 17th, 2011 ·
Now we just need to compare the conversion functions too. The stuff in the default package that is. And since it’s related, the toString method and the various ways of getting it called.
#2 by jackson on January 17th, 2011 ·
I’m never going to finish this series, am I? ;-)
You’re right, of course. I did one on implicit type conversion with the built-in types (e.g.
int
,Number
), but I don’t think I have one on theint(val)
-style converters. Oh well, stay tuned for part four! :)#3 by fastas3 on January 17th, 2011 ·
I’m really impressed by your ability to break things down into pieces and investigate the problem. This is only one of many oddities like this in FP I think. And because there are no clear specifications of FP internals, we should investigate things like you to become better developers.
#4 by as3isolib on January 17th, 2011 ·
I did a quick code refactor in my as3isolib.v2.core such that casting was using the if..as casting methods and I got a slight bump in FPS performance of about 1-2 FPS. Not a huge gain but rendering every frame for thousands of objects, it makes a difference over the long haul.
Thanks again Jackson and folks. This is a superbly awesome series.
#5 by jackson on January 17th, 2011 ·
You’re very welcome! I’m always happy to hear when an article makes a real-world difference. :-D
#6 by skyboy on January 19th, 2011 ·
Perhaps it’s something to do with the package look-up? Splitting the class out into something like
com.jackson.MyClass
could get a performance penalty as with the native classes.#7 by jackson on January 19th, 2011 ·
I actually tried that, but omitted the results from the article as there was no performance difference. It may have to do with application domains, security, native classes, or some other difference. I wish I knew what though. :-/ For now the knowledge that it’s slower with native classes will have to suffice.
#8 by Keyston on January 23rd, 2011 ·
While this test does show some difference, would it be better to cast a non-native AS3 class that has as close to the same complexity to an Native Class, maybe testing against the flex framework against the many inherited classes ie . List->ListBase->Group->GroupBase->UIComponent. My thinking is since when casting you need to marshal all properties (signature checking) I would think that a non-native class that is empty will cast far faster then a non-native class that has the complexity of say MovieClip or in flex sense List/Group
#9 by jackson on January 23rd, 2011 ·
Good point- I haven’t tested the effect of subclassing (once or many times) or complexity of object on the performance. I don’t think either will have much (if any) bearing on the performance though, as there isn’t any type conversion going on, just a change of one type to another. Still, I may include this in a followup article. If I do, I’ll link back to your comment.
#10 by skyboy on January 23rd, 2011 ·
In which case you might try type-casting a DenseMap: It certainly matches some of the native classes for complexity, and the only one I can think of off the top of my head.