Flash Player 10.2 Performance: Part 1
From a performance perspective, lots changed in Flash 10.1 (see part 1, 2, 3, 4, 5, 6). Flash Player 10.2 was officially released last week, so it’s time to update this site’s many performance tests to the new player. This time around I’ll be updating more performance tests per part of this series, so hopefully everything will be updated a lot quicker than last time. Read on for the updates!
All tests in this performance update use the same environment:
- Flex SDK (MXMLC) 126.96.36.19976, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 10.1.102.64 or 10.2.152.26
- 2.8 Ghz Intel Xeon W3530
- Windows 7
Flash Player 10.1 Performance:
|Linked List (recycling nodes)||31||15|
Flash Player 10.2 Performance
|Linked List (recycling nodes)||22||27|
Direct allocation seems to have taken a performance hit, which is a real shame because it happens all the time. On the plus side, we seem to be able to make up for it by using free lists. The recycling technique though, is now antiquated.
Namespaces As Function Pointers
|Explicit Namespace||Namespace Variable||No Namespace|
|Flash Player 10.1||109||7046||100|
|Flash Player 10.2||103||6191||103|
There’s not much change here, but using namespace variables seems about 15% faster.
|Assign Swap||XOR Swap|
|Flash Player 10.1||233||302|
|Flash Player 10.2||233||302|
No change here: XOR swap is still slower and less readable.
Runnables as Function Pointers
|Flash Player 10.1||163||50||57|
|Flash Player 10.2||204||54||54|
Method call speed is pretty much unchanged, but calls through
Function objects are now 25% slower. This is a real bummer since they are the basis of most callback and signal/event systems (except TurboSignals, which uses runnables).
Flash Player 10.1 Performance:
|Dictionary (strong keys)||511||4504||287|
|Dictionary (weak keys)||579||4579||270|
|BMD w/ alpha getPixel32||n/a||n/a||183|
|BMD w/o alpha getPixel32||n/a||n/a||167|
|BMD w/ alpha getPixel||n/a||n/a||182|
|BMD w/o alpha getPixel||n/a||n/a||171|
Flash Player 10.2 Performance:
|Dictionary (strong keys)||568||4850||261|
|Dictionary (weak keys)||568||4901||261|
|BMD w/ alpha getPixel32||n/a||n/a||185|
|BMD w/o alpha getPixel32||n/a||n/a||168|
|BMD w/ alpha getPixel||n/a||n/a||184|
|BMD w/o alpha getPixel||n/a||n/a||172|
There are a lot of figures here and they vary a little from test to test, but overall not much has changed. One notable exception is that
for-in loops are slower across the board by about 10%.
|Flash Player 10.1||735||704|
|Flash Player 10.2||770||735|
With or without the
try/catch, both versions are 5% slower in 10.2.
|XML Class||String Class|
|Flash Player 10.1||57||1|
|Flash Player 10.2||48||1|
XML is now about 19% faster, but still nearly 50x slower than just using a
Shape vs. Sprite
|Shape FPS||Sprite FPS||Shape Memory||Sprite Memory|
|Flash Player 10.1||60||60||34524||50908|
|Flash Player 10.2||60||60||35776||55228|
There’s no change in the performance as it was already capped at 60 FPS. As for memory,
Shape is using about the same amount and
Sprite is using about 8% more.
|Plain||Local||Var||Method||Static||Override||super||Interface Direct||Interface via Interface||Interface via Class|
|Flash Player 10.1||259||215||216||54||62||52||57||54||118||54|
|Flash Player 10.2||321||257||271||53||60||56||57||55||51||53|
As we saw in the runnables test above,
Function objects—plain, local, var—are slower in 10.2. On the plus side, calling an interface method via an interface object no longer carries a 2x performance slowdown and is now just as fast as an interface method call directly or via a class instance. This is a big win for anyone who uses a lot of interfaces!
Simple Regular Expressions
|Flash Player 10.1||3||3||97||95|
|Flash Player 10.2||3||3||94||91|
There may be a very slight boost to regular expression speed here, but it may also just be statistical variance.
Beware of Getters and Setters
|Flash Player 10.1||183||18||32||78|
|Flash Player 10.2||182||26||27||70|
These are strange results! The non-getter field access (
Point.x) got slower by 44% and the getter field access (
MyPoint.x) got faster by 11%. The 4.3x performance boost for using variables instead of getters/setters is now narrowed to only 2.7x, which is a shame as it is now harder to improve field access performance.
Var Args Is Slow
|Pre-Allocated Array||Dynamically-Allocated Array||Var Args|
|Flash Player 10.1||16||109||109|
|Flash Player 10.2||12||170||7|
Wow, var args has been amazingly optimized in Flash Player 10.2! It’s now even faster than a pre-allocated
Array, meaning it’s probably not even using an
Array behind the scenes anymore. This is great news for any fan of var args!
Since this article has been superseded by this followup article, I won’t be updating Faster isNaN() anymore.
Inlining Math Functions
|Function||Player 10.1||Player 10.2|
|abs||10 inline, 15 Math||9 inline, 14 Math|
|ceil||14 inline, 18 Math||13 inline, 17 Math|
|floor||13 inline, 18 Math||13 inline, 18 Math|
|max||258 inline, 46 Math||61 inline, 46 Math|
|min||249 inline, 46 Math||60 inline, 47 Math|
|max2||14 inline, 18 Math||14 inline, 21 Math|
|min2||14 inline, 16 Math||14 inline, 20 Math|
The only real change here is the big speedups for the inlined version of
max. They’re still slower than the regular
Math versions, so there’s not much point to using them.
|Class||Player 10.1||Player 10.2|
|Array||47 hit, 222 miss||44 hit, 247 miss|
|Vector Dynamic||42 hit, 7090 miss||42 hit, 6470 miss|
|Vector Fixed||41 hit, 7106 miss||43 hit, 6455 miss|
|Object||150 hit, 182 miss||137 hit, 206 miss|
|Dictionary Strong||141 hit, 242 miss||148 hit, 276 miss|
|Dictionary Weak||141 hit, 249 miss||146 hit, 278 miss|
|BitmapData no alpha, getPixel||93 hit, 78 miss||96 hit, 75 miss|
|BitmapData no alpha, getPixel32||94 hit, 94 miss||93 hit, 90 miss|
|BitmapData alpha, getPixel||93 hit, 76 miss||92 hit, 74 miss|
|BitmapData alpha, getPixel32||85 hit, 67 miss||90 hit, 73 miss|
|ByteArray||57 hit, 50 miss||53 hit, 49 miss|
The performance penalty—due to the
Error that gets thrown—for missing on a
Vector has been reduced by about 10%. Otherwise, nothing much has changed.
More To Come
I’ll reserve any general conclusions until the series has concluded, but for now the performance is quite mixed. Stay tuned for part two!