Linked Lists: Part I
I’ve written before about linked lists when I covered free lists. There were massive speedups to be had there, but that article mostly covered the performance costs of allocation and deallocation. Today is part one of a series that more generally covers linked lists.
I’ve written a LinkedList class in AS3 with an API that (mostly) matches the Array API. The goal is to have a drop-in replacement for Array for times where a linked list will yield better performance. I recommend the Wikipedia article for those new to the subject and for those wanting to learn more about the various trade-offs between arrays and linked lists. For those skipping it, let’s dive right into my linked list implementation.
For starters, the implementation is pretty straightforward and simple. Anyone who has taken a class on data structures will feel right at home here. This implementation is also quite incomplete as it does not implement a good deal of the Array API it hopes to duplicate. The one exception to this API duplication is that I cannot implement the multiple constructors (size and values) that Array has due to AS3 prohibiting function overloading, which includes constructor functions. Lastly, the implementation currently includes only a LinkedListNode class and a LinkedList class. Here they are:
LinkedListNode
package { public class LinkedListNode { public var next:LinkedListNode; public var prev:LinkedListNode; public var data:*; public function LinkedListNode(data:*=undefined) { this.data = data; } } }
LinkedList
package { public class LinkedList { public var head:LinkedListNode; public var tail:LinkedListNode; public function LinkedList(numElements:int=0) { for (var i:int = 0; i < numElements; ++i) { push(undefined); } } public static function fromValues(...values): LinkedList { var ret:LinkedList = new LinkedList(); for each (var val:* in values) { ret.push(val); } return ret; } public function concat(...args): LinkedList { // TODO implement return null; } public function every(callback:Function, thisObject:*=null): void { // TODO implement } public function filter(callback:Function, thisObject:*=null): void { // TODO implement } public function forEach(callback:Function, thisObject:*=null): void { // TODO implement } public function indexOf(searchElement:*, fromIndex:int=0): int { // TODO implement return -1; } public function join(sep:*=","): String { var ret:String = ""; for (var curNode:LinkedListNode = this.head; curNode; curNode = curNode.next) { ret += curNode.data + sep; } return ret.substr(0, ret.length-sep.length); } public function lastIndexOf(searchElement:*, fromIndex:int=0x7fffffff): int { // TODO implement return -1; } public function map(callback:Function, thisObject:*=null): LinkedList { // TODO implement return null; } public function pop(): * { if (this.tail) { var ret:* = this.tail.data; this.tail = this.tail.prev; return ret; } else { return undefined; } } public function push(...args): void { for each (var arg:* in args) { var newNode:LinkedListNode = new LinkedListNode(arg); if (this.tail) { this.tail.next = newNode; this.tail.next.prev = this.tail; this.tail = this.tail.next; } else { this.head = this.tail = newNode; } } } public function reverse(): LinkedList { var front:LinkedListNode = this.head; var back:LinkedListNode = this.tail; var temp:*; while (front != back && back.prev != front) { temp = front.data; front.data = back.data; back.data = temp; front = front.next; back = back.prev; } return this; } public function shift(): * { if (this.head) { var ret:* = this.head.data; this.head = this.head.next; return ret; } else { return undefined; } } public function slice(startIndex:int=0, endIndex:int=16777215): LinkedList { var ret:LinkedList = new LinkedList(); var cur:LinkedListNode = this.head; var i:int; for (i = 0; i < startIndex && cur; ++i) { cur = cur.next; } for (; i < endIndex && cur; ++i) { ret.push(cur.data); cur = cur.next; } return ret; } public function some(callback:Function, thisObject:*=null): Boolean { // TODO implement return false; } public function sort(...args): LinkedList { // TODO implement return null; } public function sortOn(fieldName:Object, options:Object=null): LinkedList { // TODO implement return null; } public function splice(startIndex:int, deleteCount:int, ...values): LinkedList { var ret:LinkedList = new LinkedList(); var cur:LinkedListNode = this.head; var endIndex:int = startIndex + deleteCount; var i:int; for (i = 0; i < startIndex && cur; ++i) { cur = cur.next; } for (; i < endIndex && cur; ++i) { ret.push(cur.data); cur.next = cur.next.next; cur.next.prev = cur; cur = cur.next; } return ret; } public function toString(): String { return join(); } public function toLocaleString(): String { return join(); } public function unshift(...args): void { for each (var arg:* in args) { var newNode:LinkedListNode = new LinkedListNode(arg); if (this.head) { this.head.prev = newNode; this.head.prev.next = this.head; this.head = this.head.prev; } else { this.head = this.tail = newNode; } } } } }
As noted above, there are many omissions in the API. I also haven’t done much in the way of performance tuning. For example, I could replace a lot of calls to push and join with more specialized code. But it’s cleaner this way and serves as a better way to get started. Remember: “premature optimization is the root of all evil” (Donald Knuth). Nevertheless, I’ve written a performance tester:
package { import flash.display.*; import flash.events.*; import flash.text.*; import flash.utils.*; /** * An app to test linked list performance * @author Jackson Dunstan */ public class LinkedListPerformance extends Sprite { public function LinkedListPerformance() { stage.scaleMode = StageScaleMode.NO_SCALE; stage.align = StageAlign.TOP_LEFT; // Setup logger var logger:TextField = new TextField(); logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); function log(msg:*): void { logger.appendText(msg + "\n"); } const VALUE:int = 33; const SIZE:int = 100000; const PUSH_POP_ELEMENTS:int = 1000000; const UNSHIFT_SHIFT_ELEMENTS:int = 500; const JOINS:int = 2; const REVERSES:int = 100; const SLICES:int = 1000; const SLICE_START:int = 10; const SLICE_END:int = 100; const SPLICES:int = 1000; const SPLICE_START:int = 10; var array:Array; var list:LinkedList; var beforeTime:int; var curNode:LinkedListNode; // Populate the array array = []; for (var i:int = 0; i < SIZE; ++i) { array[i] = VALUE; } // Populate the list list = new LinkedList(); for (i = 0; i < SIZE; ++i) { list.push(VALUE); } log("traverse: (" + SIZE + " elements)"); beforeTime = getTimer(); for (i = 0; i < SIZE; ++i) { array[i]; } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (curNode = list.head; curNode; curNode = curNode.next) { curNode.data; } log("\tlinked list: " + (getTimer()-beforeTime)); log("push: (" + PUSH_POP_ELEMENTS + " elements)"); beforeTime = getTimer(); for (i = 0; i < PUSH_POP_ELEMENTS; ++i) { array.push(VALUE); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < PUSH_POP_ELEMENTS; ++i) { list.push(VALUE); } log("\tlinked list: " + (getTimer()-beforeTime)); log("pop: (" + PUSH_POP_ELEMENTS + " elements)"); beforeTime = getTimer(); for (i = 0; i < PUSH_POP_ELEMENTS; ++i) { array.pop(); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < PUSH_POP_ELEMENTS; ++i) { list.pop(); } log("\tlinked list: " + (getTimer()-beforeTime)); log("unshift: (" + UNSHIFT_SHIFT_ELEMENTS + " elements)"); beforeTime = getTimer(); for (i = 0; i < UNSHIFT_SHIFT_ELEMENTS; ++i) { array.unshift(VALUE); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < UNSHIFT_SHIFT_ELEMENTS; ++i) { list.unshift(VALUE); } log("\tlinked list: " + (getTimer()-beforeTime)); log("shift: (" + UNSHIFT_SHIFT_ELEMENTS + " elements)"); beforeTime = getTimer(); for (i = 0; i < UNSHIFT_SHIFT_ELEMENTS; ++i) { array.shift(); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < UNSHIFT_SHIFT_ELEMENTS; ++i) { list.shift(); } log("\tlinked list: " + (getTimer()-beforeTime)); log("join: (" + JOINS + " operations)"); beforeTime = getTimer(); for (i = 0; i < JOINS; ++i) { array.join(); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < JOINS; ++i) { list.join(); } log("\tlinked list: " + (getTimer()-beforeTime)); log("reverse: (" + REVERSES + " operations)"); beforeTime = getTimer(); for (i = 0; i < REVERSES; ++i) { array.reverse(); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < REVERSES; ++i) { list.reverse(); } log("\tlinked list: " + (getTimer()-beforeTime)); log("slice: (" + SLICES+ " operations)"); beforeTime = getTimer(); for (i = 0; i < SLICES; ++i) { array.slice(SLICE_START, SLICE_END); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < SLICES; ++i) { list.slice(SLICE_START, SLICE_END); } log("\tlinked list: " + (getTimer()-beforeTime)); log("splice: (" + SPLICES+ " operations)"); beforeTime = getTimer(); for (i = 0; i < SPLICES; ++i) { array.splice(SPLICE_START, 5, VALUE, VALUE, VALUE, VALUE, VALUE); } log("\tarray: " + (getTimer()-beforeTime)); beforeTime = getTimer(); for (i = 0; i < SPLICES; ++i) { list.splice(SPLICE_START, 5, VALUE, VALUE, VALUE, VALUE, VALUE); } log("\tlinked list: " + (getTimer()-beforeTime)); } } }
It’s good to know the performance, even if you haven’t optimized. Then you’ll know what kinds of gains you’ve made when you do the optimization. So here’s a starting point:
Function | 3.0 Ghz Intel Core 2 Duo, 4 GB RAM, Windows XP | Winner |
---|---|---|
Traverse | 1 array/1 list | — |
Push | 57 array/1084 list | Array |
Pop | 12 array/24 list | Array |
Unshift | 26 array/0 list | List |
Shift | 29 array/0 list | List |
Join | 281 array/4279 list | Array |
Reverse | 12 array/77 list | Array |
Slice | 12 array/318 list | Array |
Splice | 1 array/7 list | Array |
Looks like Array is the clear winner in most cases. The big exceptions are operations on the front: shift and unshift. So if you have those kinds of operations to do, a list may be faster even at this point. Remember to take all of this with a grain of sand, as mentioned above.
Next time I’ll be filling in more of the functions toward the goal of completing the API. So stay tuned!
#1 by Alan on December 19th, 2009 ·
Why did you choose to go with node object vs. a node interface – my preferred approach.
Is it a performance issue? Interfaces are inherently slower and accessors are slow as well, but I find the flexibility of the interface much more portable and reusable.
Another downside to a node object is object creation. This implementation would require creating your value about, creating your node ‘container’ and then nesting your data inside of the node – doubling the amount of objects you have.
#2 by jackson on December 19th, 2009 ·
There are two issues here:
Performance – As you point out, calling methods on interfaces is slow. I also showed this in my article on function performance. I assume you mean some kind of getter for the node data here. My approach requires no function call, getter or otherwise, but instead just accesses an untyped variable (*). This is consistent with the Array API in that it simply holds anything you want. Even if you did have a getter for the data, it too would need to be untyped so as to support any type and be portable and reusable, as you point out.
Number of objects – An interface doesn’t get around the creation of objects. You have to have an object at each node so that you can hold the next and previous pointers. The only way around this would be to require that the only data allowed to be stored in the list would have these pointers built into it. This is possible using an interface, but gets really restrictive when you consider that you have to create node wrapper classes a lot (eg. class StringNode, class IntNode, class UintNode, …). Still, instances of these node wrappers are also objects. So I’m not sure how you can save on the number of objects. Could you show an example?
PS: I think I’ve found your article about AS3 linked lists. I’d like to read a follow-up to it showing the complete linked list class and some usage examples. Do you have one? I’d like to learn more about your strategy. :)
#3 by jonathan on December 19th, 2009 ·
Have you try final keyword for your node. it’s increase performance when you browse a linked list (see http://www.quasimondo.com/archives/000692.php).
#4 by jackson on December 19th, 2009 ·
I tried adding the final keyword to LinkedListNode and to LinkedList, but it didn’t make any difference to the performance. This is in keeping with my findings for final functions. That article was light on details though. Can you point me to a benchmark showing more about how to get a speedup out of final? I’d be quite interested in any approach that can speed up this code. Thanks.
#5 by Kronoshifter on October 3rd, 2010 ·
I agree with Alan. Using a node object isn’t as good as having a node interface. You wouldn’t have to write all those wrapper classes. Rather, the node interface could just be implemented by a previously existing class, with its data property pointing to itself. As an example, in a little game I’ve been working on I have an Enemy class. Now instead of writing a node to contain my Enemy object, all I need to do is have my Enemy object implement the node interface and voila! My Enemy object IS the node.
#6 by jackson on October 3rd, 2010 ·
That is an interesting third approach. Many people would complain that your
Enemy
class shouldn’t know that it is part of a node from a design standpoint. However, it does eliminate an extra object per node, so it could mean a performance gain. Perhaps I’ll do an update to this series to explore the performance benefits…