GWT Collections emulation library performance pitfall

Sunday, September 26, 2010
Update : check also this question on StackOverflow for a discussion on GWT performance practices and this very nice post summing things up in regards to GWT performance tuning

Intro

Lately we've been using GWT (Google Web Toolkit) at Rento for a hot new project and at some point we came across a performance bottleneck on a specific function. This function had to be called from every 200ms to every 1s and it was taking from 20ms in Chrome and Safari (totally acceptable) to 1200ms in IE7 & IE8 (orders of magnitude beyond unacceptable!).

At first, when we saw these numbers we started pulling our hair out (once again) with IE's amazing performance, but then, what the heck, this graph says it all and we were ready to pay the price of supporting IE once again. So we just had to optimize this code somehow. With a first glimpse the bottleneck was a for loop with about 1000 iterations. Makes sense...

CS basics

Computation time is one of the two main criteria when measuring algorithm complexity (the other being space, aka memory). Definition :

It represents the amount of time (or number of computation steps) that a "normal" physical computer would take to solve a certain computational problem using a certain algorithm.

When you want to reduce the computation time for an algorithm (that contains iterations, as is usually the case) you have the following options :

  • Reduce the number of iterations
  • Reduce the computation time inside every iteration

Step 1: Reducing the number of iterations

For the given problem that the algorithm was trying to solve the number of iterations could not be easily*  reduced (where easily means implementing an optimal algorithm in a small amount of time and without bloating the code base). We did some best-case-scenario optimizations but most of the time we didn't actually hit it and additionally the worst one was still unacceptable. There were one or two possible solutions though in case part two (see below) wouldn't work.

Step 2: Reducing the computation time inside the loop

Inside each iteration (among other stuff) a normalization of a string was taking place. This computation looked pretty innocent since it was taking usually less than one ms (we couldn't get an exact value since it was too small) each time it was called. The problem though with repetition is that this phenomenally small amount of time adds up on big iterations and thus by simply caching the result of the normalization and re-using it on future calls, saved us about 500ms on IE (Chrome, Safari and Firefox didn't have a problem anyway). Not bad at all! But still we had about 700ms to cut!

Not enough! Digging deeper

At this point we started scratching our heads and thinking that maybe we should go back to step 1 and actually implement the aforementioned complex algorithms but the gut feeling was "no, there's something else wrong here". After all what was happening inside each iteration step was the normalization (which was out of the game now) and 2-3 checks (conditions, regular expressions and so forth) that shouldn't be that complex anyway.

What a better place to start than the profiler on the "javascript vm" that the code was actually running : Internet Explorer 8. Thank God, IE8 comes with the best profiler out there (yes, I do prefer it among all the others like firebug's and Chrome's). You can see the call tree and since IE's JavaScript engine is the slowest one you can really nail the problems :D

So, back on the profiler and now that the normalization was out we could see more clearly that something was probably wrong with the iterations themselves. A "getIterator" (or something like that) was taking about 90% of the time. Could it be the Set we were using on the Java side? Good hypothesis, now we needed proof!

Benchmarking GWT Collections emulation

After spending some time setting up a dummy stupid-simple GWT project for benchmarking the emulation library for the Collections java api we had our results:

Results for initializing various collections with 1000 elements inside :

TestIE 7ChromeFF 3.6
Array initialization0 ms0 ms0 ms
ArrayList initialization with initial capacity0 ms0 ms3 ms
ArrayList initialization without initial capacity0 ms0 ms3 ms
ArrayList initialization from Arrays$ArrayList16 ms0 ms0 ms
ArrayList initialization from ArrayList0 ms0 ms0 ms
ArrayList initialization from class LinkedList0 ms0 ms3 ms
ArrayList initialization from HashSet47 ms0 ms9 ms
ArrayList initialization from LinkedHashSet16 ms0 ms4 ms
ArrayList initialization from TreeSet62 ms0 ms7 ms
LinkedList initialization from Arrays$ArrayList31 ms0 ms3 ms
LinkedList initialization from ArrayList31 ms0 ms2 ms
LinkedList initialization from LinkedList16 ms0 ms2 ms
LinkedList initialization from HashSet62 ms0 ms8 ms
LinkedList initialization from LinkedHashSet32 ms0 ms4 ms
LinkedList initialization from TreeSet78 ms2 ms7 ms
HashSet initialization with initial capacity16 ms0 ms4 ms
HashSet initialization without initial capacity15 ms1 ms3 ms
HashSet initialization from Arrays$ArrayList32 ms1 ms4 ms
HashSet initialization from ArrayList31 ms0 ms5 ms
HashSet initialization from LinkedList16 ms0 ms4 ms
HashSet initialization from HashSet110 ms2 ms10 ms
HashSet initialization from LinkedHashSet47 ms1 ms7 ms
HashSet initialization from TreeSet78 ms2 ms9 ms
LinkedHashSet initialization from Arrays$ArrayList63 ms2 ms11 ms
LinkedHashSet initialization from ArrayList78 ms1 ms11 ms
LinkedHashSet initialization from LinkedList62 ms2 ms11 ms
LinkedHashSet initialization from HashSet109 ms2 ms16 ms
LinkedHashSet initialization from LinkedHashSet79 ms2 ms12 ms
LinkedHashSet initialization from TreeSet109 ms3 ms15 ms
TreeSet initialization from Arrays$ArrayList563 ms51 ms103 ms
TreeSet initialization from ArrayList547 ms35 ms128 ms
TreeSet initialization from LinkedList593 ms32 ms118 ms
TreeSet initialization from HashSet609 ms44 ms94 ms
TreeSet initialization from LinkedHashSet672 ms32 ms742 ms
TreeSet initialization from TreeSet672 ms37 ms85 ms







TestIE 7ChromeFF 3.6
Native array iteration0 ms0 ms0 ms
ArrayList iteration16 ms0 ms2 ms
LinkedList iteration16 ms0 ms1 ms
HashSet iteration78 ms1 ms6 ms
LinkedHashSet iteration31 ms0 ms3 ms
TreeSet iteration62 ms2 ms7 ms
HashMap iteration around *keys*63 ms0 ms6 ms
HashMap iteration around *values*78 ms1 ms6 ms
LinkedHashMap iteration around *keys*15 ms0 ms3 ms
LinkedHashMap iteration around *values*47 ms1 ms5 ms

Bullseye! Using anything else other than native arrays (which translate to native JS arrays on the compiled JS source) has a tremendous overhead on IE!

Step 3: Reducing the *iterator* time!

With the results of the benchmarks at hand we refactored the code to use and iterate around old-plain-c-like arrays. Not nice in the Java world (for example you can not append things on the fly, you can not declare them final since you don't know exactly the size of them, etc, etc) but the result was worth it on speed terms : from 1200ms on IE it is now taking about 20-100ms (depending on the case). Yes, we can now move on with our lives...

Conclusions

I was rather disappointed with GWT at this point. You are supposed to be using it instead of pure JavaScript so it can be optimized on each browser and so you can use the wealth of tools in the Java ecosystem. In this particular case we had none of the two, quite the opposite I must say. From now on and until the GWT team finds a better way to implement the Collections API we should should think twice before writing List in our declarations...

5 comments:

  • blaze

    Hi,

    Preaty cool block. I have same problem in developing mine application and I moved to using JSArrays in GWT(the wraper for native array) and I got some tunning but not this much as you are explaning!
    What do you mean by plain-c-like arrays...normal java array or native js array?
    Also i have a question related to the number of iteration you are doing per cycle? and dose this cycles affect the layout(making changes to the dom) (because as I saw from the tests i did, the layouting in IE was always taking 2-3 times more time then Chrome or FF , and i cancel my try to optimize the logic, because I came a thinking that the IE js-angin is just to slow :) and there is nothing I can do about...)

  • Nikos Dimitrakopoulos

    Ok, clarifications :

    1. "plain-c-like" arrays / Arrays : Java arrays basically
    2. Native js array : They translation of Java arrays to native JS code (not the JSArray class found in GWT)

    Regarding 2 : I haven't tried using JSArray on the Java side cause it's totally un-testable code (since it depends on native JavaScript implementations). As a last optimization step it may made sense to try it.

    3. The iteration had nothing to do layout - I'm pretty sure that when rendering is required things on IE are even worse :/

  • graffic

    This remind me a sentence in the pamediakopes developer book. All generalizations are wrong, including this one.

    Or from the author of Node.js, and talking about more code related things: be careful with abstractions, you might have to use them (and you might feel trapped by them at some point).

    http://www.youtube.com/watch?v=F6k8lTrAE2g

  • Thomas Broyer

    There's been an attempt in building optimized collections (not following the Java Collections API), but it looks like it has been deleted since then: http://code.google.com/p/google-web-toolkit/source/browse/branches/2.1/bikeshed/src/com/google/gwt/collections/

  • Post a Comment