Dyn is crazy for dashboards. We wake up looking at them and always have ideas on how to make them better. We use several platforms – Munin, Graphite, and Nagios to name a few – but recently we have started building our own data processing tools that allow us to build real-time dashboards.
Here’s what we did and how we did it.
I selected a few tasks that a usual dashboard would do and multiplied the desired amount of work by 50 to come up with a test workload:
- Stuff an array with a million random positive integers (to see how dynamic memory allocation may affect interpreters)
- Calculate the variance on the resulting dataset (to check performance of math operations)
- Sort the array (because this step is important in descriptive statistics)
- Find 10,000 quantiles (to observe data access speed)
I wanted to compare usual dashboard builders. This was my lineup:
I added these browsers to compete with server-side tools:
All tests (except the IE9 one) were done on an
ancient Intel Core Duo 6400 with Fedora Core 16 (Linux 3.2.3 x86_64 SMP). The IE9 test ran on an Intel i5 CPU with 64-bit Windows 7 Professional. The Google TV was Sony Internet TV with 3.2 Firmware.
I added a few other contestants which aren’t the first choice for a web applications, but are well-known data processing tools:
- R 2.14 because R is designed exactly for this kind of data crunching
- Java (1.6.0 openjdk)
- C (code compiled by gcc 4.6.2 with \-O3 flag)
- Python 2.7 with numpy library
During testing, I uncovered a serious underperformance of standard sort routines in Firefox and C for our use-case (sorting integers). I added a custom sort test for web browsers and a faster quicksort test for C in the mix.
Results and Summary
For each contestant, I ran the challenge 10 times. I aggregated mean values into the table below and added the median absolute deviation (MAD) to demonstrate consistency between 10 runs. All these digits show time in seconds.
|Initialization||Calculating Variance||Sorting||Calculating Quantiles||Total Run Time||MAD|
|C (custom quickSort)||0.01||0||0.04||0||0.06||0.005|
|Python 2.7 + numpy||0.03||0||0.09||0.02||0.13||0.003|
|C (standard qsort)||0.01||0||0.16||0||0.17||0.000|
|Chrome 16 (standard sort)||0.1||0.09||0.23||0||0.42||0.018|
|Chrome 16 (custom sort)||0.11||0.08||0.65||0||0.85||0.005|
|Firefox (custom sort)||0.04||0.23||1.03||0||1.31||0.016|
|Chrome 11 (Sony TV, standard sort)||0.31||0.52||0.73||0.01||1.56||0.029|
|IE9 (other CPU, standard sort)||0.25||0.75||0.75||0.01||1.76||0.050|
|Firefox (standard sort)||0.04||0.26||3.2||0||3.51||0.025|
|IE9 (other CPU, custom sort)||0.25||0.66||3.05||0.01||3.97||0.062|
|Chrome 11 (Sony TV, custom sort)||0.3||0.51||4.42||0.01||5.24||0.083|
The results confirm that using the client’s CPU to calculate content to show on the dashboard is a good way to eliminate load from servers. It will also not slow clients down because the usual calculations will be done faster than a tenth of a second. We certainly can see good competition to server-side interpreters, especially for our target case where workload is different every time a dashboard is displayed.
If you want to check out the tests and the code we used to do this, you can get it from Github. You can use it to make similar evaluations to the ones we made, or improve the code we published.
Other observations I made as a result of this testing:
- R is an amazing tool. Use it when you’re thinking about numbers!
- Numpy is a very smart library. I was not aware of it when I started, but plan to use it more often now. I cannot recommend it for everyone yet because I don’t know much about its thread safety and platform independence, but I plan some more investigation. Comment below if you know more please!
- I wonder whether Firefox developers can make their sort function smarter.