How we Benchmark Flint OS
With the recent news that Google is retiring the Octane benchmark now seemed like the perfect time to post some thoughts on the state of benchmarking and how we decide what to optimise for.
When optimising Flint OS we have always looked to ensure the best overall experience not just maxing out a particular benchmark. This includes looking at various aspects of a user’s interaction with the OS such as the startup time, page loading, video playback, memory intensive situations and more.
As discussed in many posts by the v8 team over the past year or so they have been developing a benchmarking process that more accurately represents the real world performance of a user’s typical browsing experience. This fantastic picture below from @tverwaes at the BlinkOn 6 conference really sums up why Octane wasn’t really relevant for real world usage anymore. As we can see the distribution of time in Octane vs. real world web applications is quite different, so optimising for Octane beyond where it is currently, is likely not going to yield any significant improvements in the real world.
In optimising for the Pi every CPU clock is precious and we tried many kernel tweaks, hz values and storage formats and compression. We noticed early on that higher Octane doesn’t necessarily mean better page loading times, in fact, we noticed quite the opposite. Internally we have builds approaching a score of 3,000 in Octane however the page loading and system stability suffered. Eventually we settled on a compromise of around 2600-700.
Going forward we will continue to test by measuring boot time, memory stress testing and video playback. We will also be dropping the Octane benchmark completely in favour of speedometer and jetstream. Hopefully you’ve found this post an interesting read and it’s given some insight into the complex world of OS performance tuning!