Abstract:
In order to evaluate software performance and find regressions, many
developers use automated performance tests. However, the test results often
contain a certain amount of noise that is not caused by actual performance
changes in the programs. They are instead caused by external factors like
operating system decisions or unexpected non-determinisms inside the
programs. This makes interpreting the test results hard since results that
differ from previous results cannot easily be attributed to either genuine
changes or noise.
In this thesis we use Mozilla Firefox as an example to try to find the
causes for this performance variance, develop ways to reduce the noise and
present a statistical technique that makes identifying genuine performance
changes more reliable.
Our results show that a significant amount of noise is caused by memory
randomization and other external factors, that there is variance in Firefox
internals that does not seem to be correlated with test result variance, and
that our suggested statistical forecasting technique can give more reliable
detection of genuine performance changes than the one currently in use by
Mozilla.