Improving Defect Prediction Models by Combining Classifiers Predicting Different Defects
Abstract
Background: The software industry spends a lot of money on finding and fixing defects.
It utilises software defect prediction models to identify code that is likely to be defective.
Prediction models have, however, reached a performance bottleneck. Any improvements to
prediction models would likely yield less defects-reducing costs for companies.
Aim: In this dissertation I demonstrate that different families of classifiers find distinct
subsets of defects. I show how this finding can be utilised to design ensemble models which
outperform other state-of-the-art software defect prediction models.
Method: This dissertation is supported by published work. In the first paper I explore
the quality of data which is a prerequisite for building reliable software defect prediction
models. The second and third papers explore the ability of different software defect prediction models to find distinct subsets of defects. The fourth paper explores how software defect prediction models can be improved by combining a collection of classifiers that predict different defective components into ensembles. An additional, non-published work, presents a visual technique for the analysis of predictions made by individual classifiers and discusses some possible constraints for classifiers used in software defect prediction.
Result: Software defect prediction models created by classifiers of different families predict
distinct subsets of defects. Ensembles composed of classifiers belonging to different families
outperform other ensemble and standalone models. Only a few highly diverse and accurate
base models are needed to compose an effective ensemble. This ensemble can consistently
predict a greater number of defects compared to the increase in incorrect predictions.
Conclusion: Ensembles should not use the majority-voting techniques to combine decisions
of classifiers in software defect prediction as this will miss correct predictions of classifiers
which uniquely identify defects. Some classifiers could be less successful for software defect
prediction due to complex decision boundaries of defect data. Stacking based ensembles can outperform other ensemble and stand-alone techniques. I propose new possible avenues of research that could further improve the modelling of ensembles in software defect prediction. Data quality should be explicitly considered prior to experiments for researchers to establish reliable results.
Publication date
2018-10-02Published version
https://doi.org/10.18745/th.23943https://doi.org/10.18745/th.23943
Funding
Default funderDefault project
Other links
http://hdl.handle.net/2299/23943Metadata
Show full item recordThe following license files are associated with this item: