Factors Affecting the Performance of Trainable Models for Software Defect Prediction
Abstract
Context. Reports suggest that defects in code cost the US in excess of $50billion per year to put right.
Defect Prediction is an important part of Software Engineering. It allows developers to prioritise the
code that needs to be inspected when trying to reduce the number of defects in code. A small change in
the number of defects found will have a significant impact on the cost of producing software.
Aims. The aim of this dissertation is to investigate the factors which a ect the performance of
defect prediction models. Identifying the causes of variation in the way that variables are computed
should help to improve the precision of defect prediction models and hence improve the cost e ectiveness
of defect prediction.
Methods. This dissertation is by published work. The first three papers examine variation in the
independent variables (code metrics) and the dependent variable (number/location of defects). The
fourth and fifth papers investigate the e ect that di erent learners and datasets have on the predictive
performance of defect prediction models. The final paper investigates the reported use of di erent
machine learning approaches in studies published between 2000 and 2010.
Results. The first and second papers show that independent variables are sensitive to the measurement
protocol used, this suggests that the way data is collected a ects the performance of defect
prediction. The third paper shows that dependent variable data may be untrustworthy as there is no
reliable method for labelling a unit of code as defective or not. The fourth and fifth papers show that the
dataset and learner used when producing defect prediction models have an e ect on the performance of
the models. The final paper shows that the approaches used by researchers to build defect prediction
models is variable, with good practices being ignored in many papers.
Conclusions. The measurement protocols for independent and dependent variables used for defect
prediction need to be clearly described so that results can be compared like with like. It is possible
that the predictive results of one research group have a higher performance value than another research
group because of the way that they calculated the metrics rather than the method of building the model
used to predict the defect prone modules. The machine learning approaches used by researchers need
to be clearly reported in order to be able to improve the quality of defect prediction studies and allow a
larger corpus of reliable results to be gathered.
Publication date
2013-06-13Published version
https://doi.org/10.18745/th.10978https://doi.org/10.18745/th.10978