Show simple item record

dc.contributor.authorShepperd, Martin
dc.contributor.authorBowes, David
dc.contributor.authorHall, Tracy
dc.date.accessioned2014-12-10T14:47:31Z
dc.date.available2014-12-10T14:47:31Z
dc.date.issued2014-06-03
dc.identifier.citationShepperd , M , Bowes , D & Hall , T 2014 , ' Researcher bias : The use of machine learning in software defect prediction ' , IEEE Transactions in Software Engineering , vol. 40 , no. 6 , 6824804 , pp. 603-616 . https://doi.org/10.1109/TSE.2014.2322358
dc.identifier.issn0098-5589
dc.identifier.urihttp://hdl.handle.net/2299/14911
dc.description.abstractBackground. The ability to predict defect-prone software components would be valuable. Consequently, there have been many empirical studies to evaluate the performance of different techniques endeavouring to accomplish this effectively. However no one technique dominates and so designing a reliable defect prediction model remains problematic. Objective. We seek to make sense of the many conflicting experimental results and understand which factors have the largest effect onpredictive performance. Method. We conduct a meta-analysis of all relevant, high quality primary studies of defect prediction to determine what factors influence predictive performance. This is based on 42 primary studies that satisfy our inclusion criteria that collectively report 600 sets of empirical prediction results. By reverse engineering a common response variable we build arandom effects ANOVA model to examine the relative contribution of four model building factors (classifier, data set, input metrics and researcher group) to model prediction performance. Results. Surprisingly we find that the choice of classifier has little impact upon performance (1.3 percent) and in contrast the major (31 percent) explanatory factor is the researcher group. It matters more who does the work than what is done. Conclusion. To overcome this high level of researcher bias, defect prediction researchers should (i) conduct blind analysis, (ii) improve reporting protocols and (iii) conduct more intergroup studies in order to alleviate expertise issues. Lastly, research is required to determine whether this bias is prevalent in other applications domains.en
dc.format.extent14
dc.format.extent543659
dc.language.isoeng
dc.relation.ispartofIEEE Transactions in Software Engineering
dc.subjectmeta-analysis
dc.subjectresearcher bias
dc.subjectSoftware defect prediction
dc.subjectSoftware
dc.titleResearcher bias : The use of machine learning in software defect predictionen
dc.contributor.institutionSchool of Computer Science
dc.contributor.institutionScience & Technology Research Institute
dc.contributor.institutionCentre for Computer Science and Informatics Research
dc.description.statusPeer reviewed
rioxxterms.versionofrecord10.1109/TSE.2014.2322358
rioxxterms.typeJournal Article/Review
herts.preservation.rarelyaccessedtrue


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record