Show simple item record

dc.contributor.authorGreen, P. D.
dc.contributor.authorLane, P.C.R.
dc.contributor.authorRainer, A.
dc.contributor.authorScholz, S.
dc.date.accessioned2009-05-12T07:40:31Z
dc.date.available2009-05-12T07:40:31Z
dc.date.issued2009
dc.identifier.citationIn: Procs of the 6th Machine Learning and Data Mining in Pattern Recognition International Conference, MLDM 2009en
dc.identifier.other904507en
dc.identifier.urihttp://hdl.handle.net/2299/3358
dc.description.abstractWe apply machine-learning techniques to help automate the process of mining the version history of software projects. Analysis of version histories is important in the study of software evolution. One of the associated problems is tracing program elements which have changed or moved as the result of file restructuring. As an initial application, we have developed classifiers to identify one such type of file change, `split files'. Our process involves extracting features through syntactic analysis of the original source code, and then training and evaluating classifiers against a set of data assessed by visual inspection. We analysed 266K files from 84 open-source projects, filtering out a set of candidate files for which our classifiers achieve either 89% overall accuracy, or a false positive rate of 5%.en
dc.format.extent183417 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherIbaI (Institute of Computer Vision & Applied Computer Sciences)en
dc.subjectmachine learningen
dc.subjectdata miningen
dc.subjectclassificiationen
dc.subjectsoftware-source codeen
dc.titleBuilding Classifiers to Identify Split Files.en
dc.typeConference paperen
herts.preservation.rarelyaccessedtrue


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record