dc.contributor.author | Green, P. D. | |
dc.contributor.author | Lane, P.C.R. | |
dc.contributor.author | Rainer, A. | |
dc.contributor.author | Scholz, S. | |
dc.date.accessioned | 2013-01-15T12:59:05Z | |
dc.date.available | 2013-01-15T12:59:05Z | |
dc.date.issued | 2010 | |
dc.identifier.citation | Green , P D , Lane , P C R , Rainer , A & Scholz , S 2010 , Selecting Features in Origin Analysis . in Research and Development in Intelligent Systems XXVII, Incorporating Applications and Innovations in Intelligent Systems XVIII, : Proceedings of AI-2010, The Thirtieth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence . Springer Nature Link , pp. 379-392 . | |
dc.identifier.isbn | 978-0-85729-129-5 | |
dc.identifier.isbn | 978-0-85729-130-1 | |
dc.identifier.other | dspace: 2299/4913 | |
dc.identifier.uri | http://hdl.handle.net/2299/9655 | |
dc.description | Original paper can be found at: http://www.springer.com/computer/ai/book/978-0-85729-129-5 Copyright Springer | |
dc.description.abstract | When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features. | en |
dc.format.extent | 225694 | |
dc.language.iso | eng | |
dc.publisher | Springer Nature Link | |
dc.relation.ispartof | Research and Development in Intelligent Systems XXVII, Incorporating Applications and Innovations in Intelligent Systems XVIII, | |
dc.subject | data mining | |
dc.subject | feature construction | |
dc.subject | origin analysis | |
dc.subject | machine learning | |
dc.title | Selecting Features in Origin Analysis | en |
dc.contributor.institution | School of Computer Science | |
dc.contributor.institution | Science & Technology Research Institute | |
dc.contributor.institution | Department of Computer Science | |
dc.contributor.institution | School of Physics, Engineering & Computer Science | |
dc.contributor.institution | Centre for Computer Science and Informatics Research | |
rioxxterms.type | Other | |
herts.preservation.rarelyaccessed | true | |