Selecting Features in Origin Analysis

Green, P. D., Lane, P.C.R., Rainer, A. and Scholz, S. (2010) Selecting Features in Origin Analysis. In: UNSPECIFIED.
Copy

When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features.

Full text not available from this repository.

EndNote BibTeX Reference Manager Refer Atom Dublin Core RIOXX2 XML MODS MPEG-21 DIDL OPENAIRE OpenURL ContextObject HTML Citation ASCII Citation METS OpenURL ContextObject in Span Data Cite XML
Export

Downloads