Selecting Features in Origin Analysis

Green, P. D., Lane, P.C.R., Rainer, A. and Scholz, S. (2010) Selecting Features in Origin Analysis. In: UNSPECIFIED.
Copy

When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features.

Full text not available from this repository.

EndNote BibTeX Reference Manager Refer Atom Dublin Core METS HTML Citation OpenURL ContextObject Data Cite XML ASCII Citation MPEG-21 DIDL RIOXX2 XML MODS OpenURL ContextObject in Span
Export

Downloads