Selecting Features in Origin Analysis

Green, P. D., Lane, P.C.R., Rainer, A. and Scholz, S. (2010) Selecting Features in Origin Analysis. In: UNSPECIFIED.
Copy

When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features.

Full text not available from this repository.

EndNote BibTeX Reference Manager Refer Atom Dublin Core OpenURL ContextObject in Span RIOXX2 XML METS MODS MPEG-21 DIDL ASCII Citation Data Cite XML HTML Citation OpenURL ContextObject
Export

Downloads