Using pre and post-processing methods to improve binding site predictions

Sun, Yi, Castellano, C.G., Robinson, M., Adams, R.G., Rust, A.G. and Davey, N. (2009) Using pre and post-processing methods to improve binding site predictions. Pattern Recognition, 42 (9). pp. 1949-1958. ISSN 0031-3203
Copy

Currently the best algorithms for transcription factor binding site prediction within sequences of regulatory DNA are severely limited in accuracy. In this paper, we integrate 12 original binding site prediction algorithms, and use a `window' of consecutive predictions in order to contextualise the neighbouring results. We combine either random selection or Tomek links under-sampling with SMOTE over-sampling techniques. In addition, we investigate the behaviour of four feature selection filtering methods: Bi-Normal Separation, Correlation Coefficients, F-Score and a cross entropy based algorithm. Finally, we remove some of the final predicted binding sites on the basis of their biological plausibility. The results show that we can generate a new prediction that significantly improves on the performance of any one of the individual algorithms.

visibility_off picture_as_pdf

picture_as_pdf
1_s2.0_S0031320309000533_main.pdf
subject
Published Version
lock
Restricted to Repository staff only

Request Copy

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads