Using sampling methods to improve binding site predictions
Te Boekhorst, R.
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. In previous work we combine random selection under-sampling into SMOTE over-sampling technique, working with several classification algorithms from machine learning field to integrate binding site predictions. In this paper, we improve the classification result with the aid of Tomek links as an either undersampling or cleaning technique.