Optimising a hierarchical neural clusterer applied to large gene sequence data sets
Kaye, Paul H.
Evolutionary Algorithms have been used to optimise the performance of neural network models before. This paper uses a hybrid approach by permanently attaching a Genetic Algorithm (GA) to a hierarchical clusterer to investigate appropriate parameter values for producing specific tree shaped representations for some gene sequence data. It addresses a particular problem where the size of the data set makes the direct use of a GA too time consuming. We show by using a data set nearly two orders of magnitude smaller in the GA investigation that the results can be usefully translated across to the real, much larger data sets. The data sets in question are gene sequences and the aim of the analysis was to cluster short sub-sequences that could represent binding sites that regulate the expression of genes.