Effectiveness of orthogonal instantaneous and transitional feature parameters for speaker verification
Abstract
The effectiveness, for text-dependent speaker verification, of orthogonal instantaneous and transitional feature parameters of speech is investigated. Instantaneous spectral features are represented by cepstral coefficients obtained through a linear prediction analysis of speech. Transitional spectral information is characterised using differential cepstral coefficients. Sets of orthogonal parameters are obtained by applying an eigenvector analysis to instantaneous and transitional feature coefficients. The experimental work is based on the use of a subset of the BT Millar speech database, consisting: of repetitions off isolated digit utterances 1 to 9 and zero spoken by twenty male speakers. The investigation includes an examination of the relative speaker discrimination abilities of the above two types of orthogonal feature parameters. It is shown experimentally that the equal error rate in speaker verification can be reduced significantly by forming a spectral distance based on a combination of orthogonal instantaneous and transitional feature parameters. It is further demonstrated that, when the input utterance consists of a sequence of five digits, an equal error rate of less than 0.5% can be achieved.