Speaker or voice recognition is the identification of a person from characteristics of their voice (voice biometrics). It is also called voice identification. There is a difference between speaker recognition (recognizing who is speaking) and speech recognition (recognizing what is being said).
In achieving voice identification – the individual presents either a fixed (password) or prompted phrase that is recorded into the system with cooperative users. Or if the individual submitting the sample is unaware of the collection, unwilling or unable to cooperate (such as in death), two or more existing recordings are submitted to compare. This presents a more difficult challenge.
The sampled speech waveforms are then analyzed by the audio forensic expert. Comparing the frequency content of the speech and the characteristics such as the quality, duration, intensity dynamics, and pitch of the signal.
While examining speech waveforms is the driving force behind Voice Identification, combining higher level characteristics such as: rhythm, speed, modulation and intonation, based on personality type and parental influence; and semantics, idiolects, pronunciations and idiosyncrasies, related to birthplace, socio-economic status, and education level assists greatly in determining speaker recognition.
Susceptibility to transmission channel, microphone variability and noise is an additional factor. Challenges may arise when a clean landline phone is attempted to be verified using a noisy cellular phone. This final piece of the puzzle is where 21 years of experience make the difference in voice biometrics.
Upon complete examination of all evidence gathered, we arrive at one of 5 possible declarations – positive identification, probable identification, positive elimination, possible elimination or inconclusive.