Data Science and Engineering: Formant estimation in Matlab/Octave

martes, 26 de febrero de 2013

Formant estimation in Matlab/Octave

Formants
In Speech Processing, the resonant frequencies of the vocal tract of an individual are called the formants. These frequencies characterize the individual according to his age, and gender, and can even be used to perform more complex tasks like Speaker Identification.

Typically the first two to four of these resonant frequencies are of interest, and they can be identifiedin a typical frequency plot of a speech signal as shown below,

The frequency values corresponding to the peaks F1, F2, ..., FN of the red envelope are called the formants. The little peaks below the red line are called partials.

Methods for Estimating Formants

The two most used methods for formant estimations are based on LPC coefficients and Cepstral analysis. The basic idea behind both methods is to obtain an smoothed version of the frequency response, and then compute the peaks' locations based on this representation. A good reference for LPC and formant estimation can be found in this paper. Two good resources I have used to understand and implement Cepstral methods are this one and this one.

Octave Code

In this section I provide an implementation of the method explained in the paper mentioned in previous section(LPC method). You can access the file LPCFormants.m from my dropbox.
A few words on this code. The function assumes that you have already windowed the signal, and accepts the sampling rate as a parameter. If you have no idea of what are windows, you should take a look at Short Time Analysis of Speech in numerous web resources like here or here.