A 94dB, 1000Hz calibration tone produced by a Bruel & Kjaer 4620 calibrator was recorded on the same tape on which speech was recorded, with the amplification varied by a known amount (see Tables 1 and 2).
Date | Speaker | DAT Rec. Level | Input Gain | Output Gain | ||
---|---|---|---|---|---|---|
Speech | Tone | Speech | Tone | |||
6/11/1998 | LMTJ | 6 | 20 | 10 | 20 | 10 |
25/1/1999 | LMTJ | Not registered | 20 | 10 | 20 | 20 |
22/6/1999 | ACC | 6 | 20 | 10 | 20 | 10 |
19/11/1999 | CFGA | 3.5 | 20 | 10 | 30 | 10 |
19/11/1999 | ISSS | 5 | 20 | 10 | 20 | 10 |
Date | Speaker | Language | DAT Rec. Level | Input Gain | Output Gain | ||
---|---|---|---|---|---|---|---|
Speech | Tone | Speech | Tone | ||||
17/11/2000 | RS | English | 4 | 20 | 10 | 20 | 10 |
17/11/2000 | RS | Portuguese | 4 | 20 | 10 | 20 | 10 |
17/11/2000 | PS | English | 4 | 20 | 10 | 20 | 10 |
17/11/2000 | PS | Portuguese | 4 | 20 | 10 | 20 | 10 |
To obtain an absolute spectral amplitude we will start by calculating a factor A1 which, when added to the internal arbitrary amplitude of the recorded calibration tone, makes the sum equal to the known amplitude of the calibration tone:
A1 = 94.1 - 20log(Yarb(1000)) (dB)
where Yarb(1000) is the arbitrary internal amplitude of the Fourier transform at 1kHz of the calibration tone. We will also have to calculate a second A2 that will be equal to the difference in amplification for the tone and speech:
A2 = Gcal - Gsp (dB)
where Gcal is the gain applied when the calibration signal was recorded, and Gsp is the gain applied when the speech signal was recorded. Therefore the absolute spectral amplitude of the speech signal Xarb(1000) is given by
Xabs = 20log(Xarb(1000)) + A1 + A2 (dB)
The spectra shown in the thesis by Jesus(2001) do not present an absolute amplitude. We are currently working on a method that uses the calibration signal to calculate an absolute spectral amplitude that will be referred to a 1Hz interval and will thus allow comparison regardless of window lengths and averaging techniques.
The power spectrum (energy) of the speech signal is defined as:
E=∫|x(t)|²dt=∫|X(f)|²df
If we increase the number of points in x(t) (i.e. the size of the window) the value of the integral (area delimited by the function) also increases. Therefore, the window length used to calculate the power spectra affects the overall amplitude. All else being equal, the larger the size of the window the higher is the overall amplitude.
We used the same window size to calculate the power spectra of ambient noise, sustained fricatives, fricatives in nonsense words and real words. We used a larger number of windows to calculate the averaged power spectrum of a longer segment of signal (ambient noise and sustained fricatives). This allowed us to compare spectral amplitudes of Corpus 1a, 1b, 2, 3 and 4, for a given recording session.