Auditory filterbank

Next: Calculation of the four Up: Auditory segregation model Previous: Formulation of the problem

Auditory filterbank

Firstly, we describe the wavelet transform and the inverse wavelet transform to design an auditory filterbank.

If $\psi\in L^2({\bf {R}})$ satisfies the ``admissibility'' condition:

$\begin{displaymath}D_{\psi}:=\int_{-\infty}^{\infty} \frac{\vert\hat{\psi}(\omega)\vert^2}{\vert\omega\vert}d\omega < \infty, \end{displaymath}$

(9)

where $\hat{\psi}$ is Fourier transform of $\psi$ , then $\psi$ is called a ``basic wavelet''. Relative to every basic wavelet $\psi$ , the integral wavelet transform on $L^2({\bf {R}})$ is defined by

$\begin{displaymath}\tilde{f}(a,b)=\frac{1}{\sqrt{\vert a\vert}}\int_{-\infty}^\infty f(t)\overline{\psi\left(\frac{t-b}{a}\right)}dt, \end{displaymath}$

(10)

where a is the ``scale parameter'', b is the ``shift parameter'', and $a,b \in {\bf {R}}$ with $a\not=0$ . In addition, under this additional assumption, it follows that $\hat{\psi}$ is a continuous function, so that finiteness of $D_{\psi}$ in Eq. (

) implies $\hat{\psi}(0)=0$ , or equivalently, $\int_{-\infty}^{\infty} \psi(t)dt=0$ .

If $\psi(t)$ is a basic wavelet, then the inverse wavelet transform exist for all t as follows:

$\begin{displaymath}f(t)=\frac{1}{D_\psi} \int_{-\infty}^\infty \int_{-\infty}^\infty \tilde{f}(a,b)\psi\left(\frac{t-b}{a}\right)\frac{dadb}{a^2} \end{displaymath}$

(11)

Moreover, if we let $\psi(t)$ be a complex basic wavelet, then the integral wavelet transform can be represented by

$\begin{displaymath}\tilde{f}(a,b)=\vert\tilde{f}(a,b)\vert e^{j \arg(\tilde{f}(a,b))}, \end{displaymath}$

(12)

where $\vert\tilde{f}(a,b)\vert$ is the amplitude spectrum and $\arg(\tilde{f}(a,b))$ is the phase spectrum.

Secondly, to construct an auditory filterbank, we use the gammatone filter as an analyzing wavelet. The gammatone filter is an auditory filter designed by Patterson[Patterson et al.1994], and simulates the response of the basilar membrane. The impulse response of the gammatone filter is given by

$\begin{displaymath}gt(t)=At^{N-1}e^{-2\pi b_f t} \cos(2\pi f_0 t), \qquad t \geq 0, \end{displaymath}$

(13)

where $At^{N-1}e^{-2\pi b_f t}$ is the amplitude term represented by Gamma distribution and f₀ is the center frequency. In addition, amplitude characteristics of the gammatone filter are represented approximately by

$\begin{displaymath}GT(f)\approx \left[1+\frac{j(f-f_0)}{b_f}\right]^{-N},\qquad 0 < f < \infty, \end{displaymath}$

(14)

where GT(f) is the Fourier transform of gt(t). The characteristics of the gammatone filter are shown in Fig.

. To determine phase information, we extend the impulse response of the gammatone filter, which is a basic wavelet. This basic wavelet is represented by

$\begin{displaymath}\psi(t)=At^{N-1}e^{j2\pi f_0 t-2\pi b_f t}, \end{displaymath}$

(15)

using the Hilbert transform. This analyzing wavelet satisfies the admissibility condition approximately, because $GT(0) \approx 0$ .

**Figure:** Impulse response and amplitude characteristics of the gammatone filter (f₀=600 Hz, N=4, b_f=22.99).
$\begin{figure} \epsfile{file=FIGURE/gammatone.eps,width=0.47\textwidth} \end{figure}$

**Figure:** Frequency characteristics of the wavelet filterbank.
$\begin{figure} \epsfile{file=FIGURE/FBproperty.eps,width=0.47\textwidth} \end{figure}$

Finally, an auditory filterbank is designed with a center frequency f₀ of 600 Hz, a bandpassed region from 60 Hz to 6000 Hz, and a number of filters K of 128. This auditory filterbank is implemented on computer, using a discrete wavelet transform with the following conditions: sampling frequency f_s=20 kHz, the scale parameter $a=\alpha^p, -{K}/{2} \leq p \leq {K}/{2},\alpha=10^{2/K}$ , and the shift parameter b=q/f_s, where $p,q\in {\bf {Z}}$ . Frequency characteristics of the wavelet filterbank are shown in Fig. .

Next: Calculation of the four Up: Auditory segregation model Previous: Formulation of the problem

Masashi Unoki
2000-10-26