next up previous
Next: Calculation of the four Up: Auditory segregation model Previous: Formulation of the problem

Auditory filterbank

Firstly, we describe the wavelet transform and the inverse wavelet transform to design an auditory filterbank.

If $\psi\in L^2({\bf {R}})$ satisfies the ``admissibility'' condition:

 \begin{displaymath}D_{\psi}:=\int_{-\infty}^{\infty} \frac{\vert\hat{\psi}(\omega)\vert^2}{\vert\omega\vert}d\omega < \infty,
\end{displaymath} (9)

where $\hat{\psi}$ is Fourier transform of $\psi$, then $\psi$ is called a ``basic wavelet''. Relative to every basic wavelet $\psi$, the integral wavelet transform on $L^2({\bf {R}})$ is defined by

 \begin{displaymath}\tilde{f}(a,b)=\frac{1}{\sqrt{\vert a\vert}}\int_{-\infty}^\infty f(t)\overline{\psi\left(\frac{t-b}{a}\right)}dt,
\end{displaymath} (10)

where a is the ``scale parameter'', b is the ``shift parameter'', and $a,b \in {\bf {R}}$ with $a\not=0$. In addition, under this additional assumption, it follows that $\hat{\psi}$ is a continuous function, so that finiteness of $D_{\psi}$ in Eq. ([*]) implies $\hat{\psi}(0)=0$, or equivalently, $\int_{-\infty}^{\infty} \psi(t)dt=0$.

If $\psi(t)$ is a basic wavelet, then the inverse wavelet transform exist for all t as follows:

 \begin{displaymath}f(t)=\frac{1}{D_\psi} \int_{-\infty}^\infty \int_{-\infty}^\infty \tilde{f}(a,b)\psi\left(\frac{t-b}{a}\right)\frac{dadb}{a^2}
\end{displaymath} (11)

Moreover, if we let $\psi(t)$ be a complex basic wavelet, then the integral wavelet transform can be represented by

\begin{displaymath}\tilde{f}(a,b)=\vert\tilde{f}(a,b)\vert e^{j \arg(\tilde{f}(a,b))},
\end{displaymath} (12)

where $\vert\tilde{f}(a,b)\vert$ is the amplitude spectrum and $\arg(\tilde{f}(a,b))$ is the phase spectrum.

Secondly, to construct an auditory filterbank, we use the gammatone filter as an analyzing wavelet. The gammatone filter is an auditory filter designed by Patterson[Patterson et al.1994], and simulates the response of the basilar membrane. The impulse response of the gammatone filter is given by

\begin{displaymath}gt(t)=At^{N-1}e^{-2\pi b_f t} \cos(2\pi f_0 t), \qquad t \geq 0,
\end{displaymath} (13)

where $At^{N-1}e^{-2\pi b_f t}$ is the amplitude term represented by Gamma distribution and f0 is the center frequency. In addition, amplitude characteristics of the gammatone filter are represented approximately by

\begin{displaymath}GT(f)\approx \left[1+\frac{j(f-f_0)}{b_f}\right]^{-N},\qquad 0 < f < \infty,
\end{displaymath} (14)

where GT(f) is the Fourier transform of gt(t). The characteristics of the gammatone filter are shown in Fig. [*]. To determine phase information, we extend the impulse response of the gammatone filter, which is a basic wavelet. This basic wavelet is represented by

 \begin{displaymath}\psi(t)=At^{N-1}e^{j2\pi f_0 t-2\pi b_f t},
\end{displaymath} (15)

using the Hilbert transform. This analyzing wavelet satisfies the admissibility condition approximately, because $GT(0) \approx 0$.
  
Figure: Impulse response and amplitude characteristics of the gammatone filter (f0=600 Hz, N=4, bf=22.99).
\begin{figure}
\epsfile{file=FIGURE/gammatone.eps,width=0.47\textwidth}
\end{figure}


  
Figure: Frequency characteristics of the wavelet filterbank.
\begin{figure}
\epsfile{file=FIGURE/FBproperty.eps,width=0.47\textwidth}
\end{figure}

Finally, an auditory filterbank is designed with a center frequency f0 of 600 Hz, a bandpassed region from 60 Hz to 6000 Hz, and a number of filters K of 128. This auditory filterbank is implemented on computer, using a discrete wavelet transform with the following conditions: sampling frequency fs=20 kHz, the scale parameter $a=\alpha^p, -{K}/{2} \leq p \leq {K}/{2},\alpha=10^{2/K}$, and the shift parameter b=q/fs, where $p,q\in {\bf {Z}}$. Frequency characteristics of the wavelet filterbank are shown in Fig. [*].


next up previous
Next: Calculation of the four Up: Auditory segregation model Previous: Formulation of the problem
Masashi Unoki
2000-10-26