next up previous
Next: Overview of the proposed Up: Auditory sound segregation model Previous: Grouping block

Separation block


  
Figure: Signal processing of a separation block.
\begin{figure}\center
\epsfile{file=FIGURE/Comp.eps,width=0.9\linewidth}
\end{figure}

The separation block determines Ak(t), Bk(t), $\theta_{1k}(t)$, and $\theta_{2k}(t)$ from Sk(t) and $\phi_k(t)$ using constraints (ii) and (iv) in the determined concurrent time-frequency region. In this paper, the improvement of the auditory sound segregation model is to reconsider the constraints on the continuity of $\theta_{1k}(t)$ as well as the constraints on the continuity of Ak(t) and F0(t). Constraint (ii) is implemented such that Ck,R(t) and Dk,R(t) are linear (R=1) polynomials, in order to reduce the computational cost of estimating Ck,R(t) and Dk,R(t). In this assumption, Ak(t) and $\theta_{1k}(t)$, which can be allowed to undergo a temporal change in region, constrain the second-order polynomials ( $A_k(t)=\int C_{k,1}(t)dt+C_{k,0}'$ and $\theta_{1k}(t)=\int D_{k,1}(t)+D_{k,0}'$). Then, substituting dAk(t)/dt=Ck,R(t) into Eq. ([*]), we get the linear differential equation of the input phase difference $\theta_k(t)=\theta_{2k}(t)-\theta_{1k}(t)$. By solving this equation, a general solution is determined by

 \begin{displaymath}\theta_k(t)=\arctan\left(\frac{S_k(t)\sin(\phi_k(t)-\theta_{1k}(t))}{S_k(t)\cos(\phi_k(t)-\theta_{1k}(t))+C_k(t)}\right),
\end{displaymath} (32)

where $C_k(t)=-\int C_{k,R}(t)dt-C_{k,0}=-A_k(t)$ [Unoki and Akagi1999].

The signal flow of the separation block is shown in Fig. [*]. In the segment Th-Th-1 that can be determined by E0,R(t)=0, the terms Ak(t), Bk(t), $\theta_{1k}(t)$, and $\theta_{2k}(t)$ are determined by the following steps. First, the estimation regions, $\hat{C}_{k,0}(t)-P_k(t) \leq C_{k,1}(t) \leq \hat{C}_{k,0}(t)+P_k(t)$ and $\hat{D}_{k,0}(t)-Q_k(t) \leq D_{k,1}(t) \leq \hat{D}_{k,0}(t)+Q_k(t)$, are determined by using the Kalman filter, where $\hat{C}_{k,0}(t)$ and $\hat{D}_{k,0}(t)$ are the estimated values and Pk(t) and Qk(t) are the estimated errors (see Appendix A). Next, the candidates of Ck,1(t) at any Dk,1(t) are selected by using the spline interpolation in the estimated error region (see Appendix B). Then, $\hat{C}_{k,1}(t)$ is determined by using

 \begin{displaymath}\hat{C}_{k,1}=\mathop{\arg\max}_{\hat{C}_{k,0}-P_k\leq C_{k,1...
...\vert\hat{A}_k\vert\vert \vert\vert\hat{\hat{A}}_k\vert\vert},
\end{displaymath} (33)

where $\hat{A}_k(t)$ is obtained by the spline interpolation and $\hat{\hat{A}}_k(t)$ is determined in across-channel that satisfies constraint (iii) as follows.

\begin{displaymath}\hat{\hat{A}}_k(t)=\frac{1}{N_{F_0}} \sum_{\ell\in {\bf{L}},\ell \not=k} \frac{\hat{A}_{\ell}(t)}{\Vert\hat{A}_\ell(t)\Vert},
\end{displaymath} (34)

where ${\bf{L}}$ is a set of $\ell$ that satisfies Eq. ([*]). Finally, $\hat{D}_{k,1}(t)$ is determined by using

 \begin{displaymath}\hat{D}_{k,1}=\mathop{\arg\max}_{\hat{D}_{k,0}-Q_k\leq D_{k,1...
...\vert\hat{A}_k\vert\vert \vert\vert\hat{\hat{A}}_k\vert\vert}.
\end{displaymath} (35)

Since $\theta_{1k}(t)$ and $\theta_k(t)$ are determined from $\hat{D}_{k,1}(t)$ and $\hat{C}_{k,1}(t)$, the terms Ak(t), Bk(t), and $\theta_{2k}(t)$ can be determined from Eq. ([*]), Eq. ([*]), and $\theta_{2k}(t)=\theta_k(t)+\theta_{1k}(t)$, respectively.


next up previous
Next: Overview of the proposed Up: Auditory sound segregation model Previous: Grouping block
Masashi Unoki
2000-10-26