This simulation assumed that f1(t) was a vowel /a/ synthesized by the log magnitude approximation (LMA) [Imai et al.1977,Imai1978] as shown in Fig. 15, where averaged Hz, jitter was 5 Hz (from 123 to 128 Hz), and f2(t) was bandpassed pink noise, where the bandwidth was about 6 kHz. In this simulation, NF0=40. Five types of f(t) were used as simulation stimuli, where the SNRs of f(t) ranged from 0 to 20 dB in 5-dB steps.
For example, when the SNR of f(t) was 10 dB as shown in Fig. 16, the proposed method could segregate Ak(t) with high accuracy and could extract , shown in Fig. 17, from f(t). In this case, the precision for Ak(t) is shown in Fig. 18 (top panel). In addition, the average SDs of and f(t) for five simulations are shown in Fig. 18 (bottom panel). It was possible to improve the precision by about 4.8 dB and the spectrum distortion by about 7.3 dB as noise reduction, comparing the proposed method with condition 3. Here, comparing the amplitude spectrum of original signal f1(t) with that of or f(t), the proposed method could clearly reduce the noise-component from the observed amplitude spectrum, as shown in Fig. 19. In this figure, the amplitude spectra are shown in one frame in middle point of signal duration. Hence, the proposed model could also extract with high accuracy the amplitude information of speech f1(t) from a noisy speech f(t) in which speech and noise existed in the same frequency region. Hence, this method can be applied in cases where a speech signal is to be extracted from noisy speech.