Mr.HO,Tuan Vu,Human Life Design Area, received Student Best Paper Award in NCSP2018.
Mr. HO, Tuan Vu, (the first year doctral student in Akagi lab. of Human Life Design Area) received Student Best Paper Awards in The International Workshop on Nonlinear Circuits, Communications and Signal Processing 2018 (NCSP2018).
NCSP is an international workshop on nonlinear circuits, communications and signal processing, organized by the Research Institute of Signal Processing (RISP) annually. In this year, NCSP'18 was held in Honolulu, USA from March 4th to 7th, celebrating the 15th time in total. Student Best Paper Awards is awarded to students who are recognized as having published particularly outstanding papers by the Technical Program Committee among published papers by student presenters (first author) at NCSP 2018.
Reference:http://www.risp.jp/NCSP18/
■Date Awarded
March 7,2018
■Title
Non-parallel Dictionary-based Voice Conversion using Variational Autoencoder with Modulation Spectrum-constrained Training
■Article
In this paper, we present a non-parallel voice conversion (VC) approach that does not require parallel data or linguistic labeling for training process. Dictionary-based voice conversion is the class of methods aiming to decomposed speech into separate factors for manipulation. Non-negative matrix factorization (NMF) is the most common method to decompose input spectrum into a weighted linear combination of a set of dictionaries (basis) and weights. However, the requirement for parallel training data in this method causes several problems: 1) limited practical usability when parallel data are not available, 2) additional error from alignment process degrades output speech quality. In order to alleviate these problems, this paper presents a dictionary-based VC approach by incorporating a Variational Autoencoder (VAE) to decomposed input speech spectrum into speaker dictionary and weights without parallel training data. According to evaluation results, the proposed method achieved better speech naturalness while retaining the same speaker similarity as NMF-based VC even though un-aligned data is used.
■Comment
I would like to express my appreciation to the NCSP'18 committee board and Research Institute of Signal Processing Japan for recognizing me with the "NCSP'18 Student Paper Award". I am truly honor to receive it. I would like to express my sincere gratitude to Professor Masato Akagi, my supervisor at Japan Advanced Institute of Science and Technology, as all this work cannot be possible without the kind support from him. I also want to extend my appreciation to all the lab members at Acoustic Information Science Laboratory for all the help and support they provided me during this study. This award is an important milestone that will encourage me to keep on working harder in the future.
April 25, 2018