We propose a novel phrase break prediction method that combines implicitfeatures extracted from a pre-trained large language model, a.k.a BERT, with linguistic features . Inconventional BiLSTM based methods, word representations and/or sentencerepresentations are used as independent components . The proposed method takesaccount of both representations to extract the latent semantics, which cannot be captured by previous methods . The objective evaluation results show that the proposed method obtains an absolute improvement of 3.2 points for the F1 score compared with conventional methods using linguistic features. The perceptual listening test results verify that a TTS system that applied our proposed method achieved a mean opinion score of 4.39 in prosodynaturalness .

Author(s) : Kosuke Futamata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana

Links : PDF - Abstract

Code :

Keywords : method - methods - proposed - representations - phrase -

Leave a Reply

Your email address will not be published. Required fields are marked *