Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:Co-speech Gesture Generation with Variational Auto Encoder 
著者
和文: 賈 宸一, 篠田 浩一.  
英文: Shinichi Ka, Koichi Shinoda.  
言語 English 
掲載誌/書名
和文: 
英文:Lecture Notes in Computer Science on MultiMedia Modeling (MMM 2024) 
巻, 号, ページ vol. 14556       
出版年月 2024年1月28日 
出版者
和文: 
英文:Springer, Cham 
会議名称
和文: 
英文:Multimedia Modeling(MMM) 2024 
開催地
和文: 
英文:Amsterdam 
公式リンク https://mmm2024.org/index.html
 
DOI https://doi.org/10.1007/978-3-031-53311-2_12
アブストラクト The research field of generating natural gestures from speech input is called co-speech gesture generation. Co-speech generation methods should suffice two requirements: fidelity and diversity. Several previous researches have utilized deterministic methods to establish a one-to-one mapping between speech and motion to achieve fidelity to speech, but the variety of gestures produced is limited. Other methods generate gestures probabilistically to make them various, but they often lack fidelity to the speech. To overcome these limitations, we propose Speaker-aware Audio2Gesture (SA2G) that uses a variational autoencoder (VAE) with the input of randomized speaker-aware features, an extension of the previously proposed A2G. By using ST-GCNs as encoders and controlling the variance for randomization, it can generate gestures faithful to speech content, which also have a large variety. In our evaluation on TED datasets, it improves the fidelity of the generated gestures from the baseline by 85.4, while increasing the Multimodality by 9.0×10^(-3).

©2007 Institute of Science Tokyo All rights reserved.