Home >

news ヘルプ

論文・著書情報


タイトル
和文: 
英文:Attentive Statistics Pooling for Deep Speaker Embedding 
著者
和文: 岡部 浩司, 越仲 孝文, 篠田 浩一.  
英文: Koji Okabe, Takafumi Koshinaka, Koichi Shinoda.  
言語 English 
掲載誌/書名
和文: 
英文:Proc. Interspeech 2018 
巻, 号, ページ         pp. 2252-2256
出版年月 2018年9月4日 
出版者
和文: 
英文:ISCA 
会議名称
和文: 
英文:Interspeech 2018 
開催地
和文:ハイデラバード 
英文:Hyderabad 
ファイル
公式リンク https://www.isca-speech.org/archive/Interspeech_2018/pdfs/0993.pdf
 
DOI https://doi.org/10.21437/Interspeech.2018-993
アブストラクト This paper proposes attentive statistics pooling for deep speaker embedding in text-independent speaker verification. In conventional speaker embedding, frame-level features are averaged over all the frames of a single utterance to form an utterance-level feature. Our method utilizes an attention mechanism to give different weights to different frames and generates not only weighted means but also weighted standard deviations. In this way, it can capture long-term variations in speaker characteristics more effectively. An evaluation on the NIST SRE 2012 and the VoxCeleb data sets shows that it reduces equal error rates (EERs) from the conventional method by 7.5% and 8.1%, respectively.

©2007 Institute of Science Tokyo All rights reserved.