This paper addresses multi-channel speaker separation based on a deep delay-and-subtraction beamformer.
Deep neural network(DNN) is first applied to estimate the delay time between speakers and microphones , and then
speakers’ speech is recovered from mixed signals by using a delay-and-subtraction algorithm. We evaluated our method
by using simulated data made from WSJCAM0 database. The proposed method achieved high precision source localization,
and about 62% relative improvement on word error rate (WER) over a delay-and-sum (DS) beamformer.