Improving Noise Robustness for Spoken Content Retrieval using Semi-supervised ASR and N-best Transcripts for BERT-based Ranking Models

01/15/2023
by   Yasufumi Moriya, et al.
0

BERT-based re-ranking and dense retrieval (DR) systems have been shown to improve search effectiveness for spoken content retrieval (SCR). However, both methods can still show a reduction in effectiveness when using ASR transcripts in comparison to accurate manual transcripts. We find that a known-item search task on the How2 dataset of spoken instruction videos shows a reduction in mean reciprocal rank (MRR) scores of 10-14 disparity, we investigate the use of semi-supervised ASR transcripts and N-best ASR transcripts to mitigate ASR errors for spoken search using BERT-based ranking. Semi-supervised ASR transcripts brought 2-5.5 standard ASR transcripts and our N-best early fusion methods for BERT DR systems improved MRR by 3-4 early fusion for BERT DR reduced the MRR gap in search effectiveness between manual and ASR transcripts by more than 50

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset