Automatic speech recognition based on attention-enhanced blockformer

Wei Liu; Tianyu Zhan; Chunsheng Xu

doi:10.1117/12.3021478

19 February 2024 Automatic speech recognition based on attention-enhanced blockformer

Wei Liu, Tianyu Zhan, Chunsheng Xu

Proceedings Volume 13063, Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023); 130630Q (2024) https://doi.org/10.1117/12.3021478
Event: Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023), 2023, Changchun, China

Abstract

The Blockformer speech recognition model has recently been proposed as a state-of-the-art (SOTA) model ontheAishell-1 Chinese speech dataset. This model exhibited significant improvements in character error rate (CER) when compared to its baseline, Conformer. The key improvement of Blockformer is the addition of the Squeeze-and-Excitation (SE) block on top of Conformer, which enables better utilization of the information contained in each Conformer block. In our study of Blockformer, we identified scope for improving its block information extraction method. To this end, we used the attention mechanism to enhance the SE block's efficacy in squeezing block information. And we enhanced the model's structure in attention inference mode to align more effectively with the training structure. Under the four inference modes, namely attention, attention rescoring, ctc greedy search, and ctc prefix beam search, the CER reaches 4.67%, 4.43%, 4.75% and 4.75%. All of these rates are at the level of Blockformer or exceed it.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Wei Liu, Tianyu Zhan, and Chunsheng Xu "Automatic speech recognition based on attention-enhanced blockformer", Proc. SPIE 13063, Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023), 130630Q (19 February 2024); https://doi.org/10.1117/12.3021478

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available