Paper
11 October 2023 Research on key technologies of speech recognition based on deep learning
Jing Luo
Author Affiliations +
Proceedings Volume 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023); 128002Z (2023) https://doi.org/10.1117/12.3004054
Event: 6th International Conference on Computer Information Science and Application Technology (CISAT 2023), 2023, Hangzhou, China
Abstract
The research on speech recognition shows that the accuracy of speech recognition in practical application scenarios is still not high because speech is affected by the user's unclear pronunciation, environmental noise and other factors. A key technology of speech recognition based on deep learning is proposed for resource browsing in campus enrollment. Firstly, the computational advantages of deep learning and the problems in the process of audio acquisition are analyzed. FFT algorithm is proposed to complete the noise reduction and enhancement of speech, and LSTM algorithm to complete the enhancement and recognition of speech. The experimental results show that FFT+LSTM algorithm has a high accuracy of speech recognition, reduces the response time of speech recognition, and has better flexibility and practicability.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jing Luo "Research on key technologies of speech recognition based on deep learning", Proc. SPIE 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023), 128002Z (11 October 2023); https://doi.org/10.1117/12.3004054
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Speech recognition

Denoising

Fourier transforms

Detection and tracking algorithms

Data modeling

Deep learning

Education and training

Back to Top