Paper
7 December 2023 Optimization of dedicated domain text classification based on data augmentation using BERT generation pre-trained model
Yuzhe Xiang, Can Cui, Xinjie Zhu, Zhi Wang
Author Affiliations +
Proceedings Volume 12941, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2023); 129411I (2023) https://doi.org/10.1117/12.3011550
Event: Third International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 203), 2023, Yinchuan, China
Abstract
Due to the excessive cost of data collection as well as annotation in dedicated domains, artificial intelligence model training is difficult to achieve optimality with insufficient data. To optimize this issue, a text generation data augmentation method based on the BERT model is proposed in this paper for augmenting the small amount of available annotated data. The optimization of this data augmentation method is demonstrated by experiments. In a text classification experiment, this data augmentation method can improve the training effect of the source data by 2.9%.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yuzhe Xiang, Can Cui, Xinjie Zhu, and Zhi Wang "Optimization of dedicated domain text classification based on data augmentation using BERT generation pre-trained model", Proc. SPIE 12941, International Conference on Algorithms, High Performance Computing, and Artificial Intelligence (AHPCAI 2023), 129411I (7 December 2023); https://doi.org/10.1117/12.3011550
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Education and training

Transformers

Mathematical optimization

Semantics

Statistical modeling

Machine learning

Back to Top