TEXT EMBEDDING AUGMENTATION BASED ON RETRAINING WITH PSEUDO-LABELED ADVERSARIAL EMBEDDING

Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Blog Article

Pre-trained language models (LMs) have been shown to achieve outstanding performance in various natural language processing tasks; however, these models have a significantly large number of parameters to handle large-scale text corpora during the pre-training process, and thus, they entail the risk of overfitting when fine-tuning for small task-oriented datasets is conducted.In this paper, we propose a text embedding augmentation method to mg09aca18te prevent such overfitting.The proposed method applies augmentation to a text embedding by generating an adversarial embedding, which is not identical to original input embedding but maintaining the characteristics of the original input embedding, old taylor whiskey 1933 price using PGD-based adversarial training for input text data.

A pseudo-label that is identical to the label of the input text is then assigned to adversarial embedding to conduct retraining by using adversarial embedding and pseudo-label as input embedding and label pair for a separate LM.Experimental results on several text classification benchmark datasets demonstrated that the proposed method effectively prevented overfitting, which commonly occurs when adjusting a large-scale pre-trained LM to a specific task.

Report this page