Deep One-Class Fine-Tuning for Imbalanced Short Text Classification in Transfer Learning
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
The abundance of user-generated online content has presented significant challenges in handling big data. One challenge involves analyzing short posts on social media, ranging from sentiment identification to abusive content detection. Despite recent advancements in pre-trained language models and transfer learning for textual data analysis, the classification performance is hindered by imbalanced data, where anomalous data represents only a small portion of the dataset. To address this, we propose Deep One-Class Fine-Tuning (DOCFT), a versatile method for fine-tuning transfer learning-based textual classifiers. DOCFT uses a one-class SVM-style hyperplane to encapsulate anomalous data. This approach involves a two-step fine-tuning process and utilizes an alternating optimization method based on a custom OC-SVM loss function and quantile regression. Through evaluations on four different hate-speech datasets, we observe that significant performance improvements can be achieved by our method.
Open Access Status
This publication is not available as open access