Article Info

Leveraging Transfer Learning and Label Optimization for Enhanced Traditional Chinese Medicine Ner Performance

Saidah Saad, Huang Zikun
dx.doi.org/10.17576/apjitm-2024-1301-04

Abstract

Named Entity Recognition (NER) is a crucial component in various domains, including medical and financial fields, as it helps identify text fragments belonging to predefined categories from unstructured text. Over time, NER algorithms have evolved from dictionary-based approaches to machine learning and deep learning techniques. Transfer learning, a novel deep learning method, has shown impressive results in NER tasks. However, transfer learning models still face challenges, such as limited entity labels and the impact of noisy datasets. To address these challenges, this research aims to optimise the application of deep learning models for NER and achieve enhanced results. The research initially applied the BERT+CRF model to the WanChuang dataset, resulting in an F1-measure of 89.1%. This established the feasibility of using transfer learning models for NER on Chinese medical data and served as a baseline for comparison in the project. To address label-related issues in the baseline model, a scheme was proposed to improve the learning rate of the CRF layer, resulting in an increased F1 measure of 91.0%. Additionally, to mitigate the impact of noisy training data, a 10-fold retraining scheme was introduced to optimise the training set. By retraining the model using the optimised training set, an optimal F1 measure of 92.7% was achieved. The experiments demonstrated that the transfer learning model enhances NER entity extraction capabilities while the optimised CRF layer effectively captures the internal relationships of entity tags, thus improving overall performance. This research contributes to advancing NER techniques and their application in various domains.

keyword

Named Entity Recognition, Traditional Chinese medicine, transfer learning, BERT, CRF

Area

Knowledge Technology