Learning With Annotation of Various Degrees

Abstract

In this paper, we study a new problem in the scenario of sequences labeling. To be exact, we consider that the training data are with annotation of various degrees, namely, fully labeled, unlabeled, and partially labeled sequences. The learning with fully un-/labeled sequence refers to the standard setting in traditional un-/supervised learning, and the proposed partially labeling specifies the subject that the element does not belong to. The partially labeled data are cheaper to obtain compared with the fully labeled data though it is less informative, especially, when the tasks require a lot of domain knowledge. To solve such a practical challenge, we propose a novel deep Conditional Random Field (CRF) model which utilizes an end-to-end learning manner to smoothly handle fully/un-/partially labeled sequences within a unified framework. To the best of our knowledge, this could be one of the first works to utilize the partially labeled instance for sequence labeling and the proposed algorithm unifies the deep learning and CRF in an end-to-end framework. Extensive experiments show that our method achieves state-of-the-art performance in two sequence labeling tasks on some popular data sets.

Publication
IEEE Transactions on Neural Networks and Learning Systems (TNNLS 2019)
Hao Zhang
Hao Zhang
Staff Algorithm Engineer