Dataset Construction and Opinion Holder Detection Using Pre-trained Models

  • Al- Mahmud Kyushu Institute of Technology
  • Kazutaka Shimada Kyushu Institute of Technology
Keywords: Opinion holder, opinion holder detection, BERT, DistilBERT, CSE, CRF, logistic regression.


With the growing prevalence of the Internet, increasingly more people and entities express opinions on online platforms, such as Facebook, Twitter, and Amazon. As it is becoming impossible to detect online opinion trends manually, an automatic approach to detect opinion holders is essential as a means to identify specific concerns regarding a particular topic, product, or problem. Opinion holder detection comprises two steps: the presence of opinion holders in text and identification of opinion holders. The present study examines both steps. Initially, we approach this task as a binary classification problem: INSIDE or OUTSIDE. Then, we consider the identification of opinion holders as a sequence labeling task and prepare an appropriate English-language dataset. Subsequently, we employ three pre-trained models for the opinion holder detection task: BERT, DistilBERT, and contextual string embedding (CSE). For the binary classification task, we employ a logistic regression model on the top layers of the BERT and DistilBERT models. We compare the models’ performance in terms of the F1 score and accuracy. Experimental results show that DistilBERT obtained superior performance, with an F1 score of 0.901 and an accuracy of 0.924. For the opinion holder identification task, we utilize both feature- and fine-tuning-based architectures. Furthermore, we combined CSE and the conditional random field (CRF) with BERT and DistilBERT. For the feature-based architecture, we utilize five models: CSE+CRF, BERT+CRF, (BERT&CSE)+CRF, DistilBERT+CRF, and (DistilBERT&CSE)+CRF. For the fine-tuning-based architecture, we utilize six models: BERT, BERT+CRF, (BERT&CSE)+CRF, DistilBERT, DistilBERT+CRF, and (DistilBERT&CSE)+CRF. All language models are evaluated in terms of F1 score and processing time. The experimental results indicate that both the feature- and fine-tuning-based (DistilBERT&CSE)+CRF models jointly yielded the optimal performance, with an F1 score of 0.9453. However, feature-based CSE+CRF incurred the lowest processing time of 49 s while yielding a comparable F1 score to that obtained by the optimal-performing models.


Lun-Wei Ku, Chia-Ying Lee, and Hsin-Hsi Chen. Identification of opinion holders.

Computational Linguistics and Chinese Language Processing, 14(4):383–402, 2009.

Yohei Seki, Noriko Kando, and Masaki Aono. Multilingual opinion holder identifi-

cation using author and authority viewpoints. Information Processing Management,

(2):189–199, 2009.


Yu-Chieh Wu, Li-Wei Yang, Jeng-Yan Shen, Liang-Yu Chen, and Shih-Tung Wu.

Tornado in multilingual opinion analysis: a transductive learning approach for chinese

sentimental polarity recognition. In Proceedings of the NTCIR-7 workshop, pages

–306, 2008.

Soo-Min Kim and Eduard Hovy. Extracting opinions, opinion holders, and topics

expressed in online news media text. In Proceedings of the Workshop on Sentiment

and Subjectivity in Text, pages 1–8, 2006.

Youngho Kim, Seongchan Kim, and Sung-Hyon Myaeng. Extracting topic-related

opinions and their targets in NTCIR-7. In Proceeedings of NTCIR-7 workshop, pages

–254, 2008.

Youngho Kim, Yuchul Jung, and Sung-Hyon Myaeng. Identifying opinion holders

in opinion text from online newspapers. In 2007 IEEE International Conference on

Granular Computing (GRC 2007), pages 699–702, 2007.

Ruifeng Xu, Kam-Fai Wong, and Yunqing Xia. Coarse-fine opinion mining-WIA in

NTCIR-7 MOAT task. In Proceedings of NTCIR-7 workshop, pages 307–313, 2008.

Yejin Choi, Claire Cardie, Ellen Riloff, and Siddharth Patwardhan. Identifying sources

of opinions with conditional random fields and extraction patterns. In Proceedings

of human language technology conference and conference on empirical methods in

natural language processing, pages 355–362, 2005.

Eric Breck, Yejin Choi, and Claire Cardie. Identifying expressions of opinion in con-

text. In Proceedings of IJCAI, volume 7, pages 2683–2688, 2007.

Meng Xinfan and Wang Houfeng. Detecting opinionated sentences by extracting con-

text information. In Proceedings of the NTCIR-7 workshop, pages 268–271, 2008.

Kang Liu and Jun Zhao. NLPR at multilingual opinion analysis task in NTCIR-7. In

Proceedings of NTCIR-7 workshop, pages 226–231, 2008.

Bin Lu. Identifying opinion holders and targets with dependency parser in chinese

news texts. In Proceedings of the NAACL HLT 2010 student research workshop, pages

–51, 2010.

Mohamed Elarnaoty, Samir AbdelRahman, and Aly Fahmy. A machine learn-

ing approach for opinion holder extraction in arabic language. arXiv preprint

arXiv:1206.1011, 2012.

Michael Wiegand, Marc Schulder, and Josef Ruppenhofer. Opinion holder and target

extraction for verb-based opinion predicates–the problem is not solved. In Proceed-

ings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and

Social Media Analysis, pages 148–155, 2015.

Michael Wiegand and Josef Ruppenhofer. Opinion holder and target extraction based

on the induction of verbal categories. In Proceedings of the Nineteenth Conference on

Computational Natural Language Learning, pages 215–225, 2015.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-

training of deep bidirectional transformers for language understanding. arXiv preprint

arXiv:1810.04805, 2018.

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. Distilbert,

a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint

arXiv:1910.01108, 2019.

Alan Akbik, Duncan Blythe, and Roland Vollgraf. Contextual string embeddings for

sequence labeling. In Proceedings of the 27th International Conference on Computa-

tional Linguistics, pages 1638–1649, 2018.

Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion An-

droutsopoulos, and Suresh Manandhar. SemEval-2014 task 4: Aspect based sentiment

analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation

(SemEval 2014), pages 27–35, 2014.

Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Suresh Manandhar, and Ion An-

droutsopoulos. SemEval-2015 task 12: Aspect based sentiment analysis. In Proceed-

ings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages

–495, 2015.

Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh

Manandhar, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin,

Orph ́ee De Clercq, V ́eronique Hoste, Marianna Apidianaki, Xavier Tannier, Na-

talia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Salud Maria Jim ́enez-Zafra, and

G ̈ulsen Eryigit. Semeval-2016 task 5: Aspect based sentiment analysis. In Proceed-

ings of the 10th International Workshop on Semantic Evaluation (SemEval 2016),

pages 19–30, 2016.

Stefan Schweter and Alan Akbik. FLERT: document-level features for named entity

recognition. CoRR, abs/2011.06993, 2020.

John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional ran-

dom fields: Probabilistic models for segmenting and labeling sequence data. In Pro-

ceedings of the Eighteenth International Conference on Machine Learning, pages

–289, 2001.

using-conditional-random-fields-in-nlp-3660df22e95c. blog/exploring-conditional-random-fields-for-nlp-


Rrubaa Panchendrarajan and Aravindh Amaresan. Bidirectional LSTM-CRF for

named entity recognition. In Proceedings of the 32nd Pacific Asia Conference on

Language, Information and Computation, pages 531–540, 2018.

Taiki Watanabe, Akihiro Tamura, Takashi Ninomiya, Takuya Makino, and Tomoya

Iwakura. Multi-task learning for chemical named entity recognition with chemical

compound paraphrasing. In Proceedings of the Conference on Empirical Methods in

Natural Language Processing and the 9th International Joint Conference on Natural

Language Processing (EMNLP-IJCNLP), pages 6244–6249, 2019.

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R

Bowman. Glue: A multi-task benchmark and analysis platform for natural language

understanding. arXiv preprint arXiv:1804.07461, 2018.

Technical Papers