Stance Classification Post Kesehatan di Media Sosial Dengan FastText Embedding dan Deep Learning

Ernest Lim; Esther Irawati Setiawan; Joan Santoso

doi:10.52985/insyst.v1i2.86

Authors

Ernest Lim Institut Sains dan Teknologi Terpadu Surabaya
Esther Irawati Setiawan Institut Sains dan Teknologi Terpadu Surabaya https://orcid.org/0000-0002-7163-3556
Joan Santoso Institut Sains dan Teknologi Terpadu Surabaya

DOI:

https://doi.org/10.52985/insyst.v1i2.86

Keywords:

Bahasa Indonesia, Deep Learning, fastText, Media Sosial, Stance Classification

Abstract

Misinformasi merupakan fenomena yang semakin sering terjadi di media sosial, tidak terkecuali Facebook, salah satu media sosial terbesar di Indonesia. Beberapa penelitian telah dilakukan mengenai teknik identifikasi dan klasifikasi stance di media sosial Indonesia. Akan tetapi, penggunaan Word2Vec sebagai word embedding dalam penelitian tersebut memiliki keterbatasan pada pengenalan kata baru. Hal ini menjadi dasar penggunaan fastText embedding dalam penelitian ini. Dengan menggunakan pendekatan deep learning, penelitian berfokus pada performa model dalam klasifikasi stance suatu judul post kesehatan di Facebook terhadap judul post lainnya. Stance berupa for (setuju), observing (netral), dan against (berlawanan). Dataset terdiri dari 3500 judul post yang terdiri dari 500 kalimat klaim dengan enam kalimat stance terhadap setiap klaim. Model dengan fastText pada penelitian ini mampu menghasilkan F1 macro score sebesar 64%.

References

D. Goldie, M. Linick, H. Jabbar, and C. Lubienski, “Using Bibliometric and Social Media Analyses to Explore the ‘Echo Chamber’ Hypothesis,” Educ. Policy, vol. 28, no. 2, 2014, doi: 10.1177/0895904813515330.

S. Jacobson, E. Myung, and S. L. Johnson, “Open media or echo chamber: the use of links in audience discussions on the Facebook Pages of partisan news organizations,” Inf. Commun. Soc., vol. 19, no. 7, 2016, doi: 10.1080/1369118X.2015.1064461.

J. J. Van Bavel and A. Pereira, “The Partisan Brain: An Identity-Based Model of Political Belief,” Trends in Cognitive Sciences, vol. 22, no. 3. 2018, doi: 10.1016/j.tics.2018.01.004.

F. Zollo and W. Quattrociocchi, “Misinformation Spreading on Facebook,” 2018.

C. Shao, G. L. Ciampaglia, O. Varol, K. C. Yang, A. Flammini, and F. Menczer, “The spread of low-credibility content by social bots,” Nat. Commun., vol. 9, no. 1, 2018, doi: 10.1038/s41467-018-06930-7.

H. Allcott and M. Gentzkow, “Social media and fake news in the 2016 election,” Journal of Economic Perspectives, vol. 31, no. 2. 2017, doi: 10.1257/jep.31.2.211.

C. Reuter, S. Stieglitz, and M. Imran, “Social media in conflicts and crises,” Behav. Inf. Technol., vol. 39, no. 3, 2020, doi: 10.1080/0144929X.2019.1629025.

M. Roy, N. Moreau, C. Rousseau, A. Mercier, A. Wilson, and L. Atlani-Duault, “Ebola and Localized Blame on Social Media: Analysis of Twitter and Facebook Conversations During the 2014–2015 Ebola Epidemic,” Cult. Med. Psychiatry, vol. 44, no. 1, 2020, doi: 10.1007/s11013-019-09635-8.

S. Sommariva, C. Vamos, A. Mantzarlis, L. U. L. Đào, and D. Martinez Tyson, “Spreading the (Fake) News: Exploring Health Messages on Social Media and the Implications for Health Professionals Using a Case Study,” Am. J. Heal. Educ., vol. 49, no. 4, 2018, doi: 10.1080/19325037.2018.1473178.

S. M. Mohammad, P. Sobhani, and S. Kiritchenko, “Stance and sentiment in Tweets,” ACM Trans. Internet Technol., vol. 17, no. 3, 2017, doi: 10.1145/3003433.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 2013.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,” J. Mach. Learn. Res., vol. 12, 2011.

Y. Kim, “Convolutional neural networks for sentence classification,” 2014, doi: 10.3115/v1/d14-1181.

S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A large annotated corpus for learning natural language inference,” 2015, doi: 10.18653/v1/d15-1075.

R. Jannati, R. Mahendra, C. W. Wardhana, and M. Adriani, “Stance Classification Towards Political Figures on Blog Writing,” 2019, doi: 10.1109/IALP.2018.8629144.

E. I. Setiawan et al., “Analisis Pendapat Masyarakat terhadap Berita Kesehatan Indonesia menggunakan Pemodelan Kalimat berbasis LSTM (Indonesian Stance Analysis of Healthcare News using Sentence Embedding Based on LSTM),” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 9, no. 1, pp. 8–17, 2020.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” Trans. Assoc. Comput. Linguist., vol. 5, 2017, doi: 10.1162/tacl_a_00051.

A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM networks,” in Proceedings of the International Joint Conference on Neural Networks, 2005, vol. 4, doi: 10.1109/IJCNN.2005.1556215.

M. Thomas, B. Pang, and L. Lee, “Get out the vote: Determining support or opposition from Congressional,” Proc. 2006 Conf. Empir. Methods Nat., 2006.

K. S. Hasan and V. Ng, “Stance classification of ideological debates: Data, models, features, and constraints,” in Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013, pp. 1348–1356.

B. Riedel, I. Augenstein, G. P. Spithourakis, and S. Riedel, “A simple but tough-to-beat baseline for the Fake News Challenge stance detection task,” arXiv Prepr. arXiv1707.03264, 2017.

M. García Lozano, H. Lilja, E. Tjörnhammar, and M. Karasalo, “Mama Edha at SemEval-2017 Task 8: Stance Classification with CNN and Rules,” 2018, doi: 10.18653/v1/s17-2084.

W. F. Chen and L. W. Ku, “UTCNN: A deep learning model of stance classification on social media text,” 2016.

A. Hanselowski et al., “A retrospective analysis of the fake news challenge stance detection task,” arXiv Prepr. arXiv1806.05180, 2018.

I. Augenstein, T. Rocktäschel, A. Vlachos, and K. Bontcheva, “Stance detection with bidirectional conditional encoding,” 2016, doi: 10.18653/v1/d16-1084.

I. Habernalt and I. Gurevych, “Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM,” in 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, 2016, vol. 3, doi: 10.18653/v1/p16-1150.

D. Mrowca and E. Wang, “Stance detection for fake news identification,” Eliaswang.Com, 2017.

Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A Neural Probabilistic Language Model,” in Journal of Machine Learning Research, 2003, vol. 3, no. 6, doi: 10.1162/153244303322533223.

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of tricks for efficient text classification,” in 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference, 2017, vol. 2, doi: 10.18653/v1/e17-2068.

I. Santos, N. Nedjah, and L. de Macedo Mourelle, “Sentiment analysis using convolutional neural network with fastText embeddings,” 2018, doi: 10.1109/la-cci.2017.8285683.

R. Kiros et al., “Skip-thought vectors,” in Advances in Neural Information Processing Systems, 2015, vol. 2015-January.

L. Logeswaran, H. Lee, and A. Arbor, “a N Efficient Framework for Learning Sentence,” Iclr2018, 2016.

Y. Adi, E. Kermany, Y. Belinkov, O. Lavi, and Y. Goldberg, “Fine-grained analysis of sentence embeddings using auxiliary prediction tasks,” 2017.

J. Wieting, M. Bansal, K. Gimpel, and K. Livescu, “Towards universal paraphrastic sentence embeddings,” 2016.

Y. Zhang and B. Wallace, “A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification,” arXiv Prepr. arXiv1510.03820, 2015.

E. Kochkina, M. Liakata, and I. Augenstein, “Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM,” 2018, doi: 10.18653/v1/s17-2083.

Stance Classification Post Kesehatan di Media Sosial Dengan FastText Embedding dan Deep Learning

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Make a Submission

Information