Date of Award:

5-2021

Document Type:

Thesis

Degree Name:

Master of Science (MS)

Department:

Computer Science

Committee Chair(s)

Shuhan Yuan

Committee

Shuhan Yuan

Committee

Xiaojun Qi

Committee

Dean Mathias

Abstract

Online social networks provide people with convenient platforms to communicate and share life moments. However, because of the anonymous property of these social media platforms, the cases of online hate speeches are increasing. Hate speech is defined by the Cambridge Dictionary as “public speech that expresses hate or encourages violence towards a person or group based on something such as race, religion, sex, or sexual orientation”. Online hate speech has caused serious negative effects to legitimate users, including mental or emotional stress, reputational damage, and fear for one’s safety. To protect legitimate online users, automatically hate speech detection techniques are deployed on various social media. However, most of the existing hate speech detection models require a large amount of labeled data for training. In the thesis, we focus on achieving hate speech detection without using many labeled samples. In particular, we focus on three scenarios of hate speech detection and propose three corresponding approaches. (i) When we only have limited labeled data for one social media platform, we fine-tune a per-trained language model to conduct hate speech detection on the specific platform. (ii) When we have data from several social media platforms, each of which only has a small size of labeled data, we develop a multitask learning model to detect hate speech on several platforms in parallel. (iii) When we aim to conduct hate speech on a new social media platform, where we do not have any labeled data for this platform, we propose to use domain adaptation to transfer knowledge from some other related social media platforms to conduct hate speech detection on the new platform. Empirical studies show that our proposed approaches can achieve good performance on hate speech detection in a low resource setting.

Checksum

04454e1c4c78ccf1405ea30ba7fdb283

Recommended Citation

Li, Peiyu, "Achieving Hate Speech Detection in a Low Resource Setting" (2021). All Graduate Theses and Dissertations, Spring 1920 to Summer 2023. 8097.
https://digitalcommons.usu.edu/etd/8097

Download

Included in

Computer Sciences Commons

COinS

Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at DigitalCommons@usu.edu.

DOI

https://doi.org/10.26076/1ed8-eb08

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Achieving Hate Speech Detection in a Low Resource Setting

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Abstract

Checksum

Recommended Citation

Included in

DOI

Browse

For Authors

Scholarly Communication

Research Data

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Achieving Hate Speech Detection in a Low Resource Setting

Author

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Abstract

Checksum

Recommended Citation

Included in

Share

DOI

Browse

For Authors

Scholarly Communication

Research Data