Machine Learning Techniques, Features, Datasets, and Algorithm Performance Parameters for Sentiment Analysis: a Systematic Review

Loading...
Thumbnail Image
Date
2022
Authors
Ondara, Bernard
Waithaka, Stephen
Kandiri, John
Journal Title
Journal ISSN
Volume Title
Publisher
COAS
Abstract
The purpose of this paper is to review various studies on current machine learning techniques used in sentiment analysis with the primary focus on finding the most suitable combinations of the techniques, datasets, data features, and algorithm performance parameters used in most applications. To accomplish this, we performed a systematic review of 24 articles published between 2013 and 2020 covering machine learning techniques for sentiment analysis. The review shows that Support Vector Machine as well as Naïve Bayes techniques are the most popular machine learning techniques; word stem and n-grams are the most extensively applied features, and the Twitter dataset is the most predominant. This review further revealed that machine learning algorithms' performance depends on many factors, including the dataset, extracted features, and size of data used. Accuracy is the most commonly used algorithm performance metric. These findings offer important information for researchers and businesses to use when selecting suitable techniques, features, and datasets for sentiment analysis for various business applications such as brand reputation monitoring.
Description
Article
Keywords
sentiment analysis, machine learning technique, machine learning algorithm, sentiment classification technique, sentiment classification algorithm
Citation
Ondara, B., Waithaka, S., Kandiri, J., & Muchemi, L. (2022). Machine Learning Techniques, Features, Datasets, and Algorithm Performance Parameters for Sentiment Analysis: A Systematic Review. Open Journal for Information Technology, 5(1), 1.