Machine Learning Techniques, Features, Datasets, and Algorithm Performance Parameters for Sentiment Analysis: a Systematic Review
Loading...
Date
2022
Authors
Ondara, Bernard
Waithaka, Stephen
Kandiri, John
Journal Title
Journal ISSN
Volume Title
Publisher
COAS
Abstract
The purpose of this paper is to review various studies on current machine learning techniques
used in sentiment analysis with the primary focus on finding the most suitable combinations of
the techniques, datasets, data features, and algorithm performance parameters used in most
applications. To accomplish this, we performed a systematic review of 24 articles published
between 2013 and 2020 covering machine learning techniques for sentiment analysis. The review
shows that Support Vector Machine as well as Naïve Bayes techniques are the most popular
machine learning techniques; word stem and n-grams are the most extensively applied features,
and the Twitter dataset is the most predominant. This review further revealed that machine
learning algorithms' performance depends on many factors, including the dataset, extracted
features, and size of data used. Accuracy is the most commonly used algorithm performance
metric. These findings offer important information for researchers and businesses to use when
selecting suitable techniques, features, and datasets for sentiment analysis for various business
applications such as brand reputation monitoring.
Description
Article
Keywords
sentiment analysis, machine learning technique, machine learning algorithm, sentiment classification technique, sentiment classification algorithm
Citation
Ondara, B., Waithaka, S., Kandiri, J., & Muchemi, L. (2022). Machine Learning Techniques, Features, Datasets, and Algorithm Performance Parameters for Sentiment Analysis: A Systematic Review. Open Journal for Information Technology, 5(1), 1.