Year of Graduation

2022

Document Type

Thesis

Major

Mathematics & Statistics

Directing Professor

Dr. Julie M. Clark

Abstract

Sentiment Analysis, an up-and-coming subfield of Natural Language Processing (NLP), contains previously untapped potential that can be utilized to drive better business decision making. In this paper, we employ state-of-the-art sentiment analysis tools to compare the performances of traditional classification algorithms – namely Support Vector Machines (SVMs), bagging, boosting, random forest, and decision tree classifiers – on insurance-related textual data. We successfully demonstrate that algorithms such as bagging and boosting, which were constructed to enhance the performance of simpler algorithms such as decision tree classifiers, offer only marginal improvements in terms of classification accuracy and certain performance metrics for our data. However, the improved accuracy comes as the cost of slightly higher runtimes. Insurance companies could apply these findings to choose suitable algorithms and gain a more nuanced understanding of the needs of their insureds.

Index Terms— Sentiment Analysis, Textual Analysis, Machine Learning, Natural Language Processing (NLP), Opinion Mining (OM)

Share

COinS