Forecasting Public Opinions from Twitter Data using Regression and Time Series Methods

Authors

  • Arun Kodirekka, A. Srinagesh

Abstract

Forecasting public opinions are useful in business decision makings. In this 21st century, primarily data is available in online mode. Users follow the reviews of the products, share opinions to the public. Reviews are free-form and unstructured text format. Sentiment analysis is used to extract the opinions from the unstructured textual reviews. Forecasting is a process of future assessment. The machine learning supports the forecasting in two approaches, regression and time series models. One is extracting the daily sentiment scores from the tweets, the daily sentiment scores in the form of numerical type. Second one is applying the regression models and time series models on the daily sentiment scores. These regression algorithms are linear regression; random forest regression, Poisson regression and super vector regression with different kernels like linear, polynomial and radial are used in this paper. Quality metrics for regression techniques MSE and R-square are verified. Linear regression, and SVR with kernels like linear, and radial attain more than 97% of R-square. Finally two time series methods applied, these are EWMA and Esmoothing approaches. The optimal EWMA is achieved by smoothing parameter and optimal Esmoothing is achieved by reducing the RMSE using optimal parameter, which performs good results in forecasting.

Published

2020-04-30

Issue

Section

Articles