Abstract:
Sentiment Analysis is an innovative development in machine learning with the use of natural language processing techniques to derive emotions as positive, negative, neutral from public opinions in information. The main objective of this study is to introduce a sentiment analysis mechanism using social media data within big data distributed environment for stock market predictions. The proposed methodology is validated for the Colombo Stock market during the period from 2010 January to 2020 based on the Twitter feed content and stock market movements. Based on the past tweets, the current research examines the effectiveness of various machine learning techniques such as K Nearest Neighbour, Decision Tree Model, Support Vector Machine, Grey Exponential Smoothing model, and Multinomial Naïve Bayes machine learning to predict stock market indices. The mentioned algorithms were trained and tested through 80% of the data was used for training and 20% was used for testing. The key finding of this research suggested that the Grey Exponential Smoothing model and Naïve Bayes perform well in sentiment classification. Furthermore, results confirmed that the public sentiment highly influences the market fluctuations and economic elements like monetizing policies, government changes, unexpected pandemics, interest rates, the confidence of expectations of the public, confidence in economic growth are significantly make sense on the stock market.