https://papers.academic-conferences.org/index.php/icair/article/view/4294
Stock market prediction has long been a challenging problem in the field of finance and investment. Accurately predicting the movements of stock prices is crucial for making informed decisions and maximizing investment returns. Traditional models mainly use historical prices. We found that there is a gap in research in integrating financial news into the model, which has emerged as a promising direction in enhancing predictive accuracy. This research aims to address this problem by exploring a multimodal approach by combining companies’ news articles and their historical stock data to predict future stock movements. The objective was to compare the performance of a Graph Neural Network (GNN) model with an LSTM model. The methodology employed in this research involves an LSTM model that embeds the historical data for each company and a language model to embed news articles. These embeddings will represent nodes that have relationships presented by edges within a graph. Using a GNN message aggregation technique known as GraphSAGE, the model should be able to capture interactions and dependencies between news articles, companies, and industries and use this information to predict future stock movements. Two target variable approaches are explored: one focusing on the binary classification of whether the stock price will increase or decrease, and the other considering the significance of the increase. This methodology was evaluated on two datasets, the US equities dataset and the Bloomberg dataset. The results showed that the GNN model was able to achieve better performance than the baseline LSTM model on both datasets. The GNN model achieved an accuracy of 53% on the first target, a statistically significant 1% improvement over the baseline, and a 4% precision gain on the second target, which confirms the effectiveness of exploiting financial news using graph-based models. Furthermore, we observed that increasing the number of news samples led to improved accuracy. We also find that headlines contain stronger predictive signal than full articles which is consistent with evidence that headlines disproportionately shape readers’ judgments and market reactions.