A Data-Driven Approach for Portfolio Optimization Using Machine Learning and Deep Learning Algorithms

Authors
1 School of Industrial Engineering, Iran University of Sci ence and Technology (IUST)
2 Khorasan Razavi Agricultural and Natural Resources Research and Education Center
Abstract
In today's complex and dynamic financial markets, portfolio optimization presents a significant challenge for investors. As such, capital market investors grapple with fundamental questions regarding which stocks to buy, at what time, and in what quantities. This research aims to provide a novel approach to portfolio optimization using a mean-variance model based on predictions from traditional machine learning and deep learning algorithms, offering solutions to these crucial questions. Drawing on the emergence of data-driven methods, this study compares the performance of various machine learning and deep learning algorithms in forecasting stock prices on the Tehran Stock Exchange. The dataset comprises the closing prices of nine major symbols from the Tehran Stock Exchange over a 1000-day period. The findings suggest that traditional machine learning models, particularly linear regression, outperform deep learning models in predicting prices. Furthermore, the mean-variance portfolio optimization approach leverages optimal stock selection and allocation to maximize returns while minimizing risk. This research serves as a practical tool for portfolio managers and risk analysts, facilitating improved risk management and investment portfolio performance.
Keywords

Sheng Z, Benshan S, Zhongping W. Analysis of mean-VaR model for financial risk control. Systems Engineering Procedia. 2012;4:40-5.

Pawaskar S. Stock price prediction using machine learning algorithms. International Journal for Research in Applied Science & Engineering Technology (IJRASET). 2022;10.

Fischer T, Krauss C. Deep learning with long short-term memory networks for financial market predictions. European journal of operational research. 2018;270(2):654-69.

Srivinay, Manujakshi BC, Kabadi MG, Naik N. A Hybrid Stock Price Prediction Model Based on PRE and Deep Neural Network. Data. 2022;7(5):51.

Kumbure MM, Lohrmann C, Luukka P, Porras J. Machine learning techniques and data for stock market forecasting: A literature review. Expert Systems with Applications. 2022;197:116659.

Su X, Yan X, Tsai CL. Linear regression. Wiley Interdisciplinary Reviews: Computational Statistics. 2012;4(3):275-94.

Breiman L. Random forests. Machine learning. 2001;45:5-32.

Chen T, Guestrin C, editors. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016.

Vapnik V. The nature of statistical learning theory: Springer science & business media; 2013.

Cover T, Hart P. Nearest neighbor pattern classification. IEEE transactions on information theory. 1967;13(1):21-7.

Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of statistics. 2001:1189-232.

Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2005;67(2):301-20.

Goodfellow I. Deep Learning: MIT Press; 2016.

Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997;9(8):1735-80.

LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 1998;86(11):2278-324.

Bengio Y. Learning deep architectures for AI. Foundations and trendsĀ® in Machine Learning. 2009;2(1):1-127.

Elman JL. Finding structure in time. Cognitive science. 1990;14(2):179-211.

Dorogush AV, Ershov V, Gulin A. CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:181011363. 2018.

Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems. 2017;30.

Chai T, Draxler RR. Root mean square error (RMSE) or mean absolute error (MAE). Geoscientific model development discussions. 2014;7(1):1525-34.