How machine learning is used in portfolio management?
Artificial Intelligence & Machine Learning
Active Portfolio Management using Machine Learning
Modern Portfolio Theory was first introduced by Harry Markowitz in 1952. The groundbreaking investment theory demonstrated that the performance of an individual stock is not as important as the performance of an entire portfolio. The interpretations of his article on “Portfolio Selection” , in The Journal of Finance, taught investors that risk, and not the best price, should be the cornerstone of a portfolio . Once an investor’s risk tolerance has been established, building a portfolio around it is streamlined.
The goal of portfolio management is to choose the best strategy to maximize returns on investment, ensure portfolio flexibility or optimize risk. Attaining such objectives involve the optimal allocation of investment funds, optimization of risk factors, considering financial ratios (such as the Sharpe Ratio) and much more. 
Due to machine learning models’ capability to identify novel investment opportunities, increased computational efficiency and reduced overhead time, it has become an integral part of the financial modeling process. Its capacity to extract information effectively and efficiently from a wide range of large textual and numerical datasets with minimal human supervision makes it superior to traditional methods.
Such models help portfolio managers with trade execution, data parsing, idea generation & pattern recognition, alpha factor design, asset allocation, position sizing, strategy testing, and eliminates human biases.
Our goal of this project is to identify an efficient Machine Learning architecture that will replicate the stock basket’strend, almost perfectly. The selected ML model will have superior prediction capabilities and will adapt to the directional changes in prices accurately. Thus, the ML model will help us predict the expected stock returns. Additionally, using the past data we can calculate the standard deviations and gauge the risk.
Now, these stocks will be parsed through various portfolio optimization techniques. To identify the correct optimization technique, in-depth research on various techniques will be carried out. This will produce an optimized portfolio with weightage of each stock.
Fig 1: High level view of Machine Learning in Portfolio Construction 
Literature Review & Model Implementation
1) Mean–variance portfolio optimization using machine learning-based stock price prediction (Wei Chen, Haoyu Zhang, Mukesh Kumar Mehlawat, Lifen Jia) 
The paper presents a hybrid model based on machine learning for stock prediction and mean– variance (MV) model for portfolio selection. The model follows two stages:
Stock Prediction: Combines eXtreme Gradient Boosting (XGBoost) with an improved firefly algorithm (IFA). XGBoost is an improved gradient boosted decision tree and is composed of multiple decision trees. It is, therefore, a suitable classifier for the financial markets. The accuracy of prediction is calculated using mean absolute percentage error (MAPE), mean square error (MSE), mean absolute error (MAE), and root mean square error (RMSE).
Portfolio Selection: Stocks with higher potential returns are selected according to each stock’s predicted prices, and the MV model is employed for allocating the investment proportion of the portfolio. The portfolio is evaluated on its annualized mean return, annualized standard deviation, annualized Sharpe ratio, and annualized Sortino ratio (variation of Sharpe ratio – considering the standard deviation of only negative portfolio returns instead of the total standard deviation.)
The paper concludes that ML portfolio management pricing methods outperform the benchmarks concerning accuracy, potency, and efficiency.
We predicted the stock price of three companies using ARIMA, LSTM-RNN, and Transformer architectures. The accuracy of prediction is calculated using root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and Profit & Loss (PNL) rubrics.
ARIMA : AutoRegressive Integrated Moving Average is a time series forecasting model that incorporates autocorrelation measures to model non-stationary data to predict future values. The autocorrelation part of the model measures the dependency of a signal with a delayed copy of itself as a function. ARIMA models are capable of capturing temporal structures in time-series data. This model uses three parameters –
- p: number of lag observations
- d: degree of differencing
- q: size of moving window average
LSTM-RNN : Long Short Term Memory based Recurrent Neural Network. RNN is a generalization of feedforward neural network that has an internal memory. As its name suggests, it is recurrent in nature, i.e., it performs the same function for every input of data while the output of the current input depends on the past one computation. Once the output is generated, it is replicated and sent back to the RNN. Decisions are based on the current input and the previous step learnt output. LSTM is a modified version of RNN which makes it easier to remember past data in memory.
Transformer : it is a neural network architecture that uses a self-attention mechanism, allowing the model to focus on the relevant parts of the time-series to improve prediction qualities. This consists of Single-head Attention and Multi-head Attention. The self-attention mechanism searches for inputs sequences, for each input, and adds result to the output sequence. These processes are parallelized allowing acceleration of the learning process.
We have incorporated the daily OHLCV (Open, High, Low, Close, Volume) data of three companies – Dr Reddy’s Laboratories Ltd, Wipro Limited, Reliance Steel & Aluminum Co – for a time frame of 252 days, from 16Sept20 till 15Sept21. The data has been extracted & stored locally from NSE website. The data set incorporated is divided into a ratio of 7:3, where 70% of it will be the training data and the remaining for testing. For this project, we will ignore the top and bottom 5% of values, as they may be outliers.
c) Predicting Stocks
ARIMA: We initially scaled our features to avoid intensive computation. We imported the ARIMA module from statsmodels.tsa.arima.model to train our model. The ARIMA models uses the first 180 rows of the dataset as training data, with parameters p=2, d=1, q=0. These learnt results are utilized to predict the close price of the further rows of dataset (row number 180 onwards)
Fig 3: ARIMA Implementation (File Name: arima.py)
LSTM- RNN: We are developing a neural network regressor for continuous value prediction using LSTM. We initialize the same. Now, we add the first layer with the Dropout layer. The LSTM layer has 250 neurons that will capture the model trend. Since we need to add another layer to the model, our return_sequence is set to “True”.
The input_shape corresponds to the umber of time stamps and indicators. For the Dropout layer, a value of 0.2 denotes that 20% of the 250 neurons will be ignored. This is repeated for layers 2,3, and 4. Finally a one-dimension output layer is created. This layer is one dimensional as we are only predicting one price each time. We now compile our code using “Adam” optimizer and loss as “mean square error”. We then fit our model.
Fig 4: LSTM-RNN Implementation (File Name: rnn.py)
Transformer: To encode the notion of time into the model, time encoders have been incorporated in the code (class Time2Vector). The time encoder comprises of two main ideas – periodic component (ReLU function) & non-periodic component (linear function). The “class Time2Vector” encompasses two functions – build (initialize matrices) and call (calculates the periodic and linear time features). We now initialize the three Transformer Encoder layers – Single Head Attention, Multi Head Attention, and Transformer Encoder layer.
(Note: Snippet of this is not in the report. Kindly refer attached python files to view them.) After these initializations, the training process begins, which is followed by testing the data. The parameters used are – Batch Size: 12, Sequence Length: 3, and Size of model input: 5.
Fig 5: Transformer Implementation (File Name: transformer.py)
The stock_prediction.py imports all these models and returns the final outputs depending on the stock ticker symbol entered.
2) Alternative Approaches to Mean–variance Optimization (Maria Debora Braga)  This paper illustrates two examples risk-based asset allocation strategies: the global minimum variance strategy; and the optimal risk parity strategy.
Global Minimum-Variance Strategy – the asset manager recommends a portfolio already known by the reader. It is the portfolio at the leftmost end of the efficient frontier, which, given the location, is the portfolio with the smallest attainable ex ante standard deviation. It is not necessary to implement Mean-Variance Optimization to identify the global minimum-variance portfolio (GMVP). It can be replaced with a simplified optimization algorithm with the formula for portfolio variance as an objective function to be minimized and the inclusion of the traditional long-only and budget constraints. Therefore, the optimization problem to be solved to identify the GMVP can be written formally as follows:
Optimal Risk Parity Strategy – aims to overcome the problem of risk concentration. The alternative designation of optimal risk parity as equally weighted risk contribution strategy or portfolio suggests the criteria used to structure the portfolio coherently with the goal: to give each asset class a weight so that the amount of risk it contributes to overall portfolio risk is equal to the amount contributed by any other asset class in the portfolio. The above condition can become written formally as equality among component risks. Moreover, defined as the product of asset class weight in the portfolio and its marginal contribution to risk.
The identification of optimal asset class weights for a risk parity portfolio involves solving an optimization problem which is different from mean-variance optimization. More precisely, it includes a new objective function to be minimized together with the traditional constraints on portfolio weights.
Mean-Variance Analysis – the process of weighing risk, expressed as variance against the expected return. Investors weigh the amount of risk they are willing to take on in exchange for different levels of reward. Mean-variance analysis allows investors to find the highest reward at a specified level of risk or the least risk at a stated level of return. Mean-variance analysis is one part of modern
portfolio theory that assumes that investors will make rational decisions about investments if they have complete information. A major assumption here is that investors seek low risk and high reward. The two main components of mean-variance analysis are as follows:
• Variance – Represents how spread out the numbers are in a set
• Expected Return – Probability expressing the estimated return of the investment in the security
If two different securities have the same expected return, but one has lower variance, the one with lower variance is the better pick. Similarly, if two different securities have approximately the same variance, the one with the higher return is the better pick.
Results & Conclusion
From our results of LSTM-RNN, and the Transformer when compared to the ARIMA benchmark, our conclusion is that the LSTM is better at getting the trends to almost replicate the original trend, but it does not react to directional changes as good as the Transformer which does not score so well when it comes to closeness to the original stock trend.
Any stock prediction model cannot provide greater accuracy without increasing the complexity of prediction operations exponentially. The key to optimizing a stock forecasting system is balancing accuracy and computational expense.
Fig 6: RDY Price Prediction
Fig 6: WIT Price Prediction
The rationale behind the use of ML is not just about novel architectures that revolutionizes forecasting and time-series prediction, but also the way each use case should be evaluated in terms of the domain of the use case itself. Referring to this specific use case, metrics are surely an evaluation factor, but it is the PNL that gives deeper insight into the accuracy of the model to predict changes based on the trends and patterns that are of significance and perform the best when looked at the big picture.
Moreover, Table 1: RDY Performance Matrix (Net Starting Value 10,00,000)
Table 2: WIT Performance Matrix (Net Starting Value 10,00,000)
Table 3: RS Performance Matrix (Net Starting Value 10,00,000)
Identification of correct ML architecture in Active Portfolio Management is cardinal because the stock predicted expected returns will be the base for stock selection for portfolio optimization.
Also, from our research, we conclude that risk-based optimization techniques are superior to the traditional mean-variance technique. When the risk-based strategies become considered, the portfolio selection phase undergoes changes because the strategy comes up to a single risky portfolio proposal. Therefore, an investor does not have to choose an optimal point on the efficient frontier according to their risk tolerance/aversion and investment objective.
The decision-making process is therefore analogous to the Capital Asset Pricing Model. In CAPM, the same risky portfolio is offered to all the investors. A point is then selected on the Capital Market line. This is the line, in the risk-return space, that originates from the risk-free rate and is tangent to the efficient frontier. The tangency portfolio is also the portfolio with the maximum Sharpe ratio.
Written by Varun Chandra Gupta
References & Citations
 “Portfolio Selection”, Wiley Online Library [Online]
Available: https://onlinelibrary.wiley.com/doi/full/10.1111/j.1540-6261.1952.tb01525.x [Accessed Nov. 03, 2022]
 “Portfolio Management”, Investopedia
Available: https://www.investopedia.com/articles/07/portfolio-history.asp [Accessed Nov. 03, 2022]
 “Portfolio Management”, groww.in. [Online].
Available: https://groww.in/p/portfolio-management [Accessed Oct. 07, 2022]
 Derek Snow, “Machine Learning in Asset Management—Part 1: Portfolio Construction—Trading Strategies” in The Journal of Financial Data Science Winter 2020, [Online Document], Available: https://jfds.pm-research.com/content/2/1/10 [Accessed Nov. 03, 2022]
 Wei Chen, Haoyu Zhang, Mukesh Kumar Mehlawat and Lifen Jia, “Mean–variance portfolio optimization using machine learning-based stock price prediction”, ScienceDirect, [Online Document], Available:
https://reader.elsevier.com/reader/sd/pii/S1568494620308814?token=072C0F96416E333E05C4C88C 0F3EF956F66CB2B591F5865147EF13701FFC6CEB96D70EA6ADA889440EE8F47A7D90D1E6& originRegion=us-east-1&originCreation=20221007063440 [Accessed Oct. 07, 2022]
 “ARIMA Model”, ProjectPro [Online]
Available: https://www.projectpro.io/article/how-to-build-arima-model-inpython/544#:~:text=Model%20in%20Python%3F-,ARIMA%20Model%2D%20Complete%20Guide%20to%20Time%20Series%20Forecasting%20in% 20Python,data%20to%20predict%20future%20values [Accessed Nov. 03, 2022]
 “ARIMA Model”, TowardsDataScience [Online]
Available: https://towardsdatascience.com/time-series-forecasting-predicting-stock-prices-using-an arima-model-2e3b3080bd70 [Accessed Nov. 03, 2022]
 “LSTM for Stock Price Prediction”, TowardsDataScience [Online]
Available: https://towardsdatascience.com/lstm-for-google-stock-price-prediction-e35f5cc84165 [Accessed Nov. 03, 2022]
 “Stock Predictions with state-of-the-art Transformers and Time Embeddings”, TowardsDataScience [Online]
Available: https://towardsdatascience.com/stock-predictions-with-state-of-the-art-transformer-and time-embeddings-3a4485237de6 [Accessed Nov. 03, 2022]
 Maria Debora Braga, “Alternative Approaches to Traditional Mean-Variance Optimisation”, SpringerLink, [Online Document], Available: https://link.springer.com/chapter/10.1007/978-3-319- 32796-9_6 [Accessed Nov. 03, 2022]
 Michael Pinelis, David Ruppert, “Machine learning portfolio allocation”, ScienceDirect, [Online Document], Available:
https://reader.elsevier.com/reader/sd/pii/S2405918821000155?token=E142E1AA4431A9B55ECF1C2 ABA5CC7E52CA69D5E1C728796D64FEC02BCFB0E3ABEEFAA56E0AF1E431BAE0CE7C041C E5E&originRegion=us-east-1&originCreation=20221007052506 [Accessed Oct. 07, 2022]
 Jörn Sass, Anna-KatharinaThös, “Risk reduction and portfolio optimization using clustering methods”, ScienceDirect, [Online Document], Available: https://reader.elsevier.com/reader/sd/pii/S2452306221001416?token=5E952856364B9133BB345373 B3373F4A2A5D62EAC2886F34758D28024D01C053253082DAC3408677BB6E425ECB264C6F& originRegion=us-east-1&originCreation=20221007063710 [Accessed Oct. 07, 2022]
How machine learning is used in portfolio management?