Q-Learning & Black-Scholes : QLBS Model

See the source image

Q-Learning & Black-Scholes : QLBS Model. The QLBS model is a discrete-time option hedging and pricing model. One that is based on Dynamic Programming (DP) and Reinforcement Learning (RL).

Moreover, it combines the famous Q-Learning method for RL with the Black-Scholes (-Merton) model’s idea of reducing the problem of option pricing.

In addition, hedging to the problem of optimal rebalancing of a dynamic replicating portfolio for the option, which is made of a stock and cash.

Here we expand on several NuQLear (Numerical Q-Learning) topics with the QLBS model.

Did Finance Oversleep a Century of Development in Physics?
See the source image

Firstly, we investigate the performance of Fitted Q Iteration for a RL (data-driven) solution to the model.

Secondly, we benchmark it versus a DP (model-based) solution, as well as versus the BSM model.

Thirdly, we develop an Inverse Reinforcement Learning (IRL) setting for the model.

Moreover, one where we only observe prices and actions (re-hedges) taken by a trader, but not rewards.

Dr. Igor Halperin on Reinforecement Learning & IRL For Investing & The Dangers of Deep Learning

Fourthly, we outline how the QLBS model can be used for pricing portfolios of options.

Rather than a single option in isolation, thus providing its own, data-driven and model independent solution to the (in)famous volatility smile problem of the Black-Scholes model.

Read The Full Paper

About Igor Halperin, Machine Learning Mind

Igor Halperin is an AI Research Associate at Fidelity Investments. His research focuses on using methods of reinforcement learning, information theory, and physics for financial problems such as portfolio optimization, dynamic risk management, and inference of sequential decision-making processes of financial agents. Igor has an extensive industrial and academic experience in statistical and financial modeling, in particular in the areas of option pricing, credit portfolio risk modeling, portfolio optimization, and operational risk modeling. Prior to joining Fidelity, Igor worked as a Research Professor of Financial Machine Learning at NYU Tandon School of Engineering.  

Before that, Igor was an Executive Director of Quantitative Research at JPMorgan, and a quantitative researcher at Bloomberg LP. Furthermore, Igor has published numerous articles in finance and physics journals, and is a frequent speaker at financial conferences. Igor co-authored the books “Machine Learning in Finance: From Theory to Practice” (Springer 2020) and “Credit Risk Frontiers” (Bloomberg LP, 2012). In addition, Igor has a Ph.D. in theoretical high energy physics from Tel Aviv University. And a M.Sc. in nuclear physics from St. Petersburg State Technical University.

Inverse Reinforcement Learning With NYU Professor