Dynamic Replication And Hedging : A Reinforcement Learning Approach The authors of this article address the problem of how to optimally hedge an options book in a practical setting, where trading decisions are discrete and trading costs can be non linear and diﬃcult to model.
Based on reinforcement learning (RL), a well-established machine learning technique, the authors propose a model that is ﬂexible, accurate and very promising for real-world applications.
A key strength of the RL approach is that it does not make any assumptions about the form of trading cost. RL learns the minimum variance hedge subject to whatever transaction cost function one provides.
All that it needs is a good simulator, in which transaction costs and options prices are simulated accurately.
The problem of replicating and hedging an option position is fundamental in ﬁnance.
Since the publication of the seminal work of black1973pricing and merton1973theory on option pricing and dynamic hedging (jointly referred to as BSM), a substantial number of articles have addressed the problem of optimal replication and hedging.
The core idea of BSM is that in a complete and frictionless market there is a continuously rebalanced dynamic trading strategy in the stock and riskless security that perfectly replicates the option.
However, in practice continuous trading of arbitrarily small amounts of stock is inﬁnitely costly.
Instead, the portfolio replicating the option is adjusted at discrete times to minimize trading costs.
Consequently, perfect replication is impossible and an optimal hedging strategy will depend on the desired trade-oﬀ between replication error and trading costs. In other words, the hedging strategy chosen by an agent depends on their risk aversion