Inverse Reinforcement Learning With NYU Professor & Fidelity’s Dr. Igor Halperin
Inverse Reinforcement Learning for Marketing Learning. Firstly, customer preferences from an observed behaviour is an important topic in the marketing literature. Secondly, structural models typically model forward-looking customers or firms as utility-maximizing agents whose utility is estimated using methods of Stochastic Optimal Control.
Furthermore, we suggest an alternative approach to study dynamic consumer demand, based on Inverse Reinforcement Learning (IRL).
Furthermore, we develop a version of the Maximum Entropy IRL that leads to a highly tractable model formulation that amounts to low-dimensional convex optimization in the search for optimal model parameters. In addition, using simulations of consumer demand, we show that observational noise for identical customers can be easily confused with an apparent consumer heterogeneity.
Full Paper: https://ssrn.com/abstract=3087057 or http://dx.doi.org/10.2139/ssrn.3087057
Firstly, we present a simple model of a non-equilibrium self-organizing market where asset prices are partially driven by investment decisions of a bounded-rational agent. Secondly, the agent acts in a stochastic market environment driven by various exogenous “alpha” signals, agent’s own actions (via market impact), and noise. Unlike traditional agent-based models, our agent aggregates all traders in the market, rather than being a representative agent. Therefore, we identified with a bounded-rational component of the market itself, providing a particular implementation of an Invisible Hand market mechanism. In conclusion, in such setting, market dynamics are modeled. As a fictitious self-play of such bounded-rational market-agent in its adversarial stochastic environment.
Inverse Reinforcement Learning With NYU Professor & Fidelity’s Dr. Igor Halperin
As rewards we obtain by such self-playing market agents are not observable from market data. We formulate and solve a simple model of such market dynamics based on a neuroscience-inspired Bounded Rational Information Theoretic Inverse Reinforcement Learning (BRIT-IRL). In addition, this results in effective asset price dynamics. With a non-linear mean reversion. Which in our model we generate dynamically, rather than postulated. Furthermore, we argue that our model operates in a similar way to the Black-Litterman model. In particular, it represents, in a simple modeling framework. Moreover, market views of common predictive signals, market impacts and implied optimal dynamic portfolio allocations. Lastly, we use it to assess values of private signals. Moreover, it allows one to quantify a “market-implied” optimal investment strategy, along with a measure of market rationality. Our approach is numerically light, and we implemented it using standard off-the-shelf software such as TensorFlow.