Close this search box.
Close this search box.

Is Backtesting Accurate? Challenges of Backtesting in Deep Learning (Stock Market)

Is Backtesting Accurate? Challenges of Backtesting in Deep Learning (Stock Market)

Artificial Intelligence & Machine Learning

Deep Learning deals with algorithms inspired by the biological structure and functioning of a brain to aid machines with intelligence. 

Moreover, its “grandfather,” Artificial Intelligence, is the quality of intelligence being introduced in machines. Machines themselves are dumb, and humans inject some sort of intelligence to allow them to think independently. Its “father,” Machine Learning, is the process of inducing intelligence into a machine without explicit programming. To be clear, explicit programming means programming done by the programmer, while implicit programming means programming done by the Java virtual machine, not the Programmer. 

An example of machine learning could be a system that could predict the wine price by learning from the historical wine quality and other related variables. The system is not encoded with a comprehensive list of all possible rules. Instead, it learns independently based on the patterns it identified from historical training data. Now, the “son” would be deep learning. It happens that machine learning fails to excel in some specific cases, which are easy for humans. For example, it performs poorly with image, audio, and some unstructured data types.

When solving the issue, an idea exists to mimic the human brain’s biological process. 

A diagram showing various structures within the human brain
Human brain bisected in the sagittal plane, showing the white matter of the corpus callosum

People can create a system composed of billions of neurons connected to learning new things. In other words, deep learning is a field within machine learning and artificial intelligence that deals with algorithms inspired by a human brain to aid machines with intelligence without explicit programming. Some examples of using deep learning in daily life are virtual assistants like Siri, autonomous driving in Tesla, and product recommendations on Netflix. Deep Learning nowadays has forayed into virtually all industry verticals, like healthcare with detecting cancer, aviation with fleet optimization, and banking and financial services with fraud detection.

Brain of human embryo at 4.5 weeks, showing interior of forebrain
Brain of human embryo at 4.5 weeks. Showing interior of forebrain
Henry Vandyke Carter and one more author – Henry Gray (1918) Anatomy of the Human Body. (See “Book” section below) Bartleby.comGray’s Anatomy

Backtesting of Deep Learning in the Stock Market

In the domain of financial analysis using stock market data, a vital tool for achieving explainability at real-world adoption is backtesting. It refers to using historical data to assess a model’s viability retrospectively and is based on the intuitive notion that any strategy that worked well in the past will likely work well in the future. Backtesting in machine learning usually splits data into training and testing sets during modeling. The goal of this process is to determine the accuracy and evaluate performance. However, when modeling for the financial market, performance is measured by the model’s profitability or volatility of the model.

Machine learning for decision-making raises concerns about statistical bias and the lack of due process. It relates to the selection bias that can affect research results in the financial area. In the context of deep learning in the stock market, backtesting involves building models that simulate trading strategies using historical data. It considers the model’s performance. And helps discard unsuitable models or strategies to prevent selection bias. Moreover, to properly backtest, we must test on unbiased and sufficient historical data from different samples.

Challenges of Backtesting Deep Learning in the Stock Market

The first challenge of backtesting in deep learning is the availability of historical market data. The core of conducting stock market analysis is the availability of consistently updated historical data, but such data is not readily available. Paywalls often restrict access to such data and complicate its use for academic research. Although institutions, such as Wharton Research Data Services, collaborate with academic institutions to provide access to some of these kinds of data, the subscription level determines the degree of access. The data remains widely inaccessible to a larger pool of institutions, making the only options either inconsistent publicly available market data or paying the premium. 

The value of $1,000 invested in LTCM,[23] the Dow Jones Industrial Average and invested monthly in U.S. Treasuries at constant maturity.
JayHenry – Own work

The second challenge is access to supplementary data. Access to related data types is closely associated with the previous issue, which can be used to improve performance on modeling tasks involving financial data. These data can include fundamental data, such as firms’ annual reports, and alternative data,  such as financial interests appearing in news articles. It is essential to differentiate these kinds of data because sources usually differ from those responsible for market data. There are many kinds of potential supplementary data, and some work remains to reach a state where such data is readily available. 


This is a poster for WarGames. The poster art copyright is believed to belong to the distributor of the film, MGM/UA Entertainment Co., the publisher of the film or the graphic artist. Further details: Poster of movie WarGames.

The third challenge is the long-term investment horizon. Several studies reviewed consider a relatively short investment horizon, from a few days to a few months. Many investments in the stock market are associated with portfolios spanning decades, so buying and holding growth investments is attractive. Growth investment expects above-average returns for young public companies, expecting significant future growth. Identifying a growth investment opportunity at the early stage would have produced larger-than-average returns.

Such patterns could be discovered by using supplementary data. As a result of modeling similar historical growth investments as part of an investment strategy, it might become possible to identify newer investments that can produce considerable returns for long-term investments. 

For instance, in a generative approach, however, the inverse probability becomes estimated and combined with the prior probability using Bayes’ rule:

p({\rm {label}}|{\boldsymbol {x}},{\boldsymbol {\theta }})={\frac {p({{\boldsymbol {x}}|{\rm {label,{\boldsymbol {\theta }}}}})p({\rm {label|{\boldsymbol {\theta }}}})}{\sum _{L\in {\text{all labels}}}p({\boldsymbol {x}}|L)p(L|{\boldsymbol {\theta }})}}.

An Explanation of Bayesian Statistics for Beginners!

The last one is the financial deep learning framework. Many popular machine learning and deep learning frameworks have improved the state-of-the-art. Although these frameworks frequently appear in academic and industrial research, implementations generally correspond to financial considerations. We observed no real attempts to extend existing frameworks using improvements based on these specialized works. Stock market machine learning problems involve incrementally learning using time-series data. 

Some machine learning frameworks dedicated to learning research have the tools and considerations for concept drift and prequential evaluation built into their framework. The absence of such frameworks for financial machine learning means that individual research teams must implement their ideas without attempting to integrate them into an open-source framework.

Furthermore, an accessible framework focused on deep learning research using financial data would enable the promotion of such ideas and allow research in this area to conform to established industry practices more closely. It would also allow researchers to provide specific implementations to improve the state-of-the-art.

Charles H Martin, PhD  & CEO of Calculation Consulting, one of our favorite deep learning minds told us:

“A unique challenge with financial research is that unlike other industry problems, like recommender systems, search relevance and/or ad click prediction, in finance, past performance does not reflect future outcomes. 

For this reason, you must be especially careful not to overfit the historical data.  

To that end, you may try applying the open-source weightwatcher tool to detect if your layers show tell-tale signatures of overfitting.”  

We encourage you to check out Dr. Martin’s site:

Is Backtesting Accurate? Challenges of Backtesting in Deep Learning (Stock Market)

Profile photo of Charles H. Martin, PhD
Dr. Martin, Is Backtesting Accurate?

Is Backtesting Accurate? Challenges of Backtesting in Deep Learning (Stock Market)

Written by Ruizhe Li

Deep Learning God Yann LeCun – Facebook / Meta’s Director of Artificial Intelligence & Courant Professor.

Is Backtesting Accurate? Challenges of Backtesting in Deep Learning (Stock Market)

Artificial Intelligence & Machine Learning