NLP For Stock Market Prediction
One firm ProntoNLP believes they have harnessed Natural Language Processing for alpha generation. They have shared their research with us below:
Testing the Basic ProntoNLP Extracted Events
There is a longstanding literature about the efficacy of an earnings-surprise-based signal; returns drift in the direction of the earnings surprise for at least one quarter. Stocks with the most positive earnings surprises enjoy positive abnormal returns in the subsequent quarter after earnings are announced.
These findings led to the utilization of an earnings surprise as a signal in stock selection. However, in spite of the importance of earnings as a summary measure, there are many other layers of information that are provided by management as part of the earnings conference calls. ProntoNLP extracted many “events” from the S&P transcripts of earnings conference calls, such as acquisitions, margins, backlog and buybacks, some of which cover the past, but many which have ramifications for future performance. The purpose of this write-up is to examine a signal that is constructed from the ProntoNLP basic model. The software easily enables users to write their own event extractions which could improve on the basic extractions offered by ProntoNLP.
Methodology:
We begin with the event extractions from the S&P earnings conference calls, which begins to have good coverage in the third quarter of 2009, reaches over 6,000 earnings conference calls in the last quarters of 2021, some of which are by foreign companies. ProntoNLP extracts its events from each transcript and assigns a score to each transcript which is based on the difference between the sum of positive and negative events, scaled by the total sum extracted events. This measure can be thought of as a sentiment of the conference calls that is based on important events for management and analysts.
We construct a tone change measure by focusing on the current sentiment minus the average sentiment in the prior four quarters. We then match the conference call dates to those of the preliminary earnings announcement date from Compustat, and require the two dates to be within one day of each other. This ensures that we have actual earnings conference calls so we can compare the extracted tone change signal to the earnings surprise signal.
We then match the companies identified by the conference calls transcripts with returns from the CRSP database. There were roughly 1,000 companies in late 2009 climbing steadily to over 2,000 companies in mid 2015, and ending with over 2,300 companies in late 2020. In addition, we extract the earnings surprise from IBES using both the IBES actual and mean forecasts of quarterly earnings by all analysts in the 90-day period before the earnings announcement (where only the most recent analyst forecast is retained). The standardized earnings surprise (SUE) is the actual earnings minus mean forecasted earnings scaled by the standard deviation of the analyst forecasts during the 90-day period. Requiring analyst forecast data reduces the sample a bit, still around
1,000 companies initially, building up to around 2,000 in 2017 and ending with over 2,100 companies in 2021.
For each month-end beginning in December 2009, we identify the most recent earnings announcement for each company that had an earnings conference call and where the earnings surprise signal was available. We ranked the two signals (earnings surprise and conference calls0 separately into deciles, where decile zero had the most negative earnings surprises (conference call tone change) and decile nine the most positive. We assign the decile rank to each member of the decile, and subtracted 0.5, to obtain a variable that is in the range of [-0.5, +0.5]. Furthermore, We examine the signal performance by focusing on performance in the following month.
We use two primary methodologies to test the signals. The first is by using Fama and MacBeth monthly cross-sectional regressions where the dependent variable is the abnormal return in the following month and the independent variable is the scaled signal rank. The coefficient on the signal is equivalent to a hedge portfolio that takes long positions in the most positive signal decile and short positions in the most negative, except that it uses all companies, not only in the extreme signals. We assess the strength of the signal by examining the average coefficients of the 145 monthly cross-sectional regressions and their standard deviation. The abnormal returns are the buy and hold return on a security minus the value weighted buy and hold return on a portfolio of similar companies in terms of size (market value of equity, 3 groups), Book/Market (3 groups) and 11-month momentum (t-12 through t-1, 3 groups).
We also use a Fama – French factors analysis to assess whether the signal has returns after controlling for the Fama- French five factors. Furthermore, we first calculate the average monthly return on the top decile during the 145 months in the sample period. We the run a regression of the monthly return minus the risk free rate on the five F-F factors. If the intercept is significantly positive it indicates abnormal performance above the five risk factors.
Results: We first evaluate the correlation between the earnings surprise signal and the conference call tone change signal. A high correlation would indicate that most of the information in the conference call can be captured by the earnings surprise signal. We find that the actual correlation between the two signals is about 11-12%. This is a low enough correlation, which indicates that conference calls can capture additional information beyond that which is captured in earnings.1
The top decile of earnings surprises, i.e., the most positive surprises, earned 17BPS (the intercept in the first row) beyond the F-F five factors, which was significant at the 5% level. It had a beta of 1.09, indicating higher systematic risk, which is understandable given that these companies had extreme earnings surprises possibly because of a higher operating risk. It also showed positive tilts towards smaller and value companies, but not any significant tilt to the profitability or investment factors. The most negative earnings surprises earned 23BPS above the F-F factors, significant at the 10% level. The positive return on the short portfolio is the opposite of what one would expect, but is consistent with our observations earlier.
In contrast, we find that the portfolio of the most positive conference call tone changes had 27BPS return after controlling for the F-F factors, significant at the 2% level. It also had a higher beta of 1.13, and positive tilts to smaller and value companies. The most negative tone changes actually had a negative return of 9BPS after controlling for the F-F factors, although this negative return was not significantly different from zero.
Main Takeaways:
1. The earnings surprise signal has not worked well in our sample during the period of the study, although the long portion of the signal (the most positive earnings surprises) showed positive and significant abnormal returns.
2. The conference calls tone change signal earned significant abnormal returns, but mostly for the long positions (the most positive tone changes). The returns on the short positions were not materially different than zero.
3. The correlation of the conference call tone change and the earnings surprise signals is low at 11-12%, indicating that the conference calls provide additional information beyond earnings.
Performance Graphs
Figure 1 – Comparison of Return of our Long signal (top 10%) vs. Earning Surprise and the Market (S&P 500) – Risk Free Return
• MKT_RF is the market return (Russell 1000) – Risk Free Return • LongSignal = Our NLP based signal
• longsueaf1signal = The Earning Surprise based return
Abnormal return (FF 3 factors) of top 10% (long) and bottom 10% (short). Buy and hold return on the security minus buy and hold cap-weighted return on similar stocks (size, B/M and momentum). Monthly Rebalancing.
Value and Growth versions of the basic signal
We identified what are the main events that characterize value companies or growth companies. For the value tilted signal, we gave events that are highly associated with value companies a weight of 3 and all other events got a weight of 1. For the growth tilted signal, we gave events that are highly associated with growth companies a weight of 3 and all other events got a weight of 1.
As can be seen the value tilted signal (green) performs better than the basic signal (orange) and much better than the growth tilted signal (red).
NLP For Stock Market Prediction
1 We also tested another signal that is based on earnings estimate revisions. We summed up the number of Up, Down and Zero revisions in the 3-month period prior to the month of portfolio construction. We then calculated diffusion signal of (Up-Down)/(Up+Down+Zero). Lastly, we assigned the rank of +0.5 to those diffusing measures above 0.5, -0.5 to those below -0.5 and zero otherwise. The earnings revision signal was also only 11-12% correlated with the conference call signal.