Close this search box.
Close this search box.

Does Option Volume Predict Stock Direction?

Does Option Volume Predict Stock Direction?

Trading and Investing

The Options Market Beat 94% of Participants in the M6 Financial Forecasting Contest

As of today I have permission to make public my mischievous entry in the year-long world-wide stock and ETF forecasting contest … the sixth major M-competition in a storied history stretching back to 1982.

My entry seems to have turned the contest from a “model versus model” into a “market versus model” battle. You can call this highjacking if you want. I think some of you who know me well will have suspected it.

Rebellion Research’s 2022 Book Of The Year ‘Microprediction: Building an Open AI Network’

Here’s the repo containing code I used and hopefully in a small way this transparency furthers the aims of the organizers. Actually, the contest is not over but I doubt my main finding will change too much by the end. As you can see, my “microprediction” entry is currently 10th overall.

The “market” benchmark in 10th position overall (163 competitors)

I won’t re-describe the contest itself in this note, you will have to read the contest guidelines. But in short, participants provided a monthly-rebalanced portfolio and also monthly quintile probabilities judged by Brier score. The universe comprised fifty stock and fifty ETFs.

A Market Benchmark

There were two primary limitations I saw in this contest:

  1. The contest only ran for a year
  2. Rebalancing was to occur only once a month

Since fund managers are typically judged on the time-scale of decades (or should be — see Donoho, Crenian and Scalan JoPM article) it may seem odd to some to run a one year investment contest using Sharpe ratio. But what would be the alternative? The choices by the organizers are pragmatic.

I think we can safely assume assume though, that the non-linear prize structure impacts contest behavior. I am as greedy as the next guy, but I decided to to exercise restraint on this one occasion and not optimize for cash. As I don’t yet know what those on the podium chose to do, I can’t comment on whether they swung for the fences or not … but I’m pretty sure that would have had to do so in order to claim the big prize.

My entry is, as noted, morally a benchmark. I had decided it would simply use the option market for the all important volatilities, not a prediction model of any kind.

Before the contest I was reasonably sure this benchmark would be a stronger one that that provided by organizer Spyros Makridakis, who input the 1/n portfolio and the uniform distribution for all rank probabilities … but I was not sure how mine would fare against prediction teams from around the world.

Of course, I needed some method of constructing a portfolio as well. Since the organizers’ benchmark was long-only I decided on the same, and I went with my own creation: Schur portfolios. To be honest I’d have to check the commits to see exactly what I did each month. However, the last entry used a variety of this method, as can become seen from the code.

More on Schur in a moment.

The Hypothesis

My pre-contest hypothesis, which I hope is rather obvious from my entry and thus does not require the melodramatic opening of a letter sent to myself, was that the majority of quants and data scientists entering the contest would do a worse job of estimating probabilities than the market itself.

By the market I refer to the shortest dated options market.

I suppose there may be some others that might help infer market dynamics (betting markets, prediction markets etc). I had given some consideration to also using some very short term volatilities like this one from the microprediction platform, but never got around to it and thought that the mis-match in time-scales would spoil the clean thesis, if not the results.


Now as you can see from the code for my entry, I did exercise a tiny bit of license in the use of the volatilities. I would not really wish to defend this in court but here it is:

Here stocks are listed before ETFs and the interpretation is that I shied slightly away from investing in the former versus the latter … call it gut instinct or call it an attempt to every-so-slightly unwind the implicit shrinkage that occurs later in the portfolio construction.

Another possibly dubious decision was a “cowardice shrinkage”, as we might term it, towards the 1/n benchmark.

Whether this was a good idea or just a way of reducing my chance of real prize money to near zero I do not claim to know. Maybe I just wanted to beat Spyros! I doubt it changes the big picture here.


Similarly, the estimation for correlations is up for grabs although, I tend to think, not overly controversial either. I used my Python precise package (documented here) and in particular, I used an exponentially weighted partial-moments method of predicting covariance with a memory of 100 time-steps and a decay of 0.01.

You are supposed to infer my choice from the import statement in the code

It is entirely possible that I made this choice just so that Fred Viole wouldn’t have to challenge me on not using partial moments. In seriousness, they are rather nifty. Beyond that there is nothing mysterious in the covariance estimation (see here) except that as noted, I kept only the correlation — preferring to use the options markets to set the relative sizes of variances.

Rank probabilities

Once correlations and volatilities were determined, rank probabilities follow as busywork — i.e. Monte Carlo.

Although in passing I will mention that at this juncture I did temporarily go down a very interesting rabbit-hole. I think I may have discovered the world’s first fast algorithm for producing all rank probabilities for five variables whose distributions are known.

I even wrote up a paper on this but, unfortunately, the organizers changed the rules of M6 so my hard work was irrelevant (or perhaps I just misinterpreted the original intent prior to their rewording — this was lucky as I never would have made the discovery otherwise). If you are an editor of a journal whose readers might be interested in an amazing advance in order statistics, then do get in contact with me.


As noted I utilized my brain-child: Schur Complementary portfolios.

Schur portfolios nest Hierarchical Risk Parity but they are more strongly motivated mathematically as a top-down device. Actually they are too logical, at least for one leading academic journal — judged “too mathematical” by the editor.

I assume some of my readers are more mathematically inclined and would not like their understanding of portfolio theory and the connections between hierarchical and optimization-based approaches to be curtailed by a deliberate eschewing of undergraduate linear algebra. You can read about it here.

I’m almost certain I could have used just about anything semi-reasonable to construct diversified long-only portfolios, and my justifications for Schur lie elsewhere. But in the interest of completeness I’ve included the code and I note a few minor matters of “art” such as the ad-hoc covariance adjustment:

Ad-hoc covariance adjustment

There were a couple of other ad-hoc twiddles in the code. For example I every-so-slightly penalized stocks with upcoming earnings announcements.


I said above I did this for science but a skeptic will say my behavior betrays me. I got off to a rather bad start and it was pretty clear I was not going to win any money with such a conservative approach. That little dream died and after about six months I will admit that the absence of cash incentive led to me be a little less religious about my monthly rebalancing and entry submission.

By “less religious” I mean I forgot every single time. As I check the logs now (for example) I am ashamed to see that my last entry was the end of June. So my portfolios and probabilities just rode along for six months or more. You’ll have to forgive me as I maintain a contest platform which requires no manual uploading from participants, and anything I can’t completely automate falls to the wayside eventually.

My laziness did accidentally create a test for another hypothesis: is increasingly stale market volatilitystill a better guide than most data scientists? The answer would appear to be yes.


I documented the fortunes of myself and also Marco Gorelli on this page. I’m not sure what Marco did (we will surely find out soon) although he did mention publicly that he was using my precise package. Here are my prediction percentiles for each of the five stages of the contest:

Dumb luck? Probably not. As you can see, I think it is abundantly clear that the market-implied benchmark I created consistently outperformed a huge majority of contestants.

I’m less sure, pending further analysis on luck versus skill which I may never have time to get to, what to make of my performance in the investment side of the event. However I did have a positive return and outperformed the 1/n benchmark which, at the time of writing, had a negative return over the year.

Like Hierarchical Risk Parity, my portfolios represent a very cautious use of correlation information.

I am guessing, but of course do not know, that my entry would beat the 1/n benchmark quite consistently if this competition were to run indefinitely.

I would like to thank JB Kurland for producing a wonderful interactive showing the rank progression of all competitors, including myself, over the course of the year. It is interesting to look at the other competitors to see who might have been taking big chances in order to finish high. Here are my progressive rankings:

And here are the ranking of the 8th place finisher:

Hmmm … had I not revealed my method I’d be willing to wager a heathy sum on a year long rematch! But that’s not true of everyone in the top ten. I suspect some beat me fair and square. Well done to you.

Tentative Conclusion — Stop Using Models When You Can Use a Market Instead

The option market, so interpreted, outperformed almost all participants in the M6 Financial Forecasting contest even though <snark> it was handicapped by a portfolio construction method rejected by a fine journal </snark>. So, while the contest was ostensibly an exercise to compare models, it actually adds to the extensive literature on the superiority of markets over models.

I believed before the contest that markets should become used in place of models whenever possible to do so. I believe that after the contest. Yet most employers of data scientists might not have thought carefully about this social science “law” — one that is far more reliable that most findings about people, information or prediction.

I hope they read my book or at least the tldr.

For those of you who are upset by this M6 revelation, I’m sorry, perhaps once the winning methodologies are revealed you will be able to convince yourself that you would have selected those methodologies, or people. But I shall remain skeptical.

If you have a magical way of predicting in advance which data scientist will be the “right one” … good for you. If you don’t, then I strongly suggest you initiate a market with a few lines of code, in addition to whatever else you do, or at least hold their feet to the fire by publishing model residuals and receiving effortless ongoing performance analysis.

The odds of a data scientist locating the correct exogenous data also seem quite small. How many entrants in the M6 contest “found” the most relevant data in this case — even though it is economically obvious?

For M6 competitors who wish to prove me wrong, or prove me right, you can join the ongoing higher frequency version of the M6 Competition — a collection of thousands of prediction tasks where skill is paramount. You can read the docs or background information site or just join the slack.

In that way, you’ll be increasing the probability that “data science” as we know it today can be replaced by something decidedly less mediocre in other contexts too, not just time-series prediction.

You may disagree. But do me a favour and find your disagreement with the long form of this argument, not my short one here.

The book rated 5 stars by EVERY SINGLE AMAZON REVIEWER (except some doofus who admitted in the review they didn’t understand anything in the book but felt compelled to rate it anyway).

Does Option Volume Predict Stock Direction?

Trading and Investing