
If we want to get alpha returns, we need to understand some aspect of a market or asset better than anyone else. And for quants, that means good systematic models.
Alpha research
Alpha research is trying to understand the factors that affect investment returns — so that we can improve returns.
So we actually need two things: a model and a strategy.
A model tries to explain current prices using information we could have known in the past.
A strategy combines that model with other knowledge and spits out proposed trades that we think will result in good returns.
A Model
An simple model example is mathematical function that predicts what the price of Tesla shares will be in two years — using only data we know today. That data may include company details and information about the markets and the world.
In our example, the model often gets more complex because we may find out that we get fuzzy predictions for Tesla itself, BUT we can get much better predictions if we assume some broad or narrow market conditions. Then we might develop a pretty good model for a new virtual asset — say the price of Tesla relative to the price of a basket of alternative energy stocks. We could call this virtual asset “hedged Tesla” — since when we “buy” it, we are hedging Tesla against changes in the alt-energy sector.
As an aside, this modeling improvement is behind the presence of the word “hedge” in the phrase “hedge fund”. The original hedge funds invested exactly in ways that tried to remove the effects of broad market moves. Today, of course, a hedge fund is a legal structure with tax and reporting implications — and it permits all kinds of strategies.
More below on modeling…
A Strategy
An simple example of a strategy is the following. Assuming we have a model for US large cap stock prices. We buy all the undervalued stocks, and we “short” all the overvalued stocks. But a strategy has to get a little more complex for technical and practical reasons.
One technical reason arises because we like to think we can estimate the precision of the predictions we make with our models. Given that, we probably want to make bigger bets on more precise predictions and smaller bets on fuzzier predictions.
Two more practical reasons that lead to more complex strategies are capital limits and trading costs. Capital limits are related to the precision/fuzziness input above. Strategies become complex when we consider how to allocate the limited capital we have to invest among various stocks the model covers. Trading costs refer to the lost costs of buying, selling, or shorting a stock. One cost is brokerage fees, but for big players a much bigger trading cost is market slippage. Slippage is the sad fact that when we buy a ton of shares, we drive up the price we ourselves pay. Likewise, when we sell a ton of shares we drive down the price we receive. Because of both of these costs, our simple strategy might need to become more complex to manage the number of trades so that we avoid racking up excessive trading costs.
Developing a strategy often involves a mathematical tool called an optimizer, which helps us find the ideal strategy given a set of models, constraints, and well-defined objectives.
A little more on models
When we develop a financial model, we are working in a sub-field of social science, and we find our models are imprecise. It’s not like physics, where we can predict to 17 digits.
When we see this imprecision, we naturally want to fix it. And we might assume that since we are modeling human behavior the only way to be more precise is to have more complex models. The truth is… sometimes. Complex models create many other issues. So at least look at the data. Graph your Y across many features and see if you see any hints that complexity will improve things.
In fact, we feel, on a rough empirical basis that four things tend to enhance the PnL that you can get from a model, listed roughly in order of importance. And these four enhancements usually add more PnL than switching to more complex models. But this depends, of course, a lot on the particular thing you are trying to predict.
Data: Get more data. If possible, data that is unique to us. Data that is either not available to, or not being used by other market participants. We think this drives about 80% of PnL.
Checks: Do better data checks. Poorly prepared and poorly vetted data tend to produce bad models.
Markets: Take into account relevant market details. This is a whole category of work.
Inference: Are we doing good science? Can we do some careful causal inference.
After these four, try to creatively approach data visualization and data summarization to look for hints. Take a look at feature engineering. In some cases, these two will hint at better modeling techniques.