Hi guys, hi from Andrea Unger.
Today I would like to discuss a bit about historical data!
Historical data, a question I often answer is: how far back in history should historical data be to test effectively a market?
And: What time frame should the definition be to get a clear picture of the market?
So this depends on the instruments you intend to trade.
I mean let’s start for example on index futures, DAX future or mini S&P whatever.
Here the point is that you should go back enough to cover different market scenarios.
If you have data starting from for example 2012, you won’t get a clear picture because, after that date, markets had more or less a common uptrend behavior, also some years before, so you will develop on markets with a strong directional bias and this can obviously produce strategies that might not be effective when things change.
We know what happened in index futures in 2008 so it should be at least recommended to cover also that strong downtrend year such as 2008 to have a clear picture of trend and volatility in these markets.
Also 2001and 2002 showed a strong downtrend and then this means it could be good to have a look at what happened then, but those years and that kind of downtrend had a smaller algorithmic presence of the market, and the dynamics of the moves were a bit different which does not mean what worked then is not working today, but referring to those moves is not so important as it could be for example to refer to 2008 to have an idea of what happens when markets really fall down in a violent mode.
So 2006 could be a good compromise on index futures and a time frame.
Time frame, if you can get down to five minutes timeframe is good.
Smaller than that really I don’t think it’s necessary, one minute or tick data which are by the way that expensive.
That’s not something that really is needed for us and if you plan to develop of tick data than you probably should even need an infrastructure, technical infrastructure to trade at that definition and normally retail traders don’t need and do not have that kind of hardware and connectivity and so on.
So actually five minutes is enough to clearly define how the moves look like at therefore what kind of action to take.
Going to commodities, crude oil, gold, soybeans or whatever, okay I would say that if we start with data from 2010 it is good enough, before that there were data 2008, 2009, but the electronic markets on these futures were newly born and the initial moves were not so clear as needed to trade and to develop effectively what we want to use today.
So in those cases, I believe that having data going back to 2010 is absolutely satisfactory to have a good database.
On currency futures or on Forex, it would not be bad to go back to 2003 at least, also starting from 2008 or all you can have something, but back in 2003 is not that bad because I noticed that from time to time some dynamics seen in some periods of the past come back and are shown again on the markets.
So if you can go that back it’s not bad.
If you refer to Forex instead of currency futures be careful about what the source of the data because you know it’s an OTC market so everybody has its own data let’s say and it’s important to develop on data you are then most probably trading ones you go live.
So this is more or less the picture.
On grains I had forgotten, be careful because the sessions changed from time to time you during the year so it is important when you set it up for backtest it.
On Bonds, I don’t think many people are that interested in Bonds but on Bonds also what was there for indexes might be valid for Bonds as well.
This is more or less the picture.
Obviously, the higher quality of data you gather the higher the quality also of the software you use to back test of the data must be.
I mean if you get tick data for thousands of dollars and then you use a small hand calculator to develop obviously you don’t have a good compromise.
So the higher the quality on the data you focus on the higher the quality of the software you use to picture your strategy must obviously be.
This is something you have to keep in mind when you try to structure your backtesting environment.
That’s it, I hope it was useful.
In any case, ask questions here below in the comments I will be glad to answer and see you next time.
Ciao from Andrea Unger