I guess it's time to tell the story online?<p>When I graduated college, I spent 3 months as a programmer with my econ friend trying to build exactly this. I started off creating a system to paper trade stocks retroactively. So you imagine you go back in time and pretend it's January 1st, 1982 then have an algorithm look at the stocks then, then move it a day forward, and let it trade for the past 40 years and see how it does.<p>We tried linear models, SVMs, neural networks, RNNs, ensembles, genetic algorithms, anything with stock data, news sentiment data, classic quant structures, and everything in-between. Basically, 3 solid months of coding before I started working.<p>Anyway, I found out a lot of stuff the hard way, because I didn't have an econ degree.<p>First off, you try enough methods, you end up p hacking or hill climbing the past anyway, and it's no good.<p>Second off, historical clean data is hard to get. It may or may not have splits in it or other things, so you may inadvertantly supply information from the future when playing back from the past. It's hard to get this right.<p>Third off, for many of the models we used, they were almost always competitive in the 80s (even a linear regression), but in the oughts or 2010's, they stopped being competitive. We thought computer based trading was becoming more competitive in hedge funds.<p>Fourth, simple models tended to work better. So for instance we may have trained the model on data from 70s-80s, then starting in the 80s, we did online (continuous) training as we moved the model forward in time. There's just not enough data. You can train on all historical stocks or all stocks or related data streams in the industry up to that point, but I think we probably didn't have enough data and the market is competitive.<p>Fifth, I wish I read a Random Walk Down Wall Street earlier, or all of Taleb's stuff. These are books that have deep mistrust of quants.<p>Sixth, I think to be competitive, you need to have money in the game, many heuristics, and industry experience. Big firms have this and equipment, but it's hard to get in as an individual.<p>Seventh, I put several hundred hours into this project and learned a bunch about machine learning and economics. In every way I loved the experience, and I'd encourage you to try it. Probably I'm a n00b here, but I hope some of my notes can help you.