More challenging projects every programmer should try

757 pointsby pavehawk2007over 4 years ago

36 comments

henningover 4 years ago

As someone who has played with writing trading bots but never traded them with real money, some advice: if your results seem too good to be true, they probably are. Your trading bot may be doing unrealistic things or its results may not be reliable if the following are true:- You are trading in a market with low liquidity or one that is controlled by a small number of market participants. I'm not an expert but I think this would apply more to markets like penny stocks and less to big markets like forex for major currency pairs- You are not taking transaction costs into account or not doing so properly- Your bot makes a low number of trades, making the results close or equivalent to lucky coin flips- Your bot is simply making trades that cannot be executed, or may be doing simulated trades of something that is not actually tradable. This applies to a large number of research papers that assume you can just buy and trade the S&P 500 itself. You can trade ETFs that are tied to an index but an index is not a tradable instrument in of itself. Once you realize this, a lot of papers seem very weird- You are not modelling other aspects of the trading process realistically, such as assuming the bot has infinite funds to trade, allowing it to take unlimited losses and continue trading when in reality you'd be hit with a margin call and your trading would be stopped- Your code is committing any number of data snooping errors where the bot is asked to trade at time A (say the open of a trading session) but has access to future data (say the closing price of that day, future data that would not actually exist in a live environment)- Depending on what you believe about how market conditions change over time, your bot may have worked in the past but would not work if used today. I.e., the market may have adapted to whatever edge your bot may have discoveredThere are probably lots more pitfalls I don't even know about since I'm not an actual trader.I'm not discouraging anyone from playing around or trying things, of course. I think it's great fun, which is why I do it.Here's the good news: if you realize you don't actually have an edge and avoid risking your hard-earned money, you come out ahead of almost all people who ever trade.

评论 #25490815 未加载

评论 #25490753 未加载

评论 #25492959 未加载

评论 #25506059 未加载

评论 #25490968 未加载

评论 #25515970 未加载

评论 #25494954 未加载

simiasover 4 years ago

I would add "build a toy regex engine" to the list.A couple of years ago I implemented a toy regex engine from scratch (building NFAs then turning them into DFAs). I thought it was an enlightening experience because it showed me that the core principles behind regular languages are fairly simple, although you could spend years optimizing and improving your implementation. How do you deal with unicode? How do you modify your implementation to know how many characters you can skip if you don't have a match in order to avoid testing every single position in a file?It demystified the concept of a regex engine for me while at the same time making me realize how impressive the advanced, ultra optimized engines we use and take for granted are.

评论 #25491467 未加载

评论 #25491207 未加载

评论 #25491719 未加载

评论 #25492252 未加载

评论 #25495031 未加载

评论 #25492050 未加载

anonytraryover 4 years ago

This article is aimed towards students. It's great advice for students who are in college, know very little, and want to improve their CS skills.It's poor advice for someone who already has a STEM degree and wants to build something useful and profitable. If you already know how these things work, your time is better spent on the "edge of the circle": <a href="http://matt.might.net/articles/phd-school-in-pictures/" rel="nofollow">http://matt.might.net/articles/phd-school-in-pictures/</a> which applies to businesses and startups as well.If you're in the latter group -- you've already got the skills to build real shit. Don't waste your time on homework problems. Find a problem you have and build a solution for it. Don't listen to people who tell you to work on homework problems that have already been solved; it's a complete waste of your time if you already know the fundamentals.As for stock trading bots -- if you don't have a mathematics degree or equivalent (e.g. having incredible math skills), don't even bother. You won't be profitable, and you will learn nothing useful in the process, because you will approach the problem as a naive CS student would. Smarter people than you have made trading bots and have failed miserably. Without having an extremely strong foundation in mathematics, your trading bot will amount to nothing more than a futile exercise in gluing APIs together.

评论 #25495114 未加载

评论 #25493570 未加载

评论 #25493831 未加载

评论 #25496526 未加载

评论 #25493011 未加载

评论 #25493793 未加载

评论 #25494326 未加载

评论 #25495693 未加载

fergieover 4 years ago

I would strongly recommend building something that you yourself think is cool, and not feeling that you have to conform to what other people tell you to do.

评论 #25493205 未加载

评论 #25493067 未加载

评论 #25492949 未加载

jameskiltonover 4 years ago

I would also recommend, if someone is interested in games, to do Tetris. It's a simple concept that is trickier than expected once you have to figure out the details of how it all comes together.

评论 #25491564 未加载

评论 #25491324 未加载

评论 #25491202 未加载

评论 #25493170 未加载

评论 #25490247 未加载

评论 #25492422 未加载

评论 #25491607 未加载

xamuelover 4 years ago

Here's an open-ended programming project which, in a certain formal sense, spans the entire range of all difficulty levels: write an "intuitive ordinal notation" for as large of an ordinal number as you can.What is an "intuitive ordinal notation"? Definition: The set of intuitive ordinal notations is the smallest set P of computer programs with the following property. For every computer program p, if, when p is run, all of p's outputs are elements of P, then p is in P.So "End.", the program which immediately ends with no outputs, is vacuously in P (all of its outputs are in P, because it has no outputs). It notates the ordinal 0. Likewise, "Print(`End.')" is in P, because its sole output, "End.", is in P; it notates the ordinal 1. Likewise, "Print(`Print(End.')')" is in P, notating the ordinal 2. And so on.The above can be short-circuited: "Let X=`End'; While(True){Print(X); X=`Print(\`'+X+`\')'}". This program outputs "End.", "Print(`End.')", "Print(`Print(`End.')')", and so on forever, all of which are in P, so this program itself is in P. It notates omega, the smallest infinite ordinal.Here's a library of examples in Python, currently going up to a notation for the ordinal omega^omega: <a href="https://github.com/semitrivial/IONs" rel="nofollow">https://github.com/semitrivial/IONs</a>

评论 #25491559 未加载

评论 #25491615 未加载

avl999over 4 years ago

Building a distributed key value store is a fun project and lets you learn tons of real world world stuff. It's a great excuse to get a survey on grokking the design decisions required to build a distributed system and it will truly help one understand why No SQL DBs scale easier than relational ones and the kind of tradeoffs they make to achieve that.

jackschultzover 4 years ago

Instead of a stock trading bot, go for daily fantasy sports contests. It can cover pretty much all parts of programming.Web scraping to gather data, databases for storing it, ML for analyzing, front and backend web dev to show the daily information and adjust.And instead of having to deal with trading regulations, contests can be really small and easy to enter. There are daily contests for 5 cents an entry, and you can enter 150 optimized lineups from an uploaded csv for $7.50 a day. You can really learn a ton.

评论 #25492434 未加载

ashleynover 4 years ago

Write a toy compiler for a basic like language, you'll learn about what your languages are actually doing.

评论 #25490371 未加载

评论 #25490777 未加载

评论 #25492183 未加载

评论 #25490684 未加载

评论 #25490948 未加载

评论 #25491666 未加载

评论 #25492570 未加载

评论 #25490810 未加载

bob1029over 4 years ago

The database project is quite the rabbit hole if you start chasing performance. I have learned some amazing things about just how fast a 3~4ghz CPU core actually is from this journey.

timviseeover 4 years ago

Fantastic list!By the way, adventofcode.com is currently ongoing. Though the challenges are easy compared to the projects in this list, I highly recommend it. It covers problems you might face in big projects. With these small puzzles it's easy to experiment. It prepares you for bigger things.

评论 #25491122 未加载

userbinatorover 4 years ago

IMHO a text-based browser isn't exactly in the "challenging" category, as it basically amounts to stripping all the HTML tags out and doing some very simple transformations (like replacing 's with newlines.) Then again, one of the things I've been working on intermittently for the past few years is a graphical (CSS2+) browser, which is definitely in the challenging category. There are some other public efforts too:<a href="https://github.com/lexborisov/Modest" rel="nofollow">https://github.com/lexborisov/Modest</a><a href="https://github.com/litehtml" rel="nofollow">https://github.com/litehtml</a><a href="https://github.com/ArthurHub/HTML-Renderer" rel="nofollow">https://github.com/ArthurHub/HTML-Renderer</a>Along the same lines, some other challenging projects I recommend are to write decoders/renderers for existing formats like MP3, MP4, PDF, etc.

评论 #25491537 未加载

评论 #25493579 未加载

评论 #25491139 未加载

评论 #25491947 未加载

SoSoRoCoCoover 4 years ago

I wrote a raytracer in 1996, and then a year later used Intel's VTune to speed it up. Just removing unused "return" statements gave me 3x speed increase. Apparently Borland C/C++ wasn't very smart back then.A fun project I did after that was writing a AI frame language to do goal-stack problem solving, specifically with path finding. I connected it to the ray tracer and made movies of spheres having wars. (I used an unlicensed DivX encoder to stitch together thousands of GIFs.)

评论 #25493423 未加载

djeiasbsboover 4 years ago

For some simpler projects, I can only recommend doing some digital signal processing. For example, an audio signal is just a list of values, so you can do things like:- Count the number of zero crossings - Find out where they are - Create any shape of wave by adding together multiple sine waves - Hard clip the signal - Stretch a signal and interpolate it with new samples - Invert and revert a signalFor level 2, you can start processing "live":- Create a sine synthesizer - Create a small ring buffer of samples - Find out how to output that audio (system audio, soundcard) - Add MIDI support - Add polyphony supportDSP gets hard once it has to be in real time and the latency has to be minimal. It's great exercise to mess around with it.

评论 #25492896 未加载

0xbadcafebeeover 4 years ago

> it is really simple to create the basic "database". You can start by using the dictionary data structure that comes with whatever programming language you're using and slap a web API on top of it.Better yet: do it in C. There's no "dictionary" object type so you have to make it yourself. You'll soon learn a whole bunch of fallacies about how those "dictionaries" actually work. After you spent a good deal of time doing that, you can switch to authentication/authorization, logging, storage, tracing, API management, resource quotas, and a raft of distributed computing issues.I recommend basing it on Consul, it has a better general model than etcd.

评论 #25490932 未加载

评论 #25490960 未加载

评论 #25491620 未加载

评论 #25490870 未加载

Waterluvianover 4 years ago

Writing a Game Boy emulator has been the most fulfilling and interesting programming project in my life.I love, most of all, how modular the project is. I can do an hour here or there and make meaningful progress.I'm really eager to discover other very large programming projects that break down into sensible bites so well.

评论 #25491308 未加载

评论 #25492109 未加载

rdescartesover 4 years ago

I would recommend choosing a long enough time (e.g. 3 months) to contribute an open source project you are using, especially you are not familiar to that domain. I learnt a lot from modern compiler stuffs by contribute to rust-analzyer.

tracyhenryover 4 years ago

A great compilation: <a href="https://github.com/danistefanovic/build-your-own-x" rel="nofollow">https://github.com/danistefanovic/build-your-own-x</a>

评论 #25492416 未加载

rex64over 4 years ago

I recently went through the process of creating a ray tracer project from zero for learning purposes. It was a humbling and eye-opening experience. I've written an article[0] to explain my process in detail if you're interested.[0] <a href="https://alessandrocuzzocrea.com/how-i-made-a-ray-tracer/" rel="nofollow">https://alessandrocuzzocrea.com/how-i-made-a-ray-tracer/</a>

eatonphilover 4 years ago

I'd also recommend writing an emulator for real or fake (e.g. CHIP-8) hardware. It seems complicated but the core loop gets pretty simple. It ends up giving you a much better view of both assembly and pointer semantics (useful for better understanding C).

评论 #25491034 未加载

mellosoulsover 4 years ago

Related discussion from the original suggestions a year ago fwiw:<a href="https://news.ycombinator.com/item?id=21790779" rel="nofollow">https://news.ycombinator.com/item?id=21790779</a>

forgotmypw17over 4 years ago

I would add a basic feature-complete website which works on every mainstream browser starting with Mosaic.It's much easier than it may seem, architecting it is interesting, and there is a lot of "last 10%" stuff which keeps it fun as long as you keep going.In the demystifying area, it demystified HTML and JS history for me, forced me to use with a minimal toolkit, and taught me how to build "modern" JS features in ways which will not break browsers which don't know how to do them or have them disabled.

acutesoftwareover 4 years ago

Writing even an extremely simple game without using a game engine or dedicated game library is quite an eye opener.Make a small 2D platform game, and it covers so many areas (and it is a lot of fun!).

xwdvover 4 years ago

Being able to do challenging projects is a cold comfort when by far the most challenging project I’ve ever faced was trying to build something people would pay.

评论 #25490346 未加载

评论 #25490900 未加载

评论 #25490670 未加载

评论 #25491239 未加载

评论 #25491153 未加载

arendtioover 4 years ago

What I am really missing is some kind of real-time AI. A decade ago, I have coded some bot for an ego-shooter with RTS elements and have learnt so much from it (while having a lot of fun).It starts with basic things like waypoint systems vs. area awareness systems plus the relevant routing algorithms like A*, but goes on to organizing a group of players and finding good strategies. And all of that with a limited time budget and an changing environment around you. Last but not least, you want to emulate human behavior which is probably the hardest part as it includes changing you behavior according to your situation (don't run straight against a wall for 10 seconds) but also taking into account the weaknesses as e.g. humans can't aim perfectly.Granted, what I have done has a huge field of challenges, but even with a 2D engine I think you can learn a lot from the experience.

评论 #25494257 未加载

fspearover 4 years ago

I would add an emulator to the list.I've always struggled to figure out how these are built from scratch.A long time ago I wanted to code a neogeo emulator but gave up before I even started, I didn't have a clue where to begin.I am amazed at anyone that can code an emulator from scratch.

评论 #25491214 未加载

评论 #25491228 未加载

评论 #25491182 未加载

评论 #25491184 未加载

the_cat_kittlesover 4 years ago

another one related to stock trading, but perhaps more interesting- build a simulator for a sport. both baseball and darts lend themselves to markov models, and are simple enough to simulate in some detail. with darts, you can get very close to as accurate as possible. baseball has more weird complications because of the rules. but its fun to do, and to compare to old games to see how well your model does.

评论 #25491466 未加载

kunalpowar1203over 4 years ago

Thanks for following up with this list after your previous one. I spent a great deal of my time (including some office time) on writing a Chip-8 emulator thanks to your previous list :D

whatever_dudeover 4 years ago

My own favorite is an equation parser.Before attempting to do so I thought it was implemented as a simple seek over the string, maybe a bunch of regex stuff. I guess it can be done that way, at the cost of growing complexity; but the proper solution (with a stack, etc) is so elegant (makeing it easy to add functions, operators, parenthesis, variables, etc) that it really makes one appreciate the value of good, thoughtful engineering.

mraza007over 4 years ago

Interesting projects. I might try ray tracing in python as I’m also exploring a-lot about CGI lately.Has anyone tried CGI if so how’s your experience has been so far

trustfundbabyover 4 years ago

What do folks think about implementing a web crawler that you can send to a website and it indexes every internal url on the site. I remember sitting down to write one 100 years ago now, and finding it to be much trickier than I thought it would be.

评论 #25493032 未加载

cghendrixover 4 years ago

These look fun!

mjgsover 4 years ago

Awesome couple of articles.

person_of_colorover 4 years ago

How about designing your own virtual memory system

评论 #25493040 未加载

4778468dover 4 years ago

>> automate testing on historical data over long periods of timeI want to try this. Where can u get access to historical pricing data that includes pricing changes during the day, not just end of day prices?

knownover 4 years ago

<a href="https://en.wikipedia.org/wiki/List_of_lists_of_lists" rel="nofollow">https://en.wikipedia.org/wiki/List_of_lists_of_lists</a> FTW