I implemented something very similar in C using shared memory and a custom multi producer / consumer ring buffer for a different exchange. It was NEAT.<p>I loved how the code turned out and how performant it was and would love to talk to people who went through the same.