When people simulating real-time physics, it's almost always a bad idea to use semaphores like that. Instead, all pieces of the systems are normally simulated in lockstep. If the system has sufficient computations, a simulation step can be parallelized to use multiple CPU cores, but they are synchronized at every timestep of the simulation. Otherwise it will inevitably fail, there are way too many reasons for unfairness in the scheduler: other processes, hyperthreading, NUMA, unequal cooling/throttling, and more.