There's a couple companies in the business of using a 3d simulation for training autonomous cars like Waymo, Cruise, Nvidia, and Applied Intuition. I don't quite understand their product though.<p>1. Are the trained object detectors in the simulation applied to real world data also or is only the part that makes decisions transferred to the real vehicle (e.g. it's safe to turn left here) while detectors trained on real world images of cars, people, etc. used?<p>2. Tangentially, I thought that in general detectors trained on computer generated images was not very applicable to real world images. eg training on a bunch of images of 3d modeled humans won't work well with testing on pictures of real humans. Is this not true?
I work on full-3D simulations (using a game engine) for an autonomous car company. I can't speak for every AV company, but in my experience, simulators are used far more for testing than 'training'. The appeal of using a 3D game engine for simulation is that you can create inputs to the car's perception system. Without this ability, you're stuck either replaying recorded data, or spoofing out perception and only testing planning/controls and down. These two approaches are actually extremely powerful, so the vast majority of AV simulation testing is not done in full 3D.<p>There are some situations where 3D simulation is useful, though. First, it allows you to run your AV software in its entirety (i.e., not spoofing perception), making for a very complete integration test. A 3D sim can capture complex, interesting occlusions that other sims cannot. Another fairly common use case is experimenting with new sensor setups before they're added to the car.<p>As for training, it's mostly research at this point. I think there's promise in using synthetic data to supplement real-world data training data for perception systems.<p>There are a number of companies trying to market simulation 'platforms' to AV makers. I think there's the potential for one of these products to gain traction -- but it's a difficult sell. AVs are enormously complicated, a 3rd party product would need to both beat in-house sims and support a lot of very specific (and likely propriety) AV features.
Simulation to real training is an active area of research and I'm not aware of it being used anywhere in production for critical systems. I have not seen anything better than the "Learning Dexterity"[1] paper that was published last year.<p>[1]: <a href="https://openai.com/blog/learning-dexterity/" rel="nofollow">https://openai.com/blog/learning-dexterity/</a>
Taking just the images subset of this, but it should apply for other types of data as well:<p>It doesn’t have to be computer-generated images. It can also be computer—altered images (think n° rotation, blurring, cropping, etc.) which should work pretty well in part because real world images are sometimes rotated, blurred, cropped, etc.