Okay, so it's easier then directly using the the CUDA, etc., C toolchains, perhaps, but why not compare to Python + Numba, which has been available with GPU support for quite a while, and likewise avoids direct exposure to the underlying C toolchains, provides interactive compilation, can be used with a nice REPL (or, Jupyter Notebook), etc.?