There was a talk at the 2011 LLVM dev meeting (cached slides here <a href="http://webcache.googleusercontent.com/search?q=cache:http://llvm.org/devmtg/2011-11/Hines_AndroidRenderscript.pdf" rel="nofollow">http://webcache.googleusercontent.com/search?q=cache:http://...</a> , llvm.org is down today) about Renderscript's design philosophy and LLVM-based compilers.<p>In short, it's not an accident or incompetence that aspects of current desktop GPU execution models (e.g., thread blocks, scratchpad shared memory) are not exposed in Renderscript. It's a conscious decision to make sure you can get decent performance on not only those GPUs, but ARMv5-v8 CPUs (with and without SIMD instructions), x86, DSPs, etc. Getting good performance on these platforms from a language that does expose these constructs (e.g., CUDA) is still an open research problem (see MCUDA <a href="http://impact.crhc.illinois.edu/mcuda.aspx" rel="nofollow">http://impact.crhc.illinois.edu/mcuda.aspx</a> and friends).<p>Though Renderscript aims to achieve decent performance on a huge variety of platforms, even if they only cared about mobile GPUs, the major contenders (Imagination, ARM, Samsung, Qualcomm, NVIDIA) have wildly different architectures, and a language that is close to the metal on one will present a huge impedance mismatch on the others. Note that things are sufficiently different from desktop GPU design that we're just now seeing SoCs come out that support OpenCL (in hardware, driver support seems to be lagging), and you can't run CUDA on Tegra 4.
It does look like mobile GPU vendors are about to start offering OpenCL support. For example, ARM submitted OpenCL 1.1 Full Profile conformance test results for the Mali-T604 last year (<a href="http://blogs.arm.com/multimedia/775-opencl-with-arm-mali-gpu-computingwith-no-compromises/" rel="nofollow">http://blogs.arm.com/multimedia/775-opencl-with-arm-mali-gpu...</a>), and Imagination Technologies showed mobile OpenCL demos last year at CES (<a href="http://www.youtube.com/watch?v=sDrz-w1jzEU" rel="nofollow">http://www.youtube.com/watch?v=sDrz-w1jzEU</a>).<p>It's easy to see why OpenCL hasn't rolled out fully on mobile GPUs yet: writing and debugging a full OpenCL software stack is very expensive and time-consuming, and there's still not that much real programmer demand for OpenCL on mobile.<p>As for Renderscript, it's always sounded like a bit of "not invented here" syndrome Google's part -- we've already got CUDA and OpenCL, and RS doesn't really bring much new to the table. They've already deprecated the 3D graphics part of Renderscript in Android 4.1, so perhaps they'll do the same to Renderscript Compute soon.
I suspect that as soon as Apple exposes OpenCL in any way on IOS, Android will shortly follow. Likewise, if Mozilla exposes WebCL in FireFox, Chrome will shortly follow. What I don't expect is for them to take the lead in doing so.<p>Say what you want of OpenCL/CUDA, but what other language smoothly subsumes SIMD, multi-threading, and multi-core awareness? I expected it to already be available on smart phones by now. What's taking so long?
If someone with that level of experience can find so many flaws so quickly, why aren't people with that level of domain knowledge brought in when the API is originally being developed? Or, if they are, why isn't there released documentation on why the API isn't as good as they wish it could be?
I think that Renderscript is not meant as replacement for native C++ code. Rather it's an platform independent and easy way to give a programmer more performance power (beyond Java).
I guess that if you need real performance or more control you'll have go the NDK route anyways. But if you just want to write another Instagram clone then Renderscript is the way to go.