I'm no graphics programmer. I'm mostly interested in the compute (OpenCL, CUDA) side of things. I don't own an NVidia GPU, so my experience mostly relates to AMD.<p>I'm certainly interested in AMD's Linux "ROCm" push. I really think the programming model there is relatively easy to understand, but there are major flaws in the documentation and implementation.<p>For example, OpenCL 1.2 on ROCm 2.0 isn't stable enough to run Blender Cycles. Yes, you can render the default cube, but very slowly. On a real scene, Blender Cycles on OpenCL ROCm can take 500+ seconds to compile, and the actual execution seems to hang (infinite loop and/or memory segfault, depending on the scene) on anything close to a typical geometry.<p>Note that Blender's OpenCL code is explicitly written for AMD's older OpenCL (AMDGPU-Pro OpenCL implementation). Blender has a separate CUDA branch for NVidia cards. So OpenCL ROCm is at very least performance-incompatible with OpenCL AMDGPU-Pro. The Blender OpenCL code probably has to be rewritten to work (ie: not infinite loop), and maybe even become efficient on OpenCL ROCm again.<p>--------<p>AMD's hardware is fine (not as power-efficient as NVidia, but performance is fine, in theory). But the drivers / software stack is clearly immature. Even as ROCm has hit a 2.0 release, these sorts of issues still exist.<p>AMDGPU-PRO with OpenCL1.2 is workable, but feels old and cranky. (OpenCL 1.2 was specified in 2011, and is missing key features. Its atomics model is incompatible with C/C++11, its missing SVM and kernel-side enqueue... etc. etc.)<p>AMDGPU-PRO OpenCL2.0 is theoretically supported, but is still unstable in my experience. ROCm OpenCL (both 1.2 and 2.0) is still under development, but doesn't seem to be ready for prime-time yet. (At least, with Blender 2.79 or 2.80 Cycles is any indication).<p>AMD HCC seems usable, but there aren't many programs using it. AMD HIP is an interesting idea but I haven't used it.<p>I know NVidia has driver issues / software issues. But CUDA Code written 5 years ago will still have similar performance / implementation if run on today's cards, on today's software stack. I'm not sure if the same is true for AMD's OpenCL code (between AMDGPU-PRO OpenCL1.2 and ROCm 1.2).<p>----------<p>Long story short: the only mature AMD OpenCL compute platform seems to be OpenCL 1.2 on AMDGPU-PRO. Fortunately, it also seems like AMDGPU-PRO will work for the foreseeable future, but AMD really needs to clarify its platform to attract developers. (Ex: prioritize testing of ROCm OpenCL to ensure performance-compatibility with existing OpenCL 1.2 code written for AMDGPU-PRO)