I have often wondered if you couldn't put the needed database indexes and other associated data onto a GPU , and have the GPU handle the optimization for the query, run the query, and then just return to the database server which blocks to go to to get the data; the indexes could be synced to disk of course but they would be run from the GPU.
Radix sort is my favorite sort algorithm. Here, I made a little Radix sort video with sound: <a href="http://rasterburn.org/~sgt/stuff3/radixsort.avi" rel="nofollow">http://rasterburn.org/~sgt/stuff3/radixsort.avi</a>
There's an ongoing thread in the CUDA forums about it: <a href="http://forums.nvidia.com/index.php?showtopic=175238" rel="nofollow">http://forums.nvidia.com/index.php?showtopic=175238</a>