Welcome to yet another fun thread!
I played around with CUDA a while back, but I wasn't all that hyped on it. The syntax was shockingly similar to programing with shaders, so I just dumped it figuring it was a tool for non graphics programmers to make use of the GPU. I recently decided to give GPGPU another try with OpenCL. I couldn't find any decent demos out there, so I was losing interest pretty fast. I decided to go ahead and implement my own demo to see what all the fuss was about.
The first problem was learning the API. Its not like other popular APIs where you just type a single word in Google and you are bathed in helpful websites. Being a relatively new and underrated tech, you actually have to do your homework. Luckily I found a site that gives a decent intro to the OpenCL API. It's not perfect, it's missing some important cleanup routines and doesn't cover performance issues. I had to read over the OpenCL spec to learn about those (note: not fun).
So now that I had a wrapper for OpenCL, I went ahead with my first demo. I was losing patience fast, so I built something I knew was computationally insane and also easy to implement. Ahem... the Universe
At 32768 stars, 193 GFLOPS, the demo runs at about 10 fps on my ATI 5750 HD. Simulating the environment with all 6 cores on my AMD Thuban, it takes 63 seconds to render a single frame.
Overall I'm pleased with the experience. Runtime kernel compilation makes developing and debugging kernels a pleasure. Once you get know the API, the rest pretty much follows through. From what I know, PhysX is a good example of what some games out there today take advantage of, but I look forward to seeing some neat uses such as real time ocean (water) dynamics and real time smoke effects (see Blender 3D smoke emitter). I would not be surprised if you could make a great game on any one of those two topics alone.