This paper came out awhile ago - it was in GPU Gems 3, which came out in 2007 - but I came across it again today and thought it was worth re-sharing.
The basic problem they're trying to solve is getting cubemap reflections to match an arbitrary BRDF. When you're rendering a point light you can directly evaluate any BRDF equation you like - it's just a matter of writing the code - but it's different when you have a cubemap, since then light is coming in from many directions, not just one. To be physically correct, you need to convolve the cubemap with the BRDF.
Currently, people often use a cubemap that's pre-blurred according to a Phong distribution (e.g. using CubeMapGen), with different specular powers (roughness) in each mip level. Then in real-time you can sample it with trilinear filtering, setting the mip level based on your material's specular power. But strictly speaking, this pre-blur approach only works with Phong and not with any other BRDF. We often just use the Phong cubemap anyway, as it's a pretty good approximation. But we can do better.
In this paper, the authors show how to approximate the true specular for any BRDF, using Monte Carlo importance sampling in the pixel shader. Basically, you take some random samples from the cubemap, generated from a distribution that matches the BRDF shape, and average them together. It takes a lot of samples to make it look good, though. To reduce the number of samples and improve performance, you can use a mipmapped cube, and the authors show how to calculate the right mip levels so that the samples blend together nicely and don't produce visible noise. They assumed a cube with standard mipmaps, but it should work even better with a pre-blurred Phong cubemap, with some tweaks to the formulas.
I think it would also be interesting to think about explicit sample placement for small numbers of samples. Rather than just placing samples randomly - or even using a low-discrepancy sequence - if you're going to take < 10 samples, it should be better to use some kind of Poisson-disk-like pattern in angular space. Maybe the sample pattern can even be optimized for a specific BRDF, to get good results with even fewer samples.
I have the feeling that in 2007 this technique was too heavyweight to really be useful for games at the time. But now that it's 2013, perhaps this technique's time has come! Newer GPUs and the next-gen consoles should have enough horsepower to do a few cube samples per pixel.