I think you can try following method.
First, pregenerate array of evenly distributed rays in full unit sphere, defRay[N].xyz.
Second, generate random rotation matrix M. For that you must generate true random point on 4D sphere (quaternion), it might be tricky but quite doable.
Third, spawn rays with directions dir*= normalize(N + M * defRay*), where N** is unit surface normal.
That way you have rays distributed in hemisphere with density proportional to cos(N\\^dir) which is ideal for diffuse surfaces. True random rotation on every ray hit ensures mathematic correctness and reduces banding.