i don't know anything about the linux drivers, sorry.. there everything sais nvidia is bether..
about the fragmentprograms:
actually, it was one of the MAIN issues in 3dmark03, that nvidia cheated by not using their 128bit power but settle back to 16bit floats in much cases, and run in 64bit mode!!
nvidias fragment programs are ****ing slow in 128bit mode, while they are quite fast in 64bit mode. atis fragment programs are what is required for dx9, namely 96bit precision, and run in that mode as fast as possible.
because nvidia did not supported real 96bit mode they got very bad results in 3dmark and started to cheat all the way around. their hw is just wayoff any standards.
and, i prefer to have a hw wich has precision issues at 0.00152587890625% in floatingpoint calculations BUT CAN DO ABOUT EVERYTHING AND EVERYWHERE FLOATINGPOINT than having a hw that can do 32bit per float, but can not store them in regular textures, use them in regular fixed function pipeline, etc.
if you want to use floatingpoint on nv-hw, you have to rely entierly on nv-extensions currently.
and, while the nv hw has the full floatingpoint precision, they have NOT the full floatingpoint requirements!! their floats don't follow the IEEE standards in any way actually! so you still cannot use them for cientific calculations really, result is not exactly defined.