I'm probably about 95% sure that it is the call to glDisableClientState that is causing my problems. Let me give you a little background to explain why.
I'm implementing a library for haptic rendering (force feedback) of large scale virtual environments. The library supports a number of 3D primitives which can be rendered in immediate mode. I got OK performance using all the tricks like culling, spatial partitioning etc. but of course -- I'm wasn't really making full use of the gpu to efficiently render large numbers of triangles.
I then implemented a vertex buffer class which takes an stl Vector of triangles sorted according to their material property. I setup the vertex buffer object as per my example code (the code fragment I posted is actually in a render method for my vertex buffer class). Each of my objects in the environment tessellate themselves into the stl Vector, the vertex buffer class is constructed from this Vector, and then it's render method is called within the main rendering loop of the application.
In the render method for the vertex buffer I step through all the different material types contained in my vertex buffer and render all the triangles of a given type, switching the material properties as required. At first I thought that the bottlenck may have been caused by switching material properties but I eliminated that by making my whole model render with a single material.
I gradually made the vertex buffer simpler and simpler by commenting out various bits but even with just GL_VERTEX_ARRAY enabled (and then disabled after glDrawArrays), I still get the same problem.
I am running on a Linux box and I'm beginning to wonder if there might be a bug in the nvidia driver for linux. My next plan of attack is to get a simple c or c++ program that uses vertex buffer objects to render at least 10,000 triangles and try running it on my machine to see if it exhibits the same problem.