I'm using some dynamic vertex buffers, but only where needed.
There's (likely) nothing about this that is specific to my computer as my partner has the same problem on his computer with my engine
I'm not sure what D3DPOOL is for, after doing some googling it seems to be something for Direct3D9, I'm working with Direct3D10
Yes, sometimes I have a lot of draw calls, this is definitely an area that needs improving, but it is not the cause for my engine running five times slower than the engine without optimisations, since in that engine they do a lot of draw calls with smaller buffers too.
For example, the unoptimised engine (no optimisations) can draw about 1 million rects with 60+ fps. While my engine can draw only about 2000 rects with 60+fps. The performance in 2D seems to be even waay worse. However, for 2D I do sort by my own algorithm, not by depth buffer. This is for allowing partial transparency in 2D.
In 3D, my render is about 5 times slower than the unoptimised engine, but in 2D, well, about 500 times slower.
I'm not updating a lot of shader constants. However, there is this one thing though: When compiling effects, I have this option
DWORD dwFXFlags = 0;
dwFXFlags |= D3D10_EFFECT_COMPILE_ALLOW_SLOW_OPS; //needed for setting samplerFilter
As the comment says, this is needed for setting the sampler Filter. In my HLSL effects:
Filter = g_SamplerFilter;
AddressU = WRAP;
AddressV = WRAP;
MaxAnisotropy = 8;
If I don't do this, than I can't set the sampler filter at any time (for example at the menu, the user could select his texture filtering style). However, this doesn't seem to have such a big impact on performance, because I have tried without as well, and I can't see any noticeable difference.
EDIT: another thing comes to mind.
My mesh class has a method Activate() and Deactivate(). This will determine if the mesh that is already added to the renderer should temporarily not be drawn or, the opposite. I remember him saying that he tried to compute which rects are visible and which aren't, and Activating/Deactivating meshes accordingly, which seemed to give quite some performance increase. This only helped when zooming in though.
This seemed weird, because I thought that DirectX would determine which geometry is (not) visible through view frustrum clipping and rasterization stage. Am I not correct?