xcrypt at February 27th, 2012 18:49 — #1
In a book I'm reading I learned that we use constant buffers in directX to minise the number of API calls needed to do state changes, which leads to better performance.
However, then I see the book using it like this:
The C++ application code typically needs to communicate with the effect; in particular, the C++ application usually needs to update variables in the constant buffers. For example, suppose that we had the following constant buffer defined in an effect file:
Through the ID3D10Effect interface, we can obtain pointers to the variables in the constant buffer:
fxWVPVar = mFX->GetVariableByName("gWVP")->AsMatrix();
fxColorVar = mFX->GetVariableByName("gColor")->AsVector();
fxSizeVar = mFX->GetVariableByName("gSize")->AsScalar();
fxIndexVar = mFX->GetVariableByName("gIndex")->AsScalar();
fxOptionOnVar = mFX->GetVariableByName("gOptionOn")->AsScalar();
The ID3D10Effect::GetVariableByName method returns a pointer of type ID3D10EffectVariable. This is a generic effect variable type; to obtain a pointer to the specialized type (e.g., matrix, vector, scalar), you must use the appropriate As***** method (e.g., AsMatrix, AsVector, AsScalar).
Once we have pointers to the variables, we can update them through the C++ interface. Here are some examples:
fxWVPVar->SetMatrix( (float*)&M ); // assume M is of type D3DXMATRIX
fxColorVar->SetFloatVector( (float*)&v ); // assume v is of type
fxSizeVar->>SetFloat( 5.0f );
fxIndexVar->SetInt( 77 );
fxOptionOnVar->SetBool( true );
But now they make separate calls, is this not beating the purpose of constant buffers?
If it isn't, then why not?
If it is, then how do I set it correctly? (I assume this has something to do with "ID3D10EffectConstantBuffer::SetConstantBuffer()" but it seems to be poorly documented)
reedbeta at February 28th, 2012 00:54 — #2
I think it's less to do with the number of C++ function calls and more to do with the amount of stuff in the GPU command buffer. On older architectures, each time you set a vertex or fragment parameter resulted in a separate command in the buffer and (I guess) that could be costly in some cases. With hardware constant buffers, ultimately you're just assembling your data in a struct, then using one command to pass a pointer to that struct to the GPU.
You can do all this yourself: you'd make a struct in C++ that exactly matches the layout of the constant buffer as declared in the shader code, and populate it with your data. Create an ID3D10Buffer object for each constant buffer, map it using MAP_WRITE_DISCARD and copy your struct into it each frame (or whenever it needs to be updated). Then use SetConstantBuffer() to point the GPU at that buffer object before a draw call.
Most of these mechanics are done for you by the effect system, though. When you use the ID3D10EffectBlahVariable objects, It's internally keeping track of the constant buffers, and when you do a SetFloat(), SetInt(), etc. probably all it's doing is storing the value into a struct member, so those would be pretty lightweight calls. At some point, presumably when you do a draw call, it caches off a copy of the current constant buffer and points the GPU to that. At least that's my guess for what it's doing under the hood.
Doing it yourself might be a bit faster, but setting constants isn't likely to be a bottleneck for a typical app, at least I wouldn't think so. So it's probably not worth worrying about.
jarkkol at March 4th, 2012 19:26 — #3
In order to efficiently use constant buffers, you need to organize your data according to the data update frequency and cache the cbuffers. Essentially you want to minimize the number of cbuffer updates in order to minimize amount of data that needs to be transferred from CPU to GPU. In your example you have MVP matrix in per-object constant buffer, but this isn't a good idea because you have to update the cbuffer per frame because of the camera changes. It's better to hold only local->world matrix in per-object cbuffer, so that the cbuffer needs to be updated only when the object moves. Then have world->projection matrix in separate cbuffer and perform the matrix multiplication in shader instead (or rather transfer vector first to world with per-object matrix and then world->projection with per-view matrix).
In practice I use 3 different types of cbuffer in my DX11 code: per-view, per-object and per-material cbuffers. This minimizes the need to update cbuffers since per-view needs update only once per view. per-object cbuffer requires updates only if object moves (only small percentage of objects is expected to move per frame), and per-material cbuffer requires changes only upon material property updates (almost never).