It appears that both of you only have one main update routine. I read a while ago that it is best to split up the update into two phases. One phase for updating the game physics/states and another update for the rendering.
Depends on where and what you exactly split up.
It is indeed a good idea - considering the next generation dual core processors - to split up rendering operations and game state code at some point, so that you use both cores.
In detail, most DX / OpenGl API calls eat up cycles like jellybears, so they feel quite comfortable on a different processor. Also, note that during the actual rendering operations, you usually just require few informations from your entities (such as meshes, textures, transforms), so this process is easy to parallelize.
Finally, splitting up makes sense if you are running fixed step physics, and don't want rendering to interfere there.
For most simpler applications, f.e. variable time step and lower CPU / graphics load, it is in most cases simpler to just do a single game loop that does rendering and logics in a fixed sequence.
For splitting up to work, however, it is not important to have a seperate rendering state and game state (which easily becomes an utter mess, I tried this one too ). It is by far more important to properly design your renderer so it is easy to parallelize.
This especially means having just few points where your logic needs to wait or test for the renderer mutex object. Adding a mutex to every single of your object's transforms will most likely kill your performance.
In my main loop, for example, there is just a single non-blocking synchronization request in the main loop (the RenderStart() call). Then, the logics thread commits all rendering commands to the renderer, where they are queued up in an internal buffer. After that, a call to RenderFinish() restarts the rendering main thread. This thread also has a main loop, but it acts as a simple worker thread, and does not do any updating (i.e. it just handles all commands queued while RenderStart() and RenderFinish(), by translating them into API calls, and then sits waiting).
One might perhaps consider adding extrapolation to the renderer (whereas you'd again need some kind of coherent renderer side state, but you might go with the previous command setup, and provide additional infos for extrapolation, i.e. speed, acceleration). Yet, there will hardly ever be a situation where your renderer is faster than your logics system (and thus had time to extrapolate).
Another idea for splitting up would be to seperate updates of graphic objects from updates to their physical state (which however is another, unrelated problem). They did this in battlefields 1942, for example. There, they update the "skeleton" of soldiers far away less often than for nearby enemies. This leads to a few glitches, but improves the framerate. However, I guess this is one thing that may be done in the logic thread whenever it is time to commit rendering operations. For example, before committing an object, see if it should get its geometry updated. It is also certainly not the point where you'd want to go with yet another thread - simply consider how ugly it would be to have to synch three threads (renderer, logics, geo update) trying to access your object's transforms.
I hope this helps,
and sorry for the long text