Finally i got some time to better answer.
Grid traversal is highly uneffective the further your ray goes (this can be mostly solved by SVO or such though).
Anyway instanced ray tracing isn't that much faster than standard uninstanced one. I'd say it's even slower than doing uninstanced rendering. Imagine scene, we have 1M triangle mesh. lets place this mesh on 5 positions in world.
How simple standard ray tracer works:
- (possible to precompute) Create acceleration structure (F.e. Kd-tree) for whole scene (with F.e. SAH - to get realtime performance), leaves contains triangles
- Traverse each ray through Kd-tree
How simple instanced ray tracer works:
- (possible to precompute) Create acceleration structure for our mesh (Kd-tree with SAH), leaves contains triangles
- (possible to precompute) Create acceleration structure of scene, where leaves countains another Kd-tree (the object's one)
- Traverse ray through scene hierarchy
- If leaf is hit, then traverse ray through object hierarchy (transformed to local object space)
While point 1. will be a lot slower in standard ray tracer (for 5 meshes containing 1M tris, and ideal SAH Kd-tree builder O(N log N) - we create the Kd-tree at least like 5.6 times slower).
In importatnt point 2. the standard ray tracer will most likely own the instanced ray tracer (assuming we have good layout of Kd-tree in memory so we won't lose that much on cache misses).
Note that in 1st case our demands for memory will be more than 5 times bigger! So this is what instancing saves. Performance - not so much!
We would though need to benchmark it.