As the subject line asks, what's the general process by which an engine takes the frustum coordinates (and actually, this assumes that data has already been put into -1,-1,-1 1,1,1 view box coords) and gets all the way to pixels on da screen?
The word for this process is "rasterization" and it's a complex topic. The good news is that the hardware handles all this for you so you don't really need to know the details of the process to write a 3D app. However if you'd like to understand it, look for some software rasterization articles. There are a variety of ways to go about it.
Gotcha. Well, lemme try this...
I know that you have to do the viewport transform:
X = (X + 1) * Viewport.Width * 0.5 + Viewport.TopLeftX
Y = (1 - Y) * Viewport.Height * 0.5 + Viewport.TopLeftY
Z = Viewport.MinDepth + Z * (Viewport.MaxDepth - Viewport.MinDepth)
After that, I'm basically looking to understand the logic of the actual pixel-painting, if you will, so I can implement shader and lighting designs.
I know a lot of things like texture mapping and lighting and shading (obviously) enter here; I'm simply looking for an overall summation, whether or not any math is included, of the process so I can understand the conceptual logic of the "painting part".
I'm not sure exactly what you're asking for...after the viewport transform, you have the vertices of the triangle in screen space; the graphics card determines which pixels it covers and then runs a pixel shader to calculate the color for each one of those pixels. Conceptually, that's all there is to it. What you do in the pixel shader is up to you. Are you looking for a tutorial on how to write shaders, basic lighting/shading models and suchlike?
In terms of the actual implementation of rasterization, of course determining which pixels a triangle covers isn't a trivial thing to do fast, so there are a bunch of technical details and optimizations. It sounds like that's not what you're interesting in learning about right now, though, unless I've misunderstood.
Well, basically, I'm curious to the step-by-step, math and logic and all, that takes the triangles in the screen space and gets to the final result. I suppose that, perhaps, a flaw might reside in my understanding that in order to write a shader, one must code the actual determining process of what pixels cover the vertices and spaces they enclose. This must be where I'm confused. And, lol, I suppose I don't rightly know therefore exactly how to ask the question. I did assume that when one writes shaders for lighting/shading/art style/etc. stuff, that entailed doing the whole screen space vertice and triangle info to pixel coverage business yourself, something which seems rather complex to me.
The GPU entirely handles the problem of determining which pixels a given triangle covers, so no, you don't have to do that yourself. (You can theoretically do it all yourself, but I wouldn't advise it unless you have a very good reason! ) A pixel shader is just a short program, written in a shading language like Cg or HLSL, that the GPU executes once for each pixel in the triangle. It can do things like access attributes passed down from the vertex shader (UV coordinates, normal vectors, etc.), which are automatically interpolated across the area of the triangle, and it can get texture samples. You'll write the code to combine these data to generate the final color of that pixel, but the GPU calls into that your code, so you don't need to implement (nor can you control) the rasterization process itself.
Actually the rasterization is one of the last things actually done by hardware on GPU. You can still write it in OpenCL and do it yourself, but it isn't advised (it will more likely by a lot slower than hardware rasterization process).
I see... That I realize I was confused about the scope of shaders then. I guess that's still where I'm confused. I gather that, basically, each triangle has color/texture image attached to it, and additional color and/or lighting values/info as well; I thought that when one writes shaders to, say, generate random noise functions or custom lighting, one would have to calculate the current skew of the triangle in order to apply effects appropriately on a pixel-by-pixel basis. I think what I don't get is how one can write a shader to manipulate each pixel without knowing the skew of the triangle, as some shader purposes obviously deal with skew-affected issues (such as noise generation; if the triangle is skewed a bit away from the camera, the noise would need to be generated a bit more densely to make it look like right, things like that). Probably an ignorant-sounding Q here, but this is just an area of the engine process I'm seeking to get an exact hold on. As well, how are things like (hope I get the following right) octrees handled therefore? And voxels? (I'm trying to remember what those are/do, I just know they're related here somehow.)
The "skew" is implicit in the interpolation of attributes across the triangle. For example, you assign UVs (texture coordinates) to each vertex, and the GPU automatically generates intermediate UVs for all the pixels within the triangle, in a perspective-correct fashion. The pixel shader you write can receive these UVs and use them to sample a texture, which makes the texture appear glued to the triangle and will transform and foreshorten appropriately as you move the camera about and look at the triangle from different angles.
Voxels are a completely different rendering process that is not based on triangles. GPUs do not support voxels directly, though voxel rendering can be done on the GPU, typically by drawing a single triangle that covers the whole screen and using a pixel shader to render the voxels yourself. Anyway, that's a whole other topic.
I see. So some of the GPU work actually occurrs before shader functioning, in fact feeding interpolated triangle data TO shaders?
Also, I've always been curious: why are triangles the standard modus operandi, versus meshes being composed of any ol' coordinates? I assume I has something to do with the conductivity to trigonometry-type math enablement, but I'm indeed very curious. I know that there are other methods out there, be they voxels, octrees, or square polygon methodologies. I also hear that one of the major companies and their engine (forgot who and which) are looking at full octree implementation in their next engine iteration.
Yes, the GPU is a combination of hard-wired and programmable stages. Shaders are the programmable ones; rasterization and interpolation are one of the hard-wired ones.
Triangles are mathematically one of the simplest shapes you can use to make a 3D world out of, and everything else can be reduced to triangles or approximated by triangles. It's a lot simpler and faster for graphics APIs and hardware to handle one kind of primitive than many, so they picked triangles as the one true primitive. (Actually, GPUs support line and point primitives as well, but these are rarely used in production rendering. Lines are used for wireframe views and other kinds of debugging/developer views.)
By the way, octrees are not a primitive that can be used to make 3D worlds; they are just a container - a data organization system that you can put other things into, to make certain kinds of spatial operations faster (such as finding the nearest object to a given point). Octrees can be used to organize objects in your game world, triangles, voxels, or anything else that lives in 3D space. Octrees are an example of a class of data structures called space-partitioning structures, of which other examples are bounding box trees, sphere trees, BSP trees, and kd-trees.
Thanks Reedbeta, that's great stuff. That clarifies a lot. If I have any other Q's here I'll put 'em forward...right now, that gives me the stuff I'm looking for.
Btw, in regards to the various methods of space-partitioning structures you mentioned: is there a favored, or generally preferred, method from amongst those you mentioned? I'm just curious as to what is usually used in gaming, or, also, if there's a method that is growing in popularity right now. I also assume, if I can do so, that those methods can be somewhat customized in their functioning to fashion your own style of space-partitioning...?
Different methods are useful for different cases - that's why different methods exist. Which method works best depends a lot on the details of your scene and what you want to use the space-partitioning for.
Also if the space partitioning can be done prior to the game running, it's no good having a rendering scheme that can render millions of polygons per second if the space partitioning takes a week to run per frame.
I have a software raytracer I use a lot, when I wrote it I used a bounding volume hierarchy that uses axis-aligned bounding boxes,it uses SAH for optimal splitting.
The BVH creation is ten times slower than the renderer, but the combined effect is a raytracer that is roughly a hundred times faster than brute force