- Why do we need homogeneous coordinates?
You need them for affine transformations and projection. With an ordinary 3x3 matrix you can only apply linear transformations such as rotation, scaling and shearing. The point at (0, 0, 0) will always remain at that position. A transformation matrix is essentially just a linear system of equations. A vector (x, y, z) transformed by a 3x3 matrix M is essentially just x*M.xAxis + y*M.yAxis + z*M.zAxis. It's ovious that you can't have a translation if all the components of the vector are zero, so you need an extra term: x*M.xAxis + y*M.yAxis + z*M.zAxis + translation. By using homogeneous coordinates, you include a w-component to the vector (which usually is an implied 1) in order to be able to include translation in the matrix: x*M.xAxis + y*M.yAxis + z*M.zAxis + w*M.translation. To get to the actual 3d point in space, you should divide the resulting vector by it's w-component. By making sure that w is always 1 this will save a divide. Therefore common calculations only involve 3d vectors with an implied w=1 and 3x4 or 4x3 matrices with an implied last row/column of (0, 0, 0, 1)
For perpective projections, you need to be able to divide. For a point in view space, the projected point on a 2d screen can be calculated by dividing x and y by z. This is an operation that is not possible with matrices. However, using the properties of homogeneous coordinates, the division by z can be achieved simply by copying 'z' to 'w' using a 4x4 transformation matrix.
Note that while mathematically homogeneous coordinates are very well-defined, in computer graphics we merely use them as a convenience to be able specify the calculations that we need. Most computer graphics applications will never use homogeneous coordinates to their full extent. Before perspectice transformation, 'w' will always equal 1, and the only place where an actual division by w takes place is when projecting the points on a 2D surface before rendering. No one cares about the property that (1, 2, 3, 1) and (2, 4, 6, 2) are essentially the same point. In fact, having w=2 will break most code as they simply assume that w=1, and it may not even be possible to explicitely store a w component.