So if i am using the column major matrix, I only need to change the index to make the code snippet work, right?
I also noticed that there have some other algorithms to decompose a matrix ( SVD, QR algorithm ). So when should I to use these algorithms?
To clarify, I'm assuming that your matrix is in "SRT" form - Scale, Rotate, Translate, meaning the transformations from local to world space are done in that order. First you scale the local space - usually uniformly, but if nonuniform scaling is applied it is always along the XYZ axes of local space (this is what Smile meant by a "diagonal scale matrix"). Then you rotate the space to orient it to its parent and finally translate it to its correct position. When a matrix is constructed in SRT form, you can extract the original XYZ scale factors by looking at the length of each row (for row-vector convention) or column (for column-vector convention) of the matrix.
This will also work when multiple transformations are concatenated in a local->parent->parent->...->world hierarchy - as long as nonuniform scaling is applied only at the bottom-most level. (Uniform scaling is fine at any level.) If rotation is used prior to non-uniform scaling, or (worse) if multiple non-uniform scalings are applied with a rotation between them, then the result is no longer in SRT form and you have to use a more sophisticated algorithm like one of the ones you mentioned.
In short, stick with uniform scaling and you won't have trouble. Non-uniform scaling brings in a whole bunch of extra problems.