[SOLVED & COMMITED]sceGumLookAt(vfpu) is broken

Zettablade · Post by **Zettablade** » Sun Mar 04, 2007 4:44 pm

Just got the most recent update from the pspsdk's svn today. Apparently, there were some changes, and now sceGumLookAt from gum vfpu is working improperly. It severely warps the view. Just thought I'd point it out.
Sooo, if anyone would like to point out a viable solution, that would be awsome :)

Zettablade · Post by **Zettablade** » Sun Mar 04, 2007 11:18 pm

so, I've done a temp "fix", involving going back to the previous revisions code for sceGumLookAt _vfpu. A lot has changed between the new "broken" and the old stable code. I'll post the differences. I hope this issue is addressed in the next revision, as sceGumLookAt is a very useful function, and sceGum_vfpu, which may not be as fast as possible, is still faster then sceGum.

old, stable (vfpu)

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	ScePspFMatrix4* t = GUM_ALIGNED_MATRIX&#40;&#41;;
	gumLoadIdentity&#40;t&#41;;
	gumLookAt&#40;t,eye,center,up&#41;;
	
	pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

	__asm__ volatile &#40;
			"lv.q C000.q, 0 + %0\n"
			"lv.q C010.q, 16 + %0\n"
			"lv.q C020.q, 32 + %0\n"
			"lv.q C030.q, 48 + %0\n"
			"vmmul.q M100, M300, M000\n"
			"vmmov.q M300, M100\n"
	&#58; &#58; "m"&#40;*t&#41; &#41;;

	gum_current_matrix_update = 1;
&#125;
#endif

new, "broken" (vfpu)

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

    __asm__ volatile &#40;
        "vmidt.q M100\n"
        // load eye, center, up vectors
        "ulv.q    C000, %0\n"
        "ulv.q    C010, %1\n"
        "ulv.q    C020, %2\n"
 
        // forward = center - eye
        "vsub.t  R102, C010, C000\n"
 
         // normalize forward
        "vdot.t  S033, R102, R102\n"
        "vrcp.s  S033, S033\n"
        "vscl.t  R102, R102, S033\n"
 
         // side = forward cross up
        "vcrsp.t R100, R102, C020\n"
        "vdot.t  S033, R100, R100\n"
        "vrcp.s  S003, S033\n"
        "vscl.t  R100, R100, S033\n"
 
         // lup = side cross forward
        "vcrsp.t R101, R100, R102\n"
        "vneg.t  R102, R102\n"
 
        "vmidt.q M200\n"
        "vneg.t  C230, C000\n"
	"vmmul.q M300, M100, M200\n"
    &#58; &#58; "m"&#40;*eye&#41;, "m"&#40;*center&#41;, "m"&#40;*up&#41;&#41;;

	gum_current_matrix_update = 1;
&#125;
#endif

current, stable (c implementation)

Code: Select all

#ifdef F_gumLookAt
void gumLookAt&#40;ScePspFMatrix4* m, ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	ScePspFVector3 forward, side, lup,ieye;
	ScePspFMatrix4 t;

	forward.x = center->x - eye->x;
	forward.y = center->y - eye->y;
	forward.z = center->z - eye->z;

	gumNormalize&#40;&forward&#41;;

	gumCrossProduct&#40;&side,&forward,up&#41;;
	gumNormalize&#40;&side&#41;;

	gumCrossProduct&#40;&lup,&side,&forward&#41;;
	gumLoadIdentity&#40;&t&#41;;

	t.x.x = side.x;
	t.y.x = side.y;
	t.z.x = side.z;

	t.x.y = lup.x;
	t.y.y = lup.y;
	t.z.y = lup.z;

	t.x.z = -forward.x;
	t.y.z = -forward.y;
	t.z.z = -forward.z;

	ieye.x = -eye->x; ieye.y = -eye->y; ieye.z = -eye->z;
	gumMultMatrix&#40;m,m,&t&#41;;
	gumTranslate&#40;m,&ieye&#41;;
&#125;
#endif

#ifdef F_sceGumLookAt
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	gumLookAt&#40;gum_current_matrix,eye,center,up&#41;;
	gum_current_matrix_update = 1;
&#125;
#endif

hlide · Post by **hlide** » Sun Mar 04, 2007 11:28 pm

Zettablade wrote: new, "broken" (vfpu)

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

    __asm__ volatile &#40;
        "vmidt.q M100\n"
        // load eye, center, up vectors
        "ulv.q    C000, %0\n"
        "ulv.q    C010, %1\n"
        "ulv.q    C020, %2\n"
 
        // forward = center - eye
        "vsub.t  R102, C010, C000\n"
 
         // normalize forward
        "vdot.t  S033, R102, R102\n"
        "vrcp.s  S033, S033\n"
        "vscl.t  R102, R102, S033\n"
 
         // side = forward cross up
        "vcrsp.t R100, R102, C020\n"
        "vdot.t  S033, R100, R100\n"
        ---------->>> "vrcp.s  S003, S033\n" <<<-----------
        ---------->>> "vscl.t  R100, R100, S033\n" <<<-----------
 
         // lup = side cross forward
        "vcrsp.t R101, R100, R102\n"
        "vneg.t  R102, R102\n"
 
        "vmidt.q M200\n"
        "vneg.t  C230, C000\n"
	"vmmul.q M300, M100, M200\n"
    &#58; &#58; "m"&#40;*eye&#41;, "m"&#40;*center&#41;, "m"&#40;*up&#41;&#41;;

	gum_current_matrix_update = 1;
&#125;
#endif

the line "vscl.t R100, R100, S033\n" should be "vscl.t R100, R100, S003\n"

or

the "vrcp.s S003, S033\n" should be "vrcp.s S033, S033\n", which I think to be the correct one.

chp · Post by **chp** » Sun Mar 04, 2007 11:56 pm

I'll take a closer look at this asap. I got the code from MrMr[iCE] and I tested the changed sceGumPerspective() so that it worked, I'll go through the math on the other changes more thoroughly.

Zettablade · Post by **Zettablade** » Mon Mar 05, 2007 12:00 am

chp wrote:I'll take a closer look at this asap. I got the code from MrMr[iCE] and I tested the changed sceGumPerspective() so that it worked, I'll go through the math on the other changes more thoroughly.

Awsome, thx, and keep up the good work.

Raphael · Post by **Raphael** » Mon Mar 05, 2007 1:44 am

MrMrICE forgot that normalization involves the reciprocal square root of the vector length :) So exchange vrcp.s with vrsq.s and it should work.
Also, after the crossproduct of side and forward to fix the up vector, shouldn't that be normalized again afterwards? IIRC the cross of two unit vectors needs not necessarily be a unit vector again. Could be wrong though.

Regards.

EDIT:

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
   pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

    __asm__ volatile &#40;
        "vmidt.q M100\n"
        // load eye, center, up vectors
        "ulv.q    C000, %0\n"
        "ulv.q    C010, %1\n"
        "ulv.q    C020, %2\n"
 
        // forward = center - eye
        "vsub.t  R102, C010, C000\n"
 
         // normalize forward
        "vdot.t  S033, R102, R102\n"
        "vrsq.s  S033, S033\n"
        "vscl.t  R102, R102, S033\n"
 
         // side = forward cross up
        "vcrsp.t R100, R102, C020\n"
        "vdot.t  S033, R100, R100\n"
        "vrsq.s  S033, S033\n"
        "vscl.t  R100, R100, S033\n"
 
         // lup = side cross forward
        "vcrsp.t R101, R100, R102\n"

        "vneg.t  R102, R102\n"
 
        "vmidt.q M200\n"
        "vneg.t  C230, C000\n"
   "vmmul.q M300, M100, M200\n"
    &#58; &#58; "m"&#40;*eye&#41;, "m"&#40;*center&#41;, "m"&#40;*up&#41;&#41;;

   gum_current_matrix_update = 1;
&#125;
#endif

EDIT2: I was wrong, the renormalization of the up vector is NOT necessary. I removed that code.

Zettablade · Post by **Zettablade** » Mon Mar 05, 2007 1:54 am

Raphael wrote:

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
   pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

    __asm__ volatile &#40;
        "vmidt.q M100\n"
        // load eye, center, up vectors
        "ulv.q    C000, %0\n"
        "ulv.q    C010, %1\n"
        "ulv.q    C020, %2\n"
 
        // forward = center - eye
        "vsub.t  R102, C010, C000\n"
 
         // normalize forward
        "vdot.t  S033, R102, R102\n"
        "vrsq.s  S033, S033\n"
        "vscl.t  R102, R102, S033\n"
 
         // side = forward cross up
        "vcrsp.t R100, R102, C020\n"
        "vdot.t  S033, R100, R100\n"
        "vrsq.s  S033, S033\n"
        "vscl.t  R100, R100, S033\n"
 
         // lup = side cross forward
        "vcrsp.t R101, R100, R102\n"
        // not sure if following three lines really needed&#58;
        "vdot.t S033, R101, R101\n"
        "vrsq.s S003, S033\n"
        "vscl.t R101, R101, S033\n"

        "vneg.t  R102, R102\n"
 
        "vmidt.q M200\n"
        "vneg.t  C230, C000\n"
   "vmmul.q M300, M100, M200\n"
    &#58; &#58; "m"&#40;*eye&#41;, "m"&#40;*center&#41;, "m"&#40;*up&#41;&#41;;

   gum_current_matrix_update = 1;
&#125;
#endif

Tested an it works perfectly! But here's the modfied code. This should go in the svn...

Code: Select all

#ifdef F_sceGumLookAt_vfpu
void sceGumLookAt&#40;ScePspFVector3* eye, ScePspFVector3* center, ScePspFVector3* up&#41;
&#123;
	pspvfpu_use_matrices&#40;gum_vfpucontext, VMAT3, VMAT0 | VMAT1&#41;;

	__asm__ volatile &#40;
		"vmidt.q M100\n"
		// load eye, center, up vectors
		"ulv.q    C000, %0\n"
		"ulv.q    C010, %1\n"
		"ulv.q    C020, %2\n"

		// forward = center - eye
		"vsub.t  R102, C010, C000\n"

		// normalize forward
		"vdot.t  S033, R102, R102\n"
		"vrsq.s  S033, S033\n"
		"vscl.t  R102, R102, S033\n"

		// side = forward cross up
		"vcrsp.t R100, R102, C020\n"
		"vdot.t  S033, R100, R100\n"
		"vrsq.s  S033, S033\n"
		"vscl.t  R100, R100, S033\n"

		// lup = side cross forward
		"vcrsp.t R101, R100, R102\n"

		"vneg.t  R102, R102\n"

		"vmidt.q M200\n"
		"vneg.t  C230, C000\n"
		"vmmul.q M300, M100, M200\n"
	&#58; &#58; "m"&#40;*eye&#41;, "m"&#40;*center&#41;, "m"&#40;*up&#41;&#41;;

	gum_current_matrix_update = 1;
&#125;
#endif

There you go! Put that in the svn!

hlide · Post by **hlide** » Mon Mar 05, 2007 2:02 am

Raphael wrote:

Code: Select all

         // lup = side cross forward
        "vcrsp.t R101, R100, R102\n"
        // not sure if following three lines really needed&#58;
        "vdot.t S033, R101, R101\n"
        "vrsq.s S003, S033\n"
        "vscl.t R101, R101, S033\n"

in the C version, this normalization is not present so you should be right. By the way, S003 meseems to be still wrong (should be S033 if this normalization makes sense).

the result of a cross-product of two normalized vectors should be normalized.

chp · Post by **chp** » Mon Mar 05, 2007 2:06 am

If you use two normalised vectors to do the cross product, you will get a normalised vector as a result. I don't have the proof readily around, but that's the way it is. :) I'll change the usage vrcp.s into vrsq.s like you did (excluding the last normalisation) and commit.

Zettablade · Post by **Zettablade** » Mon Mar 05, 2007 2:09 am

chp wrote:If you use two normalised vectors to do the cross product, you will get a normalised vector as a result. I don't have the proof readily around, but that's the way it is. :) I'll change the usage vrcp.s into vrsq.s like you did (excluding the last normalisation) and commit.

Glad this is over with now. It was worth staying up all night!

chp · Post by **chp** » Mon Mar 05, 2007 2:15 am

At closer examination the code for sceGumLookAt() is not properly written to be used in the matrix stack. I'll take a closer look at it soon, but for now the older version is back in.

Raphael · Post by **Raphael** » Mon Mar 05, 2007 2:31 am

Yeah, I did the proof myself to get sure :) (wiki didn't spit out a fact about it).
Should be ok though, because behaviour is same as stable vfpu version (writing resulting matrix to M300 and setting gum_update_current_matrix).

hlide · Post by **hlide** » Mon Mar 05, 2007 2:34 am

chp wrote:If you use two normalised vectors to do the cross product, you will get a normalised vector as a result. I don't have the proof readily around, but that's the way it is. :)

in theory, yes. If you just try to normalized several vectors initially then cross-product the results in a loop without subsequent normalizations, you may get a degradation in the long run due to the limitation of float precision. This is why when computing with quaternions, you need to normalize vectors as soon as possible to avoid degradation in the long run.

But here, it shouldn't be a problem.

chp · Post by **chp** » Mon Mar 05, 2007 8:16 am

Indeed, float-precision is always a problem, but normalizing the vector after a single cross-product is a slight overkill. If only all math could be in perfect precision. ;)

Raphael · Post by **Raphael** » Mon Mar 05, 2007 9:46 am

chp, I just noticed what you meant with wrong application of the matrices. The original version was supposed to multiplicate the new lookAt matrix into the current M300 matrix. Should just need an additional vmmul.q, no?

Code: Select all

    "vmmul.q M200, M300, M200\n" 
    "vmmul.q M300, M100, M200\n"

BTW: chp, is there a specific reason as to why sceGuLookAt/Perspective/Ortho multiplicate the result into the current matrix on stack, rather than replace it? I think OpenGL does that too, but I just can't seem to find any use in it (especially since we have three status matrices on PSP). I know they aren't time critical functions so it shouldn't matter, but I was just wondering.

hlide · Post by **hlide** » Mon Mar 05, 2007 4:14 pm

Raphael wrote:chp, I just noticed what you meant with wrong application of the matrices. The original version was supposed to multiplicate the new lookAt matrix into the current M300 matrix. Should just need an additional vmmul.q, no?
Code: Select all
    "vmmul.q M200, M300, M200\n" 
    "vmmul.q M300, M100, M200\n"
BTW: chp, is there a specific reason as to why sceGuLookAt/Perspective/Ortho multiplicate the result into the current matrix on stack, rather than replace it? I think OpenGL does that too, but I just can't seem to find any use in it (especially since we have three status matrices on PSP). I know they aren't time critical functions so it shouldn't matter, but I was just wondering.

huh, i'm not sure "vmmul.q *M200*, M300, *M200*\n" would work because if i'm not wrong MD must be different from MS and MT (you cannot do a matrix product inplace).

And by the way, if you look at sceGumLookAt more thoroughly (which calls gumLookAt), it happens that a matrix identity always loaded in M300 before so it is like you replace it at the end indeed, so I think you dont need to have two multiplications here.

chp · Post by **chp** » Mon Mar 05, 2007 6:29 pm

It is not unthinkable that someone would like to do more than one projection-step, and I don't want to be the one telling them "no, because we didn't want to allow that flexibility". :)

And hlide, if you look more closely at the sceGumLookAt()-code in pspgum_vfpu.c, it loads the resulting matrix into registers and multiplies with the current one. gumLoadIdentity() does not load into vfpu-registers, it's writing directly to memory (sceGumLoadIdentity() would touch the registers).

The approach I'll look at rewriting it will make it use only one vmmul.q anyway, it's all about ordering and the vfpu is flexible.

Raphael · Post by **Raphael** » Mon Mar 05, 2007 7:59 pm

That's what I thought :) thanks for the answer