freeze psp with asm inline function

gff_cruner · Post by **gff_cruner** » Mon May 01, 2006 11:29 am

Hi everyone.
I am new in psp programing, in inline assembly coding and I am new in this forum too :).
I am trying to code a simple little 3d engine. Now, I was coding a simple function to multiply two matrix, but i have some problems with inline assembly code.
This is my code:

Code: Select all

void multMatrix&#40;float *matrix1, float *matrix2&#41;
&#123;
	__asm__ volatile&#40;
		"lv.q c000,  0 + %0\n"
		"lv.q c010, 16 + %0\n"
		"lv.q c020, 32 + %0\n"
		"lv.q c030, 48 + %0\n"
		"lv.q c100,  0 + %1\n"
		"lv.q c110, 16 + %1\n"
		"lv.q c120, 32 + %1\n"
		"lv.q c130, 48 + %1\n"
		"vmmul.q m200, m000, m100\n"
		"sv.q c200,  0 + %1\n"
		"sv.q c210, 16 + %1\n"
		"sv.q c220, 32 + %1\n"
		"sv.q c230, 48 + %1\n"
		&#58;&#58;"m" &#40;matrix1&#91;0&#93;&#41;,
		  "m" &#40;matrix2&#91;0&#93;&#41;&#41;;
&#125;

I don´t know what´s wrong, but I only get a freeze psp with this code.
Maybe my problem is shift registers. I don´t know.

Can someone help me to fix this??

Edited:
I have discovered that always I use lv or sv to load and store data, I get a freeze psp. But when I use mtv/mfv it works.
Generally, somebody knows how to save/load a 4x4 matrix to/from vfpu using simple vfpu instructions (without GUM)??

Thanks

starman2049 · Post by **starman2049** » Thu May 04, 2006 8:15 am

lv.q and sv.q require data to be quadword aligned. meaning the starting address of the argumants needs to be multiples of 16 bytes.

gff_cruner · Post by **gff_cruner** » Thu May 04, 2006 9:05 am

starman2049 wrote:lv.q and sv.q require data to be quadword aligned. meaning the starting address of the argumants needs to be multiples of 16 bytes.

Thanks starman2049. I didn´t realize. I´ll test the code trying to align the bytes that compose the matrix that way. But, implies this that I have to realign the data pointed by matrix1 and matrix2 before the asm code, isn´t? Then, is not enough to shift the registers using '+'? (Sorry for this kind of questions, I am really n00b in asm coding)

Thanks a lot for the reply! ;)

starman2049 · Post by **starman2049** » Thu May 04, 2006 9:53 am

You have to make sure that your matrices have 16 byte alignment. Meaning that the starting address of the [0][0] element is a multiple of 16 bytes for each matrix.

I define a float matrix as:

Code: Select all

typedef float	FMATRIX&#91;4&#93;&#91;4&#93; __attribute__&#40;&#40;aligned&#40;16&#41;&#41;&#41;;

I have sw and hw versions of some of these functions becasue it is not always possible to make sure a matrix (or more commonly a vector) is aligned to 16 bytes, particularily when working on complex streams such as particle effects, etc where you might pack arrays so they are exactly a cache line (64 bytes) for cache performance, but then data might not be aligned inside the array for best VFPU/lq/sq usage. Sometimes it's a tradeoff...

chp · Post by **chp** » Thu May 04, 2006 9:52 pm

You can also use the unaligned version if you cannot guarantee that something is aligned to qword boundary (like when you declare a matrix on the stack since GCC doesn't align stack variables), ulv.q/usv.q exist for this purpose. They are actually macro-instructions for lvl/lvr and svl/svr, but you don't want to spend your time computing offsets every time you use them so I recommend the macro-versions. :) And instead of reinventing the wheel you can look at the code in pspgum_vfpu.c, it should give good ideas on how to work with the vfpu.

gff_cruner · Post by **gff_cruner** » Mon May 08, 2006 5:13 am

First of all, thank you both. Your replies were very helpful for me ;)

My first try was to align the bytes of the matrix using "__attribute__((aligned(16)))", as starman2049 said, but the function returned strange results (I don't know why).
I followed the advice from chp as well, and after some tests and some reading of pspvfpu.h, psptypes.h and pspgum_vfpu.h, I realized that the vfputypes.h includes declarations of matrices built as structs:

Code: Select all

typedef struct ScePspIMatrix4 &#123;
	ScePspIVector4 	x;
	ScePspIVector4 	y;
	ScePspIVector4 	z;
	ScePspIVector4 	w;
&#125; ScePspIMatrix4 __attribute__&#40;&#40;aligned&#40;16&#41;&#41;&#41;;

...which also use the alignement that starman said.

I used this struct (where the vectors are also declared as structs) with this function (thanks to pspgum_vfpu.h code):

Code: Select all

void aceMultMatrix&#40;ScePspFMatrix4 *m0, ScePspFMatrix4 *m1&#41;
&#123;
	pspvfpu_use_matrices&#40;pspvfpu_initcontext&#40;&#41;, 0, VMAT0 | VMAT1 |VMAT2&#41;;

	__asm__ volatile &#40;
		"ulv.q r000, 0+%0\n"
		"ulv.q r001, 16+%0\n"
		"ulv.q r002, 32+%0\n"
		"ulv.q r003, 48+%0\n"

		"ulv.q r100, 0+%1\n"
		"ulv.q r101, 16+%1\n"
		"ulv.q r102, 32+%1\n"
		"ulv.q r103, 48+%1\n"

		"vmmul.q   m200, m000, m100\n"

		"usv.q r200, 0+%0\n"
		"usv.q r201, 16+%0\n"
		"usv.q r202, 32+%0\n"
		"usv.q r203, 48+%0\n"
		&#58;"+m"&#40;*m1&#41;&#58; "m"&#40;*m0&#41;&#41;;
&#125;

And finally, IT WORKED ;)

Anyway, it's quite strange using these matrices as structs and it's hard to code this way when your matrices are declared as arrays, so I'm looking for a way to do it that way. I'll be very thankfully if somebody gives me some kind of hint about this topic.

Rekai · Post by **Rekai** » Fri Sep 29, 2006 5:49 am

well I have not actually coded it but you sould be able to overload the = operator to look something like this:

Code: Select all

ScePspFMatrix4 operator=&#40;float** m&#41;&#123;
   ScePspFMatrix4 ret=&#123;&#123;m&#91;0&#93;&#91;0&#93;,m&#91;0&#93;&#91;1&#93;,m&#91;0&#93;&#91;2&#93;,m&#91;0&#93;&#91;3&#93;,&#125;,
          &#123;m&#91;1&#93;&#91;0&#93;,m&#91;1&#93;&#91;1&#93;,m&#91;1&#93;&#91;2&#93;,m&#91;1&#93;&#91;3&#93;,&#125;,
          &#123;m&#91;2&#93;&#91;0&#93;,m&#91;2&#93;&#91;1&#93;,m&#91;2&#93;&#91;2&#93;,m&#91;2&#93;&#91;3&#93;,&#125;,
          &#123;m&#91;3&#93;&#91;0&#93;,m&#91;3&#93;&#91;1&#93;,m&#91;3&#93;&#91;2&#93;,m&#91;3&#93;&#91;3&#93;,&#125;,&#125;
   return ret;
&#125;

that sould do, and if doesnt, try making a macro to switch from one type to the other.

gff_cruner · Post by **gff_cruner** » Fri Oct 06, 2006 1:40 am

Thanks Rekai, i will use it