freeze psp with asm inline function

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
gff_cruner
Posts: 11
Joined: Mon May 01, 2006 11:13 am
Location: La Coruna - Spain

freeze psp with asm inline function

Post by gff_cruner »

Hi everyone.
I am new in psp programing, in inline assembly coding and I am new in this forum too :).
I am trying to code a simple little 3d engine. Now, I was coding a simple function to multiply two matrix, but i have some problems with inline assembly code.
This is my code:

Code: Select all

void multMatrix(float *matrix1, float *matrix2)
{
	__asm__ volatile(
		"lv.q c000,  0 + %0\n"
		"lv.q c010, 16 + %0\n"
		"lv.q c020, 32 + %0\n"
		"lv.q c030, 48 + %0\n"
		"lv.q c100,  0 + %1\n"
		"lv.q c110, 16 + %1\n"
		"lv.q c120, 32 + %1\n"
		"lv.q c130, 48 + %1\n"
		"vmmul.q m200, m000, m100\n"
		"sv.q c200,  0 + %1\n"
		"sv.q c210, 16 + %1\n"
		"sv.q c220, 32 + %1\n"
		"sv.q c230, 48 + %1\n"
		::"m" (matrix1[0]),
		  "m" (matrix2[0]));
}
I don´t know what´s wrong, but I only get a freeze psp with this code.
Maybe my problem is shift registers. I don´t know.

Can someone help me to fix this??

Edited:
I have discovered that always I use lv or sv to load and store data, I get a freeze psp. But when I use mtv/mfv it works.
Generally, somebody knows how to save/load a 4x4 matrix to/from vfpu using simple vfpu instructions (without GUM)??

Thanks
starman2049
Posts: 75
Joined: Mon Sep 19, 2005 5:41 am

Post by starman2049 »

lv.q and sv.q require data to be quadword aligned. meaning the starting address of the argumants needs to be multiples of 16 bytes.
gff_cruner
Posts: 11
Joined: Mon May 01, 2006 11:13 am
Location: La Coruna - Spain

Post by gff_cruner »

starman2049 wrote:lv.q and sv.q require data to be quadword aligned. meaning the starting address of the argumants needs to be multiples of 16 bytes.
Thanks starman2049. I didn´t realize. I´ll test the code trying to align the bytes that compose the matrix that way. But, implies this that I have to realign the data pointed by matrix1 and matrix2 before the asm code, isn´t? Then, is not enough to shift the registers using '+'? (Sorry for this kind of questions, I am really n00b in asm coding)

Thanks a lot for the reply! ;)
starman2049
Posts: 75
Joined: Mon Sep 19, 2005 5:41 am

Post by starman2049 »

You have to make sure that your matrices have 16 byte alignment. Meaning that the starting address of the [0][0] element is a multiple of 16 bytes for each matrix.

I define a float matrix as:

Code: Select all

typedef float	FMATRIX[4][4] __attribute__((aligned(16)));
I have sw and hw versions of some of these functions becasue it is not always possible to make sure a matrix (or more commonly a vector) is aligned to 16 bytes, particularily when working on complex streams such as particle effects, etc where you might pack arrays so they are exactly a cache line (64 bytes) for cache performance, but then data might not be aligned inside the array for best VFPU/lq/sq usage. Sometimes it's a tradeoff...
chp
Posts: 313
Joined: Wed Jun 23, 2004 7:16 am

Post by chp »

You can also use the unaligned version if you cannot guarantee that something is aligned to qword boundary (like when you declare a matrix on the stack since GCC doesn't align stack variables), ulv.q/usv.q exist for this purpose. They are actually macro-instructions for lvl/lvr and svl/svr, but you don't want to spend your time computing offsets every time you use them so I recommend the macro-versions. :) And instead of reinventing the wheel you can look at the code in pspgum_vfpu.c, it should give good ideas on how to work with the vfpu.
GE Dominator
gff_cruner
Posts: 11
Joined: Mon May 01, 2006 11:13 am
Location: La Coruna - Spain

Post by gff_cruner »

First of all, thank you both. Your replies were very helpful for me ;)

My first try was to align the bytes of the matrix using "__attribute__((aligned(16)))", as starman2049 said, but the function returned strange results (I don't know why).
I followed the advice from chp as well, and after some tests and some reading of pspvfpu.h, psptypes.h and pspgum_vfpu.h, I realized that the vfputypes.h includes declarations of matrices built as structs:

Code: Select all

typedef struct ScePspIMatrix4 {
	ScePspIVector4 	x;
	ScePspIVector4 	y;
	ScePspIVector4 	z;
	ScePspIVector4 	w;
} ScePspIMatrix4 __attribute__((aligned(16)));
...which also use the alignement that starman said.

I used this struct (where the vectors are also declared as structs) with this function (thanks to pspgum_vfpu.h code):

Code: Select all

void aceMultMatrix(ScePspFMatrix4 *m0, ScePspFMatrix4 *m1)
{
	pspvfpu_use_matrices(pspvfpu_initcontext(), 0, VMAT0 | VMAT1 |VMAT2);

	__asm__ volatile (
		"ulv.q r000, 0+%0\n"
		"ulv.q r001, 16+%0\n"
		"ulv.q r002, 32+%0\n"
		"ulv.q r003, 48+%0\n"

		"ulv.q r100, 0+%1\n"
		"ulv.q r101, 16+%1\n"
		"ulv.q r102, 32+%1\n"
		"ulv.q r103, 48+%1\n"

		"vmmul.q   m200, m000, m100\n"

		"usv.q r200, 0+%0\n"
		"usv.q r201, 16+%0\n"
		"usv.q r202, 32+%0\n"
		"usv.q r203, 48+%0\n"
		:"+m"(*m1): "m"(*m0));
}
And finally, IT WORKED ;)

Anyway, it's quite strange using these matrices as structs and it's hard to code this way when your matrices are declared as arrays, so I'm looking for a way to do it that way. I'll be very thankfully if somebody gives me some kind of hint about this topic.
Rekai
Posts: 5
Joined: Tue Sep 12, 2006 6:14 am

Post by Rekai »

well I have not actually coded it but you sould be able to overload the = operator to look something like this:

Code: Select all

ScePspFMatrix4 operator=(float** m){
   ScePspFMatrix4 ret={{m[0][0],m[0][1],m[0][2],m[0][3],},
          {m[1][0],m[1][1],m[1][2],m[1][3],},
          {m[2][0],m[2][1],m[2][2],m[2][3],},
          {m[3][0],m[3][1],m[3][2],m[3][3],},}
   return ret;
}

that sould do, and if doesnt, try making a macro to switch from one type to the other.
gff_cruner
Posts: 11
Joined: Mon May 01, 2006 11:13 am
Location: La Coruna - Spain

Post by gff_cruner »

Thanks Rekai, i will use it
Post Reply