Optimizing the GU (how to get best performances)
Optimizing the GU (how to get best performances)
I searched how to get the best performances with the GU
here is what I found :
Textures :
-use indexed textures
-use swizzled textures
-load textures in VRAM
Vertices array :
-use vertices between 8-12 bytes
-EDIT : DON'T use vertex index
-use stripped triangles
ok, now i want to know what is the best vertex format to use, I have :
-texture coordinates
-normals
-vertices position
because I need 16 bit for the vertices position, i don't need more than 8 bit for the normals, and 8 bits texture coordinates can be well
this is : 3*2+3*1+2*1 = 11 bytes
11 is between 8-12 but vertex data must be aligned to 4 bytes, this mean i can't use 11 bytes vertices
i think the best thing to do is to use 16 bit for each : 2*3+2*3+2*2 =16
but 16 is'nt between 8-12
I read that I should use double buffered display list, but i can't find how to do this, can anybody help me please ?
and if you know any optimization I haven't listed, please tell it
thanks for replying
here is what I found :
Textures :
-use indexed textures
-use swizzled textures
-load textures in VRAM
Vertices array :
-use vertices between 8-12 bytes
-EDIT : DON'T use vertex index
-use stripped triangles
ok, now i want to know what is the best vertex format to use, I have :
-texture coordinates
-normals
-vertices position
because I need 16 bit for the vertices position, i don't need more than 8 bit for the normals, and 8 bits texture coordinates can be well
this is : 3*2+3*1+2*1 = 11 bytes
11 is between 8-12 but vertex data must be aligned to 4 bytes, this mean i can't use 11 bytes vertices
i think the best thing to do is to use 16 bit for each : 2*3+2*3+2*2 =16
but 16 is'nt between 8-12
I read that I should use double buffered display list, but i can't find how to do this, can anybody help me please ?
and if you know any optimization I haven't listed, please tell it
thanks for replying
Last edited by skwi on Mon Aug 14, 2006 1:19 am, edited 1 time in total.
everybody don't care ?
i found this : http://forums.ps2dev.org/viewtopic.php?t=4703
and this : http://forums.ps2dev.org/viewtopic.php?t=4476
i found this : http://forums.ps2dev.org/viewtopic.php?t=4703
and this : http://forums.ps2dev.org/viewtopic.php?t=4476
Well, to my finding, I could also just supply a 8Bit Vertex position structure, which will have 3bytes per vertex and therefore isn't aligned to 4bytes, but still worked well. So maybe just try to go with your 11byte structure and see if it works. If it does, it should be at best speed already so nothing to worry
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
Alexander Berl
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
Alexander Berl
thanks for your answers
Raphael, , sometimes 8 bit can work and it's 4 bytes aligned : 1*3+1*3+1*2 = 8 , but with somes big detailled objects, 256 isn't enough, i'll make a test but i think there is not a huge difference according to this :
8 bytes: 13.67mv/s (13.67mv/s)
10 bytes: 13.67mv/s (13.67mv/s)
12 bytes: 13.67mv/s (13.67mv/s)
16 bytes: 13.62mv/s (13.67mv/s)
(frome here : http://forums.ps2dev.org/viewtopic.php?t=4703)
weak, texture is stripped when I use stripped vertices ?
and do you know something about double buffered display list ?
Raphael, , sometimes 8 bit can work and it's 4 bytes aligned : 1*3+1*3+1*2 = 8 , but with somes big detailled objects, 256 isn't enough, i'll make a test but i think there is not a huge difference according to this :
8 bytes: 13.67mv/s (13.67mv/s)
10 bytes: 13.67mv/s (13.67mv/s)
12 bytes: 13.67mv/s (13.67mv/s)
16 bytes: 13.62mv/s (13.67mv/s)
(frome here : http://forums.ps2dev.org/viewtopic.php?t=4703)
weak, texture is stripped when I use stripped vertices ?
and do you know something about double buffered display list ?
-
- Posts: 376
- Joined: Wed May 10, 2006 11:31 pm
but i don't use any of this
this is my main loop :
this is my main loop :
Code: Select all
while(running())
{
sceGuStart(GU_DIRECT,list1);
if(objetest.etat==0)
{
if(chargeobj(&objetest)==0)//load the object
break;//quit
}
gettimeofday(&curr_time,0);
fcount++;
if(curr_time.tv_sec!=lastime)//every second
{
fps=fcount;
fcount=0;
}
lastime= curr_time.tv_sec;
pspDebugScreenSetXY(0, 1);
pspDebugScreenPrintf("test nb x : %x\nfps : %d",objetest.texvram,fps);
pspDebugScreenSetXY(0, 3);
pspDebugScreenPrintf("near : %f\n",near);
sceCtrlReadBufferPositive(&pad, 1);
if (pad.Buttons & PSP_CTRL_UP){
x+=-sin(yrot)*.1;
z+=cos(yrot)*.1;
}
if (pad.Buttons & PSP_CTRL_DOWN){
x-=-sin(yrot)*.1;
z-=cos(yrot)*.1;
}
if (pad.Buttons & PSP_CTRL_LEFT){
}
if (pad.Buttons & PSP_CTRL_RIGHT){
}
if (pad.Buttons & PSP_CTRL_LTRIGGER){
near-=.1;
}
if (pad.Buttons & PSP_CTRL_RTRIGGER){
near+=.1;
}
if (pad.Buttons & PSP_CTRL_SQUARE){
xrot+=.1;
}
if (pad.Buttons & PSP_CTRL_TRIANGLE){
y-=.1;
}
if (pad.Buttons & PSP_CTRL_CROSS){
y+=.1;
}
if (pad.Buttons & PSP_CTRL_CIRCLE){
xrot-=.1;
}
if (pad.Lx<100){//left
yrot-=((100.0f-pad.Lx)/1000.0f);
}
if (pad.Lx>160){//right
yrot+=((pad.Lx-160.0f)/1000.0f);
}
if (pad.Ly<100){//up
x+=(-sin(yrot)*(100.0f-pad.Ly)/500.0f);
z+=(cos(yrot)*(100.0f-pad.Ly)/500.0f);
}
if (pad.Ly>160){//down
x-=(-sin(yrot)*(pad.Ly-160.0f)/500.0f);
z-=(cos(yrot)*(pad.Ly-160.0f)/500.0f);
}
// clear screen
sceGuClearColor(0xff554433);
sceGuClearDepth(0);
sceGuClear(GU_COLOR_BUFFER_BIT|GU_DEPTH_BUFFER_BIT);
//setup one light
ScePspFVector3 pos = { 0, 2, 0 };
sceGuLight(0,GU_DIRECTIONAL,GU_DIFFUSE_AND_SPECULAR,&pos);
sceGuLightColor(0,GU_DIFFUSE,0x88888888);
sceGuLightColor(0,GU_SPECULAR,0xFFFFFFFF);
sceGuLightAtt(0,0.0f,1.0f,0.0f);
sceGuSpecular(16.0f);
sceGuAmbient(0x00444444);
sceGumMatrixMode(GU_PROJECTION);
sceGumLoadIdentity();
sceGumPerspective(45,16.0f/9.0f,near,1000.0f);
sceGumMatrixMode(GU_VIEW);
sceGumLoadIdentity();
{
ScePspFVector3 pos = { x, y, z };
ScePspFVector3 rot = { xrot, yrot,zrot };
sceGumRotateXYZ(&rot);
sceGumTranslate(&pos);
}
sceGumMatrixMode(GU_MODEL);
dessineobj(&objetest);
sceGuFinish();
sceGuSync(0,0);
//sceDisplayWaitVblankStart();
sceGuSwapBuffers();
}