VFPU Rendering engine (for fragment processing)
VFPU Rendering engine (for fragment processing)
For those who are interested seeing some C++ and VFPU code this engine offers that and pixel processing rendering engine for PSP. (I used PSPSDK, Eclipse & Cygwin as my environment..)
Basically it has a scanline renderer, which allows different kinds of interpolations per pixel bases. It is not directed to game development as such, but more for exploring different kinds of rendering/shading types on PSP.
Included in the engine:
- a simple scenegraph
- OBJ loader
- scaline renderer (this is where the most bugs seem to be.. :) )
- Phong, Gouraud and Quadratic shaders implemented
- VFPU used for the rendering pipeline transforms, occlusion culling, and lighting calculations
- Simple multithreading for UI
- UI (rotate & zoom)
Current Issues:
- the scanline algorithm produces uggly flickering.. maybe because of writing directly to memory... (maybe someone would have an idea about this?)
- scanline does not draw to the right hand side 32 pixels...hmmm.. related to the above, maybe.
- And I believe the scanline could be faster as well!!!
- The normals are not drawn correctly now...
- Code could be cleaned more I guess
Todo (I'm not sure if I have the time anymore though...):
- Fix the above bugs
- File menu for loading objects
- textures, cubemaps, hdr, blooming, simple modelling capabilities, etc, etc... :)
Usage:
0. copy the eboot AND the data folder + contents on your psp
1. start the program and press X when the model has been loaded
2. Use analogue stick to rotate L&R to zoom and [] & /\ to change the active renderer.
Download:
*UPDATED 2007-07-08 (fixed phong shading as discussed below)
*UPDATED 2007-07-10 (optimized vfpu stuff as discussed below)
http://www.etukan.net/psp/rtEngine.zip (1.75MB)
The thanks go to:
ps2dev.org, hlide(VFPU), tyranid(PSPLINK!), torakak(Eclipse), skistovel(PSPGU), rapso (founding phong bug)
an example with a model of 5000 polys and 3842 vertices
Basically it has a scanline renderer, which allows different kinds of interpolations per pixel bases. It is not directed to game development as such, but more for exploring different kinds of rendering/shading types on PSP.
Included in the engine:
- a simple scenegraph
- OBJ loader
- scaline renderer (this is where the most bugs seem to be.. :) )
- Phong, Gouraud and Quadratic shaders implemented
- VFPU used for the rendering pipeline transforms, occlusion culling, and lighting calculations
- Simple multithreading for UI
- UI (rotate & zoom)
Current Issues:
- the scanline algorithm produces uggly flickering.. maybe because of writing directly to memory... (maybe someone would have an idea about this?)
- scanline does not draw to the right hand side 32 pixels...hmmm.. related to the above, maybe.
- And I believe the scanline could be faster as well!!!
- The normals are not drawn correctly now...
- Code could be cleaned more I guess
Todo (I'm not sure if I have the time anymore though...):
- Fix the above bugs
- File menu for loading objects
- textures, cubemaps, hdr, blooming, simple modelling capabilities, etc, etc... :)
Usage:
0. copy the eboot AND the data folder + contents on your psp
1. start the program and press X when the model has been loaded
2. Use analogue stick to rotate L&R to zoom and [] & /\ to change the active renderer.
Download:
*UPDATED 2007-07-08 (fixed phong shading as discussed below)
*UPDATED 2007-07-10 (optimized vfpu stuff as discussed below)
http://www.etukan.net/psp/rtEngine.zip (1.75MB)
The thanks go to:
ps2dev.org, hlide(VFPU), tyranid(PSPLINK!), torakak(Eclipse), skistovel(PSPGU), rapso (founding phong bug)
an example with a model of 5000 polys and 3842 vertices
Last edited by onne on Tue Jul 10, 2007 9:52 am, edited 2 times in total.
help for building custom shaders
I got question about creating custom shaders in the DX or GLSL style, using vertex- and pixel/fragment- shading programs... So I thought others might be interested as well.
-----------------------------
As you know there is really no shading language on PSP, but we have to do the things ourselves. That was the exact purpose of the engine I wrote... So,
* in Gouraud shader:
-- the normals are taken only per vertex basis
-- the color got using the "(vertex) shader program" (and pixel coordinates of course) is interpolated per fragment.
* in Phong shader
-- the normals are interpolated per fraqment basis
-- color information is calculated per fragment (using the "(fragment) shader program")
* in Quadratic shader the formulation is more complex
-- normals are interpolated once in the edge midpoints
-- color is calculated per fragment using quadratic formulation
In other words, in the Gouraud case the shader program ( CalculateLightVFPU() ) is called per vertex, where as in the Phong case it is called per fragment.
Following that pattern one can implement any shader... although the cost of interpolation is quite high as you can see. But in the case of smaller models (<1000) results may both look very nice and be real time.
And of course it is possible to use shaders only on some objects instead of all!
-----------------------------
As you know there is really no shading language on PSP, but we have to do the things ourselves. That was the exact purpose of the engine I wrote... So,
* in Gouraud shader:
-- the normals are taken only per vertex basis
-- the color got using the "(vertex) shader program" (and pixel coordinates of course) is interpolated per fragment.
* in Phong shader
-- the normals are interpolated per fraqment basis
-- color information is calculated per fragment (using the "(fragment) shader program")
* in Quadratic shader the formulation is more complex
-- normals are interpolated once in the edge midpoints
-- color is calculated per fragment using quadratic formulation
In other words, in the Gouraud case the shader program ( CalculateLightVFPU() ) is called per vertex, where as in the Phong case it is called per fragment.
Following that pattern one can implement any shader... although the cost of interpolation is quite high as you can see. But in the case of smaller models (<1000) results may both look very nice and be real time.
And of course it is possible to use shaders only on some objects instead of all!
thanks rapso... Yes, maybe the normalization is a bit in a weird place, but it is there. In the code the light vector is static and normalized, and so is the view...
So the only thing that needs normalization is the normals. They are normalized when the appropriate transform is applied in the ApplyNormalMatrix().
Although, as you said the normals should be normalized in theory also when interpolating them in phong shading... hmm. As the interpolation is linear between normals length of 1... I'm not sure how much the additional normalization would improve the accuracy... (at least the performance decrease is significant). I will see to it soon.. :)
So the only thing that needs normalization is the normals. They are normalized when the appropriate transform is applied in the ApplyNormalMatrix().
Code: Select all
//...
// pow2
"vmul.t R600, R500, R500\n"
"vmul.t R601, R501, R501\n"
"vmul.t R602, R502, R502\n"
//sum
"vadd.s S600, S600, S610\n"
"vadd.s S600, S600, S620\n"
"vadd.s S601, S601, S611\n"
"vadd.s S601, S601, S621\n"
"vadd.s S602, S602, S612\n"
"vadd.s S602, S602, S622\n"
// 1/sqrt
"vrsq.s S600, S600\n"
"vrsq.s S601, S601\n"
"vrsq.s S602, S602\n"
// normalize to length 1
"vscl.t R500, R500, S600\n"
"vscl.t R501, R501, S601\n"
"vscl.t R502, R502, S602\n"
//...
yes, I've seen that normalize per vertex, but I'm talking about
the interpolated normal is not unit length anymore so u need to normalize it first before doing any calculation.
Code: Select all
unsigned long CMyRenderer::CalculateLightVFPU( float* aNormal, const TColorRGBA& aMaterialColor )
{
// printf("initcolor: [%f, %f, %f]\n", aMaterialColor.iR, aMaterialColor.iG, aMaterialColor.iB );
unsigned long pspColor(0);
// float r,g,b,a;
__asm__ volatile (
//Load Normal & color
"ulv.q R600, 0x0(%1)\n" //N
"ulv.q R601, 0x0(%2)\n" //L
"ulv.q R602, 0x0(%3)\n" //V
//here
yep, I see it... I'm just thinking how much it improves the accuracy. :)
(..I guess it should a lot as the intensity of many highlights decreases when interpolating...)
Actually the phong and gouraud both use the same CalculateLightVFPU() function, so only the phong needs the normalization...
I will try how much it improves the quality and update the archieve, thanks for the notice!
-----------------
UPDATED the zip...
yes the quality is better and phongshading works as it should (I added a seperate lighting function CalculateLightVFPUPhong() ). thanks.
(..I guess it should a lot as the intensity of many highlights decreases when interpolating...)
Actually the phong and gouraud both use the same CalculateLightVFPU() function, so only the phong needs the normalization...
I will try how much it improves the quality and update the archieve, thanks for the notice!
-----------------
UPDATED the zip...
yes the quality is better and phongshading works as it should (I added a seperate lighting function CalculateLightVFPUPhong() ). thanks.
some operations can be reduced through parallelization :
vadd.s has 3 cycles penality because of RAW
==> no RAW now, so it should be 1 cycle per vadd.s
==> but still you can reduce 6 vadd.s to 2 vadd.t !
another one :
==>
vadd.s has 3 cycles penality because of RAW
Code: Select all
"vadd.s S600, S600, S610\n"
"vadd.s S600, S600, S620\n"
"vadd.s S601, S601, S611\n"
"vadd.s S601, S601, S621\n"
"vadd.s S602, S602, S612\n"
"vadd.s S602, S602, S622\n"
Code: Select all
"vadd.s S600, S600, S610\n"
"vadd.s S601, S601, S611\n"
"vadd.s S602, S602, S612\n"
"vadd.s S600, S600, S620\n"
"vadd.s S601, S601, S621\n"
"vadd.s S602, S602, S622\n"
Code: Select all
"vadd.t C600, C600, C610\n"
"vadd.t C600, C600, C620\n"
Code: Select all
"vrsq.s S600, S600\n"
"vrsq.s S601, S601\n"
"vrsq.s S602, S602\n"
Code: Select all
"vrsq.t C600, C600\n"
ah... I forgot to group those. Thanks! Although I did not know about RAW... interesting.
It seems I should upload it to some svn... updating the zip starts to get me. heh.
What still bothers me are the artifacts in the scanline. I'm not quite sure where those come from.
Oh, and hlide.. I have this part in the code:
I guess the "vuc2i" command you suggested is not in the current sdk yet, or is it? (I did not update recently)
It seems I should upload it to some svn... updating the zip starts to get me. heh.
What still bothers me are the artifacts in the scanline. I'm not quite sure where those come from.
Oh, and hlide.. I have this part in the code:
Code: Select all
// Would be better off using these:
//Convert PSP colors to floats
"vuc2i.s R600, S633\n"
"vi2f.q R600, R600, 24\n"
RAW : Read After Writeonne wrote:I did not know about RAW... interesting.
vfpu operations need to be completed in several cycles but when there is no dependency between a instruction and the next one, the next one can start one cycle after the previous one.
here vadd.s needs 3 cycles (latency), but if you manage to avoid that the target register of the first instruction is not used as a source register in the next two instructions, the first instruction would appear to be executed in one cycle because the next two instructions are able to start one cycle after (pitch).
i'm not quite sure but i remember Tyranid commits one of my changes and perhaps this instruction was added. If not, you can find one post where i give the opcode in hexa (.word 0x?????). MrMr[Ice] pointed out about some misuse of those instructions in one post, you should find an answer.onne wrote:Oh, and hlide.. I have this part in the code:I guess the "vuc2i" command you suggested is not in the current sdk yet, or is it? (I did not update recently)Code: Select all
// Would be better off using these: //Convert PSP colors to floats "vuc2i.s R600, S633\n" "vi2f.q R600, R600, 24\n"