Putting textures in VRAM
Putting textures in VRAM
Since this should be much faster than having them in normal memory I'm wondering how to best access this memory. Is there a SDK function call to allocate the memory or should I calculate the amount used for displaybuffers, depthbuffers etc and then use the rest manually by calculating the offset from VRAM start?
Thoughts, ideas?
Thoughts, ideas?
Use sceGuCopyImage() to upload your images to VRAM. For maximum speed, pre-swizzle them before you upload (the uploading cannot do this like it could on the PS2). There's no allocation scheme, so you'd have to deal with that yourself, but it shouldn't be that hard... Just do a few static allocations for framebuffer & co, then a more dynamic approach for uploading textures.
Textures are swizzled in blocks of 16 bytes by 8 lines, independent of format.
EDIT: you can find a better explanation on how to swizzle here.
Textures are swizzled in blocks of 16 bytes by 8 lines, independent of format.
EDIT: you can find a better explanation on how to swizzle here.
GE Dominator
Hi,
I'm looking to do the same thing... But i dont know much about the framebuffer.. (addresses/formats/allocation schemes)
Can you point me at any info about the base vram pointer, how much i need to leave for the frame buffers/zbuffer and allocation sizes/formats, and how to set my frame buffer pointers to known addresses?
Thanks!
Also, am i correct in assuming swizzled textures can also work from system mem?
And one more thing,
Is swapping textures in and out of vram frequently particularly slow?
ie, when a texture is used once by one model for instance, is it worth copying that texture to vram before rendering...
Or would the best approach be to figure out which textures will be used the heaviest every frame, and just leave them vram resident and render everything else from system mem?
On PS3 you can use the path3 DMA to upload the next model's texture while the previous model is rendering, therefore when the next model is ready to draw, its texture has already been uploaded and ready for immediate use... which makes it all nice and a-syncronised...
Obviously the PS2 couldnt use textures from system memory though :P
How do you go about this on PSP? .. how much slower is rendering with textures in system memory?
I'm looking to do the same thing... But i dont know much about the framebuffer.. (addresses/formats/allocation schemes)
Can you point me at any info about the base vram pointer, how much i need to leave for the frame buffers/zbuffer and allocation sizes/formats, and how to set my frame buffer pointers to known addresses?
Thanks!
Also, am i correct in assuming swizzled textures can also work from system mem?
And one more thing,
Is swapping textures in and out of vram frequently particularly slow?
ie, when a texture is used once by one model for instance, is it worth copying that texture to vram before rendering...
Or would the best approach be to figure out which textures will be used the heaviest every frame, and just leave them vram resident and render everything else from system mem?
On PS3 you can use the path3 DMA to upload the next model's texture while the previous model is rendering, therefore when the next model is ready to draw, its texture has already been uploaded and ready for immediate use... which makes it all nice and a-syncronised...
Obviously the PS2 couldnt use textures from system memory though :P
How do you go about this on PSP? .. how much slower is rendering with textures in system memory?
[quote="turkeyman"]
I'm looking to do the same thing... But i dont know much about the framebuffer.. (addresses/formats/allocation schemes)
Can you point me at any info about the base vram pointer, how much i need to leave for the frame buffers/zbuffer and allocation sizes/formats, and how to set my frame buffer pointers to known addresses?
[/quote]
Maybe you want to take a look in the pspgl vidmem code, that's basically doing all you need. Also check the EGL functions calling this code to see how it is used: http://svn.ps2dev.org/filedetails.php?r ... rev=0&sc=0
Holger
I'm looking to do the same thing... But i dont know much about the framebuffer.. (addresses/formats/allocation schemes)
Can you point me at any info about the base vram pointer, how much i need to leave for the frame buffers/zbuffer and allocation sizes/formats, and how to set my frame buffer pointers to known addresses?
[/quote]
Maybe you want to take a look in the pspgl vidmem code, that's basically doing all you need. Also check the EGL functions calling this code to see how it is used: http://svn.ps2dev.org/filedetails.php?r ... rev=0&sc=0
Holger
VRAM base pointer is aquired using sceGeEdramGetAddr(). Since the framebuffer is 512 pixels wide (480 that you render to), you multiply that by 272 and by pixel size (2 for 16-bit, 4 for 32-bit). Z-buffer seems to always be 16-bit, so it's 512*272*2 for that one. Look at sceGuDispBuffer() and sceGuDepthBuffer() to set your pointer explicitly.turkeyman wrote:Hi,
I'm looking to do the same thing... But i dont know much about the framebuffer.. (addresses/formats/allocation schemes)
Can you point me at any info about the base vram pointer, how much i need to leave for the frame buffers/zbuffer and allocation sizes/formats, and how to set my frame buffer pointers to known addresses?
Yes. It's not "if you use vram it's swizzled", you set enable swizzling when you set the texture.turkeyman wrote: Also, am i correct in assuming swizzled textures can also work from system mem?
I have not verified this, but sceGuCopyImage() might work in parallell with rendering, so you might (with emphasis on 'might', because this is just theory) be able to upload your texture while rendering with the previous one.turkeyman wrote: And one more thing,
Is swapping textures in and out of vram frequently particularly slow?
ie, when a texture is used once by one model for instance, is it worth copying that texture to vram before rendering...
Or would the best approach be to figure out which textures will be used the heaviest every frame, and just leave them vram resident and render everything else from system mem?
On PS3 you can use the path3 DMA to upload the next model's texture while the previous model is rendering, therefore when the next model is ready to draw, its texture has already been uploaded and ready for immediate use... which makes it all nice and a-syncronised...
Obviously the PS2 couldnt use textures from system memory though :P
How do you go about this on PSP? .. how much slower is rendering with textures in system memory?
With textures in system ram, you get 50MB/s throughput, and with vram you get 500MB/s. Fast enough for you? :) And copying a texture to vram from main-ram can be done at 150MB/s using sceGuCopyImage(), so it's almost always a win to upload your texture before rendering. It all depends on how much system-ram access you will do. If you for example render using a 64x32 16-bit texture on an entire scene, there will hardly be any difference, because it will all fit in the texture page-cache, so after that first triangle, the rest will not do any system-memory requests.
GE Dominator
awesome, thanks heaps for the info guys!
i was just reading through the headers, the comments are a little confusing. and i have a couple more questions :P
you made no mention to allocation for the second frame buffer in your last post, i'll just assume where you described the framebuffer, you meant to multiply that by 2 for the front and back buffer...
these 2 functions:
sceGuDispBuffer
sceGuDrawBuffer
i *guess* that one sets the front buffer, and the other sets the back buffer/rendertarget?
the reason i'm not confident in my guess, is that both functions take very different paramaters...
sceGuDrawBuffer only takes a pixel format, which i suppose would imply the render target to be the native res of the screen? .. nar that dosent really make sense... what if you want to render to a smaller viewport, or to a dynamic texture?
sceGuDispBuffer takes the width and height, but no pixel format?
that dosent really make sense, do the 2 functions need to be used together to set the render target? ... if that's the case, why does it need to take 2 different vram pointers? ... and how do you set the front buffer? or is the front buffer always vram offset 0 and the backbuffer copied into the frontbuffer instead of the buffers simply being swapped?
the thing i was wondering, is it possible to allocate the back buffer and front buffer as different formats, for example, the front buffer at 565 and the back buffer at 8888 (allowing a desination alpha while rendering, but not wasting the extra memory on the front buffer aswell?)
another thing,
you mention that the depth buffer appears to 'always' be 16 bit..
sceGuDepthBuffer dosent take any useful paramaters at all, no dimensions (i'll assume it treats it the same as the current back buffer) and the format would be a 16bit depth value as you said... where does the stencil buffer live? is it a subset of the zbuffer? ... any less than 16bit z precision would get pretty nasty pretty quickly i imagine...
anway, i just thought i'd mention the immediate confusion when approaching these matters :)
i'm sure reading through the samples will resolve this confusion, but if anyone who maintains the SDK would like to make the function comments a little more well defined, that would be tops! :P
thanks again guys!
i was just reading through the headers, the comments are a little confusing. and i have a couple more questions :P
you made no mention to allocation for the second frame buffer in your last post, i'll just assume where you described the framebuffer, you meant to multiply that by 2 for the front and back buffer...
these 2 functions:
sceGuDispBuffer
sceGuDrawBuffer
i *guess* that one sets the front buffer, and the other sets the back buffer/rendertarget?
the reason i'm not confident in my guess, is that both functions take very different paramaters...
sceGuDrawBuffer only takes a pixel format, which i suppose would imply the render target to be the native res of the screen? .. nar that dosent really make sense... what if you want to render to a smaller viewport, or to a dynamic texture?
sceGuDispBuffer takes the width and height, but no pixel format?
that dosent really make sense, do the 2 functions need to be used together to set the render target? ... if that's the case, why does it need to take 2 different vram pointers? ... and how do you set the front buffer? or is the front buffer always vram offset 0 and the backbuffer copied into the frontbuffer instead of the buffers simply being swapped?
the thing i was wondering, is it possible to allocate the back buffer and front buffer as different formats, for example, the front buffer at 565 and the back buffer at 8888 (allowing a desination alpha while rendering, but not wasting the extra memory on the front buffer aswell?)
another thing,
you mention that the depth buffer appears to 'always' be 16 bit..
sceGuDepthBuffer dosent take any useful paramaters at all, no dimensions (i'll assume it treats it the same as the current back buffer) and the format would be a 16bit depth value as you said... where does the stencil buffer live? is it a subset of the zbuffer? ... any less than 16bit z precision would get pretty nasty pretty quickly i imagine...
anway, i just thought i'd mention the immediate confusion when approaching these matters :)
i'm sure reading through the samples will resolve this confusion, but if anyone who maintains the SDK would like to make the function comments a little more well defined, that would be tops! :P
thanks again guys!
just another thought on the topic,
would it be possible to make use of vram that is wasted by the framebuffers for storing textures in any way?
ie, frame buffer width is 480 pixels, but we allocate 512 across.. this leaves 32*bytesperpixel bytes at the end of each line, unfortunately, this really isnt enough for any useful sized texture... but, if there were a way to set the texture line stride or anything like that (number of bytes to skip to the next texture line) you could store a bunch of 32xX or 64xX textures in the excess framebuffer space... obviously the hardware knows about line strides (because the framebuffer uses one its self) .. weather that tech is available to the texture unit or not is another question...
you could also make use of the space to store 16 colour palette data :P (does this benefit significantly from being stored in vram?)
another question that would break the idea entirely, does the buffer clearing function clear the entire buffer, or just the 480x272 region?
would it be possible to make use of vram that is wasted by the framebuffers for storing textures in any way?
ie, frame buffer width is 480 pixels, but we allocate 512 across.. this leaves 32*bytesperpixel bytes at the end of each line, unfortunately, this really isnt enough for any useful sized texture... but, if there were a way to set the texture line stride or anything like that (number of bytes to skip to the next texture line) you could store a bunch of 32xX or 64xX textures in the excess framebuffer space... obviously the hardware knows about line strides (because the framebuffer uses one its self) .. weather that tech is available to the texture unit or not is another question...
you could also make use of the space to store 16 colour palette data :P (does this benefit significantly from being stored in vram?)
another question that would break the idea entirely, does the buffer clearing function clear the entire buffer, or just the 480x272 region?
sceGuDispBuffer() sets the display buffer. The inputs it takes are width & height of the display buffer, a pointer to the front-buffer and the buffer width (512 for a 480 display buffer to align up to the nearest power of 2).
sceGuDrawBuffer() sets the pixel format, back-buffer pointer and the buffer-width for the draw-buffer.
sceGuDepthBuffer() sets the pointer to the depth-buffer and the buffer-width.
sceGuDrawBufferList() can be used for render-targets, by setting the pixel-format, the buffer-pointer & width, then setting up sceGuOffset() / sceGuViewport() to render to the proper area. sceGuScissor() might help in this area aswell. sceGuDrawBufferList() does not, as sceGuDrawBuffer() cache it's information, so when swapping the buffers it won't propagate to the display buffer. It does however require that you are aware of which buffer you currently render to.
I haven't had time to review the stencil-buffer at all (been way too busy at work), but I think it's inside the alpha-channel. One fact supports this, and it's that when you do a sceGuClear(), you push the stencil-clear value into the high bits of the clear-color. This means you cannot use destination-alpha and stencil at the same time, but that's a small trade-off.
Yes, you can probably store a texture in the 32x272 unused area. I have not tried this, but it might be possible. Actually, looking at the code and what we know of how the rendering works, you might be able to get rid of that area and use it for the full display. This will have to be tested first though, it might just be me being a bit tired. :) It would involve playing with the buffer-width a bit, something I've just taken for granted so far.
sceGuDrawBuffer() sets the pixel format, back-buffer pointer and the buffer-width for the draw-buffer.
sceGuDepthBuffer() sets the pointer to the depth-buffer and the buffer-width.
sceGuDrawBufferList() can be used for render-targets, by setting the pixel-format, the buffer-pointer & width, then setting up sceGuOffset() / sceGuViewport() to render to the proper area. sceGuScissor() might help in this area aswell. sceGuDrawBufferList() does not, as sceGuDrawBuffer() cache it's information, so when swapping the buffers it won't propagate to the display buffer. It does however require that you are aware of which buffer you currently render to.
I haven't had time to review the stencil-buffer at all (been way too busy at work), but I think it's inside the alpha-channel. One fact supports this, and it's that when you do a sceGuClear(), you push the stencil-clear value into the high bits of the clear-color. This means you cannot use destination-alpha and stencil at the same time, but that's a small trade-off.
Yes, you can probably store a texture in the 32x272 unused area. I have not tried this, but it might be possible. Actually, looking at the code and what we know of how the rendering works, you might be able to get rid of that area and use it for the full display. This will have to be tested first though, it might just be me being a bit tired. :) It would involve playing with the buffer-width a bit, something I've just taken for granted so far.
GE Dominator
When you set the texture pointer, the 2nd register containing the MSBs of the pointer (TBWx in commands.txt) also contain a "width", which I take to be the stride (since the logical texture size is encoded elsewhere). I had been planning on trying to use the strip at the edge of the screen for small textures like glyphs.turkeyman wrote:ie, frame buffer width is 480 pixels, but we allocate 512 across.. this leaves 32*bytesperpixel bytes at the end of each line, unfortunately, this really isnt enough for any useful sized texture... but, if there were a way to set the texture line stride or anything like that (number of bytes to skip to the next texture line) you could store a bunch of 32xX or 64xX textures in the excess framebuffer space... obviously the hardware knows about line strides (because the framebuffer uses one its self) .. weather that tech is available to the texture unit or not is another question...
I have not tested this yet, however.
There is no specific buffer clearing function; it's just drawing normal primitives (typically quads/sprites) with various blending/testing states disabled. What you want to clear is up to you (not sure if its affected by the scissor rectangle or not).turkeyman wrote:another question that would break the idea entirely, does the buffer clearing function clear the entire buffer, or just the 480x272 region?
J
Last edited by jsgf on Mon Aug 22, 2005 6:34 pm, edited 1 time in total.
-
- Posts: 100
- Joined: Sat Aug 20, 2005 3:25 am
How about putting the front-buffer in main RAM and leaving only the back-buffer and the Z-buffer in VRAM ? Cough... cough... ;).turkeyman wrote:just another thought on the topic,
would it be possible to make use of vram that is wasted by the framebuffers for storing textures in any way?
ie, frame buffer width is 480 pixels, but we allocate 512 across.. this leaves 32*bytesperpixel bytes at the end of each line, unfortunately, this really isnt enough for any useful sized texture... but, if there were a way to set the texture line stride or anything like that (number of bytes to skip to the next texture line) you could store a bunch of 32xX or 64xX textures in the excess framebuffer space... obviously the hardware knows about line strides (because the framebuffer uses one its self) .. weather that tech is available to the texture unit or not is another question...
you could also make use of the space to store 16 colour palette data :P (does this benefit significantly from being stored in vram?)
another question that would break the idea entirely, does the buffer clearing function clear the entire buffer, or just the 480x272 region?
Putting palettized textures in VRAM
Does anyone have any idea how this works with a palettized texture? Specifically, I mean how to copy both the texture and the palette to VRAM using sceGuCopyImage? Right now, I'm copying the texture to VRAM before usage, but keeping the palette (CLUT) in main RAM because I don't know how to put it in VRAM with anything other than memcpy.
Re: Putting palettized textures in VRAM
using sceGuClutLoad to upload your colorLoothor wrote:Does anyone have any idea how this works with a palettized texture? Specifically, I mean how to copy both the texture and the palette to VRAM using sceGuCopyImage? Right now, I'm copying the texture to VRAM before usage, but keeping the palette (CLUT) in main RAM because I don't know how to put it in VRAM with anything other than memcpy.
lookup tables is preffered .....making sure
you are setting the color mode as needed
sceGuClutMode should be used ;)
10011011 00101010 11010111 10001001 10111010
texture reads from user memoryKristof wrote:Not sure texture are faster in VRAM
Make a test ....
(mem range 0×08800000 - 0×01800000)
have a bandwidth of 50MB/s
texture reads from GE memory or VRAM
(mem range 0×04000000 - 0×00200000)
have a bandwidth of 500MB/s :)
if you have a texture in user memory
it is possible to load that texture to VRAM
at a bandwidth of 150MB/s
that makes VRAM 5x as fast as normal
user memory i think that is fast enough for any purpose ;)
10011011 00101010 11010111 10001001 10111010