Slow PNG encode? Memory stick issue?
Slow PNG encode? Memory stick issue?
Hi,
I am new to homebrewing on PSP. I have written the code that takes a snapshot of current display and saves it as PNG.
I have compiled libpng 1.2.8 and zlib 1.2.3 without any modification to the source code.
When I take a snapshot it takes approx 10 seconds to finish the job. I thought PSP is faster than this.
I believe access to memory stick is slowing things down. I did some debugging - tracing to file when each scan line gets processed by libpng. The log function is simple:
void Log(const char *s)
{
FILE *fp = fopen("trace.txt", "a");
fputs(s, fp);
fclose(fp);
}
The encoding process takes 2 and a half minutes when the logging is turned on!!
Is memory stick really that slow? Any suggestions how to speed up the encoding process?
Thanx
I am new to homebrewing on PSP. I have written the code that takes a snapshot of current display and saves it as PNG.
I have compiled libpng 1.2.8 and zlib 1.2.3 without any modification to the source code.
When I take a snapshot it takes approx 10 seconds to finish the job. I thought PSP is faster than this.
I believe access to memory stick is slowing things down. I did some debugging - tracing to file when each scan line gets processed by libpng. The log function is simple:
void Log(const char *s)
{
FILE *fp = fopen("trace.txt", "a");
fputs(s, fp);
fclose(fp);
}
The encoding process takes 2 and a half minutes when the logging is turned on!!
Is memory stick really that slow? Any suggestions how to speed up the encoding process?
Thanx
It is quite slow, but not usually that slow. What seems to hurt the speed the most is doing lots of small individual writes - especially if you close the file in between.
Tracing by closing and reopening the file is painfully slow - unfortunately it seems to be necessary if you want to be sure the file is flushed, in case of a crash.
To speed up your app, you could try writing the PNG file data to memory first, rather than to file. Assuming that you're using libpng, this is fairly easy to do - there's a function called something like png_set_write_func() that lets you register your own implementation of the writing function, that can write to mem rather than disk.
You might also look into the async file I/O functions - e.g. sceIoWriteAsync, as these let your program continue execution before the I/O is complete.
Finally, you could try doing stuff like disabling interrupts, or copying the screen to a buffer before encoding it, to try to reduce any visual glitches from screen updates before you finish encoding.
Tracing by closing and reopening the file is painfully slow - unfortunately it seems to be necessary if you want to be sure the file is flushed, in case of a crash.
To speed up your app, you could try writing the PNG file data to memory first, rather than to file. Assuming that you're using libpng, this is fairly easy to do - there's a function called something like png_set_write_func() that lets you register your own implementation of the writing function, that can write to mem rather than disk.
You might also look into the async file I/O functions - e.g. sceIoWriteAsync, as these let your program continue execution before the I/O is complete.
Finally, you could try doing stuff like disabling interrupts, or copying the screen to a buffer before encoding it, to try to reduce any visual glitches from screen updates before you finish encoding.
Got a v2.0-v2.80 firmware PSP? Download the eLoader here to run homebrew on it!
The PSP Homebrew Database needs you!
The PSP Homebrew Database needs you!
Fanjita,
Thanks for your tips.
Too bad logging like this is so slow, it makes logging useless, because I want to know what was the last thing that was executed before crash, not what was the last thing that was unbuffered from stream. :(
I'll try writting PNG file to memory first and then write to the stream in one blow.
Bulb
Thanks for your tips.
Too bad logging like this is so slow, it makes logging useless, because I want to know what was the last thing that was executed before crash, not what was the last thing that was unbuffered from stream. :(
I'll try writting PNG file to memory first and then write to the stream in one blow.
Bulb
- nullp01nter
- Posts: 26
- Joined: Wed Jan 04, 2006 7:40 am
- Location: Saxony/Germany
@bulb:
You may want to try PSPLINK to have proper logging support. You either have to setup a Wifi network for this or get (or build) a SIO to PC converter. You then can use printf() to print logging statements to stdout and they will appear on your serial console or telnet. Have a look here: http://forums.ps2dev.org/viewtopic.php?t=3834
Thoralt
You may want to try PSPLINK to have proper logging support. You either have to setup a Wifi network for this or get (or build) a SIO to PC converter. You then can use printf() to print logging statements to stdout and they will appear on your serial console or telnet. Have a look here: http://forums.ps2dev.org/viewtopic.php?t=3834
Thoralt
Last edited by nullp01nter on Fri Jan 20, 2006 9:04 am, edited 1 time in total.
And psplink has an inbuilt screenshot command (admitedly in .bmp only but that is what graphics converters are for :P).
At any rate it will be because you are writing in small chunks, the kernel's io functions do almost zero buffering, takes maybe a second or so to write out an uncompressed bitmap (nearly 400k) from psplink which builds in memory first and fires off the entire lot to ms.
At any rate it will be because you are writing in small chunks, the kernel's io functions do almost zero buffering, takes maybe a second or so to write out an uncompressed bitmap (nearly 400k) from psplink which builds in memory first and fires off the entire lot to ms.
Nullpointer and Tyranid,
Thanks for your tips. I have read the discussion about PSPLink and I think the stuff is cool. I will probably switch to wifi logging (mainly because I lack electronics knowledge to build a special serial cable and because I already have a router :).
Anyway, I thought access to memory stick would outperform WIFI or serial access.
As I said, 1st I have to encode PNG in memory and then write whole chunk to memory stick. If file access is really such a burden on the system, then writing compressed data is better thing to do (PNG vs BMP).
Thanks for your tips. I have read the discussion about PSPLink and I think the stuff is cool. I will probably switch to wifi logging (mainly because I lack electronics knowledge to build a special serial cable and because I already have a router :).
Anyway, I thought access to memory stick would outperform WIFI or serial access.
As I said, 1st I have to encode PNG in memory and then write whole chunk to memory stick. If file access is really such a burden on the system, then writing compressed data is better thing to do (PNG vs BMP).
I've done simple test now that just writes 100 Kbytes to memory stick. I have written it:
a. in one single block and
b. one byte at a time (using fwrite() 100000 times)
Both methods took neary the same time. The problem is that they take about 40 seconds to complete! This is 2.5 Kb/sec.
Is this rate OK or is there something wrong with my memory stick or (hopefully not) with my PSP?
If this is the normal rate, then I guess I should better move my file I/O to different thread (assuming that main thread can run, when worker thread is blocked with I/O operation). Currently I only use one thread. Could this be my sole problem?
Thanx
a. in one single block and
b. one byte at a time (using fwrite() 100000 times)
Both methods took neary the same time. The problem is that they take about 40 seconds to complete! This is 2.5 Kb/sec.
Is this rate OK or is there something wrong with my memory stick or (hopefully not) with my PSP?
If this is the normal rate, then I guess I should better move my file I/O to different thread (assuming that main thread can run, when worker thread is blocked with I/O operation). Currently I only use one thread. Could this be my sole problem?
Thanx
Everyone,
I was looking through some example code and noone uses another thread for file IO, so this must not be an issue.
I experimented with another memory stick, which proved faster: doing 100 Kb writes in 7 seconds, instead of 40 seconds. Still way too slow. Interesting note: both memory sticks are from Sony and have 1 Gb capacity.
I was using slower memory stick in further examination.
Original code giving 2.5 Kb/s rate:
u8 buffer[100000];
FILE *f = fopen("test.bin", "wb");
fwrite(buffer, 100000, 1, f);
fclose(f);
Modified code giving 150 Kb/s rate:
u8 buffer[100000];
int f = sceIoOpen("test.bin", PSP_O_WRONLY | PSP_O_CREAT, 0777);
sceIoWrite(f, buffer, 100000);
sceIoClose(f);
Seems using Sony's file IO functions directly gives much better performance than C runtime library. This seems strange, because CRT is just a wrapper around Sony's file IO. Seems like a bug of CRT. Sadly I can't check the sources, because I am using devkitpro R6, which comes precompiled.
Newly released MSTest v1.0 also gives such transfer rate, with various buffer sizes it was possible to squeeze it to max 170 Kb/s.
I am still not completly satisfied though, because USB copy to memory stick gives me around 2500 Kb/s. Though, I have run out of ideas where to get better transfer rate.
I would really like to hear your experience with write transfer rate.
I was looking through some example code and noone uses another thread for file IO, so this must not be an issue.
I experimented with another memory stick, which proved faster: doing 100 Kb writes in 7 seconds, instead of 40 seconds. Still way too slow. Interesting note: both memory sticks are from Sony and have 1 Gb capacity.
I was using slower memory stick in further examination.
Original code giving 2.5 Kb/s rate:
u8 buffer[100000];
FILE *f = fopen("test.bin", "wb");
fwrite(buffer, 100000, 1, f);
fclose(f);
Modified code giving 150 Kb/s rate:
u8 buffer[100000];
int f = sceIoOpen("test.bin", PSP_O_WRONLY | PSP_O_CREAT, 0777);
sceIoWrite(f, buffer, 100000);
sceIoClose(f);
Seems using Sony's file IO functions directly gives much better performance than C runtime library. This seems strange, because CRT is just a wrapper around Sony's file IO. Seems like a bug of CRT. Sadly I can't check the sources, because I am using devkitpro R6, which comes precompiled.
Newly released MSTest v1.0 also gives such transfer rate, with various buffer sizes it was possible to squeeze it to max 170 Kb/s.
I am still not completly satisfied though, because USB copy to memory stick gives me around 2500 Kb/s. Though, I have run out of ideas where to get better transfer rate.
I would really like to hear your experience with write transfer rate.
Don't forget that the USB transfer is buffered asynchronously in RAM - if you're doing one single file write, you will get large transfer speeds.
You'll notice the true speed if you try writing to multiple files, it seems to wait for the async completion before moving between files.
You'll notice the true speed if you try writing to multiple files, it seems to wait for the async completion before moving between files.
Got a v2.0-v2.80 firmware PSP? Download the eLoader here to run homebrew on it!
The PSP Homebrew Database needs you!
The PSP Homebrew Database needs you!
Fanjita,
Are you saying that IO Async equivalents are synthetically faster? That Async function first copies everything in RAM, returns immediately and in the background flushes that buffer?
Sadly I can't test Async functions, because my wife is currently hooked with Lumines. :)
Though I have some doubts. USB transfer could be doing fake trasfer rates, but by copying very large files (100 Mb and more) I don't see a drop in transfer rate, which should be dropped, because there is no memory in PSP to hold such large buffer.
Are you saying that IO Async equivalents are synthetically faster? That Async function first copies everything in RAM, returns immediately and in the background flushes that buffer?
Sadly I can't test Async functions, because my wife is currently hooked with Lumines. :)
Though I have some doubts. USB transfer could be doing fake trasfer rates, but by copying very large files (100 Mb and more) I don't see a drop in transfer rate, which should be dropped, because there is no memory in PSP to hold such large buffer.
Ok, I have played with IO async function and they do not work any faster. Unless I call sceIoWaitAsync() the data is not written to the memory stick at all. When it is written to the memory stick, it consumes the same amount of time as equivalent sync function does. Funny thing is that the file is nowhere to be found, eventhough memory stick was accessed and i called sceIoWaitAsync() before sceIoCloseAsync().
So, I still wonder how come USB transfer rate is 2500 Kb/sec, but my code and any other program only achieves 170 Kb/sec. This is 14 times faster! I know memory stick can't be as slow as 170 Kb/sec.
So, I still wonder how come USB transfer rate is 2500 Kb/sec, but my code and any other program only achieves 170 Kb/sec. This is 14 times faster! I know memory stick can't be as slow as 170 Kb/sec.
I can easily get 15MB/s on mine using read(). Wasn't interested in writing when I ran this test. benchmark.txt
I can't afford to buy Sandisk just to try things out. :)
Anyway, I don't have an issue with having Sony's slower memory stick, the issues here are:
1. Why is CRT IO up to 70 times slower than Sony's IO? Seems like a bug in CRT code.
2. Why are Sony's IO routines 14 times slower than using USB transfer? Is this deliberate action of Sony? Developers would not need fast transfer rates for writing and reading saves, so did Sony choose to put some waits in these routines?
What I would really like to see if someone tries to compare USB transfer rate versus transfer rate of his own code.
Anyway, I don't have an issue with having Sony's slower memory stick, the issues here are:
1. Why is CRT IO up to 70 times slower than Sony's IO? Seems like a bug in CRT code.
2. Why are Sony's IO routines 14 times slower than using USB transfer? Is this deliberate action of Sony? Developers would not need fast transfer rates for writing and reading saves, so did Sony choose to put some waits in these routines?
What I would really like to see if someone tries to compare USB transfer rate versus transfer rate of his own code.
weltall,
No buffering, if I transfer like 100 Mb file and then stop USB connection the file is completly transfered on the PSP. If I was to write 100 Mb file using sceIoWrite() I would have to wait like for eternity. If transfer is 14 times slower you can see the difference!
TyRaNiD,
I have just run your (great) PSPLink and experimented with USB copy that your (again great:) application provides. I've got 1700 - 1800 Kb/sec transfer rate, which is slower than usual USB copy (2500 Kb/sec), but still way faster than my sceIoWrite(). What's the catch here? Is sceIoWrite() performing faster if the buffer is in specific memory location? Do you do anything special when copying files via USB?
BTW, if I read from PSP, the USB transfer rate from your program is max 5800 Kb/sec, which is exactly the same as via normal USB transfer.
No buffering, if I transfer like 100 Mb file and then stop USB connection the file is completly transfered on the PSP. If I was to write 100 Mb file using sceIoWrite() I would have to wait like for eternity. If transfer is 14 times slower you can see the difference!
TyRaNiD,
I have just run your (great) PSPLink and experimented with USB copy that your (again great:) application provides. I've got 1700 - 1800 Kb/sec transfer rate, which is slower than usual USB copy (2500 Kb/sec), but still way faster than my sceIoWrite(). What's the catch here? Is sceIoWrite() performing faster if the buffer is in specific memory location? Do you do anything special when copying files via USB?
BTW, if I read from PSP, the USB transfer rate from your program is max 5800 Kb/sec, which is exactly the same as via normal USB transfer.
- nullp01nter
- Posts: 26
- Joined: Wed Jan 04, 2006 7:40 am
- Location: Saxony/Germany
Alright!!! This did the trick!
Now I've got 1900 Kb/sec trasfer rate if using sceIoWrite() and 165 Kb/sec transfer rate if using fwrite().
Can anyone else check if memory alignment makes a difference? Could be that you ought to have slow memory stick to spot the difference. :)
Thanx for the tip jonny!
Now I've got 1900 Kb/sec trasfer rate if using sceIoWrite() and 165 Kb/sec transfer rate if using fwrite().
Can anyone else check if memory alignment makes a difference? Could be that you ought to have slow memory stick to spot the difference. :)
Thanx for the tip jonny!
- nullp01nter
- Posts: 26
- Joined: Wed Jan 04, 2006 7:40 am
- Location: Saxony/Germany
Normally you use this if you declare your data in your code, e. g.:
This reserves a chunk of memory (statically) which is aligned at 16 bytes.
Thoralt
Code: Select all
unsigned char __attribute__((aligned(16))) ucData[256];
Thoralt
bulb -> ok, i have looked at libpng, and as you said, only two functions uses fread / fwrite :
- pngwio.c :
- pngrio.c :
Ok, so i have to try to modify in order to use sceIoWrite() / sceIoRead().
Plus, the png_FILE_p structure.
Plus, do i have also to modify malloc in libpng ?
- pngmem.c :
The line : ret = malloc((size_t)size);
has to be changed in memalign(64, (size_t)size) , isn't it ?
jsgf -> i looked for setvbuf in pspsdk and i found this in pspsdk/src/lib/LIB.status :
Thanks for your help, i'm not on my personal computer so i can't test for the moment.
- pngwio.c :
Code: Select all
#if !defined(PNG_NO_STDIO)
/* This is the function that does the actual writing of data. If you are
not writing to a standard C stream, you should create a replacement
write_data function and use it at run time with png_set_write_fn(), rather
than changing the library. */
#ifndef USE_FAR_KEYWORD
void PNGAPI
png_default_write_data(png_structp png_ptr, png_bytep data, png_size_t length)
{
png_uint_32 check;
#if defined(_WIN32_WCE)
if ( !WriteFile((HANDLE)(png_ptr->io_ptr), data, length, &check, NULL) )
check = 0;
#else
check = fwrite(data, 1, length, (png_FILE_p)(png_ptr->io_ptr));
#endif
if (check != length)
png_error(png_ptr, "Write Error");
}
#else
/* this is the model-independent version. Since the standard I/O library
can't handle far buffers in the medium and small models, we have to copy
the data.
*/
#define NEAR_BUF_SIZE 1024
#define MIN(a,b) (a <= b ? a : b)
void PNGAPI
png_default_write_data(png_structp png_ptr, png_bytep data, png_size_t length)
{
png_uint_32 check;
png_byte *near_data; /* Needs to be "png_byte *" instead of "png_bytep" */
png_FILE_p io_ptr;
/* Check if data really is near. If so, use usual code. */
near_data = (png_byte *)CVT_PTR_NOCHECK(data);
io_ptr = (png_FILE_p)CVT_PTR(png_ptr->io_ptr);
if ((png_bytep)near_data == data)
{
#if defined(_WIN32_WCE)
if ( !WriteFile(io_ptr, near_data, length, &check, NULL) )
check = 0;
#else
check = fwrite(near_data, 1, length, io_ptr);
#endif
}
else
{
png_byte buf[NEAR_BUF_SIZE];
png_size_t written, remaining, err;
check = 0;
remaining = length;
do
{
written = MIN(NEAR_BUF_SIZE, remaining);
png_memcpy(buf, data, written); /* copy far buffer to near buffer */
#if defined(_WIN32_WCE)
if ( !WriteFile(io_ptr, buf, written, &err, NULL) )
err = 0;
#else
err = fwrite(buf, 1, written, io_ptr);
#endif
if (err != written)
break;
else
check += err;
data += written;
remaining -= written;
}
while (remaining != 0);
}
if (check != length)
png_error(png_ptr, "Write Error");
}
#endif
#endif
Code: Select all
#if !defined(PNG_NO_STDIO)
/* This is the function that does the actual reading of data. If you are
not reading from a standard C stream, you should create a replacement
read_data function and use it at run time with png_set_read_fn(), rather
than changing the library. */
#ifndef USE_FAR_KEYWORD
void PNGAPI
png_default_read_data(png_structp png_ptr, png_bytep data, png_size_t length)
{
png_size_t check;
/* fread() returns 0 on error, so it is OK to store this in a png_size_t
* instead of an int, which is what fread() actually returns.
*/
#if defined(_WIN32_WCE)
if ( !ReadFile((HANDLE)(png_ptr->io_ptr), data, length, &check, NULL) )
check = 0;
#else
check = (png_size_t)fread(data, (png_size_t)1, length,
(png_FILE_p)png_ptr->io_ptr);
#endif
if (check != length)
png_error(png_ptr, "Read Error");
}
#else
/* this is the model-independent version. Since the standard I/O library
can't handle far buffers in the medium and small models, we have to copy
the data.
*/
#define NEAR_BUF_SIZE 1024
#define MIN(a,b) (a <= b ? a : b)
static void /* PRIVATE */
png_default_read_data(png_structp png_ptr, png_bytep data, png_size_t length)
{
int check;
png_byte *n_data;
png_FILE_p io_ptr;
/* Check if data really is near. If so, use usual code. */
n_data = (png_byte *)CVT_PTR_NOCHECK(data);
io_ptr = (png_FILE_p)CVT_PTR(png_ptr->io_ptr);
if ((png_bytep)n_data == data)
{
#if defined(_WIN32_WCE)
if ( !ReadFile((HANDLE)(png_ptr->io_ptr), data, length, &check, NULL) )
check = 0;
#else
check = fread(n_data, 1, length, io_ptr);
#endif
}
else
{
png_byte buf[NEAR_BUF_SIZE];
png_size_t read, remaining, err;
check = 0;
remaining = length;
do
{
read = MIN(NEAR_BUF_SIZE, remaining);
#if defined(_WIN32_WCE)
if ( !ReadFile((HANDLE)(io_ptr), buf, read, &err, NULL) )
err = 0;
#else
err = fread(buf, (png_size_t)1, read, io_ptr);
#endif
png_memcpy(data, buf, read); /* copy far buffer to near buffer */
if(err != read)
break;
else
check += err;
data += read;
remaining -= read;
}
while (remaining != 0);
}
if ((png_uint_32)check != (png_uint_32)length)
png_error(png_ptr, "read Error");
}
#endif
#endif
Plus, the png_FILE_p structure.
Plus, do i have also to modify malloc in libpng ?
- pngmem.c :
Code: Select all
/* Allocate memory. For reasonable files, size should never exceed
64K. However, zlib may allocate more then 64K if you don't tell
it not to. See zconf.h and png.h for more information. zlib does
need to allocate exactly 64K, so whatever you call here must
have the ability to do that. */
png_voidp PNGAPI
png_malloc(png_structp png_ptr, png_uint_32 size)
{
png_voidp ret;
#ifdef PNG_USER_MEM_SUPPORTED
if (png_ptr == NULL || size == 0)
return (NULL);
if(png_ptr->malloc_fn != NULL)
ret = ((png_voidp)(*(png_ptr->malloc_fn))(png_ptr, (png_size_t)size));
else
ret = (png_malloc_default(png_ptr, size));
if (ret == NULL && (png_ptr->flags&PNG_FLAG_MALLOC_NULL_MEM_OK) == 0)
png_error(png_ptr, "Out of Memory!");
return (ret);
}
png_voidp PNGAPI
png_malloc_default(png_structp png_ptr, png_uint_32 size)
{
png_voidp ret;
#endif /* PNG_USER_MEM_SUPPORTED */
if (png_ptr == NULL || size == 0)
return (NULL);
#ifdef PNG_MAX_MALLOC_64K
if (size > (png_uint_32)65536L)
{
#ifndef PNG_USER_MEM_SUPPORTED
if(png_ptr->flags&PNG_FLAG_MALLOC_NULL_MEM_OK) == 0)
png_error(png_ptr, "Cannot Allocate > 64K");
else
#endif
return NULL;
}
#endif
/* Check for overflow */
#if defined(__TURBOC__) && !defined(__FLAT__)
if (size != (unsigned long)size)
ret = NULL;
else
ret = farmalloc(size);
#else
# if defined(_MSC_VER) && defined(MAXSEG_64K)
if (size != (unsigned long)size)
ret = NULL;
else
ret = halloc(size, 1);
# else
if (size != (size_t)size)
ret = NULL;
else
ret = malloc((size_t)size);
# endif
#endif
#ifndef PNG_USER_MEM_SUPPORTED
if (ret == NULL && (png_ptr->flags&PNG_FLAG_MALLOC_NULL_MEM_OK) == 0)
png_error(png_ptr, "Out of Memory");
#endif
return (ret);
}
has to be changed in memalign(64, (size_t)size) , isn't it ?
jsgf -> i looked for setvbuf in pspsdk and i found this in pspsdk/src/lib/LIB.status :
So I suppose that this function is not yet implemented, isn't it ?Stdio:
-----
stdin, stdout, stderr, ok. Maybe some specific ps2 function to switch stderr
to SIO could be an idea.
Also, should have buffering...
remove - missing
rename - missing
tmp* - missing
fclose - ok
fflush - ok (memory card)
fcloseall - ok
fopen - ok
freopen - missing
fdopen - ok
setbuf - missing
setvbuf - missing
Thanks for your help, i'm not on my personal computer so i can't test for the moment.