Page 1 of 1

Problems with reads in no-wait RPC calls

Posted: Sun Nov 04, 2007 1:06 am
by ffgriever
I've run into weird problem with calling RPCs in no-wait mode. The concept of the code in question is very simple:

1. RPC function is called to send filled buffer from IOP side to EE side. If buffer is not yet filled, it waits till it's done (semaphore). If the required data lies outside of the buffered area, it invalidates the buffer and reads the required data into it alone. Misses happen very rarely, because the stream lies usually in one file and is read continously. It's always called synchronously (WAIT = 0, 'wait' mode). Always waiting for the dma transfer to finish before returning. Always checking if dma transfer was properly queued.
2. RPC function is called to fill the buffer with data (based on current buffer pos + buffer size). It's called asynchronously (WAIT = 1, 'no-wait' mode) while the previous buffer is processed on EE side.

It's not the most efficient solution, but for the purpose it was done, it is more than enough.

If the buffer fill function is called synchronously, everything is ok. The problems begin when I'm calling it asynchronously. I'm sure, the semaphores work as they should (see further, but I also implemented my own, simple, two state semaphores). The problem is, that if I read in the asynchronously called function more than 1024 bytes at once... then somehow the buffer is not sent properly by the first function (it seems, the read command reads wrong data, although file is properly opened, lseek is properly done and read returns proper amount of bytes read). It doesn't happen often, but always seem to be related to the fact that functions are called to close to each other (very rarely processing on EE side is finished much quicker than usually, and read/send buffer is called in a short time). This problem persists even if I read into separate buffer and do not make any additional changes (so the first function always misses the buffer, thus making in fact all the real reads synchronous).

If I read entire size in 1024B chunks, it works ok. Anything more results in problems from time to time. The semaphores work fine (send buffer is postopned as it should if at buffer fill something like waitthread is done; everything works fine if I'm doing it the other way). The results are better if I place some code between first (synchronous) and second (asynchronous) function call (it can be even nopdelay(); ), but it doesn't solve the problem (it's just less frequent). The same, when I place such delay after second function call (then get buffer call never happens before buffer is fully filled).

Memory is properly aligned, checked static/volatile qualifiers, tried reading/writing to uncached space, made many, many other things... but wasn't able to solve this to work with no-wait mode. It doesn't mean I didn't solve this (that's why I'm sure of all the code).

I've solved it by making just a simple reorganization. I'm making the second RPC call synchronously, then on IOP side I'm checking if the thread for fill buffer function is created. If it's not, I'm just creating it... if it is, I'm just waking it up. Returning immediately from the RPC server thread. Just as in this code.

Code: Select all

void readAheadThread()
{
	while (1)
	{
		temp_readbufahead_RPC();
		SleepThread();
	}
	ExitDeleteThread();
}

part of RPC server thread
[...]
case TEMP_READBUFFAHEAD:
	if (readAheadThreadId == 0)
	{
		struct _iop_thread param;

		param.attr         = 0x02000000;
		param.thread       = (void*)readAheadThread;
		param.priority 	  = 39;
		param.stacksize    = 0x800;
		param.option      = 0;

		readAheadThreadId = CreateThread(&param);

		if (readAheadThreadId > 0)
		{
			StartThread(readAheadThreadId,0);
		}
	} else
	{
		WakeupThread(readAheadThreadId);
	}
	return NULL;
[...]
(returns null pointer, but at this time, there is 0 byte long receive buffer at EE side, so it's not a mistake... I tried of course sending something usefull here for tests).

It works almost just as calling it in no-wait mode. This is in fact more efficient than the no-wait solution (not very much, but still, so it' a better solution anyway). Now everything works fine, so problem solved, but it bothers me a little bit. What may be the reasons that making more than 1024 bytes reads in asynchronously called function leads to some problems?

BTW. I'm using host: (ps2link), same happens using fakehost (redirecting to hdd). Didn't try it with anything else.

Posted: Sun Nov 04, 2007 7:58 am
by Mega Man
You say that you wait on EE until IOP has filled the buffer using a semaphore. How did you share the semaphore between IOP and EE?

Your code didn't show how you are calling the RPC functions. I can't say anything without seeing it.

Did you wait until endfunc is called before reading data on the EE?

Did you flush the data cache on the IOP side?

Your description sounds like: Your are reading synchronously data from IOP while overwriting it asynchronly. You are writing the word "buffer ", but I don't know how your buffer is organized and if you always mean the same buffer or buffer address. And I don't know what you mean with "invalidate". Did you invalidate the cache on the EE or IOP? You never mention if you do something on the IOP or on the EE.

Posted: Sun Nov 04, 2007 11:03 am
by ffgriever
Sorry, thought everything was clear... guess not (not really my best day I would say... but was I really THAT unclear?).
You say that you wait on EE until IOP has filled the buffer using a semaphore. How did you share the semaphore between IOP and EE?
I didn't. I'm waiting on IOP side... but since the function that copies the "buffer" from IOP to EE side is called on EE synchronously (enters wait state until IOP side function returns) it affects also EE (but such lock happens very rarely, because the stream is read continously).
Your code didn't show how you are calling the RPC functions. I can't say anything without seeing it.
I was trying to use it as follows:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0);
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0);
After the changes I made to make it work (still asynchronous, but new thread created... so RPC called synchronously, but executed in new thread on IOP... making all asynchronous):

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0);
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 0, 0,0, 0,0, 0, 0);
So just made the RPC call synchronous, while it executs on IOP asynchronous (just waking up a thread, returns immediately).
Did you wait until endfunc is called before reading data on the EE?
I was waiting untill all transfers are done and the function returns, isn't it enough? The EE side works in one thread in this case, interrupts are enabled, transfer call is made in wait mode, I didn't see a point in specifying endfunc (maybe I'm wrong at this point). I'm not sending it through rpc receive buffer but through separate dma transfer (always waiting till it's done with dmastat, wrote about it).
Did you flush the data cache on the IOP side?
I even tried beeing supercautious (testing) with this one, flushing (not just invalidating, because that would be wrong - leading to loss of data) data cache on both IOP and EE before and after every read (data is flushed automatically on EE after the function ends, until "2" flag is specified - or at least that's what I've seen somewhere). Also tried to write into uncached space (even run the EE application in 0x20000000 segment, to be sure it's not some caching problems).
Your description sounds like: Your are reading synchronously data from IOP while overwriting it asynchronly.
But never when the data is read (copied from IOP to EE) - due to the simple fact that copying function waits until reading (of file, writing to mem) is done, and writing (buffer fill) function never is invoked while the other is in progress (it can't happen in current code, but I've even added additional checks for security).
You are writing the word "buffer ", but I don't know how your buffer is organized
It's array of unsigned chars... nothing fancy (static size of 28224 bytes currently, no dynamic allocation). Sometimes it's beeing casted to others.
and if you always mean the same buffer or buffer address.
Always the same buffer at the same address (but as I've wrote above... it happens also when I read to another buffer, not reading it's contents at all, not changing anything in the memory that could affect the app, making all the real reads/transfers synchronous!! I just read some phony data into some phony buffer! I don't transfer this "test" buffer to EE, only the real one, that is read synchronously).
And I don't know what you mean with "invalidate". Did you invalidate the cache on the EE or IOP?
I didn't mean invalidating data nor instruction cache. I just meant that I treat the buffer as containing invalid data (I need data from the other place that is currently in buffer) and read the proper (necessary) data into it (such case is very rare).
You never mention if you do something on the IOP or on the EE.
Everything concerning reading is done on IOP side. EE side is only two rpc calls. Then the contents gained on IOP side and transferred to EE side are precessed on EE side.

Bah, nevermind. It works as it should (with just small changes to the code, that's why I would say there was something in the way I called the RPC on EE or in the code handling them alone), just curious (didn't run into something like that earlier).

Posted: Mon Nov 05, 2007 10:39 am
by Mega Man
ffgriever wrote:
Your code didn't show how you are calling the RPC functions. I can't say anything without seeing it.
I was trying to use it as follows:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0);
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0);
After the changes I made to make it work (still asynchronous, but new thread created... so RPC called synchronously, but executed in new thread on IOP... making all asynchronous):

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0);
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 0, 0,0, 0,0, 0, 0);
So just made the RPC call synchronous, while it executs on IOP asynchronous (just waking up a thread, returns immediately).
Did you wait until endfunc is called before reading data on the EE?
I was waiting untill all transfers are done and the function returns, isn't it enough? The EE side works in one thread in this case, interrupts are enabled, transfer call is made in wait mode, I didn't see a point in specifying endfunc (maybe I'm wrong at this point). I'm not sending it through rpc receive buffer but through separate dma transfer (always waiting till it's done with dmastat, wrote about it).
You specify SIF_RPC_M_NOWAIT as mode. The function SifCallRpc() then returns immediately, there is nothing done. The 8th parameter is a function which is called when the the RPC call is complete.
I never used this, but I've seen it in ps2 linux. I believe the function is called in interrupt context, so function must be short and should only signal a semaphore.
As I understand the usage of the function SifSetDma() in loadfile.c: The function only checks if the DMA packet is transfered to the IOP memory. It didn't check that the IOP has processed the DMA packet. So your data need not to be ready when SifSetDma() says so.

Posted: Tue Nov 06, 2007 5:24 am
by EEUG
@ffgriever: ..may I insert my 5 cents :)?
I think you can't use things like:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0); 
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0); 
if one of the calls is asynchrounous. However you can try to use different RPC descriptors, like:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0); 
SifCallRpc(&tmp_cd1, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0); 
...and more efficient code would be:

Code: Select all

void readAheadThread() 
{ 
   while (1) 
   { 
      SleepThread(); 
      temp_readbufahead_RPC(); 
   } 
   ExitDeleteThread(); 
}
...so, you create thread at startup and there's no need for

Code: Select all

if (readAheadThreadId == 0)
statement each read request.
Small note: 'ExitDeleteThread' does nothing on IOP side, so, you should delete the thread from outside...

Posted: Tue Nov 06, 2007 7:10 pm
by ffgriever
Thank you guys... both
EEUG wrote:I think you can't use things like:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0); 
SifCallRpc(&tmp_cd0, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0); 
if one of the calls is asynchrounous. However you can try to use different RPC descriptors, like:

Code: Select all

SifCallRpc(&tmp_cd0, TMP_TRANSFER_BUFF, 0, tmp_rpcReBuff, 16, &tmp_ret, 16, 0, 0); 
SifCallRpc(&tmp_cd1, TEMP_READBUFFAHEAD, 1, 0,0, 0,0, 0, 0); 
I knew there was something wrong with the way I was using/calling rpc. That's it. After this simple fix, everything works fine in the previous form. Seems the same client cannot be used at the same time (and it was used, when the asynchronously called function was returning while the other was called - that's why the problem was occurring very rarely... the length of data to read has been chosen very carefully, to let EE process everything in this time, with possibly no stalls - in fact, it usually ends reading just before EE finishes processing).
...and more efficient code would be:[...]
You're right, of course. This had even bigger impact, than anyone could expect... but it's because I'm running into cache problems right now (sometimes very simple changes, like adding additional variables or changing sizes give so weird efficiency results, that none would expect... got to make use of prefetch)
Small note: 'ExitDeleteThread' does nothing on IOP side, so, you should delete the thread from outside...
I didn't know that. It's not a problem in this case, as this loop never ends... but will help in others.