Thread creation problem

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
hectic128
Posts: 12
Joined: Sat Nov 17, 2007 4:41 am

Thread creation problem

Post by hectic128 »

I've been having some problems creating threads. It seems that sometimes when a thread is created, the OS will immediately kill it (before it even gets to the thread's entry point as far as I can tell.),

I can reproduce the problem with some pretty straightforward test code, which is included below. Basically, what I'm doing is:

In the main context:

1. Set test value to 0.
2. Create and start thread 'n'
3. Wait for thread 'n' to end.
4. Check that test value was set to 1.
5. Repeat...

In thread context:

1. Set test value to 1.
2. Exit and delete thread.

What eventually happens is that the check in main (#4) fails. Doing a 'thinfo' from PSPLINK gives the following output, which seems to indicate that the OS killed the thread:

UID: 0x04E1F74B - Name: thread29

Attr: 0x800000FF - Status: 32/KILLED - Entry: 89004d8
Stack: 9ff7c00 - StackSize 0x00000400 - GP: 0x08917B80
InitPri: 17 - CurrPri: 17 - WaitType 0
WaitId: 0x00000000 - WakeupCount: 0 - ExitStatus: 0x800201A2
RunClocks: 48155 - IntrPrempt: 1 - ThreadPrempt: 0
ReleaseCount: 0, StackFree: 640


When this happens is fairly unpredictable - sometimes it will die after just a few threads, sometimes it will make it a few hundred. I haven't been able to discern much of a pattern. I've added delays in various places (in case it was caused by some sort of race condition) but this didn't seem to have any effect.

Considering I have zero visibility into the PSP OS (and relatively little experience with it) I'm kinda stuck at this point, so I'm looking for help... Am I doing something incredibly stupid? Has anyone else run into a similar problem?

Thanks!

Test code:

int washere;

int func(SceSize argc,void * argv)
{
printf("func\n");
washere = 1;
sceKernelExitDeleteThread(0);
return 0;
}


int main()
{
int i;

pspDebugScreenInit();
SetupCallbacks();

printf("Starting test\n");

for (i=0; i<1000; i++)
{
char threadName[32];

printf("%d\n",i);

washere = 0;

sprintf(threadName,"thread%d\n",i);

int thid =
sceKernelCreateThread(threadName,
func,
0x11,
1024,
0,
NULL);

sceKernelStartThread(thid, 0, 0);

sceKernelWaitThreadEnd(thid, NULL);

if (washere != 1)
{
printf("Failure!\n");
return 0;
}


}

printf("Done\n");
return 0;

}
DoctorRockit
Posts: 5
Joined: Sun Dec 10, 2006 3:12 am

Post by DoctorRockit »

I haven't really looked at your threading code, but one thing you might want to try is qualifying the global variable washere as being volatile, to force the compiler to read it from memory in your final check.
hectic128
Posts: 12
Joined: Sat Nov 17, 2007 4:41 am

Post by hectic128 »

Unfortunately this didn't help.

Good thought though...
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

Maybe the thread wasn't created in the first place. You don't check for that in the above code. You should try printing thid as part of the failure message to see what it was.
User avatar
Jim
Posts: 476
Joined: Sat Jul 02, 2005 10:06 pm
Location: Sydney
Contact:

Post by Jim »

It's also interesting that it quits at 30 threads which sounds like a hard-coded limit. Are you sure that you're deleting the threads that have exited so the thread id's can be re-used?

Jim
hectic128
Posts: 12
Joined: Sat Nov 17, 2007 4:41 am

Post by hectic128 »

I verified that threads were actually getting created.

When it originally stopped at 30 threads, I thought the same thing, that I was running into some kind of limit (possibly because I wasn't deleting threads). However, the point at which it stops is somewhat random, and often goes above 30.

I have figured something out: the problem goes away if I increase the stack size to 4K. I had originally set the stack size to 1K, because I figured it didn't really matter, as I wasn't doing a whole lot in the thread. I am calling printf, which could have a pretty deep call stack, but even then, it still surprises me.

Assuming that the problem was that I was blowing away a thread's stack, a lesson learned (for me at least) is that the thread info display in psplink won't necessarily show this. That is, if you look at the thread that was killed, it showed that only around 300 bytes of the stack was being used...
pspZorba
Posts: 156
Joined: Sat Sep 22, 2007 11:45 am
Location: NY

Post by pspZorba »

Well, your code works for me if I increase the stack size
--pspZorba--
NO to K1.5 !
User avatar
Jim
Posts: 476
Joined: Sat Jul 02, 2005 10:06 pm
Location: Sydney
Contact:

Post by Jim »

In newlib, the first thing printf does is create a 1Kb array on the stack.
Jim
pspZorba
Posts: 156
Joined: Sat Sep 22, 2007 11:45 am
Location: NY

Post by pspZorba »

Yes the problem should come from something like that.
Because there are only two threads running simultanously (so not too many ressources at same time), the exit-deletion is correct, and if you increase the stack size it works properly.

Even the unsecure access of "washere" should not be a problem as the two threads can't access it at the same time.

And moreover, if I remove the printf, it works properly even only with a 512b stack.
--pspZorba--
NO to K1.5 !
J.F.
Posts: 2906
Joined: Sun Feb 22, 2004 11:41 am

Post by J.F. »

pspZorba wrote:Yes the problem should come from something like that.
Because there are only two threads running simultanously (so not too many ressources at same time), the exit-deletion is correct, and if you increase the stack size it works properly.

Even the unsecure access of "washere" should not be a problem as the two threads can't access it at the same time.

And moreover, if I remove the printf, it works properly even only with a 512b stack.
Stack overflows... the bane of programmers everywhere. Of course, if it weren't for SOs, we probably wouldn't have homebrew on some consoles. :)
TyRaNiD
Posts: 907
Joined: Sun Jan 18, 2004 12:23 am

Post by TyRaNiD »

Unfortunately psplink can only report what the kernel knows about, if you lash up an SIO cable and enable serial kprintf you will actually get the kernel's I'm fucked output which gives a clearer indication. You can also try doing thctx @thread which might print out the final registers as the context switch is what triggers the overflow handler (although at this point the thread could have walked over god knows how much memory :P)
Post Reply