Thread-local storage?
Thread-local storage?
Is there any support for __thread variables, or some other TLS mechanism? I notice the current gcc doesn't support the __thread keyword, but is there any hope of implementing it?
There actually seems to be some TLS functions in the the thread manager, however they only seem available in kernel mode and not sure how to use them. I would guess with some work you could simulate TLS, perhaps wrap the normal thread functions and which would then allocate a block of memory for TLS and do what ever is necessary to get gcc to plug into that. I might take a look, see what we are supposed to do to get gcc to use it.
Just a quick update, i've been looking through newlib's reent code and it seems that it supports reentry pretty much through out, however in our version all we actually do is return _impure_ptr (remember that :P) which is common throughout the system it seems :( So I thought I would do a little digging into some of the games I have to see if Sony make use of that as they no doubt use newlib based on the symbols from puzzle bobble.
What I found was Sony seem to allocate a pretty large structure on the stack in _main which they then zero out rather like REENT_INIT does. This pointer is then written to 4($k0) which when I dumped the memory there found what looks like a block of stack which is preallocated before we even do anything (at the least k0 is not set earlier in _main). k0 changes on a per-thread basis (it is set to 0 in kernel threads though, perhaps understandable) however it doesn't seem to ever get touched by any other code, even from looking through bits of the kernel, so maybe this is a simple TLS mechanism at work :)
Digging even further (i.e. breaking on thread start) seems we loose 320 bytes of stack in user mode threads before we even have a chance to get our grubby mits on it, 256bytes seems to be allocated to $k0 and $fp takes the other 64bytes. In $fp we have the instruction (syscall #0x2070) which is also the target of the default $ra so that is obviously the thread return, the rest seemed to not get used. If you have psplink and you fancy looking at this then just use the thctx command to dump the context of a thread and look what k0 is set to in each case.
The upshot of this is if we could port across TLS gcc/ld code to the MIPS target (a challenge in itself) I reacon we could use the same mechanism for our own code to provide thread local storage. I am going to look to see if we can use the existing stuff in newlib to give us proper reentrant newlib which would be nice (although it would mean wrapping the create thread call most probably so that we could preinitialise the TLS data), however I don't see why it couldn't work.
Guess I will need to look at gcc/ld next ;)
What I found was Sony seem to allocate a pretty large structure on the stack in _main which they then zero out rather like REENT_INIT does. This pointer is then written to 4($k0) which when I dumped the memory there found what looks like a block of stack which is preallocated before we even do anything (at the least k0 is not set earlier in _main). k0 changes on a per-thread basis (it is set to 0 in kernel threads though, perhaps understandable) however it doesn't seem to ever get touched by any other code, even from looking through bits of the kernel, so maybe this is a simple TLS mechanism at work :)
Digging even further (i.e. breaking on thread start) seems we loose 320 bytes of stack in user mode threads before we even have a chance to get our grubby mits on it, 256bytes seems to be allocated to $k0 and $fp takes the other 64bytes. In $fp we have the instruction (syscall #0x2070) which is also the target of the default $ra so that is obviously the thread return, the rest seemed to not get used. If you have psplink and you fancy looking at this then just use the thctx command to dump the context of a thread and look what k0 is set to in each case.
The upshot of this is if we could port across TLS gcc/ld code to the MIPS target (a challenge in itself) I reacon we could use the same mechanism for our own code to provide thread local storage. I am going to look to see if we can use the existing stuff in newlib to give us proper reentrant newlib which would be nice (although it would mean wrapping the create thread call most probably so that we could preinitialise the TLS data), however I don't see why it couldn't work.
Guess I will need to look at gcc/ld next ;)
Great! Reserving a register to point to the TLS storage is pretty common; on x86 its %gs, but everything else with enough registers reserves a normal integer register for that purpose. Since the MIPS API defines k0 and k1 as being for kernel use (ie, volatile in user-mode; typically they're used for TLB reload trap handler, which we don't need to worry about), it makes sense that k0 would be reused for the TLS block pointer. (It seems that SGI/MIPS really dropped the ball when they last redefined the MIPS ABI, and didn't define a standard register for TLS usage, even though it was about the same time the other architecture ABIs were being extended for TLS. This means that Sony may have made something up from scratch.)
http://people.redhat.com/drepper/tls.pdf describes how the TLS stuff works in general, though a lot of it is about how to make it work with dynamic libraries. But all the stuff which relates to static linking should be relevent, and there's plenty of documentation about how other architectures have implemented TLS; they should be useful as templates to either work out what Sony have done, or design something new.
http://www.linux-mips.org/wiki/NPTL is the linux-mips design of TLS for Mips, for use with NPTL. I don't know whether this patch http://sourceware.org/ml/binutils/2005-02/msg00607.html or something like it has made it into binutils yet, or if so, whether it will work in Sony's little playground. I suspect not, because they've decided there are no available registers, and so they use some other kernel-based mechansm for getting a thread-specific pointer. Using k0 is much simpler.
http://people.redhat.com/drepper/tls.pdf describes how the TLS stuff works in general, though a lot of it is about how to make it work with dynamic libraries. But all the stuff which relates to static linking should be relevent, and there's plenty of documentation about how other architectures have implemented TLS; they should be useful as templates to either work out what Sony have done, or design something new.
http://www.linux-mips.org/wiki/NPTL is the linux-mips design of TLS for Mips, for use with NPTL. I don't know whether this patch http://sourceware.org/ml/binutils/2005-02/msg00607.html or something like it has made it into binutils yet, or if so, whether it will work in Sony's little playground. I suspect not, because they've decided there are no available registers, and so they use some other kernel-based mechansm for getting a thread-specific pointer. Using k0 is much simpler.
Is it also 64-byte aligned? If so, its probably just using its own cache line to prevent I and D cache aliasing...TyRaNiD wrote:Digging even further (i.e. breaking on thread start) seems we loose 320 bytes of stack in user mode threads before we even have a chance to get our grubby mits on it, 256bytes seems to be allocated to $k0 and $fp takes the other 64bytes. In $fp we have the instruction (syscall #0x2070) which is also the target of the default $ra so that is obviously the thread return, the rest seemed to not get used.
I haven't managed to get psplink running anything yet; I can get it started, but the PSP just crashes when I actually try to run anything...If you have psplink and you fancy looking at this then just use the thctx command to dump the context of a thread and look what k0 is set to in each case.