I am trying to replace the sqrt() function using the codegen.h approach and have written the following but it has a bug. I am a new at inline assembler and can't get the following to work (it just returns the value I pass it)
Actually, since you're trying to use the VFPU, I wonder if you can even use the general registers for this... Something like this should work: (note that VFPU is supported in the current binutils, so no need for codegen)
Can't get anywhere with this. I'm on september '05 toolchain and sdk so the VFPU stuff is not builtin so I had to stick with codegen.h approach. Unfortunatly there is no mtv/mvf and I spent some time trying to add those opcodes to little avail.
Then I decided to update my toolchain and grabbed the latest "famous" version from oopo.net haggled through that for a bit to find that it looks like the VFPU stuff is only in the version in SVN as of yet.
I've never been able to checkout stuff from svn.ps2dev.net - it keeps saying that the hostname is invalid. I tried punching TCP/3690 open on my router but can't get to it.
I tried the beta toolchain from oopo.net, but it failed in the install so I went back to the latest "famous" version.
Is there a userid/pw that is needed for svn.pspdev.org?
No userid or password is required for svn.ps2dev.org. Please make sure you have the correct URL. In your last message you refer to it as svn.ps2ev.net and svn.pspdev.org which are both wrong.
If you have a restrictive firewall you might not be able to access the server. Let me know if you have any more problems. I believe someone did create a mirror of subversion that can be accessed through HTTP. You should be able to find it on forums via search.
Well I still can't get into svn, but I was able to get my sqrt(), sin(), cos(), etc over to VFPU and got a very nice speed-up so thank you to everyone who helped with this!!
These are probably done in newlib or elsewhere already, but if you have the latest toolchain you should be able to implement your own math functions that are MUCH faster. Note you have to add "|" PSP_THREAD_ATTR_VFPU into your PSP_MAIN_THREAD_ATTR() def in main.c.
I do however think most of the time is spent loading the matrix, so you should perhaps change your code to multiply more than one vertex per call (like using it for an array of vertices). The overhead you then would get when you need to multiply just one is minimal compared to the opposite situation. And don't forget that you can let the GE do all this job if you just intend to render it. :)
Here's a much faster atan2 routine. This isn't in VFPU format (yet), but I thought I would post for now and update later. This is about 10 times faster than the default atan2 routine. PLEAE NOTE: this is a low order approximation and should only be used when you need precision to a few digits. I use this 50 to 100 times per frame in some levels so it was a big boost for me (8 fps!)
Since the VFPU has an asin() in silicon and there is a known identity between atan and asin this could be done other ways, and could be done to higher order.
I'm personally hoping chp has a fancy matrix approach for this one :)