go through it. Since I found it rather informational, I decided to share the results here. Maybe even some discussion will come from this, so if you have questions on the results, ask, or if you have other information, share it.
Everything was benched with the PSP at default speed, ie 222Mhz, so the ops/µs will increase by 33% when PSP is set to 333Mhz. Tick counts won't change though (tested), so they are reliable. The results also include any latencies induced, so interlacing costly ops with independant other ops might decrease the real tick cost somewhat.
UPDATE: I added ulv.q and usv.q ops for unaligned loads/stores
UPDATE: Added exec cost/latencys for interlacing vfpu with mips code. The latency measures how many ticks worth of mips code you can 'hide' after the vfpu op.
Code: Select all
OP ops/µs ticks/op exec/latency
vadd.q ~220 1 1/0
vsub.q ~220 1 1/0
vdot.q ~220 1 1/0
vmul.q ~220 1 1/0
vhdp.q ~220 1 1/0
vdiv.q ~4 56 14/42
vmmul.q ~14 16 1/15
vmin.q ~220 1 1/0
vmax.q ~220 1 1/0
vabs.q ~220 1 1/0
vneg.q ~220 1 1/0
vidt.q ~77 3 1/2
vzero.q ~77 3 1/2
vone.q ~77 3 1/2
vrcp.q ~56 4 1/3
vrsq.q ~56 4 1/3
vsin.q ~56 4 1/3
vcos.q ~56 4 1/3
vexp2.q ~56 4 1/3
vlog2.q ~56 4 1/3
vsqrt.q ~56 4 1/3
vasin.q ~56 4 1/3
vnrcp.q ~56 4 1/3
vnsin.q ~56 4 1/3
vrexp2.q ~56 4 1/3
vi2uc.q ~220 1 1/0
vi2s.q ~220 1 1/0
vsgn.q ~220 1 1/0
vcst.q ~220 1 1/0
vf2in.q ~220 1 1/0
vi2f.q ~220 1 1/0
vhtfm4.q ~56 4 1/3
vtfm4.q ~56 4 1/3
vmidt.q ~19 12 1/11
vmzero.q ~19 12 1/11
lv.q(cache) ~219 1 1/0
lv.q(mem) ~4 68
ulv.q(cache)~109 2 2/0
ulv.q(mem) ~4 68
sv.q(cache) ~32 7 5/2
sv.q(mem) ~2 111
usv.q(cache)~16 14 10/4
usv.q(mem) ~2 111
If I find time, I'll maybe also bench the triple, pair and single ops for comparison. Maybe also some comparison to MIPS counterpart ops would be useful (esp for vdiv, vmmul where it's not clear whether vfpu is really faster).
NOTE: If I missed something important, please LMK, I'm basing these results on my current knowledge of op tickcosts and latencies, which might not be 100% correct. So these results are also not warranted for :P