SNES9X is frustrating me.
-
- Posts: 35
- Joined: Wed May 04, 2005 4:48 pm
SNES9X is frustrating me.
I'm just confused about how something with 333mhz of processing power can only run an SNES game at an average of 20fps. Can someone enlighten me if there is a reason why this is, or if am I doing something terribly wrong?
Not with a bang, but a whisper.
This is the way the world ends.
This is the way the world ends.
Several reasons. The snes has lots of very custom hardware, that runs in parallel. Moreover, when you soft-emulate (without dynamic recompilation that is) a platform, you need up to 10 times the power of the first machine. Count that there's probably no optimisation, and that the toolkit isn't finished yet, there is lots to optimize and tweak to make it run smoothly.
pixel: A mischievous magical spirit associated with screen displays. The computer industry has frequently borrowed from mythology. Witness the sprites in computer graphics, the demons in artificial intelligence and the trolls in the marketing department.
-
- Posts: 35
- Joined: Wed May 04, 2005 4:48 pm
Im wondering if snes9x is rendering a 32 color bit depth frames all the time instead of a lesser number (like 8 bit?). I noticed in the doom source for psp it had 8 color bit depth in the rendering code :) (correct me if im wrong).
Transparency is what really needs optimising, turn that off and get a massive increase in performance.
Transparency is what really needs optimising, turn that off and get a massive increase in performance.
-
- Posts: 1
- Joined: Sat Jun 18, 2005 11:53 am
- Location: Vancouver BC, Canada
- Contact:
Hi guys,
I am not sure if this thread need to turn into a snes emu development thread :-)
I bet its the wrong place to do so > Any admin. plz. correct me if I am wrong.
Anyway, if people are interested to tune this stuff, or they join the project and we tell them what to are the part to do inside the project (there is a lot of people and a lot of tasks) and do their homework (= read snes specs, know snes cpu,gpu and apu, understand PSP mips assembly and so on...), or they just wait and say nothing.
If it is to have the thread full of empty theory, just use your friend tool (google) and get the information like every member of our project should do(=read the source code of snes9x and everything that implies).
Progress come from self involvement, and not from relying on other to get the answer to problems.
Anyway, motivated people, keep up the good work.
See you.
I am not sure if this thread need to turn into a snes emu development thread :-)
I bet its the wrong place to do so > Any admin. plz. correct me if I am wrong.
Anyway, if people are interested to tune this stuff, or they join the project and we tell them what to are the part to do inside the project (there is a lot of people and a lot of tasks) and do their homework (= read snes specs, know snes cpu,gpu and apu, understand PSP mips assembly and so on...), or they just wait and say nothing.
If it is to have the thread full of empty theory, just use your friend tool (google) and get the information like every member of our project should do(=read the source code of snes9x and everything that implies).
Progress come from self involvement, and not from relying on other to get the answer to problems.
Anyway, motivated people, keep up the good work.
See you.
Right now there's barely any hardware being used.
This emulator is so very early and everyone expects full speed off the bat.
Be patient.
Let's track emulators in PSP Dev:
Mirakichi releases the initial GB emulator, only having 1x mode and being slow
a 2x mode is enabled in it, but that cuts the speed down
the source is released and others begin adding their own modes, including a 1.5x filtered mode that used a different Bitblt and ran faster
repeat on and on with many more people finding ways to increase the display method.
Move onto ruka's NesterJ. This emulator quite possibly will be very key for others as far as speeding up goes, as ruka is the first author to use ASM in a PSP program. He already noted simply switching to ASM increased the speeds about 20%, and that was only using it for one function: drawing graphics.
I'm glad 1.5 can use homebrew now; I hope many 1.5 devs jump aboard and help out.
But for God's sake give up on your false "full speed SNES emulation tomorrow!!!!11" dreams right now.
People are working on it and they all have the same reachable goal.
Just give them space and time while you enjoy your Christmas present
This emulator is so very early and everyone expects full speed off the bat.
Be patient.
Let's track emulators in PSP Dev:
Mirakichi releases the initial GB emulator, only having 1x mode and being slow
a 2x mode is enabled in it, but that cuts the speed down
the source is released and others begin adding their own modes, including a 1.5x filtered mode that used a different Bitblt and ran faster
repeat on and on with many more people finding ways to increase the display method.
Move onto ruka's NesterJ. This emulator quite possibly will be very key for others as far as speeding up goes, as ruka is the first author to use ASM in a PSP program. He already noted simply switching to ASM increased the speeds about 20%, and that was only using it for one function: drawing graphics.
I'm glad 1.5 can use homebrew now; I hope many 1.5 devs jump aboard and help out.
But for God's sake give up on your false "full speed SNES emulation tomorrow!!!!11" dreams right now.
People are working on it and they all have the same reachable goal.
Just give them space and time while you enjoy your Christmas present
333 Mhz if you go full soft is a challenge.
No the versions that are outhere for the moment do not use the hard,
plz look around and logically deduct that nobody knows how to use the 3D on the PSP yet.
Now even if we could do it (hardware acceleration) -> fine.
But how about :
- mode 7 (you can rotate polygon in hard but not bank of tiles...)
- windowing.
- mosaic mode.
As I said in previous post, people dont look at the code, dont know the architecture of the snes, know nothing about assembly or JIT and says :
how this can be that slow ?
or
with this and that, it should be possible...
Yes, it IS possible, but it requires a good amount of creativity, energy and technical knowledge to make it happen.
Look at Snes9x for GP32, you will see that the tuning gave really strong impact on performance. But this has a cost in term of comitment and free time, ask the people who worked on it.
(Greets to yoyoFr here, you were great, I am ashamed I could not do more)
No the versions that are outhere for the moment do not use the hard,
plz look around and logically deduct that nobody knows how to use the 3D on the PSP yet.
Now even if we could do it (hardware acceleration) -> fine.
But how about :
- mode 7 (you can rotate polygon in hard but not bank of tiles...)
- windowing.
- mosaic mode.
As I said in previous post, people dont look at the code, dont know the architecture of the snes, know nothing about assembly or JIT and says :
how this can be that slow ?
or
with this and that, it should be possible...
Yes, it IS possible, but it requires a good amount of creativity, energy and technical knowledge to make it happen.
Look at Snes9x for GP32, you will see that the tuning gave really strong impact on performance. But this has a cost in term of comitment and free time, ask the people who worked on it.
(Greets to yoyoFr here, you were great, I am ashamed I could not do more)
Snes9x on pc took several years to achieve the fruit today. While for psp, it has been only several months till now and we already have fully functional emulators. It is good.
Considering the computation amount for graphic processing. 128 simultaneous spirits, each up to 64*64 size, that is 512k pixels in total, each processed(transparent or not) at 60 fps. This needs 30m integer operations per second.
Once using mode 7, background size is 256*224 in pixel, each multiplied by a 2*2 matrix(6 floating point operations), also processed at 60 fps, that is about 20m flops.
Other graphic effect such as semi-transparency and pixel priority, each needs 10m level computation amount. Considering the unwanted spending and the 65816 emulation, if full software method is used, probably 333mhz still could not get full speed (please correct me if wrong). To get full speed, I am afraid we have to wait for the day we know how to correctly use psp's graphic hardware.
Considering the computation amount for graphic processing. 128 simultaneous spirits, each up to 64*64 size, that is 512k pixels in total, each processed(transparent or not) at 60 fps. This needs 30m integer operations per second.
Once using mode 7, background size is 256*224 in pixel, each multiplied by a 2*2 matrix(6 floating point operations), also processed at 60 fps, that is about 20m flops.
Other graphic effect such as semi-transparency and pixel priority, each needs 10m level computation amount. Considering the unwanted spending and the 65816 emulation, if full software method is used, probably 333mhz still could not get full speed (please correct me if wrong). To get full speed, I am afraid we have to wait for the day we know how to correctly use psp's graphic hardware.
-
- Posts: 16
- Joined: Wed Jun 01, 2005 8:55 am
konfig>
Try to do something smarter than a matrix computation per pixel.
Floating point operation ? and why not a Cray computer to draw a line...
I can do what you are talking about with 3 ADD, 2 SHIFT , 1 ARRAY INDEX access per pixel inside the inner loop, else software like duke-nukem would have never worked on 486 PC at 33 Mhz. And using 32 BIT INTEGERS. On ARM this would be 3 to 4 instruction per pixel.(ADD and SHIFT can be combined together)
Still your assumptions are also wrong because we need to convert the texture coordinate into TILE coordinate (which have H and V flip flags)
It's not a simple texture mapping, so it cost probably some AND masks (and a switch case may be in the worst case to handle flip case) per pixel.
The guys who worked on Zsnes and Snes9x are just incredible.
We must really thank them for their hardwork. (especially with the ideology of snes9x to have no assembly in core functions for portability, I believe its smarter than reinventing the wheel each time).
Having functionnal emulators on a PSP is not a miracle. Its just a matter of modifying the access to VRAM, malloc and sound access, keypad and compile again if the source is in C at 100%.
I am glad many people are doing so, but it is not a technical challenge.
The challenge is to get the PSP to run most of the emulators at full speed.
That is a real challenge.
We have enough RAM to start deploy technics like JIT, use the graphics acceleration when possible, enough RAM to write more complex and tuned algorithm, enough RAM to have precalculation.
We can also rewrite in assembly what is not optimizable using new data structure or algorithm.[/code]
Its seems you really know what you are talking about right ?background size is 256*224 in pixel, each multiplied by a 2*2 matrix(6 floating point operations), also processed at 60 fps, that is about 20m flops.
Try to do something smarter than a matrix computation per pixel.
Floating point operation ? and why not a Cray computer to draw a line...
I can do what you are talking about with 3 ADD, 2 SHIFT , 1 ARRAY INDEX access per pixel inside the inner loop, else software like duke-nukem would have never worked on 486 PC at 33 Mhz. And using 32 BIT INTEGERS. On ARM this would be 3 to 4 instruction per pixel.(ADD and SHIFT can be combined together)
Still your assumptions are also wrong because we need to convert the texture coordinate into TILE coordinate (which have H and V flip flags)
It's not a simple texture mapping, so it cost probably some AND masks (and a switch case may be in the worst case to handle flip case) per pixel.
Making is one thing, porting is another. It tooks us a few hours to port the PC code to the PSP. Yes it took years to reach a good level in emulators because detecting issue with hardware simulation is a very difficult task.Snes9x on pc took several years to achieve the fruit today. While for psp, it has been only several months till now and we already have fully functional emulators. It is good.
The guys who worked on Zsnes and Snes9x are just incredible.
We must really thank them for their hardwork. (especially with the ideology of snes9x to have no assembly in core functions for portability, I believe its smarter than reinventing the wheel each time).
Having functionnal emulators on a PSP is not a miracle. Its just a matter of modifying the access to VRAM, malloc and sound access, keypad and compile again if the source is in C at 100%.
I am glad many people are doing so, but it is not a technical challenge.
The challenge is to get the PSP to run most of the emulators at full speed.
That is a real challenge.
We have enough RAM to start deploy technics like JIT, use the graphics acceleration when possible, enough RAM to write more complex and tuned algorithm, enough RAM to have precalculation.
We can also rewrite in assembly what is not optimizable using new data structure or algorithm.[/code]
I only read the hardware document and did not read the source code.laxer3a wrote: Its seems you really know what you are talking about right ?
Try to do something smarter than a matrix computation per pixel.
Floating point operation ? and why not a Cray computer to draw a line...
I can do what you are talking about with 3 ADD, 2 SHIFT , 1 ARRAY INDEX access per pixel inside the inner loop, else software like duke-nukem would have never worked on 486 PC at 33 Mhz. And using 32 BIT INTEGERS. On ARM this would be 3 to 4 instruction per pixel.(ADD and SHIFT can be combined together)
But I am intrested in these optimization method.
3 ADD, 2 SHIFT, 1 ARRAY INDEX access per pixel. Seems you precompute the value and store them in a data group and use integer instead of floating point (closest value. bigger data group leads to more precise value). Trade space for time.
This helps to understand how a snes emulator can run smoothly on a 1xx mhz x86. Thanks.
> konfig
I precompute nothing...
What you do is compute the coordinate screen -> texture for the first pixel at the beginning of each scanline (this could be optimized too actually),and then use the inverse matrix to find the next pixel inside the texture when I move on pixel on the next right pixel on the screen.
As going from one pixel to the next pixel on the screen horizontally is a vector [1 0] if you multiply it by the inverse matrix, you get a constant translation vector inside the texture for each next pixel on the screen...
To avoid floating point, you can use integer and fixed point calculation.
(and then tolerate a given amount of loss in precision (still enough for graphic rendering)
(google is your friend... I gave you a lot of keyword here, do your homework "texture mapping" , "fixed point", etc...)
So your inner loop, is like
This is a really basic computation for 2D texture mapping.(or 3D per scanline with no Z correction) (and this texture width must be a 2^n, else you have a MUL+SHIFT instead of SHIT+AND)
Of cours this is an "abstract" code, in the snes emulation you would need to do texture to tile coordinate conversion, handle the flag, handle when you go outside of the texture (BG layer), and transparency etc...etc...
Still you can optimize to get as close as possible to this ideal inner loop.
By the way, think that the snes is 8/16 Bit , the main cpu was at 3.57 Mhz if I am correct. Do believe that floating point was not even a word that the hardware engineer at nintendo could imagine to implement at that time for this cost and clock speed.
(We sent man on the moon with less computing power that you have with your watch.)
I precompute nothing...
What you do is compute the coordinate screen -> texture for the first pixel at the beginning of each scanline (this could be optimized too actually),and then use the inverse matrix to find the next pixel inside the texture when I move on pixel on the next right pixel on the screen.
As going from one pixel to the next pixel on the screen horizontally is a vector [1 0] if you multiply it by the inverse matrix, you get a constant translation vector inside the texture for each next pixel on the screen...
To avoid floating point, you can use integer and fixed point calculation.
(and then tolerate a given amount of loss in precision (still enough for graphic rendering)
(google is your friend... I gave you a lot of keyword here, do your homework "texture mapping" , "fixed point", etc...)
So your inner loop, is like
Code: Select all
*screenpixelcolor++ = Texture[textureX>>BIT_PREC + (textureY<<MYSHIFT)&MASK];
textureX += vectorX;
textureY += vectorY;
Of cours this is an "abstract" code, in the snes emulation you would need to do texture to tile coordinate conversion, handle the flag, handle when you go outside of the texture (BG layer), and transparency etc...etc...
Still you can optimize to get as close as possible to this ideal inner loop.
By the way, think that the snes is 8/16 Bit , the main cpu was at 3.57 Mhz if I am correct. Do believe that floating point was not even a word that the hardware engineer at nintendo could imagine to implement at that time for this cost and clock speed.
(We sent man on the moon with less computing power that you have with your watch.)
-
- Posts: 3
- Joined: Wed Jun 22, 2005 4:35 am
First post here.. but here goes :
I was the original developer of SNEeSe (long long time ago), i went through a year of hell understanding how the snes works.. so hopefully that at least proves i know the snes.
Mode 7 itself is actually very easy in hardware. The awkward part as far as hardware goes is what everyone percieves as mode 7 (ie FZERO et al). These use dma to change snes hardware registers to allow them to update the matrix on a per scan line basis, which produces that lovely perspective effect. Its these HDMA effects that are the biggest problem for hardware rasterisation..
FYI Floating point was available on the sinclair spectrum a machine which existed a long time before the SNES. I`m sure the hardware engineers new about it, but what they did was far more intelligent.
I was the original developer of SNEeSe (long long time ago), i went through a year of hell understanding how the snes works.. so hopefully that at least proves i know the snes.
Mode 7 itself is actually very easy in hardware. The awkward part as far as hardware goes is what everyone percieves as mode 7 (ie FZERO et al). These use dma to change snes hardware registers to allow them to update the matrix on a per scan line basis, which produces that lovely perspective effect. Its these HDMA effects that are the biggest problem for hardware rasterisation..
FYI Floating point was available on the sinclair spectrum a machine which existed a long time before the SNES. I`m sure the hardware engineers new about it, but what they did was far more intelligent.
To the developers of the SNES9x emu for PSP. I'd be willing to lend you guys a hand in making it better. I'm an Xbox/Xbox 360/PC developer but I do have a PSP dev kit sitting on the desk next to mine at work and access to all documentation and whatnot ... and I could probably get the PSP dev guy to help out with any major technical issues ...
Just PM me if interested.
Thanks
Just PM me if interested.
Thanks
If I could be any fictional character, I'd be God
- Anonymous
- Anonymous