ASM in C program? Include it externally? Some info plz?

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
sg57
Posts: 144
Joined: Fri Oct 14, 2005 2:26 pm

ASM in C program? Include it externally? Some info plz?

Post by sg57 »

Just for fun, I want to learn the basics of ASM and have gotten down variable manipulations and thought I'd make a small calculator application to test my knowledge on it.

I don't know enough to make an entire PSP application out of ASM (don't know how to include, start main, program, loop, etc.) so I was just going to use something like:

Code: Select all

while(1) {

     extern "ASM" {
          mul $product,$factor,$factor; 
          // multiply the factor, return into product register
     }
}
I don't know 'all' the functions such as exponents, but I'm learning.

Tell me if this will work or not? Thanks!

(oh and any links to good ASM teaching tutorials would be awesome ;) Raphael, you have any?)
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

if you want to know if this works, why not try to compile? The compiler will tell you if it understands what you want it to do (I'd say definately NO if I was him though...)
Well, if it compiles, I wouldn't run that though... you know... infinite loop and such...

Also I'm not sure if you're already matched for assembly, it's somewhat more complex than C or C++ and seeing how you still lack there...
but well, maybe reading the first few chapters of the following wouldn't hurt for you either: http://chortle.ccsu.edu/AssemblyTutoria ... tents.html
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
Kojima
Posts: 275
Joined: Mon Jun 26, 2006 3:49 am

Post by Kojima »

I'm not sure if this applies to C as well (I use ++ exclusively) but this is how you include asm within a C++ program.

Code: Select all


void AddTest&#40;&#41; 
&#123; 


int num1=5; 
int num2=10; 
int out=0; 
int to = 5;
float r17,r16;
r16=0;
r17=0;
for&#40;int i=0;i<to;i++&#41;
&#123;
	float res = num1*num2;
	r16 = res;
	r17 = r16+r17;
&#125;
printf&#40;"C Res&#58;%f",r17&#41;;

num1 =5;
num2 = 10;
out = 0;
to = 5;
asm __volatile__ &#40;"\n\ 
ori $5,$0,0\n\
ori $17,$0,0\n\
loop&#58;\n\
mul %1,%2\n\
mflo $16\n\
addu $19,$16,$17\n\
move $17,$19\n\
addiu $5,$5,1\n\
bne $5,%3,loop\n\
nop\n\
move %0,$17\n\
"&#58;"=r"&#40;out&#41;&#58;"r"&#40;num1&#41;,"r"&#40;num2&#41;,"r"&#40;to&#41;&#41;; 

	printf&#40;"Asm Result&#58;%d \n",out&#41;; 
&#125;

siberianstar
Posts: 70
Joined: Thu Jun 22, 2006 9:24 pm

Post by siberianstar »

You never have to write an entire application in assembly,
use inline assembly:

asm (
"istruction 1\n"
"istruction 2\n"
: gcc output constraints : gcc input constraints );

for example

float output, op1, op2;
asm (
"mul.s %0, %1, %2\n"
: "=f" (output) : "f" (op1), "f" (op2) );

note that allegrex doesn't have double registers/instructions
sg57
Posts: 144
Joined: Fri Oct 14, 2005 2:26 pm

Post by sg57 »

That was what I was aiming for really... I just want to know it incase I want/need to use it somewhere where Im stuck.

Is there any real place to use ASM rather than C/C++, as in a more efficient place?
TyRaNiD
Posts: 907
Joined: Sun Jan 18, 2004 12:23 am

Post by TyRaNiD »

For the most part other than utilising instructions not normally utilised by the C compiler or instruction is just cannot utilised like the VFPU it is rarely necessary to use straight asm.

Take for example byte swapping such as doing htonl in the network code, you can use the wsbw instruction (or the __builtin_allegrex_wsbw gcc instruction) which gives you endian swapping in one instruction rather than doing all the shifts.

The only other reason to use asm tends to be for size coding but most people dont care about doing that :)
sg57
Posts: 144
Joined: Fri Oct 14, 2005 2:26 pm

Post by sg57 »

So, there's no real need to learn all of ASM, but only in those areas you specified? (I don't care about sizing either)
Kojima
Posts: 275
Joined: Mon Jun 26, 2006 3:49 am

Post by Kojima »

Hmm, what about speed? Surely a maths intensive, logic intensive loop can be done at a faster pace in asm if you know what you're doing? (I don't :) )
User avatar
dsn
Posts: 47
Joined: Wed Nov 09, 2005 11:48 am
Location: Indianapolis, Indiana, USA

Post by dsn »

Modern compilers are better than most programmers at optimizing for speed. There's some debate on this of course--and a number of counter-examples--but the level of skill required to write faster assembly than a compiler is beyond the reach of most programmers. Learning assembly can give you a better understanding of how computer systems work, but the days of using it to write entire programs are long gone.
sg57
Posts: 144
Joined: Fri Oct 14, 2005 2:26 pm

Post by sg57 »

I can see how. Assembly is pretty Low-level, right?

C is a pretty medium level language while Python and LUA are way up there in the high-classes.

I still can't believe that is true <astonished> How long ago were developers sitting there, looking at a bunch of symbols and numbers, a few real strings here and there, then all of a sudden, someone decided to just create a new language to replicate (?) ASM but more 'readable' if you will...

Seems like i need a histroy lesson in computer programming <sobbing at lack of knowledge>

Anyone know of some good 'history' lessons out there/ Id love to learn more about this so called 'programming whole programs in ASM are long gone'. Just seems a little creepy at how fast this society's technology is growing...

(I guess that hover boardd coming out in 2016 should come outa tad sooner eh? :))
Djakku
Posts: 45
Joined: Mon Jan 30, 2006 2:41 am

Post by Djakku »

AFAIK, asm is platform dependent, that mean the program that you write have to be understood by the processor and porting a ASM app is a real pain. On the other hand it is faster..
Most of the demo in the psp scene were coded in Assembly. I've always thought that ASM was indeed harder to learn but more efficent and stable than C but since now everyone recommend C, i'm confused. Either way i'm not a coder (yet.. :) ) so I could be wrong..
Good luck with your app dude :)
User avatar
groepaz
Posts: 305
Joined: Thu Sep 01, 2005 7:44 am
Contact:

Post by groepaz »

i doubt most of the demos are written in asm :=P
User avatar
skistovel
Posts: 14
Joined: Mon Jul 17, 2006 11:46 pm

Post by skistovel »

Assembly still has a place, but it is usually inline a C/C++ program or in demo scene!

Usually after you finish debugging and you are near release version, you can start optimize your code by replacing often used functions with assembly. Of course you do that after u have checked how you compiler convert it to assembly and if it took a "longer" approach.
On the other hand, coding right will create no such problems, like breaking an algorithm into 3 parts than just doing in it one line. The problem is that there are different optimization u can perform but vary with the compiler u use and hardware architecture u are coding for..

My advice? Stick to C/C++, master it and then start looking for "extra" optimizations!
TyRaNiD
Posts: 907
Joined: Sun Jan 18, 2004 12:23 am

Post by TyRaNiD »

Just to prove a point that ASM coding isn't totally dead i've uploaded my mini fire demo to the main ps2dev.org page. Have fun working out my weirdo asm code :)
User avatar
Saotome
Posts: 182
Joined: Sat Apr 03, 2004 3:45 am

Post by Saotome »

Very nice. Good example how to make eboots as small as possible, i like that :)

Not a good solution for 2.00+ firmwares, because of the hardcoded syscalls, but cool anyway ;)
infj
TyRaNiD
Posts: 907
Joined: Sun Jan 18, 2004 12:23 am

Post by TyRaNiD »

True, but I think the technique would work for up to 2.5 as I think 2.6 was where they started randomising syscall numbers, could be wrong mind.

If people really wanted it I could add proper syscalls to the proceedings, not a hard task, just would add 200 bytes or so to the proceeding which for what I was originally writing the code for seems a waste ;)
Fanjita
Posts: 217
Joined: Wed Sep 28, 2005 9:31 am

Post by Fanjita »

TyRaNiD wrote:True, but I think the technique would work for up to 2.5 as I think 2.6 was where they started randomising syscall numbers, could be wrong mind.
For the record, they started at 2.5, although the effect was less obvious - perhaps the size of the random constant was a little smaller or something.
Got a v2.0-v2.80 firmware PSP? Download the eLoader here to run homebrew on it!
The PSP Homebrew Database needs you!
bradskins
Posts: 25
Joined: Tue Dec 20, 2005 5:54 pm

Post by bradskins »

what are the thoughts on solving this randomization thing? Or is psp assembly officially dead at 2.5?

EDIT: nm I have a working solution
...
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

TyRaNiD wrote:For the most part other than utilising instructions not normally utilised by the C compiler or instruction is just cannot utilised like the VFPU it is rarely necessary to use straight asm.

Take for example byte swapping such as doing htonl in the network code, you can use the wsbw instruction (or the __builtin_allegrex_wsbw gcc instruction) which gives you endian swapping in one instruction rather than doing all the shifts.

The only other reason to use asm tends to be for size coding but most people dont care about doing that :)
1) Do you know where I could find a whole list of this __builtin_allegrex_* ? being in Windows and haven't any place to install a linux (a real one or a vmware-d one). It is quite a nightmare to find the source of psp-gcc.

2) Do these intrisics include VFPU ones ? I suppose not since it looks as if GCC doesn't clobber VFPU registers.

3) Where can I find details of constraints matching of asm gcc for allegrex ?

Yes I know I ask a lot but still up ti now, I am unable to find those answers on internet :(((
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

siberianstar wrote: note that allegrex doesn't have double registers/instructions
Are you speaking about them ?

Table 1-10 Extensions to the ISA: Load and Store Instructions
  • OpCode Description
    LD Load Doubleword
    LDL Load Doubleword Left
    LDR Load Doubleword Right
    LL Load Linked
    LLD Load Linked Doubleword
    LWU Load Word Unsigned
    SC Store Conditional
    SCD Store Conditional Doubleword
    SD Store Doubleword
    SDL Store Doubleword Left
    SDR Store Doubleword Right
    SYNC Sync
Table 1-11 Extensions to the ISA: Arithmetic Instructions (ALU Immediate)
  • OpCode Description
    DADDI Doubleword Add Immediate
    DADDIU Doubleword Add Immediate Unsigned
Table 1-12 Extensions to the ISA: Multiply and Divide Instructions
  • OpCode Description
    DMULT Doubleword Multiply
    DMULTU Doubleword Multiply Unsigned
    DDIV Doubleword Divide
    DDIVU Doubleword Divide Unsigned
etc.

By the way, does someone know if ll/sc are available on allegrex ?

EDIT:

I try this one :


Code: Select all

void enter_critical_section&#40;int *r1&#41;
&#123;
  __asm__ volatile &#40;
    "loop1&#58;\n"
    "ll $v1, &#40;%0&#41;\n"
    "blez $v1, loop1\n"
    //"nop\n" // gcc seems to add it itself
    "addi $v0, $v1, -1\n"
    "sc $v0, &#40;%0&#41;\n"
    "beq $v0, 0, loop1\n"
    //"nop\n" // gcc seems to add it itself
    &#58; &#58; "r"&#40;r1&#41; &#58; "v0", "v1"&#41;;
&#125;
and added in cellshading.c (pspsdk sample for Gu):

Code: Select all


int cs = 1;

int main&#40;int argc, char* argv&#91;&#93;&#41; &#123;
	/* Setup Homebutton Callbacks */
	setupCallbacks&#40;&#41;;

	enter_critical_section&#40;&cs&#41;;

	/* Generate Torus */
...
After compiling and checking the assembly code to see if enter_critical_section contains ll/sc pair and is called by main, I run it succesfully on my PSP.

I wonder if LD/LDL/LDR/LLD/LWU/SCD/SD/SDL/SDR/etc. work too.
cheriff
Regular
Posts: 258
Joined: Wed Jun 23, 2004 5:35 pm
Location: Sydney.au

Post by cheriff »

The difference is that all those instructions refer to doublewords, which is simply twice the length of a word (probably 64 bits since PSP is a 32b machine)
On ps2 there are instructions dealing with quadwords - 128bit quantities.

Without looking anything up, I'd hazard a guess that the add, multiply and so on inteructions listed above are all integer operations on what would effectively be a long int. (Feel free to add a grain of salt to that one, however)

What neither of the machines support are double precision floats and associated arithmetic. So if you define two doubles and try to add them, the compiler will include all this extra code to deal with the floating point stuff. If you instead added two floats (ie single precision) then this can be done in a single instruction.

EDIT: After thinking a bit and looking something up I see the seem to apply to machines with 64bits in each register (I was assuming you could maybe fuse r2 and r3 togther to make 64bits, like elsewhere). So i guess it all comes down to how wide each GPR is in PSP. I really shouldn't be posting after 11pm ;)
Damn, I need a decent signature!
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

cheriff wrote:The difference is that all those instructions refer to doublewords, which is simply twice the length of a word (probably 64 bits since PSP is a 32b machine)
On ps2 there are instructions dealing with quadwords - 128bit quantities.

Without looking anything up, I'd hazard a guess that the add, multiply and so on inteructions listed above are all integer operations on what would effectively be a long int. (Feel free to add a grain of salt to that one, however)

What neither of the machines support are double precision floats and associated arithmetic. So if you define two doubles and try to add them, the compiler will include all this extra code to deal with the floating point stuff. If you instead added two floats (ie single precision) then this can be done in a single instruction.

EDIT: After thinking a bit and looking something up I see the seem to apply to machines with 64bits in each register (I was assuming you could maybe fuse r2 and r3 togther to make 64bits, like elsewhere). So i guess it all comes down to how wide each GPR is in PSP. I really shouldn't be posting after 11pm ;)
well, trying to compile

Code: Select all

asm &#40;"ld $v0,0&#40;$a0&#41;"&#41;
gives me to my surprise after dissassembling the produced .o file

Code: Select all

lw v0,0&#40;a0&#41;
lw v1,4&#40;a0&#41;
instead of an expected instruction exception :/
adrahil
Posts: 274
Joined: Thu Mar 16, 2006 1:55 am

Post by adrahil »

Yeah, it's normal since it loads the doubleword into two registers, which each contain a word :)
v0 now contains the 4 upper, and v1 the 4 lower...
cheriff
Regular
Posts: 258
Joined: Wed Jun 23, 2004 5:35 pm
Location: Sydney.au

Post by cheriff »

cool, so it does split the 64 bits over two 32b registers, I wasn't sure if that was the case... My MIPS has unfortunately been slipping recently and I'm out of practice :(
Damn, I need a decent signature!
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

adrahil wrote:Yeah, it's normal since it loads the doubleword into two registers, which each contain a word :)
v0 now contains the 4 upper, and v1 the 4 lower...
well, I thought in MIPS64, v0 is 64-bit wide and not 32-bit wide. So a "ld v0,(a0)" was not exactly equivalent to "lw v0,0(a0);lw v1,4(a0)" for me because v1 should not be modified as being another 64-bit register. Just imagine my X86 assembler will transform "mov eax,[di]" into "mov ax,[di+ 0]; mov dx,2[di + 2]" because I want to generate a binary for a 8086, it would be weird and false!!! I have the same reaction here !

but if you tell me that in fact a 64-bit register in mips64 is spanning two registers (here, v0 and v1) the same way as "ax" is spanning "ah" and "al", well I can understand its transformation in mips32.
cheriff
Regular
Posts: 258
Joined: Wed Jun 23, 2004 5:35 pm
Location: Sydney.au

Post by cheriff »

hlide wrote: well, I thought in MIPS64, v0 is 64-bit wide and not 32-bit wide.
This is in fact true on a 64 bit platform. From some random google search on the mpis64:
LD rt, offset(base) Load doubleword. The contents of the 64-bit doubleword at the memory location specified by the effective address are loaded as a signed value and stored in GPR rt.
However, on MIPS32:
ld Rdest, address Load Double-Word Load the 64-bit quantity at address into registers Rdest and Rdest + 1
That is the architectural definition of what the instruction does. Now either psp doesnt support the LD opcode (or at least that the assembler is aware of) so it instead is treats is as a pseudo-op and generates equivalent instructions.

It is assumed that the programmer is aware that LD into v0 will also overwrite v1, as per the definition.

Hope this helps somewhat :)
Damn, I need a decent signature!
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

a lot :)
User avatar
Raphael
Posts: 646
Joined: Tue Jan 17, 2006 4:54 pm
Location: Germany
Contact:

Post by Raphael »

The PSP is a MIPS32 architecture actually. That's why 64bit ops will be aliased to two 32bit instructions.
<Don't push the river, it flows.>
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki

Alexander Berl
Post Reply