Page 1 of 1

Bad code or gcc 3.2.2 compiler -O2 bug?

Posted: Sun May 09, 2004 11:19 am
by t0mb0la
When I compile the code (debug.cpp) shown below with ee-gcc (3.2.2) using option '-O2' the following output is produced:
uld be l
Followed by line two
Line three is the same
So line four has same format
ine one
If I remove the '-O2' option, the text is output as expected. Is there something wrong with the code, or is it my compiler?

Code: Select all

#include <kernel.h>
#include <string.h>

//static char textarray&#91;5&#93;&#91;30&#93;; //this has a different effect
char textarray&#91;5&#93;&#91;30&#93;;

//extern "C" &#123; extern void _init&#40;&#41;; &#125; //this makes no difference
void textfill&#40;&#41;;
void textfunction&#40;&#41;;

int main &#40;int argc, char **argv&#91;&#93;&#41;
&#123;
// _init&#40;&#41;;
 textfunction&#40;&#41;;
&#125;

void textfill&#40;&#41;
&#123;
 strcpy&#40;textarray&#91;0&#93;, "This should be line one"&#41;;
 strcpy&#40;textarray&#91;1&#93;, "Followed by line two"&#41;;
 strcpy&#40;textarray&#91;2&#93;, "Line three is the same"&#41;;
 strcpy&#40;textarray&#91;3&#93;, "So line four has same format"&#41;;
 strcpy&#40;textarray&#91;4&#93;, "Why should five not also?"&#41;;
&#125;

void textfunction&#40;&#41;
&#123;
 int i;

 textfill&#40;&#41;;
 for &#40;i = 0; i <= 4; i++&#41;
 &#123;
    printf&#40;"%s\n", textarray&#91;i&#93;&#41;;
 &#125;
&#125;

Posted: Sun May 09, 2004 11:48 am
by mharris
I've seen this before, and my conclusion was that UDP was getting bolloxed when the packets were coming in too fast. With UDP (as opposed to TCP), there's no guarantee of the order in which they're sent or received. It's just a theory, and doing tcpdump shows that the output from printf is indeed arriving out of sequence, but they're all very close together in time.

Interesting that the -O2 flag makes a difference -- maybe the less-optimized code has enough delays between the individual calls to npmPuts() to make a difference, and your client pulls them out in the right order.

I'm assuming you're using ps2link, not naplink... I'm guessing the naplink drivers use their own packet encapsulation rather than IP, but I've never run naplink so I can't be sure. One would think this problem wouldn't exist on naplink because of this.

If you try using naplink, or scr_printf() or similar, does the problem go away in -O2 code?

Posted: Sun May 09, 2004 1:02 pm
by t0mb0la
I'm sure this isn't a UDP issue. I first came across it in a program which is using GSlib to display the text, the lines are messed up in exactly the same way, using strlen on textarray[0] or textarray[5] (in this example) will return what it 'sees' thus (I extended to printf("%s [%d]\n", textarray, strlen(textarray)); ) and the output is:
uld be l [8]
Followed by line two [20]
Line three is the same [22]
So line four has same format [28]
ine one [7]


I didn't use scr_printf in the example, because I'm building with ps2sdk which no longer has the init_scr/scr_printf functions. iirc, building this with ps2lib produced the same results. -O2 seems to be doing something strange with the first and last items in the array. The last line seems to be taking on the end of the first line.

Re: Bad code or gcc 3.2.2 compiler -O2 bug?

Posted: Sun May 09, 2004 3:31 pm
by Guest
t0mb0la wrote:
int main (int argc, char **argv[])
There is one bug you should fix and then try again. I don't think
it should matter or cause this problem necessarily, but its worth
trying.

char **argv[] is an incorrect declaration. It should be one of
the following:

char **argv or char *argv[]


Gorim

Posted: Sun May 09, 2004 4:17 pm
by ooPo
Hooray for munged data.

Notice how it skips 5 characters in line one? I wonder if that may be tied to your having 5 textarray lines? Or looping 5 times... *shrug*

Try printing out the lines before and after you do a strcpy. Try doing them in a different order. See if you can find out if its failing during the strcpy or during the printf. Maybe use a memcpy instead? Try to output it to the screen somehow if you can, avoiding the network code.

The code compiles and runs fine on my pc, so the code should work fine. :)

Re: Bad code or gcc 3.2.2 compiler -O2 bug?

Posted: Sun May 09, 2004 5:16 pm
by Guest
One more possibility, and this could very well likely be your problem

The newest versions of gcc have a weird optimization where it will
automagically convert printf() calls to puts() calls. This hung me
up on compiling oopo's new beta toolchain.

Compiling your code with the -S flag and viewing asm output
confirms that your printf() is being converted to puts().

Ok, in theory, things should still work. But you can turn off that
annoying attempt at optimization with: -fno-builtin-printf and see
if it goes away.

I can't take credit for knowing this, pixel helped me out last night
when I ran into this problem. :)

Gorim

Posted: Sun May 09, 2004 10:34 pm
by mharris
t0mb0la: can you post the asm code that's generated for textfunction() w/ -O0 and -O2? That should the ultimate test... Pass is the -S option to gcc instead of -c, but you know that already...

I'd try myself, but I don't have the 3.2.2 gcc installed.

Posted: Mon May 10, 2004 5:01 am
by t0mb0la
Ok, here is the asm code generated without -O2 option for the textfill and textfunction:

Code: Select all

$LC0&#58;
	.ascii	"This should be line one\000"
	.align	3
$LC1&#58;
	.ascii	"Followed by line two\000"
	.align	3
$LC2&#58;
	.ascii	"Line three is the same\000"
	.align	3
$LC3&#58;
	.ascii	"So line four has same format\000"
	.align	3
$LC4&#58;
	.ascii	"Why should five not also?\000"
	.text
	.align	2
	.globl	_Z8textfillv
	.ent	_Z8textfillv
_Z8textfillv&#58;
	.frame	$fp,32,$31		# vars= 0, regs= 4/0, args= 0, extra= 0
	.mask	0xc0000000,-16
	.fmask	0x00000000,0
	subu	$sp,$sp,32
	sd	$31,16&#40;$sp&#41;
	sd	$fp,0&#40;$sp&#41;
	move	$fp,$sp
	la	$4,textarray
	la	$5,$LC0
	jal	strcpy
	la	$4,textarray+30
	la	$5,$LC1
	jal	strcpy
	la	$4,textarray+60
	la	$5,$LC2
	jal	strcpy
	la	$4,textarray+90
	la	$5,$LC3
	jal	strcpy
	la	$4,textarray+120
	la	$5,$LC4
	jal	strcpy
	move	$sp,$fp
	ld	$31,16&#40;$sp&#41;
	ld	$fp,0&#40;$sp&#41;
	addu	$sp,$sp,32
	j	$31
	.end	_Z8textfillv
$Lfe2&#58;
	.size	_Z8textfillv,$Lfe2-_Z8textfillv
	.rdata
	.align	3
$LC5&#58;
	.ascii	"%s &#91;%d&#93;\n\000"
	.text
	.align	2
	.globl	_Z12textfunctionv
	.ent	_Z12textfunctionv
_Z12textfunctionv&#58;
	.frame	$fp,48,$31		# vars= 16, regs= 4/0, args= 0, extra= 0
	.mask	0xc0000000,-16
	.fmask	0x00000000,0
	subu	$sp,$sp,48
	sd	$31,32&#40;$sp&#41;
	sd	$fp,16&#40;$sp&#41;
	move	$fp,$sp
	jal	_Z8textfillv
	sw	$0,0&#40;$fp&#41;
$L4&#58;
	lw	$2,0&#40;$fp&#41;
	slt	$2,$2,5
	bne	$2,$0,$L7
	b	$L3
$L7&#58;
	lw	$3,0&#40;$fp&#41;
	li	$2,30			# 0x1e
	mult	$3,$3,$2
	la	$2,textarray
	addu	$2,$3,$2
	move	$4,$2
	jal	strlen
	move	$6,$2
	lw	$3,0&#40;$fp&#41;
	li	$2,30			# 0x1e
	mult	$3,$3,$2
	la	$2,textarray
	addu	$2,$3,$2
	la	$4,$LC5
	move	$5,$2
	jal	printf
	lw	$2,0&#40;$fp&#41;
	addu	$2,$2,1
	sw	$2,0&#40;$fp&#41;
	b	$L4
$L3&#58;
	move	$sp,$fp
	ld	$31,32&#40;$sp&#41;
	ld	$fp,16&#40;$sp&#41;
	addu	$sp,$sp,48
	j	$31
	.end	_Z12textfunctionv
$Lfe3&#58;
	.size	_Z12textfunctionv,$Lfe3-_Z12textfunctionv
	.ident	"GCC&#58; &#40;GNU&#41; 3.2.2"
And the same functions, compiled with -O2 option produces this, which seems to have attempted to inline strcpy:

Code: Select all

$LC0&#58;
	.ascii	"This should be line one\000"
	.align	3
$LC4&#58;
	.ascii	"Why should five not also?\000"
	.align	3
$LC1&#58;
	.ascii	"Followed by line two\000"
	.align	3
$LC2&#58;
	.ascii	"Line three is the same\000"
	.align	3
$LC3&#58;
	.ascii	"So line four has same format\000"
	.text
	.align	2
	.p2align 3,,7
	.globl	_Z8textfillv
	.ent	_Z8textfillv
_Z8textfillv&#58;
	.frame	$sp,0,$31		# vars= 0, regs= 0/0, args= 0, extra= 0
	.mask	0x00000000,0
	.fmask	0x00000000,0
	lui	$3,%hi&#40;$LC0&#41; # high
	lui	$4,%hi&#40;$LC4&#41; # high
	addiu	$3,$3,%lo&#40;$LC0&#41; # low
	lui	$2,%hi&#40;textarray&#41; # high
	ld	$8,16&#40;$3&#41;
	addiu	$2,$2,%lo&#40;textarray&#41; # low
	lq $5,0&#40;$3&#41;
	addiu	$4,$4,%lo&#40;$LC4&#41; # low
	lui	$3,%hi&#40;$LC1&#41; # high
	sd	$8,16&#40;$2&#41;
	addiu	$3,$3,%lo&#40;$LC1&#41; # low
	sq $5,0&#40;$2&#41;
	lhu	$10,24&#40;$4&#41;
	addu	$7,$2,120
	lq $6,0&#40;$4&#41;
	ld	$9,16&#40;$4&#41;
	ldl	$4,7&#40;$3&#41;
	ldr	$4,0&#40;$3&#41;
	ldl	$5,15&#40;$3&#41;
	ldr	$5,8&#40;$3&#41;
	lwl	$8,19&#40;$3&#41;
	lwr	$8,16&#40;$3&#41;
	lb	$11,20&#40;$3&#41;
	sdl	$4,37&#40;$2&#41;
	sdr	$4,30&#40;$2&#41;
	sdl	$5,45&#40;$2&#41;
	sdr	$5,38&#40;$2&#41;
	swl	$8,49&#40;$2&#41;
	swr	$8,46&#40;$2&#41;
	sb	$11,50&#40;$2&#41;
	lui	$4,%hi&#40;$LC2&#41; # high
	addiu	$4,$4,%lo&#40;$LC2&#41; # low
	ldl	$3,7&#40;$4&#41;
	ldr	$3,0&#40;$4&#41;
	ldl	$5,15&#40;$4&#41;
	ldr	$5,8&#40;$4&#41;
	lw	$8,16&#40;$4&#41;
	lh	$11,20&#40;$4&#41;
	sdl	$3,67&#40;$2&#41;
	sdr	$3,60&#40;$2&#41;
	sdl	$5,75&#40;$2&#41;
	sdr	$5,68&#40;$2&#41;
	sw	$8,76&#40;$2&#41;
	sh	$11,80&#40;$2&#41;
	lb	$3,22&#40;$4&#41;
	sb	$3,82&#40;$2&#41;
	lui	$3,%hi&#40;$LC3&#41; # high
	addiu	$3,$3,%lo&#40;$LC3&#41; # low
	ldl	$4,7&#40;$3&#41;
	ldr	$4,0&#40;$3&#41;
	ldl	$5,15&#40;$3&#41;
	ldr	$5,8&#40;$3&#41;
	ldl	$8,23&#40;$3&#41;
	ldr	$8,16&#40;$3&#41;
	lwl	$11,27&#40;$3&#41;
	lwr	$11,24&#40;$3&#41;
	sdl	$4,97&#40;$2&#41;
	sdr	$4,90&#40;$2&#41;
	sdl	$5,105&#40;$2&#41;
	sdr	$5,98&#40;$2&#41;
	sdl	$8,113&#40;$2&#41;
	sdr	$8,106&#40;$2&#41;
	swl	$11,117&#40;$2&#41;
	swr	$11,114&#40;$2&#41;
	lb	$4,28&#40;$3&#41;
	sb	$4,118&#40;$2&#41;
	sq $6,120&#40;$2&#41;
	sh	$10,24&#40;$7&#41;
	.set	noreorder
	.set	nomacro
	j	$31
	sd	$9,16&#40;$7&#41;
	.set	macro
	.set	reorder

	.end	_Z8textfillv
$Lfe2&#58;
	.size	_Z8textfillv,$Lfe2-_Z8textfillv
	.rdata
	.align	3
$LC5&#58;
	.ascii	"%s &#91;%d&#93;\n\000"
	.text
	.align	2
	.p2align 3,,7
	.globl	_Z12textfunctionv
	.ent	_Z12textfunctionv
_Z12textfunctionv&#58;
	.frame	$sp,64,$31		# vars= 0, regs= 8/0, args= 0, extra= 0
	.mask	0x80070000,-16
	.fmask	0x00000000,0
	subu	$sp,$sp,64
	sd	$18,32&#40;$sp&#41;
	sd	$17,16&#40;$sp&#41;
	sd	$16,0&#40;$sp&#41;
	sd	$31,48&#40;$sp&#41;
	.set	noreorder
	.set	nomacro
	jal	_Z8textfillv
	lui	$18,%hi&#40;$LC5&#41; # high
	.set	macro
	.set	reorder

	lui	$2,%hi&#40;textarray&#41; # high
	addiu	$16,$2,%lo&#40;textarray&#41; # low
	addu	$17,$16,150
$L8&#58;
	.set	noreorder
	.set	nomacro
	jal	strlen
	move	$4,$16
	.set	macro
	.set	reorder

	addiu	$4,$18,%lo&#40;$LC5&#41; # low
	move	$5,$16
	move	$6,$2
	.set	noreorder
	.set	nomacro
	jal	printf
	addu	$16,$16,30
	.set	macro
	.set	reorder

	slt	$3,$16,$17
	.set	noreorder
	.set	nomacro
	bne	$3,$0,$L8
	ld	$31,48&#40;$sp&#41;
	.set	macro
	.set	reorder

	ld	$18,32&#40;$sp&#41;
	ld	$17,16&#40;$sp&#41;
	ld	$16,0&#40;$sp&#41;
	#nop
	.set	noreorder
	.set	nomacro
	j	$31
	addu	$sp,$sp,64
	.set	macro
	.set	reorder

	.end	_Z12textfunctionv
$Lfe3&#58;
	.size	_Z12textfunctionv,$Lfe3-_Z12textfunctionv
	.ident	"GCC&#58; &#40;GNU&#41; 3.2.2"

Posted: Mon May 10, 2004 6:05 am
by Guest
Are you compiling as C++ ? The ASM function names looked
like they have been mangled. can you try compiling as C ?

furthermore, the section where it inline copies the first two
strings is definitely hosed as bad code.

I compiled your code earlier today to look at asm output, and
it looked alot cleaner. I compiled it from my mac, having just
installed the new toolchain last night:

Code: Select all

$LC0&#58;
        .ascii  "This should be line one\000"
        .align  3
$LC1&#58;
        .ascii  "Followed by line two\000"
        .align  3
$LC2&#58;
        .ascii  "Line three is the same\000"
        .align  3
$LC3&#58;
        .ascii  "So line four has same format\000"
        .align  3
$LC4&#58;
        .ascii  "Why should five not also?\000"
        .text
        .align  2
        .p2align 3,,7
        .globl  textfill
        .ent    textfill
textfill&#58;
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, extra= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        lui     $4,%hi&#40;textarray&#41; # high
        lui     $2,%hi&#40;$LC0&#41; # high
        addiu   $5,$4,%lo&#40;textarray&#41; # low
        addiu   $2,$2,%lo&#40;$LC0&#41; # low
        ldl     $3,7&#40;$2&#41;
        ldr     $3,0&#40;$2&#41;
        ldl     $6,15&#40;$2&#41;
        ldr     $6,8&#40;$2&#41;
        ldl     $7,23&#40;$2&#41;
        ldr     $7,16&#40;$2&#41;
        sdl     $3,7&#40;$5&#41;
        sdr     $3,0&#40;$5&#41;
        sdl     $6,15&#40;$5&#41;
        sdr     $6,8&#40;$5&#41;
        sdl     $7,23&#40;$5&#41;
        sdr     $7,16&#40;$5&#41;
        lui     $3,%hi&#40;$LC1&#41; # high
        addiu   $3,$3,%lo&#40;$LC1&#41; # low
        ldl     $2,7&#40;$3&#41;
        ldr     $2,0&#40;$3&#41;
        ldl     $4,15&#40;$3&#41;
        ldr     $4,8&#40;$3&#41;
        lwl     $6,19&#40;$3&#41;
        lwr     $6,16&#40;$3&#41;
        lb      $7,20&#40;$3&#41;
        sdl     $2,37&#40;$5&#41;
        sdr     $2,30&#40;$5&#41;
        sdl     $4,45&#40;$5&#41;
        sdr     $4,38&#40;$5&#41;
        swl     $6,49&#40;$5&#41;
        swr     $6,46&#40;$5&#41;
        sb      $7,50&#40;$5&#41;
        lui     $2,%hi&#40;$LC2&#41; # high
        addiu   $2,$2,%lo&#40;$LC2&#41; # low
        ldl     $8,7&#40;$2&#41;
        ldr     $8,0&#40;$2&#41;
        ldl     $3,15&#40;$2&#41;
        ldr     $3,8&#40;$2&#41;
        lwl     $4,19&#40;$2&#41;
        lwr     $4,16&#40;$2&#41;
        lb      $6,20&#40;$2&#41;
        sdl     $8,67&#40;$5&#41;
        sdr     $8,60&#40;$5&#41;
        sdl     $3,75&#40;$5&#41;
        sdr     $3,68&#40;$5&#41;
        swl     $4,79&#40;$5&#41;
        swr     $4,76&#40;$5&#41;
        sb      $6,80&#40;$5&#41;
        lb      $8,21&#40;$2&#41;
        lb      $3,22&#40;$2&#41;
        sb      $8,81&#40;$5&#41;
        sb      $3,82&#40;$5&#41;
        lui     $3,%hi&#40;$LC3&#41; # high
        addiu   $3,$3,%lo&#40;$LC3&#41; # low
        ldl     $7,7&#40;$3&#41;
        ldr     $7,0&#40;$3&#41;
        ldl     $8,15&#40;$3&#41;
        ldr     $8,8&#40;$3&#41;
        ldl     $2,23&#40;$3&#41;
        ldr     $2,16&#40;$3&#41;
        lwl     $4,27&#40;$3&#41;
        lwr     $4,24&#40;$3&#41;
        sdl     $7,97&#40;$5&#41;
        sdr     $7,90&#40;$5&#41;
        sdl     $8,105&#40;$5&#41;
        sdr     $8,98&#40;$5&#41;
        sdl     $2,113&#40;$5&#41;
        sdr     $2,106&#40;$5&#41;
        swl     $4,117&#40;$5&#41;
        swr     $4,114&#40;$5&#41;
        lb      $7,28&#40;$3&#41;
        sb      $7,118&#40;$5&#41;
        lui     $2,%hi&#40;$LC4&#41; # high
        addiu   $2,$2,%lo&#40;$LC4&#41; # low
        ldl     $6,7&#40;$2&#41;
        ldr     $6,0&#40;$2&#41;
        ldl     $7,15&#40;$2&#41;
        ldr     $7,8&#40;$2&#41;
        ldl     $8,23&#40;$2&#41;
        ldr     $8,16&#40;$2&#41;
        lb      $3,24&#40;$2&#41;
        sdl     $6,127&#40;$5&#41;
        sdr     $6,120&#40;$5&#41;
        sdl     $7,135&#40;$5&#41;
        sdr     $7,128&#40;$5&#41;
        sdl     $8,143&#40;$5&#41;
        sdr     $8,136&#40;$5&#41;
        sb      $3,144&#40;$5&#41;
        lb      $6,25&#40;$2&#41;
        sb      $6,145&#40;$5&#41;
        j       $31
        .end    textfill
$Lfe2&#58;
        .size   textfill,$Lfe2-textfill
        .align  2
        .p2align 3,,7
        .globl  textfunction
        .ent    textfunction
textfunction&#58;
        .frame  $sp,48,$31              # vars= 0, regs= 6/0, args= 0, extra= 0
        .mask   0x80030000,-16
        .fmask  0x00000000,0
        subu    $sp,$sp,48
        sd      $17,16&#40;$sp&#41;
        sd      $16,0&#40;$sp&#41;
        sd      $31,32&#40;$sp&#41;
        jal     textfill
        lui     $2,%hi&#40;textarray&#41; # high
        addiu   $16,$2,%lo&#40;textarray&#41; # low
        addu    $17,$16,150
        move    $4,$16
$L11&#58;
        .set    noreorder
...

Posted: Mon May 10, 2004 6:06 am
by pixel
Try recompiling your sources with the options -O2 -fno-builtin to disable *all* the builtins (including strcpy), to see if it's the fault of the builtin strcpy here.

Posted: Mon May 10, 2004 6:25 am
by Guest
I didn't compile with -fno-builtins and the asm is much nicer :)

Anyhow, the section of his asm code that is really munged is
this:

Code: Select all

_Z8textfillv&#58;
   .frame   $sp,0,$31      # vars= 0, regs= 0/0, args= 0, extra= 0
   .mask   0x00000000,0
   .fmask   0x00000000,0
   lui   $3,%hi&#40;$LC0&#41; # high
   lui   $4,%hi&#40;$LC4&#41; # high
   addiu   $3,$3,%lo&#40;$LC0&#41; # low
   lui   $2,%hi&#40;textarray&#41; # high
   ld   $8,16&#40;$3&#41;
   addiu   $2,$2,%lo&#40;textarray&#41; # low
   lq $5,0&#40;$3&#41;
   addiu   $4,$4,%lo&#40;$LC4&#41; # low
   lui   $3,%hi&#40;$LC1&#41; # high
   sd   $8,16&#40;$2&#41;
   addiu   $3,$3,%lo&#40;$LC1&#41; # low
   sq $5,0&#40;$2&#41;
   lhu   $10,24&#40;$4&#41;
   addu   $7,$2,120
 &#91;color=red&#93;  lq $6,0&#40;$4&#41; &#91;/color&#93;
One key thing is that last lq. That data is loaded in the
beginning, but gets stored way further down at the end
(not shown). In fact, its doing stuff in real funky orders
in this section of code.

Gorim

Posted: Mon May 10, 2004 7:35 am
by t0mb0la
Ok, compiling with -O2 -fno-builtin removes the problem. Does this mean my builtin strcpy is broken?

Posted: Mon May 10, 2004 9:13 am
by mrbrown
Actually, MrHTFord outlined this issue in GCC 3.2.2 (MIPS backend) a few months ago either here or on IRC. There's a bug when GCC picks alignment for block moves, I think it was where it would pick 128-bit alignment for data that was only 64-bit aligned. Lemme see if I can find it...

Hmm, I can't find it here, so we must've talked about it on IRC. Try tracking MrHTFord down, as he knows more details than me.

Posted: Mon May 10, 2004 11:16 am
by Guest
Ok, so why does my brand-spanking newly installled toolchain
not have this problem ? (installed from oopo's beta toolchain
installer...and i have no other tools installed on this host so
can't be any side-effects from accidental environmental
references to pre-existing stuff).

Its pretty bad if the compiler can't be consistent about when
it exhibits this bad output.

FYI, I compiled with -O2 (you can tell from the inlined strcpy
in my output) and did not do -fno-builtins.

Gorim

Posted: Mon May 10, 2004 2:01 pm
by ooPo
Its possible he's using an earlier 3.2.2 patch, I've added stuff from mrhtford to the beta set... maybe it was fixed already?

At the very least, is there a way to override block move alignment and force it to 64-bit?

Posted: Wed Jun 02, 2004 9:48 am
by pixel
I finally found the patch back, and put it here:

http://www.nobis-crew.org/gcc-3.2.2-IOP ... 22.diff.gz

Now I'm going to do some tests with it.