Discussion:
Weird segmentation fault
(too old to reply)
Mark
2010-09-30 13:59:12 UTC
Permalink
[Linux 2.6.18, GCC 4.1.2]

I have a really weird segmentation fault on a C program. I have a
line like this:

strcat(stringVariable, "_");

And it crashes with a segmentation fault. All the time. The same
program works fine on Solaris and AIX.

stringVariable is plenty big enough to accomodate an extra character.
(It contains a 7 character string and it is 1K in size.) I have run
it under the debugger to verify this. I have also run in under purify
which doesn't tell me anything useful.

I know that the compiler is old but it is not my decision to update
it.

Any ideas?
--
(\__/) M.
(='.'=) Due to the amount of spam posted via googlegroups and
(")_(") their inaction to the problem. I am blocking some articles
posted from there. If you wish your postings to be seen by
everyone you will need use a different method of posting.
Peter van Hooft
2010-10-01 04:34:43 UTC
Permalink
Post by Mark
[Linux 2.6.18, GCC 4.1.2]
I have a really weird segmentation fault on a C program. I have a
strcat(stringVariable, "_");
And it crashes with a segmentation fault. All the time. The same
program works fine on Solaris and AIX.
stringVariable is plenty big enough to accomodate an extra character.
(It contains a 7 character string and it is 1K in size.) I have run
it under the debugger to verify this. I have also run in under purify
which doesn't tell me anything useful.
I know that the compiler is old but it is not my decision to update
it.
Any ideas?
Can you run your program in gdb and post the output of
print stringVariable
just before the strcat() ?

peter
Mark
2010-10-01 09:22:34 UTC
Permalink
Post by Peter van Hooft
Post by Mark
[Linux 2.6.18, GCC 4.1.2]
I have a really weird segmentation fault on a C program. I have a
strcat(stringVariable, "_");
And it crashes with a segmentation fault. All the time. The same
program works fine on Solaris and AIX.
stringVariable is plenty big enough to accomodate an extra character.
(It contains a 7 character string and it is 1K in size.) I have run
it under the debugger to verify this. I have also run in under purify
which doesn't tell me anything useful.
I know that the compiler is old but it is not my decision to update
it.
Any ideas?
Can you run your program in gdb and post the output of
print stringVariable
just before the strcat() ?
I can't remember the exact text but it was as I expected, something
like this:

"DEFAULT", '\0' (repeated 200 times).

Anyway I have found a different machine to compile the program on and
it now works!
--
(\__/) M.
(='.'=) Due to the amount of spam posted via googlegroups and
(")_(") their inaction to the problem. I am blocking some articles
posted from there. If you wish your postings to be seen by
everyone you will need use a different method of posting.
Kelsey Bjarnason
2010-10-04 01:44:26 UTC
Permalink
Post by Mark
[Linux 2.6.18, GCC 4.1.2]
I have a really weird segmentation fault on a C program. I have a
strcat(stringVariable, "_");
And it crashes with a segmentation fault. All the time. The same
program works fine on Solaris and AIX.
stringVariable is plenty big enough to accomodate an extra character.
(It contains a 7 character string and it is 1K in size.) I have run
it under the debugger to verify this. I have also run in under purify
which doesn't tell me anything useful.
I know that the compiler is old but it is not my decision to update
it.
Any ideas?
Can you run your program in gdb and post the output of print
stringVariable
just before the strcat() ?
I can't remember the exact text but it was as I expected, something like
"DEFAULT", '\0' (repeated 200 times).
Anyway I have found a different machine to compile the program on and it
now works!
I would tend to think that if the behaviour varies, machine to machine,
OS to OS, on something as basic as a srcat, the issue is very likely not
with the machine or OS, but with undefined behaviour in the code - UB
permitting any sort of result, including seeming to work.
Mark
2010-10-04 09:31:09 UTC
Permalink
On Sun, 3 Oct 2010 18:44:26 -0700, Kelsey Bjarnason
Post by Kelsey Bjarnason
Post by Mark
[Linux 2.6.18, GCC 4.1.2]
I have a really weird segmentation fault on a C program. I have a
strcat(stringVariable, "_");
And it crashes with a segmentation fault. All the time. The same
program works fine on Solaris and AIX.
stringVariable is plenty big enough to accomodate an extra character.
(It contains a 7 character string and it is 1K in size.) I have run
it under the debugger to verify this. I have also run in under purify
which doesn't tell me anything useful.
I know that the compiler is old but it is not my decision to update
it.
Any ideas?
Can you run your program in gdb and post the output of print
stringVariable
just before the strcat() ?
I can't remember the exact text but it was as I expected, something like
"DEFAULT", '\0' (repeated 200 times).
Anyway I have found a different machine to compile the program on and it
now works!
I would tend to think that if the behaviour varies, machine to machine,
OS to OS, on something as basic as a srcat, the issue is very likely not
with the machine or OS, but with undefined behaviour in the code - UB
permitting any sort of result, including seeming to work.
I would normally agree with this but I have extensably scrutinized and
tested the code. It works on all other platforms and, if the debugger
is not lying, there should be nothing wrong with this line of code.
--
(\__/) M.
(='.'=) Due to the amount of spam posted via googlegroups and
(")_(") their inaction to the problem. I am blocking some articles
posted from there. If you wish your postings to be seen by
everyone you will need use a different method of posting.
Kelsey Bjarnason
2010-10-05 07:11:26 UTC
Permalink
[snips]
Post by Mark
Post by Kelsey Bjarnason
I would tend to think that if the behaviour varies, machine to machine,
OS to OS, on something as basic as a srcat, the issue is very likely not
with the machine or OS, but with undefined behaviour in the code - UB
permitting any sort of result, including seeming to work.
I would normally agree with this but I have extensably scrutinized and
tested the code. It works on all other platforms and, if the debugger
is not lying, there should be nothing wrong with this line of code.
There may be nothing wrong with *that* line of code. That doesn't mean
there isn't UB happening elsewhere which is trashing the results in a
manner which simply happens to show up _here_ on a particular system. On
a different system, you may get completely different results, such as
seeming to work, or silently corrupting some value somewhere.
Mark
2010-10-05 08:27:12 UTC
Permalink
On Tue, 5 Oct 2010 00:11:26 -0700, Kelsey Bjarnason
Post by Kelsey Bjarnason
[snips]
Post by Mark
Post by Kelsey Bjarnason
I would tend to think that if the behaviour varies, machine to machine,
OS to OS, on something as basic as a srcat, the issue is very likely not
with the machine or OS, but with undefined behaviour in the code - UB
permitting any sort of result, including seeming to work.
I would normally agree with this but I have extensably scrutinized and
tested the code. It works on all other platforms and, if the debugger
is not lying, there should be nothing wrong with this line of code.
There may be nothing wrong with *that* line of code. That doesn't mean
there isn't UB happening elsewhere which is trashing the results in a
manner which simply happens to show up _here_ on a particular system. On
a different system, you may get completely different results, such as
seeming to work, or silently corrupting some value somewhere.
Indeed but this software has been intensively tested and run under
purify etc. It is running successfully on many different platforms
right now.
The code has been manually scutinised for all common problems,
including buffer overflows etc. Other developers have noticed similar
unexplained crashes on entirely different software. The only common
denominator is that they were compiled on the same machine. AFAIK all
the other developers have concluded that the problem lies in the
machine/compiler.
I am just trying to be very thorough in my investigations.

I agree that there may be a bug somewhere in my code but, if there is,
I can't find it.
--
(\__/) M.
(='.'=) Due to the amount of spam posted via googlegroups and
(")_(") their inaction to the problem. I am blocking some articles
posted from there. If you wish your postings to be seen by
everyone you will need use a different method of posting.
Kelsey Bjarnason
2010-10-05 12:07:16 UTC
Permalink
[snips]
Post by Mark
I agree that there may be a bug somewhere in my code but, if there is,
I can't find it.
Well, it is, indeed, possible there is an actual bug in the
implementation. That said, they tend to be rare to begin with, and
exceedingly so in something as straightforward as a strcat.

You say the only common denominator between crashed apps is they were
compiled on the same machine. Is this actually the case? They have zero
overlapping code, apart from the implementation's library? And the flaw
shows up if they're compiled on this machine, but _run_ on other
machines? Or only if run on this one? How about compiled on other
machines, then run on this one - same flaw, or no?

If compiling elsewhere and running here fails on this machine (while
working elsewhere), this would suggest the fault lies not with the
implementation, but with the machine - flaky RAM, for example.

If compiling on this machine and running elsewhere causes failures, then
it's not the machine, it's the code or the implementation.

Just points to ponder.

Loading...