Program runs for hours then won't run at all

I wrote a C program that creates a web page using data I pull from a file and serve using Apache2.2 on FreeBSD 8.2-RELEASE. This program runs great for hours but, all of a sudden, it won't do the simplest of things. The program hangs while serving up pages that have worked for months and give a 2GB core dump. If I use gdb on the core file, the area it points to as causing the failure doesn't make any sense cause I've not touched it in months and, sometimes, years. It acts like it's in a loop cause it takes a while before I can enter anything in the console. In top, WCPU shows 18% to 40%. I can't kill it in top or with 'kill -s HUP pid'. It eventually dies on its own when the filesystem gets filled up.

messages shows it exited with signal 11 (core dumped).

This is on a box I used to have connected to the internet and did dev work but now it's only run through a router that connects to my laptop and another FreeBSD box (cause I'm in a hotel for a few weeks). It seems every other function works so I'm wanting to rule out hardware problems.

Yesterday, this program worked in the morning till about 10AM. Started failing and didn't work again until about 10PM last night. It worked this morning until 10AM and failed again. Don't go too much by the 12-hour timing but does that ring a bell with anyone?

I'm frustrated cause I don't know where to look or what to think of to solve this. I shut the system off for a few hours to see what happens then.
 
I'm beginning to think that but, unfortunately, I didn't install valgrind on this box and can't anymore.

It's working right now. Strange, strange, strange.
 
I wouldn't know what to show cause I don't know which module is causing the problem but there's so much going on you'd have trouble figuring it out.
 
This is a very common issue in programs where you manage your own memory allocation and garbage collecting. Strange things happen in strange times in places that are not responsible for the crash. This can happen when using pointers to memory that has been freed before (and or not reused), when minor buffer overflows happen (integer overflows, single byte overflows) most of the time it does not cause a crash right where it happens but it causes memory corruption somewhere else in the program and then it crashes. These things happen very often if you are not very careful how you manage your memory and are very difficult to debug. Programs like devel/cppcheck and your mentioned Valgrind can pick up most of these bugs, but sometimes it is much more complicated than that.

Also I would avoid writing CGI apps in C/C++ as it adds unnecessary complexity as it is very hard to debug such programs.
 
expl said:
This can happen when using pointers to memory that has been freed before
Correct.

expl said:
Also I would avoid writing CGI apps in C/C++ as it adds unnecessary complexity as it is very hard to debug such programs.
Only partially true. Debugging through CGI directly is indeed a PITA but if you have another means of debugging and testing the program (which you should be doing offline anyway before taking it online) there's nothing wrong with using C/C++ for CGI. You just need proper testing procedures, which may or may not include simulating HTTP traffic. Think about it: the program itself should just to what it's supposed to do and not care about CGI. That's an abstraction level higher and should be easy to isolate in a properly designed program.

@OP: If you cannot disclose source code to us, you most likely will need to do some kind of memory management verification as this is the most likely culprit.

Fonz
 
expl said:
Also I would avoid writing CGI apps in C/C++ as it adds unnecessary complexity as it is very hard to debug such programs.
I'm thinking something much more nefarious. Applications like that are prone to exploitation.

@OP, are you sure it's not somebody that's trying to hack into your server?

I can imagine input filtering not being up to par and an attacker could easily crash, or worse, the application.
 
Back
Top