Python Any way to prevent process being killed?

I'm running a process, for a project, that seems to get sigkilled once it uses up a lot of resources. The problem is that every time I start the process it'll run for maybe an hour or two then gets sigkilled. It's a python script that needs to process a lot of data and compile it all into one large file. It runs for approximately an hour to an hour and a half and sometime within that time frame it gets killed and I have to start over. Is there any way to prevent this?
 
if the program has some internal support to run with less memory look at limits / rctl / sysct vm.overcommit
if not try to add more swap and hope
 
Look up the id of the process. & Verify how much physical & virtual memory it uses. How much IO it is doing. How much CPU it uses.
Maybe an OOM-killer (out of memory ?).
Try not to restart the process but rather the reason why it is killed.
 
I'm running a process, for a project, that seems to get sigkilled once it uses up a lot of resources. The problem is that every time I start the process it'll run for maybe an hour or two then gets sigkilled. It's a python script that needs to process a lot of data and compile it all into one large file. It runs for approximately an hour to an hour and a half and sometime within that time frame it gets killed and I have to start over. Is there any way to prevent this?
 
I'll take a look at this link. I wondEr why it didn't show up when I searched on here and Google.


Look up the id of the process. & Verify how much physical & virtual memory it uses. How much IO it is doing. How much CPU it uses.
Maybe an OOM-killer (out of memory ?).
Try not to restart the process but rather the reason why it is killed.
I used top and watched it as it ran after the first time. CPU got up to like 80%. Memory was at like 8GB total usage from this process ( I have 16gb in my system).
 
At the time leading up to the kill, how much swapspace do you have and how much of it is used? (`pstat -s`)

Under FreeBSD you should have a lower chance to get killed while swapspace usage is still reasonable, compared to Linux. Or in other words, adding swapspace should help.
 
First, as cracauer already said: Before you can fix it, you need to diagnose it. Do you know for sure that the problem is memory pressure? Read up on how the OOM killer works. You said you "have 16GB in the system and memory was at like 8GB", which makes me think that memory pressure might not be the problem. What exactly are the error messages? If I remember right, the OOM killer leaves log entries in dmesg or /var/log/messages, so you should be able to see if that's the problem.

Second, assuming that memory pressure is actually the problem, there are two ways to fix it. One is to create more memory. Either remove other processes that use memory, or install more physical memory (or make your program distributed), or add more swapspace. The latter is the easiest, but be warned that a program that has to swap all the time will run amazingly slow.

Third, the second way to fix the program is to make it use less memory. You say the program is in Python. That's a wonderful programming language, but it can be quite inefficient using CPU and particularly memory. Think about the data structures you're using. In particular, if your data is parsed into many objects which then each contain small data types (integers, floats, strings), and the objects kept in structures such a lists and trees, memory usage can easily 10x more than in other programming systems. You may want to look at using NumPy and SciPy arrays or Series/Dataframes instead, and other space-efficient data structures.
 
Back
Top