Summary: various ways of getting user CPU time get stuck at about 105.824 hours; is there a better way to get this info?
OS: FreeBSD 12.0-RELEASE-p3, but this behavior was observed on FreeBSD 11.1 and FreeBSD 11.2 also. amd64 architecture.
I have a program that generates a lot of permutations and checks if each permutation satisfies some criteria that I want. (example: arrange numbers on a many-sided die so that the total of the numbers at each vertex add up to the same number). This program tends to run a long time and essentially emulates a CPU infinite loop except for outputting a solution every few minutes, days, years, or geologic eons. I want to get some idea of how long a particular run will take, so I put in code to use CPU time and real time to estimate time to completion for a run. Currently it spits out status every 5 minutes. The process is uni-threaded.
The process is run at idle priority on systems with 2 and 4 cores, with one core left free for light use and overnight maintenance jobs. It typically gets 99.9% user CPU averaged over a good portion of a day, even with Firefox or Libreoffice (at normal priority) sucking up CPU a few hours a day.
I have discovered, though, that user CPU time seems to quit advancing after 380967 seconds (105.824 hours or 4.409 days; this is sampled every 5 minutes real time) and then just returns the same number over and over. Mostly I'm using getrusage() to get user CPU time, but clock() and clock_gettime(CLOCK_VIRTUAL) seem to be getting the information with common code. I spent a bit of time looking for 64-bit overflows in my time-formatting routines before realizing they weren't the problem. Is there a better clock to use with clock_gettime()? A per-thread clock?
Is there a better way to get this information? I don't *have* to have just user CPU; user + system + interrupt time would probably be indistinguishable. "top" seems to get the run time right. How portable is kinfo_getproc(), which "top" uses? How long has that interface been around? Will it stay around? How efficient is it compared to getrusage()? Does anyone know when *IT* will overflow?
The only type I can use for the estimated time for a run is "long double", because of its exponent range. However, just because a full run will take way longer than the age of the universe doesn't mean I won't get a few solutions in a few weeks.
OS: FreeBSD 12.0-RELEASE-p3, but this behavior was observed on FreeBSD 11.1 and FreeBSD 11.2 also. amd64 architecture.
I have a program that generates a lot of permutations and checks if each permutation satisfies some criteria that I want. (example: arrange numbers on a many-sided die so that the total of the numbers at each vertex add up to the same number). This program tends to run a long time and essentially emulates a CPU infinite loop except for outputting a solution every few minutes, days, years, or geologic eons. I want to get some idea of how long a particular run will take, so I put in code to use CPU time and real time to estimate time to completion for a run. Currently it spits out status every 5 minutes. The process is uni-threaded.
The process is run at idle priority on systems with 2 and 4 cores, with one core left free for light use and overnight maintenance jobs. It typically gets 99.9% user CPU averaged over a good portion of a day, even with Firefox or Libreoffice (at normal priority) sucking up CPU a few hours a day.
I have discovered, though, that user CPU time seems to quit advancing after 380967 seconds (105.824 hours or 4.409 days; this is sampled every 5 minutes real time) and then just returns the same number over and over. Mostly I'm using getrusage() to get user CPU time, but clock() and clock_gettime(CLOCK_VIRTUAL) seem to be getting the information with common code. I spent a bit of time looking for 64-bit overflows in my time-formatting routines before realizing they weren't the problem. Is there a better clock to use with clock_gettime()? A per-thread clock?
Is there a better way to get this information? I don't *have* to have just user CPU; user + system + interrupt time would probably be indistinguishable. "top" seems to get the run time right. How portable is kinfo_getproc(), which "top" uses? How long has that interface been around? Will it stay around? How efficient is it compared to getrusage()? Does anyone know when *IT* will overflow?
The only type I can use for the estimated time for a run is "long double", because of its exponent range. However, just because a full run will take way longer than the age of the universe doesn't mean I won't get a few solutions in a few weeks.