Problems with the famous NVIDIA driver 195.22 on 8.0-RELEASE/amd64

The quality of the ATI driver (for supported GPUs) is better than the intel drivers, but you still mentioned those :-)

Adam
 
yks said:
Upgraded to Xorg 7.5 and nvidia-driver 195.36.15. Hangs, crashes, all remain. Xorg.log has no information.
Hangs always on switching to a ttyvX from X, sometimes on exiting X, sometimes on starting applications, sometimes just hangs without any activity. What the hell may this be? I have to use nouveau as replacement, but it is so quite slow...
My system is: Core-i5, GeForce GT 240, FreeBSD 8.0-RELEASE-p2 #0 amd64.
This is interesting. I'm running something very similar to you... Xeon 3450 (basically i5-750), geforce 210, FreeBSD 8.0-RELEASE, amd64. Asus p7f-e mobo. Nvidia driver is 195.22. I'm also running two of those graphics cards, outputting to 4 screens. Cooling should not be a problem as I have rigged up a fan to blow directly on the two cards.

I get crashes usually once every 4 days or so, but may be as little as 12 hours after or 6 days after the last reboot. Every screen is as it was, but I usually can't seem to find where the mouse cursor was. Other than that, looks fine, but can't move the mouse or see any output from tapped keys. I have a cron job set to output the current date and time to a log, and this is also stopped at the last moment that the machine was working. There is nothing to indicate anything is wrong in /var/log/messages when I reboot.

Not sure whether it is nvidia or not to be honest, but it's difficult to troubleshoot as I can only try something different and see if the machine dies 4 days later. Kind of tearing my hair out thinking what the issue is exactly. I run ZFS but the crashes are not occuring at IO intensive periods or when issuing zfs commands. I note that when I run X the HDD light flashes every 1-3 seconds which it does not do when X is not running. I can't figure that one out either.

Edit: the latter IS nvidia related, and I'm getting the following in my ~/.xsession-errors:
Code:
Xlib: extension "RANDR" mission on display :"0.0".
 
I think I've traced the source of the HDD flashing at least. Using
# find /home /var/tmp -mmin -1
I noted that ~/.xsession-errors, and ~/.gconfd/ and ~/.gconfd/savedstate were changing all the time. IIRC, a mention of metacity in the .xsession-errors file made me remember that metacity wasn't suppose to be running, and by doing
[CMD="ps ax | grep metacity"][/CMD]
and trying to kill it without success I realized that the PID was changing all the time, so something was trying to spawn it all the time without success (I think gnome-session, from memory).

Since metacity wasn't supposed to be running anyway (running compiz with emerald), and I couldn't find anyone else who had solved it despite other people having similar issues, I decided to do the following:
# cd /usr/local/bin
mv metacity metacity.old
vim metacity

Then changed the contents to an empty script.
Code:
#!/bin/sh
# chmod 700 metacity
Now I'm still getting the HDD light every second or so, but CPU usage is down to under 10% for all cores and I'm not getting the modifications to .xsession-errors etc. I'll upgrade the nvidia driver as well and see if that helps with the system stability. I am still getting notifications in .xsession-errors about extension "RANDR" missing, but this time only when I launch a new application.
 
Well, in my case there didn't seem to be any system activity after hangs. In some cases, everything just stopped and the system didn't respond, but the HDD I/O seemed to stop as well, at least when I rebooted (reset) the PC, I didn't notice any files modified after the hang, even after waiting pretty long in vain hope that the system would 'get through'...
As to RANDR, I personally don't use it, but don't think it can contribute to that kind of faults.
If you still face the hangs&crashes problem, maybe the nv driver could solve these, as it did for me. Of course, if you can do without 3D. (That compiz stuff...) Or, at least, consider giving it a try to determine the cause of the problem.
 
yks said:
Well, in my case there didn't seem to be any system activity after hangs. In some cases, everything just stopped and the system didn't respond, but the HDD I/O seemed to stop as well, at least when I rebooted (reset) the PC, I didn't notice any files modified after the hang, even after waiting pretty long in vain hope that the system would 'get through'...
As to RANDR, I personally don't use it, but don't think it can contribute to that kind of faults.
If you still face the hangs&crashes problem, maybe the nv driver could solve these, as it did for me. Of course, if you can do without 3D. (That compiz stuff...) Or, at least, consider giving it a try to determine the cause of the problem.
Thanks for the help. So far it is 8 days of uptime and no hang. That's a record. Things I've changed since then:
  • Instead of using USB wireless (that was dropping out every couple minutes and clogging up /var/log/messages) I connect via ethernet to a WRAP box as a wireless bridge. I suspect that this may have been the main impetus behind the hanging.
  • The above mentioned fix (e.g. renaming metacity)
  • disabling virtually everything in my crontab that was zfs related. I've since done a couple zpool scrubs in order to give that a bit of a prod.
I want to try getting rid of compiz anyway, since it is crashing out every so often and produces those messages. I think the video cards might draw less power too. I can do the same thing (grid) using other methods I believe. I certainly do want to try some other WMs. Thanks for your info.
 
Back
Top