My Computer hangs

Hi everyone.
dmesg:
Code:
FreeBSD 8.0-RELEASE-p4 #3: Wed Jul 14 10:11:43 WEST 2010
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5600+ (2900.13-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x60fb2  Stepping = 2
 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x2001<SSE3,CX16>
  AMD Features=0xea500800<SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!>
  AMD Features2=0x11f<LAHF,CMP,SVM,ExtAPIC,CR8,Prefetch>
  TSC: P-state invariant
real memory  = 8589934592 (8192 MB)
avail memory = 7995944960 (7625 MB)
ACPI APIC Table: <050908 APIC1406>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 <Version 2.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: <050908 RSDT1406> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of fee00000, 1000 (3) failed
acpi0: reservation of ffb80000, 80000 (3) failed
acpi0: reservation of fec10000, 20 (3) failed
acpi0: reservation of 0, a0000 (3) failed
acpi0: reservation of 100000, cfe00000 (3) failed
ACPI HPET table warning: Sequence is non-zero (2)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
acpi_hpet0: HPET never increments, disabling
device_attach: acpi_hpet0 attach returned 6
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
vgapci0: <VGA-compatible display> port 0xc000-0xc0ff mem 0xd0000000-0xdfffffff,0xfe9f0000-0xfe9fffff,0xfe800000-0xfe8fffff irq 18 at device 5.0 
on pci1
pcib2: <ACPI PCI-PCI bridge> at device 7.0 on pci0
pci2: <ACPI PCI bus> on pcib2
re0: <RealTek 8168/8168B/8168C/8168CP/8168D/8168DP/8111B/8111C/8111CP/8111DP PCIe Gigabit Ethernet> port 0xd800-0xd8ff mem 0xfeaff000-
0xfeafffff,0xfdff0000-0xfdffffff irq 19 at device 0.0 on pci2
atapci0: <ATI IXP700/800 SATA300 controller> port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfe7ff800-
0xfe7ffbff irq 22 at device 17.0 on pci0
atapci0: [ITHREAD]
atapci0: AHCI v1.10 controller with 6 3Gbps ports, PM supported
ata2: <ATA channel 0> on atapci0
ata2: port is not ready (timeout 0ms) tfd = 000001d0
ata2: software reset clear timeout
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: port is not ready (timeout 0ms) tfd = 000001d0
ata3: software reset clear timeout
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci0
ata4: port is not ready (timeout 0ms) tfd = 000001d0
ata4: software reset clear timeout
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci0
ata5: port is not ready (timeout 0ms) tfd = 000001d0
ata5: software reset clear timeout
ata5: [ITHREAD]
ata6: <ATA channel 4> on atapci0
ata6: [ITHREAD]
ata7: <ATA channel 5> on atapci0
ata7: [ITHREAD]
ohci0: <OHCI (generic) USB controller> mem 0xfe7fe000-0xfe7fefff irq 16 at device 18.0 on pci0
ohci0: [ITHREAD]
usbus0: <OHCI (generic) USB controller> on ohci0
ohci1: <OHCI (generic) USB controller> mem 0xfe7fd000-0xfe7fdfff irq 16 at device 18.1 on pci0
ohci1: [ITHREAD]
usbus1: <OHCI (generic) USB controller> on ohci1
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfe7ff000-0xfe7ff0ff irq 17 at device 18.2 on pci0
ehci0: [ITHREAD]
usbus2: EHCI version 1.0
usbus2: <EHCI (generic) USB 2.0 controller> on ehci0
ohci2: <OHCI (generic) USB controller> mem 0xfe7fc000-0xfe7fcfff irq 18 at device 19.0 on pci0
ohci2: [ITHREAD]
usbus3: <OHCI (generic) USB controller> on ohci2
ohci3: <OHCI (generic) USB controller> mem 0xfe7f7000-0xfe7f7fff irq 18 at device 19.1 on pci0
ohci3: [ITHREAD]
usbus4: <OHCI (generic) USB controller> on ohci3
ehci1: <EHCI (generic) USB 2.0 controller> mem 0xfe7f6800-0xfe7f68ff irq 19 at device 19.2 on pci0
ehci1: [ITHREAD]
usbus5: EHCI version 1.0
usbus5: <EHCI (generic) USB 2.0 controller> on ehci1
pci0: <serial bus, SMBus> at device 20.0 (no driver attached)
atapci1: <ATI IXP700/800 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0
ata0: <ATA channel 0> on atapci1
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci1
ata1: [ITHREAD]
pci0: <multimedia, HDA> at device 20.2 (no driver attached)
isab0: <PCI-ISA bridge> at device 20.3 on pci0
isa0: <ISA bus> on isab0
pcib3: <ACPI PCI-PCI bridge> at device 20.4 on pci0
pci3: <ACPI PCI bus> on pcib3
rl0: <RealTek 8139 10/100BaseTX> port 0xe800-0xe8ff mem 0xfebffc00-0xfebffcff irq 20 at device 5.0 on pci3
miibus1: <MII bus> on rl0
rlphy0: <RealTek internal media interface> PHY 0 on miibus1
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: Ethernet address: 00:50:fc:29:94:96
rl0: [ITHREAD]
rl1: <RealTek 8139 10/100BaseTX> port 0xe400-0xe4ff mem 0xfebff800-0xfebff8ff irq 21 at device 6.0 on pci3
miibus2: <MII bus> on rl1
rlphy1: <RealTek internal media interface> PHY 0 on miibus2
rlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl1: Ethernet address: 00:16:0a:14:92:52
rl1: [ITHREAD]
ohci4: <OHCI (generic) USB controller> mem 0xfe7f5000-0xfe7f5fff irq 18 at device 20.5 on pci0
ohci4: [ITHREAD]
usbus6: <OHCI (generic) USB controller> on ohci4
acpi_button0: <Power Button> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse Explorer, device ID 4
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
driver bug: Unable to set devclass (devname: (null))
cpu0: <ACPI CPU> on acpi0
acpi_throttle0: <ACPI CPU Throttling> on cpu0
powernow0: <PowerNow! K8> on cpu0
cpu1: <ACPI CPU> on acpi0
powernow1: <PowerNow! K8> on cpu1
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
driver bug: Unable to set devclass (devname: (null))
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: cannot reserve I/O port range
Timecounters tick every 1.000 msec
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 480Mbps High Speed USB v2.0
usbus3: 12Mbps Full Speed USB v1.0
usbus4: 12Mbps Full Speed USB v1.0
usbus5: 480Mbps High Speed USB v2.0
usbus6: 12Mbps Full Speed USB v1.0
ad2: 76351MB <SAMSUNG SP0822N WA100-31> at ata1-master UDMA100
ugen0.1: <ATI> at usbus0
uhub0: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <ATI> at usbus1
uhub1: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen2.1: <ATI> at usbus2
uhub2: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
ugen3.1: <ATI> at usbus3
uhub3: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
ugen4.1: <ATI> at usbus4
uhub4: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
ugen5.1: <ATI> at usbus5
uhub5: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus5
ugen6.1: <ATI> at usbus6
uhub6: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6
acd0: DVDR <TSSTcorpCD/DVDW SH-S182M/SB03> at ata1-slave UDMA33
ad4: 953869MB <SAMSUNG HD103UJ 1AA01112> at ata2-master SATA300
ad6: 238475MB <HDT722525DLA380 V44OA96A> at ata3-master SATA150
GEOM: ad2s1: geometry does not match label (255h,63s != 16h,63s).
ad8: 239372MB <Maxtor 6L250S0 BANC1E00> at ata4-master SATA150
ad10: 1430799MB <SAMSUNG HD154UI 1AG01118> at ata5-master SATA300
SMP: AP CPU #1 Launched!
uhub6: 2 ports with 2 removable, self powered
uhub0: 3 ports with 3 removable, self powered
uhub1: 3 ports with 3 removable, self powered
uhub3: 3 ports with 3 removable, self powered
uhub4: 3 ports with 3 removable, self powered
GEOM: ad4: the secondary GPT table is corrupt or invalid.
GEOM: ad4: using the primary only -- recovery suggested.
GEOM: ad6: the secondary GPT table is corrupt or invalid.
GEOM: ad6: using the primary only -- recovery suggested.
GEOM: ad8: the secondary GPT table is corrupt or invalid.
GEOM: ad8: using the primary only -- recovery suggested.
Root mount waiting for: usbus5 usbus2
Root mount waiting for: usbus5 usbus2
Root mount waiting for: usbus5 usbus2
uhub2: 6 ports with 6 removable, self powered
uhub5: 6 ports with 6 removable, self powered
usbd_set_config_index:523: could not read device status: USB_ERR_SHORT_XFER
ugen5.2: <LaCie> at usbus5
umass0: <MSC Bulk-Only Transfer> on usbus5
umass0:  SCSI over Bulk-Only; quirks = 0x0000
Root mount waiting for: usbus5
umass0:0:0:-1: Attached to scbus0
Trying to mount root from ufs:/dev/ad2s1a
da0 at umass-sim0 bus 0 target 0 lun 0
da0: <Hitachi HDT721010SLA360 > Fixed Direct Access SCSI-2 device 
da0: 40.000MB/s transfers
da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C)
ZFS filesystem version 13
ZFS storage pool version 13
rl0: link state changed to UP
rl1: link state changed to UP
fuse4bsd: version 0.3.9-pre1, FUSE ABI  7.8

My problem is that from time to time the computer just hangs and no message in the logs about what caused it.
In the beginning of the installation the computer hanged every night, the problem was SMTP related.
Now it can stay up for 7 days, no problem and then just hang at any time.
I've done a memtest, and that was OK.

Can any one please help out?

My computer acts as DNS, DHCP, NTP, Gateway, and NAS(zfs) each in its jail.

Thanks.
 
This doesn't look too good:
Code:
GEOM: ad4: the secondary GPT table is corrupt or invalid.
GEOM: ad4: using the primary only -- recovery suggested.
GEOM: ad6: the secondary GPT table is corrupt or invalid.
GEOM: ad6: using the primary only -- recovery suggested.
GEOM: ad8: the secondary GPT table is corrupt or invalid.
GEOM: ad8: using the primary only -- recovery suggested.
 
Have you put the following entries in /boot/loader.conf?

Code:
vm.kmem_size="12G" # This should be about 1.5 times your memory.
vfs.zfs.arc_max="3G" # The maximum amount of memory that you would use for zfs.
 
Thanks for your replies.

Code:
vm.kmem_size="12G" # This should be about 1.5 times your memory.
vfs.zfs.arc_max="3G" # The maximum amount of memory that you would use for zfs.

I've incorporated this information as suggested, now I'll just have to wait and see.

Code:
GEOM: ad4: the secondary GPT table is corrupt or invalid.
GEOM: ad4: using the primary only -- recovery suggested.
GEOM: ad6: the secondary GPT table is corrupt or invalid.
GEOM: ad6: using the primary only -- recovery suggested.
GEOM: ad8: the secondary GPT table is corrupt or invalid.
GEOM: ad8: using the primary only -- recovery suggested.

About this message, I've noted it but every time issue a "zpool status" I get this message.

Code:
zpool status
  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          ad4       ONLINE       0     0     0
          ad6       ONLINE       0     0     0
          ad8       ONLINE       0     0     0
          da0       ONLINE       0     0     0
          ad10      ONLINE       0     0     0

errors: No known data errors

As so I have some thoughts that this really is a issue.

Thanks.
 
Hi everyone,

It as pass 9 days since I last rebooted the computer with this loading configurations and unfortunately the computer hanged again this time in the middle of the day.

This is my /var/log/messages.

Code:
Jul 25 05:32:00 gaia ntpd[1032]: kernel time sync status change 2001
Jul 25 08:56:54 gaia ntpd[1032]: kernel time sync status change 6001
Jul 25 10:22:16 gaia ntpd[1032]: kernel time sync status change 2001
Jul 25 15:55:49 gaia syslogd: kernel boot file is /boot/kernel/kernel

Is there any more tweaks I can do?
Should I try any other approach to solve this problem?

Thanks.
 
Sorry for being back so soon and with the same problem.

Here is my update:
This weekend I clone the system disk and upgraded to 8.1
Big motivation zfs v14, that's the big reason for using freebsd.

But before updating I added something to loader.conf.

Loader.conf
Code:
vfs.zfs.zil_disable="1"
autoboot_delay="2"
vm.kmem_size_max="12G"
vm.kmem_size="12G" # This should be about 1.5 times your memory.
vfs.zfs.arc_max="3G" # The maximum amount of memory that you would use for zfs.
vfs.zfs.vdev.cache.size="1G"
#splash_pcx_load="YES"
#bitmap_load="YES"
#bitmap_name="/boot/splash.pcx"
#beastie_disable="YES"
#loader_logo="beastie"

Also here is my rc.conf
Code:
hostname="gaia.alsuki.ath.cx"                                                                        
                                                                                                     
# Placas de Rede                                                                                     
network_interfaces="rl0 rl1 lo0"                                                                     
ifconfig_rl0="inet 192.168.1.10/24"                                                                  
ifconfig_rl1="inet 10.170.14.9/24"                                                                   
ifconfig_rl1_alias0="10.170.14.3/24"                                                                 
ifconfig_rl1_alias1="10.170.14.2/24"
ifconfig_rl1_alias2="10.170.14.5/24"
ifconfig_rl1_alias3="10.170.14.6/24"
ifconfig_rl1_alias4="10.170.14.250/24"
ifconfig_rl1_alias5="10.170.14.251/24"
ifconfig_rl1_alias6="10.170.14.10/24"
ifconfig_rl1_alias7="10.170.14.14/24"
ifconfig_rl1_alias8="10.170.14.15/24"

defaultrouter="192.168.1.1"

# Configuracao da firewall
gateway_enable="YES"
pf_enable="YES"
pf_rules="/etc/pf.conf"
pf_flags=""
pflog_enable="YES"
pflog_logfile="/var/log/pflog"
pflog_flags=""

# Activacao das jaulas
ezjail_enable="YES"

# Activacao do ZFS
zfs_enable="YES"

# Activar NFS
rpcbind_enable="YES"
nfs_server_enable="YES"
nfs_flags="-t -u -n 32"
mountd_flags="-r"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"

# Activacao de servicos
sshd_enable="YES"
usbd_enable="YES"
devd_enable="YES"
devfs_system_ruleset="devfrules_common"
ldconfig_paths="/usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg"

# Sendmail
sendmail_enable="YES"

# Activar o rato
mouse_type="auto"
moused_port="/dev/psm0"
moused_enable="YES"

# Melhoramentos FSCK
fsck_y_enable="YES"
background_fsck="NO"

# Activar HAL / DBUS
dbus_enable="YES"
polkitd_enable="YES"
hal_enable="YES"

# Activar som
snddetect_enable="YES"
mixer_enable="YES"

# Activar avahi_daemon
avahi_enable="YES"

#activar swapmoitor
swapmonitor_enable="YES"

keymap="pt.iso.acc"

#syslogd_flags="-s -s"
#syslogd_flags="-a 10.170.14.9"

# Configuracao NTP
ntpdate_enable="YES"
ntpd_enable="YES"

# Activar supporte NTFS
fusefs_enable="YES"

# Activar GDM
#gdm_enable="YES"
#Dumpdev="AUTO"
#Dumpdir="/var/crash"

Because every time the computer hangs/crash no log is created I'm thinking of debugging the kernel crash.

I've been reading how to do this but no step by step how-to exist on how to accomplish this.

Can anyone help me on this?

Thanks
 
You have this

Code:
network_interfaces="rl0 rl1 lo0"             
ifconfig_rl0="inet 192.168.1.10/24"         
ifconfig_rl1="inet 10.170.14.9/24"
ifconfig_rl1_alias0="10.170.14.3/24"
ifconfig_rl1_alias1="10.170.14.2/24"
ifconfig_rl1_alias2="10.170.14.5/24"
ifconfig_rl1_alias3="10.170.14.6/24"
ifconfig_rl1_alias4="10.170.14.250/24"
ifconfig_rl1_alias5="10.170.14.251/24"
ifconfig_rl1_alias6="10.170.14.10/24"
ifconfig_rl1_alias7="10.170.14.14/24"
ifconfig_rl1_alias8="10.170.14.15/24"

I take it that these alias are for your jails.
I use the jail auto alias create feature for my
155 jails and there all created with out any suffix.
I would change your hard coded list like this, removing the /24.

Code:
network_interfaces="rl0 rl1 lo0"             
ifconfig_rl0="inet 192.168.1.10/24"         
ifconfig_rl1="inet 10.170.14.9/24"
ifconfig_rl1_alias0="10.170.14.3"
ifconfig_rl1_alias1="10.170.14.2"
ifconfig_rl1_alias2="10.170.14.5"
ifconfig_rl1_alias3="10.170.14.6"
ifconfig_rl1_alias4="10.170.14.250"
ifconfig_rl1_alias5="10.170.14.251"
ifconfig_rl1_alias6="10.170.14.10"
ifconfig_rl1_alias7="10.170.14.14"
ifconfig_rl1_alias8="10.170.14.15"
The jail system has the option to auto add the alias on jail start
and auto remove the alias on jail stop instead of hard coding all the
alias in your rc.conf.

This may or may not be the source of your problem.

There is a brand new port for creating and manageing jails called "qjail"
that does the creating of the auto alias for you.
see http://sourceforge.net/projects/qjail/
It's been submited to the ports collection but has not been added yet as the
backlog from the 8.1 frezee has to be process first, but the package and the
port make files can be downloaded form sourceforge.net.
 
t1066 said:
Does it help if you comment out the following line

Code:
vfs.zfs.vdev.cache.size="1G"

in loader.conf?

Probably yes.

I believe that instead of crashing every night it will crash only once a week.

I'm also working on other theory that may disk displacement is the culpable,
I have 5 Disk
4 sata (2 x 1,5G and 2 x 3,0G)
1 usb
this 5 disk are the zpoll.
Yesterday I bought a new disk to change this, I'll be moving all disk inside the box and connecting all of them through SATA. And moving to raidz2.

Thanks for all the help, I'll be posting any advancements as I go along.
 
Back
Top