Segmentation fault while upgrading from 10.0-RELEASE to 10.1-RELEASE

Code:
# dmesg | grep -A 8 ^CPU

CPU: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz (2392.56-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x6fb  Family = 0x6  Model = 0xf  Stepping = 11
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  VT-x: HLT,PAUSE
  TSC: P-state invariant, performance statistics
real memory  = 3221225472 (3072 MB)
# cat /etc/nsswitch.conf
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: release/10.0.0/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $
#
#group: compat
group: files ldap
#group_compat: nis
hosts: files dns
networks: files
#passwd: compat
passwd: files ldap
#passwd_compat: nis
shells: files
services: compat
#services_compat: nis
protocols: files
rpc: files
Symptoms: after a reboot I could not log in. In single user mode ls worked but ls -l would segfault. This made me assume that it had something to do with /etc/nsswitch.conf. I used cat >nsswitch.conf to create a minimal /etc/nsswitch.conf :
Code:
passwd: files
group: files
After that ls -l worked again. I still could not log in. I commented out Kerberos in /etc/pam.d/system and in /etc/pam.d/other after which I could log in again.

freebsd-update -IDS showed that most files were not updated. Since it was a test system I decided to back up the configuration and install 10.1-RELEASE from scratch. I have not tried to run without /etc/nsswitch.conf.

Update: After rereading my post I realized that I did not mention that several commands still did work after my last change. vi worked after my change to /etc/nsswitch.conf. I could log in after my changes to the pam.d files but zfs still segfaulted.
 
This is my /etc/nsswitch.conf:
Code:
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: releng/9.3/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $
#
#
#group: compat
group: files nis winbind
group_compat: nis
hosts: files dns
networks: files
#passwd: compat
passwd: files nis winbind
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files
 
Any news on this? I still have to upgrade two systems. Will mv /etc/nsswitch.conf /etc/nsswitch.conf.o before doing freebsd-update -r 10.1-RELEASE upgrade prevent it?
 
arnov, I am regularly scanning the IRC and the list but I have not heard about any reliable solution for freebsd-update yet. On the other hand, I have been successful with the buildword process and can thus recommend to go this way.

Regards,
Peter
 
I'd really like to know what's causing the segmentation fault. On my system, if I remove wins from the hosts line in /etc/nsswitch.conf, everything works normally. Apparently even when doing ls or vi some kind of host lookup is performed which mysteriously fails with the new kernel or parts of the userland.

I may have been premature by discounting the LDAP connection. Turns out I am also using LDAP via the installed samba4. After reinstalling databases/ldb samba works nicely except for the system-wide WINS host lookup.
 
In my case I restored system libraries with a FreeBSD 10.1-RELEASE memory stick using a similar procedure.
  1. Boot from USB stick and exit to Live CD.
  2. Mount the damaged FreeBSD installation on /mnt (/, /usr, /var)
  3. Back up manually modified files from /mnt/etc to the USB stick.
  4. Code:
    # cd /usr/freebsd-dist; for file in base.txz lib32.txz kernel.txz src.txz ; do (cat $file | tar --unlink -xvpJf - -C /mnt); done
  5. Reboot in current restored system.
  6. Mount the USB stick and restore the backup to /etc.
 
Same problem here and I had some important data on this server. The data is on different zpools apart from the root pool. In this case, if I just reinstall the system, will I be able to import these pools?
 
Have you tried to do a buildworld on a clean machine or vm and tgz /usr/src and /usr/obj to a seg faulted machines to do a installworld without using buildworld?
 
I ran into this today as well. Upgraded from 9.2 to 10.1-RELEASE.
Solved it by reextracting base.txz. Would really have been nice to know what went wrong.
 
Not sure if this is helpful but I successfully upgraded from stock 10.0-RELEASE (no added patches) to 10.1-RELEASE just this past Monday without problem.

FreeBSD mybox 10.1-RELEASE-p5 FreeBSD 10.1-RELEASE-p5 #0: Tue Jan 27 08:55:07 UTC 2015 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
 
Well I need to upgrade from 10.0-RELEASE-p12 to 10.1-RELEASE and I don't want to run into the problems that are discussed about in this thread. This needs to be done soon since 10.0-RELEASE reaches its end-of-life in two weeks.
Did anyone successfully upgraded from 10.0-RELEASE-p12 to 10.1-RELEASE?
(Would that be an idea to rollback to 10.0-RELEASE before attempting to upgrade?)
 
This is a very odd problem.
I've upgraded too 9.3-RELEASE to 10.1-RELEASE earlier in July: no problem at all. Both use a modified nsswitch.conf because those systems are bound to an LDAP server.

I've got a third server, almost identical (all 3 are mail servers), bound to LDAP. I've tried to upgrade it from 9.3-RELEASE to 10.1-RELEASE this morning: epic failures.

Firstly, it took me ages to fetch upgrades, freebsd-update(8) was failing over and over on bad file checksums. I've had to change update servers, and it finally worked. First time I'm seeing this kind of problem.
Then I've experienced the same segfault anomalies you guys are reporting.

Thanks to my VMware snapshot, I've rolled back, replaced my nsswitch.conf with a non-ldap version, and I've done the all upgrade process again. It worked well.

I really don't understand why this bug would occur on this third server, but not in the first 2. The only differences are:

- first 2 were upgraded the first week of July, the third was upgraded today (12th of August)
- first 2 were installed as 8.x-RELEASE, upgraded to 9.x-RELEASE a year or so ago, the third one was installed as 9.x-RELEASE from scratch.

I find this a little bit concerning...
 
Last edited by a moderator:
Back
Top