Segmentation fault while upgrading from 10.0-RELEASE to 10.1-RELEASE

spiky

New Member

Reaction score: 2
Messages: 10

Hi,

While upgrading to 10.1-RELEASE, everything was working okay for the first two commands:
Code:
[root@beasty ~]# freebsd-update -r 10.1-RELEASE upgrade
...
[root@beasty ~]# freebsd-update install
...
After rebooting successfully, I ran freebsd-update install once again and here's what I got:
Code:
[root@beasty ~]# freebsd-update install
Installing updates...Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
...
Eventually, I had no choice other than "ctrl-c" the whole process since it was printing the same error message again and again but, after that, almost every command would result in a segfault. For now, I did a zfs rollback but would obviously like to upgrade to 10.1 someday. I don't know where to start with this.
 

pvoigt

Member

Reaction score: 5
Messages: 82

That's exactly what I observed today when trying to upgrade from 10.0-RELEASE to 10.1-RELEASE. After the reboot I obtained the same segmentation faults and I couldn't stop with "ctrl-c" either. My system became completely unusable: no SSH and no serial console. A quickly attached USB keyboard and attached monitor revealed that even login dumped core. All I could do with my UFS root partition was to format and restore from a luckily available fresh dump of my root file system.

My system is up with 10.0-RELEASE again but I would like to upgrade. After my bad experience I would greatly appreciate any help on how to proceed.

My system is:
Code:
# uname -a
FreeBSD spock 10.0-RELEASE-p12 FreeBSD 10.0-RELEASE-p12 #0: Tue Nov  4 05:07:17 UTC 2014  root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
BTW: I tested the upgrade procedure before in a virtual machine without any issue. The test system, however, does not carry the same number of ports and services. It's just a minimal system only.

Regards,
Peter
 

pvoigt

Member

Reaction score: 5
Messages: 82

Indeed, I am not using ZFS at all, just pure UFS and no RAID. Besides this I am using a GELI encrypted UFS home but it is not needed during the upgrade process.
 

wildtollwut

Member

Reaction score: 4
Messages: 60

I faced the same problem today when upgrading from 10.0-RELEASE to 10.1-RELEASE. I am using ZFS and a GELI encrypted root. Almost every command segfaults, ls and mount still work, however. Currently I'm desperately trying to recover the system.

Edit: Simple recovery failed, now the system is unbootable and even ls segfaults.

Any news regarding this?
 

arnov

New Member

Reaction score: 3
Messages: 15

I had the same problem. I use LDAP in /etc/nsswitch.conf and Kerberos for authentication. When I removed this from nsswitch.conf (I had to use cat >nsswitch.conf to do that since no editor would work) and the pam.d files a lot of commands functioned again. ZFS still crashed. However freebsd-update -IDS showed that most files were not upgraded. Unfortunately there is no freebsd-update reinstall or something like that.

Because it was easier to just backup the configuration and reinstall 10.1 from scratch I did not look further. I am now reluctant to update my other systems from 9.3 to 10.1 before I know what went wrong. Does it have to do with LDAP and/or Kerberos?
 

wildtollwut

Member

Reaction score: 4
Messages: 60

My system seems to be running again (preliminary at least). I booted from USB and replaced /bin, /lib and /libexec. Somehow freebsd-update must have corrupted at least one of these directories.

Another run of freebsd-update screws up again. I have no idea what could be the cause. Currently I'm unable to update.

I'm fairly certain that this is not related to LDAP and Kerberos as I'm not running either of those.

Update: apparently, only /lib is corrupted. It suffices to replace the files in it by valid ones e.g. from a bootable ISO. However, applications in /usr like vim still segfault at termination. This only happens if /usr/local is mounted/present. I suspect that e.g. /usr/local/lib is also affected by the faulty update procedure.
 

pvoigt

Member

Reaction score: 5
Messages: 82

Some people reported similar errors on IRC. But I do not yet know why exactly freebsd-update fails. I can suppose only that it might be a combination of errors like incomplete mirrors and a bug in freebsd-update. But I do not know for sure. It is hard to find reliable information. I have been advised to build the base system and the kernel from source. This method has been told to be more reliable than using freebsd-update.

Right now make buildword and make buildkernel have just finished. Tomorrow I am going to do the rest of the upgrade process.

Regards,
Peter
 

talsamon

Daemon

Reaction score: 283
Messages: 1,835

http://blog.gmane.org/gmane.os.freebsd.stable

The problems with running freebsd-update on 10.1-RC3 (including using
freebsd-update to upgrade to -RC4 or -RELEASE) should now be fixed. The
problem was due to some files being missing from freebsd-update mirrors
and resulted from -RC3 going out at the same time as patches were being
built for a set of security advisories (SA-14:20 through 14:23).

I know this glitch has been discussed in a large number of places, and
I'm sure there are people who have been affected by this who are not
subscribed to -stable, so please relay this message to the appropriate
fora where you see this being discussed.

Thanks,
( Colin Percival | 16 Nov 02:08 2014)
 

pvoigt

Member

Reaction score: 5
Messages: 82

Thanks, talsamon, for pointing to the relevant portion of the above link. But I am still not sure if it fully applies because all people in this thread are upgrading from 10.0-RELEASE and not from 10.1-RC3. Or do I not understand you correctly?

Regards,
Peter
 

talsamon

Daemon

Reaction score: 283
Messages: 1,835

Yes, I think there are more problems as in this statement mentioned. But I found it, and think I should post it. (I haven't seen that someone else had posted a link to this in the other threads.)
 

dR3b

Member

Reaction score: 7
Messages: 35

I had the same problem! The upgrade from 9.3 RELEASE-p5 ends with
Code:
Segmentation fault (core dumped)
A further test with another VM (ESXi) and 10.0-RELEASE-p12 caused no problems.
 

wildtollwut

Member

Reaction score: 4
Messages: 60

My system is running again after I extracted base.txz from 10.1 to / (not overwriting /usr, /etc, /var and the likes). Still, vim was segfaulting when closing it. I could trace this back to a faulty libtspi.so (e.g. used by gnutls). As long as it resides in /usr/local/lib it's causing segfaults in various problems.

Edit: I just built libtspi.so i.e. security/trousers from ports and installed it. The same behavior :mad:

Another edit: I copied libtspi* from a working 10.1-RELEASE to my system while freebsd-update IDS was running. Just as the copying was finished I got lots of
Code:
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
...
from freebsd-update. This must be somehow related.
 
OP
spiky

spiky

New Member

Reaction score: 2
Messages: 10

Has anybody tried with the manual method ( make buildworld)? As for me, I've decided to do a clean install and restore my configurations and data afterwards. Then, I upgraded my 10.0 jails to 10.1 with success using the manual method:

Code:
[root@beasty /usr/src]# make buildworld
...
[root@beasty /usr/src]# mergemaster -p -D /jails/cranky
...
[root@beasty /usr/src]# make -DBATCH_DELETE_OLD_FILES installworld delete-old delete-old-libs DESTDIR=/jails/cranky
...
[root@beasty /usr/src]# mergemaster -i -U -D /jails/cranky
 

pvoigt

Member

Reaction score: 5
Messages: 82

Yeah, I have been convinced by people on IRC to do make buildworld. I am a bit disappointed that there is no reliable information about the reason for freebsd-update failing. There should at least be a kind of official warning to wait using freebsd-update until the reason of failure will be found. My rock stable picture of FreeBSD is somewhat disturbed by my extremely bad experience with freebsd-update. I have never had such a harsh crash before with a completely unresponsive system. At least not with a Unix system :)

First I did not want to dare the make buildworld process due to my lack of experience. But the whole process was straight forward and went very smoothly.

I finally decided for make buildworld because I cannot afford a longer server downtime. And the reinstallation of more than 900 ports including their re-configuration from scratch would have been too time intensive. With using make buildworld I effectively had a server down time of no longer than two times two minutes, e.g. the two reboots.

Though not really necessary I am currently rebuilding all ports. Compared to building and installing the new base system and the new kernel this is even more time consuming. This is not only regarding the pure build time but mainly because some of the ports are not building at all. At least one has had an open PR for several months. I am currently skipping them and will investigate the details later.

Regards,
Peter
 
OP
spiky

spiky

New Member

Reaction score: 2
Messages: 10

I agree with you pvoigt. I'm also a bit disappointed with that.

By the way, as for you rebuilding the ports, have you looked at PKGNG? Using this in combination with the ports (only for packages which require custom compiling options) is a very effective way of managing packages. I think the official word is to use only one method but my experience so far using both is very good.
 

jb_fvwm2

Daemon

Reaction score: 187
Messages: 1,720

Checking the Makefile, freebsd-update appeared about 2006. I used /usr/src/UPDATING prior to that to do the buildworld cycle and kind of thought that the former would be used by those already experienced with the latter, seeing as how things can go awry with either. The latter would be a backup to the former.
 

dR3b

Member

Reaction score: 7
Messages: 35

Delete the following directories:
Code:
/boot/kernel.old
/boot/kernel.generic
After that freebsd-upgrade is running without any errors.
 

mxms

New Member


Messages: 5

Delete the following folder:
Code:
/boot/kernel.old
/boot/kernel.generic
After that "freebsd-upgrade" is running without any errors.
It didn't work for me. Also I can't execute make buildworld because I also had a 'segmentation fault'.
 

gavin@

New Member
Developer


Messages: 8

For anybody seeing the "Segmentation fault (core dumped)" messages still, can you provide the output of dmesg | grep -A 8 ^CPU and also show the content of your /etc/nsswitch.conf file please?

Also, if you mv /etc/nsswitch.conf /etc/nsswitch.conf.o do things start working again without having to do anything else?
 

mxms

New Member


Messages: 5

Code:
# dmesg | grep -A 8 ^CPU
CPU: Intel(R) Atom(TM) CPU D525   @ 1.80GHz (1800.11-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x106ca  Family = 0x6  Model = 0x1c  Stepping = 10
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x40e31d<SSE3,DTES64,MON,DS_CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 4294967296 (4096 MB)
avail memory = 4093018112 (3903 MB)
# cat /etc/nsswitch.conf
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: release/10.0.0/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $
#
group: files winbind
group_compat: nis
hosts: files dns
networks: files
passwd: files winbind
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files
Removing of /etc/nsswitch.conf didn't work.
 
OP
spiky

spiky

New Member

Reaction score: 2
Messages: 10

The following appears on my clean install of 10.1 but I've restored nsswitch.conf as it were on 10.0.

Code:
[root@beasty ~]# cat /var/run/dmesg.boot | grep -A 8 ^CPU
CPU: Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz (3292.60-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x306a9  Family = 0x6  Model = 0x3a  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x7fbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
[root@beasty ~]#
[root@beasty ~]# cat /etc/nsswitch.conf
#
# nsswitch.conf(5) - name service switch configuration file
# $FreeBSD: release/10.0.0/etc/nsswitch.conf 224765 2011-08-10 20:52:02Z dougb $
#
passwd: files ldap
##passwd: files winbind
group: files ldap
##group: files winbind
#group: compat
#group_compat: nis
##hosts: files dns mdns
##hosts: files mdns4_minimal [NOTFOUND=return] dns
hosts: files mdns dns
networks: files
#passwd: compat
#passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files
[root@beasty ~]#
 

wildtollwut

Member

Reaction score: 4
Messages: 60

Very interesting, for me the removal of /etc/nsswitch.conf works (even if /usr/local/lib/libtspi.so is present) and the system doesn't segfault anymore.
/etc/nsswitch.conf
Code:
group: compat
group_compat: nis
hosts: files wins dns
networks: files
passwd: compat
passwd_compat: nis
shells: files
services: compat
services_compat: nis
protocols: files
rpc: files
Code:
CPU: Intel(R) Celeron(R) CPU  N2820  @ 2.13GHz (2133.47-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x30673  Family = 0x6  Model = 0x37  Stepping = 3
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE
2,SS,HTT,TM,PBE>
  Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRA
ND>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x101<LAHF,Prefetch>
  Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
--
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  2
Still, to get to this point, I had to replace /lib with a version from a 10.1-RELEASE image.
 
Top