Solved Unable to buildkernel and other odd behavior

When I was upgrading from 10.1-RELEASE to 10.2-RELEASE through a normal binary update with freebsd-update -r 10.2-RELEASE, I got a few errors about no files existing:
Code:
/usr/sbin/freebsd-update: cannot open files/.gz: No such file or directory
I followed this solution, by updating the src, the kernel, and the world individually. I rebooted and updated my basejail to the same version, using ezjail.

Just to be safe, I always rebuild all the ports after minor upgrades, so I went into each jail and used portmaster to rebuild everything. No problems. Back in the main OS, I also rebuilt all the ports. However, some odd behavior happened. lang/python27 kept getting hung up on the part where it checks to see if pthreads are enabled. This is the same issue documented here. In that thread, users were having trouble building Python inside their jails, so they were able to create a new basejail and everything was fine. My problem is the opposite...Python won't build inside the main OS, but it works fine in the jails.

Which leads me to believe that I messed up the binary upgrade. Another thing I noticed in the main OS was when I tried to start ntpd. It always hung, no CPU usage, nothing. In fact, I had to drop into single-user mode to disable it because the OS wouldn't even boot past that point. Using truss service ntpd onestart, I noticed it got stuck on a vfork(2) call. To me, this seems like something is quite wrong with the kernel.

I tried checking the checksums using freebsd-update IDS >> /root/cheksums.ids, but that ended up listing just about every file as a mismatch.

I've taken the suggestion from other threads to manually sync my source with svn and buildworld/buildkernel. So I've done that using svnlite co svn://svn.freebsd.org/base/release/10.2.0 /usr/src and following the guide from the handbook. Building the world works fine, but it hangs again with building the kernel. Specifically it stops here:

Code:
ctfconvert -L VERSION -g vers.o
linking kernel.debug
ctfmerge -L VERSION -g -o kernel.debug locore.o  cam.o cam_compat.o cam_periph.o cam_queue.o cam_sim.o cam_xpt.o  ata_all.o ata_xpt.o ata_pmp.o scsi_xpt.o scsi_all.o scsi_cd.o  scsi_ch.o ata_da.o scsi_da.o scsi_pass.o scsi_sa.o scsi_enc.o  scsi_enc_ses.o scsi_enc_safte.o smp_all.o freebsd32_capability.o  freebsd32_ioctl.o freebsd32_misc.o freebsd32_syscalls.o  freebsd32_sysent.o dsargs.o dscontrol.o dsfield.o dsinit.o  dsmethod.o dsmthdat.o dsobject.o dsopcode.o dsutils.o dswexec.o  dswload.o dswload2.o dswscope.o dswstate.o evevent.o evglock.o  evgpe.o evgpeblk.o evgpeinit.o evgpeutil.o evhandler.o evmisc.o  evregion.o evrgnini.o evsci.o evxface.o evxfevnt.o evxfgpe.o  evxfregn.o exconfig.o exconvrt.o excreate.o exdebug.o exdump.o  exfield.o exfldio.o exmisc.o exmutex.o exnames.o exoparg1.o  exoparg2.o exoparg3.o exoparg6.o exprep.o exregion.o exresnte.o  exresolv.o exresop.o exstore.o exstoren.o exstorob.o exsystem.o  exutils.o hwacpi.o hwesleep.o hwgpe.o hwpci.o hwregs.o hwsleep.o  hwtimer.o hwvalid.o hwxface.o hwxfsleep.o nsaccess.o nsalloc.o  nsarguments.o nsconvert.o nsdump.o nseval.o nsinit.o nsload.o  nsnames.o nsobject.o nsparse.o nspredef.o nsprepkg.o nsrepair.o  nsrepair2.o nssearch.o nsutils.o nswalk.o nsxfeval.o nsxfname.o  nsxfobj.o psargs.o psloop.o psobject.o psopcode.o psopinfo.o  psparse.o psscope.o pstree.o psutils.o pswalk.o psxface.o  rsaddr.o rscalc.o rscreate.o rsdump.o rsdumpinfo.o rsinfo.o  rsio.o rsirq.o rslist.o rsmemory.o rsmisc.o rsserial.o rsutils.o  rsxface.o tbdata.o tbfadt.o tbfind.o tbinstal.o tbprint.o  tbutils.o tbxface.o tbxfload.o tbxfroot.o utaddress.o utalloc.o  utbuffer.o utcache.o utcopy.o utdebug.o utdecode.o utdelete.o  uterror.o uteval.o utexcep.o utglobal.o uthex.o utids.o utinit.o  utlock.o utmath.o [...]

I've tried multiple times, but it always gets stuck on ctfmerge(1), no CPU usage, nothing happening. Is this normal? I've left it go for 20+ minutes, and nothing.

At this point, what are my options? How can I get a successful buildkernel? Or how can I get freebsd-update(8) to force a full reinstall? The basejail image update from ezjail-admin(8) worked fine, so how can I replicate that process on the main OS?
 
Last edited by a moderator:
For further information, I booted into single user to buildkernel there. It again hangs on ctfmerge(1), but I noticed the process state is umtxn. When I stop buildkernel and go back to the shell, ps still shows the ctfmerge(1) process "running", stuck in the umtxn state.

Is there a way to use the live CD/DVD to reinstall the OS, without data loss?
 
Last edited by a moderator:
Thanks sidetone, that's probably the direction I'm headed. Unfortunately, it's kind of difficult with a 6 TB ZFS pool, but hopefully I can find some spare HDDs. I'll wait a bit to see if anyone has other suggestions.
 
This freebsd-update -r 10.2-RELEASE is supposed to be freebsd-update -r 10.2-RELEASE upgrade. Another thing, did you run freebsd-update fetch install first?
You can still save that system by converting to -STABLE, see this thread.
 
I just mistyped above. I don't think it even does anything with just the -r flag. Running a fetch and install doesn't help because it thinks everything is fine and that no updates are available.

I tried to convert to -STABLE using the thread you linked to, but I'm still having the same problem, since it's using mutexes from my currently running kernel.

What do you think about building the kernel while booted into the live CD? I'm thinking about going into the live CD, mounting my ZFS pool and building the kernel/world to the ZFS pool. Then booting back to the installed OS and installing the world/kernel from there. Does this sound bad?

To do that, I would have to set an altroot for the ZFS pool. Are there any environment variables or something in make.conf I can set to tell it to use something like /mnt/altzfsroot/usr/obj instead of /usr/obj?
 
Why not install FreeBSD on a separate but small (80g<) harddrive? Then leave your filesystem alone, then mount it from there. This is what I often do. If you partitioned your drive, you may have more options. Be careful of your commands, because it can easily delete the wrong directories.

To do that, I would have to set an altroot for the ZFS pool. Are there any environment variables or something in make.conf I can set to tell it to use something like /mnt/altzfsroot/usr/obj instead of /usr/obj?
I tried searching for this before, in make.conf() and src.conf() but couldn't find it.
 
I could install the OS to another hard drive and use that as the boot drive, but I'd rather not. I should be able to just reinstall the OS somehow.

I'm pretty sure you can set an environment variable of MAKEOBJDIRPREFIX to some other directory. At least according to this wiki entry, it should be fine. I'll try booting from the CD, mounting the ZFS pool, and buildworld/buildkernel tonight and report back for future Googlers.
 
Back
Top