sshd frequent crashes (segfault and fatal during key checks??)

Hey all, new poster here but I’ve been using FreeBSD on and off since the 4.x days.

I’m setting up a new VPS in Vultr using 15.0-RELEASE(-p5) and am noticing some really strange behavior. Here’s an excerpt from /var/log/messages

Apr 4 18:18:52 sshd-session[3581] Connection closed by 27.122.56.141 port 58442 [preauth]
Apr 4 18:32:11 sshd-session[3693]: error: Fssh_kex_exchange_identification: read: Connection reset by peer
Apr 4 18:40:00 kernel: pid 2016 (sshd), jid 0, uid 0: exited on signal 11 (no core dump - sugid process denied by kern.sugid_coredump)

I enabled core dumping and waited a bit for it to reoccur. The backtrace looks like this:

* thread #1, name = 'sshd', stop reason = signal SIGSEGV
* frame #0: 0x00000489cdaae0b5 libcrypto.so.35`___lldb_unnamed_symbol9805 + 1493
frame #1: 0x00000489cdaac3a5 libcrypto.so.35`___lldb_unnamed_symbol9799 + 229
frame #2: 0x00000489cda942ed libcrypto.so.35`___lldb_unnamed_symbol9711 + 109
frame #3: 0x00000489cda8b419 libcrypto.so.35`___lldb_unnamed_symbol9686 + 3433
frame #4: 0x00000489cda89b1f libcrypto.so.35`BN_mod_exp_mont + 175
frame #5: 0x00000489cda87586 libcrypto.so.35`BN_BLINDING_create_param + 406
frame #6: 0x00000489cdd0786b libcrypto.so.35`RSA_setup_blinding + 155
frame #7: 0x00000489cdd0779a libcrypto.so.35`RSA_blinding_on + 74
frame #8: 0x00000489cb157768 libprivatessh.so.5`___lldb_unnamed_symbol2217 + 392
frame #9: 0x00000489cb12103d libprivatessh.so.5`Fssh_sshkey_private_deserialize + 365
frame #10: 0x00000489cb120ca0 libprivatessh.so.5`Fssh_sshkey_unshield_private + 576
frame #11: 0x00000489cb120dac libprivatessh.so.5`Fssh_sshkey_private_serialize_opt + 156
frame #12: 0x00000481a8f29066 sshd`___lldb_unnamed_symbol594 + 166
frame #13: 0x00000481a8f27f9a sshd`___lldb_unnamed_symbol593 + 3290
frame #14: 0x00000481a8f26365 sshd`___lldb_unnamed_symbol582 + 6645
frame #15: 0x00000489ce72537f libc.so.7`__libc_start1 + 303
frame #16: 0x00000481a8f248f1 sshd`___lldb_unnamed_symbol580 + 33


It seems to be something related to RSA private keys which is strange. I validated that the key seems valid with ssh-keygen, and the extracted public key from the private key looked fine. I repeated this for the ECDSA and ed25519 keys. All keys are untouched… as the system booted up from Vultr.

I don’t use RSA identities, so I set my sshd_conf to have a single uncommented HostKey line:

HostKey /etc/ssh/ssh_host_ed25519_key

This prevents the outright segfault, but sshd is still throwing fatal errors like this:

sshd[13545]: fatal: pack_hostkeys: serialize hostkey private: string is too large

This does NOT trigger a core dump; it seems that sshd is simply shutting itself down.

This is all pretty bizarre, and it happens pretty frequently (after less than 2h of being up). Presumably some sort of automated scan/exploit as it’s on the internet?

I have never seen anything like this, and couldn’t find anything obvious after a quick search. I observed this on a Vultr VPS pretty shortly after spinning it up. I ran freebsd-update fetch install and there were no updates. Similarly pkg was up to date with the p5 patch. I also tried rebuilding the machine a few times (Vultr’s OS reinstall command, which basically provisions a new VM from a clean image).

Anyone seen this? Should I post elsewhere like an OpenSSH list?
 
The backtrace points to a crash in OpenSSL's RSA blinding setup (BN_mod_exp_mont -> RSA_setup_blinding) triggered during SSH key deserialization. This is almost certainly a corrupted or malformed RSA host key rather than an OpenSSH bug.

First, try regenerating the host keys:

rm /etc/ssh/ssh_host_rsa_key*
service sshd restart

sshd will regenerate the RSA key on restart. If the segfaults stop, the old key was the problem. Keys can become corrupted during VPS provisioning, disk issues, or if the VM was snapshotted/cloned while sshd was writing.

If the crash persists after regenerating, try disabling RSA entirely to isolate whether it is RSA-specific:

In /etc/ssh/sshd_config add:
HostKeyAlgorithms -ssh-rsa,-rsa-sha2-256,-rsa-sha2-512

Then restart sshd. Modern clients will use Ed25519 or ECDSA instead. If crashes stop, it confirms the issue is specific to the RSA code path in this OpenSSL build.

Also worth checking: what OpenSSL version is this running? Run openssl version. FreeBSD 15.0 ships with OpenSSL 3.x and there have been some blinding-related fixes in recent patch levels. A freebsd-update fetch install might pull in a fixed version.

The connection resets from random IPs (27.122.56.x) are just brute-force bots and are unrelated to the segfault. Adding those to a blocklist via fail2ban or pf won't hurt but won't fix the crash.
 
I've already tried regenerating the RSA keys. I'm having a pretty hard time understanding how keys can become corrupted to be honest, given that I tested them with ssh-keygen, but I suppose maybe it's possible?

I hadn't thought to try forbidding the algorithms. Will give that a shot later. As I mentioned already in my post, I'm already explicitly using only ED25519.

I've also noted in the original post that I have indeed run freebesd-update fetch install. (At this point I have to wonder if this is some sort of LLM summary?) My OpenSSL version is as follows: OpenSSL 3.5.4 30 Sep 2025 (Library: OpenSSL 3.5.4 30 Sep 2025).

When searching through the source tree, I noticed that FreeBSD 16 will remove automatic RSA key generation, and that Amazon cloud images already have it disabled. That's good news. I also noticed, rather buried, a mention of `sshd_rsa_enable`. It seems this mostly just prevents key generation, but perhaps it does something more than my config which only specified on explicit HostKey. I've currently deleted the RSA key (original and regenerated one).

Another point to note is that after changing the HostKey settings, the core dump level of crash has stopped, but I'm still getting fatal errors every few hours, shutting down the SSH daemon.
 
Good question cracauer@. This is a VM on a public cloud (Vultr) so I'm not sure how accurate such testing would be (or how far I should go). I guess you're thinking along the lines of memtest86 from within the VM?

The thought occurred to me, but it's also a bit curious that it's failing in the exact same way on the same code path repeatably. It doesn't seem to affect any other service on the system (I've since spun up WireGuard, Caddy, and Vinyl Cache without incident). Doesn't disqualify the idea though. Is that what you had in mind, or something else?

Here's another update from running overnight. It did eventually hit the same fatal error again. It looks like sshd_rsa_enable only controls whether the key is generated, not whether it is an algorithm that might be considered at all. To that end, I've tried explicitly setting the HostKeyAlgorithms with a pruned set based on removing RSA algorithms from the list given as default in the manpage:

HostKeyAlgorithms ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ssh-ed25519@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com
 
Observed the crash (fatal shutdown) again. I've reduced the HostKeyAlgorithms to just the ed25519 ones. It does seem like the HostKey directive limited RSA though, since limiting it stopped the core dumps. But it hits the same spot.

I also tried running memtester from ports for a while and it seemed alright. Really strange behavior.
 
Hmmm.

One thing is for sure, sshd shouldn't casually segfault. It has been reviewed for memory issues lots of times. Maybe a crypto library assassinates it?

You can't memtest86 a cloud instance, but you could run prime95/mprime. But II agree, it's not likely hardware.
 
> One thing is for sure, sshd shouldn't casually segfault.

Yeah, exactly. That disabling the RSA key stopped the segfaults seems interesting on its own, and suggests a bug somewhere in the RSA code path.

The "clean" fatal path it hits is less bad, but the fact that it catches some apparent corruption is pretty concerning on its own.

What would be helpful to devs here cracauer@? Apparently I have something of a honeypot box for the bug, and I don't want to waste that opportunity :) Would the core file be useful? Should I enable some logging at the packet level to capture the event? I'd be happy to help gather the data though I'm a bit fuzzy on the specifics, but anything including full packet capture is an option; I'm not using the VPS for anything else.
 
I am a bit torn to between classifying this as an outlier or declaring an emergency because of a segfault iin RSA.

Either way you collected enough data for a useful bug report. The advantage of a bug report is that you can harrass-CC people in the FreeBSD organization who might actually know something about this, which you can't on the forum.
 
is any crypto offloading (cryptodev, qat) involved? some hypervisors are really bad at this...
This could well be the case.

It's happening with any Vultr instance with FreeBSD 15 on it I have seen so far, and Vultrs' orchestration provisions 15.0 instances to load cryptodev using /boot/loader.conf.
 
Heh... That's depressing, but I'm glad I'm not the only one, and also that this isn't a "stop the world; RSA emergency" per se. I guess I'll open a support ticket and point them at this thread to see if this gives enough info for them to look into it.
 
I started experiencing this problem recently upon launching a new FreeBSD 14 instance on Vultr. In addition to seeing the same core dump and backtrack the OP shared, I noticed that sometimes sshd-session or even sshd itself would crash silently (as far as I can tell). This happens pretty consistently across five of their US-based locations where I have been testing.

I also launched a parallel set of instances that I built from the stock 14.4-RELEASE ISO. After almost 24-hr, sshd has died on four of the five launched from Vultr-supplied images and on zero of five that I built from ISO.

In addition to the /boot/loader.conf that Vultr ships (see attached), it turns out that the hypervisor is different, too, and that does not appear to be customer-selectable. The Vultr-supplied FreeBSD runs on KVM while the ISO-built FreeBSD runs on Hyper-V.

I have a ticket open with Vultr, to which the initial response included this comment:
This is an issue that we have open internally at this time, and is being reviewed further.
 

Attachments

This afternoon, I launched two, identical instances on Vultr in the Chicago location (same parameters as before: "Shared CPU", "Cloud Compute", "vc2-1c-2gb") from the FreeBSD 13 x64 option that Vultr ships (currently deploys as 13.5-RELEASE-p11). Both are running with hypervisor "KVMKVMKVM". The first, I have not modified beyond changing the root password and configuring additional SSH authorized_keys. The second, I upgraded to 14.4-RELEASE-p2 via freebsd-update(8).

By the time I had even finished launching the second host and tried to login, sshd on the first had terminated. I restarted it before proceeding. By the time I had finished upgrading the second host, sshd had already failed again on the first.

Writing this about three hours later, the host deployed as 13 and upgraded to 14 has not yet failed. This is consistent with the other FreeBSD 14 hosts I have at Vultr (mostly at the "New Jersey" location) that were originally deployed as FreeBSD 10 or FreeBSD 12 and have been upgraded incrementally ever since.

The only other trick up my sleeve is to snapshot one of the still-working hosts that started life as 10 or 12 and launching a new instance from the snapshot. That will reveal whether there is a change in hypervisor or not and, after some time, whether the likelihood of an sshd failure is better or worse than a freshly deployed instance.
 
Back
Top