I have an odd error and it's a bit hard to explain but I'll do my best and please feel free to ask if additional info is needed.
We have a fully updated FreeBSD 11.2 server. On that server are 3 FreeBSD 10.4 jails. All 3 jails take care of accounts and authentication for other sites/services.
Jail 1 has apache and postgresql and people use it to manage their accounts.
Jail 2 has openldap 2.4 installed and is a master LDAP for jail 3
Jail 3 has openldap 24. installed, is the main authentication server for other sites and services, and is an LDAP slave of jail 2
The way it all works is jail 1 has a script that will sync any account changes from the postgresql server to the LDAP server on jail 2 which then syncs to the LDAP server on jail 3 using LDAP sync replication.
This has all worked fine for years until a few weeks ago when I completely re-setup the jails using FreeBSD 11.2 due to 10.4 being EOL. Everything has been set up the same, same configurations, the software versions are the same, etc. the only difference between the jails is the FreeBSD version. However, now whenever there is a large sync of user data one or both of the LDAP servers will crash with the error "BDB0060 PANIC: fatal region error detected; run recovery". Initially, I thought it was maybe a corrupt account that was added but every time it's crashed I've checked the last synced account and there's nothing wrong with it and it always crashes at a different point so I've pretty much ruled that out as the issue.
I've tried everything I can think of but I can't figure out the problem so I've now rolled back to the old jails and everything is running fine ... no crashes when syncing and, as a matter of fact, the syncing is probably twice as fast. It was taking about 45 minutes to sync 12,400 accounts but now it only takes around 20 minutes.
The fact that the old jails work fine and the new ones don't is very odd because, as I mentioned before, everything is set up exactly the same between the old FreeBSD jails and the new one with the exception of the FreeBSD version. It's almost as though the LDAP replication can't write fast enough to keep up and is crashing but I'm not sure if that makes sense or not.
I did not put this system together so I can't answer exactly as to why it's set up the way it is or any super technical details about it. But I have been managing the jails for a long time and have upgraded them from FreeBSD 9.x to 10.x with no issues and am only just now having this issue when going from FreeBSD 10.4 to 11.2.
Does that all make sense? Can anyone think of any reason why this might be happening? Any help would be very much appreciated.
Thank you!
We have a fully updated FreeBSD 11.2 server. On that server are 3 FreeBSD 10.4 jails. All 3 jails take care of accounts and authentication for other sites/services.
Jail 1 has apache and postgresql and people use it to manage their accounts.
Jail 2 has openldap 2.4 installed and is a master LDAP for jail 3
Jail 3 has openldap 24. installed, is the main authentication server for other sites and services, and is an LDAP slave of jail 2
The way it all works is jail 1 has a script that will sync any account changes from the postgresql server to the LDAP server on jail 2 which then syncs to the LDAP server on jail 3 using LDAP sync replication.
This has all worked fine for years until a few weeks ago when I completely re-setup the jails using FreeBSD 11.2 due to 10.4 being EOL. Everything has been set up the same, same configurations, the software versions are the same, etc. the only difference between the jails is the FreeBSD version. However, now whenever there is a large sync of user data one or both of the LDAP servers will crash with the error "BDB0060 PANIC: fatal region error detected; run recovery". Initially, I thought it was maybe a corrupt account that was added but every time it's crashed I've checked the last synced account and there's nothing wrong with it and it always crashes at a different point so I've pretty much ruled that out as the issue.
I've tried everything I can think of but I can't figure out the problem so I've now rolled back to the old jails and everything is running fine ... no crashes when syncing and, as a matter of fact, the syncing is probably twice as fast. It was taking about 45 minutes to sync 12,400 accounts but now it only takes around 20 minutes.
The fact that the old jails work fine and the new ones don't is very odd because, as I mentioned before, everything is set up exactly the same between the old FreeBSD jails and the new one with the exception of the FreeBSD version. It's almost as though the LDAP replication can't write fast enough to keep up and is crashing but I'm not sure if that makes sense or not.
I did not put this system together so I can't answer exactly as to why it's set up the way it is or any super technical details about it. But I have been managing the jails for a long time and have upgraded them from FreeBSD 9.x to 10.x with no issues and am only just now having this issue when going from FreeBSD 10.4 to 11.2.
Does that all make sense? Can anyone think of any reason why this might be happening? Any help would be very much appreciated.
Thank you!