Problem with acme.sh - can not renew or issue letsencrypt certificates

Anybody having problems with acme.sh ?


I have had acme.sh / letsencrypt running for a very long time now couple of years actually - never any issues, until now.
The last successful certificate renewal was august 1st on one server and august 9 on a second server. Now the renewal does not work.
Also issuing a new certificate does not work.


Both servers run:
FreeBSD 13.2,
acme.sh version 3.0.7 running standalone mode.
No webservers involved.


The error I am seeing is:
Code:
[Wed Nov 29 09:43:53 CET 2023] Please refer to https://curl.haxx.se/libcurl/c/libcurl-errors.html for error code: 7
[Wed Nov 29 09:43:53 CET 2023] Here is the curl dump log:
[Wed Nov 29 09:43:53 CET 2023] == Info:   Trying x.x.x.x:80...
== Info: Immediate connect fail for x.x.x.x: Connection refused
== Info: Failed to connect to myserver port 80 after 163 ms: Couldn't connect to server
== Info: Closing connection

Which should indicate that port 80 is blocked. Except the port is wide open which I verifyed by running ssh through port 80. No connection issues whatsoever.

However, doing a tcpdump on port 80 on the servers while acme.sh is attemping a renewal, it does seem like the standalone server is not accepting input.
The connecion attempt from letsencrypt is simply shutdown

Code:
10:38:10.746319 IP ec2-3-145-182-97.us-east-2.compute.amazonaws.com.54614 > myserver.http: Flags [S], seq 2716805116, win 62727, options [mss 1460,sackOK,TS val 1384473520 ecr 0,nop,wscale 7], length 0
10:38:10.746363 IP myserver.http > ec2-3-145-182-97.us-east-2.compute.amazonaws.com.54614: Flags [R.], seq 0, ack 2716805117, win 0, length 0
10:38:11.066744 IP outbound1h.letsencrypt.org.39051 > myserver.http: Flags [S], seq 1773033676, win 64240, options [mss 1436,sackOK,TS val 3355672768 ecr 0,nop,wscale 7], length 0
10:38:11.066791 IP myserver.http > outbound1h.letsencrypt.org.39051: Flags [R.], seq 0, ack 1773033677, win 0, length 0



I have the exact same situation on two different FreeBSD servers on very different net locations, but a linux server with the same version of acme.sh does not have any issue at all.

Did I miss some imporant stuff?
 
Which should indicate that port 80 is blocked.
"Blocked" is typically a firewall issue, but then you usually get a "connection timed out" (no response at all on the initial SYN). Connection refused means it received a RST in response to a SYN. This usually happens when the port is 'closed', i.e. there's no service listening. Your tcpdump also shows this. Your server responds with a RST.
 
It seems it can't attach to port 80 - is anything else already listening on that port?
Otherwise I'd check the firewall settings.


TBH I never used standalone or nginx-mode in production because they always were a bit wonky and error-prone for me...
I've been using nothing but dnsapi for several years now and the only hiccups were when letsencrypt switched to acme-v2 api (and I may have forgotten to update one or the other host...). Been using DNS-mode with cloudflare, digitalocean, vultr and now bunny.net and all 'just worked'™
 
you can reset instead of deny with ipfw which will fake a closed port
PF too.
Code:
     set block-policy
           The block-policy option sets the default behaviour for the packet
           block action:

           drop      Packet is silently dropped.
           return    A TCP RST is returned for blocked TCP packets, an ICMP
                     UNREACHABLE is returned for blocked UDP packets, and all
                     other packets are silently dropped.

But, as with IPFW, it requires specific configuration, their default action is to drop (as is the case with pretty much all firewalls). That's why I said "usually", not "always".
 
Thx for all the good reply's. Really appreciated.
During testing I have disabled the firewall, confirmed with testing from ssh using port 80 and there is "hole through"
Nothing is using port 80, confirmed with sockstat.
I also tried to run sockstat every 1 second to see if acme.sh start listening at some point, but I did not see anything.
It looks like acme.sh is not listening on port 80 or something is preventing it. If this was a RHEL server i would be looking at SELinux.
 
wild guess: curl had all kind of weird issues for me after/since the switch to OpenSSL3 as default. security/acme.sh can also be built against wget for its http(s) capabilities. maybe worth a try, even if only to verify if it's a bug/regression with current curl?
 
Nothing is using port 80, confirmed with sockstat.
And that's why you get a connection refused. Letsencrypt tries to connect to your webserver to fetch the response for its challenge. This is to verify you are indeed the owner of the website.

It looks like acme.sh is not listening on port 80 or something is preventing it.
It's not supposed to. It's your website that should be listening here. With the HTTP challenge/response Letsencrypt will try to fetch a file from the /.well-known/acme-challenge/ path on your website.
 
I know certbot has such a mode, didn't know acme.sh did too. I thought it was as barebones as it could possible be.

I have an nginx running specifically for this. My HAProxy configuration simply sends all URL requests for /.well-known/acme-challange/ to it. I need those certificates on HAproxy for SSL termination.
 
it's actually very capable with lots of popular dns plugins (for wildcart certs) (listens/does stuff via socat i think)
 
i would replace socat with a shell script which logs arguments and then invokes socat.bin (realsocat) wit the same args
and then investigate that
or you can run sh -x acme.sh but that can be quite spammy and difficult to debug
 
acme.sh can do pretty much everything certbot can - but as pure shell and hence without a ton of python dependencies or sudo and very easily extensible. Been using it for exactly those reasons as I don't have python or sudo (I'm using doas) installed anywhere unless absolutely necessary...

It uses either the curl (default) or wget libraries for http(s) handling (and iirc socat for the socket stuff), and I've seen other curl errors (usually tls-related) from it shortly after I switched my poudriere builds to OpenSSL3.0. Sometimes rebuilding it fixed the errors, but it seems I also have switched to wget at some time, as this is the current setting for the installed package on 2 servers I just checked.
 
It's not the connection to Letsencrypt that's failing, it's the connection from Letsencrypt. So wget/curl/fetch is not the issue here. As there's nothing running on port 80, I would guess the socat "server" isn't running and/or failing to start.
 
I think we are zooming in on something. Following covacat's advice I replaced socat with a shellscript REALsocat and got this error:
023/11/29 13:10:16 REALsocat[94164] E exactly 2 addresses required (there are 15); use option "-h" for help

Also socat was recently updated to 1.8.0.0
Hm....



I also started to look into sko's advice and use dnsapi instead.
 
It's not the connection to Letsencrypt that's failing, it's the connection from Letsencrypt. So wget/curl/fetch is not the issue here. As there's nothing running on port 80, I would guess the socat "server" isn't running and/or failing to start.

I thought those are *client side* generated errors? But of course - socat is the part that handles the listening socket... TBH I never really looked at how the ACME-sausage is made behind the curtains as long as it worked...

Is acme run by root or by the acme user (that may not have rights to open sockets <1024)?

This is the socat command acme.sh is using for setting up a socket:
Code:
socat -4 TCP-LISTEN:80,crlf,reuseaddr,fork SYSTEM:
can you try this as either root or acme user and see if it proceeds and opens a socket ( sockstat -l | grep socat)?

you could also try adding ',bind=<public-facing-ip>' to the options before "SYSTEM:", this can be set via Le_Localaddress in the config if it turns out socat needs it, either due to some special interface config or different behaviour/regression in version 1.8.0.0. The systems I currently have at hand all have socat 1.7.4.4.
 
Code:
socat -4 TCP-LISTEN:80,crlf,reuseaddr,fork SYSTEM:
Does open a socket on port 80 and yes this is all run as root.

Also it has been working for a very long time now, wonder what have changed. curl is still using openssl 1.1.1, acme.sh is the same version. socat has been updated and so has curl.
I will take a moment and consider my options. if I can make it work, I think i will prefer dnsapi, that will get rid off socat,curl, wget, standalone and whatnot, making it all much simpler and less vulnerable.
 
Do you see any incoming connection attempt *at all* during an attempted renewal?

If not: please double-check your DNS records...

A colleague came to me a few weeks ago with a similar issue where it turned out he moved the domain to a completely different registrar and failed to mention it. He also didn't move all (any) records to the new nameserver, so even standalone or nginx-mode failed...
 
TBH I never really looked at how the ACME-sausage is made behind the curtains as long as it worked...
The general gist of it is that ACME generates a challenge and it needs to check the response. It can do this through HTTP (call to /.well-known/acme-challenge/<some random file>) or by querying a DNS record. This 'proves' you have control of the common name in the certificate. It's to prevent people requesting certificates for domains they have no control over (like google.com for example).
 
SirDice The basic principle is clear - I meant more what's going on in terms of what is glued together on the client (or server) side to make it work, e.g. in the case of acme.sh socat and whatever handles the rest of the generation of the challenge and handing it over to the requesting LE-server (if it's not a webserver). The dns-mode IMHO is as simple and clear as it gets, so maybe that's why I like that the most (and it always 'just worked')

covacat sorry, I totally forgot/missed that log excerp.
I'd try socat 1.7.4.4 from quarterly ports to make sure this isn't just a regression (or changed behavior).

I just tried the standalone mode on a freshly installed 13.2-RELEASE server and intentionally didn't allow port 80 in my pf.conf at first. The corresponding error message by acme.sh is:
Code:
[Wed Nov 29 15:07:19 CET 2023] new.server.tld :Verify error:38.242.204.130: Fetching http://new.server.tld/.well-known/acme-challenge/Ryf4QrnZTT2wwv5dgReV0awPuS9bC-Mg2Zukq68NhHQ: Timeout during connect (likely firewall problem)
[Wed Nov 29 15:07:19 CET 2023] Please add '--debug' or '--log' to check more details.
[Wed Nov 29 15:07:19 CET 2023] See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh

After allowing port 80 in pf.conf the process completes and I get the new cert...

So the error OP is seeing is most likely *not* a firewall issue (?) but with socat.
socat version on that host is 1.7.4.4 (from quarterly; I never use latest on servers...)


I also had a curl error at first (error code 60), but that was due to ca_root_nss not being installed (and curl somehow still doesn't want to use the certstore from the base system...)


edit: also tried with latest acme.sh (3.0.7) instead of the one from quarterly (3.0.6), which also works - so my bet is still on socat.

edit2:
installed socat from latest (1.8.0.0) and acme 3.0.6 as well as 3.0.7 can still successfully get a cert via standalone mode (used different subdomains for testing to always run the full validation process).
 
I really appreciate all the input here - thanks a lot.
Just did a rollback of socat and what do you know? now it works !

So at least in my setup socat-1.7.4.4 works together with acme.sh, socat 1.8.0.0 does not.

curl is still version 8.4.0

Now I may spend some time in figuring out what is changed.
 
Just confirmed this on the second FreeBSD server:

socat 1.8.0.0 installed
renew certificates - fails

downgrade socat to socat-1.7.4.4.pkg
renew certificates - works like a charm
install socat-1.8.0.0.pkg
renew certificates - works like a charm

So while we do not have an answer as to what was the real issue, I do have workaround.

Now looking into using dnsapi....
 
Back
Top