OpenSSL almost 10x faster than LibreSSL?

Hi all,

I’m investigating some slow VPN speeds on my router, and I’m trying to make sense of what I’m seeing. Non-VPN’d traffic can hit >1gb/s through the router, so I know it’s not a throughput problem.

This got me investigating crypto performance, and on all my machines, I’ve found that LibreSSL from ports is significantly slower than /usr/bin/openssl:

OpenSSL
Code:
❯ /usr/bin/openssl speed -elapsed -evp aes-128-cbc                                   
You have chosen to measure elapsed time instead of user CPU time.           
Doing aes-128-cbc for 3s on 16 size blocks: 39194832 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 14976402 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 4691478 aes-128-cbc's in 3.10s
Doing aes-128-cbc for 3s on 1024 size blocks: 1198333 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 8192 size blocks: 152902 aes-128-cbc's in 3.03s
OpenSSL 1.0.2k-freebsd  26 Jan 2017
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) bl$wfish(idx)
compiler: clang
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     209039.10k   319496.58k   387230.10k   406911.67k   413220.02
LibreSSL
Code:
❯ /usr/local/bin/openssl speed -elapsed -evp aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 9712514 aes-128-cbc's in 3.09s
Doing aes-128-cbc for 3s on 64 size blocks: 2658097 aes-128-cbc's in 3.04s
Doing aes-128-cbc for 3s on 256 size blocks: 683993 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 173575 aes-128-cbc's in 3.02s
Doing aes-128-cbc for 3s on 8192 size blocks: 21912 aes-128-cbc's in 3.03s
LibreSSL 2.6.4
built on: date not available
options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) idea(int) bl$wfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      50230.38k    55977.20k    58367.40k    58939.95k    59217.52k
Right now OpenVPN is compiled against LibreSSL, and I’m not able to get more than 60mb/sec through it. Would it be worth it to recompile my system against OpenSSL and see if performance is better?

Does anyone know if LibreSSL uses hardware crypto offloading?
 
Performance is better with openSSL. I don't have an article for you where I read this but we confirmed it for ourselves a while back. Not 10x slower but decidedly slower.
 
The infosec site has an article about this. They say that the memory sanitization in LibreSSL slows it down. This was an area that developers thought needed high priority when they did their code look-thru. So, they dropped the OpenSSL memory management altogether, and replaced it with a sanitized manager that is slower, because it uses OS facilities that are slower.

They say that LibreSSL explicitly zeros out memory using OpenBSD’s explicit_bzero function, or a wrapper for same (on other OS).
 
Thanks all, glad I'm at least not going crazy!

I just recompiled everything to use OpenSSL from base and I'm still stuck routing at only 60mb/sec through OpenVPN, so I guess the hunt for the slowdown continues.

Does anyone have numbers comparing IPSec tunnels to OpenVPN? Initial searches makes it sound like IPSec is faster. Any tests I can do before sinking the time in to set it up?
 
At least on mips I've found mbed TLS to be faster than OpenSSL when using OpenVPN. mbed TLS also supports AES-NI so it might be worth a try.

IPSec is going to be faster than OpenVPN unless you do something wrong :)
 
Even the LibreSSL version is going to be pretty fast, so that it's easy to imagine that a different bottleneck might drag the speed down from that point. BTW: explicitly zeroing memory is a good thing, and worth the speed cut.

diizzy: +1 on the mbed TLS. Since it was developed for the embedded world, it has much less code and is easier to configure (to a greater degree) with the same effort that would make less progress with the others. I haven't done any performance tests though. Which mips board?
 
Back
Top