C OpenSSL questions

If I use SSL_read_ex on an open connection with a client, the program blocks until the client writes something. This means I sort of have to guess when there will be a request and use read only there. But sometimes I want to be able to tell whether the client has anything to ask before I send anything. SSL_peek blocks the same as SSL_read, and SSL_has_pending seems to give 0 every time, seeming to indicate that a read function has to already be deployed in order to know if there is any buffered data from the client to begin with. The underlying BIO object is nonblocking, but I don't think that matters, I think the blocking behaviour is particular to each function in this case.

Does this make any sense? Does anybody have experience with this? Did I write this all wrong, or is it expected behaviour and there is some other way to check if the client wants to write before blocking with SSL_read?

Google is not friendly to OpenSSL questions, always giving the same generic "how to check certificate chain" results, and my scouring of the library reference has yielded nothing on this yet.
 
The traditional way to perform an asynchronous read(2) on a socket is to use fnctl(2) to set OPTION F_SETFL with the option O_NONBLOCK on the socket. This will result in your read(2) call returning (-1) and errno will be set to (EAGAIN) - which means call read(2) again "some time in the future" to see if there is data available. Otherwise your read call will return ssize_t, which is the actual number of bytes read from the socket.

In the OpenSSL world - you can actually just "get" the socket descriptor from the OpenSSL library and use either poll(2) or older style select(2) to see if the socket descriptor is "ready-to-read". Use the OpenSSL SSL_get_fd() call to get the "raw" file descriptor so you can do this.

You can view the FreeBSD manual pages for (each) of the above FreeBSD system function calls with:

shell$ man 2 read
shell$ man 2 poll

Etc.. etc...

What you want to do is to write a poll() loop (or) a select() loop -- so your code can run inside a thread (pthread(3)?) or fork(2) process or even just your main() function -- and occasionally check for socket I/O. You can use the same poll(2) logic to check the file descriptor for "ready-to-write" and ERROR issues as well.

This logic can likely also be done with SSL_poll(2).

This same coding logic also works on Linux, Mac and Windows (unless you are in a WinSock world, then you will need to bend it slightly).

Hope that helps!
 
It helps a bunch! Thank you!

I had seen the SSL_poll() reference in the OpenSSL reference (which is a word-for-word with the man pages) and was really hoping it wasn't that, as I was getting lost in the morass. You laying out the logic made it much clearer and it now feels feasable.

The OpenSSL specs constantly refer to the asynchronous mechanism you describe, but I believe I have set my BIO object to be non blocking, and I still can't get it not to block. I'm probably doing that part wrong. I create the listening BIO with BIO_set_nbio_accept(mybio, 1), and the "1" flag supposedly makes it non-blocking. But maybe when I pop it with BIO_pop and create the final client BIO object ( clientbio=BIO_pop(mybio)), it somehow loses that property. I'll have to look into that, it would be great if I could get it properly non-blocking instead of finding a work around.

This adventure did reveal a logical error in the preceeding code, which once fixed will allow me to continue in my hacky fashion of "guessing" what and when the client will ask.

I will iron out the basic logic, then figure out how to make it non blocking (on the SSL level, using SSL_get_fd if I can't), and then implement threads or forks as a final step. I feel this way it has the best chance of being a robust system. I was dead set on threads, but somebody sold me on forks because then you aren't tied to a single physical cpu and the program becomes more scaleable. We'll see about that, plenty of problems to solve in the interim.

Thanks again, very valuable help.
 
The OpenSSL specs constantly refer to the asynchronous mechanism you describe, but I believe I have set my BIO object to be non blocking, and I still can't get it not to block. I'm probably doing that part wrong. I create the listening BIO with BIO_set_nbio_accept(mybio, 1), and the "1" flag supposedly makes it non-blocking. But maybe when I pop it with BIO_pop and create the final client BIO object ( clientbio=BIO_pop(mybio)), it somehow loses that property. I'll have to look into that, it would be great if I could get it properly non-blocking instead of finding a work around.
I seem to remember (in the past) using the SSL_get_fd() call because I was facing a similar problem like you are describing (aka my OpenSSL file descriptor was not asynchronous/non-blocking or similar). I also felt like the OpenSSL library should "just support this" and tried for awhile to look for an "in OpenSSL library" solution to the problem - but ultimately I did not find a good solution "in library". There might be a better solution out there on a web page or similar?

I was dead set on threads, but somebody sold me on forks because then you aren't tied to a single physical cpu and the program becomes more scaleable.

The FreeBSD scheduler will schedule (any) program thread that is "ready-to-run" on to (any) available CPU/hyper-thread on your machine. So (by default) threads are not tied to a specific CPU/hyper-thread. Because... if it didn't work that way it would make your entire process "single threaded" or "time shared" :cool: . Of course you will need to use mutex locks to keep 2x (or many) threads from stepping on each other while they are simultaineously running at the same time in your program. But.. that's the fun of programming with threads!

VERY early implementations of "thread libraries" were in fact "time shared" ! But that was a long time ago during the Clone Wars era -- during The Old Republic.

Threads (pthread(3), etc) use multiple CPUs and hyper-threading - BUT you can actually "lock" a running thread in your program to run on a very specific CPU hyperthread using thread affinity -- which is (very cool !) for curtain coding situtations. This means that whenever the thread is "scheduled" to run on FreeBSD the thread will always run on the CPU/hyper-thread you scheduled it for. If you have like 24 CPUs with 2 hyper-threads per/CPU or similar you can divide all the "work" you want to do amoung the available CPUs/hyper-threads. The FreeBSD way to lock a thread to a specfic CPU/hyper-thead is to use affinity(3). Threads are very "light weight" meaning they don't take up a lot of resources when you create them and run them in your code - they are also called light weight processes (LWP). You can also schedule curtain threads in your code to have a higher or lower thread priority -- which means they can be scheduled to run relatively (on your FreeBSD machine - how BUSY the machine is) more or less frequently by the FreeBSD scheduler.

fork(2) is completely fine as well - but unless you apply an "affinity" to the forked process the forked process will be automatically "scheduled" by FreeBSD on to (any) available CPU/hyper-thread when the forked/child process is "ready to run". Because the fork(2) call used to be "heavy weight" when the fork(2) call was called (aka it copied the parent processes pointers, file descriptors, and even all of the parent processes "data" into the child process) - they are generally considered to be heavy weight. They have (since) optimized fork(2) more recently so it is better and less resource intensive when it is called. On FreeBSD systems you can also consider using vfork(2) which is MUCH more optimized when your source code is going to immediately call one of the execve(2) system calls. (aka execvp(2), execle(2), etc).

If you want to "really" learn threads I recommend the Orriely Ptheads book -- which seems "old" now!? :-) But a quick check of Amazon shows 4+ stars.

For socket processing the books by Dr Stephen's are very good. Dr Stephens passed away many years ago, but everyone still reads his books even today to learn socket programming.

And if you want to REALLY understand FreeBSD and the *BSD's operating system in general - pickup Marshall Kirk McKusick's book - The Design and Implementation of the 4.4 BSD Operating System. This book may seem dated - but "a large chunk" of the things you read in the book you will find in "todays" FreeBSD implementation (and the other BSD's - Ghost, NET2, etc).
 
Note that using select() or poll() on underlying sockets behind SSL is a little tricky because SSL sometimes needs to write *as part* of the read or read *as part* of the write in order for the out of band encryption to take place.

Basically if you poll waiting for "can I read yet?" you find it will never come until you write. You will see this manifest as

SSL_ERROR_WANT_WRITE and SSL_ERROR_WANT_READ returned from SSL_get_error(3).

I would focus on non-blocking sockets (with SSL) first before thinking about adding threads. The Beej Networking guide is a good reference, in particular the part on Blocking. I typically start with a busy loop server processing the sockets (whilst glowing read hot). I then retroactively fit in the select() to wait for an actionable socket. Possibly not the best for design but allows for an incremental approach of implementation.

(Then once you are serving ~3000 clients nicely and need to support magnitudes more, then chuck in thread pools and design a good verification test suite ;))
 
Sorry I was neglecting this thread a little. I need to learn not to post on off topic forums. Why must experience teach us nothing!

CShell your advice worth its weight in gold as usual. I believe I will be acquiring the thread and socket books. The McKusick book I already have, can't remember which edition, and it is what sparked off this whole jounrey. The beauty is that as I learn things, I am able to place them into a wider architectural framework in my mind that McKusick lays out with extreme elegance.

If threads aren't locked into a specific CPU, which I now feel dumb for just "accepting" without checking because it sounded right, then to me a web server using threads is a no brainer. Every process cycle and every byte used counts!

I would focus on non-blocking sockets (with SSL) first before thinking about adding threads. The Beej Networking guide is a good reference, in particular the part on Blocking. I typically start with a busy loop server processing the sockets (whilst glowing read hot). I then retroactively fit in the select() to wait for an actionable socket. Possibly not the best for design but allows for an incremental approach of implementation.

This is actually how I was planning to implement. If I have non-blocking powers, and I design the flow well, then threads shhould be fairly easily, or at least fairly optimally and robustly, implemented later on.

---

For the moment, I did solve the blocking problem. CShell is right, it's not entirely possible to do it with only the OpenSSL library. This is how I solved it, including the SSL side of things as much as possible just to be sure:

The listening socket is created with BIO_set_nbio_accept(serverbio, 1), then I create a second bio with the same command, BIO_set_nbio_accept(clientbio, 1), and only then do I pop the server bio into the client bio: clientbio=BIO_pop(serverbio). The "1" flag on both bio objects supposedly signifies making them non-blocking, but maybe it only makes them non-blocking-friendly, because this alone does not create a non-blocking socket.

After, having created the ssl object with ssl=SSL_new(ctx) and transferred ownership of the bio object to it using SSL_set_bio(ssl, clientbio, clientbio), presumably having created a non-blocking-friendly ssl connection, but before accepting a connection with SSL_accept(ssl), I summon the local socket with localsocket=SSL_get_fd(ssl) and make it non-blocking with fcntl(localsocket, F_SETFL, O_NONBLOCK).

Then now everything is nice and non-blocking as it should be. Of course I had subconsciously programmed everything to work with blocking sockets, so now I'm segfaulting everywhere. But the heavy lifting is done, now to tidy up the workshop.

Impossible without help from you guys, thanks infinitely.
 
But the heavy lifting is done, now to tidy up the workshop.

Which, of course, was unwarrantedly optimistic.

I'm starting to see the problem here. Sometimes it's not even a problem with the SSL layer doing funky things, but timply the internet connection taking time to deliver a write. I can't just programmatically check whether the client wants to write anything after everytime I write, because the answer may be "it's in the mail," and meanwhile my program just rolled right on.

---

You know what? I was right the first time. That whole lag theory is bs. If you listen after every server write, you will catch the client writes you need.

You just have to be thorough.

I guess that's a large part of where the SSL buffer voodoo comes in, foreseeing these very scenarios.

---

(I should add, for kpedersen's peace of mind, that I did do a fair bit of playing around with vanilla sockets before starting with SSL.)

---

No, no, there was lag. I just had it printing so much diagnostic information that it acted as a time delay.

I'm going back to the lab now, sorry to mumble out loud. Thanks for the help, will report with concrete results.

---

Well fruck it, (as in tire fruck), a small time delay after each write won't barely be noticeable per client, and it wouldn't give any additive lag over the system. I think that is what I am going to implement. Write, read (FOR X AMOUNT OF TIME), write, etc. The client will have that thread (when I turn it into threads) all to himself anyway, not the end of the world if there is some milliseconds lag at each exchange that he, or she, will never notice which will never spill over from the thread..

---

Only a few moments' thought will reveal several holes in that idea.

What I need is to start implementing ASYNC jobs.

---

The answer was so simple.

while(it refuses to write)
>>>>try to read;

I don't know if it's optimal in the wider scheme of things, but it works.

I don't care about burning up CPU. The donkey is there for to haul the cart.
 
After, having created the ssl object with ssl=SSL_new(ctx) and transferred ownership of the bio object to it using SSL_set_bio(ssl, clientbio, clientbio), presumably having created a non-blocking-friendly ssl connection, but before accepting a connection with SSL_accept(ssl), I summon the local socket with localsocket=SSL_get_fd(ssl) and make it non-blocking with fcntl(localsocket, F_SETFL, O_NONBLOCK).

Interesting ! Nice work. Ya I can't remember if that's what I did a long time ago.. but it sure sounds/reads familiar!
 
My desire for simple answers was predictably misguided. It's just a new paradigm that I had to adapt my code to. But I think I have now done that.

Interesting ! Nice work. Ya I can't remember if that's what I did a long time ago.. but it sure sounds/reads familiar!

Yeah I think you just hold your nose, jump into the local system just as briefly as possible, then jump back to OpenSSL hoping you didn't interfere with its obscure machinations.

---

By the way, the problem I was having will be embarassing to admit to anybody who has bothered about serving webpages without prebuilt server software, and probably most who haven't: I was getting favicon requests in unpredictable patterns. I didn't understand why I suddenly was getting writes from the client when I had programmed nothing to warrant them. It's now accounted for and I can now safely predict the pipeline like I had assumed you should be able to.

I don't think non-blocking eliminates the need to know what's going to happen, it just prevents the possibility of things suddenly clogging to a stop and also probably makes things a bit faster. It lets you multitask, it doesn't multitask for you, let's say.

About threads and polling: I have come to realize that what we have been discussing re: polling I am addressing with the OpenSSL pop mechanism. I am not at a level of expertise or need to design a multithreaded system that atomizes every transaction. What I am designing, and what I want and think will be enough for now, is to have as many cash registers open as possible for as many clients to be processed simultaneously as possible, but then each client to be served in the traditional step by step scan items - give price - take payment - bag items - etc predictable linear fashion. My plan is basically to just launch a new thread every time a connection is accepted and pop into it with the BIO_pop() function. Then have a bunch of thread-local (what was it you called those) variables to service each client in the traditional way with a full stack cash register process.

This will make my first foray into threads a lot simpler. I think it will also be pretty fast, frankly. And as things progress and my understanding grows, I can decide how much fancier I want to get with it. The idea, for example, of individual threads dealing with individual aspects of the "client checkout" or even non-client-specific processes. Sky's the limit, or maybe not, for now I'll consider it a success if I can serve each page in a unique thread.
 
All of these answers seemed a bit work-aroundy to me, and OpenSSL is too tight of a system for that to make sense. I have since found less janky solutions to some of my concerns, and here are some:

The first thing to understand is that there are two ways to work with SSL connections: using the SSL_ routines, and using BIO_f_ssl routines. The second one is, in theory, a wrapper around the first. However, I have found that this is a reasonable assesment: the SSL_ routines are good ready-to-go solutions to simple tasks. But if you want any kind of serious control over proceedings, you need the BIO routines. BIO is just the status quo for this library, and using it will embed you much further into the power and flexibility openssl has to offer.

So, first: can you set an SSL connection to be nonblocking using purely OpenSSL? With SSL_ routines, the answer is no. You have to jump out, implement nonblocking locally, and jump back in. But, as it turns out, if you use the BIO_set_nbio_accept(*BIO, 1) routine mentioned earlier with a BIO_f_ssl paradigm, the answer is yes. It works out of the box and everything is nonblocking. This is just one example of how OpenSSL just understands BIOs better.

Second: Can you know if the client wants to write, without using a blocking connection, before attempting to read? Again, with SSL_, no. With BIO_, yes. Of course, with SSL_, you just initilize the connection and voila. But with BIO, you have to create a buffer bio, push it into a bio chain with your server bio (BIO_push(buffer_bio, bio_to_be_popped_from_accept_connection_bio)), and then you can check that buffer at any time (BIO_pending(popped_bio)), and it will tell you if the client wants to write. This is less straighforward to achieve than it might sound, but it exists and works.

It also gives you a lot more power on how you read. SSL_read[_ex]() works a lot like read(), so it will just give arbitrarily long strings with linebreaks and such included. This is usually best. But with a TLS connection, the workflow is much more malleable if you read a line at a time, which you can do with BIO_ rutines (you can also still do BIO_read[_ex] which is equivalent to SSL_read[_ex]).

You can't tweak things as they happen the way you would locally, because there are just too many layers that the library is negociating and working with, and too much is, by necessity, "hardcoded." So the ideal approach is to learn the dark BIO object arts, as they have a good amount of the functionality you would try to achieve with small tweaks as "hardcoded" functions.

Thank you all for allowing me to bore you.
 
Thank you all for allowing me to bore you.

LOL - Sorry if I am kind of quiet right now, the holidays are sneaking up and taking up a lot of my time. Just how many holiday parties can be planned in a short holiday time slot? We'll see ! :-)

I also upgraded one of my machines to FreeBSD 15, and I'm actually testing with that machine right now.

The first thing to understand is that there are two ways to work with SSL connections: using the SSL_ routines, and using BIO_f_ssl routines. The second one is, in theory, a wrapper around the first. However, I have found that this is a reasonable assesment: the SSL_ routines are good ready-to-go solutions to simple tasks. But if you want any kind of serious control over proceedings, you need the BIO routines. BIO is just the status quo for this library, and using it will embed you much further into the power and flexibility openssl has to offer.

Ya that was always interesting - they did wrap the more traditional SSL_ routines with BIO -- and then left it to the reader to figure out which one worked better. I (think?) the idea was that BIO would be an easier for the programmer to make sense of the (ton) of SSL_ calls. BIO obviously speeds up OpenSSL library adoption by programmers and makes creating their own OpenSSL product a lot easier to change going forward - versioning their product and so forth.

Second: Can you know if the client wants to write, without using a blocking connection, before attempting to read? Again, with SSL_, no. With BIO_, yes. Of course, with SSL_, you just initilize the connection and voila. But with BIO, you have to create a buffer bio, push it into a bio chain with your server bio (BIO_push(buffer_bio, bio_to_be_popped_from_accept_connection_bio)), and then you can check that buffer at any time (BIO_pending(popped_bio)), and it will tell you if the client wants to write. This is less straighforward to achieve than it might sound, but it exists and works.

I am an old school socket programmer so I am used to a lot of what you are relating with read(2) / write(2), etc. kpedersen made the right comment about "not getting in the way" of what SSL what it does behind the scenes on your socket and not interfering in the SSL handshakes, cypher synchronization going, MACs, etc going on over the open socket.

That said you should still be able to "safely" use poll(2) to watch your descriptors for (READ, WRITE, ERROR) activity and then make a BIO call if you want to drive it that way.

t also gives you a lot more power on how you read. SSL_read[_ex]() works a lot like read(), so it will just give arbitrarily long strings with linebreaks and such included. This is usually best. But with a TLS connection, the workflow is much more malleable if you read a line at a time, which you can do with BIO_ rutines (you can also still do BIO_read[_ex] which is equivalent to SSL_read[_ex]).

The SSL_read you are describing is called a BUFFERED read. BUFFERED reads are the most efficient reads you can do because you want to dequeue "as much data as possible" from your connection so that the (inbound/outbound) socket buffer is empty (or close to empty) and you won't lose packets. If your socket buffers "backup" then the kernel/TCP stack will start dropping inbound data. That's obviously not desired :-). But it is also OK to enter a "tight read(2) loop" and just keep dequeuing data until EOF from the read(2)/BIO is reached using multiple read calls. I generally create a unsigned char readBuffer[8192] (or similar) and just keep reading data and processing it until EOF.

BUT you will have an easier time with this processing if you go to pthread(3)'s and just let each thread do it's own reading.

OF NOTE - You should checkout how excellent FreeBSD 15 thread performance is (over Linux). That was a big surprise when reviewing the the latest performance data from Phronix - here.
 
Thanks again for your invaluable advice.

I (think?) the idea was that BIO would be an easier for the programmer to make sense of the (ton) of SSL_ calls. BIO obviously speeds up OpenSSL library adoption by programmers and makes creating their own OpenSSL product a lot easier to change going forward - versioning their product and so forth.

I'm definitely missing something. The conclusion I had reached was the opposite. When I scroll through the docs, I find more BIO_ calls than SSL_ ones, and I also find more diverse functionality, and also harder to work with concepts. My feeling was that the library natively does things with a BIO object paradigm, and SSL_ calls are sort of shortcuts. What's even more interesting is that I have not been able to get them to work concurrently in certain cases (BIO_read() or BIO_get_line() and SSL_read_ex, for example). Maybe what I missed is that, being more direct, SSL_ calls actually allow you to do a lot of what you would want outside the library, and only use SSL_ calls as kind of final tunnels? While using BIO objects presumes you will stay within the library and thus make things more noob-friendly? Specially considering another thing you wrote:

That said you should still be able to "safely" use poll(2) to watch your descriptors for (READ, WRITE, ERROR) activity and then make a BIO call if you want to drive it that way.

So probably I just have a lot of learning still to do. It makes sense, now that I think about it, because using BIO_f_ssl, I can never get it to spit the local socket back at me, while with SSL_ calls, I can.

Decisionsss.

This is somewhat off topic. I beg your forgiveness in advance.

If you're writing something new from scratch, why not look at libtls? It was specifically designed to avoid the pitfalls and weirdness of Openssl. Note that using libtls does not force you to switch to Libressl. There's security/libretls which uses Openssl under the covers.

Reading this presently. I am invested enough into OpenSSL that a shift would hurt, but not so much that it wouldn't be worth it for a library that feels more like a hammer and less like a hammer robot.
 
This is somewhat off topic. I beg your forgiveness in advance.

If you're writing something new from scratch, why not look at libtls? It was specifically designed to avoid the pitfalls and weirdness of Openssl. Note that using libtls does not force you to switch to Libressl. There's security/libretls which uses Openssl under the covers.

So I am looking through this, as well as asorted man pages, and it looks very nice, but:

I am almost embarrassed to admit this, because when you can't find a solution anywhere it often means you are doing it very wrong, but there is one thing that I absolutely need and neither SSL_ calls nor (that I have found in my cursory searches) libtls do: read and write calls that can handle binary data, such as fread() as opposed to read(). I spent many hours trying to figure out how to circumvent ignored null bytes in SSL_read_ex, and finally the only way I found was using BIO_gets.

Am I really missing something big and obvious? Do people just not send very much binary data over TLS connections?

For now, until the situation clarifies, I will have to continue using OpenSSL.
 
Second: Can you know if the client wants to write, without using a blocking connection, before attempting to read? Again, with SSL_, no. With BIO_, yes. Of course, with SSL_, you just initilize the connection and voila. But with BIO, you have to create a buffer bio, push it into a bio chain with your server bio (BIO_push(buffer_bio, bio_to_be_popped_from_accept_connection_bio)), and then you can check that buffer at any time (BIO_pending(popped_bio)), and it will tell you if the client wants to write. This is less straighforward to achieve than it might sound, but it exists and works.

This is, of course, completely false. I just hadn't noticed it was. You still need to set the non-blocking attribute locally.

The problem is that it is impossible (at least to me) to arrive at what exactly socket the BIO_f_ssl is using.

What I concluded is this: that I have been having too much respect for the obscure machinations. I am just going to create the socket locally, and then use the set socket call for BIO objects. I can't see why it shouldn't work.

At this point, the only reason not to go back to SSL_ calls or even the libtls library is transmission of binary data. If anybody has ideas, they are welcome. It can'e believe it should be such a complicated task as I am finding it to be. Transmission of binary data is commonplace over the internet with https connections.
 
What I concluded is this: that I have been having too much respect for the obscure machinations. I am just going to create the socket locally, and then use the set socket call for BIO objects. I can't see why it shouldn't work.

So "unfortunately" in programming you (can) indeed reach this conclusion - because (1x API) relies upon (another API) -- in this case BIO relying on SSL_ -- and yes, you (can't see) the gritty details going on in the baseline API because the details are hidden from you on purpose. The idea is that BIO is supposed to simplify your coding so you don't have to call (as many?) SSL_ routines. Sometimes a calling API works out fine and other times the calling API can be difficult to reconcile.

At lot of time you will "eventually" find out later that the wrapper API was "right" -- but it just wasn't obvious why it was right when you first went to code it.

At this point, the only reason not to go back to SSL_ calls or even the libtls library is transmission of binary data. If anybody has ideas, they are welcome. It can'e believe it should be such a complicated task as I am finding it to be. Transmission of binary data is commonplace over the internet with https connections.

Binary data works fine? You can (always) pass binary data through standard read(2)/write(2), etc calls. Where you might get into trouble is with calls that contain the word "gets()" in them. That (sometimes/mostly) means those function calls are reading data in "line format" where each line received over the socket is expected to terminated with a carriage return and/or line feed ("\r\n"). Unix lines end in "\n", DOS/Windows end in "\r\n"

Remember - when processing multiple read(2) calls -- you have to (concatenate) all the data you read from (each) read call together to get the original binary data back again. For example if your client sends 32 bytes of data -- and your read(2) call reads with a buffer total size of (16 bytes) - then you would need to use 2x read(2) calls on the "reader" end of the socket to read 2x - 16 byte buffers from the socket -- and then append ALL OF the read data together to make a total of (32 bytes) -- which recreates the original 32 bytes sent from the client again.

Remember you can also translate your binary data into ASCII if you want to work with ASCII data -- either while testing or even in production. You can actually do this while "testing" your software so that you can make sure you are getting all of the right data and the sent/received data is in the right order. BASE64 encoding will translate any binary data you want to send into printable ASCII -- and later you can "decode" the BASE64 encoding back to pure binary bits again. This will GREATLY simplify your testing and makes it easier to spot bugs and issues you need to fix.

You might also want to get comfortable using GNU gdb(1) going forward. You can set "break points" in your executing code and print/display variables/ etc and see what is "really" going on. It will also help you (later) when you are debugging what's going on in threaded code.
 
nary data works fine? You can (always) pass binary data through standard read(2)/write(2), etc calls.

It took my slow head a while, but between writing my last post and coming to write this one, I was able to piece together your advice with my experiences, and realized this was the way. I was intimidated by the SSL intermediary operations, but it turns out you really can do normal local operations on the socket that OpenSSL is using. So what I did was what I would have done locally: fopen() the socket, then fread() from it, which I already know always gives binary data the way I like it. And it worked.

I am still getting some dropped null bytes, but my guess is that this is happening at the edges of each call. It doesn't sound insurmountable. The way I was glueing it all together was just fwrite()ing it to a file. This had also been working with SSL_read_ex, but in that case all sections with strings of null bytes got compressed into a single null byte.

You might also want to get comfortable using GNU gdb(1) going forward. You can set "break points" in your executing code and print/display variables/ etc and see what is "really" going on. It will also help you (later) when you are debugging what's going on in threaded code.

I guess I really should. So far I just include random printf() calls to see on the terminal if such and such happened. Maybe that is very caveman-y. I will look into this.
 
guess I really should. So far I just include random printf() calls to see on the terminal if such and such happened. Maybe that is very caveman-y. I will look into this.

LOL - That's not caveman-y -- WE ALL use printf()'s :-).

The issue is that once code gets complex you can use gdb(1) to take a look at the execution "stack" and see what's going on. You may or may not be able to put a printf() where you want it -- but you can usually set a "break point" in your code and inspect the stack.
 
I was intimidated by the SSL intermediary operations, but it turns out you really can do normal local operations on the socket that OpenSSL is using. So what I did was what I would have done locally: fopen() the socket, then fread() from it, which I already know always gives binary data the way I like it. And it worked.

This is good ! BUT -- if you really want to make sure the data you are sending and receiving is "encrypted" over the network be sure to verify the client/server data going back and forth over the network using wireshark/ethereal or tcpdump(1).

As a programmer - once you get into the middle of the OpenSSL API calls (aka with your own socket calls) you can accidentally open a security hole and that might not be what you want.
 
Looking around [a], [b], it seems there is more of a push to use the BIO API for the entire socket lifespan (i.e BIO_new_connect) than I typically see previously (i.e creating, establishing socket completely and then using SSL_set_fd). Ultimately there isn't too much difference in what it is doing down below.

However, just a note on sending via BIO and SSL, unless you create your own custom one, it doesn't expose the send(2) flags to you. Specifically MSG_NOSIGNAL. Which means if there is a connection failure during an SSL_write (and due to the encryption pump, SSL_read), the SIGPIPE signal will be raised and your program will terminate. As a workaround you may want to ignore the signal globally (i.e signal(SIGPIPE, SIG_IGN)).

In short, if you are getting weird crashes, try the above. This quirk catches us out regularly.
 
It's frustrating because, if you use the SSL_ calls, everything works beautifully, but you cannot receive binary. If you use direct manipulation of the local socket, then things don't ever fully work, and it's hard to tell why, but one knows it has to be some missing hook along the chain involved in the OpenSSL.

If you do all of this without the encryption, it is very easy.

Maybe I have to go back to the drawing board. My initial mentality was that it would be like working with sockets but with OpenSSL wrappers. Probably the truth is that it already becomes simply a different paradigm, and I have to design it from scratch with that in mind.

I already had a pretty developed code base for this little project that worked marvelously. One tiny clink with reading binary, and suddenly all of it is useless. It probably means I was doing it structurally wrong anyway.

Well then. Enough b****ing. Back to verk.

Thanks for the warnings, they will be incorporated.
 
For cringed out observers:

My initial assumption, that if there is only one read call in the SSL_ calls then it must transmit binary, was correct.

I had somehow coded the thing to do it all wrong. Human error.

It works.

Thanks to everyone here for their infinite patience, and the abundance of important information.

---

Just to state it clearly on the record: SSL_read_ex (one must assume that SSL_read also) faithfully transmits data byte-for-byte, including null bytes. If you are not finding this to be the case, there is an error in the logic of your code.

---

It might seem like a weird thing to look out for, and I don't find any forums anywhere where people talk about it. But I remember when I was experimenting with just local sockets, I had found that read() shaved off null bytes, whyle fread() did not. Maybe it was also a logical error, but the only thing I changed for it to work was using fread() in place of read() and adding a fdopen() and fclose(), so I think it's a thing. And I think I did run into some stack overflow post about it. So that's why this is where my mind went when I started getting corrupted data with OpenSSL.
 
Back
Top