C ASLR, PLT, esoteric stuff

Hullo there,
I’m writing a simple debugger and have found myself way out of my depth. A Google search on my questions only turns up dodgy black hat websites and obsolete tutorials.
I’d like to know:
1) With Address Space Layout Randomisation, obviously the traced executable and any libraries it uses could be loaded anywhere in memory. So how do I interpret DWARF info that says, for example, that “main” lives at 0x4F0? Do I add a constant to it? What library call will get me the true offset of the target program at runtime?
2) I have successfully managed to patch jmp instructions in the tracee to call libc functions indirectly via my own diagnostic functions. However I’ve hit a big problem. Because I’m patching the Procedure Linkage Table via an injected .so, the functions are only patched locally in the shared object. Calling from the tracee results in the real functions being called. I’d like to access the main program’s PLT. How can I calculate its location programmatically? I’d rather the debugee doesn’t have to be recompiled or cooperate with this in any way, i.e. it shouldn’t need to know it’s being debugged.
And yes, I’m using Linux. I have FreeBSD installed in a virtual machine and intend to switch to that once the project has matured a bit. So Linuxy answers or better still a cross platform solution would be great.
Thanks in advance.
 
Well, FreeBSD doesn't have ASLR. But your 1) is more for PIE than ASLR. When program is executed kernel chooses the address for given section (no random offset in FreeBSD), does the required parsing of binary and then handovers it to ELF interpreter (INTERP), most likely ld-elf.so (linux: ld.so). I suggest you take look in ld sources to see how parsing is done. This all is related to FreeBSD though process is the same in general.
 
Hi _martin,
(admin: I hope I'm okay accepting help, even though the rules say I shouldn't have received it?!)
Thanks for correcting my assumption that FreeBSD has ASLR. I guess I just assumed it did because of its general reputation for robustness. Thanks for the link. I prefer documentation to source generally though because it’s less prone to sudden unexpected changes. I even wrote my own DWARF parser because I found the DWARF standard documents much more readable than any source code I could find! I looked at the GDB sources in an attempt to shed light on my questions but they are labyrinthine - I had to give up.
Anyway, regarding question 1) above, I have found something like what I was looking for on Linux: the function dladdr1 has an argument called extra_info that you can use to obtain the difference between an address in the ELF file and a memory address. FreeBSD has dladdr which is similar - you can get a shared object’s base address at least. I think tinkering with the output of these functions could obtain me what I’m looking for on both platforms.
So I have a new question. This is a FreeBSD question I promise:
3) What is the FreeBSD equivalent of /proc/pid/maps? I need addresses to feed into the above-mentioned functions.
 
Rigoletto is right, you'll get better answers about this topic on the official mailing lists.

Nevertheless to the question you asked: FreeBSD doesn't use proc by default, though you can mount it - see procfs(5). /proc/$$/map is what you're after.

I'm guessing you are doing sort of emulation of the execution because you stated in 2) that you don't want process to know it's being debugged. It would make your task easier if you can trace the process ( use ptrace(2) syscall ). You'd have control over process and get around the addresses faster.

I'm not too happy with my answer I gave you above ; I wanted to make it short. The idea was though that when processes is being executed it all starts in kernel - kernel sets all structures, allocates the memory addresses (mappings), does a very little (required) parsing and handovers the execution to loader (userspace). Loader allocates other mappings as needed.
 
Hi _martin, Rigoletto,
Thanks for being so welcoming and taking time to answer.
I’ve fixed both issues mostly to my satisfaction now so don’t need to resort to the mailing list.
_martin I’m already throwing ptrace at the problem. I’m not actually emulating anything (I’d have be mad to try to emulate an x86-64 CPU!) I’m writing a malloc debugger so am trying to intercept various memory-related calls. Actually I think valgrind emulates an entire CPU but that’s crazy territory. Intercepting calls will not allow to catch all bugs (invalid reads and writes for example) but it can catch leaks and invalid frees. And I’m hoping the tracee will be able run at near-normal speed. I should have something interesting to share quite soon.
 
You're ight, I should have read the flipping manual more carefully. ptrace with PT_VM_ENTRY returns all the mapped regions with minimum drama. It's my fault for assuming that ptrace is the same across different Unixes.
Now i wish I could figure out how to do the same in macOS...
 
Are you using a book for learning how to write a debugger?
Hullo helmet1080,
No, I'm not using a book - just referring to the man pages for ELF and ptrace. I learned x86 assembly language many years ago from this book:
I think it is hopelessly outdated now though.
 
Hullo helmet1080,
No, I'm not using a book - just referring to the man pages for ELF and ptrace. I learned x86 assembly language many years ago from this book:
I think it is hopelessly outdated now though.
Is it defficult to write a debugger for someone who knows only C programming? Should I learn assembly? What do you recommend to learn to write a toy debugger, just for development/programming practice?
 
Hi there,
You would need to know a bit of assembly language because a debugger needs to interact with the callstack which is largely hidden from the C programmer (though it helps to know it's there). I also had to study some assembly output of clang to determine how to patch functions correctly.
Youd need to study the man pages I mentioned already and also the DWARF standard at www.dwarfstd.org which is quite an intimidating document!
I'm not sure there is such a thing as a toy debugger because mine is about as simple as can be and yet it's still about 4,000 lines of code.
A toy compiler would be a better way to learn programming - much more fun, more rewarding and you can make it as simple as you like - if you want you could invent a language that just does multiply and add!
 
A friend of mine wrote a debugger as a master's degree project in computer science (called "informatics" in Europe of those days). Took him 1.5 years, in the 80s, when things were still easier, working mostly full-time He was already a very experienced coder, with lots of experience in both low-level (assembly, instruction sets, machine architecture) and high-level computing.
 
That sounds a fun way to spend a year and a half. If I had crates of cola and Chinese food. And to get a master's for the pleasure of it OMG yes.
One thing I've learned about projects of that scale is that at some point you will no longer understand the complexity of your application (except via your intuition). The trick is to acknowledge this but plough on ahead anyway.
 
Back
Top