Trigger an interrupt when the value of a memory location is modified in FreeBSD

_martin · Aug 23, 2022

Does it mean you did run it in debug environment but you were not able to trigger the issue due to lack of the conditions (traffic) that could cause this ?

Maybe one more angle to look at it - what if the arr[] in the code is victim of the different issue? Is arr[] global variable? Are there any other variables defined in the same scope as arr[] is? Print addresses of those to see which ones end up next to each other. Could the variable next to it be a problem? Such as off-by-one in string operation,etc.

_martin · Aug 23, 2022

Btw. you started to get the information from the handler yourself but for some reason you hashed it out. I did a small redo on your handler to demonstrate:

Code:

void handler(int sig_num, siginfo_t *sig, void *unused) {
        ucontext_t *uc = (ucontext_t *)unused;
        printf("returning to: %p\n", uc->uc_mcontext.mc_rip);

        // or let's be evil ..
        uc->uc_mcontext.mc_rip = 0xcafec0de;

        if (cnt == 10) {
                printf("let's just end this..\n");
                _exit(1);
        }

        cnt++;
        printf("signum: %d, count: %lu\n", sig_num, cnt);

        if (cnt == 5) {
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
        }
}

Also if you read the siginfo(3) man page you'll see this:

In addition, the following signal-specific information is available:
..
..
SIGSEGV si_addr address of faulting memory reference

Which means meaning of sig->si_addr changes depending on the signal.

In my handler above I'm modifying the saved rip to jump somewhere else. Normally one would use setjmp() and friends for cleaner code but this is just a demonstration.

bakul · Aug 23, 2022

breakpoint - related to code. You can always replace the instruction at a breakpoint with some thing that will cause a trap and allow a debugger to take control an when you continue or single step, it will put back the original instruction & continue.

watchpoint - related to data. You can (indirectly via mprotect) fiddle with the TLB entry for a page to cause a trap but this is not precise enough. This is why Intel added 4 watchpoint address registers. The h/w must watch every data access and when it matches one of these addresses it does what the associated control bits say.

See /usr/include/x86/reg.h for the defn of __dbreg64 (or 32) & https://en.wikipedia.org/wiki/X86_debug_register
I think you will have to use the ptrace(2) syscall to interface with this. There may be some online tutororial that shows how to use it.

bakul · Aug 24, 2022

Avk said:
In production environment we are not allowed to run using gdb as correctly mentioned by reddy.

You will end up wasting a lot of time as you will have to implement a bunch of things that gab already provides. Talk to your manager or the right muckymuck and point out the time and money cost of avoiding gdb. It is better to implement a privilege / permission system as well as do logging so that gdb is used when absolutely required and under tight control.

_martin · Aug 24, 2022

Sometimes is makes sense to read the whole thread and then answer. These all have been already said.

Avk · Aug 24, 2022

_martin said:
Does it mean you did run it in debug environment but you were not able to trigger the issue due to lack of the conditions (traffic) that could cause this ?

Maybe one more angle to look at it - what if the arr[] in the code is victim of the different issue? Is arr[] global variable? Are there any other variables defined in the same scope as arr[] is? Print addresses of those to see which ones end up next to each other. Could the variable next to it be a problem? Such as off-by-one in string operation,etc.

Once this issue happen a core file is generated and process restarts. In local environment we are not able to generate core file (also no process restart) even with traffic; that's mean we are not able to repro this in local environment. May be the kind of traffic we are running for reproduction purpose in local environment doesn't match with that of customer environment.

Regarding arr[], it is basically a part of a big structure and it is dynamically allocated. The parent variable is global and there are many structures involved and this array of pointers arr[] is part of one of them.

The issue always happen specifically for arr[1] element, but arr[0] or arr[2] are intact. Hence there is little chance some other array[] has overflowed into this array.

Bobi B. · Aug 25, 2022

Did you consider to run a new thread that basically does

Code:

volatile uint8_t *ptr = ...;
while (true)
  if (!*ptr)
    force_core_dump();

One core will be fully utilised, but with some luck in couple of crashes you might be able to catch the culprit by analysing the core dump.

_martin · Aug 25, 2022

Avk said:
Once this issue happen a core file is generated and process restarts.

Ok, understood. But I see you do have short downtime of a service as process has to be restarted. We don't know what is the justification against gdb in your live environment but seems you do have case for it here.

To elaborate on my code above a bit further. The saved context (3rd param in handler) is the context handler will used to sigreturn (i.e. return from handler). There you can examine and possibly control the flow. But this does require you to know what is the program doing, i.e.:

Code:

./mprotect
Starting ....0x800a09700 : size = 16384
signum: 11, count: 1, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 2, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 3, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 4, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
signum: 11, count: 5, returning to: 0x201cc1
something tried to write to ptr @ 0x201cc1
ptr: 0
All completed...

Let's see what that is with objdump -d mprotect:

Code:

  201c69:    48 8b 04 25 d8 3f 20     mov    rax,QWORD PTR ds:0x203fd8
  201c70:    00
  201c71:    c7 00 00 00 00 00        mov    DWORD PTR [rax],0x0

I've compiled the binary with debug symbols so I can easily verify that with readelf -Wa mprotect | grep 203fd8

Code:

    40: 0000000000203fd8     8 OBJECT  GLOBAL DEFAULT   24 ptr

and confirmed that it's the code:

Code:

        ptr[0] = 0;

From the asm output you can see the value 0 is used as immediate, i.e. you won't find this value in saved context.

If you are using threads you can have race condition in handler and you need to have some logic (mutex) to control the mprotect. With this handler you should also take care of other possibilities of SIGSEGV. Simplest case would be to compare si_addr to ptr, if it's not it do exit.

Handler I used in my code is pretty much the same, pasting for completeness:

Code:

void handler(int sig_num, siginfo_t *sig, void *unused) {
        cnt++;
        ucontext_t *uc = (ucontext_t *)unused;

        printf("signum: %d, count: %lu, returning to: 0x%lx\n", sig_num, cnt, uc->uc_mcontext.mc_rip);

        if (sig->si_addr == ptr) {
                printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }

        if (cnt == 5) {
                mprotect(ptr, size, PROT_READ | PROT_WRITE);
        }
}

I like this problem as an idea. I went through this thread again but didn't find the answer: why is it a problem first just to do a printf debugging to see what is being written where? You know all locations in code where arr is being used and written to. Just use printf before it to print within what function what is being written.

Avk · Aug 26, 2022

_martin said:
Ok, understood. But I see you do have short downtime of a service as process has to be restarted. We don't know what is the justification against gdb in your live environment but seems you do have case for it here.

To elaborate on my code above a bit further. The saved context (3rd param in handler) is the context handler will used to sigreturn (i.e. return from handler). There you can examine and possibly control the flow. But this does require you to know what is the program doing, i.e.:

Code:

./mprotect Starting ....0x800a09700 : size = 16384 signum: 11, count: 1, returning to: 0x201cc1 something tried to write to ptr @ 0x201cc1 signum: 11, count: 2, returning to: 0x201cc1 something tried to write to ptr @ 0x201cc1 signum: 11, count: 3, returning to: 0x201cc1 something tried to write to ptr @ 0x201cc1 signum: 11, count: 4, returning to: 0x201cc1 something tried to write to ptr @ 0x201cc1 signum: 11, count: 5, returning to: 0x201cc1 something tried to write to ptr @ 0x201cc1 ptr: 0 All completed...

Let's see what that is with objdump -d mprotect:

Code:

201c69: 48 8b 04 25 d8 3f 20 mov rax,QWORD PTR ds:0x203fd8 201c70: 00 201c71: c7 00 00 00 00 00 mov DWORD PTR [rax],0x0

I've compiled the binary with debug symbols so I can easily verify that with readelf -Wa mprotect | grep 203fd8

Code:

40: 0000000000203fd8 8 OBJECT GLOBAL DEFAULT 24 ptr

and confirmed that it's the code:

Code:

ptr[0] = 0;

From the asm output you can see the value 0 is used as immediate, i.e. you won't find this value in saved context.

If you are using threads you can have race condition in handler and you need to have some logic (mutex) to control the mprotect. With this handler you should also take care of other possibilities of SIGSEGV. Simplest case would be to compare si_addr to ptr, if it's not it do exit.

Handler I used in my code is pretty much the same, pasting for completeness:

Code:

void handler(int sig_num, siginfo_t *sig, void *unused) { cnt++; ucontext_t *uc = (ucontext_t *)unused; printf("signum: %d, count: %lu, returning to: 0x%lx\n", sig_num, cnt, uc->uc_mcontext.mc_rip); if (sig->si_addr == ptr) { printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip); } if (cnt == 5) { mprotect(ptr, size, PROT_READ | PROT_WRITE); } }

I like this problem as an idea. I went through this thread again but didn't find the answer: why is it a problem first just to do a printf debugging to see what is being written where? You know all locations in code where arr is being used and written to. Just use printf before it to print within what function what is being written.

This is excellent post, though I have few clarifications.
In your case it is returning to address 0x201cc1. ==> "signum: 11, count: 4, returning to: 0x201cc1"
Ideally it should point to the address where it is causing the interrupt. But in your case the address 0x201c71 contains the instruction that do the 0 assignment.

201c71: c7 00 00 00 00 00 mov DWORD PTR [rax],0x0

In my case the return address points to exactly the position where I do the 0 assignment.

2 : signum -> 11, returning to address -> 0x401137, ptr[0] = 9
something tried to write to ptr @ 0x401137

401130: 48 8b 05 d9 05 20 00 mov 0x2005d9(%rip),%rax # 601710 <ptr>
401137: c7 00 00 00 00 00 movl $0x0,(%rax)
40113d: bf 81 12 40 00 mov $0x401281,%edi

The readelf output for my case is this :

$ readelf -Wa mprotect1 | grep 601710
13: 0000000000601710 8 OBJECT GLOBAL DEFAULT 23 ptr
85: 0000000000601710 8 OBJECT GLOBAL DEFAULT 23 ptr

What is the meaning of 23 ptr above ?
Thanks !!!

_martin · Aug 26, 2022

Avk said:
Ideally it should point to the address where it is causing the interrupt.

It's exactly that -- instruction that caused the issue and hence interrupt occurred. It can't get any better than this, you have the exact address of the fault.

Addresses can differ as we're most likely using different compiler (and I've slightly different code). Your faulting address is logically the same one (assigning 0 to *ptr) but it's just different virtual address. As you're compiling the code you can choose pretty much whatever (with some exceptions) address you like. If you use this Makefile

Code:

CFLAGS=-g -O0 -Wall -Wpedantic -Ttext 0xcafe000

mprotect:    mprotect.c
    clang $(CFLAGS) -o mprotect mprotect.c

clean:
    rm -f *.o mprotect

Then the faulting address will be somewhere around 0xcafe4a1
23 is the index number into the symbol table. Not relevant to anything here, it's ELF specific.

Avk · Aug 28, 2022

Probably one last challenge with this approach is this :

When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, re-executing the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.

That's the reason I used this statement in my signal handler. The signal is generated for 5 times before I set read/write permission.

C:

        if (5 == cnt)
            mprotect(ptr, size, PROT_READ | PROT_WRITE);

Now the challenge is in the field we can't block the program execution by generating signal whenever the arr[] is accessed (write operation). If signal is generated we need to unblock it immediately, may be like this :

C:

void handler(int sig_num, siginfo_t *sig, void *unused) {
        // inside signal handler
        ucontext_t *uc = (ucontext_t *)unused;
        cnt++;
        printf("\n\n%d : signum -> %d,  returning to address -> %p, ptr[0] = %d\n", cnt, sig_num, uc->uc_mcontext.mc_rip, ptr[0]);

        if (sig->si_addr == ptr) {
           printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }
        mprotect(ptr, size, PROT_READ | PROT_WRITE);
}

After signal handler is executed and returned, the control would re-execute the same instruction (write operation on arr[]) and no signal would be generated this time. Once the write instruction is passed, somewhere I need to protect the memory again mprotect(ptr, size, PROT_READ); so that next time tries to write the arr[] again, the signal handler should execute again.

May be from the signal handler I should start a timer so that after few millisecond the timer handler execute this code and enable the memory protection ?

C:

mprotect(ptr, size, PROT_READ);

I am just guessing this could be one approach, there may better way to achieve this.

_martin · Aug 28, 2022

That timer approach would cause you headache and be source of a serious performance degradation.

The safest thing that I could think of right now is what was actually suggested here by two people -- use a wrapper function around write to array. Something like:

Code:

wrapper_write(int* array, int pos, int val) {
   *(array+pos) = val;
    mprotect(array, size, PROT_READ);
}

But if you go through the trouble of changing the code to this yet again, printf debugging will save you lot of issues.

Avk · Sep 1, 2022

From the signal handler we can tell what is the returning address using : sig_num, uc->uc_mcontext.mc_rip

C:

void handler(int sig_num, siginfo_t *sig, void *unused) {
        // inside signal handler
        ucontext_t *uc = (ucontext_t *)unused;
        cnt++;
        printf("\n\n%d : signum -> %d,  returning to address -> %p, ptr[0] = %d\n", cnt, sig_num, uc->uc_mcontext.mc_rip, ptr[0]);

        if (sig->si_addr == ptr) {
           printf("something tried to write to ptr @ 0x%lx\n", uc->uc_mcontext.mc_rip);
        }
        mprotect(ptr, size, PROT_READ | PROT_WRITE);
}

If we know the returning address, can we not fetch the whole instruction from that address and print within the signal handler ?

_martin · Sep 1, 2022

It's PITA to decode the x86 instructions. Its size is variable (up to 15B) and is a challenge on its own.
You could though print first 15B in the handler (starting at uc->uc_mcontext.mc_rip) and manually decide what to do with it.

Note you said that you have problem with arr[1] only. So you could focus on this in handler. Check if sig->si_addr == (ptr+1) and then do deeper actions. Or, once again, do the printf debugging first to see what is being modified when.

_martin · Sep 15, 2022

Avk: I wonder - were you able to find the bug?

Avk · Sep 16, 2022

_martin said:
Avk: I wonder - were you able to find the bug?

Not yet. This bug was kind of off focus for last few weeks ... but need to actively work on this. Looks like many customers have started reporting this recently.

Avk · Sep 21, 2022

One quick question, what happen once we protect a memory and then free the memory ?
Something like this :
ptr = (int*)malloc(size);
mprotect(ptr, size, PROT_READ); // enable the protection

Now instead of disable the protection if we free the memory, what happen ?
free(ptr);

Does it keep generating the interrupt once someone tries to access the memory from the same process ?
It is possible that if we call the malloc() again, OS might allocate the same memory which was under protection.
In that case will the memory protection remain valid or freeing the memory would disable the protection ?

_martin · Sep 21, 2022

Avk said:
what happen once we protect a memory and then free the memory

It depends whether it was already written to it before (and possibly depends on systems's jemalloc optimization in malloc.conf).

Generally SIGSEGV would occur as jemalloc (FreeBSD's heap allocator) is not able to write its metadata to the chunk.
Note though mprotect() granularity is PAGE_SIZE (or even the whole region according to mprotect(2), heap chunk can be way smaller (and most likely will not start on page boundary). It was assumed you're getting your memory region from mmap. You should not touch malloc chunks with mprotect.

Avk said:
Does it keep generating the interrupt once someone tries to access the memory from the same process ?

Yes, SIGSEGV is generated, i.e. your handler is called.

Avk said:
It is possible that if we call the malloc() again, OS might allocate the same memory which was under protection.

As mentioned above It depends. No if write was done after malloc. Yes if you did malloc(), mprotect(), and free() without any write to ptr (and no other malloc was done in between) -- chunk would be returned to allocator and subsequent malloc() of the similar size would return the same pointer but now pointing to read-only memory. Subsequent write would case SIGSEGV.