C Undefined symbol mid run of program

but still can not wrap my head around how a memory overwrite can cause a function to not be found?
A memory overwrite can cause anything. The behavior is undefined. There was a joke footnote in an old C textbook or standards document which said something like "undefined means: it could reformat your disk". Honestly, if you had reported that after overwriting memory, there were purple elephants flying around your computer, I would have been only mildly surprised.

Now a little more serious: Think about how "dynamic linking" must work. When the compiler/linker prepares the executable for your program, they typically don't actually "link" a copy of the standard libraries (such as libc, which contains all the usual string and IO functions) into your executable file on disk. Doing that would be very wasteful; you would end up with lots of executables using disk space, each for storing copies of the same functions (such as open, write, printf, strcpy, ...). And at runtime, there would be lots of copies of all these functions in memory, even more wasteful. So instead what happens is that your "executable" on disk is actually not immediately executable. Instead it contains a little stub memory area, which says something like "If we need to call printf at runtime, please look in the dynamic libc library such as /lib/libc.so, make sure one copy of it is loaded somewhere in memory, record here where it is, and then call the printf function at that address".

Now user noahbar comes and overwrites that little bit of memory which tells the dynamic linker where printf really is. You see the problem?

Anecdote from many decades ago: In my mis-spent youth (I was actually in my 30s), I got paid to program in C++ on a Windows 3.1 machine, using the Waterloo memory extender (so we had a flat 32-bit address space, none of the 640k limitations of the original MS-DOS). During development, we discovered that sometimes after running our program, you couldn't print any more, and any attempt to use the "print" command from the command line would hang the computer. Strange, isn't it? It turns out to be completely logical: At physical address zero, MS-DOS keeps an interrupt vector table, some of which is used for printing. The first dozen bytes are not actually commonly used, so you can actually write and read memory using a NULL pointer, as long as you only do a little bit of it. If you do too much of it, then the first thing that suffers is the print command, and if you try to use hundreds of bytes, things blow up and the system crashes. But the scary thing is: You can have bugs in your code where you use NULL pointers, but your code continues working so-so. One of the best things about moving to Windows NT was that suddenly, the attempt to use a NULL pointer (even for reading) immediately crashed, so finding such bugs was much faster.
 
going to fix everything mentioned, but still can not wrap my head around how a memory overwrite can cause a function to not be found? i was able to fix it by taking alain's advice

I was about to ask for that. It would be great if you could post a complete program that reproduces the function-not-found problem.
 
i do have some memory leaks /invalid write/read of size 1 in my combineStr function, I did make it realloc based on the size of the string(s) so that should be solved. It is an http server so most of the time is spent idling waiting for a request to come in so it should not be too hard to track down bugs
 
Assuming you are using nul terminated strings, never forget +1 byte of space for the nul.

As for what is going wrong @ralphbsz is spot on.

Imagine that, in memory (stack or heap, doesn't reall matter) you have something equivalent to

```
char some_buffer[16];
int *some_pointer;
```

laid out consecutively like above.

Then if you do something like

```
sprintf(some_buffer, "value: %d", some_int);
// some code
*some_pointer = 42;
```

This will be OK for values of some_int less than 100 million (or greater than -10 million). More than theat then the sprintf will overrun 'some_buffer' and start writing to 'some_pointer'.

Then when you write to the memory pointed to by 'some_pointer' anything can happen. You can't modify code in memory, it is read only. But you can modify the PLT which has to be writeable.

A quick explanation of the PLT. When you use shared libraries, you don't know where they will get loaded into memory. For instance if I have libfoo.so and exe1 then libfoo might get loaded at 0x40100000 but exe2 might link with libbbar.so AND libfoo.so and because of that load libfoo at address 0x4010f000.

On the first call to a function in a shared library, the following gets done
  • jump to the GOT (global offset table)
  • jump back to the symbol resolution function in the link loader, this finds the function and modifies the GOT
  • jump to the requested function
On subsequent calls, beacuause the GOT was update with the address of the desired function there is just one extra jump.

It looks to me that you are corrupting memory in one of the regions used for GOT / PLT or similar.
 
So what is in characterArr.h, and presumably there is a characterArr.c as well since you call addToArr().

As others have said the way you have used calloc() is fundamentally a memory leak unless you can guarantee you will free the pointer after the call. Instead of calling calloc(), pass in the address and length of a buffer to receive the output, and check that you do not write more than that length to the output buffer, to prevent buffer overflow.

Fix the rest of it, eliminate the buffer overflows and memory leaks, and you will likely find that your original problem disappears. Even if this doesn't fix your original problem, you need to fix the leaks and buffer overflows first, so that you have a sane test harness to debug.

For example, the function findBetween might look something like this. This is just a sketch of course. You would probably want to add some assertions about the validity of the arguments, for example.

#include <stdbool.h> #include <stdio.h> #include <string.h> /*------------------------------------------------------------------------------ findBetween: find a string bounded by two other strings to the left and to the right, within a larger overall string. */ bool findBetween(char* left, char* right, char* full, char *out, int outmax) { bool rc = false; char *start = strstr(full, left); if (NULL == start) { // left not found printf("left string \"%s\" not found\n", left); } else { start += strlen(left); char *end = strstr(start, right); if (NULL == end) { // right not found printf("right string \"%s\" not found\n", right); } else { int len = end-start; if (len == 0) { // zero length string found printf("no characters found between \"%s\" and \"%s\"\n", left, right); } else { if ( len > outmax) { printf("output buffer is too small!\n"); } else { strncpy(out, start, len); rc = true; } } } } return rc; } #define MAXOUTLEN 10 int main(int argc, char *argv[]) { char out[MAXOUTLEN+1] = {0}; char left[] = "left"; char right[] = "right"; char *full[] = { "", "left", "right", "stuffleftmiddlerightmore", "stuffleft1234567890rightmore", "stuffleft1234567890Arightmore", "stuffleftmrightmore", "stuffleftrightmore", "stuffleftsomethingreallylongrightmore", }; int nstrings = sizeof(full) / sizeof(char *); int i; for (i=0; i< nstrings; i++) { printf("test string \"%s\"\n", full[i]); memset(out, 0, MAXOUTLEN); if (!findBetween(left, right, full[i], out, MAXOUTLEN)) { printf("not found\n"); } else { printf("found string \"%s\" between \"%s\" and \"%s\"\n", out, left, right); } printf("\n"); } return 0; }

$ cc -o test test.c
$ ./test
test string ""
left string "left" not found
not found

test string "left"
right string "right" not found
not found

test string "right"
left string "left" not found
not found

test string "stuffleftmiddlerightmore"
found string "middle" between "left" and "right"

test string "stuffleft1234567890rightmore"
found string "1234567890" between "left" and "right"

test string "stuffleft1234567890Arightmore"
output buffer is too small!
not found

test string "stuffleftmrightmore"
found string "m" between "left" and "right"

test string "stuffleftrightmore"
no characters found between "left" and "right"
not found

test string "stuffleftsomethingreallylongrightmore"
output buffer is too small!
not found
 
Of course a much nicer approach to writing this kind of code is to use a regular expression library, but that's probably not going to help you much at present.
 
Back
Top