Gpart jokes

I know it's not nice (I wrote a lot of broken code myself) but still can't stop laughing. Here's the reason, I hope writing this here will help me get to my senses (because as you know uncontrollable solitary laughing can be a sign of madness):
Code:
s1 = class_name;
	for (; *s1 != '\0'; s1++)
		*s1 = tolower(*s1);
	gclass_name = malloc(strlen(class_name) + 1);
	if (gclass_name == NULL)
		errx(EXIT_FAILURE, "No memory");
	s1 = gclass_name;
	s2 = class_name;
	for (; *s2 != '\0'; s2++)
		*s1++ = toupper(*s2);
	*s1 = '\0';
From the comment, it is supposed to turn class_name into all capital letters :)
But I think it has another problem also, so I need some feedback, because I don't know what to do with it. In my testing case class_name is part, so the problematic line translates into gclass_name = malloc(5). This single line allocates 4100008 (sic! Almost 4 MiB) bytes of memory! Can it be considered a bug or is it intentional (but I can't even imagine any gclass_name 4MiB long)? With whom I should talk about it? (I filled another bug in gpart, but no one have responded even to dismiss it, so I guess that after the info from FreeBSD wiki I should contact some developer directly, but don't know exactly whom). Thanks in advance for any advice.
 
Are you sure? As far as I can see the malloc(3) call is done only once because the preceding for-loop has only this body:

Code:
*s1 = tolower(*s1);

For clarity this is what it looks like when braces are added to show how the compiler interprets it:

Code:
for (; *s1 != '\0'; s1++) {
    *s1 = tolower(*s1);
}
gclass_name = malloc(strlen(class_name) + 1);
 
Yes, I'm sure, it's not the problem I have with it, I know that single line of code can go without brackets. The problem is that it doesn't produce what it's suppose to do, I hardcoded some other values for malloc, and it always allocated 4099999 bytes, when the value was < 5. I don't know, maybe it's a problem with gdb? That's why I wanted to ask here before I take any further actions.
 
malloc returns a pointer to the allocated buffer. The buffer is not initialized. The 4100008 seems to be a memory address.
 
I know, when I try to initialize it with some string, even empty, I get segmentation fault. It's not a memory address, currently I have:
gclass_name memory address is 0x801c17058, gclass_name[0] is P, gclass_name[4] is \0, gclass[x] can read chars from other strings placed somewhere else, gclass_name[4100007] is \0, gclass_name[4100008] prints Error accessing memory address 0x802000000: Bad address.

Little update: after rebooting(? i also placed (char *) before malloc, maybe this fixed it?) gclass_name prints PART ok, but the memory problem is still there (FreeBSD 10.1 RELEASE, amd64).

Another update. I was able to print gclass_name[400000] through printf also, so it's not a gdb problem I think. The question is, is it a bad usage of attributes somewhere, or a bug in compiler? Another funny thing, when I do malloc(9) it allocates less memory (4095936 bytes) than malloc(1)
 
My bad, I checked the address space it has access to, and it's from the other side also (so it's actually 8388608 bytes). I allocated memory for the second variable of this type just after the first one, it was placed 8 bytes after the first. Using negative values I could print contents of the previous variable :) Probably this caused the error in gclass_name I had at the beginning, because then it printed something like P \0 A A R \0 T \0 I have some ideas for further tests, so I'm going back to make more fun with it :)
 
Keep in mind that a modern operating system has virtual memory and the smallest unit of memory that the memory management can give a process is one page. The page size on i386 and amd64 is 4096 bytes as far as I know. An allocation of 5 bytes is going to allocate at least one page unless there have been other allocations previously and the malloc(3) implementation can use the leftover space from the page allocated for the previous allocation. The memory allocator may also decide that it's more beneficial to allocate multiple pages in one go since it's likely that if a process does one allocation on the heap it's going to do many more after the first.

When you get memory from malloc(3) you shouldn't try to interpret its contents in any way and especially after the number of bytes you requested. If you ask for 5 bytes use only the 5 bytes from the location pointed by the pointer you were given, anything after the 5 bytes should be treated as off-limits. The best practice is to initialize the memory to zero yourself or use calloc(3) unless the code is inside a time critical section where every cycle counts.
 
Yes, I keep it in mind. I only don't understand why it needs 8 megabytes to allocate 5 byte string :p A little lower in space there's enough free place for this, why malloc doesn't use it? From where it knows 5 chunks of what it needs to allocate? Of course, I'm just starting my journey with FreeBSD and still don't know a lot of important details, didn't analyze the whole code, so it may be perfectly sane behaviour, that's why I'm asking here. But still, this code doesn't act in a deterministic way, sometimes it returns garbage, and I try to understand why, because without this it's impossible to learn anything.
 
I think I wrote it already? :p When I malloc gclass_name, it has access to 8 megabytes of memory (last valid address - first valid address +1), using it I can print all the values from there (like /lib/geom, the address to the library) using for loop for example. As for "direct" allocation it's probably 8 bytes, I try to allocate other variables, and they are allocated 8 bytes from each other, always in the same place when I use malloc(1) to malloc(8). When I use malloc(9) they jump 16 bytes from each other. In general gpart takes around 20 megabytes of memory.
 
I think I wrote it already? :p When I malloc gclass_name, it has access to 8 megabytes of memory (last valid address - first valid address +1), using it I can print all the values from there (like /lib/geom, the address to the library) using for loop for example. As for "direct" allocation it's probably 8 bytes, I try to allocate other variables, and they are allocated 8 bytes from each other, always in the same place when I use malloc(1) to malloc(8). When I use malloc(9) they jump 16 bytes from each other. In general gpart takes around 20 megabytes of memory.

You can't trust the first and last valid address when dealing with virtual memory. The memory map may have "holes" in it which are unmapped pages with no physical memory behind them. There is a tool designed to show you the memory map of a process and that is called procstat(1). For example this is the memory map for a shells/zsh process on my system:

Code:
firewall ~ % procstat -v 1240
  PID      START        END PRT  RES PRES REF SHD   FL TP PATH
1240  0x8048000  0x80e2000 r-x  147  159   8   4 CN-- vn /usr/local/bin/rzsh
1240  0x80e2000  0x80e5000 rw-    3    0   1   0 CN-- vn /usr/local/bin/rzsh
1240  0x80e5000  0x80eb000 rw-    6    6   1   0 C--- df
1240 0x280e2000 0x280f8000 r-x   22   23  34   0 CN-- vn /libexec/ld-elf.so.1
1240 0x280f8000 0x28100000 rw-    5    5   1   0 C--- sw
1240 0x28102000 0x2810a000 rw-    1    1   1   0 CN-- sw
1240 0x2810c000 0x28150000 r-x   67   69   8   4 CN-- vn /lib/libncursesw.so.8
1240 0x28150000 0x28153000 rw-    3    0   1   0 CN-- vn /lib/libncursesw.so.8
1240 0x28153000 0x28178000 r-x   19   19  12   6 CN-- vn /lib/libm.so.5
1240 0x28178000 0x28179000 rw-    1    0   1   0 CN-- vn /lib/libm.so.5
1240 0x28179000 0x282b8000 r-x  263  278  65  31 CN-- vn /lib/libc.so.7
1240 0x282b8000 0x282bf000 rw-    7    0   1   0 CN-- vn /lib/libc.so.7
1240 0x282bf000 0x282f4000 rw-    8    8   1   0 CN-- sw
1240 0x282f4000 0x282f8000 rw-    4    4   1   0 CN-- df
1240 0x282f8000 0x28330000 r-x   47   52   8   4 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/zle.so
1240 0x28330000 0x28335000 rw-    5    0   1   0 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/zle.so
1240 0x28335000 0x28354000 r-x   31   32   8   4 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/complete.so
1240 0x28354000 0x28355000 rw-    1    0   1   0 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/complete.so
1240 0x28355000 0x2835b000 r-x    6    6   8   4 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/zutil.so
1240 0x2835b000 0x2835c000 rw-    1    0   1   0 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/zutil.so
1240 0x2835c000 0x28372000 r--   22   40   4   0 ---- vn /usr/local/share/zsh/5.0.7/functions/Completion.zwc
1240 0x28372000 0x28379000 r-x    6    7   8   4 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/parameter.so
1240 0x28379000 0x2837a000 rw-    1    0   1   0 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/parameter.so
1240 0x2837a000 0x2839a000 r--   26   26   4   0 ---- vn /usr/local/share/zsh/5.0.7/functions/Completion/Base.zwc
1240 0x283af000 0x283cb000 r--    4    4   3   0 ---- vn /usr/local/share/zsh/5.0.7/functions/Completion/Zsh.zwc
1240 0x283cb000 0x283d8000 r-x   13   13   6   3 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/computil.so
1240 0x283d8000 0x283d9000 rw-    1    0   1   0 CN-- vn /usr/local/lib/zsh/5.0.7/zsh/computil.so
1240 0x28400000 0x28c00000 rw-  184  184   1   0 C--- sw
1240 0x28c00000 0x28da2000 r--   22   22   3   0 ---- vn /usr/local/share/zsh/5.0.7/functions/Completion/Unix.zwc
1240 0xbfb5f000 0xbfb7f000 rwx   21   21   1   0 CN-D sw
1240 0xbfb7f000 0xbfb9f000 rwx   31   31   1   0 CN-- sw
1240 0xbfb9f000 0xbfbbf000 rwx   31   31   1   0 CN-- sw
1240 0xbfbbf000 0xbfbdf000 rwx   31   31   1   0 CN-- df
1240 0xbfbdf000 0xbfbff000 rwx   28   28   1   0 C--- sw
1240 0xbfbff000 0xbfc00000 r-x    1    1  37   0 ---- ph

Btw, this is nothing that is FreeBSD specific. All modern operating systems with virtual memory work the same way in general with some variation on the details.
 
I think I don't quite grasp the "hole" thing ;) procstat tells me that 0x801800000 to 0x802000000 are reserved, and this is my experience from accessing it through gclass_name. I wrote a function that replaces \0's with some char, and it works for this whole area, there are no holes. Of course it doesn't mean it must be in one piece in real memory, but it's not the problem (or maybe it is?). I started to analyze it, because gdb showed me garbage in gclass_name instead of PART, and my goal is to understand why. Probably it is just some problem with gdb (or my computer's memory), but to make sure I must rule out variables colliding, and for this I must understand why this area is so huge, if placed in some smaller place everything would be much easier to control and there would be smaller risk of colliding. For example, under some circumstances, it placed that "No memory" string right after gclass_name, quite differently than another similar variable, not respecting(?) malloc allocation.
 
You probably want read about how jemalloc works (and why it works the way it does). Another - quite well-written - text about jemalloc can be found here (reading chapter 2 should do as an introduction).

malloc(3) (i.e. jemalloc) requests at least two "chunks" of memory from the kernel. On FreeBSD 9 and 10 the default chunk size is 4 MiB (2 ^ LG_CHUNK_DEFAULT bytes).

As described in the man page the chunk size can be reduced by settings MALLOC_OPTIONS accordingly, down to 64KiB with "6k":

Code:
% uname -r
9.3-STABLE
% MALLOC_OPTIONS="P6k" id 2>&1 | grep Chunk
Chunk size: 65536 (2^16)
%
 
I have 10.1-RELEASE and the second returns only Ambiguous output redirect. I looked, and I don't see any variables or files mentioned in malloc(3). Maybe this causes all my problems with it?
Nice, I just wanted to find some reliable OS to draw simple graphics in Inkscape, and now I'm descending the bottomless pit of (s)hell, where, it seems, I will be trapped for ever.
0x48 0x65 0x6C 0x70 0x21 0x3B 0x29
 
Oops, I just noticed that on 10.x the correct man page is jemalloc(3), MALLOC_OPTIONS is called MALLOC_CONF and "6k" should probably be "opt.lg_chunk:16".

Why don't you just tell us what the real problem is your trying to solve (instead of engaging in what I perceive as borderline-trolling)?
 
It doesn't change anything, I don't have any MALLOC* variables.

I think I explained the most important part already? I try to write some patch for gpart. When I started to test gpart with gdb (without my patch, to be clear) p gclass_name printed not what it was supposed to print, but some garbage. It looked like *s1++ was moving not by one byte, but two. Also under some circumstances it looked like malloc doesn't allocate memory properly, the output of it wasn't predictable at all. Since I'm new to FreeBSD, I decided to write here, to get a broader picture of how things are/should be constructed, to decide if it is a problem with gpart itself, gdb or maybe something else. As for my style of writing, when I sit the whole day and analyze the code, some (non)sense of humour helps me keep sanity. If it's really trolling, I won't write here anymore, sorry.
 
It looked like *s1++ was moving not by one byte, but two.

Looked like? Do you have any gdb output so we can have a look at what you're actually doing (and how)?

If the address of s1 increases by two after incrementing s1 then *s1 is not of type "char". This is unlikely, except someone did something exceptionally funny (like "#define char short").

Further questions that come to mind are: Did you compile the system yourself? If so, did you add anything funny to /etc/make.conf? Or did you made any drastic changes to your system?
 
... As for my style of writing, when I sit the whole day and analyze the code, some (non)sense of humour helps me keep sanity. If it's really trolling, I won't write here anymore, sorry.

Well, people don't understand what exactly let you lough, since the reffered code is perfectly OK, as can be verified by copying the following code into a file named setclass.c, compiling, and then executing it:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include <err.h>

char *class_name = NULL, *gclass_name = NULL;

int main(int argc, char *argv[])
{
   class_name = argv[1];
   printf("%s\n", class_name);

/* unchanged code snippet of /usr/src/sbin/geom/core/geom.c
static void
set_class_name(void)
{  BEGIN */
   char *s1, *s2;

   s1 = class_name;
   for (; *s1 != '\0'; s1++)
      *s1 = tolower(*s1);
   gclass_name = malloc(strlen(class_name) + 1);
   if (gclass_name == NULL)
      errx(EXIT_FAILURE, "No memory");
   s1 = gclass_name;
   s2 = class_name;
   for (; *s2 != '\0'; s2++)
      *s1++ = toupper(*s2);
   *s1 = '\0';
/* } END of unchanged code snippet */

   printf("%s\n", gclass_name);
   return 0;
clang setclass.c -o setclass
setclass part

Result:
Code:
part
PART
What else?
 
worldi Yes, it looked like, because I don't know if this is what really happened, probably not. If I knew the output would be like that I would log it somewhere, I haven't expected it, and this is why it was so funny. Keep in mind that most of the time it works OK. What I'm doing? Usually doing a fresh base kernel install with src, setting vt, creating /etc/make.conf to turn off optimizations, compilling /usr/src/sbin/geom with DEBUG_FLAGS=-g and then running #gdb gpart. At the beginning I compiled world to see where is the source of gpart (couldn't find it in g***le), and how long it will take :) I create make.conf to only change O2 to O0, because without it gdb was acting even more funny :p The only drastic change to the system is GNOME2 installation. As I said, I don't understand why the malloc chunk has 8MiB, not the default 4MiB, maybe this creates my problems, the values are not defined anywhere and it confuses malloc? Gdb is also acting funny, after some time, even in idle, it refuses to exit. Actually it exits, but the gdb process is still active.

@obsignia Exactly that's why I'm laughing, because the code is OK, and yet, sometimes it doesn't do what's expected. I know it should print PART, and it prints most of the time. I don't know, probably my English is even worse than I thought, and it's hard to understand. Coding is fun, learning is / should be fun. FreeBSD is really fun to use, the problem here is not so serious, probably in normal circumstances it will never occur, that's why I treat it so lightheartedly. When something unexpected happens when you do something fun, it's funny, not? I wanted this thread to be fun also, not some boring chore for those who decide to help me with it. But maybe my sense of humour is too strange :(
 
Back
Top