HOWTO: Debug AMD64 ASM in gdb

sgeos · Sep 20, 2015

There is not a lot of documentation on writing working AMD64 assembler for FreeBSD. For some reason I have been compelled to look into this topic. I figured sharing might be useful.

First, let us start with a C program that will do the same thing as the ASM program in almost the same way.

arg_echo.c

Code:

#include <stdio.h>
#include <sys/syscall.h>
#include <unistd.h>

#define BUFFER_SIZE 2048

char buffer[BUFFER_SIZE];

int main(int argc, char** argv)
{
  for (int arg = 0; arg < argc; arg++) {
    int length;
    for (length = 0; 0 != argv[arg][length]; length++) {
      buffer[length] = argv[arg][length];
    }
    buffer[length++] = '\n';
    syscall(SYS_write, STDOUT_FILENO, buffer, length);
  }
  syscall(SYS_exit, 0);
}

Build it with clang -g arg_echo.c -o arg_echo. Note the -g flag. This is necessary to generate dwarf(3) debugging information for gdb(1). Specifically, gdb needs dwarf2. Not dwarf3 or dwarf4.

Load it in gdb(1) with gdb -tui arg_echo. Enter layout split or layout asm and you should see the assembler in the window. Pretty simple.

Next, it is time for a couple of assembler files. The system.inc contains macro definitions to make arg_echo.s more readable.

system.inc

Code:

%define stdin  0
%define stdout 1
%define stderr 2

%define SYS_nosys 0
%define SYS_exit  1
%define SYS_fork  2
%define SYS_read  3
%define SYS_write 4

%macro system 1
  mov rax, %1
  syscall
%endmacro

%macro sys.exit 0
  system SYS_exit
%endmacro

%macro sys.fork 0
  system SYS_fork
%endmacro

%macro sys.read 0
  system SYS_read
%endmacro

%macro sys.write 0
  system SYS_write
%endmacro

arg_echo.s

Code:

%include 'system.inc'

%define BUFFER_SIZE 2048

section .bss
buffer  resb BUFFER_SIZE

section .text
align 4

global _start
_start:
  nop ; for gdb breakpoint
  mov rsp, rdi ; rdi contains the stack pointer to argc, argv[n]...
  pop rbx ; argc

  jmp is_last_arg
  proc_arg:
    pop rsi  ; argv[n]
    mov rdx, buffer
  copy_char:
    mov byte cl, [rsi] ; c = *argv[n]
    cmp cl, 0 ; if 0 != c
    je output
    mov byte [rdx], cl ; *buffer = c
    inc rsi ; argv[n]++
    inc rdx ; buffer++
    jmp copy_char
  output:
    mov byte [rdx], 0x0A ; append \n
    inc rdx ; buffer++
    ; write stdout, buffer, length
    mov rdi, stdout
    mov rsi, buffer
    sub rdx, buffer ; length
    sys.write
    dec rbx ; argc--
  is_last_arg:
    cmp rbx, 0 ; if 0 != argc
    jne proc_arg

  ; exit(0)
  xor rdi, rdi
  sys.exit

Everything up this point has been copy paste. This is where the real how-to begins.

devel/nasm is a popular assembler, but it produces dwarf3 debugging information. devel/yasm can produce the dwarf2 debugging information gdb(1) needs, so we will install that instead. portmaster devel/yasm

Build with the following commands. The ld(1) -s and -S flags strip information. Do not use these flags.

 yasm -f elf -m amd64 -g dwarf2 arg_echo.s

ld -o arg_echo arg_echo.o

Alternatively, you can use the following commands to link with clang(1). You need the -nostdlib flag if your .s file contains a _start function. Without this flag linking will fail because there will be a conflict when clang(1) tries to add a standard _start function to wrap a probably nonexistent main() function.

 yasm -f elf -m amd64 -g dwarf2 arg_echo.s

clang -o arg_echo arg_echo.o -nostdlib

If you must use devel/nasm, you can get impaired debugging information (no source code) with the following commands.

 nasm -f elf64 -g -F stabs main64.s

ld -o arg_echo arg_echo.o

Yet again, load it in gdb(1) with gdb -tui arg_echo. Enter layout split the see the source code and assembler. Surprisingly, layout asm may not work.

Astute readers will have noticed the nop at the top of _start. gdb(1) does not break on the very first instruction. You need to add a nop if you want to break before your program executes any state changing instructions. If breaking after the first instruction executes is not a big deal, leave out the nop.

You generally need to add a breakpoint before executing a program to be able to inspect it. To add a breakpoint, type something like b *0x4000b1. To add a breakpoint to the second instruction in _start, use b *&_start+1. To start execution use r, or r argv1 argv2 if you need command line parameters. Use s to step through the program one instruction at a time. Use c to continue until the next breakpoint or until the program stops running.

To inspect the registers, use i r. You can inspect specific registers with something like i r rdi rsp. The output columns are register name, hexadecimal value, decimal value. Some registers display a hexadecimal value in the decimal column. To set a register, use something like set $rbx = 2. Reference.

To inspect memory use something like x 0x60010c, or x/5sc 0x60010c to display multiple values with a format and size. More useful yet, you can display memory at an address loaded on a register with x/5sc $rsi. Offsets can be used x/5sc $rsi+0x10. If you want to view memory at something like a bss label, use x/5sc &buffer. Reference.

Set memory with something like set {char}0x600110=0x0A. A contiguous memory region can be set like set {int [3]}0x600110={1, 0x02, 'D'}. Strings can be written as contiguous bytes with set {char [7]}0x600110="foobar". Registers can be used as pointers to memory set {char [7]}$rdx="foobar". A bss variable can be set with set {char [7]}&buffer = "foobar". Reference.

Use info files to get information on the loaded files, like the entry point address. You can view the stack frame with f. Quit with q.

For more information, you probably want to read the System V AMD64 ABI Reference, search for specific information or man gdb(1). You may also be interested in Thread assembly-simple-hello-world.53274.

HOWTO: Debug AMD64 ASM in gdb

sgeos