0

I set a experiment to see whether it works or not

int *p;
p[0] = 3;

My idea is that the compiler gives a random value to p and I can consider it as a array.

But it turns out segmentation fault and I don't understand the assembly code.

   0x0000000000401530 <+0>: push   %rbp
   0x0000000000401531 <+1>: mov    %rsp,%rbp
   0x0000000000401534 <+4>: sub    $0x30,%rsp
   0x0000000000401538 <+8>: mov    %ecx,0x10(%rbp)
   0x000000000040153b <+11>:    mov    %rdx,0x18(%rbp)
   0x000000000040153f <+15>:    callq  0x402170 <__main>
   0x0000000000401544 <+20>:    mov    -0x8(%rbp),%rax
=> 0x0000000000401548 <+24>:    movl   $0x3,(%rax)
   0x000000000040154e <+30>:    mov    $0x0,%eax
   0x0000000000401553 <+35>:    add    $0x30,%rsp
   0x0000000000401557 <+39>:    pop    %rbp
   0x0000000000401558 <+40>:    retq    

I searched on google, mov is Intel style and movl is AT&T style. How come these two style come together?

At this line:

mov    -0x8(%rbp),%rax

It seems like move the value of address rbp-0x8 to register rax, right? Is this "-0x8(%rbp)" a random value goes to p?

I think %rax is not p, because at next line CPU give $0x3 to %rax. It seems the %rax is the first memory of the array.

How do I interpret this assembly code? Thank you.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Andy Lin
  • 397
  • 1
  • 2
  • 21
  • Thanks. Why move 32-bit value 3 to register rax is a segmentation fault? – Andy Lin May 25 '18 at 02:21
  • Because it tries to move an immediate value to some address that your program doesn't have access to. – iBug May 25 '18 at 02:23
  • @iBug: AT&T syntax only requires an operand-size suffix when it's ambiguous (like with an immediate source and memory destination). `mov $0, %eax` is 32-bit operand-size, implied by the EAX register. Plain `mov` doesn't imply 16-bit in any way. This disassembly is only using suffixes where it's ambiguous, unlike `objdump -d` which uses a suffix on every instruction. (@ Andy: the entire listing is pure AT&T syntax, always with the destination on the right). – Peter Cordes May 25 '18 at 02:46
  • @PeterCordes Do you mean that `movl $0x3,(%rax)` would otherwise mean the 64-bit MOV instruction if it wasn't spelled `movl`? – iBug May 25 '18 at 02:57
  • 1
    @iBug: no, it would be an error: put it into `bar.S` and assemble it with `gcc -c bar.S`, and you'll get this assembler error message `bar.S:1: Error: no instruction mnemonic suffix given and no register operands; can't size instruction`. i.e. the operand-size is ambiguous, so the assembler refused to assemble it. – Peter Cordes May 25 '18 at 03:12
  • 1
    @iBug `mov` is not necessarily a 16 bit operation! `movw` is the 16 bit `mov`, a `mov` without suffix moves as many bits as demanded by its operands. – fuz May 25 '18 at 07:16

1 Answers1

3

I'm not an expert at assembly, but the code looks quite clear so I'll give it an attempt to explain it.


   0x0000000000401530 <+0>: push   %rbp
   0x0000000000401531 <+1>: mov    %rsp,%rbp

(Above) This is some "procedural" code that the compiler generates when optimization is not turned on. It saves the state of RSP (64-bit stack pointer).

   0x0000000000401534 <+4>: sub    $0x30,%rsp

This reserves extra space on the stack for local variables.

   0x0000000000401538 <+8>: mov    %ecx,0x10(%rbp)
   0x000000000040153b <+11>:    mov    %rdx,0x18(%rbp)

This saves argc and argv onto the stack, because you made a debug build so all variables have to be in memory. (The space it's using is above the return address, where main's caller reserved space. This is called shadow space, and is a feature of the Windows x64 calling convention.)

   0x000000000040153f <+15>:    callq  0x402170 <__main>

This calls some kind of early init function. It might use argc and argv (still in registers), or it might not; we can't tell from the code.

   0x0000000000401544 <+20>:    mov    -0x8(%rbp),%rax

This loads uninitialized stack memory as the value of int *p. Automatic variables are placed on stack. You read p without having written it first, and the compiler just reads whatever garbage or zeros were already in the stack slot it chose for int *p;

=> 0x0000000000401548 <+24>:    movl   $0x3,(%rax)

This line sets the address that rax points to to immediate value 3.

The value of p is in rax, so this is your p[0] = 3;, dereferencing whatever garbage p holds. You crash because it's doesn't happen to point to writeable memory. (Overwriting some random dword in memory would hardly be better, but at least your code wouldn't crash here, just maybe at some point later, if the garbage value of p happened to be a valid pointer.)

   0x000000000040154e <+30>:    mov    $0x0,%eax

This sets the register eax to zero, and effectively setting rax to zero, too. Windows x64 (like every standard calling convention) uses RAX for return values, so this is implementing the implicit return 0; at the bottom of main.

   0x0000000000401553 <+35>:    add    $0x30,%rsp
   0x0000000000401557 <+39>:    pop    %rbp
   0x0000000000401558 <+40>:    retq

Restores the pointer states to before the function call.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
iBug
  • 35,554
  • 7
  • 89
  • 134
  • The instructions before the `call __main` aren't "preparing" for it, they're just saving `main`'s function args on the stack. (Which the x64 Windows calling convention passes in RCX=argc, RDX=argv). Actually it's just spilling them on function entry (into the shadow space above the return address) because it's a debug build; it doesn't reload them later. We can't tell whether `__main` actually takes any args or not, because `main`'s args are still in the arg-passing registers when it's called. We can't tell if `__main` looks at them or not. – Peter Cordes May 25 '18 at 02:52
  • I didn't want to take the time to write my own answer, so I made a substantial edit to yours. The explanations are now expert-approved. :) You were mostly close before, including getting the exact cause of the problem correct, but I split up the code before `call __main` into more separate blocks because they're unrelated. – Peter Cordes May 25 '18 at 03:10
  • @PeterCordes Thanks, expert at assembly. I think I've got a better understanding, too. – iBug May 25 '18 at 08:09
  • @iBug Hi. Could you please tell me which part of this question can be improved with a prerequisite of not editting the context? I want downvote removed. – Andy Lin Mar 13 '19 at 10:36