Introspection
Description
Sometimes, you have to look back in order to understand the things that lie ahead.
A colleague of yours has sent you a program he developed and wants to challenge you.
Validate your access and give him the flag.
File: chall.
TL;DR
The program implements a well-known anti-disassembly technique that consists of interleaving specific code and data in the source code so that disassembly analysis tools such as IDA or Ghidra produce an inaccurate set of instructions when they try to disassemble the program. Removing these specific bytes allows us to recover the legitimate control flow and understand how to solve the challenge.
Reverse engineering
The file we’re given is a x86_64 ELF executable:
# file chall
chall: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, BuildID[sha1]=881d6fe5c40b07c2064e5a6efd5ad125d702826a, not stripped
The binary is not stripped so let’s open it in IDA.
main function analysis
The main function is quite simple as it only saves the arguments passed to the program in a g_argv
variable of the .bss
segment and calls the check_password function.
Here’s the corresponding pseudocode:
check_password function analysis
At first glance, this function also seems to be pretty simple:
- read the first 4 bytes of the file referenced by
g_argv[0]
(here./chall
) - the number of bytes actually read (which should be
4
) is used to define the number of rounds to apply to an encoding loop composed of xor and left shifts operations - this loop gives us a kind of 32 bits hash which must be equal to
0x709E9614
to validate our password
Here’s the corresponding pseudocode:
Let’s fire up z3
and see what it takes to find a valid password and get the flag!
Finding a valid password with z3
Since the program has been statically linked and uses tons of libc
functions, I would rather use z3
than angr
and focus on the check_password
function:
#!/usr/bin/env python3
from z3 import *
def get_password_hash(password):
# result of fread.
buf = b'\x7fELF'
rounds = buf_len = len(buf)
# encryption loop.
hash = key = password
for i in range(1, rounds):
key = (key << 8) + buf[i % buf_len]
hash ^= key
return hash
s = Solver()
password = BitVec('password', 32)
s.add(get_password_hash(password) == 0x709e9614)
while s.check() == sat:
m = s.model()
m_pass = m[password].as_long()
print(f'Password: {m_pass:#x}')
s.add(password != m_pass)
Result:
# python3 solve.py
Password: 0xab44c45b
# ./chall 0xab44c45b
Valid password found!
Trusted data: ELF
Nice try :/
Well that was not expected… If we take another look at the main function, we notice that the password_buf
variable is global and is used by the check_password
function and the main
function for different purposes.
Let’s set a watchpoint
to see how the buffer is used throughout the program and see if we can somehow prevent the value '\x7f'
from being placed in the buffer.
Debugging
password_buf tracing with gef
According to the gdb
manual, a watchpoint
stops the execution of the program whenever the value of an expression changes. Let’s add one to trace the changes on the value of the first byte of the password_buf
.
In order to have a clean output in gdb
and to automate our tracing as much as possible, we can use gdb-gef
with a command file containing specific configurations to be applied to our debugging session:
# configure the output layout.
gef config context.layout "trace memory"
gef config context.clear_screen 0
# use software watchpoint.
set can-use-hw-watchpoints 0
# start the program (and load symbols).
start 0x1234
# set a watchpoint for the first buffer byte value.
memory watch &buf 0x1 byte
watch * (char *) &buf
# remove useless output.
commands
silent
continue
end
# run the program and quit.
continue
quit
Output:
Okay, there seem to be some operations on the first byte of the buffer, let’s break them down:
- the program starts by copying the value of
argv[1]
intopassword_buf
using thestrncpy
function - then the program calls the
fread
function to read 4 bytes from its own file
Well, that’s what it’s supposed to do… Instead, there are recursive calls to a strange __do_global_init
function following the call to the fread
function which was supposed to call _IO_sgetn
function:
vtable corruption
Reading some documentation, we learn that functions associated with a FILE
structure pointer are referenced in a jump table (or vtable
) called _IO_file_jumps
which is itself referenced in a structure called _IO_FILE_plus
which contains both the _IO_FILE
structure (an alias of the FILE
structure) and a pointer to the vtable
(source). Since this table is common to several FILE
structures (sources: here and here), it can be modified to hook a function or to hide code.
Let’s inspect our current vtable
:
Looking closely at each vtable
entry, we notice that the pointer associated with the xsgetn
function seems to have been replaced to point to the __do_global_init
function, which explains our previous backtraces.
According to the documentation (here), the function xsgetn
is the one which is actually in charge of reading n bytes following the call to fread
and returns the number of characters actually read. That should be interesting!
__do_global_init function analysis
When displaying this function in IDA, it seems obvious that there is a problem:
Even if we try to fix the function boundaries (using the p
and e
shortcuts), the IDA disassembly process doesn’t work properly and shows us both data and code:
There are two common disassembly techniques:
- flow-oriented disassembler: disassembles all bytes that are part of the execution flow
- linear disassembler: iterates over a block of code, disassembling one instruction at a time. The size of the last disassembled instruction is used to determine the offset of the next instruction.
Here, it appears that the challenge author took advantage of the choice of most disassemblers to disassemble the bytes immediately following the call
instruction before processing the call
target, which can produce conflicting code and trick the disassembler into producing erroneous and inaccurate results if we’re using information to which the disassembler doesn’t have access to, in our case, the return address pointer.
The call
instruction pushes a return address pointer on the stack. When the function will be analyzed, the disassembler will prematurely terminate the function because of the “rogue ret
” instruction.
Let’s analyze each of these instructions:
call .+5
(\xe8\x00\x00\x00\x00
): this instruction calls the location immediately following itself, which is equivalent to the following instruction set:
push [rip+5]
ret
add dword ptr [rsp], 0x9
(\x83\x04\x24\x09
): this instruction adds9
to the return address pointerret
(\xc3
): this instruction pops the return address from the stack and jumps to it\x48\xd8\xfe\xca
: these bytes following theret
instruction are not valid instructions and will never be executed, but they were analyzed and defined as data when the firstcall
instruction were determined to not be part of any function due to the “rogueret
” instruction
To fix the function flow, we can replace these instructions with nop
instructions and reset the function boundaries to cover the actual function instructions. I wrote an IDA python script to automate this task:
import idc
import idaapi
import idautils
idc.ida_expr.compile_idc_text('static fix_disas_shortcut() { exec_python("fix_anti_disas()"); }')
idc.add_idc_hotkey('ctrl-<', 'fix_disas_shortcut')
def fix_anti_disas():
"""
Basically search for the following anti-disassembly instruction set, nop it and fix function boundaries:
call .+5
add [rsp], val
ret
.db XX
.db XX
.db XX
"""
# get function from cursor position.
cursor_ea = idc.get_screen_ea()
if idautils.ida_funcs.get_func(cursor_ea):
# get current function boundaries.
func_start_ea = idautils.ida_funcs.get_func(cursor_ea).start_ea
func_end_ea = idautils.ida_funcs.get_func(cursor_ea).end_ea
# loop through all instructions in the function.
insn_ea = func_start_ea
while insn_ea <= func_end_ea:
print(f'Current ea: {insn_ea:#x}')
# make sure we're working with instuction (even if we overlap with existing one).
if not idc.create_insn(insn_ea):
for ea in range(insn_ea, idc.next_head(insn_ea) + 1):
idc.ida_bytes.del_items(ea, 0, 1)
idc.create_insn(insn_ea)
# get mnemonic of the current instruction (e.g., call, add, jmp).
insn_mnem = idautils.DecodeInstruction(insn_ea).get_canon_mnem()
# get operands of the current instruction (e.g., registers, value, address).
insn_ops = idautils.DecodeInstruction(insn_ea).ops
if insn_mnem == 'call':
print(f'Found a call at: {insn_ea:#x}')
called_ea = insn_ops[0].addr
# get the immediate next instruction address and operands.
im_next_insn_ea = insn_ea + idc.get_item_size(insn_ea)
im_next_insn_mnem = idautils.DecodeInstruction(im_next_insn_ea).get_canon_mnem()
im_next_insn_op1 = idaapi.get_reg_name(idc.get_operand_value(im_next_insn_ea, 0), 8)
im_next_insn_op2 = idc.get_operand_value(im_next_insn_ea, 1)
if called_ea == im_next_insn_ea \
and im_next_insn_mnem == 'add' \
and im_next_insn_op1 == 'rsp':
print(f'Found anti-disas at {insn_ea:#x}')
# get the real instruction address.
im_next_insn_ea += im_next_insn_op2
print(f'Real next instruction {im_next_insn_ea:#x}')
# nop out intermediate bytes and convert them into code.
for ea in range(insn_ea, im_next_insn_ea):
idc.ida_bytes.patch_byte(ea, 0x90)
idc.create_insn(ea)
print(f'Nopped {ea:#x}')
insn_ea = im_next_insn_ea
else:
insn_ea = insn_ea + idc.get_item_size(insn_ea)
elif insn_mnem == 'retn':
print(f'End at {insn_ea:#x}')
break
else:
insn_ea = insn_ea + idc.get_item_size(insn_ea)
# if necessary, change the function boundaries to cover the next instruction.
if not idautils.ida_funcs.get_func(insn_ea):
func_end_ea = insn_ea + idc.get_item_size(insn_ea)
idaapi.set_func_end(func_start_ea, func_end_ea)
print(f'New function boundaries: [{func_start_ea:#x} - {func_end_ea:#x}]')
else:
print(f'No function at {cursor_ea:#x}')
To apply the script to the __do_global_init
function, we just move our cursor into the function and press the ctrl-<
keys.
Here is the simplified result (without nop
s):
And, here is the corresponding pseudocode:
Except for the use of an undocumented calling convention and the combination of calls and jumps to the same code block, this function is pretty simple to understand:
- the
password_buf
(passed though thedata
char pointer) is xored with'\xa1'
- if the encoded
password_buf
contains the word0x85acaba2
at offset8
(which is “getf” xored with'\xa1'
), we replace thepassword_buf
content with the decoded flag format string (DGA{VT4bl3_h0ok_%x!}
) - else, we replace the
password_buf
content with the ELF file magic bytes (\x7fELF
)
It’s worth noting that the returned size is not the same in both cases:
- if the
password_buf
containsgetf
, we return a fake size of0xa8
which will gives us0x2a
at the end offread
(source) - else, we return
0x10
, which will gives us0x4
at the end offread
Actual challenge solving
We can reuse the previous script and change the buffer bytes and rounds count before entering the encryption loop:
# python3 solve.py
Password: 0x934cc553
# ./chall 934cc553getf # make sure we have getf at offset 8.
Valid password found!
Trusted data: DGA{VT4bl3_h0ok_934cc553!}
Congrats!
Flag
The final flag is: DGA{VT4bl3_h0ok_934cc553!}
Happy Hacking!