Strange Thing
Description
We received this file earlier. We don’t know who it came from, but someone wants to send us a message.
File: strange_thing.hex.
TL;DR
We received a compiled Arduino sketch file which is in Intel HEX format.
After some reverse engineering work, we understand that the program consists in looping a Morse code sequence.
We just have to extract and decode it to get the flag.
Reverse engineering
Processor selection
Based on my previous work designing a challenge based on an Arduino firmware (see NorzhCTF 2020 - Door Lock), I immediately thought of a compiled Arduino firmware, but let’s check and Google it:
One of the top results suggests that the sketch was compiled for one of the most used SOCs on Arduino: the ATmega328 AVR microcontroller.
Based on this, we can go ahead and configure our IDA setup and create a new project with the Atmel AVR ATmega328P
processor (see avr_helper project):
Actual reverse-engineering
Disclaimer: The reverse engineering process is similar to the one of my previous writeup on the subject, I suggest you read it for more explanations: NorzhCTF 2020 - Door Lock.
The first thing to do when analyzing the firmware of a microcontroller is to analyze the reset handler that is responsible for loading data into SRAM and starting the program.
In the case of Arduino, the first instruction loaded is located at the very beginning of the read-only memory (i.e. ROM:0000
) which is also the reset and interrupt vector.
Here, it’s a jump
to the __RESET
routine which is the actual reset handler:
As shown in the illustration above, the reset handler clears the SRAM memory then calls a sub_159
routine. There is no data loaded in the SRAM here:
the program only relies on constant values.
loop()
Based on the reset handler, we can expect to get a call to one of the default Arduino routine (i.e setup
or loop
). Looking at the sub_159
routine graph we get the following view:
That routine is looping on itself, we’ve found the loop
routine!
If we take a close look at the instructions of this routine, we can clearly notice that it makes successive calls of two different routines (ROM:00DD
and ROM:0070
).
With a bit of reverse engineering, we can find that these routines are respectively the digitalWrite
and the delay
routines:
As a good lazy man, I used IDAPython to automatically add comments to these calls.
We immediately notice two things: there are four distinct values for the argument of the delay routine (i.e 200
, 400
, 600
and 1200
) and the state of the
digital pin is sequentially inverted. We are probably looking at Morse code!
Let’s reuse our IDAPython script, simulate the loop routine and dump the Morse code sequence:
import idautils
import idc
def to_long(reg_dict):
"""Return 4 registers values to long."""
keys = sorted(reg_dict.keys())
long_val = 0
long_val += reg_dict[keys[0]] << 0
long_val += reg_dict[keys[1]] << 8
long_val += reg_dict[keys[2]] << 16
long_val += reg_dict[keys[3]] << 24
return long_val
morse_translate_table = {200L: '.',
600L: '-',
400L: '/',
1200L: '\n'}
start_ea = 0x01b9
func_ea = idautils.ida_funcs.get_func(start_ea).start_ea
watched_regs = {i: None for i in range(22, 26, 1)}
last_called_func = None
last_called_func_arg = None
message = ''
for insn_ea in idautils.FuncItems(func_ea): # for all instructions in the function.
if insn_ea >= start_ea: # we want to emulate instructions from the start_ea address.
insn = idautils.DecodeInstruction(insn_ea) # get the current instruction.
insn_mnem = insn.get_canon_mnem() # get mnemonic of the current instruction (e.g., call, add, ldi)
insn_ops = insn.Operands # get operands of the current instruction (e.g., registers, value, address)
if insn_mnem == 'ldi':
# load immediate value in reg register.
reg = insn_ops[0].reg
value = insn_ops[1].value
if reg in watched_regs:
watched_regs[reg] = value # save register's loaded value.
if insn_mnem == 'call':
# call called_func address.
called_func = idautils.ida_funcs.get_func_name(insn_ops[0].addr)
if called_func == 'delay':
called_func_arg = to_long(watched_regs) # convert registers to a unsigned long integer.
idc.set_cmt(insn_ea, 'delay(' + str(to_long(watched_regs)) + ');', 0) # add regular comment to the current instruction.
if ((last_called_func == 'digitalWrite' and last_called_func_arg == 'HIGH') or # if the indicator is on, let's see how long we leave it on (. or -).
(last_called_func == 'delay')): # two consecutive pause, this one probably indicates the next sequence (word or letter seperator).
message += morse_translate_table[called_func_arg] # translate the duration to a morse code symbol.
# save called_func and called_func_arg.
last_called_func = called_func
last_called_func_arg = called_func_arg
if called_func == 'digitalWrite': # change indicator state.
called_func_arg_enum = {0: 'LOW', 1: 'HIGH'} # human readable indicator's states.
called_func_arg = called_func_arg_enum[watched_regs[24]] # convert state to human readable value.
idc.set_cmt(insn_ea, 'digitalWrite(LED_PIN, ' + called_func_arg + ');', 0) # add regular comment to the current instruction.
# save called_func and called_func_arg.
last_called_func = called_func
last_called_func_arg = called_func_arg
print(message)
Download link: simulation.py.
Here’s the result:
Flag
The final flag is: If-you-did-not-script-you-must-be-100-years-old!
Happy Hacking!