GREYCTF Finals 2022 Write-ups

mcdulltii
11 min readJun 19, 2022

--

Introduction

I participated in this CTF competition along with beanbeah and ariana.

Final scoreboard

To our surprise, we got first place!

As you can see from our category breakdown, and the final challenge page below, we did not partake in the pwn category at all.

I contributed to the Reverse Engineering (RE) challenges, which I found to be quite doable. Admittedly I took a long time for each challenge because my team was watching Ricky Gervais: SuperNature.

The challenge author for these three RE challenges seems to disagree with me

MBA

In the distrib download, we’re given two files: One bash script and an ELF64 binary.

In the bash script, it reads in flag.txt and password.txt into the FLAG and PASS environment variables. We can emulate the remote server by having our own test flag and password files. The environment variables can also be persisted in the same terminal by using export FLAG=$(cat flag.txt) and export PASS=$(cat password.txt), such that the mba binary can be run by itself.

As I “specialize” in malware obfuscation methods, I knew what the challenge meant from first glance. MBA, standing for Mixed Boolean-Arithmetic, is a type of obfuscation method to perform a semantics-preserving transformation from a simple expression to a representation that is hard to understand and analyze. These obfuscation techniques can be solved using semantic solvers or the like, in order to reduce complex MBA expressions.

There are a couple of MBA reducers on Github which I found when I was tackling this, but there actually is no need to use them.

As you can see from the decompilation of the main function in the mba binary, there are certain variables and comparisons that obviously display MBA obfuscation, such as line 6-10, 19-23, 44, and 51-54.

These MBA obfuscated regions can be easily tackled by dynamic analysis.

For example, variable v25 in line 6, is initially 0 before the MBA calculations.

and… is still 0 after the calculations…

Well, you get the idea, we can step till after the calculations have finished, to retrieve the “hardcoded” variable and comparison values.

Similarly, in line 19-23, the same process has been done to find out that the return value of the hash function is compared with the value 0x3DD99B6C9D29C576.

Let’s look at the hash function then. The hash function initializes a hardcoded hash_value at 0x5CA1AB1EF01DAB1E. A for loop iterates through our input and performs certain operations onto the hash_value and returns it after the loop. We can just copy this function out to see what it actually does in each step.

In the first step, the last byte of hash_value has changed to 0x7f.

The initial last byte was 0x1e, which when xor-ed together will give the char a, which is the first character of our input!

The next step rolls the hash_value to the left by a byte, and the loop repeats with the next characters of our input.

Now that we know how the hash function works, we can find out the initial value before the hash comparison from before by writing our own unhash function.

In the above script, the unhash function takes in a hex string without the 0x prefix, and performs the lambda function f on it. The function f basically splits the hex string into chunks of 2, and turns them into integers, essentially changing the hex string into a list of integers. Then, by xor-ing with the initial hash_value, we can get our correct input.

This script returns ax0rm4nh, which we’ll need to roll once to the right. Why? From before, our first character is xor-ed and rolled to the left, and so-on. As the hash_value has a length of 8, the number of times rolled can be modulo-ed by 7. As such, we will need to reverse the left roll by rolling once to the right.

When running the hash function with hax0rm4n, the hash is indeed 0x3DD99B6C9D29C576. The main function checks whether the PASS environment is hax0rman, then sends a prompt for 2 inputs. Our 2 inputs are retrieved by a getMessage function.

The getMessage function essentially splits our input string at the char :. The string after : is hashed and compared with the string before :. So our input format will be hash(string):string.

The first message is string compared with the administrator string. Our second message is hashed, then compared with the hash of the PASS variable again, which we have already found. As such, our input shall be as follows:

hash(administrator):administrator
0x3DD99B6C9D29C576:hax0rm4n

The hash of the string administrator is calculated to be 0x6edf0d59b8ad0299.

By inputting 0x6edf0d59b8ad0299:administrator, then 0x3DD99B6C9D29C576:hax0rm4n into the remote server, we get our flag.

Flag: grey{A_M4st3r_B1n4ry_An4Lyst_OOOO}

This is sus

In the distrib download, we’re given two files: One bash script and an ELF32 binary.

The bash script runs the sus binary, then echos different strings depending on the return value of the binary. It looks like the return value should be zero, to print the last string.

In the sus binary, the start decompiled function is as follows.

In the __libc_csu_init function, has two function table referenced as below.

The function essentially retrieves the offset between the two function tables, and runs the functions between them.

Among the 3 function offsets, sub_8049740 prints an input prompt.

Before printing the prompt, is a call to __bsd_signal with an argument 11. When the binary is debugged, a SIGSEGV signal is warned, which can be inferred to be this function call. It seems to be some sort of anti-debugger trick, but it doesn’t hinder debugging other than being warned about the signal.

The function offset _dl_start_1 is passed into the signal function, which I assumed will be run after the input prompt. It calls the sub_804A080 function below.

In this function calls __libc_read, which reads 500 bytes from STDIN into the char array inpt. Then, this array is passed into sub_804A350.

As seen above, the range_256 array is initialized from 0 to 255, then shuffled with characters of string s at index (j % s_len). String s can be found to be static, so this shuffled array will be the same every run. This array can be retrieved dynamically.

Afterwards, is a do while loop, which in every cycle, switches two bytes in the array. The number in range_256 at index (*v10+v11) is xor-ed with our input character sequentially, and outputting to the array inpt_1.

After the loop, each character in the output array is rolled left by 4, and compared with a hardcoded byte array at byte_80E50A0.

Our input after all the operations will result in the hardcoded array at byte_80E50A0. To reverse these steps, each character in the byte_80E50A0 array is rolled right by 4, then xor-ed in the do while loop to get our initial input.

As we have the initial range_256 array, we can emulate the do while loop as above and retrieve each byte used for the xor.

Now that we have the xor bytes used, we are left with rolling right each character by 4, and the xor itself.

ror = lambda val, r_bits, max_bits: \
((val & (2**max_bits-1)) >> r_bits%max_bits) | \
(val << (max_bits-(r_bits%max_bits)) & (2**max_bits-1))
ror_chars = [ror(i, 4, 8) for i in match_chars]
print(bytearray([i ^ j for i, j in zip(ror_chars, used_chars)]))

This script will output our desired input, as below. And there’s the flag!

`b"Never gonna give you flag\nNever gonna let you down\nNever gonna run around and desert you\nNever gonna make you cry\nNever gonna say goodbye\nNever gonna tell a lie and hurt you\nflag{<3_N3v3r_G0nn4_G1v3_u_Up_<3}\nWe\'ve known each other for so long\nYour heart\'s"`

Flag: flag{<3_N3v3r_G0nn4_G1v3_u_Up_<3}

Untransparent 2

In the distrib download, we’re given an ELF64 binary.

Putting this file into IDA, gives errors towards graphing and decompiling, so we’ll continue using the disassembly view.

Taking a look at the main function, are hundreds and thousands of instruction blocks just like above. I’m assuming the author used C++ templating to generate these assembly blocks inline. The differences in each block are minor, such as the address offset used in the first mov instruction, the integer used in subtraction in the second sub instruction, and the jump location in the fourth jz instruction.

By running through the binary dynamically, we can roughly understand the program flow. In each block, a saved value in the stack is passed into a register, and subtracted with a hardcoded value. The result of the subtraction is then moved back into another address in the stack, and jumped only when the subtraction yields zero. If the subtracted result is not zero, execution jumps to the next instruction block for the same operations.

There are only certain blocks which have jump locations to “win” addresses, such as the one block below.

This instruction block subtracts the value at the offset compare_win by 1873574810, and jumps to loc_4E83A3 if the result is 0. The address of the jump prints a different string from normal, “The flag reveals itself”.

It then moves a hardcoded value into the stack at offset compare_win and jumps back to the start of all the instruction blocks.

Stepping through the process manually is not feasible as the number of instruction blocks are insanely high. However, by scrolling through the instruction blocks, some blocks are found to perform different operations.

This is one such block, which compares a value stored in the stack with the ASCII character 0. If the comparison returns true, the register cl is set and and-ed with 1, then moved back into the stack.

In IDA, by text searching for ASCII characters, we found that all the numbers, alphabets (lower and upper), !, @, {, } are present.

By manually going through all the instruction blocks like above, which contain these character comparisons, and setting breakpoints on all of them, we can find how our input relates to the flag checking algorithm, without requiring to reverse engineer the usage of the hardcoded values.

Interpreting the output string that the flag reveals itself, a random input is used when debugging the binary. After the input is read, certain byte operations are performed, a hardcoded value is stored on the stack at offset win_var, then execution is jumped into the start of the instruction blocks. The flag should supposedly reveal itself in the program flow from here.

Continuing on the debugger shows that the binary builds a table on the stack storing all the ASCII characters mentioned. Afterwards, the next breakpoint hits an instruction block containing a character comparison with g.

And so, that’s the start of the flag check. The subsequent breakpoints hit will show the flag characters in sequence. i.e. character comparison with r, then e, then y, and so on.

Continuing until the program ends, the flag characters have indeed revealed themselves.

Flag: grey{y0u_aR3_a_Pr0fe551on4l}

Afterthoughts

If you were to look at our last solve time, 10:21:00AM, you would think that we hoarded that challenge flag, before the competition ends at 10:30:00AM.

Well we actually didn’t! We were stuck at finding the flag for more than 30 minutes, after having admin access. It was really a clutch moment to find the correct command to retrieve the flag.

Before this challenge flag submission, we were actually in 3rd place. This last flag has managed to close the gap between us and the top two teams, and scraped us by to 1st place!

The first prize is $9000!!!

And, more flexing for first bloods and full solve on the RE category. (I’m not writing a writeup on oneliner)

On another note, despite the organizers admitting that they had a shortage of challenge creators and time, it was a fair decision to have an equal number of challenges in each category. The in-person event was also organized well, with adequate (free) food, snacks and drinks to last us the entire competition, given that our team had the least amount of sleep amongst the 10 teams.

A massive thank you to all the organizers, sponsors, and participants for making this event a huge success! (And my team members for hard carrying our team too $$$)

--

--

mcdulltii
mcdulltii

Written by mcdulltii

A programming enthusiast that does image synthesis on the side.

No responses yet