Introduction #
This adventure into the wonderful world of reverse engineering is brought to you by the great challenges available on Hack the Box. I picked one of the easier challenges available on their website to test my slowly growing knowledge and understanding of how to use tools such as Ghidra, IDA and Binary Ninja. This particular sample was analysed statically using Ghidra. I’ve included a little spiel below about the program, just in case anyone was interested. Otherwise feel free to skip and roll right on through to the analysis portion of this article!
Analysis Environment #
I will be using an M1 Apple Silicon MacBook to perform analysis of this binary. I have built out a malware analysis VM environment on my machine which houses an assortment of macOS, Nix and Windows operating systems to perform both dynamic and static analysis safely (but more on this in a separate post!).
What is Ghidra? #
Ghidra is a wonderful open-source disassembler that was developed by NSA’s Research Directorate to support their Cybersecurity mission. It’s a incredible tool with a lot of cool and honestly life-changing functions that displays assembly code in an easy-to-read format.
This particular tool is one of the perfect starting points for people new to the field because the cost of entry is nothing (unlike Binary Ninja and IDA Pro). That’s right, you can use it for free! I would also recommend getting a copy of Ghidra: The Definitive Guide written by the impressive Chris Eagle (that’s right, the same chap that wrote the Definitive Guide to IDA Pro!). It is a very comprehensive introduction (and really deep dive) into all the functionality offered by the platform—starting off with a wonderful introduction section that teaches you how to reverse engineer using the tool.
Ghidra is supplied across Linux, macOS, and Windows Operating Systems.
The Challenge #
The zip archive of this challenge can be found via the hack the box website. I have provided the hashes of the executable below for reference.
Hash Value | Filename: encrypt |
---|---|
SHA 256 | 035335fc74d59ff91f295c0ab0b9fbb0ff99060e508f7035004ab53604bee293 |
SHA 1 | fa61d18fdd736213b11f7a51f9923f2bc7cde9e9 |
MD5 | 00ed50d5e50ab99e6a0c4911043b5dd3 |
Download: Hack The Box
Unzipping the archive revealed two key artefacts: encrypt
and file.enc
. The initial theory was that the executable pushed out an encoded file into the directory it was run into. Let’s get stuck into it and start understanding this!
Step 1: Action Plan #
Whenever I analyse a sample, the first step I generally take is to assess what information I have been provided with and come up with an “action plan” to guide my analysis. As someone who is easily pulled into rabbit holes, this approach often helps me stay on track when performing analysis.
For this sample, I came up with the following plan:
- What information/clues has the ZIP archive provided me with that could assist in understanding the function of this binary?
- We have the executable titled
encrypt
and a file titledflag.enc
(let’s remember this name)
- We have the executable titled
- Figure out the format of the executable—what platform does it normally run on? What VMs should I use to mitigate any risk in executing this?
- Running
file
at the command line:ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter
. Let’s run static analysis on macOS or *nix then.
- Running
- What am I trying to achieve in analysing this sample?
- Understand the encryption algorithm through static analysis
- If any additional questions, run dynamically inside a VM
- Write a decrypter for this sample to reveal the text (i.e. our flag)
With these broad questions in mind, let’s get started.
Step 2: Ghidra Project & Function Analysis #
We need a logical starting point for our analysis. Depending on your learning style, there will be multiple ways to approach this as your eyeballs are overwhelmed by the delightful spluttering of assembly and decompiled text of a new, unfamiliar binary. However, I would always recommend you follow through the default, auto analysis process provided by Ghidra until you are familiar enough with the program to make changes.
Open up Ghidra and start a new project. Ghidra should automatically assess that the binary is x86:LE:64:default:gcc
which immediately gives us some additional clues as to what we are dealing with (if you didn’t do any command line sleuthing beforehand).
Once your project is set up, double-click your imported file (in this case encrypt
) and begin auto-analysis.
Figure 1: Auto-Analysis Prompt in Ghidra
Analysis Options #
By default, Ghidra does switch the majority of auto-analysis options on. Through discussions with a friend, there are some additional checks that should be performed prior to just hitting Analyze
. Let’s double-check that ASCII Strings
“minimum string length” is set to LEN_4
and (only because this program is small), let’s turn on Decompiler Parameter ID
for good measure.
Figure 2: Auto Analysis Options in Ghidra
Now hit apply and analyze.
Analysing Functions #
Now we are ready to look at this program. Ghidra will provide a breakdown of all the functions present in the opened binary. In this case, being quite a small and simple program, there are not too many functions present. Of these, main
seems like a very logical place to start analysis.
Moving to the main function, Ghidra provides two views; the Assembly code, and decompiled C.
We will start our analysis looking at the Decompiled C code for ease of readability and then compare this with the assembly output.
undefined8 main(void)
{
int iVar1;
time_t tVar2;
long in_FS_OFFSET;
uint local_40;
uint local_3c;
long local_38;
FILE *local_30;
size_t local_28;
void *local_20;
FILE *local_18;
long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28);
local_30 = fopen("flag","rb");
fseek(local_30,0,2);
local_28 = ftell(local_30);
fseek(local_30,0,0);
local_20 = malloc(local_28);
fread(local_20,local_28,1,local_30);
fclose(local_30);
tVar2 = time((time_t *)0x0);
local_40 = (uint)tVar2;
srand(local_40);
for (local_38 = 0; local_38 < (long)local_28; local_38 = local_38 + 1) {
iVar1 = rand();
*(byte *)((long)local_20 + local_38) = *(byte *)((long)local_20 + local_38) ^ (byte)iVar1;
local_3c = rand();
local_3c = local_3c & 7;
*(byte *)((long)local_20 + local_38) =
*(byte *)((long)local_20 + local_38) << (sbyte)local_3c |
*(byte *)((long)local_20 + local_38) >> 8 - (sbyte)local_3c;
}
local_18 = fopen("flag.enc","wb");
fwrite(&local_40,1,4,local_18);
fwrite(local_20,1,local_28,local_18);
fclose(local_18);
if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
Figure 3: The Main Function
At first glance, this code seems quite overwhelming. There are a lot of local variables that are doing… things? Resulting in another thing. Perhaps what is most interesting here though is the presence of a very particular file flag.enc
, surrounded by some even more interesting calls. A quick squiz of the accompanying assembly code reveals the presence of an XOR, which could have something to do with the encryption function of this binary. Well worth a look.
Step 3: Analysing Main #
Having identified main as a function of interest, this section of our analysis will focus on systematically working through the decompiled code to see whether we can get a good understanding of how the encryption process is being performed by this program.
local_10 = *(long *)(in_FS_OFFSET + 0x28);
local_30 = fopen("flag","rb");
fseek(local_30,0,2);
local_28 = ftell(local_30);
fseek(local_30,0,0);
local_20 = malloc(local_28);
fread(local_20,local_28,1,local_30);
fclose(local_30);
Figure 4: Before the Encryption, Start of Main Function
This segment of code looks to open a file called flag
in read mode, perform some data reads, assign to memory and then close off the file. However, the default language used by Ghidra doesn’t really clearly tell us that story. Let’s clean it up so it’s more readable. To do this in Ghidra, simply click on the variable you want to rename and then L
on your keyboard. You’ll be presented with a prompt where you can input a new name that makes more sense to you in your analysis!
/* ignore first line for the moment, likely a stack canary */
stackCanary = *(long *)(in_FS_OFFSET + 0x28);
/* open flag file in read mode */
originalFile = fopen("flag","rb");
/* right click and set equate in ghidra to automatically tell you what seek values are */
/* i used this to designate 2 as SEEK_END */
/* check man page for fseek to determine the meaning of the arguments */
fseek(originalFile,0,SEEK_END);
/* go to the end of the file */
/* ftell called to determine the size of the file*/
/* what byte position is it in? */
/* store file size in variable fileSize */
fileSize = ftell(originalFile);
/* return to the beginning of the file again */
fseek(originalFile,0,SEEK_SET);
/* allocate memory to hold value of fileSize and assign to variable fileSizeInMem*/
fileSizeInMem = malloc(fileSize);
fread(fileSizeInMem,fileSize,1,originalFile);
/* close the flag file */
fclose(originalFile);
Figure 5: Translating the Variables in the Main Function using Ghidra
Figure 5 shows the results of walking through the code, line by line, to determine the functions of each variable of the code. We are left with a better understanding of what data is considered important and where/how it is being stored/recorded by the binary. In addition to this, we comment the code for readability’s sake if/when we return to our analysis in the future/to assist understanding in the community.
We continue through the rest of the code in main until all of the variables make a little more sense to us.
undefined8 main(void)
{
int randomNumberGen;
time_t setTime;
long in_FS_OFFSET;
uint castTime;
uint ranNumGen_2;
long byteIncrement;
FILE *originalFile;
size_t fileSize;
void *allocateMem;
FILE *encodedFile;
long stackCanary;
/* ignore first line for the moment, likely a stack canary */
stackCanary = *(long *)(in_FS_OFFSET + 0x28);
/* open flag file in read mode */
originalFile = fopen("flag","rb");
/* right click and set equate in ghidra to automatically tell you what seek values are */
/* i used this to designate 2 as SEEK_END */
/* check man page for fseek to determine the meaning of the arguments */
fseek(originalFile,0,SEEK_END);
/* go to the end of the file */
/* ftell called to determine the size of the file*/
/* what byte position is it in? */
/* store file size in variable fileSize */
fileSize = ftell(originalFile);
/* return to the beginning of the file again */
fseek(originalFile,0,SEEK_SET);
/* allocate memory to hold value of fileSize and assign to variable fileSizeInMem*/
fileSizeInMem = malloc(fileSize);
fread(fileSizeInMem,fileSize,1,originalFile);
/* close the flag file */
fclose(originalFile);
/* get time and place into variable setTime */
setTime = time((time_t *)0x0);
/* turn into an unsigned integer, meaning the value cannot be negative */
castTime = (uint)setTime;
/* seed for the rand value based off of the variable setTime. If you have the same time, you have the seed for this random number generator */
srand(castTime);
/*now this is a weird section, a little hard to read*/
for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
randomNumberGen = rand();
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
ranNumGen_2 = rand();
ranNumGen_2 = ranNumGen_2 & 7;
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
*(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
}
/* open the file called flag.enc with write privileges */
/* the b here is for system compatibility */
encodedFile = fopen("flag.enc","wb");
/* write the first 4 bytes of castTime variable into the file */
fwrite(&castTime,1,4,encodedFile);
/* write the contents of allocateMem into the file, for a total byte size of the fileSize variable */
fwrite(fileSizeInMem,1,fileSize,encodedFile);
/* close the file*/
fclose(encodedFile);
/* this is the function end */
if (stackCanary != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
Figure 6: Mostly Deobfuscated Main Function of Encrypt Binary
Let’s summarise what information we can gather now we have translated the main function:
- a file called
flag
is read in by the program - the file size of
flag
is recorded - the current time is recorded and changed into an unsigned integer
- the variable
castTime
is used as the seed forsrand
(and therefore we have the seed key) - Funky code containing some bitwise operations, bit shifting and number generation (we will dig into this deeper)
- open the file
flag.enc
and place the contents ofcastTime
into the file - Write the contents of
fileSizeInMem
into the file - Close the file
- Exit program
From the above, the only component of the code that Ghidra struggled to make sense the following segment:
/*now this is a weird section, a little hard to read*/
for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
randomNumberGen = rand();
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
ranNumGen_2 = rand();
ranNumGen_2 = ranNumGen_2 & 7;
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
*(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
}
Figure 7: Confusing Code Segment
But why is this significant? Well, judging by the presence of rand()
, the right shift and left shift, and a bitwise, it is highly likely that this segment of code is our encryption component.
Here, we will need to reference the assembly instructions to see whether it provides any additional insights as to what series of operations are being performed here. We can achieve this through highlighting this segment of code in the decompiler, which will highlight the corresponding code in the assembly pane.
Figure 8: Equivalent Assembly Instructions
Looking at the assembly, there are two key instructions here (as highlighted) that provide clues as to what is going on. This is Ghidra’s attempt to logically represent a Rotate on Left
. This is definitely our encryption algorithm.
Figure 9: Depiction of Rotate on Left
Let’s amend our summary then:
- a file called
flag
is read in by the program - the file size of
flag
is recorded - the current time is recorded and changed into an unsigned integer
- the variable
castTime
is used as the seed forsrand
(and therefore we have the seed key) - clear variables
randomNumGen
andranNumGen_2
, placing seed ofrand()
into both - perform bitwise operation on
ranNumGen_2
and7
- perform
ROL
on data - place the results of the encryption in variable
fileSizeInMem
- open the file
flag.enc
and place the contents ofcastTime
into the file - Write the contents of
fileSizeInMem
into the file - Close the file
- Exit program
As we explore writing a program to decrypt the data in flag.enc
, our complete understanding of the above summary will likely change, as we understand the nuisances of the decryption process.
Conclusion #
Post analysis of this file, we are left with a fairly good understanding of the encoding function performed by this binary. The entire main function has been changed into a much easier to read block of code, with one slightly funky section that was the result of Ghidra attempting to write out a ROL instruction somewhat logically (our major gotcha for this sample analysis). Figure 6 shows our the results of our hard work.
undefined8 main(void)
{
int randomNumberGen;
time_t setTime;
long in_FS_OFFSET;
uint castTime;
uint ranNumGen_2;
long byteIncrement;
FILE *originalFile;
size_t fileSize;
void *allocateMem;
FILE *encodedFile;
long stackCanary;
/* ignore first line for the moment, likely a stack canary */
stackCanary = *(long *)(in_FS_OFFSET + 0x28);
/* open flag file in read mode */
originalFile = fopen("flag","rb");
/* right click and set equate in ghidra to automatically tell you what seek values are */
/* i used this to designate 2 as SEEK_END */
/* check man page for fseek to determine the meaning of the arguments */
fseek(originalFile,0,SEEK_END);
/* go to the end of the file */
/* ftell called to determine the size of the file*/
/* what byte position is it in? */
/* store file size in variable fileSize */
fileSize = ftell(originalFile);
/* return to the beginning of the file again */
fseek(originalFile,0,SEEK_SET);
/* allocate memory to hold value of fileSize and assign to variable fileSizeInMem*/
fileSizeInMem = malloc(fileSize);
fread(fileSizeInMem,fileSize,1,originalFile);
/* close the flag file */
fclose(originalFile);
/* get time and place into variable setTime */
setTime = time((time_t *)0x0);
/* turn into an unsigned integer, meaning the value cannot be negative */
castTime = (uint)setTime;
/* seed for the rand value based off of the variable setTime. If you have the same time, you have the seed for this random number generator */
srand(castTime);
/* for each byte in the file, until the length of the file is reached, complete the following*/
for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
/* place seed of rand into randomNumGen */
randomNumberGen = rand();
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
ranNumGen_2 = rand();
/* place seed of rand into ranNumGen_2 */
/* perform bitwise operation between ranNumGen_2 and digit 7 */
ranNumGen_2 = ranNumGen_2 & 7;
/* perform ROL on data */
*(byte *)((long)fileSizeInMem + byteIncrement) =
*(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
*(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
}
/* open the file called flag.enc with write privileges */
/* the b here is for system compatibility */
encodedFile = fopen("flag.enc","wb");
/* write the first 4 bytes of castTime variable into the file */
fwrite(&castTime,1,4,encodedFile);
/* write the contents of allocateMem into the file, for a total byte size of the fileSize variable */
fwrite(fileSizeInMem,1,fileSize,encodedFile);
/* close the file*/
fclose(encodedFile);
/* this is the function end */
if (stackCanary != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
Figure 10: Completely Deobfuscated Main Function of Encrypt Binary
Where to now? #
For now, we take a well-earned break and do a little happy dance to celebrate completing the first component of this challenge—understanding the encryption algorithm used by the program to encode the flag.
The next part in this series will move onto writing a simple program to reverse the encoding performed on this text, hopefully revealing our flag in plain text and solving the final component of this challenge.