Skip to main content

HTB Challenge: Simple Encryptor Part 1

2959 words·14 mins· loading · loading ·
Ghidra Reverse Engineering Cryptographic Algorithms
polaryse
Author
polaryse
A Malware Analyst documenting their exploration of the wonderful world of malware.

Introduction
#

This adventure into the wonderful world of reverse engineering is brought to you by the great challenges available on Hack the Box. I picked one of the easier challenges available on their website to test my slowly growing knowledge and understanding of how to use tools such as Ghidra, IDA and Binary Ninja. This particular sample was analysed statically using Ghidra. I’ve included a little spiel below about the program, just in case anyone was interested. Otherwise feel free to skip and roll right on through to the analysis portion of this article!

Analysis Environment
#

I will be using an M1 Apple Silicon MacBook to perform analysis of this binary. I have built out a malware analysis VM environment on my machine which houses an assortment of macOS, Nix and Windows operating systems to perform both dynamic and static analysis safely (but more on this in a separate post!).

What is Ghidra?
#

Image alt

Ghidra is a wonderful open-source disassembler that was developed by NSA’s Research Directorate to support their Cybersecurity mission. It’s a incredible tool with a lot of cool and honestly life-changing functions that displays assembly code in an easy-to-read format.

This particular tool is one of the perfect starting points for people new to the field because the cost of entry is nothing (unlike Binary Ninja and IDA Pro). That’s right, you can use it for free! I would also recommend getting a copy of Ghidra: The Definitive Guide written by the impressive Chris Eagle (that’s right, the same chap that wrote the Definitive Guide to IDA Pro!). It is a very comprehensive introduction (and really deep dive) into all the functionality offered by the platform—starting off with a wonderful introduction section that teaches you how to reverse engineer using the tool.

Ghidra is supplied across Linux, macOS, and Windows Operating Systems.

The Challenge
#

The zip archive of this challenge can be found via the hack the box website. I have provided the hashes of the executable below for reference.

Hash Value Filename: encrypt
SHA 256 035335fc74d59ff91f295c0ab0b9fbb0ff99060e508f7035004ab53604bee293
SHA 1 fa61d18fdd736213b11f7a51f9923f2bc7cde9e9
MD5 00ed50d5e50ab99e6a0c4911043b5dd3

Download: Hack The Box

Unzipping the archive revealed two key artefacts: encrypt and file.enc. The initial theory was that the executable pushed out an encoded file into the directory it was run into. Let’s get stuck into it and start understanding this!

Step 1: Action Plan
#

Whenever I analyse a sample, the first step I generally take is to assess what information I have been provided with and come up with an “action plan” to guide my analysis. As someone who is easily pulled into rabbit holes, this approach often helps me stay on track when performing analysis.

For this sample, I came up with the following plan:

  1. What information/clues has the ZIP archive provided me with that could assist in understanding the function of this binary?
    • We have the executable titled encrypt and a file titled flag.enc (let’s remember this name)
  2. Figure out the format of the executable—what platform does it normally run on? What VMs should I use to mitigate any risk in executing this?
    • Running file at the command line: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter. Let’s run static analysis on macOS or *nix then.
  3. What am I trying to achieve in analysing this sample?
    • Understand the encryption algorithm through static analysis
    • If any additional questions, run dynamically inside a VM
    • Write a decrypter for this sample to reveal the text (i.e. our flag)

With these broad questions in mind, let’s get started.

Step 2: Ghidra Project & Function Analysis
#

We need a logical starting point for our analysis. Depending on your learning style, there will be multiple ways to approach this as your eyeballs are overwhelmed by the delightful spluttering of assembly and decompiled text of a new, unfamiliar binary. However, I would always recommend you follow through the default, auto analysis process provided by Ghidra until you are familiar enough with the program to make changes.

Open up Ghidra and start a new project. Ghidra should automatically assess that the binary is x86:LE:64:default:gcc which immediately gives us some additional clues as to what we are dealing with (if you didn’t do any command line sleuthing beforehand).

Once your project is set up, double-click your imported file (in this case encrypt) and begin auto-analysis.

Image alt
Figure 1: Auto-Analysis Prompt in Ghidra

Analysis Options
#

By default, Ghidra does switch the majority of auto-analysis options on. Through discussions with a friend, there are some additional checks that should be performed prior to just hitting Analyze. Let’s double-check that ASCII Strings “minimum string length” is set to LEN_4 and (only because this program is small), let’s turn on Decompiler Parameter ID for good measure.

Image alt
Figure 2: Auto Analysis Options in Ghidra

Now hit apply and analyze.

Analysing Functions
#

Now we are ready to look at this program. Ghidra will provide a breakdown of all the functions present in the opened binary. In this case, being quite a small and simple program, there are not too many functions present. Of these, main seems like a very logical place to start analysis.

Moving to the main function, Ghidra provides two views; the Assembly code, and decompiled C.

We will start our analysis looking at the Decompiled C code for ease of readability and then compare this with the assembly output.


undefined8 main(void)

{
  int iVar1;
  time_t tVar2;
  long in_FS_OFFSET;
  uint local_40;
  uint local_3c;
  long local_38;
  FILE *local_30;
  size_t local_28;
  void *local_20;
  FILE *local_18;
  long local_10;
  
  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_30 = fopen("flag","rb");
  fseek(local_30,0,2);
  local_28 = ftell(local_30);
  fseek(local_30,0,0);
  local_20 = malloc(local_28);
  fread(local_20,local_28,1,local_30);
  fclose(local_30);
  tVar2 = time((time_t *)0x0);
  local_40 = (uint)tVar2;
  srand(local_40);
  for (local_38 = 0; local_38 < (long)local_28; local_38 = local_38 + 1) {
    iVar1 = rand();
    *(byte *)((long)local_20 + local_38) = *(byte *)((long)local_20 + local_38) ^ (byte)iVar1;
    local_3c = rand();
    local_3c = local_3c & 7;
    *(byte *)((long)local_20 + local_38) =
         *(byte *)((long)local_20 + local_38) << (sbyte)local_3c |
         *(byte *)((long)local_20 + local_38) >> 8 - (sbyte)local_3c;
  }
  local_18 = fopen("flag.enc","wb");
  fwrite(&local_40,1,4,local_18);
  fwrite(local_20,1,local_28,local_18);
  fclose(local_18);
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

Figure 3: The Main Function

At first glance, this code seems quite overwhelming. There are a lot of local variables that are doing… things? Resulting in another thing. Perhaps what is most interesting here though is the presence of a very particular file flag.enc, surrounded by some even more interesting calls. A quick squiz of the accompanying assembly code reveals the presence of an XOR, which could have something to do with the encryption function of this binary. Well worth a look.

Step 3: Analysing Main
#

Having identified main as a function of interest, this section of our analysis will focus on systematically working through the decompiled code to see whether we can get a good understanding of how the encryption process is being performed by this program.

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_30 = fopen("flag","rb");
  fseek(local_30,0,2);
  local_28 = ftell(local_30);
  fseek(local_30,0,0);
  local_20 = malloc(local_28);
  fread(local_20,local_28,1,local_30);
  fclose(local_30);

Figure 4: Before the Encryption, Start of Main Function

This segment of code looks to open a file called flag in read mode, perform some data reads, assign to memory and then close off the file. However, the default language used by Ghidra doesn’t really clearly tell us that story. Let’s clean it up so it’s more readable. To do this in Ghidra, simply click on the variable you want to rename and then L on your keyboard. You’ll be presented with a prompt where you can input a new name that makes more sense to you in your analysis!

  /* ignore first line for the moment, likely a stack canary */
  stackCanary = *(long *)(in_FS_OFFSET + 0x28);
  /* open flag file in read mode */
  originalFile = fopen("flag","rb");
  /* right click and set equate in ghidra to automatically tell you what seek values are */
  /* i used this to designate 2 as SEEK_END */
  /* check man page for fseek to determine the meaning of the arguments */
  fseek(originalFile,0,SEEK_END);
  /* go to the end of the file */
  /* ftell called to determine the size of the file*/
  /* what byte position is it in? */
  /* store file size in variable fileSize */
  fileSize = ftell(originalFile);
  /* return to the beginning of the file again */
  fseek(originalFile,0,SEEK_SET);
  /* allocate memory to hold value of fileSize and assign to variable fileSizeInMem*/
  fileSizeInMem = malloc(fileSize);
  fread(fileSizeInMem,fileSize,1,originalFile);
  /* close the flag file */
  fclose(originalFile);

Figure 5: Translating the Variables in the Main Function using Ghidra

Figure 5 shows the results of walking through the code, line by line, to determine the functions of each variable of the code. We are left with a better understanding of what data is considered important and where/how it is being stored/recorded by the binary. In addition to this, we comment the code for readability’s sake if/when we return to our analysis in the future/to assist understanding in the community.

We continue through the rest of the code in main until all of the variables make a little more sense to us.

undefined8 main(void)

{
  int randomNumberGen;
  time_t setTime;
  long in_FS_OFFSET;
  uint castTime;
  uint ranNumGen_2;
  long byteIncrement;
  FILE *originalFile;
  size_t fileSize;
  void *allocateMem;
  FILE *encodedFile;
  long stackCanary;
  
  /* ignore first line for the moment, likely a stack canary */
  stackCanary = *(long *)(in_FS_OFFSET + 0x28);
  /* open flag file in read mode */
  originalFile = fopen("flag","rb");
  /* right click and set equate in ghidra to automatically tell you what seek values are */
  /* i used this to designate 2 as SEEK_END */
  /* check man page for fseek to determine the meaning of the arguments */
  fseek(originalFile,0,SEEK_END);
  /* go to the end of the file */
  /* ftell called to determine the size of the file*/
  /* what byte position is it in? */
  /* store file size in variable fileSize */
  fileSize = ftell(originalFile);
  /* return to the beginning of the file again */
  fseek(originalFile,0,SEEK_SET);
  /* allocate memory to hold value of fileSize and assign to variable   fileSizeInMem*/
  fileSizeInMem = malloc(fileSize);
  fread(fileSizeInMem,fileSize,1,originalFile);
  /* close the flag file */
  fclose(originalFile);
  
/* get time and place into variable setTime */
  setTime = time((time_t *)0x0);
/* turn into an unsigned integer, meaning the value cannot be negative */
  castTime = (uint)setTime;
/* seed for the rand value based off of the variable setTime. If you have the same time, you have the seed for this random number generator */
  srand(castTime);

/*now this is a weird section, a little hard to read*/
  for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
    randomNumberGen = rand();
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
    ranNumGen_2 = rand();
    ranNumGen_2 = ranNumGen_2 & 7;
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
         *(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
  }
  
/* open the file called flag.enc with write privileges */
/* the b here is for system compatibility */
  encodedFile = fopen("flag.enc","wb");
/* write the first 4 bytes of castTime variable into the file */
  fwrite(&castTime,1,4,encodedFile);
/* write the contents of allocateMem into the file, for a total byte size of the fileSize variable */
  fwrite(fileSizeInMem,1,fileSize,encodedFile);
/* close the file*/
  fclose(encodedFile);
/* this is the function end */
  if (stackCanary != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

Figure 6: Mostly Deobfuscated Main Function of Encrypt Binary

Let’s summarise what information we can gather now we have translated the main function:

  • a file called flag is read in by the program
  • the file size of flag is recorded
  • the current time is recorded and changed into an unsigned integer
  • the variable castTime is used as the seed for srand (and therefore we have the seed key)
  • Funky code containing some bitwise operations, bit shifting and number generation (we will dig into this deeper)
  • open the file flag.enc and place the contents of castTime into the file
  • Write the contents of fileSizeInMem into the file
  • Close the file
  • Exit program

From the above, the only component of the code that Ghidra struggled to make sense the following segment:

/*now this is a weird section, a little hard to read*/
  for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
    randomNumberGen = rand();
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
    ranNumGen_2 = rand();
    ranNumGen_2 = ranNumGen_2 & 7;
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
         *(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
  }

Figure 7: Confusing Code Segment

But why is this significant? Well, judging by the presence of rand(), the right shift and left shift, and a bitwise, it is highly likely that this segment of code is our encryption component.

Here, we will need to reference the assembly instructions to see whether it provides any additional insights as to what series of operations are being performed here. We can achieve this through highlighting this segment of code in the decompiler, which will highlight the corresponding code in the assembly pane.

Image Alt
Figure 8: Equivalent Assembly Instructions

Looking at the assembly, there are two key instructions here (as highlighted) that provide clues as to what is going on. This is Ghidra’s attempt to logically represent a Rotate on Left. This is definitely our encryption algorithm.

Image alt
Figure 9: Depiction of Rotate on Left

Let’s amend our summary then:

  • a file called flag is read in by the program
  • the file size of flag is recorded
  • the current time is recorded and changed into an unsigned integer
  • the variable castTime is used as the seed for srand (and therefore we have the seed key)
  • clear variables randomNumGen and ranNumGen_2, placing seed of rand() into both
  • perform bitwise operation on ranNumGen_2 and 7
  • perform ROL on data
  • place the results of the encryption in variable fileSizeInMem
  • open the file flag.enc and place the contents of castTime into the file
  • Write the contents of fileSizeInMem into the file
  • Close the file
  • Exit program

As we explore writing a program to decrypt the data in flag.enc, our complete understanding of the above summary will likely change, as we understand the nuisances of the decryption process.

Conclusion
#

Post analysis of this file, we are left with a fairly good understanding of the encoding function performed by this binary. The entire main function has been changed into a much easier to read block of code, with one slightly funky section that was the result of Ghidra attempting to write out a ROL instruction somewhat logically (our major gotcha for this sample analysis). Figure 6 shows our the results of our hard work.


undefined8 main(void)

{
  int randomNumberGen;
  time_t setTime;
  long in_FS_OFFSET;
  uint castTime;
  uint ranNumGen_2;
  long byteIncrement;
  FILE *originalFile;
  size_t fileSize;
  void *allocateMem;
  FILE *encodedFile;
  long stackCanary;
  
  /* ignore first line for the moment, likely a stack canary */
  stackCanary = *(long *)(in_FS_OFFSET + 0x28);
  /* open flag file in read mode */
  originalFile = fopen("flag","rb");
  /* right click and set equate in ghidra to automatically tell you what seek values are */
  /* i used this to designate 2 as SEEK_END */
  /* check man page for fseek to determine the meaning of the arguments */
  fseek(originalFile,0,SEEK_END);
  /* go to the end of the file */
  /* ftell called to determine the size of the file*/
  /* what byte position is it in? */
  /* store file size in variable fileSize */
  fileSize = ftell(originalFile);
  /* return to the beginning of the file again */
  fseek(originalFile,0,SEEK_SET);
  /* allocate memory to hold value of fileSize and assign to variable   fileSizeInMem*/
  fileSizeInMem = malloc(fileSize);
  fread(fileSizeInMem,fileSize,1,originalFile);
  /* close the flag file */
  fclose(originalFile);
  
/* get time and place into variable setTime */
  setTime = time((time_t *)0x0);
/* turn into an unsigned integer, meaning the value cannot be negative */
  castTime = (uint)setTime;
/* seed for the rand value based off of the variable setTime. If you have the same time, you have the seed for this random number generator */
  srand(castTime);
  
/* for each byte in the file, until the length of the file is reached, complete the following*/
  for (byteIncrement = 0; byteIncrement < (long)fileSize; byteIncrement = byteIncrement + 1) {
  /* place seed of rand into randomNumGen */
    randomNumberGen = rand();
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) ^ (byte)randomNumberGen;
    ranNumGen_2 = rand();
/* place seed of rand into ranNumGen_2 */
/* perform bitwise operation between ranNumGen_2 and digit 7 */
    ranNumGen_2 = ranNumGen_2 & 7;
/* perform ROL on data */
    *(byte *)((long)fileSizeInMem + byteIncrement) =
         *(byte *)((long)fileSizeInMem + byteIncrement) << (sbyte)ranNumGen_2 |
         *(byte *)((long)fileSizeInMem + byteIncrement) >> 8 - (sbyte)ranNumGen_2;
  }
  
/* open the file called flag.enc with write privileges */
/* the b here is for system compatibility */
  encodedFile = fopen("flag.enc","wb");
/* write the first 4 bytes of castTime variable into the file */
  fwrite(&castTime,1,4,encodedFile);
/* write the contents of allocateMem into the file, for a total byte size of the fileSize variable */
  fwrite(fileSizeInMem,1,fileSize,encodedFile);
/* close the file*/
  fclose(encodedFile);
/* this is the function end */
  if (stackCanary != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

Figure 10: Completely Deobfuscated Main Function of Encrypt Binary

Where to now?
#

For now, we take a well-earned break and do a little happy dance to celebrate completing the first component of this challenge—understanding the encryption algorithm used by the program to encode the flag.

The next part in this series will move onto writing a simple program to reverse the encoding performed on this text, hopefully revealing our flag in plain text and solving the final component of this challenge.