December 28, 2012

Labyrinth, Noise Elimination, Circuit Engineering... Review of the Most Interesting Tasks of PHDays CTF Quals

PHDays CTF Quals, information security competition, ended last week. 493 teams from 30 countries competed in information hacking and protection. All the tasks were divided into five categories from Reverse Engineering to the tasks typical of the real world (the details and results of the competition are available in our previous post). Each category included five tasks of different challenge levels (from 100 to 500 points).

The majority of the tasks were solved by the teams, some of them caused troubles, and some were left unsolved. Moreover, for a part of the tasks the teams used such solutions, which were not even considered by the organizers. This time we want to review the most interesting (in our opinion) and difficult tasks of PHDays CTF Quals.

Misc 400

An interactive service offered the participants to find a path in a 3D labyrinth (a cube (50х50) with multiple corridors inside).

Each time a team went through a labyrinth, another appeared, and thus 16 times in total. A hint was given in the middle of the task: "A point of view does matter". If viewing in one of the three projections, then a path in each of the labyrinths is a character of the answer.

Therefore, 16 labyrinths give us 16 characters of the flag: NOF3ARNO3XITHER3.
When the last labyrinth was solved, the service popped up a message with the following text: You win! How do you like the flag? ;) And closed the connection. Such an unexpected end of the task caused cognitive dissonance in many participants. :)

Follow the link to view the path projections.

Github code.

Bin300 (HashME) – Hash with Modular Exponent

A binary file of 754 bytes was provided.

The task was formulated as follows: "Find the valid password, and you will find the cherished flag".
 The file included the following strings: Bad pwd, hex, sys, hashlib, argv, isalnum, len, Exception, chr, pow, int, encode, md5, hexdigest, <module>.

Judging by the strings, it is easy to guess that the file contains Python byte-code, which is also proved by the GNU file python 2.7 byte-compiled.

There is a decompiler named uncompyle for Python 2.7.

Set up tuj, launch, and receive a decompiled text:
import sys, hashlib
(5, 1, 3, 6,) = (10018627425667944010192184374616954034932336288972070602267764174849233338727414964592990350312034463496546535924460513481267263055398790908691402854122123L, 7548218116432136940925610514648634474612691039131890951895054656437277296127635726026902728136306678987800886118938655787775411887815467753774352743068577L, 6192128262312421513644888506697421915171917575080330421897398651929773466194971539791158995262083381167771056580666419101167108372547406447696753234781064L, sys.argv[-1])
if not 6.isalnum() or len(6) > 10: raise Exception('Bad pwd')
0 = (chr(len(6)) + 6) * 32
2 = pow(1, int(0[:64].encode('hex'), 16), 5)
if 3 != 2: print hex(2)
else: print hashlib.md5(6).hexdigest()
It is evident, that something is wrong with the code — it cannot be compiled. The reason is very simple — variable names were substituted by simple numbers in the compiled file. It hardly prevents byte-code execution, but decompiling makes variable names begin with numbers, and the parser considers it as an error.

It is only needed to insert any letter before each simple number to fix it or to use the first hint:
Python 2.7, "pgxweh" <=> "513602"
Proper functioning code should look as follows:
import sys, hashlib
(p, g, x, w,) = (10018627425667944010192184374616954034932336288972070602267764174849233338727414964592990350312034463496546535924460513481267263055398790908691402854122123L,
7548218116432136940925610514648634474612691039131890951895054656437277296127635726026902728136306678987800886118938655787775411887815467753774352743068577L,
6192128262312421513644888506697421915171917575080330421897398651929773466194971539791158995262083381167771056580666419101167108372547406447696753234781064L, sys.argv[-1])
if not w.isalnum() or len(w) > 10: raise Exception('Bad pwd')
e = (chr(len(w)) + w) * 32
h = pow(g, int(e[:64].encode('hex'), 16), p)
if x != h: print hex(h)
else: print hashlib.md5(w).hexdigest()
The code makes it clear that if to specify the correct password as the last argument of the command string, a hexadecimal MD5 value of this password (flag) will be displayed. The password consists only of letters and numbers, its length is from 1 to 10 characters inclusive.

The password is converted to a Pascal string (length byte + data), which is repeated to obtain 64 bytes. These 64 bytes are interpreted as a long integer, which becomes exponent е. The password is deemed to be right if pow(g,e,p)==x

The values p, g, and x are known, p is a simple number, g is a multiplicative group generator, e is to be found. This is the discrete logarithm problem. 48 hours are not enough for the 512-bit p (as far as we know ;). However, there is a chance to brute force the password.

A possible character set is 0-9A-Za-z. That is 62 variants. The maximum length is 10, thus the total number of possible passwords is: 

62^1 + 62^2 + 62^3 + ... + 62^10 == 853058371866181866 ≈ 2^59.6

Calculation of a 512-bit modular exponent is carried out quite slowly, and it is hardly possible to brute force even 228 variants using only one computer for 48 hours.

However, it is not necessary to calculate a modular exponent for each password, and the second hint makes it clear:

g^(a+b) == g^a * g^b

Suppose we brute force passwords of 3 characters. Then the exponent will look as follows in a hexadecimal record:

е = 03 XX YY ZZ 03 XX YY ZZ 03 XX YY ZZ … 03 XX YY ZZ

where XX, YY, and ZZ are password characters.

Taking into account the hint, the exponent can be written as 4 numbers:

e0 = 03 00 00 00 03 00 00 00 03 00 00 00 … 03 00 00 00
e1 = 00 XX 00 00 00 XX 00 00 00 XX 00 00 … 00 XX 00 00
e2 = 00 00 YY 00 00 00 YY 00 00 00 YY 00 … 00 00 YY 00
e3 = 00 00 00 ZZ 00 00 00 ZZ 00 00 00 ZZ … 00 00 00 ZZ

And then
pow(g,e,p) == (pow(g,e0,p) * pow(g,e1,p) * pow(g,e2,p) * pow(g,e3,p)) % p

It is remarkable that for passwords with a fixed length the value is e0, and thus pow(g,e0,p) will be unchanged. And pow(g,e[i],p) can take only one of 62 possible values, which can be calculated only once. Changing only one password character, only one multiplier will be changed. Modular exponentiation can be brought to modular multiplication to increase speed by more than 500 times.

However, even when all this is done, it's still hardly possible to brute force 2^60 variants within 48 hours. This time the third hint can help:

Meet In The Middle

The thing is that modular multiplication is a reversible operation (at least in this case). Using the extended Euclidean algorithm, it is possible to calculate g-1==g’, algebraic supplement g modulo p, so that (g*g’)%p==1

Then if
(pow(g,e0,p) * pow(g,e1,p) * pow(g,e2,p) * pow(g,e3,p)) % p == x

then the following equation will be true:  

(pow(g,e2,p) * pow(g,e3,p)) % p == (x * pow(g’,e0,p) * pow(g’,e1,p)) % p

It allows us to "meet in the middle".

Mentally divide the password length into two parts as close to the middle as possible. Brute force all the variants of a shorter part and multiply x by each of the values pow(g’,e[i],p) consecutively. Save the results in a table.

Brute force all the variants of the other part and multiply 1 by each of the values pow(g,e[i],p) consecutively. You can multiply modulo p. Find the result in the table. In case of matching values, you only need to remember the value of the short password part, which generated this table element.

Due to the fact that the password length is not more than 10 characters, then a half of the password will not be longer than 5 characters and 62^5 == 916132832 ≈ 2^29.8. So the task can be solved by doing less than 2^31 modular multiplications, which is possible even if only one machine is used. Though to store 2^29.8 512-bit values, almost 55 GB of memory will be needed.

However, firstly, you can save less than 512 bit (40 bit should be enough so that the collision number would be close to zero). And secondly, the correct password contained only 9 characters, and already 2 GB of RAM are enough for 62^4 variants.

It can be guaranteed that, if correctly used, a single-threaded Python program installed on a computer, which CPU is Core-i5 3.1GHz, will find any password with length up to 8 characters inclusive approximately in 4 minutes and any 9-character password in an hour and a half.

Binary – 400 (BoobFs)

This task is very interesting, but no team managed to solve it.
Input data:
  1. BMP image (file system image).
  2. A software to create a file system image out of files.

File system image as a picture

The file system consists of a main header, one directory including a variable file amount and divided into several blocks of variable length. Each file is also divided into blocks of variable length. The blocks are encrypted by the RC4 algorithm on a user key. A directory contains a file name, its size, and the size and offset of the first file's block. Each block contains the offset and size of the next block. The file system grows from the end.
struct FS_HEADER
{
 DWORD signature;   // BOOB
 DWORD dirOffset;
 DWORD firstBlockSize;
};
 
struct DIR_BLOCK_HEADER
{
 BYTE  signature;  // D
 DWORD nextBlockOffset;
 DWORD nextBlockSize;
 DWORD numberOfFiles;
};
 
struct FILE_ENTRY
{
 CHAR  fileName[MAX_FNAME]; // MAX_FNAME = 20
 DWORD fileSizeInBytes;
 DWORD firstBlockOffset;
 DWORD firstBlockSize;
};
 
struct FILE_BLOCK_HEADER
{
 BYTE  signature;  // F
 DWORD nextBlockOffset;
 DWORD nextBlockSize;
};
 
An example of a file system structure (S is the number of blocks in a directory, Ni is the number of files in the i block, P is the total blocks number, equaling to N0 + … + NS)

When the file system is created, all the data is modified (Base64 modification) and recorded to a two-dimensional array by a special formula, and all the spare space is filled with pseudorandom numbers, and a BMP file is created on the basis of this array.

The function used for recording data to the array

The program is written in C++ using STL, OOP, and virtual functions. It makes this application analysis more complicated.

The task can be solved in several stages:
  1. Read the file data into a two-dimensional array (to eliminate redundancy of the BMP format).
  2. Work out the formula, which will be used to read the data from the two-dimensional array.
  3. Work out the method used for data conversion and invert the algorithm (modified Base64).
  4. Brute force the file system PIN.
  5. Work out the file system structure, write a program to bypass it, and extract all the files (16 in total).
  6. Compose a sentence from the file names, which will help to receive the flag.

Forensics500

The participants were provided with network traffic dump in a pcap file containing the flag.

Port 554/udp indicated that it was the RTP stream. And more likely an audio stream.


Task opened with Wireshark

Trying to listen to the audio, it becomes evident that music is played against some noise and sometimes more noisy fragments appear. In fact the stream consists of two streams. The RTP specification describes 1 bit used on the application level and defined by the profile. If this field is set up, then the packet data has a specific feature, by which the traffic can be divided into two streams. Now one of them plays the music, and noises and something like a voice can be heard from the other one. This is the final part of the task. It is necessary to understand noises' nature and eliminate them. We should say that the noise is XOR of the audio data with 0xCC byte.

There are several ways for further task solution. Here is one of them.

Less noisy parts are more likely silence and noise. Knowing the codec (see the dump), generate your own audio file with silence. Analyzing the stream and your file, it is possible to work out the nature and type of the noise.

XThe XOR key can be guessed. In this case the final part of the task related to the audio effect is very similar to analog radio tuning. The closer to the correct value, the purer the sound.

As a result, when the noise was eliminated, a woman voice spelled the flag in English.

During the competition the participants were provided with the following hints implying the main solution stages:

Listen... the strange noise
Simple noise over alphabet
a?x=b b?x=a x?????

PPP solved the task after the third hint, and some of the participants were very close to solution. Using their own methods for noise elimination, they achieved good results having mistaken by 1 or 2 characters.


Misc500

The participants were provided with a binary file for Electronics Workbench — active project, which can be started and debugged. 


Scheme appearance

The scheme is obfuscated and consists of two parts. The first part includes elements necessary to display the message PHD3 on the indicators. The second (independent) part is an aggregate of 32 binary/decimal/binary-decimal counters, conversion rate of which is a flag character. The hint “Follow the counters” helped to pay attention to the counters. The flag length in the MD5 format is 32 characters in the HEX format, conversion rates of the counters fall in the range [2;15]. A wire with pointers not connected to anything gave a hint to the order of the flag characters.

The team ufologists tried to submit MD5 differing in a couple of characters from the correct flag 10 minutes prior to the end of PHDays CTF Quals. However, unfortunately, they came short of time for the bug elimination.

No team could solve the task in the end.

Forensic400

The teams were provided with a 512x512-pixel image in the PNG format.  


Task for Forensic 400

Analyzing the image with any graphics editor, it can be noticed that the image is not of three colors — white and red colors are not homogeneous. The first hint (Not all white pixels are the same white :)) was about this fact. Blackening exposes pixels of almost white and almost red colors.


Blackening result

It is obvious that the pixels are orderly allocated. It is difficult to understand/remember/guess what this order is, that is why the second and the most important hint was made (a,b,c,d,e,f,g,h <-> a,e,c,g,b,f,d,h), which was an example of the bit-reversal permutation. To apply this permutation, it is necessary to ensure that the original sequence equals to the power of two. The image size complies with this condition.

Then it was necessary to convert in one's head write a utility, which would implement permutation described above, and receive the following image at the end:



Permutation result

Brackets are clearly seen. It means the previous step was correct! However, the text is superimposed. Removing an image part and applying the permutation, we receive a part of the flag:



Obtaining the flag part

Then we only need to find out which parts to remove and obtain other parts of the flag.




Obtaining the other flag parts

The task was solved by two teams. The first was Magic-Hat (RU), then Plaid Parliament of Pwning (US). It is only fair to say that bit-reversal permutation was described on the Russian-language resource habrahabr.ru a few months ago, so the results of the team PPP deserve respect!

We published a detailed review of the task BINARY 500, which wasn't solved by any team, in our blog a few days ago.

You can review the participants' reports to know how they coped with the tasks:

6 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete