Pages

Tuesday, December 4, 2012

Windows 8 ASLR Internals

Authors: Artem Shishkin and Ilya Smith, Positive Research.

ASLR stands for Address Space Layout Randomization. It is a security mechanism which involves randomization of the virtual memory addresses of various data structures, which may be attacked. It is difficult to predict where the target structure is located in the memory, and thus an attacker has small chances to succeed.

ASLR implementation on Windows is closely related to the image relocation mechanism. In fact, relocation allows a PE file to be loaded not only at the fixed preferred image base. The PE file relocation section is a key structure for the relocating process. It describes how to modify certain code and data elements of the executable to ensure its proper functioning at another image base.

The key part of ASLR is a random number generator subsystem and a couple of stub functions that modify the image base of a PE file, which is going to be loaded.

Windows 8 ASRL relies on a random number generator, which is actually a Lagged Fibonacci Generator with parameters j=24 and k=55 and which is seeded at Windows startup in the winload.exe module. Winload.exe gathers entropy at boot time and has different sources: registry keys, TPM, Time, ACPI, and a new rdrand CPU instruction. Windows kernel random number generator and its initialization are described in detail in [1].

We would like to give a small note about the new rdrand CPU instruction. The Ivy Bridge architecture of Intel processors has introduced the Intel Secure Key technology for generating high-quality pseudo-random numbers. It consists of a hardware digital random number generator (DRNG) and a new instruction rdrand, which is used to retrieve values from DRNG programmatically.

As a hardware unit, DRNG is a separate module on a processor chip. It operates asynchronously with the main processor cores at the frequency of 3 GHz. DRNG uses thermal noise as an entropy source. It also has a built-in testing system performing a series of tests to ensure high quality output. If one of these tests fails, DRNG refuses to generate random numbers at all.

The RDRAND instruction is used to retrieve random numbers from DRNG. The documentation states that theoretically DRNG can return nulls instead of random number sequence due to health test failure or if a generated random number queue is empty. However, we were unable to drain the DRNG in practice.
Intel Secure Key is a really powerful random number generator producing high quality random numbers at a very high speed. Unlike other entropy sources, it is practically impossible to guess the initial RNG state initialized with rdrand instruction.

The internal RNG interface function is ExGenRandom(). It also has an exported wrapper function RtlRandomEx(). Windows 8 ASLR uses this function as opposed to the previous version that relied on the rdtsc instruction. The rdtsc instruction is used for retrieving a timestamp counter on a CPU, which changes linearly so that it cannot be considered a secure random number generator.

The core function of the ASLR mechanism is MiSelectImageBase. It has the following pseudocode on Windows 8.
#define MI_64K_ALIGN(x) (x + 0x0F) >> 4
#define MmHighsetUserAddress 0x7FFFFFEFFFF

typedef PIMAGE_BASE ULONG_PTR;

typedef enum _MI_MEMORY_HIGHLOW
{
    MiMemoryHigh    = 0,
    MiMemoryLow     = 1,
    MiMemoryHighLow = 2
} MI_MEMORY_HIGHLOW, *PMI_MEMORY_HIGHLOW;


MI_MEMORY_HIGHLOW MiSelectBitMapForImage(PSEGMENT pSeg)
{
    if (!(pSeg->SegmentFlags & FLAG_BINARY32))            // WOW binary
    {
        if (!(pSeg->ImageInformation->ImageFlags & FLAG_BASE_BELOW_4GB))
        {
            if (pSeg->BasedAddress > 0x100000000)
            {
                return MiMemoryHighLow;
            }
            else
            {
                return MiMemoryLow;
            }
        }
    }

    return MiMemoryHigh;
}

PIMAGE_BASE MiSelectImageBase(void* a1<rcx>, PSEGMENT pSeg)
{
    MI_MEMORY_HIGHLOW ImageBitmapType;
    ULONG ImageBias;
    RTL_BITMAP *pImageBitMap;
    ULONG_PTR ImageTopAddress;
    ULONG RelocationSizein64k;
    MI_SECTION_IMAGE_INFORMATION *pImageInformation;
    ULONG_PTR RelocDelta;
    PIMAGE_BASE Result = NULL;

    // rsi = rcx
    // rcx = rdx
    // rdi = rdx

    pImageInformation = pSeg->ImageInformation;
    ImageBitmapType = MiSelectBitMapForImage(pSeg);

    a1->off_40h = ImageBitmapType;

    if (ImageBitmapType == MiMemoryLow)
    {
        // 64-bit executable with image base below 4 GB
        ImageBias = MiImageBias64Low;
        pImageBitMap = MiImageBitMap64Low;
        ImageTopAddress = 0x78000000;
    }
    else
    {
        if (ImageBitmapType == MiMemoryHighLow)
        {
            // 64-bit executable with image base above 4 GB
            ImageBias = MiImageBias64High;
            pImageBitMap = MiImageBitMap64High;
            ImageTopAddress = 0x7FFFFFE0000;
        }
        else
        {
            // MiMemoryHigh 32-bit executable image
            ImageBias = MiImageBias;
            pImageBitMap = MiImageBitMap;
            ImageTopAddress = 0x78000000;
        }
    }

    // pSeg->ControlArea->BitMap ^= (pSeg->ControlArea->BitMap ^ (ImageBitmapType << 29)) & 0x60000000;
    // or bitfield form
    pSeg->ControlArea.BitMap = ImageBitmapType;

    RelocationSizein64k = MI_64K_ALIGN(pSeg->TotalNumberOfPtes);

    if (pSeg->ImageInformation->ImageCharacteristics & IMAGE_FILE_DLL)
    {
        ULONG StartBit = 0;
        ULONG GlobalRelocStartBit = 0;

        StartBit = RtlFindClearBits(pImageBitMap, RelocationSizein64k, ImageBias);
        if (StartBit != 0xFFFFFFFF)
        {
            StartBit = MiObtainRelocationBits(pImageBitMap, RelocationSizein64k, StartBit, 0);
            if (StartBit != 0xFFFFFFFF)
            {
                Result = ImageTopAddress - (((RelocationSizein64k) + StartBit) << 0x10);
                if (Result == (pSeg->BasedAddress - a1->SelectedBase))
                {
                    GlobalRelocStartBit = MiObtainRelocationBits(pImageBitMap, RelocationSizein64k, StartBit, 1);
                    StartBit = (GlobalRelocStartBit != 0xFFFFFFFF) ? GlobalRelocStartBit : StartBit;
                    Result = ImageTopAddress - (RelocationSizein64k + StartBit) << 0x10;
                }

                a1->RelocStartBit = StartBit;
                a1->RelocationSizein64k = RelocationSizein64k;
                pSeg->ControlArea->ImageRelocationStartBit = StartBit;    
                pSeg->ControlArea->ImageRelocationSizeIn64k = RelocationSizein64k;

                return Result;
            }
        }
    }
    else
    {
        // EXE image
        if (a1->SelectedBase != NULL)
        {
            return pSeg->BasedAddress;
        }

        if (ImageBitmapType == MiMemoryHighLow)
        {
            a1->RelocStartBit = 0xFFFFFFFF;
            a1->RelocationSizein64k = (WORD)RelocationSizein64k;
            pSeg->ControlArea->ImageRelocationStartBit = 0xFFFFFFFF;
            pSeg->ControlArea->ImageRelocationSizeIn64k = (WORD)RelocationSizein64k;

            return ((DWORD)(ExGenRandom(1) % (0x20001 - RelocationSizein64k)) + 0x7F60000) << 16;
        }
    }

    ULONG RandomVal = ExGenRandom(1);
    RandomVal = (RandomVal % 0xFE + 1) << 0x10;

    RelocDelta = pSeg->BasedAddress - a1->SelectedBase;
    if (RelocDelta > MmHighsetUserAddress)
    {
        return 0;
    }

    if ((RelocationSizein64k << 0x10) >  MmHighsetUserAddress)
    {
        return 0;
    }

    if (RelocDelta + (RelocationSizein64k << 0x10) <= RelocDelta)
    {
        return 0;
    }

    if (RelocDelta + (RelocationSizein64k << 0x10) > MmHighsetUserAddress)
    {
        return 0;
    }

    if (a1->SelectedBase + RandomVal == 0)
    {
        Result = pSeg->BasedAddress;
    }
    else
    {
        if (RelocDelta > RandomVal)
        {
            Result = RelocDelta - RandomVal;
        }
        else
        {
            Result = RelocDelta + RandomVal;
            if (Result < RelocDelta)
            {
                return 0;
            }

            if (((RelocationSizein64k << 0x10) + RelocDelta + RandomVal)  > 0x7FFFFFDFFFF)
            {
                return 0;
            }

            if (((RelocationSizein64k << 0x10) + RelocDelta + RandomVal)  <  (RelocDelta + (RelocationSizein64k << 0x10))))

            {
                return 0;
            }
        }
    }

    //random_epilog
    a1->RelocStartBit = 0xFFFFFFFF;
    a1->RelocationSizein64k = RelocationSizein64k;
    pSeg->ControlArea->ImageRelocationStartBit = 0xFFFFFFFF;
    pSeg->ControlArea->ImageRelocationSizeIn64k = RelocationSizein64k;

    return Result;
}
As we can see, there are three different image bitmaps. The first one is for 32-bit executables, the second is for x64, and the third is for x64 with the image base above 4GB, which grants them a high-entropy virtual address.

The executables are randomized by a direct modification of the image base. As for the DLLs, ASLR is a part of relocation, and the random part of the image base selection process is ImageBias. It is a value that is initialized during the system startup.
VOID MiInitializeRelocations()
{
    MiImageBias = ExGenRandom(1) % 256;
    MiImageBias64Low = ExGenRandom(1) % MiImageBitMap64Low.SizeOfBitMap;
    MiImageBias64High = ExGenRandom(1) % MiImageBitMap64High.SizeOfBitMap;

    return;
}
Image bitmaps represent the address space of the running user processes. Once an executable image is loaded, it will have the same address for all the processes that reference it. It is natural because of efficiency and memory usage optimization, since executables use the copy-on-write mechanism.
ASLR implemented on Windows 8 can now force images, which are not ASLR aware, to be loaded at a random virtual address. The table below demonstrates the loader’s behavior with different combinations of ASLR-relevant linker flags.


*Cannot be built with MSVS because the /DYNAMICBASE option also implies /FIXED:NO, which generates a relocation section in an executable.

We can spot that the loader’s behavior changed in Windows 8 — if a relocation section is available in the PE file, it will be loaded anyway. It also proves that ASLR and the relocation mechanism are really interconnected.

Generally we can say that implementation of the new ASLR features on Windows 8 doesn’t much influence the code logic, that is why it is difficult to find any profitable vulnerabilities in it. Entropy increase for randomizing various objects is in fact a substitution of a constant expression in a code. The code graphs also show that the code review has been done.

References:

[1] Chris Valasek, Tarjei Mandt. Windows 8 Heap Internals. 2012.
[2] Ken Johnson, Matt Miller. Exploit Mitigation Improvements in Windows 8. Slides, Black Hat USA 2012.
[3] Intel. Intel®Digital Random Number Generator (DRNG): Software Implementation Guide. Intel Corporation, 2012.
[4] Ollie Whitehouse. An Analysis of Address Space Layout Randomization on Windows Vista. Symantec Advances Threat Research, 2007.
[5] Alexander Sotirov, Mark Dowd. Bypassing Browser Memory Protections. 2008.

14 comments:

  1. Hey guys,

    first I want to say thx for your great work and the really good article!

    After trying to reverse more from the ASLR mechanism in Windows 8 by myself I have a few follow up question and I hope you can maybe help me with them?

    At first I'm not quite sure where the 2nd argument of MiSelectImageBase (PSEGMENT pSeg) points within a PE file? I found out, that this is the same parameter than given to the MiRelocateImage function as 1st argument. I'm a little bit confused about the used offsets in the disassembly for pSeg and how the several parts of the randomized PE file are addressed. I hope you can help me with that? Thank you guys.

    ReplyDelete
  2. Hey,

    thanks for your interest!

    pSeg represents a structure needed to create a part of a virtual address space. Once this part of a VM address space is mapped, the contents of a PE file will be stored just right from the beginning of the file. This means that the whole PE file is a relocation unit, not the segments in it. The offsets of a pSeg structure are taken from the public symbols.

    ReplyDelete
  3. Thank you for your fast reply. I'm still a little bit confused and maybe you can give me a little bit more information, please?

    The SEGMENT structure is not public, is it? At least I did not found any information about it. If I understand this right, the PSEGMENT pointer points to the image, whose base address should be randomized through MiSelectImageBase? When you say, pSeg points to the beginning of the PE file, than it points to the beginning of the MSDOS stub or somewhere else? And what do you mean, when you say that the whole PE file is a relocation unit, not the segments in it? Because -I thought when a PE file is mapped, just some parts are mapped into memory?

    To make my other problem with the offsets more clear: The SEGMENT structure seems to have some members, such as SegmentFlags or ImageInformation, whose counterparts I can not found in the PE specification. Also some flags, such as FLAG_BINARY32 or FLAG_BASE_BELOW_4GB, can not be found in the PE specification. That is why I dont know exactly, what kind of information are compared within the PE file.

    For example the image base address of the PE file is accessed and compared to 0x100000000 through

    mov rax, 100000000h
    cmp [rcx+20h], rax

    and expressed in C-code:

    if (pSeg->BasedAddress > 0x100000000)

    so the base address of of the PE file is found at an offset of 0x20 from pSeg. But according to the PE specification the ImageBase is located in the optional header windows-specific fields at an offset of 0x18, so I thought pSeg is not pointing to the beginning of the to be randomized PE file?

    Also when it is checked if the PE file is a 32-bit image, it is done through

    test byte ptr [rcx+0Eh], 40h

    and expressed as C-code:

    if (!(pSeg->SegmentFlags & FLAG_BINARY32))

    I can not even find the FLAG_BINARY32 flag in the PE specification and the offset of 0xE is pointing into the NumberOfSymbols field, isn't it?

    Maybe I completely mix things up and don't get the whole picture. That's why it would be really nice and helpful If you can explain the idea behind the SEGMENT structure in more detail and give additional information? I would be really thankful for that!

    ReplyDelete
    Replies
    1. Oh, got it now.

      The SEGMENT structure is not related to PE Segments at all. It is an internal structure used by virtual memory manager. So it is not correct to make any conclusions about the correlation of PE format specs and SEGMENT structure. It is constructed and filled by the OS loader. As I said before, this structure is available through Microsoft public debugging symbol server, f.e. you can examine it in WinDbg by typing dt nt!_SEGMENT. Of course, this structure is not documented, but we can observe structure’s fileds names, which gives some sense.

      Delete
  4. Ah, ok. That makes sense! The truth is, I've tried to examine the structure in WinDbg (via dt nt!_SEGMENT) before, but WinDbg told me that the symbol was not found. So I thought, it is not a public symbol. I must have been distracted somehow and made a stupid mistake when typewriting :S

    You say, that the SEGMENT structure is not related to the PE file, which should be randomized, at all? But I thought, some information from the PE file must be stored somehow inside the SEGMENT structure e.g. the base address, the MI_MEMORY_HIGHLOW value, the IMAGE_FILE_DLL flag and many more? Is this a wrong assumption by me then?

    Also you wrote, that the core function of the ASLR mechanism is implemented in the MiSelectImageBase function. But isn't this function only randomizing executables and DLLs for the user-mode memory? I thought the randomization of the process heaps, thread stacks (or better all bottom-up randomization) and also kernel ASLR (or better top-down randomization) is implemented somewhere else? Do you have any further information about that? Thank you for your patience btw ;)

    ReplyDelete
    Replies
    1. You’re welcome! =)

      Let’s say that OS loader makes all the job parsing PE files. SEGMENT structure is one of many control structures, that are filled by the OS loader. Yes, it contains some information about PE file, but only that, which is relevant to SEGMENT structure. I guess there is no structure in kernel that has one-to-one mapping of PE header. In a nutshell, different fields and structures of PE file are used by different internal kernels structures.

      You’re absolutely right! ASLR influences on heaps, stacks, tebs etc. It is just the article that is more focused on images relocation. ASLR parts used for randomizing other structures are located at the corresponding constructor functions for a certain object, f.e. MiCreatePebOrTeb. I thought that it is no worth mentioning there, because there is nothing interesting inside them. ASLR is just ExGenRandom(1) % in there.

      Delete
  5. Hey guys,

    I have another question about the image bias value and please correct me if I wrote something wrong:

    Because each DLL should only be loaded once into physical memory it has to be mapped at the same virtual address within each process address space, that imports the DLL. Therefor the image base randomization for a DLL needs only to be performed once per-boot. The randomization is based on the systemwide image bias value, which is calculated only once per boot. If I understood the DLL randomization the right way, depending on which kind of DLL has to be randomized a different image bias value is used:

    - 32-bit DLL image: MiImageBitMap
    - 64-bit DLL image (with image base below 4 GB): MiImageBitMap64Low
    - 64-bit DLL image (with image base above 4 GB): MiImageBitMap64High

    Do you know how and where these image bias values are calculated? In IDA I could only find out, that all three values are taken from an offset from the cs-register? Also it would be really nice to know how many bits (for randomization purpose) each value contains.

    Maybe you can help me with my questions...thx as always :)

    ReplyDelete
    Replies
    1. Sorry, but of course the names of the three different image bias values are:

      - 32-bit DLL image: MiImageBias
      - 64-bit DLL image (with image base below 4 GB): MiImageBias64Low
      - 64-bit DLL image (with image base above 4 GB): MiImageBias64High

      Delete
    2. This info is just right in the article ;)

      Delete
    3. Oh, of course! What a fail by myself! After crawling through a lot of references I just overlooked the given information in your article. Sorry for the unnecessary question :)

      Delete
  6. Hey guys,

    I still don't get a small piece of the "puzzle" so maybe you can help again? What's about the first argument a1 for MiSelectImageBase, passed over through the rcx register? I don't get where the information within this structure comes from and what is stored in this structure. Especially the a1->SelectedBase value is often used for comparisons but what is stored within this variable? I think the pSeg->BasedAddress member contains the preferred load address of the PE image but what is the context with a1->SelectedBase?

    ReplyDelete
    Replies
    1. To tell the truth we don’t know either, because it is some sort of an internal structure. We were not able to fit this structure into the one of the available in the public symbols. So it is just arg1 with a pointer-to-something type. It happens all the time when reverse engineering, you better get used to that ;)

      Delete
    2. Ok, good to know...but I still try to understand the randomization of an executable in detail. At the beginning of the code for relocating an executable it is checked, if the image is a 64-bit executable with an image base above 4 GB, through

      if (ImageBitmapType == MiMemoryHighLow)

      Then the image receives 17 bit of randomness. But what about the 64-bit images with an image base below 4 GB (ImageBitmapType == MiMemoryLow)? Do this images receive the same (lower) amount of bits for randomization than 32-bit executable do? According to the code

      ULONG RandomVal = (ExGenRandom(1) % 0xFE + 1) << 0x10;

      both image types, 32-bit and 64-bit with an image base below 4 GB, just receive 8 bit of randomness...or am I missing something again? Especially the last part of the exe relocation gives me some trouble. According to

      if (a1->SelectedBase != NULL)
      {
      return pSeg->BasedAddress;
      }

      the last part of the exe relocation is only reached if

      a1->SelectedBase == NULL

      Otherwise pSeg->BasedAddress would be returned. But if a1->SelectedBase == NULL I don't get

      RelocDelta = pSeg->BasedAddress - a1->SelectedBase;

      because than RelocDelta must be the same as pSeg->BasedAddress? Actually I think this must be the case, because than the randomized exe image address would be sum or the difference of preferred base address and RandomVal, just like in Windows 7. But than I again don't get the meaning of a1->SelectedBase :(

      Delete
    3. 32-bit and 64-bit with an image base below 4 GB, just receive 8 bit of randomness – yes, it’s like that.
      a1->SelectedBase – goes out of the fact, that this image has already been loaded in some process. Remember there is a global bitmap of loaded modules in the system.

      Delete