Reflective Loading of PE File

Reflective loading is a powerful technique used by malware authors to evade detection and execute malicious code. By loading a Dynamic-Link Library (DLL) or Portable Executable (PE) file directly into memory, without relying on the operating system’s loader, reflective loading can bypass traditional security measures that are designed to prevent unauthorized code execution. According to the MITRE ATT&CK Framework, it identifies reflective loading as a commonly used technique by adversaries (MITRE,2022).

This post demonstrates would not provide the full code on how reflective loading works but partial ones on the more important concepts that I have been wanting to learn. To be able to create my own reflective loader, it is crucial to understand what relocation table is and how IAT (I am trying to load a PE file instead of DLL instead) should be fixed for Windows to understand how to run the file.

The first few steps includes:

Reading RAW PE file content to a memory buffer
Parsing the PE Header to get important loader information. For instance:
- Section RVA
- Section sizes
- Image Size
- EntryPoint
- etc
Create another buffer for loading based on information of PE header
Copy all the sections to the respective Virtual Addresses
Fix the relocation table
Fix the Import Address Table (IAT)

For this post, I will jump straight to point 5 and 6.

Relocation Table

To better understand this section, it is recommended to read the .reloc Section(Image Only) from windows documentation.

When binary file is loaded, the preferred image base address is not garaunteed to be available all the time. This is especially true for our case since we are loading this on top of another binary in the same memory address space. To fix this, the FixReloc function is responsible for fixing relocation errors in the shellcode that may occur during the loading process. For this to happen, the delta value is calculated to offset the current image base address from the preferred address.

Fixing the Reloc Table

A FixReloc function was created to fix relocation table. It begins by calculating the delta value as mentioned. Recall that during the loading of the PE file, we needed to create a heap with RWX permission. Heap address allocated is not fixed and its address is stored in pCustomLoaderInfo->ImageBase. The pCustomLoaderInfo is just a struct containing the relevant PE header data. This value is usually not the preferred image base. To correct this, the delta offset is calculated. The shellcode in these code snippets refers to the PE file that we are trying to load.

void FixReloc(PBYTE shellcode_load_address, PCUSTOM_LOADER_INFO pCustomLoaderInfo) {
    // Calculate the delta for offseting 
    DWORD delta = shellcode_load_address - pCustomLoaderInfo->ImageBase;
    // If delta is 0 then the image is already at the preferred image base address
    if (delta == 0) {
        return;
    }
    ...
    ...

Next in line, the function searches for the relocation section in the shellcode using virtual address specified in the loader configuration, pCustomLoaderInfo->VA_Reloc. Adding this Virtual Address to the shellcode_load_address would point us to PIMAGE_BASE_RELOCATION structure.

    // Find the reloc section
    PIMAGE_BASE_RELOCATION pRelocTable = (PIMAGE_BASE_RELOCATION)(pCustomLoaderInfo->VA_Reloc + shellcode_load_address);
    if (pRelocTable->SizeOfBlock == 0) {
        return TRUE; 
    }

Each base relocation block starts with the following structure:

typedef struct _IMAGE_BASE_RELOCATION {
    DWORD   VirtualAddress;
    DWORD   SizeOfBlock;
//  WORD    TypeOffset[1];
} IMAGE_BASE_RELOCATION;
typedef IMAGE_BASE_RELOCATION UNALIGNED * PIMAGE_BASE_RELOCATION;

The block size from the Base Relocation Block refers to an entry in the relocation table. To calculate the number of entries, obtain the size of the entire relocation table subtracting the bytes taken for the VirtualAddress and SizeOfBlock and divide by the size of WORD which is used in TypeOffset.

Next, theFixReloc function iterates through each entry in the relocation block, checking whether the type of relocation is IMAGE_REL_BASED_HIGHLOW or IMAGE_REL_BASED_DIR64 which are the most common type based on how the shellcode is compiled.

The Block Size field is then followed by any number of Type or Offset field entries. Each entry is a WORD (2 bytes) and has the following structure as seen in the same windows documentation:

If the types are IMAGE_REL_BASED_HIGHLOW or IMAGE_REL_BASED_DIR64, the the delta offset is added to the addresses from the entry.

void FixReloc(PBYTE shellcode_load_address, PCUSTOM_LOADER_INFO pCustomLoaderInfo) {
    // Calculate the delta for offseting 
    DWORD delta = shellcode_load_address - pCustomLoaderInfo->ImageBase;
    // If delta is 0 then the image is already at the preferred image base address
    if (delta == 0) {
        return;
    }

    // For each of the entry
    while (pRelocTable->VirtualAddress != 0) {
        
        DWORD entriesCount = (pRelocTable->SizeOfBlock - sizeof(IMAGE_BASE_RELOCATION)) / sizeof(WORD);
        PWORD pRelocationData = (PWORD)(pRelocTable + 1);

        for (int j = 0; j < entriesCount; j++) {
            WORD type = pRelocationData[j] >> 12;
            WORD offset = pRelocationData[j] & 0xfff;

            // Check relation if IMAGE_REL_BASED_HIGHLOW (3) or IMAGE_REL_BASED_DIR64
            if (type == IMAGE_REL_BASED_HIGHLOW || type == IMAGE_REL_BASED_DIR64) {
                // Adjust the values here
                // IMAGE_BASE_RELOCATION.VirtualAddress + offset + shellcode loaded address
                DWORD* address_to_relocate = shellcode_load_address + pRelocTable->VirtualAddress + offset; // The pRelocTable is really the RVA
                *address_to_relocate = *address_to_relocate + delta; // Notice that all values at address_to_relocate are actual addresses
            }
        }
        
        // Move to next relocatin block 
        pRelocTable = (PIMAGE_BASE_RELOCATION)((PBYTE)pRelocTable + pRelocTable->SizeOfBlock);
    }
}

One thing learnt during this study is that reloc entries contains offsets to jumpable addresses. Therefore it is crucial to fix these relocations should the preferred image base address is not available.

Import Address Table (IAT)

The Import Address Table (IAT) is a structure that contains addresses of functions imported by the shellcode. In order to properly execute the shellcode, the FixIAT function is created to fix the Import Address Table of the shellcode to point to the correct addresses of the imported functions.

To help understand the code to come, here is what the structures look like in the PE file. The IAT can contain multiple IMAGE_IMPORT_DESCRIPTOR.

typedef struct _IMAGE_IMPORT_DESCRIPTOR {
    union {
        DWORD   Characteristics;            // 0 for terminating null import descriptor
        DWORD   OriginalFirstThunk;         // RVA to original unbound IAT (PIMAGE_THUNK_DATA)
    } DUMMYUNIONNAME;
    DWORD   TimeDateStamp;                  // 0 if not bound,
                                            // -1 if bound, and real date\time stamp
                                            //     in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
                                            // O.W. date/time stamp of DLL bound to (Old BIND)

    DWORD   ForwarderChain;                 // -1 if no forwarders
    DWORD   Name;
    DWORD   FirstThunk;                     // RVA to IAT (if bound this IAT has actual addresses)
} IMAGE_IMPORT_DESCRIPTOR;
typedef IMAGE_IMPORT_DESCRIPTOR UNALIGNED *PIMAGE_IMPORT_DESCRIPTOR;

Note that it contains Relative Virtual Addresses to

Original First Thunk (imported function name table which is dubbed as Import Name Table throughout this study)
First Thunk which points to the ordinal table.
Name which is the name of the DLL.

The following shows the struct of the thunk data. Focus on both ordinal and Address of Data which a points to IMAGE_IMPORT_BY_NAME.

typedef struct _IMAGE_THUNK_DATA64 {
    union {
        ULONGLONG ForwarderString;  // PBYTE 
        ULONGLONG Function;         // PDWORD
        ULONGLONG Ordinal;
        ULONGLONG AddressOfData;    // PIMAGE_IMPORT_BY_NAME
    } u1;
} IMAGE_THUNK_DATA64;
typedef IMAGE_THUNK_DATA64 * PIMAGE_THUNK_DATA64;

To resolve all the functions, the function iterates through each Import Descriptor after every_IMAGE_IMPORT_BY_NAME is visited. Upon visitation, GetProcAddress resolves function name to function addresses.

The following presents the struct of IMAGE_IMPORT_BY_NAME:

typedef struct _IMAGE_IMPORT_BY_NAME {
    WORD    Hint;
    CHAR   Name[1];
} IMAGE_IMPORT_BY_NAME, *PIMAGE_IMPORT_BY_NAME;

Hint here is used for quicker resolution but is not necessary in our study.

In this study, the function name table is used because the IMAGE_ORDINAL_FLAG is not set. This probably happens because of how the shellcode is compiled.

The next figure shows how the data structure would look like in a PE file.

Fixing the IAT

The following snippet shows how the fixing of IAT is possible.

void FixIAT(BYTE* shellcode_load_address, PCUSTOM_LOADER_INFO pCustomLoaderInfo) {
    // Contains Virtual Address that points to Import Descriptor
    PIMAGE_IMPORT_DESCRIPTOR pImportDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(shellcode_load_address + pCustomLoaderInfo->VA_Import);
    
    while (pImportDescriptor->Name != 0) {
        // Import the modules that we need
        PBYTE dllname = (pImportDescriptor->Name + shellcode_load_address);
        printf("\n\nLoading from %s ", dllname);
        
        HMODULE dllModule = LoadLibraryA(dllname);
        if (!dllModule) {
            printf("[x] Error : Unable to load %s\n", dllname);
            return -1;
        }
        // Click into definition for PIMAGE_THUNK_DATA and you will find interesting helper functions
        PIMAGE_THUNK_DATA pIat = shellcode_load_address + pImportDescriptor->FirstThunk;
        PIMAGE_THUNK_DATA pInt = shellcode_load_address + pImportDescriptor->OriginalFirstThunk;

        //iterate through the IAT and INT
        while (pInt->u1.AddressOfData != 0) {
            if (pInt->u1.Ordinal & IMAGE_ORDINAL_FLAG) {
                // Search by ordinal
                WORD ordinal = IMAGE_ORDINAL(*(shellcode_load_address + pInt->u1.AddressOfData));

                //TODO Compile one that uses ordinals
                printf("\nHavent implemented this yet");
            }
            else {
                // By name and not ordinal
                IMAGE_IMPORT_BY_NAME* pImageImportByName = (IMAGE_IMPORT_BY_NAME*)(pIat->u1.AddressOfData + shellcode_load_address);
                BYTE* apiName = pImageImportByName->Name;
                WORD hint = pImageImportByName->Hint;
                FARPROC function_ptr = GetProcAddress(dllModule,(LPCSTR)apiName);
                // we take advantage of the fact that ordinal + function address location from IAT and function name index are the same each time.
                pIat->u1.Function = function_ptr;
                printf("\n\tHint : 0x%X\t\t - %s  " ,hint,apiName);
                int kk = 0;
            }
            pIat++;
            pInt++;
        }

        // Next Descriptor 
        pImportDescriptor++;
        // go to the next descriptor
    }
}

It should be noted however, that for real world malware, malware authors tend to indirectly resolve GetProcAddress since this is usually one of the first few places malware analysts would look at. Furthermore, they would try to remove as many dependencies as possible to make the shellcode work more independently. Nonetheless, this study is more of a proof of concept and thus, no attempt has been made to mimic that behaviour.

3.3.6. Execute Shellcode

With the relocations and function addresses fixed and resolved, the entry point can be calculated before jumping to it. Entry Point address can be calculated by adding the loaded shellcode’s base address + Virtual Address of entry point stored in cusom loader information. The permission of the page is then altered to PAGE_EXECUTE_READWRITE. After running the shellcode, the permission is then set back to its original value.

// Jump to entry point
    LPVOID address = ((BYTE*)shellcode_load_address + customLoaderInfo.VA_EntryPoint);
    void (*load_entry_point)() = address;
    VirtualProtect(shellcode_load_address, customLoaderInfo.ImageSize, PAGE_EXECUTE_READWRITE, &oldProtect);
    load_entry_point();
    VirtualProtect(shellcode_load_address, customLoaderInfo.ImageSize, oldProtect, &oldProtect);

Results

A test was done on a simple program and it worked!

Discussions

Memory forensics techniques can be employed to detect the use of reflective loading, even if the PE file does not exist on disk. Tools such as Volatility can be used to analyze the memory of the compromised system.Volatility’s malfind module or plugin can detect common patterns of reflective loader [1]. The plugin is capable of scanning for regions with PAGE_EXECUTE_READWRITE memory protection and check for the magic bytes MZ at the beginning of those regions (This is with the assumption that the header content are intact. In this study, since the headers are stripped, it is trickier to detect with malfind alone ).

References

https://www.ired.team/offensive-security/code-injection-process-injection/reflective-dll-injection
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format
(Benjamin Sølberg n.d.), ReflectivePELoader, https://github.com/BenjaminSoelberg/ReflectivePELoader
MITRE. (2022, April 21). Reflective Code Loading. MITRE ATT&CK. https://attack.mitre.org/techniques/T1620/

Quick Study of Bring Your Own Vulnerable Driver (BYOVD)

Relocation Table and Import Address Table (IAT) in Reflectively Loaded PE File

Reflective Loading of PE File

Relocation Table

Fixing the Reloc Table

Import Address Table (IAT)

Fixing the IAT

3.3.6. Execute Shellcode

Results

Discussions

References

Quick Study of Bring Your Own Vulnerable Driver (BYOVD)

A Quick Look at BlackWood DLL Loader

A Quick Look at BlackWood DLL Loader

Quick Study of Bring Your Own Vulnerable Driver (BYOVD)

A quick Look at a Dropper and Downloader

Latest Posts

A Quick Look at BlackWood DLL Loader

Relocation Table and Import Address Table (IAT) in Reflectively Loaded PE File

Relocation Table and Import Address Table (IAT) in Reflectively Loaded PE File

Reflective Loading of PE File

Relocation Table

Fixing the Reloc Table

Import Address Table (IAT)

Fixing the IAT

3.3.6. Execute Shellcode

Results

Discussions

References

Quick Study of Bring Your Own Vulnerable Driver (BYOVD)

A Quick Look at BlackWood DLL Loader

You may also like

A Quick Look at BlackWood DLL Loader

Quick Study of Bring Your Own Vulnerable Driver (BYOVD)

A quick Look at a Dropper and Downloader

Latest Posts

A Quick Look at BlackWood DLL Loader

Relocation Table and Import Address Table (IAT) in Reflectively Loaded PE File

Explore Tags