Another way of Hooking -> PeterPan

03/20/2008 14:10 pengpong#1
Hi,
i wanted to share some ideas and looking for ideas/comments.

Some of you might already have read about Hooking code (Detours,etc).
PeterPan tries to install the Hook in a more generic&easier way.

The old approach
It works like this:
We take CreateFileA as an example. Looking at the disassembly you will see:
Code:
.text:77E48CA4                 mov     edi, edi
.text:77E48CA6                 push    ebp
.text:77E48CA7                 mov     ebp, esp
Mov has 2 bytes, Push has 1, so this makes a total of 5 Bytes for these 3 instructions.

Now, if we want to hook that routine, we have to get to our code, this is done by a jmp.
And looking at the jmp instruction we see that it has 5 bytes (0xE9 and 4 bytes for relative jump address)

Now the "mov edi,edi" makes sense! Microsoft has introduced this to pad the Preamble of a function from 3 to 5

bytes, so that we can hook functions more easily (Yes, no joke...).

Ok, so to hook the function we need to do the following:

Manually check the function, see how many bytes we need to save for later use.
Copy the bytes and put at the end a jump to the original function back (not directly to the function beginning, but

to the address that is after our evil jmp)
Put our jmp code at the beginning of the original function.

In the section where we jump to, do some stuff, and then jmp to our saved bytes.

Ok, let's do this at our example:
Code:
.text:77E48CA4                 mov     edi, edi
.text:77E48CA6                 push    ebp
.text:77E48CA7                 mov     ebp, esp
.text:77E48CA9                 push    [ebp+lpFileName]
.text:77E48CAC                 call    sub_77E48C56
.text:77E48CB1                 test    eax, eax
So, we copy 5 bytes + jmp back. This will look like this:

SavedFunction:
Code:
mov edi,edi
push ebp
mov ebp,esp
jmp 77E48CA9
Our original function will look like this:

Code:
.text:77E48CA4                 jmp ourEvilCode
.text:77E48CA9                 push    [ebp+lpFileName]
.text:77E48CAC                 call    sub_77E48C56
.text:77E48CB1                 test    eax, eax
And ourEvilCode can look like this:

Code:
bla
bla
bla
jmp SavedFunction

This works nicely, but what are the downsides?

You have to check every function, for the size you need to save.
For Windows API this is easy, since 99% have the 5 byte preambel...
Let's insert a nop (1 byte length) at the beginning:

Code:
nop
mov edi,edi
push ebp
mov ebp,esp
If you now copy 5 bytes, you will break the mov ebp,esp. And the jmp back would jmp into the 2nd byte of the mov..

This will give you a protection fault sooner or later.

The second thing is, if you want to hook more functions, you have to create a ourEvilCode for every function you're

hooking (Because you need to jmp to a different SavedFunction, and probably want to have a different Payload for

every function)


Here's my idea of PeterPan:
* We have a table that gives us the length of an opcode, with this we can analyze how many bytes we have to save,

and where to jmp back in the code
* instead of jmp ourEvilCode, we do a call ourEvilCode. This puts eip onto the stack. with pop eax, we now have the

eip of the calling code, so we can have a lookup table for which Payload we want to execute, and which SavedFunction

we need to call.





Some party of my code (i'm thinking about releasing the whole code, but i need to clean it up a little more first)
It's a proof-of-concept, so there's plenty of room for improvements and cleanups....


------------------ How it's used....
Code:
	HMODULE mylo=LoadLibrary("kernel32.dll");
	DWORD add=(DWORD)GetProcAddress(mylo,"CreateFileA");
	
	
	doHook(add,(DWORD)&payload);
--------------- Build the stuff we insert into targetFunction
Code:
	char *jmpcode=(char*)malloc(50);

	memset(jmpcode,0x90,50);

	jmpcode[0]=0xE8;			// call XX XX XX XX
	
	DWORD reljmp;
	reljmp= (DWORD)&myLoad-(targetFunction+5);
	
	memcpy(jmpcode+1,&reljmp,4);


	DWORD OLD;
	VirtualProtectEx(GetCurrentProcess(),(void*)targetFunction,50,PAGE_EXECUTE_READWRITE,&OLD); 
	memcpy((void*)targetFunction,jmpcode,copysize-1);
--------------- Calculate how many bytes we need to save
Code:
	int copysize=-1;

	for(int i=0;i<30;i++) {			// 30 bytes should be enough
		p=(char*)(targetFunction+i);

		DWORD in=(DWORD)*p;
		in=in & 0xFF;				
		
		if(step[in]!=-1) 		//in step[] we have the length of each opcode
		{
			i+=step[in];
		} else {
			return -1;		//Damn.... unknown opcode, let's fail :(
		}
		if(i>=4) { 			
			copysize=i+1; break;	//Ok, we have at least 5 bytes 
		}

	}
-------------------- this is the function we call from targetFunction
Code:
__declspec (naked) void myLoad() {
	__asm {		
			nop			// i like to have nops, easier to find code in disassembly :)
			
			pop eax			// get calling address & save it
			mov meax,eax

			push eax		// do we have a Payload?
			call getPay
			add esp,4

			cmp eax,0
			je rock_on
			call eax		// Call it...
rockon:


	
			mov eax,meax		// eax=Address of Caller
			push eax
			call getBack		// Get adress of backjump
			add esp,4
			
			mov eax,eax		// just ignore this line :) 
			jmp eax			
	}
}
03/20/2008 15:19 mr.rattlz#2
What would happen if you had some position dependent code in the first 5 bytes, like a short jump ?
03/20/2008 16:17 pengpong#3
hehe yup, that gave me a short headache last night.
The Themida protected DLL i analyzed yesterday had exports that were like:
Code:
NOP
jmp somewhere
The solution:
The code that gets saved is analyzed again, and the jmp offsets are recalculated (currently only for 0xE8 and 0xE9)

Ugly Code:
Code:
//Handle address relocations...
	for(int i=0;i<copysize;i++) {
		char *p=(char*)(backtrump+i); //don't ask why it's called backtrump:D

		DWORD in=(DWORD)*p;
		in=in & 0xFF;				
		if(step[in]!=-1) 
		{

			if(in==0xE8 || in==0xE9) {				
				OutputDebugString("we have a relative address...");
				DWORD old;
				memcpy(&old,p+1,4);
				old-=relocationBase; //this is calculated somewhere else...  relocationBase=((DWORD)backtrump-targetFunction)+1;
				memcpy(p+1,&old,4);


			}
			
			i+=step[in];
		} else {			
			return -1;
		}

	}
03/20/2008 16:47 mr.rattlz#4
Quote:
Originally Posted by pengpong View Post
hehe yup, that gave me a short headache last night.
The Themida protected DLL i analyzed yesterday had exports that were like:
Code:
NOP
jmp somewhere
The solution:
The code that gets saved is analyzed again, and the jmp offsets are recalculated (currently only for 0xE8 and 0xE9)
So you have still some way to go, maybe you want to take a look at hde (Hacker Disassembler Engine) by veacheslav patkov, which takes up an amazingly tiny amount of space:
[Only registered and activated users can see links. Click Here To Register...]
03/20/2008 16:50 pengpong#5
wow ... the description of HDE sounds promising :) thx
03/20/2008 22:12 rEdoX#6
Sorry to badly disappoint you, but uall did this like 3 years ago:

[Only registered and activated users can see links. Click Here To Register...]
03/21/2008 12:31 pengpong#7
It's about learning, not if it has been done before.
So, i will gladly take a look at uall to see how they solved some problems.