Windows shellcode is a bitch too write. All of those win32 shellcode papers show how to get it by SEH or you can get it by using the TEB block i think or whatever is at fs:[30]..but i think the more code efficent way is like this..or atleast i think it is.
mov ebx, ebp
mov eax, esp
sub eax, ebx; eax = amount of bytes on stack
mov ecx, [esp+eax]; ecx = somewhere in k32..search
loopme:
cmp word ptr [ecx], 'M' + 'Z'
jz foundMZ
dec ecx
jmp loopme
foundMZ:
nop; found if we find mz header.
I think that'll work on a typicall vc++ app which builds the stack frame..just an idea.