- optimize translated cache chaining (DLL PLT-like system)- 64 bit syscalls- signals- threads- make it self runnable (use same trick as ld.so : include its own relocator and libc)- improved 16 bit support - fix FPU exceptions (in particular: gen_op_fpush not before mem load)