Commit df0f11a03b5bda2a16b8fd9530b1feeef93da8e5
1 parent
2d92f0b8
update
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@197 c046a42c-6fe2-441c-8c8c-71466251a162
Showing
4 changed files
with
113 additions
and
62 deletions
Changelog
| ... | ... | @@ -14,7 +14,11 @@ version 0.2: |
| 14 | 14 | - SHL instruction C flag fix. |
| 15 | 15 | - mmap emulation for host page size > 4KB |
| 16 | 16 | - self-modifying code support |
| 17 | - - better VM86 support (dosemu begins to work) | |
| 17 | + - better VM86 support (dosemu works on non trivial programs) | |
| 18 | + - precise exception support (EIP is computed correctly in most cases) | |
| 19 | + - more precise LDT/GDT/IDT emulation | |
| 20 | + - faster segment load in vm86 mode | |
| 21 | + - direct chaining of basic blocks (faster emulation) | |
| 18 | 22 | |
| 19 | 23 | version 0.1.6: |
| 20 | 24 | ... | ... |
TODO
| 1 | -- fix gcc 2.96 compile bug | |
| 2 | -- fix thread locks | |
| 3 | -- optimize translated cache chaining (DLL PLT-like system) | |
| 1 | + | |
| 2 | +- fix iret/lret/fpush not before mem load restarting | |
| 3 | +- fix all remaining thread lock issues (must put TBs in a specific invalid | |
| 4 | + state, find a solution for tb_flush()). | |
| 5 | +- handle fp87 state in signals | |
| 6 | +- add gcc 2.96 test configure (some gcc3 flags are needed) | |
| 7 | +- optimize FPU operations (evaluate x87 stack pointer statically) | |
| 8 | +- add IPC syscalls | |
| 9 | +- submit a patch to fix DOSEMU coopthreads | |
| 10 | + | |
| 11 | +lower priority: | |
| 12 | +-------------- | |
| 13 | +- handle rare page fault cases (in particular if page fault in heplers or | |
| 14 | + in syscall emulation code). | |
| 4 | 15 | - fix thread stack freeing (use kernel 2.5.x CLONE_CHILD_CLEARTID) |
| 5 | -- fix x86 stack allocation | |
| 6 | -- fix iret/lret restarting | |
| 7 | 16 | - more syscalls (in particular all 64 bit ones, IPCs, fix 64 bit |
| 8 | 17 | issues, fix 16 bit uid issues) |
| 9 | -- finish signal handing (fp87 state, more siginfo conversions) | |
| 10 | -- fix FPU exceptions (in particular: gen_op_fpush not before mem load) | |
| 11 | -- handle self-modifying code (track mmap and mark all pages containing | |
| 12 | - translated code as readonly. use a custom signal handler to flush | |
| 13 | - parts of the translation cache if write access to a readonly page | |
| 14 | - containing translated code). | |
| 15 | -- use gcc to compile to static code | |
| 18 | +- use page_unprotect_range in every suitable syscall to handle all | |
| 19 | + cases of self modifying code. | |
| 20 | +- use gcc as a backend to generate better code (easy to do by using | |
| 21 | + op-i386.c operations as local inline functions). | |
| 22 | +- add SSE2/MMX operations | ... | ... |
VERSION
qemu-doc.texi
| ... | ... | @@ -10,11 +10,11 @@ |
| 10 | 10 | @chapter Introduction |
| 11 | 11 | |
| 12 | 12 | QEMU is an x86 processor emulator. Its purpose is to run x86 Linux |
| 13 | -processes on non-x86 Linux architectures such as PowerPC or ARM. By | |
| 14 | -using dynamic translation it achieves a reasonnable speed while being | |
| 15 | -easy to port on new host CPUs. Its main goal is to be able to launch the | |
| 16 | -@code{Wine} Windows API emulator (@url{http://www.winehq.org}) on | |
| 17 | -non-x86 CPUs. | |
| 13 | +processes on non-x86 Linux architectures such as PowerPC. By using | |
| 14 | +dynamic translation it achieves a reasonnable speed while being easy to | |
| 15 | +port on new host CPUs. Its main goal is to be able to launch the | |
| 16 | +@code{Wine} Windows API emulator (@url{http://www.winehq.org}) or | |
| 17 | +@code{DOSEMU} (@url{http://www.dosemu.org}) on non-x86 CPUs. | |
| 18 | 18 | |
| 19 | 19 | QEMU features: |
| 20 | 20 | |
| ... | ... | @@ -22,21 +22,26 @@ QEMU features: |
| 22 | 22 | |
| 23 | 23 | @item User space only x86 emulator. |
| 24 | 24 | |
| 25 | -@item Currently ported on i386, PowerPC and S390. | |
| 25 | +@item Currently ported on i386, PowerPC. Work in progress for S390, Alpha and Sparc. | |
| 26 | 26 | |
| 27 | 27 | @item Using dynamic translation to native code for reasonnable speed. |
| 28 | 28 | |
| 29 | 29 | @item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation. |
| 30 | -User space LDT and GDT are emulated. VM86 mode is also supported | |
| 31 | -(experimental). | |
| 30 | +User space LDT and GDT are emulated. VM86 mode is also supported. | |
| 32 | 31 | |
| 33 | 32 | @item Generic Linux system call converter, including most ioctls. |
| 34 | 33 | |
| 35 | 34 | @item clone() emulation using native CPU clone() to use Linux scheduler for threads. |
| 36 | 35 | |
| 37 | -@item Accurate signal handling by remapping host signals to virtual x86 signals. | |
| 36 | +@item Accurate signal handling by remapping host signals to virtual x86 signals. | |
| 38 | 37 | |
| 39 | -@item QEMU can emulate itself on x86 (experimental). | |
| 38 | +@item Precise user space x86 exceptions. | |
| 39 | + | |
| 40 | +@item Self-modifying code support. | |
| 41 | + | |
| 42 | +@item Support of host page sizes bigger than 4KB. | |
| 43 | + | |
| 44 | +@item QEMU can emulate itself on x86. | |
| 40 | 45 | |
| 41 | 46 | @item The virtual x86 CPU is a library (@code{libqemu}) which can be used |
| 42 | 47 | in other projects. |
| ... | ... | @@ -46,19 +51,15 @@ It can be used to test other x86 virtual CPUs. |
| 46 | 51 | |
| 47 | 52 | @end itemize |
| 48 | 53 | |
| 49 | -Current QEMU Limitations: | |
| 54 | +Current QEMU limitations: | |
| 50 | 55 | |
| 51 | 56 | @itemize |
| 52 | 57 | |
| 53 | -@item Not all x86 exceptions are precise (yet). [Very few programs need that]. | |
| 54 | - | |
| 55 | -@item No support for self-modifying code (yet). [Very few programs need that, a notable exception is QEMU itself !]. | |
| 56 | - | |
| 57 | 58 | @item No SSE/MMX support (yet). |
| 58 | 59 | |
| 59 | 60 | @item No x86-64 support. |
| 60 | 61 | |
| 61 | -@item Some Linux syscalls are missing. | |
| 62 | +@item IPC syscalls are missing. | |
| 62 | 63 | |
| 63 | 64 | @item The x86 segment limits and access rights are not tested at every |
| 64 | 65 | memory access (and will never be to have good performances). |
| ... | ... | @@ -119,7 +120,7 @@ qemu /usr/local/qemu-i386/bin/qemu-i386 /usr/local/qemu-i386/bin/ls-i386 |
| 119 | 120 | |
| 120 | 121 | @end itemize |
| 121 | 122 | |
| 122 | -@section Wine launch (Currently only tested when emulating x86 on x86) | |
| 123 | +@section Wine launch | |
| 123 | 124 | |
| 124 | 125 | @itemize |
| 125 | 126 | |
| ... | ... | @@ -152,17 +153,24 @@ qemu /usr/local/qemu-i386/wine/bin/wine /usr/local/qemu-i386/wine/c/Program\ Fil |
| 152 | 153 | usage: qemu [-h] [-d] [-L path] [-s size] program [arguments...] |
| 153 | 154 | @end example |
| 154 | 155 | |
| 155 | -@table @samp | |
| 156 | +@table @option | |
| 156 | 157 | @item -h |
| 157 | 158 | Print the help |
| 158 | -@item -d | |
| 159 | -Activate log (logfile=/tmp/qemu.log) | |
| 160 | 159 | @item -L path |
| 161 | 160 | Set the x86 elf interpreter prefix (default=/usr/local/qemu-i386) |
| 162 | 161 | @item -s size |
| 163 | 162 | Set the x86 stack size in bytes (default=524288) |
| 164 | 163 | @end table |
| 165 | 164 | |
| 165 | +Debug options: | |
| 166 | + | |
| 167 | +@table @option | |
| 168 | +@item -d | |
| 169 | +Activate log (logfile=/tmp/qemu.log) | |
| 170 | +@item -p pagesize | |
| 171 | +Act as if the host page size was 'pagesize' bytes | |
| 172 | +@end table | |
| 173 | + | |
| 166 | 174 | @chapter QEMU Internals |
| 167 | 175 | |
| 168 | 176 | @section QEMU compared to other emulators |
| ... | ... | @@ -265,17 +273,59 @@ contains just a single basic block (a block of x86 instructions |
| 265 | 273 | terminated by a jump or by a virtual CPU state change which the |
| 266 | 274 | translator cannot deduce statically). |
| 267 | 275 | |
| 268 | -[Currently, the translated code is not patched if it jumps to another | |
| 269 | -translated code]. | |
| 276 | +@section Direct block chaining | |
| 277 | + | |
| 278 | +After each translated basic block is executed, QEMU uses the simulated | |
| 279 | +Program Counter (PC) and other cpu state informations (such as the CS | |
| 280 | +segment base value) to find the next basic block. | |
| 281 | + | |
| 282 | +In order to accelerate the most common cases where the new simulated PC | |
| 283 | +is known, QEMU can patch a basic block so that it jumps directly to the | |
| 284 | +next one. | |
| 285 | + | |
| 286 | +The most portable code uses an indirect jump. An indirect jump makes it | |
| 287 | +easier to make the jump target modification atomic. On some | |
| 288 | +architectures (such as PowerPC), the @code{JUMP} opcode is directly | |
| 289 | +patched so that the block chaining has no overhead. | |
| 290 | + | |
| 291 | +@section Self-modifying code and translated code invalidation | |
| 292 | + | |
| 293 | +Self-modifying code is a special challenge in x86 emulation because no | |
| 294 | +instruction cache invalidation is signaled by the application when code | |
| 295 | +is modified. | |
| 296 | + | |
| 297 | +When translated code is generated for a basic block, the corresponding | |
| 298 | +host page is write protected if it is not already read-only (with the | |
| 299 | +system call @code{mprotect()}). Then, if a write access is done to the | |
| 300 | +page, Linux raises a SEGV signal. QEMU then invalidates all the | |
| 301 | +translated code in the page and enables write accesses to the page. | |
| 302 | + | |
| 303 | +Correct translated code invalidation is done efficiently by maintaining | |
| 304 | +a linked list of every translated block contained in a given page. Other | |
| 305 | +linked lists are also maintained to undo direct block chaining. | |
| 306 | + | |
| 307 | +Althought the overhead of doing @code{mprotect()} calls is important, | |
| 308 | +most MSDOS programs can be emulated at reasonnable speed with QEMU and | |
| 309 | +DOSEMU. | |
| 310 | + | |
| 311 | +Note that QEMU also invalidates pages of translated code when it detects | |
| 312 | +that memory mappings are modified with @code{mmap()} or @code{munmap()}. | |
| 270 | 313 | |
| 271 | 314 | @section Exception support |
| 272 | 315 | |
| 273 | 316 | longjmp() is used when an exception such as division by zero is |
| 274 | -encountered. The host SIGSEGV and SIGBUS signal handlers are used to get | |
| 275 | -invalid memory accesses. | |
| 317 | +encountered. | |
| 276 | 318 | |
| 277 | -[Currently, the virtual CPU cannot retrieve the exact CPU state in some | |
| 278 | -exceptions, although it could except for the @code{EFLAGS} register]. | |
| 319 | +The host SIGSEGV and SIGBUS signal handlers are used to get invalid | |
| 320 | +memory accesses. The exact CPU state can be retrieved because all the | |
| 321 | +x86 registers are stored in fixed host registers. The simulated program | |
| 322 | +counter is found by retranslating the corresponding basic block and by | |
| 323 | +looking where the host program counter was at the exception point. | |
| 324 | + | |
| 325 | +The virtual CPU cannot retrieve the exact @code{EFLAGS} register because | |
| 326 | +in some cases it is not computed because of condition code | |
| 327 | +optimisations. It is not a big concern because the emulated code can | |
| 328 | +still be restarted in any cases. | |
| 279 | 329 | |
| 280 | 330 | @section Linux system call translation |
| 281 | 331 | |
| ... | ... | @@ -284,6 +334,11 @@ the parameters of the system calls can be converted to fix the |
| 284 | 334 | endianness and 32/64 bit issues. The IOCTLs are converted with a generic |
| 285 | 335 | type description system (see @file{ioctls.h} and @file{thunk.c}). |
| 286 | 336 | |
| 337 | +QEMU supports host CPUs which have pages bigger than 4KB. It records all | |
| 338 | +the mappings the process does and try to emulated the @code{mmap()} | |
| 339 | +system calls in cases where the host @code{mmap()} call would fail | |
| 340 | +because of bad page alignment. | |
| 341 | + | |
| 287 | 342 | @section Linux signals |
| 288 | 343 | |
| 289 | 344 | Normal and real-time signals are queued along with their information |
| ... | ... | @@ -312,6 +367,10 @@ thread. |
| 312 | 367 | The virtual x86 CPU atomic operations are emulated with a global lock so |
| 313 | 368 | that their semantic is preserved. |
| 314 | 369 | |
| 370 | +Note that currently there are still some locking issues in QEMU. In | |
| 371 | +particular, the translated cache flush is not protected yet against | |
| 372 | +reentrancy. | |
| 373 | + | |
| 315 | 374 | @section Self-virtualization |
| 316 | 375 | |
| 317 | 376 | QEMU was conceived so that ultimately it can emulate itself. Althought |
| ... | ... | @@ -323,10 +382,6 @@ space conflicts. QEMU solves this problem by being an executable ELF |
| 323 | 382 | shared object as the ld-linux.so ELF interpreter. That way, it can be |
| 324 | 383 | relocated at load time. |
| 325 | 384 | |
| 326 | -Since self-modifying code is not supported yet, QEMU cannot emulate | |
| 327 | -itself in case of translation cache flush. This limitation will be | |
| 328 | -suppressed soon. | |
| 329 | - | |
| 330 | 385 | @section Bibliography |
| 331 | 386 | |
| 332 | 387 | @table @asis |
| ... | ... | @@ -379,19 +434,10 @@ program and a @code{diff} on the generated output. |
| 379 | 434 | The Linux system call @code{modify_ldt()} is used to create x86 selectors |
| 380 | 435 | to test some 16 bit addressing and 32 bit with segmentation cases. |
| 381 | 436 | |
| 382 | -@section @file{testsig} | |
| 383 | - | |
| 384 | -This program tests various signal cases, including SIGFPE, SIGSEGV and | |
| 385 | -SIGILL. | |
| 386 | - | |
| 387 | -@section @file{testclone} | |
| 437 | +The Linux system call @code{vm86()} is used to test vm86 emulation. | |
| 388 | 438 | |
| 389 | -Tests the @code{clone()} system call (basic test). | |
| 390 | - | |
| 391 | -@section @file{testthread} | |
| 392 | - | |
| 393 | -Tests the glibc threads (more complicated than @code{clone()} because signals | |
| 394 | -are also used). | |
| 439 | +Various exceptions are raised to test most of the x86 user space | |
| 440 | +exception reporting. | |
| 395 | 441 | |
| 396 | 442 | @section @file{sha1} |
| 397 | 443 | |
| ... | ... | @@ -399,9 +445,3 @@ It is a simple benchmark. Care must be taken to interpret the results |
| 399 | 445 | because it mostly tests the ability of the virtual CPU to optimize the |
| 400 | 446 | @code{rol} x86 instruction and the condition code computations. |
| 401 | 447 | |
| 402 | -@section @file{runcom} | |
| 403 | - | |
| 404 | -A very simple MSDOS emulator to test the Linux vm86() system call | |
| 405 | -emulation. The excellent 54 byte @file{pi_10.com} PI number calculator | |
| 406 | -can be launched with it. @file{pi_10.com} was written by Bertram | |
| 407 | -Felgenhauer (more information at @url{http://www.boo.net/~jasonp/pipage.html}). | ... | ... |