Commit 998a050186aaab43ae0027f7aceba158ed03766b
1 parent
33256a25
Update (thanks to Edgar, Thiemo, malc, Paul, Laurent and Andrzej)
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5453 c046a42c-6fe2-441c-8c8c-71466251a162
Showing
1 changed file
with
178 additions
and
110 deletions
qemu-tech.texi
... | ... | @@ -33,11 +33,12 @@ |
33 | 33 | |
34 | 34 | @menu |
35 | 35 | * intro_features:: Features |
36 | -* intro_x86_emulation:: x86 emulation | |
36 | +* intro_x86_emulation:: x86 and x86-64 emulation | |
37 | 37 | * intro_arm_emulation:: ARM emulation |
38 | 38 | * intro_mips_emulation:: MIPS emulation |
39 | 39 | * intro_ppc_emulation:: PowerPC emulation |
40 | -* intro_sparc_emulation:: SPARC emulation | |
40 | +* intro_sparc_emulation:: Sparc32 and Sparc64 emulation | |
41 | +* intro_other_emulation:: Other CPU emulation | |
41 | 42 | @end menu |
42 | 43 | |
43 | 44 | @node intro_features |
... | ... | @@ -51,17 +52,17 @@ QEMU has two operating modes: |
51 | 52 | @itemize @minus |
52 | 53 | |
53 | 54 | @item |
54 | -Full system emulation. In this mode, QEMU emulates a full system | |
55 | -(usually a PC), including a processor and various peripherals. It can | |
56 | -be used to launch an different Operating System without rebooting the | |
57 | -PC or to debug system code. | |
55 | +Full system emulation. In this mode (full platform virtualization), | |
56 | +QEMU emulates a full system (usually a PC), including a processor and | |
57 | +various peripherals. It can be used to launch several different | |
58 | +Operating Systems at once without rebooting the host machine or to | |
59 | +debug system code. | |
58 | 60 | |
59 | 61 | @item |
60 | -User mode emulation (Linux host only). In this mode, QEMU can launch | |
61 | -Linux processes compiled for one CPU on another CPU. It can be used to | |
62 | -launch the Wine Windows API emulator (@url{http://www.winehq.org}) or | |
63 | -to ease cross-compilation and cross-debugging. | |
64 | - | |
62 | +User mode emulation. In this mode (application level virtualization), | |
63 | +QEMU can launch processes compiled for one CPU on another CPU, however | |
64 | +the Operating Systems must match. This can be used for example to ease | |
65 | +cross-compilation and cross-debugging. | |
65 | 66 | @end itemize |
66 | 67 | |
67 | 68 | As QEMU requires no host kernel driver to run, it is very safe and |
... | ... | @@ -75,7 +76,10 @@ QEMU generic features: |
75 | 76 | |
76 | 77 | @item Using dynamic translation to native code for reasonable speed. |
77 | 78 | |
78 | -@item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390. | |
79 | +@item | |
80 | +Working on x86, x86_64 and PowerPC32/64 hosts. Being tested on ARM, | |
81 | +HPPA, Sparc32 and Sparc64. Previous versions had some support for | |
82 | +Alpha and S390 hosts, but TCG (see below) doesn't support those yet. | |
79 | 83 | |
80 | 84 | @item Self-modifying code support. |
81 | 85 | |
... | ... | @@ -85,6 +89,10 @@ QEMU generic features: |
85 | 89 | in other projects (look at @file{qemu/tests/qruncom.c} to have an |
86 | 90 | example of user mode @code{libqemu} usage). |
87 | 91 | |
92 | +@item | |
93 | +Floating point library supporting both full software emulation and | |
94 | +native host FPU instructions. | |
95 | + | |
88 | 96 | @end itemize |
89 | 97 | |
90 | 98 | QEMU user mode emulation features: |
... | ... | @@ -96,20 +104,47 @@ QEMU user mode emulation features: |
96 | 104 | @item Accurate signal handling by remapping host signals to target signals. |
97 | 105 | @end itemize |
98 | 106 | |
107 | +Linux user emulator (Linux host only) can be used to launch the Wine | |
108 | +Windows API emulator (@url{http://www.winehq.org}). A Darwin user | |
109 | +emulator (Darwin hosts only) exists and a BSD user emulator for BSD | |
110 | +hosts is under development. It would also be possible to develop a | |
111 | +similar user emulator for Solaris. | |
112 | + | |
99 | 113 | QEMU full system emulation features: |
100 | 114 | @itemize |
101 | -@item QEMU can either use a full software MMU for maximum portability or use the host system call mmap() to simulate the target MMU. | |
115 | +@item | |
116 | +QEMU uses a full software MMU for maximum portability. | |
117 | + | |
118 | +@item | |
119 | +QEMU can optionally use an in-kernel accelerator, like kqemu and | |
120 | +kvm. The accelerators execute some of the guest code natively, while | |
121 | +continuing to emulate the rest of the machine. | |
122 | + | |
123 | +@item | |
124 | +Various hardware devices can be emulated and in some cases, host | |
125 | +devices (e.g. serial and parallel ports, USB, drives) can be used | |
126 | +transparently by the guest Operating System. Host device passthrough | |
127 | +can be used for talking to external physical peripherals (e.g. a | |
128 | +webcam, modem or tape drive). | |
129 | + | |
130 | +@item | |
131 | +Symmetric multiprocessing (SMP) even on a host with a single CPU. On a | |
132 | +SMP host system, QEMU can use only one CPU fully due to difficulty in | |
133 | +implementing atomic memory accesses efficiently. | |
134 | + | |
102 | 135 | @end itemize |
103 | 136 | |
104 | 137 | @node intro_x86_emulation |
105 | -@section x86 emulation | |
138 | +@section x86 and x86-64 emulation | |
106 | 139 | |
107 | 140 | QEMU x86 target features: |
108 | 141 | |
109 | 142 | @itemize |
110 | 143 | |
111 | 144 | @item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation. |
112 | -LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU. | |
145 | +LDT/GDT and IDT are emulated. VM86 mode is also supported to run | |
146 | +DOSEMU. There is some support for MMX/3DNow!, SSE, SSE2, SSE3, SSSE3, | |
147 | +and SSE4 as well as x86-64 SVM. | |
113 | 148 | |
114 | 149 | @item Support of host page sizes bigger than 4KB in user mode emulation. |
115 | 150 | |
... | ... | @@ -124,9 +159,7 @@ Current QEMU limitations: |
124 | 159 | |
125 | 160 | @itemize |
126 | 161 | |
127 | -@item No SSE/MMX support (yet). | |
128 | - | |
129 | -@item No x86-64 support. | |
162 | +@item Limited x86-64 support. | |
130 | 163 | |
131 | 164 | @item IPC syscalls are missing. |
132 | 165 | |
... | ... | @@ -134,10 +167,6 @@ Current QEMU limitations: |
134 | 167 | memory access (yet). Hopefully, very few OSes seem to rely on that for |
135 | 168 | normal use. |
136 | 169 | |
137 | -@item On non x86 host CPUs, @code{double}s are used instead of the non standard | |
138 | -10 byte @code{long double}s of x86 for floating point emulation to get | |
139 | -maximum performances. | |
140 | - | |
141 | 170 | @end itemize |
142 | 171 | |
143 | 172 | @node intro_arm_emulation |
... | ... | @@ -193,7 +222,7 @@ FPU and MMU. |
193 | 222 | @end itemize |
194 | 223 | |
195 | 224 | @node intro_sparc_emulation |
196 | -@section SPARC emulation | |
225 | +@section Sparc32 and Sparc64 emulation | |
197 | 226 | |
198 | 227 | @itemize |
199 | 228 | |
... | ... | @@ -216,8 +245,26 @@ Current QEMU limitations: |
216 | 245 | |
217 | 246 | @item Atomic instructions are not correctly implemented. |
218 | 247 | |
219 | -@item Sparc64 emulators are not usable for anything yet. | |
248 | +@item There are still some problems with Sparc64 emulators. | |
249 | + | |
250 | +@end itemize | |
251 | + | |
252 | +@node intro_other_emulation | |
253 | +@section Other CPU emulation | |
220 | 254 | |
255 | +In addition to the above, QEMU supports emulation of other CPUs with | |
256 | +varying levels of success. These are: | |
257 | + | |
258 | +@itemize | |
259 | + | |
260 | +@item | |
261 | +Alpha | |
262 | +@item | |
263 | +CRIS | |
264 | +@item | |
265 | +M68k | |
266 | +@item | |
267 | +SH4 | |
221 | 268 | @end itemize |
222 | 269 | |
223 | 270 | @node QEMU Internals |
... | ... | @@ -226,7 +273,6 @@ Current QEMU limitations: |
226 | 273 | @menu |
227 | 274 | * QEMU compared to other emulators:: |
228 | 275 | * Portable dynamic translation:: |
229 | -* Register allocation:: | |
230 | 276 | * Condition code optimisations:: |
231 | 277 | * CPU state optimisations:: |
232 | 278 | * Translation cache:: |
... | ... | @@ -234,6 +280,7 @@ Current QEMU limitations: |
234 | 280 | * Self-modifying code and translated code invalidation:: |
235 | 281 | * Exception support:: |
236 | 282 | * MMU emulation:: |
283 | +* Device emulation:: | |
237 | 284 | * Hardware interrupts:: |
238 | 285 | * User emulation specific details:: |
239 | 286 | * Bibliography:: |
... | ... | @@ -273,19 +320,23 @@ patches. However, user mode Linux requires heavy kernel patches while |
273 | 320 | QEMU accepts unpatched Linux kernels. The price to pay is that QEMU is |
274 | 321 | slower. |
275 | 322 | |
276 | -The new Plex86 [8] PC virtualizer is done in the same spirit as the | |
277 | -qemu-fast system emulator. It requires a patched Linux kernel to work | |
278 | -(you cannot launch the same kernel on your PC), but the patches are | |
279 | -really small. As it is a PC virtualizer (no emulation is done except | |
280 | -for some privileged instructions), it has the potential of being | |
281 | -faster than QEMU. The downside is that a complicated (and potentially | |
282 | -unsafe) host kernel patch is needed. | |
323 | +The Plex86 [8] PC virtualizer is done in the same spirit as the now | |
324 | +obsolete qemu-fast system emulator. It requires a patched Linux kernel | |
325 | +to work (you cannot launch the same kernel on your PC), but the | |
326 | +patches are really small. As it is a PC virtualizer (no emulation is | |
327 | +done except for some privileged instructions), it has the potential of | |
328 | +being faster than QEMU. The downside is that a complicated (and | |
329 | +potentially unsafe) host kernel patch is needed. | |
283 | 330 | |
284 | 331 | The commercial PC Virtualizers (VMWare [9], VirtualPC [10], TwoOStwo |
285 | 332 | [11]) are faster than QEMU, but they all need specific, proprietary |
286 | 333 | and potentially unsafe host drivers. Moreover, they are unable to |
287 | 334 | provide cycle exact simulation as an emulator can. |
288 | 335 | |
336 | +VirtualBox [12], Xen [13] and KVM [14] are based on QEMU. QEMU-SystemC | |
337 | +[15] uses QEMU to simulate a system where some hardware devices are | |
338 | +developed in SystemC. | |
339 | + | |
289 | 340 | @node Portable dynamic translation |
290 | 341 | @section Portable dynamic translation |
291 | 342 | |
... | ... | @@ -295,63 +346,51 @@ are very complicated and highly CPU dependent. QEMU uses some tricks |
295 | 346 | which make it relatively easily portable and simple while achieving good |
296 | 347 | performances. |
297 | 348 | |
298 | -The basic idea is to split every x86 instruction into fewer simpler | |
299 | -instructions. Each simple instruction is implemented by a piece of C | |
300 | -code (see @file{target-i386/op.c}). Then a compile time tool | |
301 | -(@file{dyngen}) takes the corresponding object file (@file{op.o}) | |
302 | -to generate a dynamic code generator which concatenates the simple | |
303 | -instructions to build a function (see @file{op.h:dyngen_code()}). | |
304 | - | |
305 | -In essence, the process is similar to [1], but more work is done at | |
306 | -compile time. | |
307 | - | |
308 | -A key idea to get optimal performances is that constant parameters can | |
309 | -be passed to the simple operations. For that purpose, dummy ELF | |
310 | -relocations are generated with gcc for each constant parameter. Then, | |
311 | -the tool (@file{dyngen}) can locate the relocations and generate the | |
312 | -appriopriate C code to resolve them when building the dynamic code. | |
313 | - | |
314 | -That way, QEMU is no more difficult to port than a dynamic linker. | |
315 | - | |
316 | -To go even faster, GCC static register variables are used to keep the | |
317 | -state of the virtual CPU. | |
318 | - | |
319 | -@node Register allocation | |
320 | -@section Register allocation | |
321 | - | |
322 | -Since QEMU uses fixed simple instructions, no efficient register | |
323 | -allocation can be done. However, because RISC CPUs have a lot of | |
324 | -register, most of the virtual CPU state can be put in registers without | |
325 | -doing complicated register allocation. | |
349 | +After the release of version 0.9.1, QEMU switched to a new method of | |
350 | +generating code, Tiny Code Generator or TCG. TCG relaxes the | |
351 | +dependency on the exact version of the compiler used. The basic idea | |
352 | +is to split every target instruction into a couple of RISC-like TCG | |
353 | +ops (see @code{target-i386/translate.c}). Some optimizations can be | |
354 | +performed at this stage, including liveness analysis and trivial | |
355 | +constant expression evaluation. TCG ops are then implemented in the | |
356 | +host CPU back end, also known as TCG target (see | |
357 | +@code{tcg/i386/tcg-target.c}). For more information, please take a | |
358 | +look at @code{tcg/README}. | |
326 | 359 | |
327 | 360 | @node Condition code optimisations |
328 | 361 | @section Condition code optimisations |
329 | 362 | |
330 | -Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a | |
331 | -critical point to get good performances. QEMU uses lazy condition code | |
332 | -evaluation: instead of computing the condition codes after each x86 | |
333 | -instruction, it just stores one operand (called @code{CC_SRC}), the | |
334 | -result (called @code{CC_DST}) and the type of operation (called | |
335 | -@code{CC_OP}). | |
363 | +Lazy evaluation of CPU condition codes (@code{EFLAGS} register on x86) | |
364 | +is important for CPUs where every instruction sets the condition | |
365 | +codes. It tends to be less important on conventional RISC systems | |
366 | +where condition codes are only updated when explicitly requested. | |
367 | + | |
368 | +Instead of computing the condition codes after each x86 instruction, | |
369 | +QEMU just stores one operand (called @code{CC_SRC}), the result | |
370 | +(called @code{CC_DST}) and the type of operation (called | |
371 | +@code{CC_OP}). When the condition codes are needed, the condition | |
372 | +codes can be calculated using this information. In addition, an | |
373 | +optimized calculation can be performed for some instruction types like | |
374 | +conditional branches. | |
336 | 375 | |
337 | 376 | @code{CC_OP} is almost never explicitly set in the generated code |
338 | 377 | because it is known at translation time. |
339 | 378 | |
340 | -In order to increase performances, a backward pass is performed on the | |
341 | -generated simple instructions (see | |
342 | -@code{target-i386/translate.c:optimize_flags()}). When it can be proved that | |
343 | -the condition codes are not needed by the next instructions, no | |
344 | -condition codes are computed at all. | |
379 | +The lazy condition code evaluation is used on x86, m68k and cris. ARM | |
380 | +uses a simplified variant for the N and Z flags. | |
345 | 381 | |
346 | 382 | @node CPU state optimisations |
347 | 383 | @section CPU state optimisations |
348 | 384 | |
349 | -The x86 CPU has many internal states which change the way it evaluates | |
350 | -instructions. In order to achieve a good speed, the translation phase | |
351 | -considers that some state information of the virtual x86 CPU cannot | |
352 | -change in it. For example, if the SS, DS and ES segments have a zero | |
353 | -base, then the translator does not even generate an addition for the | |
354 | -segment base. | |
385 | +The target CPUs have many internal states which change the way it | |
386 | +evaluates instructions. In order to achieve a good speed, the | |
387 | +translation phase considers that some state information of the virtual | |
388 | +CPU cannot change in it. The state is recorded in the Translation | |
389 | +Block (TB). If the state changes (e.g. privilege level), a new TB will | |
390 | +be generated and the previous TB won't be used anymore until the state | |
391 | +matches the state recorded in the previous TB. For example, if the SS, | |
392 | +DS and ES segments have a zero base, then the translator does not even | |
393 | +generate an addition for the segment base. | |
355 | 394 | |
356 | 395 | [The FPU stack pointer register is not handled that way yet]. |
357 | 396 | |
... | ... | @@ -388,28 +427,20 @@ instruction cache invalidation is signaled by the application when code |
388 | 427 | is modified. |
389 | 428 | |
390 | 429 | When translated code is generated for a basic block, the corresponding |
391 | -host page is write protected if it is not already read-only (with the | |
392 | -system call @code{mprotect()}). Then, if a write access is done to the | |
393 | -page, Linux raises a SEGV signal. QEMU then invalidates all the | |
394 | -translated code in the page and enables write accesses to the page. | |
430 | +host page is write protected if it is not already read-only. Then, if | |
431 | +a write access is done to the page, Linux raises a SEGV signal. QEMU | |
432 | +then invalidates all the translated code in the page and enables write | |
433 | +accesses to the page. | |
395 | 434 | |
396 | 435 | Correct translated code invalidation is done efficiently by maintaining |
397 | 436 | a linked list of every translated block contained in a given page. Other |
398 | 437 | linked lists are also maintained to undo direct block chaining. |
399 | 438 | |
400 | -Although the overhead of doing @code{mprotect()} calls is important, | |
401 | -most MSDOS programs can be emulated at reasonnable speed with QEMU and | |
402 | -DOSEMU. | |
403 | - | |
404 | -Note that QEMU also invalidates pages of translated code when it detects | |
405 | -that memory mappings are modified with @code{mmap()} or @code{munmap()}. | |
406 | - | |
407 | -When using a software MMU, the code invalidation is more efficient: if | |
408 | -a given code page is invalidated too often because of write accesses, | |
409 | -then a bitmap representing all the code inside the page is | |
410 | -built. Every store into that page checks the bitmap to see if the code | |
411 | -really needs to be invalidated. It avoids invalidating the code when | |
412 | -only data is modified in the page. | |
439 | +On RISC targets, correctly written software uses memory barriers and | |
440 | +cache flushes, so some of the protection above would not be | |
441 | +necessary. However, QEMU still requires that the generated code always | |
442 | +matches the target instructions in memory in order to handle | |
443 | +exceptions correctly. | |
413 | 444 | |
414 | 445 | @node Exception support |
415 | 446 | @section Exception support |
... | ... | @@ -418,10 +449,9 @@ longjmp() is used when an exception such as division by zero is |
418 | 449 | encountered. |
419 | 450 | |
420 | 451 | The host SIGSEGV and SIGBUS signal handlers are used to get invalid |
421 | -memory accesses. The exact CPU state can be retrieved because all the | |
422 | -x86 registers are stored in fixed host registers. The simulated program | |
423 | -counter is found by retranslating the corresponding basic block and by | |
424 | -looking where the host program counter was at the exception point. | |
452 | +memory accesses. The simulated program counter is found by | |
453 | +retranslating the corresponding basic block and by looking where the | |
454 | +host program counter was at the exception point. | |
425 | 455 | |
426 | 456 | The virtual CPU cannot retrieve the exact @code{EFLAGS} register because |
427 | 457 | in some cases it is not computed because of condition code |
... | ... | @@ -431,15 +461,10 @@ still be restarted in any cases. |
431 | 461 | @node MMU emulation |
432 | 462 | @section MMU emulation |
433 | 463 | |
434 | -For system emulation, QEMU uses the mmap() system call to emulate the | |
435 | -target CPU MMU. It works as long the emulated OS does not use an area | |
436 | -reserved by the host OS (such as the area above 0xc0000000 on x86 | |
437 | -Linux). | |
438 | - | |
439 | -In order to be able to launch any OS, QEMU also supports a soft | |
440 | -MMU. In that mode, the MMU virtual to physical address translation is | |
441 | -done at every memory access. QEMU uses an address translation cache to | |
442 | -speed up the translation. | |
464 | +For system emulation QEMU supports a soft MMU. In that mode, the MMU | |
465 | +virtual to physical address translation is done at every memory | |
466 | +access. QEMU uses an address translation cache to speed up the | |
467 | +translation. | |
443 | 468 | |
444 | 469 | In order to avoid flushing the translated code each time the MMU |
445 | 470 | mappings change, QEMU uses a physically indexed translation cache. It |
... | ... | @@ -448,6 +473,33 @@ means that each basic block is indexed with its physical address. |
448 | 473 | When MMU mappings change, only the chaining of the basic blocks is |
449 | 474 | reset (i.e. a basic block can no longer jump directly to another one). |
450 | 475 | |
476 | +@node Device emulation | |
477 | +@section Device emulation | |
478 | + | |
479 | +Systems emulated by QEMU are organized by boards. At initialization | |
480 | +phase, each board instantiates a number of CPUs, devices, RAM and | |
481 | +ROM. Each device in turn can assign I/O ports or memory areas (for | |
482 | +MMIO) to its handlers. When the emulation starts, an access to the | |
483 | +ports or MMIO memory areas assigned to the device causes the | |
484 | +corresponding handler to be called. | |
485 | + | |
486 | +RAM and ROM are handled more optimally, only the offset to the host | |
487 | +memory needs to be added to the guest address. | |
488 | + | |
489 | +The video RAM of VGA and other display cards is special: it can be | |
490 | +read or written directly like RAM, but write accesses cause the memory | |
491 | +to be marked with VGA_DIRTY flag as well. | |
492 | + | |
493 | +QEMU supports some device classes like serial and parallel ports, USB, | |
494 | +drives and network devices, by providing APIs for easier connection to | |
495 | +the generic, higher level implementations. The API hides the | |
496 | +implementation details from the devices, like native device use or | |
497 | +advanced block device formats like QCOW. | |
498 | + | |
499 | +Usually the devices implement a reset method and register support for | |
500 | +saving and loading of the device state. The devices can also use | |
501 | +timers, especially together with the use of bottom halves (BHs). | |
502 | + | |
451 | 503 | @node Hardware interrupts |
452 | 504 | @section Hardware interrupts |
453 | 505 | |
... | ... | @@ -513,9 +565,9 @@ it is not very useful, it is an important test to show the power of the |
513 | 565 | emulator. |
514 | 566 | |
515 | 567 | Achieving self-virtualization is not easy because there may be address |
516 | -space conflicts. QEMU solves this problem by being an executable ELF | |
517 | -shared object as the ld-linux.so ELF interpreter. That way, it can be | |
518 | -relocated at load time. | |
568 | +space conflicts. QEMU user emulators solve this problem by being an | |
569 | +executable ELF shared object as the ld-linux.so ELF interpreter. That | |
570 | +way, it can be relocated at load time. | |
519 | 571 | |
520 | 572 | @node Bibliography |
521 | 573 | @section Bibliography |
... | ... | @@ -568,6 +620,22 @@ The VirtualPC PC virtualizer. |
568 | 620 | @url{http://www.twoostwo.org/}, |
569 | 621 | The TwoOStwo PC virtualizer. |
570 | 622 | |
623 | +@item [12] | |
624 | +@url{http://virtualbox.org/}, | |
625 | +The VirtualBox PC virtualizer. | |
626 | + | |
627 | +@item [13] | |
628 | +@url{http://www.xen.org/}, | |
629 | +The Xen hypervisor. | |
630 | + | |
631 | +@item [14] | |
632 | +@url{http://kvm.qumranet.com/kvmwiki/Front_Page}, | |
633 | +Kernel Based Virtual Machine (KVM). | |
634 | + | |
635 | +@item [15] | |
636 | +@url{http://www.greensocs.com/projects/QEMUSystemC}, | |
637 | +QEMU-SystemC, a hardware co-simulator. | |
638 | + | |
571 | 639 | @end table |
572 | 640 | |
573 | 641 | @node Regression Tests | ... | ... |