10 |
|
|
11 |
<!-- |
<!-- |
12 |
|
|
13 |
$Id: intro.html,v 1.90 2006/08/14 17:45:47 debug Exp $ |
$Id: intro.html,v 1.108 2007/04/12 16:57:22 debug Exp $ |
14 |
|
|
15 |
Copyright (C) 2003-2006 Anders Gavare. All rights reserved. |
Copyright (C) 2003-2007 Anders Gavare. All rights reserved. |
16 |
|
|
17 |
Redistribution and use in source and binary forms, with or without |
Redistribution and use in source and binary forms, with or without |
18 |
modification, are permitted provided that the following conditions are met: |
modification, are permitted provided that the following conditions are met: |
53 |
<li><a href="#run">How to run the emulator</a> |
<li><a href="#run">How to run the emulator</a> |
54 |
<li><a href="#cpus">Which processor architectures does GXemul emulate?</a> |
<li><a href="#cpus">Which processor architectures does GXemul emulate?</a> |
55 |
<li><a href="#hosts">Which host architectures are supported?</a> |
<li><a href="#hosts">Which host architectures are supported?</a> |
|
<li><a href="#translation">What kind of translation does GXemul use?</a> |
|
56 |
<li><a href="#accuracy">Emulation accuracy</a> |
<li><a href="#accuracy">Emulation accuracy</a> |
57 |
<li><a href="#emulmodes">Which machines does GXemul emulate?</a> |
<li><a href="#emulmodes">Which machines does GXemul emulate?</a> |
58 |
</ul> |
</ul> |
72 |
hardware components are emulated well enough to let unmodified operating |
hardware components are emulated well enough to let unmodified operating |
73 |
systems (e.g. NetBSD) run as if they were running on a real machine. |
systems (e.g. NetBSD) run as if they were running on a real machine. |
74 |
|
|
75 |
<p>Devices and processors (ARM, MIPS, PowerPC) are not simulated with 100% |
<p>Devices and processors are not simulated with 100% accuracy. They are |
76 |
accuracy. They are only ``faked'' well enough to allow guest operating |
only ``faked'' well enough to allow guest operating systems to run without |
77 |
systems run without complaining too much. Still, the emulator could be of |
complaining too much. Still, the emulator could be of interest for |
78 |
interest for academic research and experiments, such as when learning how |
academic research and experiments, such as when learning how to write |
79 |
to write operating system code. |
operating system code. |
80 |
|
|
81 |
<p>The emulator is written in C, does not depend on third-party libraries, |
<p>The emulator is written in C, does not depend on third-party libraries, |
82 |
and should compile and run on most 64-bit and 32-bit Unix-like systems. |
and should compile and run on most 64-bit and 32-bit Unix-like systems. |
94 |
|
|
95 |
<p>If you do not have a kernel as a separate file, but you have a bootable |
<p>If you do not have a kernel as a separate file, but you have a bootable |
96 |
disk image, then it is sometimes possible to boot directly from that |
disk image, then it is sometimes possible to boot directly from that |
97 |
image. (This works for example with DECstation emulation, or when booting |
image. (This works for example with DECstation emulation, Dreamcast |
98 |
from ISO9660 CDROM images.) |
emulation, or when booting from generic ISO9660 CDROM images if the |
99 |
|
kernel is included in the image as a plain file.) |
100 |
|
|
101 |
|
<p>Thanks to (in no specific order) Joachim Buss, Olivier Houchard, Juli |
102 |
|
Mallett, Juan Romero Pardines, Alec Voropay, Göran Weinholt, Alexander |
103 |
|
Yurchenko, and everyone else who has provided me with feedback. |
104 |
|
|
105 |
|
|
106 |
|
|
216 |
<h3>Which processor architectures does GXemul emulate?</h3> |
<h3>Which processor architectures does GXemul emulate?</h3> |
217 |
|
|
218 |
The architectures that are emulated well enough to let at least one |
The architectures that are emulated well enough to let at least one |
219 |
guest operating system run (per architecture) are ARM, MIPS, and |
guest operating system run (per architecture) are ARM, MIPS, PowerPC, |
220 |
PowerPC. |
and SuperH. |
|
|
|
|
|
|
|
|
|
221 |
|
|
222 |
|
<p>Please read the page about <a href="guestoses.html">guest operating |
223 |
|
systems</a> for more information about the machines and operating systems |
224 |
|
that can be considered "working" in the emulator. |
225 |
|
|
|
<p><br> |
|
|
<a name="hosts"></a> |
|
|
<h3>Which host architectures are supported?</h3> |
|
|
|
|
|
As of release 0.4.0 of GXemul, the old binary translation subsystem, which |
|
|
was used for emulation of MIPS processors on Alpha and i386 hosts, has |
|
|
been removed. The current dynamic translation subsystem should work on any |
|
|
host. |
|
226 |
|
|
227 |
|
|
228 |
|
|
229 |
|
|
230 |
|
|
231 |
<p><br> |
<p><br> |
232 |
<a name="translation"></a> |
<a name="hosts"></a> |
233 |
<h3>What kind of translation does GXemul use?</h3> |
<h3>Which host architectures are supported?</h3> |
|
|
|
|
<b>Static vs. dynamic:</b> |
|
|
|
|
|
<p>In order to support guest operating systems, which can overwrite old |
|
|
code pages in memory with new code, it is necessary to translate code |
|
|
dynamically. It is not possible to do a "one-pass" (static) translation. |
|
|
Self-modifying code and Just-in-Time compilers running inside |
|
|
the emulator are other things that would not work with a static |
|
|
translator. GXemul is a dynamic translator. However, it does not |
|
|
necessarily translate into native code, like many other emulators. |
|
|
|
|
|
<p><b>"Runnable" Intermediate Representation:</b> |
|
|
|
|
|
<p>Dynamic translators usually translate from the emulated architecture |
|
|
(e.g. MIPS) into a kind of <i>intermediate representation</i> (IR), and then |
|
|
to native code (e.g. AMD64 or x86 code). Since one of my main goals for |
|
|
GXemul is to keep everything as portable as possible, I have tried to make |
|
|
sure that the IR is something which can be executed regardless of whether |
|
|
the final step (translation from IR to native code) has been implemented |
|
|
or not. |
|
|
|
|
|
<p>The IR in GXemul consists of arrays of pointers to functions, and a few |
|
|
arguments which are passed along to those functions. The functions are |
|
|
implemented in either manually hand-coded C, or automatically generated C. |
|
|
In any case, this is all statically linked into the GXemul binary at link |
|
|
time. |
|
|
|
|
|
<p>Here is a simplified diagram of how these arrays work. |
|
|
|
|
|
<p><center><img src="simplified_dyntrans.png"></center> |
|
|
|
|
|
<p>There is one instruction call slot for every possible program counter |
|
|
location. In the MIPS case, instruction words are 32 bits in length, |
|
|
and pages are (usually) 4 KB large, resulting in 1024 instruction call |
|
|
slots. After the last of these instruction calls, there is an additional |
|
|
call to a special "end of page" function (which doesn't count as an executed |
|
|
instruction). This function switches to the first instruction |
|
|
on the next virtual page (which might cause exceptions, etc). |
|
|
|
|
|
<p>The complexity of individual instructions vary. A simple example of |
|
|
what an instruction can look like is the MIPS <tt>addiu</tt> instruction: |
|
|
<pre> |
|
|
X(addiu) |
|
|
{ |
|
|
reg(ic->arg[1]) = (int32_t) |
|
|
((int32_t)reg(ic->arg[0]) + (int32_t)ic->arg[2]); |
|
|
} |
|
|
</pre> |
|
|
|
|
|
<p>It stores the result of a 32-bit addition of the register at arg[0] |
|
|
with the immediate value arg[2] (treating both as signed 32-bit |
|
|
integers) into register arg[1]. If the emulated CPU is a 64-bit CPU, |
|
|
then this will store a correctly sign-extended value into arg[1]. |
|
|
If it is a 32-bit CPU, then only the lowest 32 bits will be stored, |
|
|
and the high part ignored. <tt>X(addiu)</tt> is expanded to |
|
|
<tt>mips_instr_addiu</tt> in the 64-bit case, and <tt>mips32_instr_addiu</tt> |
|
|
in the 32-bit case. Both are compiled into the GXemul executable; no code |
|
|
is created during run-time. |
|
|
|
|
|
<p>Here are examples of what the <tt>addiu</tt> instruction actually |
|
|
looks like when it is compiled, on various host architectures: |
|
|
|
|
|
<p><center><table border="0"> |
|
|
<tr><td><b>GCC 4.0.1 on Alpha:</b></td> |
|
|
<td width="35"></td><td></td> |
|
|
<tr> |
|
|
<td valign="top"> |
|
|
<pre>mips_instr_addiu: |
|
|
ldq t1,8(a1) |
|
|
ldq t2,24(a1) |
|
|
ldq t3,16(a1) |
|
|
ldq t0,0(t1) |
|
|
addl t0,t2,t0 |
|
|
stq t0,0(t3) |
|
|
ret</pre> |
|
|
</td> |
|
|
<td></td> |
|
|
<td valign="top"> |
|
|
<pre>mips32_instr_addiu: |
|
|
ldq t2,8(a1) |
|
|
ldq t0,24(a1) |
|
|
ldq t3,16(a1) |
|
|
ldl t1,0(t2) |
|
|
addq t0,t1,t0 |
|
|
stl t0,0(t3) |
|
|
ret</pre> |
|
|
</td> |
|
|
</tr> |
|
|
|
|
|
<tr><td><b><br>GCC 3.4.4 on AMD64:</b></td> |
|
|
<tr> |
|
|
<td valign="top"> |
|
|
<pre>mips_instr_addiu: |
|
|
mov 0x8(%rsi),%rdx |
|
|
mov 0x18(%rsi),%rax |
|
|
mov 0x10(%rsi),%rcx |
|
|
add (%rdx),%eax |
|
|
cltq |
|
|
mov %rax,(%rcx) |
|
|
retq</pre> |
|
|
</td> |
|
|
<td></td> |
|
|
<td valign="top"> |
|
|
<pre>mips32_instr_addiu: |
|
|
mov 0x8(%rsi),%rcx |
|
|
mov 0x10(%rsi),%rdx |
|
|
mov (%rcx),%eax |
|
|
add 0x18(%rsi),%eax |
|
|
mov %eax,(%rdx) |
|
|
retq</pre> |
|
|
</td> |
|
|
</tr> |
|
|
|
|
|
<tr><td><b><br>GCC 4.0.1 on i386:</b></td> |
|
|
<tr> |
|
|
<td valign="top"> |
|
|
<pre>mips_instr_addiu: |
|
|
mov 0x8(%esp),%eax |
|
|
mov 0x8(%eax),%ecx |
|
|
mov 0x4(%eax),%edx |
|
|
mov 0xc(%eax),%eax |
|
|
add (%edx),%eax |
|
|
mov %eax,(%ecx) |
|
|
cltd |
|
|
mov %edx,0x4(%ecx) |
|
|
ret</pre> |
|
|
</td> |
|
|
<td></td> |
|
|
<td valign="top"> |
|
|
<pre>mips32_instr_addiu: |
|
|
mov 0x8(%esp),%eax |
|
|
mov 0x8(%eax),%ecx |
|
|
mov 0x4(%eax),%edx |
|
|
mov 0xc(%eax),%eax |
|
|
add (%edx),%eax |
|
|
mov %eax,(%ecx) |
|
|
ret</pre> |
|
|
</td> |
|
|
</tr> |
|
|
</table></center> |
|
|
|
|
|
<p>On 64-bit hosts, there is not much difference, but on 32-bit hosts (and |
|
|
to some extent on AMD64), the difference is enough to make it worthwhile. |
|
|
|
|
|
|
|
|
<p><b>Performance:</b> |
|
|
|
|
|
<p>The performance of using this kind of runnable IR is obviously lower |
|
|
than what can be achieved by emulators using native code generation, but |
|
|
can be significantly higher than using a naive fetch-decode-execute |
|
|
interpretation loop. In my opinion, using a runnable IR is an interesting |
|
|
compromise. |
|
|
|
|
|
<p>The overhead per emulated instruction is usually around or below |
|
|
approximately 10 host instructions. This is very much dependent on your |
|
|
host architecture and what compiler and compiler switches you are using. |
|
|
Added to this instruction count is (of course) also the C code used to |
|
|
implement each specific instruction. |
|
|
|
|
|
<p><b>Instruction Combinations:</b> |
|
|
|
|
|
<p>Short, common instruction sequences can sometimes be replaced by a |
|
|
"compound" instruction. An example could be a compare instruction followed |
|
|
by a conditional branch instruction. The advantages of instruction |
|
|
combinations are that |
|
|
<ul> |
|
|
<li>the amortized overhead per instruction is slightly reduced, and |
|
|
<p> |
|
|
<li>the host's compiler can make a good job at optimizing the common |
|
|
instruction sequence. |
|
|
</ul> |
|
234 |
|
|
235 |
<p>The special cases where instruction combinations give the most gain |
GXemul should compile and run on any modern host architecture (64-bit or |
236 |
are in the cores of string/memory manipulation functions such as |
32-bit word-length). |
|
<tt>memset()</tt> or <tt>strlen()</tt>. The core loop can then (at least |
|
|
to some extent) be replaced by a native call to the equivalent function. |
|
|
|
|
|
<p>The implementations of compound instructions still keep track of the |
|
|
number of executed instructions, etc. When single-stepping, these |
|
|
translations are invalidated, and replaced by normal instruction calls |
|
|
(one per emulated instruction). |
|
|
|
|
|
<p><b>Native Code Back-ends: (not in this release)</b> |
|
|
|
|
|
<p>In theory, it will be possible to implement native code generation |
|
|
(similar to what is used in high-performance emulators such as QEMU), |
|
|
as long as that generated code abides to the C ABI on the host, but |
|
|
for now I wanted to make sure that GXemul works without such native |
|
|
code back-ends. For this reason, as of release 0.4.0, GXemul is |
|
|
completely free of native code back-ends. |
|
237 |
|
|
238 |
|
<p>Note: The <a href="translation.html">dynamic translation</a> engine |
239 |
|
does <i>not</i> require backends for native code generation to be written |
240 |
|
for each individual host architecture; the intermediate representation |
241 |
|
that the dyntrans system uses can be executed on any host architecture. |
242 |
|
|
243 |
|
|
244 |
|
|
258 |
operating systems think that they are there, but for all practical |
operating systems think that they are there, but for all practical |
259 |
purposes, these caches are non-working. |
purposes, these caches are non-working. |
260 |
|
|
261 |
<p>The emulator is <i>not</i> timing-accurate. It can be run in a |
<p>The emulator is in general <i>not</i> timing-accurate, neither at the |
262 |
"deterministic" mode, <tt><b>-D</b></tt>. The meaning of deterministic is |
instruction level nor on any higher level. An attempt is made to let |
263 |
simply that running two emulations with the same settings will result in |
emulated clocks run at the same speed as the host (i.e. an emulated timer |
264 |
identical runs. Obviously, this requires that no user interaction is |
running at 100 Hz will interrupt around 100 times per real second), but |
265 |
taking place, and that clock speeds are fixed with the <tt><b>-I</b></tt> |
since the host speed may vary, e.g. because of other running processes, |
266 |
option. (Deterministic in this case does <i>not</i> mean that the |
there is no guarantee as to how many instructions will be executed in |
267 |
emulation will be identical to some actual real-world machine.) |
each of these 100 Hz cycles. |
268 |
|
|
269 |
<p>(Note that user interaction means <i>both</i> input to the emulated |
<p>If the host is very slow, the emulated clocks might even lag behind |
270 |
program/OS, and interaction with the emulator's debugger. Breaking into the |
the real-world clock. |
|
debugger and then continuing execution may affect when/how interrupts |
|
|
occur.) |
|
271 |
|
|
272 |
|
|
273 |
|
|
299 |
<a href="guestoses.html#declinux">Linux/DECstation</a>, |
<a href="guestoses.html#declinux">Linux/DECstation</a>, |
300 |
<a href="guestoses.html#sprite">Sprite</a>) |
<a href="guestoses.html#sprite">Sprite</a>) |
301 |
<li><b>Acer Pica-61</b> (<a href="guestoses.html#netbsdarcinstall">NetBSD/arc</a>) |
<li><b>Acer Pica-61</b> (<a href="guestoses.html#netbsdarcinstall">NetBSD/arc</a>) |
302 |
<li><b>NEC MobilePro 770, 780, 800, and 880</b> (<a href="guestoses.html#netbsdhpcmipsinstall">NetBSD/hpcmips</a>) |
<li><b>NEC MobilePro 770, 780, 800, 880</b> (<a href="guestoses.html#netbsdhpcmipsinstall">NetBSD/hpcmips</a>) |
303 |
<li><b>Cobalt</b> (<a href="guestoses.html#netbsdcobaltinstall">NetBSD/cobalt</a>) |
<li><b>Cobalt</b> (<a href="guestoses.html#netbsdcobaltinstall">NetBSD/cobalt</a>) |
304 |
<li><b>Malta</b> (<a href="guestoses.html#netbsdevbmipsinstall">NetBSD/evbmips</a>) |
<li><b>Malta</b> (<a href="guestoses.html#netbsdevbmipsinstall">NetBSD/evbmips</a>, Linux/Malta <font color="#0000e0">(<super>*1</super>)</font>) |
305 |
<li><b>Algorithmics P5064</b> (<a href="guestoses.html#netbsdalgorinstall">NetBSD/algor</a>) |
<li><b>Algorithmics P5064</b> (<a href="guestoses.html#netbsdalgorinstall">NetBSD/algor</a>) |
306 |
<li><b>SGI O2 (aka IP32)</b> <font color="#0000e0">(<super>*</super>)</font> |
<li><b>SGI O2 (aka IP32)</b> <font color="#0000e0">(<super>*2</super>)</font> |
307 |
(<a href="guestoses.html#netbsdsgimips">NetBSD/sgi</a>) |
(<a href="guestoses.html#netbsdsgimips">NetBSD/sgi</a>) |
308 |
</ul> |
</ul> |
309 |
<p> |
<p> |
310 |
<li><b><u>PowerPC</u></b> |
<li><b><u>PowerPC</u></b> |
311 |
<ul> |
<ul> |
312 |
<li><b>IBM 6050/6070 (PReP, PowerPC Reference Platform)</b> (<a href="guestoses.html#netbsdprepinstall">NetBSD/prep</a>) |
<li><b>IBM 6050/6070 (PReP, PowerPC Reference Platform)</b> (<a href="guestoses.html#netbsdprepinstall">NetBSD/prep</a>) |
313 |
|
<li><b>MacPPC (generic "G4" Macintosh)</b> (<a href="guestoses.html#netbsdmacppcinstall">NetBSD/macppc</a>) |
314 |
|
</ul> |
315 |
|
<p> |
316 |
|
<li><b><u>SuperH</u></b> |
317 |
|
<ul> |
318 |
|
<li><b>Sega Dreamcast</b> (<a href="dreamcast.html#netbsd_generic_md">NetBSD/dreamcast</a>, <a href="dreamcast.html#linux_live_cd">Linux/dreamcast</a>) |
319 |
</ul> |
</ul> |
320 |
</ul> |
</ul> |
321 |
|
|
322 |
<p><small><font color="#0000e0">(<super>*</super>)</font> = |
<p> |
323 |
Enough for root-on-nfs, but not for disk boot.)</small> |
<small><font color="#0000e0">(<super>*1</super>)</font> = |
324 |
|
Linux/Malta may be run as a guest OS, however I have not yet found any stable |
325 |
|
URL to pre-compiled Linux/Malta kernels. Thus, Linux/Malta emulation is not |
326 |
|
tested for every release of the emulator; sometimes it works, sometimes |
327 |
|
it doesn't.</small> |
328 |
|
|
329 |
|
<br><small><font color="#0000e0">(<super>*2</super>)</font> = |
330 |
|
SGI O2 emulation is enough for root-on-nfs, but not for disk boot.</small> |
331 |
|
|
332 |
|
|
333 |
<p>There is code in GXemul for emulation of many other machine types; the |
<p>There is code in GXemul for emulation of many other machine types; the |
334 |
degree to which these work range from almost being able to run a complete |
degree to which these work range from almost being able to run a complete |
344 |
<li>a console I/O device (putchar() and getchar()...) |
<li>a console I/O device (putchar() and getchar()...) |
345 |
<li>an inter-processor communication device, for SMP experiments |
<li>an inter-processor communication device, for SMP experiments |
346 |
<li>a very simple linear framebuffer device (for graphics output) |
<li>a very simple linear framebuffer device (for graphics output) |
347 |
<li>a simple SCSI disk controller |
<li>a simple disk controller |
348 |
<li>a simple ethernet controller |
<li>a simple ethernet controller |
349 |
|
<li>a real-time clock device |
350 |
</ul> |
</ul> |
351 |
|
|
352 |
<p>This mode is useful if you wish to run experimental code, but do not |
<p>This mode is useful if you wish to run experimental code, but do not |