/[gxemul]/trunk/doc/technical.html
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Diff of /trunk/doc/technical.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 6 by dpavlin, Mon Oct 8 16:18:11 2007 UTC revision 22 by dpavlin, Mon Oct 8 16:19:37 2007 UTC
# Line 1  Line 1 
1  <html>  <html><head><title>Gavare's eXperimental Emulator:&nbsp;&nbsp;&nbsp;Technical details</title>
2  <head><title>GXemul documentation: Technical details</title>  <meta name="robots" content="noarchive,nofollow,noindex"></head>
 </head>  
3  <body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000">  <body bgcolor="#f8f8f8" text="#000000" link="#4040f0" vlink="#404040" alink="#ff0000">
4  <table border=0 width=100% bgcolor="#d0d0d0"><tr>  <table border=0 width=100% bgcolor="#d0d0d0"><tr>
5  <td width=100% align=center valign=center><table border=0 width=100%><tr>  <td width=100% align=center valign=center><table border=0 width=100%><tr>
6  <td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6">  <td align="left" valign=center bgcolor="#d0efff"><font color="#6060e0" size="6">
7  <b>GXemul documentation:</b></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  <b>Gavare's eXperimental Emulator:</b></font><br>
8  <font color="#000000" size="6"><b>Technical details</b>  <font color="#000000" size="6"><b>Technical details</b>
9  </font></td></tr></table></td></tr></table><p>  </font></td></tr></table></td></tr></table><p>
 <!-- The first 10 lines are cut away by the homepage updating script.  -->  
   
10    
11  <!--  <!--
12    
13  $Id: technical.html,v 1.50 2005/05/14 18:31:16 debug Exp $  $Id: technical.html,v 1.72 2006/02/18 15:18:15 debug Exp $
14    
15  Copyright (C) 2004-2005  Anders Gavare.  All rights reserved.  Copyright (C) 2004-2006  Anders Gavare.  All rights reserved.
16    
17  Redistribution and use in source and binary forms, with or without  Redistribution and use in source and binary forms, with or without
18  modification, are permitted provided that the following conditions are met:  modification, are permitted provided that the following conditions are met:
# Line 43  SUCH DAMAGE. Line 40  SUCH DAMAGE.
40  -->  -->
41    
42    
43    
44  <a href="./">Back to the index</a>  <a href="./">Back to the index</a>
45    
46  <p><br>  <p><br>
47  <h2>Technical details</h2>  <h2>Technical details</h2>
48    
49  <p>  <p>This page describes some of the internals of GXemul.
 This page describes some of the internals of GXemul.  
   
 <p>  
 <font color="#e00000"><b>NOTE: This page is probably not  
 very up-to-date by now.</b></font>  
50    
51  <p>  <p>
52  <ul>  <ul>
53    <li><a href="#overview">Overview</a>    <li><a href="#speed">Speed and emulation modes</a>
   <li><a href="#speed">Speed</a>  
54    <li><a href="#net">Networking</a>    <li><a href="#net">Networking</a>
55    <li><a href="#devices">Emulation of hardware devices</a>    <li><a href="#devices">Emulation of hardware devices</a>
   <li><a href="#regtest">Regression tests</a>  
56  </ul>  </ul>
57    
58    
59    
60    
 <p><br>  
 <a name="overview"></a>  
 <h3>Overview</h3>  
   
 In simple terms, GXemul is just a simple fetch-and-execute  
 loop; an instruction is fetched from memory, and executed.  
   
 <p>  
 In reality, a lot of things need to be handled. Before each instruction is  
 executed, the emulator checks to see if any interrupts are asserted which  
 are not masked away. If so, then an INT exception is generated. Exceptions  
 cause the program counter to be set to a specific value, and some of the  
 system coprocessor's registers to be set to values signifying what kind of  
 exception it was (an interrupt exception in this case).  
   
 <p>  
 Reading instructions from memory is done through a TLB, a translation  
 lookaside buffer. The TLB on MIPS is software controlled, which means that  
 the program running inside the emulator (for example an operating system  
 kernel) has to take care of manually updating the TLB. Some memory  
 addresses are translated into physical addresses directly, some are  
 translated into valid physical addresses via the TLB, and some memory  
 references are not valid. Invalid memory references cause exceptions.  
   
 <p>  
 After an instruction has been read from memory, the emulator checks which  
 opcode it contains and executes the instruction. Executing an instruction  
 usually involves reading some register and writing some register, or perhaps a  
 load from memory (or a store to memory). The program counter is increased  
 for every instruction.  
   
 <p>  
 Some memory references point to physical addresses which are not in the  
 normal RAM address space. They may point to hardware devices. If that is  
 the case, then loads and stores are converted into calls to a device  
 access function. The device access function is then responsible for  
 handling these reads and writes.  For example, a graphical framebuffer  
 device may put a pixel on the screen when a value is written to it, or a  
 serial controller device may output a character to stdout when written to.  
   
   
61    
62    
63  <p><br>  <p><br>
64  <a name="speed"></a>  <a name="speed"></a>
65  <h3>Speed</h3>  <h3>Speed and emulation modes</h3>
   
 There are two modes in which the emulator can run, <b>a</b>) a straight forward  
 loop which fetches one instruction from emulated RAM and executes it  
 (described in the previous section), and <b>b</b>)  
 using dynamic binary translation.  
66    
67  <p>  So, how fast is GXemul? There is no short answer to this. There is
68  Mode <b>a</b> is very slow. On a 2.8 GHz Intel Xeon host the resulting  especially no answer to the question <b>What is the slowdown factor?</b>,
69  emulated machine is rougly equal to a 7 MHz R3000 (or a 3.5 MHz R4000).  because the host architecture and emulated architecture can usually not be
70  The actual performance varies a lot, maybe between 5 and 10 million  compared just like that.
71  instructions per second, depending on workload.  
72    <p>Performance depends on several factors, including (but not limited to)  
73  <p>  host architecture, host clock speed, which compiler and compiler flags
74  Mode <b>b</b> ("bintrans") is still to be considered experimental, but  were used to build the emulator, what the workload is, and so on. For
75  gives higher performance than mode <b>a</b>. It translates MIPS machine  example, if an emulated operating system tries to read a block from disk,
76  code into machine code that can be executed on the host machine  from its point of view the read was instantaneous (no waiting). So 1 MIPS
77  on-the-fly. The translation itself obviously takes some time, but this is  in an emulated OS might have taken more than one million instructions on a
78  usually made up for by the fact that the translated code chunks are  real machine.
79  executed multiple times.  
80  To run the emulator with binary translation enabled, just add  <p>Also, if the emulator says it has executed 1 million instructions, and
81  <tt><b>-b</b></tt> to the command line.  the CPU family in question was capable of scalar execution (i.e. one cycle
82    per instruction), it might still have taken more than 1 million cycles on
83    a real machine because of cache misses and similar micro-architectural
84    penalties that are not simulated by GXemul.
85    
86    <p>Because of these issues, it is in my opinion best to measure
87    performance as the actual (real-world) time it takes to perform a task
88    with the emulator. Typical examples would be "How long does it take to
89    install NetBSD?", or "How long does it take to compile XYZ inside NetBSD
90    in the emulator?".
91    
92    <p>So, how fast is it? :-)&nbsp;&nbsp;&nbsp;Answer: it varies.
93    
94    <p>The emulation technique used varies depending on which processor type
95    is being emulated. (One of my main goals with GXemul is to experiment with
96    different kinds of emulation, so these might change in the future.)
97    
98  <p>  <ul>
99  Only small pieces of MIPS machine code are translated, usually the size of    <li><b>MIPS:</b><br>
100  a function, or less. There is no "intermediate representation" code, so          There are two emulation modes. The most important one is an
101  all translations are done directly from MIPS to host machine code.          implementation of a <i>dynamic binary translator</i>.
102            (Compared to real binary translators, though, GXemul's bintrans
103  <p>          subsystem is very simple and does not perform very well.)
104  The default bintrans cache size is 16 MB, but you can change this by adding          This mode can be used on Alpha and i386 host. The other emulation
105  <tt>-DDEFAULT_BINTRANS_SIZE_IN_MB=<i>xx</i></tt> to your CFLAGS environment          mode is simple interpretation, where an instruction is read from
106  variable before running the configure script, or by using the          emulated memory, and interpreted one-at-a-time. (Slow, but it
107  <tt>bintrans_size()</tt> configuration file option when running the emulator.          works. It can be forcefully used by using the <tt>-B</tt> command
108            line option.)
109  <p>    <p>
110  By default, an emulated OS running under DECstation emulation which listens to    <li><b>All other modes:</b><br>
111  interrupts from the mc146818 clock will get interrupts that are close to the          These use a kind of dynamic translation system. This system does
112  host's clock. That is, if the emulated OS says it wants 100 interrupts per          not recompile anything into native code, it only uses tables of
113  second, it will get approximately 100 interrupts per real second.          pointers to functions written in (sometimes machine-generated) C
114            code. Speed is lower than what can be achieved using real binary
115  <p>          translation into native code, but higher than when traditional
116  There is however a <tt><b>-I</b></tt> option, which sets the number of          interpretation is used. With some tricks, it will hopefully still
117  emulated cycles per seconds to a fixed value. Let's say you wish to make the          give reasonable speed. The ARM and PowerPC
118  emulated OS think it is running on a 40 MHz DECstation, and not a 7 MHz one,          emulation modes use this kind of translation.
119  then you can add <tt><b>-I 40000000</b></tt> to the command line. This will not  </ul>
 make the emulation faster, of course. It might even make it seem slower; for  
 example, if NetBSD/pmax waits 2 seconds for SCSI devices to settle during  
 bootup, those 2 seconds will take 2*40000000 cycles (which will take more  
 time than 2*7000000).  
120    
 <p>  
 The <b><tt>-I</tt></b> option is also necessary if you want to run  
 deterministic experiments, if a mc146818 (or similar) device is present.  
121    
 <p>  
 Some emulators make claims such as "x times slowdown," but in the case of  
 GXemul, the host is often not a MIPS-based machine, and hence comparing  
 one MIPS instruction to a host instruction doesn't work. Performance depends on  
 a lot of factors, including (but not limited to) host architecture, host speed,  
 which compiler and compiler flags were used to build GXemul, what the  
 workload is, and so on. For example, if an emulated operating system tries  
 to read a block from disk, from its point of view the read was instantaneous  
 (no waiting). So 1 MIPS in an emulated OS might have taken more than one  
 million instructions on a real machine.  Because of this, imho it is best  
 to measure performance as the actual (real-world) time it takes to perform  
 a task with the emulator.  
122    
123    
124    
# Line 186  a task with the emulator. Line 127  a task with the emulator.
127  <a name="net"></a>  <a name="net"></a>
128  <h3>Networking</h3>  <h3>Networking</h3>
129    
130  Running an entire operating system under emulation is very interesting in  <font color="#ff0000">NOTE/TODO: This section is very old and a bit
131  itself, but for several reasons, running a modern OS without access to  out of date.</font>
132  TCP/IP networking is a bit akward. Hence, I feel the need to implement TCP/IP  
133  (networking) support in the emulator.  <p>Running an entire operating system under emulation is very interesting
134    in itself, but for several reasons, running a modern OS without access to
135    TCP/IP networking is a bit akward. Hence, I feel the need to implement
136    TCP/IP (networking) support in the emulator.
137    
138  <p>  <p>
139  As far as I have understood it, there seems to be two different ways to go:  As far as I have understood it, there seems to be two different ways to go:
# Line 380  fragmentation issue mentioned above. Line 324  fragmentation issue mentioned above.
324    
325    
326    
327    
328    
329  <p><br>  <p><br>
330  <a name="devices"></a>  <a name="devices"></a>
331  <h3>Emulation of hardware devices</h3>  <h3>Emulation of hardware devices</h3>
332    
333  Each file in the device/ directory is responsible for one hardware device.  Each file called <tt>dev_*.c</tt> in the <tt>src/device/</tt> directory is
334  These are used from src/machine.c, when initializing which hardware a  responsible for one hardware device. These are used from
335  particular machine model will be using, or when adding devices to a  <tt>src/machines/machine_*.c</tt>, when initializing which hardware a particular
336  machine using the <b>device()</b> command in configuration files.  machine model will be using, or when adding devices to a machine using the
337    <tt>device()</tt> command in configuration files.
 <p>  
 <font color="#ff0000">NOTE: 2005-02-26: I'm currently rewriting the  
 device registry subsystem.</font>  
338    
339  <p>  <p>(I'll be using the name "<tt>foo</tt>" as the name of the device in all
340  (I'll be using the name 'foo' as the name of the device in all these  these examples.  This is pseudo code, it might need some modification to
 examples.  This is pseudo code, it might need some modification to  
341  actually compile and run.)  actually compile and run.)
342    
343  <p>  <p>Each device should have the following:
 Each device should have the following:  
344    
345  <p>  <p>
346  <ul>  <ul>
347    <li>A devinit function in dev_foo.c. It would typically look    <li>A <tt>devinit</tt> function in <tt>src/devices/dev_foo.c</tt>. It
348          something like this:          would typically look something like this:
349  <pre>  <pre>
350          /*          DEVINIT(foo)
          *  devinit_foo():  
          */  
         int devinit_foo(struct devinit *devinit)  
351          {          {
352                  struct foo_data *d = malloc(sizeof(struct foo_data));                  struct foo_data *d = malloc(sizeof(struct foo_data));
353    
# Line 417  Each device should have the following: Line 355  Each device should have the following:
355                          fprintf(stderr, "out of memory\n");                          fprintf(stderr, "out of memory\n");
356                          exit(1);                          exit(1);
357                  }                  }
358                  memset(d, 0, sizeof(struct foon_data));                  memset(d, 0, sizeof(struct foo_data));
359    
360                  /*                  /*
361                   *  Set up stuff here, for example fill d with useful                   *  Set up stuff here, for example fill d with useful
# Line 429  Each device should have the following: Line 367  Each device should have the following:
367                    
368                  memory_device_register(devinit->machine->memory, devinit->name,                  memory_device_register(devinit->machine->memory, devinit->name,
369                      devinit->addr, DEV_FOO_LENGTH,                      devinit->addr, DEV_FOO_LENGTH,
370                      dev_foo_access, (void *)d, MEM_DEFAULT, NULL);                      dev_foo_access, (void *)d, DM_DEFAULT, NULL);
371                    
372                  /*  This should only be here if the device                  /*  This should only be here if the device
373                      has a tick function:  */                      has a tick function:  */
# Line 441  Each device should have the following: Line 379  Each device should have the following:
379          }                }      
380  </pre><br>  </pre><br>
381    
382    <li>At the top of dev_foo.c, the foo_data struct should be defined.          <p><tt>DEVINIT(foo)</tt> is defined as <tt>int devinit_foo(struct devinit *devinit)</tt>,
383            and the <tt>devinit</tt> argument contains everything that the device driver's
384            initialization function needs.
385    
386      <p>
387      <li>At the top of <tt>dev_foo.c</tt>, the <tt>foo_data</tt> struct
388            should be defined.
389  <pre>  <pre>
390          struct foo_data {          struct foo_data {
391                  int     irq_nr;                  int     irq_nr;
392                  /*  ...  */                  /*  ...  */
393          }          }
394  </pre><br>  </pre><br>
395            (There is an exception to this rule; ugly hacks which allow
396    <li>If foo has a tick function (that is, something that needs to be          code in <tt>src/machine.c</tt> to use some structures makes it
397          run at regular intervals) then FOO_TICKSHIFT and a tick function          necessary to place the <tt>struct foo_data</tt> in
398          need to be defined as well:          <tt>src/include/devices.h</tt> instead of in <tt>dev_foo.c</tt>
399            itself. This is useful for example for interrupt controllers.)
400      <p>
401      <li>If <tt>foo</tt> has a tick function (that is, something that needs to be
402            run at regular intervals) then <tt>FOO_TICKSHIFT</tt> and a tick
403            function need to be defined as well:
404  <pre>  <pre>
405          #define FOO_TICKSHIFT           10          #define FOO_TICKSHIFT           14
406    
407          void dev_foo_tick(struct cpu *cpu, void *extra)          void dev_foo_tick(struct cpu *cpu, void *extra)
408          {          {
# Line 466  Each device should have the following: Line 415  Each device should have the following:
415          }          }
416  </pre><br>  </pre><br>
417    
418      <li>Does this device belong to a standard bus?
419            <ul>
420              <li>If this device should be detectable as a PCI device, then
421                    glue code should be added to
422                    <tt>src/devices/bus_pci.c</tt>.
423              <li>If this is a legacy ISA device which should be usable by
424                    any machine which has an ISA bus, then the device should
425                    be added to <tt>src/devices/bus_isa.c</tt>.
426            </ul>
427      <p>
428    <li>And last but not least, the device should have an access function.    <li>And last but not least, the device should have an access function.
429          The access function is called whenever there is a load or store          The access function is called whenever there is a load or store
430          to an address which is in the device' memory mapped region.          to an address which is in the device' memory mapped region. To
431  <pre>          simplify things a little, a macro <tt>DEVICE_ACCESS(x)</tt>
432          int dev_foo_access(struct cpu *cpu, struct memory *mem,          is expanded into<pre>
433            int dev_x_access(struct cpu *cpu, struct memory *mem,
434              uint64_t relative_addr, unsigned char *data, size_t len,              uint64_t relative_addr, unsigned char *data, size_t len,
435              int writeflag, void *extra)              int writeflag, void *extra)
436    </pre>  The access function can look like this:
437    <pre>
438            DEVICE_ACCESS(foo)
439          {          {
440                  struct foo_data *d = extra;                  struct foo_data *d = extra;
441                  uint64_t idata = 0, odata = 0;                  uint64_t idata = 0, odata = 0;
# Line 516  by the caller (in <tt>src/memory_rw.c</t Line 479  by the caller (in <tt>src/memory_rw.c</t
479    
480    
481    
 <p><br>  
 <a name="regtest"></a>  
 <h3>Regression tests</h3>  
   
 In order to make sure that the emulator actually works like it is supposed  
 to, it must be tested. For this purpose, there is a simple regression  
 testing framework in the <tt>tests/</tt> directory.  
   
 <p>  
 <i>NOTE:  The regression testing framework is basically just a skeleton so far.  
 Regression tests are very good to have. However, the fact that complete  
 operating systems can run in the emulator indicate that the emulation is  
 probably not too incorrect. This makes it less of a priority to write  
 regression tests.</i>  
   
 <p>  
 To run all the regression tests, type <tt>make regtest</tt>. Each assembly  
 language file matching the pattern <tt>test_*.S</tt> will be compiled and  
 linked into a 64-bit MIPS ELF (using a gcc cross compiler), and run in the  
 emulator. If everything goes well, you should see something like this:  
   
 <pre>  
         $ make regtest  
         cd tests; make run_tests; cd ..  
         gcc33 -Wall -fomit-frame-pointer -fmove-all-movables -fpeephole -O2  
                 -mcpu=ev5 -I/usr/X11R6/include -lm -L/usr/X11R6/lib -lX11  do_tests.c  
                 -o do_tests  
         do_tests.c: In function `main':  
         do_tests.c:173: warning: unused variable `s'  
         /var/tmp//ccFOupvD.o: In function `do_tests':  
         /var/tmp//ccFOupvD.o(.text+0x3a8): warning: tmpnam() possibly used  
                 unsafely; consider using mkstemp()  
         mips64-unknown-elf-gcc -g -O3 -fno-builtin -fschedule-insns -mips64  
                 -mabi=64 test_common.c -c -o test_common.o  
         ./do_tests "mips64-unknown-elf-gcc -g -O3 -fno-builtin -fschedule-insns  
                 -mips64 -mabi=64" "mips64-unknown-elf-as -mabi=64 -mips64"  
                 "mips64-unknown-elf-ld -Ttext 0xa800000000030000 -e main  
                 --oformat=elf64-bigmips" "../gxemul"  
   
         Starting tests:  
           test_addu.S (-a)  
           test_addu.S (-a -b)  
           test_clo_clz.S (-a)  
           test_clo_clz.S (-a -b)  
           ..  
           test_unaligned.S (-a)  
           test_unaligned.S (-a -b)  
   
         Done. (12 tests done)  
             PASS:     12  
             FAIL:      0  
   
         ----------------  
   
           All tests OK  
   
         ----------------  
 </pre>  
   
 <p>  
 Each test writes output to stdout, and there is a <tt>test_*.good</tt> for  
 each <tt>.S</tt> file which contains the wanted output. If the actual  
 output matches the <tt>.good</tt> file, then the test passes, otherwise it  
 fails.  
   
 <p>  
 Read <tt>tests/README</tt> for more information.  
   
   
   
482    
483  </body>  </body>
484  </html>  </html>

Legend:
Removed from v.6  
changed lines
  Added in v.22

  ViewVC Help
Powered by ViewVC 1.1.26