Power Analysis on FPGA

Download
Documentation
   Manual
   Book
   Publications
Examples

GEZEL Manual

Copyright (c) 2010 Patrick Schaumont.

The GEZEL Manual contains the following sections.

Questions on the installation and use of GEZEL can be posted on Sourceforce.

Sections included below:

  • Cosimulation Interfaces
  • gplatform
  • StrongARM-based Cosimulation
  • 8051-based Cosimulation
  • Picoblaze-based Cosimulation
  • To keep in mind

The GEZEL Cosimulation Tool

GEZEL designs can be cosimulated with instruction-set simulators. Such designs can include coprocessors that implement graphics, networking and/or cryptographic functions. The GEZEL cosimulation engine is called gplatform. It supports cosimulations with one or more ARM cores, 8051 microcontrollers, or picoblaze microcontrollers.

Cosimulation Interfaces

The cosimulation of GEZEL with an instruction-set simulator requires, besides a GEZEL program, also an executable program that can run on the instruction-set simulator. These executables can be created using a compiler. When a compiler runs on a different host machine (e.g. a linux PC) than the target execution environment (e.g. an ARM instruction-set simulator), a cross-compiler is required. In this discussion, the C programming language and a C cross-compiler will be used to create the executables.

The interactions between the GEZEL program and the executables running on the instruction-set simulators are captured in a cosimulation interface, which is an abstracted version of the real hardware/software interface. The cosimulation interfaces of GEZEL are cycle-true models of the real implementations.

There are various forms of cosimulation interfaces, depending on the I/O mechanisms provided by the core (instruction-set simulator). A commonly used type of interface is a memory-mapped interface, in which a set of addresses in the address space of the core is shared between the hardware and the software running on the core. There can also be specialized coprocessor- or I/O-interfaces, which are supported by dedicated instructions on the core. The main advantage of a memory-mapped interface is that it is almost core-independent. Therefore, C code and GEZEL code written for one type of processor can be ported to another processor with only minimal changes. The main advantage of the specialized interface on the other hand, is that it provides a dedicated, non-shared and usually high-bandwidth data channel between the core and the hardware.

A cycle-true cosimulation interface by itself provides only a mechanism to transfer data between a C program and GEZEL. This data transfer proceeds between two concurrent entities (a core and a hardware block). To avoid that data values get lost when one party is unware of the others' activities, synchronization is required. Such synchronization will be provided in a synchronization protocol. A synchronization protocol defines a signalling sequence on one or more control signals, in addition to the data transfer channel between C and GEZEL. This signalling sequence ensures that the communicating parties achieve synchronization. Both the control signals and data transfer channel can be implemented using the same cosimulation interfaces. For example, you can use a memory-mapped interface for both of them.

gplatform The command line of gplatform is as follows

gplatform [-d] -c max_cycles gezel_file

The system configuration is fully contained within gezel_file. The -c flag allows to indicate an upperbound for the amount of cycles to simulate. By default, the cosimulation will run until all instruction-set simulators have completed execution of their application program (these stopping conditions may vary from core to core).

StrongARM-based Cosimulation There are types of of interfaces for the StrongARM ISS.
  • Memory-mapped interfaces define a memory-mapped address decoder and intercept memory reads or writes from the ARM software. This is by far the most common and popular method of a hardware-software cosimulation/codesign interface, due to its ease of use and its flexibility. The disadvantage of this interface is communication bandwidth. This type of interface is implemented using armsystemsource, armsystemsink, and armbuffer.

  • Special-function unit (SFU) interfaces define an IO port into the pipeline of the processor, and thus can be used to experiment with ASIP (application-specific instruction-set processor) concepts. The SFU interface was designed by the author of Simit-ARM [Wei Qin]. The special-function unit interfaces are triggered by special, reserved instructions. SFU interfaces have a much larger bandwidth into the StrongARM processor than memory-mapped interfaces. This type of interface is implemented using armsfu2x2, armsfu2x1, armsfu3x1.

  • Fast-Simplex-Link (FSL) interfaces define a dedicated coprocessor port on the ARM, which is emulated using memory-reads and memory-writes in the ARM software. The FSL interface is defined by the MicroBlaze processor by Xilinx, which also provides detailed documentation for this interface. The gplatform simulator implements an FSL-like interface that enables users to experiment without moving to VHDL or FGPA synthesis. Like the SFU interface, the bandwidth of an FSL interface is higher than that of a memory-mapped interface. This type of interface is implemented using armfslslave and armfslmaster.

Memory-Mapped Interface

Here is a small example of a hardware-software cosimulation of a synchronized data transfer. The design in GEZEL looks as follows.

 1 // Memory-mapped data transfer
 2  ipblock myarm {
 3   iptype "armsystem";
 4   ipparm "exec=listing13";
 5 }
 6 
 7 // Cosimulation interfaces
 8 ipblock b1(in data : ns(8)) {
 9   iptype "armsystemsink";
10   ipparm "core=myarm";
11   ipparm "address=0x80000000";
12 }
13 ipblock b2(out data : ns(8)) {
14   iptype "armsystemsource";
15   ipparm "core=myarm";
16   ipparm "address=0x80000004";
17 }
18 ipblock b3(out data : ns(32)) {
19   iptype "armsystemsource";
20   ipparm "core=myarm";
21   ipparm "address=0x80000008";
22 }
23 
24 // hardware receiver
25 dp D2(in req : ns(8); out ack : ns(8); in data : ns(32)) {
26   reg reqreg  : ns(8);
27   reg datareg : ns(32);
28   sfg sendack {
29    ack = 1;
30   }
31   sfg sendidle {
32    ack = 0;
33   }
34   sfg read {
35    reqreg  = req;
36    datareg = data;
37   }
38   sfg rcv {
39    $display("data received ", data, " cycle ", $cycle);    
40   }
41 }
42 fsm F2(D2) {
43   initial s0;
44   state   s1, s2;
45   @s0 (read, sendack) -> s1;
46   @s1 if (reqreg) then (read, rcv, sendidle) -> s2;
47                   else (read, sendack)       -> s1;
48   @s2 if (reqreg) then (read, sendidle)      -> s2;
49                   else (read, sendack)       -> s1;
50 }
51 
52 dp sysD2 {
53   sig r, a : ns(8);
54   sig d    : ns(32);
55   use myarm;
56   use D2(r,a,d);
57   use b1(a);
58   use b2(r);
59   use b3(d);
60 }
61 
62 // connect hardware to cosimulation interfaces
63 system S {
64   sysD2;
65 }

Lines 1-5 in Listing 12 include an ARM core in the simulation. It has type armsystem, which means it is a complete instruction-set simulator including its' program memory. The application program that must be loaded into the program memory is given as a parameter to this library block, on Line 4. In this case, we specify the application program is stored in the executable.

Lines 6-22 define three cosimulation interfaces between GEZEL and the ARM. These interfaces are unidirectional, memory-mapped interfaces. There are two types of memory-mapped interfaces:

  • armsystemsink blocks, such as in lines 8-12. These are channels from GEZEL to the ARM; they are a data sink for GEZEL. These blocks define an input port on the library block where data to be send to the ARM is provided.
  • armsystemsource blocks, such as in lines 18-22. These are channels from the ARM to GEZEL; they are a data source for GEZEL. These blocks define an output port on the library block where data that is received from the ARM can be retrieved.

Both armsystemsink and armsystemsource define two parameters using the ipparm field. The first parameter is the name of the ARM core they belong to. The second parameter is the address value of the ARM memory location that is shared between the GEZEL hardware block and the ARM core.

Lines 24-50 define an example hardware module that can accept values from the software running on the ARM. The module executes a two-phase full-handshake protocol, which uses two control lines (an input req and an output ack). At the start of the two-phase full-handshake protocol, the hardware module is waiting for the req control signal to become high (lines 46-47). Before driving this signal high, the software will first set the data value to a stable value. At that moment the second phase of the handshake protocol is entered, and an inverse but symmetric handshake sequence is executed. First the software will drive req to zero, after which the GEZEL hardware model will respond by driving ack to zero (lines 48-49). A software program that executes this handshake sequence on the ARM is shown next.

int main() {
   volatile unsigned char *reqp, *ackp;
   volatile unsigned int  *datap;
   int data = 0;
   int i;
 
   reqp  = (volatile unsigned char *) 0x80000004;
   ackp  = (volatile unsigned char *) 0x80000000;
   datap = (volatile unsigned int  *) 0x80000008;
 
   for (i=0; i<10; i++) {
     *datap = data;
     data++;
 
     *reqp = 1; 
     while (*ackp) { }
 
     *reqp  = 0;
     while (! *ackp) { }
   }
   return 0;
}

The memory-mapped hardware/software interfaces are included in lines 2-3 as pointers of the volatile type. Such pointers are treated with caution by a compiler optimizer. In particular, no assumption is made about the persistence of the memory location that is being pointed at by this pointer. The pointers are initialized in lines 7-9 with values corresponding to the memory addresses used in the GEZEL description.

In lines 11-20, a simple loop is shown that executes the software side of the two-phase full-handshake protocol. Lines 15 and 18 illustrate why the volatile declaration is important. An optimizing C compiler would conclude that reqp is simply overwritten in the body of the loop. In addition, the resulting value is loop-invariant and can be hoisted outside of the loop body. The resulting optimized code would write the value 0 once in reqp and never change it afterwards. By declaring reqp to be a volatile pointer, the compiler will refrain from such optimizations.

The cosimulation is now executed as follows. Start by compiling the ARM program using a cross-compiler. The -static flag creates a statically linked executable, a requirement for the ARM ISS.

/usr/local/arm/bin/arm-linux-gcc -static hshakedriver.c -o hshakedriver
Next run the cosimulation with gplatform:

> gplatform listing12.fdl
 armsystem: loading executable [listing13]
 armsystemsink: set address 2147483648
 data received 0 cycle 29365
 data received 1 cycle 29527
 data received 2 cycle 29563
 data received 3 cycle 29599
 data received 4 cycle 29635
 data received 5 cycle 29671
 data received 6 cycle 29707
 data received 7 cycle 29743
 data received 8 cycle 29779
 data received 9 cycle 29815
 Total Cycles: 32450
The simulation initializes and then prints a series of messages, which are generated by the GEZEL program. The round-trip execution time of the protocol takes 36 clock cycles, a rather high value because we are working with an unoptimized C program and an unoptimized handshake protocol.

Special-Function Unit Interface

Here is a brief example that shows how a custom instruction for StrongARM can be created. They can be called from C through the use of the following macro's. In this case, we map the op2x2 instruction (which the StrongARM does not have, of course) to the smullnv instruction. This is a non-implemented instruction which is supported by the StrongARM compiler.

#define OP2x2_1(D1,D2,S1,S2) \
       asm volatile ("smullnv %0, %1, %2, %3": \
               "=&r"(D1),"=&r"(D2): \
               "r"(S1),"r"(S2));

We will describe the use of OP2x2 with a small example. Consider the following C program. It contains two calls to OP2x2_1, which will map to a custom instruction of the op2x2 type. This program simply defines the input arguments for op2x2, calls it, and prints the result.

int main() { 
   int p;
   int a,b,c,d; 
   
   a = 10; 
   b = 20; 
   OP2x2_1(c, d, a, b);
   printf("%d %d %d %d\n", a, b, c, d); 
 
   a = 50;
   b = 20;
   OP2x2_1(c, d, a, b);
   printf("%d %d %d %d\n", a, b, c, d); 
   
   return 0; 
}
Here is a corresponding GEZEL program that implements the special-function unit.

ipblock myarm {
   iptype "armsystem";
   ipparm "exec = sfudriver";
}
ipblock armsfu1(out d1, d2 : ns(32);
                in  q1, q2 : ns(32)) {
  iptype "armsfu2x2";
  ipparm "core = myarm";
  ipparm "device = 0";
}
  
dp addsub {
use myarm;
sig d1, d2, q1, q2 : ns(32);
use armsfu1(d1, d2, q1, q2);
always {
  q1 = (d1 + d2);
  q2 = (d1 - d2);
  $display("SFU 2x2 runs at ", $cycle, ": " , q1, " ", q2);
  }
}
 
system S {
  addsub;
}

The armsfu1 ipblock in the program defines the interface between StrongARM and GEZEL. This interface provides two outputs (d1 and d2) and two inputs (q1 and q2) as expected for a op2x2 block. The addsub datapath connects to this interface, and performs operations on the values provided by the armsfu1 interface.

The synchronization between StrongARM and the custom datapath in GEZEL is implicit; whenever the StrongARM executes an op2x2 instruction, it provides the input values to the armsfu1 interface, and gives GEZEL one clock cycle to process them. At the end of that clock cycle, the StrongARM takes whatever value is available at the interface back into the program. In this case, the custom datapath is nothing more than a simple add/subtract function.

To compile and simulate this program, run make followed by make sim.

 > make /usr/local/arm/bin/arm-linux-gcc -static  sfudriver.c -o sfudriver
 > make sim
   /opt/gezel-2.1/bin/gplatform armsfu.fdl
   core myarm
   armsystem: loading executable [sfudriver]
   SFU 2x2 runs at 0: 0 0
   SFU 2x2 runs at 30791: 1e fffffff6
   10 20 30 -10
   SFU 2x2 runs at 47074: 46 1e
   50 20 70 30
   Total Cycles: 54062

Fast Simplex Link Interface

A Fast Simplex Link implements a point-to-point connection between microblaze processor (Xilinx) and a coprocessor. Several design features ensure high-throughput between the MicroBlaze and the coprocessor

  • A FSL is a dedicated, non-shared link, driven by a simple handshake protocol rather than a memory-bys read/write cycle.
  • The MicroBlaze processor has dedicated instructions to access the FSL.
  • A FSL can be buffered with a dedicated queue, which enables execution overlap of the MicroBlaze operation and the coprocessor.

We build a high-level simulation model of a copy-processor in GEZEL. The Copy-processor transfers data from an FSL slave to and FSL master. The listing below shows the Copy Coprocessor modeled in GEZEL.

 1 ipblock arm1 {
 2   iptype "armsystem";
 3   ipparm "exec = fsldrive";
 4 }
 5
 6 ipblock fsl1(out data   : ns(32);
 7              out exists : ns(1);
 8              in  read   : ns(1)) {
 9   iptype "armfslslave";
10   ipparm "core=arm1";
11   ipparm "write=0x80000000";
12 }
13 
14 ipblock fsl2(in  data   : ns(32);
15              out full   : ns(1);
16              in  write  : ns(1)) {
17   iptype "armfslmaster";
18   ipparm "core=arm1";
19   ipparm "read=0x80000004";
20   ipparm "status=0x80000008";
21 }
22
23 dp gezelfslcopy(in  rdata   : ns(32);
24                 in  exists  : ns(1);
25                 out read    : ns(1);
26                 out wdata   : ns(32);
27                 in  full    : ns(1);
28                 out write   : ns(1)) {
29   reg rexists, rfull : ns(1);
30   reg rcopy : ns(32);
31   always {
32    rexists = exists;
33    rfull   = full;
34    wdata   = rcopy;
35   }
36   sfg dowrite   { write = 1; }
37   sfg dontwrite { write = 0; }
38   sfg doread    { read = 1; }
39   sfg dontread  { read = 0; }
40   sfg capture   { rcopy = rdata; 
41                   $display("captures data: ", rdata);
42                 }
43 }
44 fsm fsm_gezelfslcopy(gezelfslcopy) {
45   initial s0;
46   state s1, s2, s3;
47   @s0 if (rexists) then (capture , doread, dontwrite) -> s1;
48                       else (dontread, dontwrite)         -> s0;
49   @s1 if (rfull)   then (dontread, dontwrite)         -> s1;
50                    else (dowrite , dontread )         -> s0;
51 }
52
53 dp top {
54   sig rdata, wdata : ns(32);
55   sig write, read  : ns(1);
56   sig exists, full : ns(1);
57   use arm1;
58   use fsl1(rdata, exists, read);
59   use fsl2(wdata, full, write);
60   use gezelfslcopy(rdata, exists, read, wdata, full, write);
61 }
62
63 system S {
64   top;
65 }

We make use of an ARM instruction-set simulator since a cycle-accurate microblaze ISS is currently not available in GEZEL. The FSL are modeled by means of ipblock constructs (line 6-22). An ARM does not have a FSL and therefore these are emulated through a memory-mapped protocol. The FSL-slide of these ipblock however implement the exact FSL protocol. In other words, any coprocessor that can be functionally verified using cosimulation with this setup, will also work when attached to Microblaze FSL. The memory-mapped protocol through which the ARM drives the FSL works as follows.

  • The FSL slave defines a write address. When the ARM writes to this address, that data will be transferred to the FSL slave interface.
  • The FSL master defines a read address. When the ARM reads from this address, the last data token provided from the FSL master will be returned. The FSL master also defines a status address. When the ARM reads from this address, the presence of a new token will be indicated. In other words, before accessing the read address, the ARM should test the value of the status address to ensure new data was written into the FSL master by the coprocessor.

The copy processor, line 23-51, is a simple FSMD that alternately drives the input FSL handshake and the output FSL handshake. The minimum latency through the coprocessor is two clock cycles: in the first clock cycle, data is copied from the FSL slave to the internal rcopy register. In the second clock cycle, data is transferred from the internal rcopy register to the FSL master.

A corresponding software driver routine that can run on the strongARM and drive this coprocessor is shown in the following listing. We use initialized pointers to provide a convenient abstraction of memory-mapped interfaces. The ARM memory read/write instructions will result in the corresponding FSL protocol to be executed in the GEZEL model. When we will transfer this program to the actual microblaze coprocessor, we will need to replace these memory reads/writes with actual microblaze FSL instructions.

#include <stdio.h>
  
int main() {
  volatile unsigned int *wchannel = (volatile unsigned int *) 0x80000000;
  volatile unsigned int *rchannel_data = (volatile unsigned int *) 0x80000004;
  volatile unsigned int *rchannel_status = (volatile unsigned int *) 0x80000008;
  int i;	
			
  for (i=0; i<5; i++) {
    *wchannel = i;	
    while (*rchannel_status != 1) ;
    printf("Received data %d\n", *rchannel_data);	
  }												     
  return 0;	
}
The C file and the GEZEL file can be cosimulated with the GEZEL-based cosimulator, gplatform. Sample simulation output is as follows.

> make
/usr/local/arm/bin/arm-linux-gcc -static -O3 fsldrive.c -o fsldrive
> make sim
gplatform fslcopy.fdl
core arm1
armsystem: loading executable [fsldrive]
Coprocessor instruction ignored 0xee303110!
Coprocessor instruction ignored 0xee203110!
captures data: 0
Received data 0
captures data: 1
Received data 1
captures data: 2
Received data 2
captures data: 3
Received data 3
captures data: 4
Received data 4
Total Cycles: 71446
8051-based Cosimulation

There are two kinds of cosimulation interfaces for the 8051 simulation model.

  • Port-mapped interfaces attach to port P0, P1, P2, or P3 of the 8051 processor. This type of interface is implemented using i8051systemsource and i8051systemsink.
  • Shared-memory interfaces define a shared-memory block attached on the xbus of the 8051 processor. Both GEZEL and the i8051 can read from/write to this memory. This type of interface is implemented using i8051buffer.
Consider the following small example of an 8051 cosimulation, a program that will simply transfer data values from the 8051 microcontroller to the GEZEL simulation

 1 dp hello_decoder(in   ins : ns(8);
 2                  in   din : ns(8)) {
 3   reg insreg : ns(8);
 4   reg dinreg : ns(8);
 5   sfg decode   { insreg = ins; 
 6                  dinreg = din; }
 7   sfg hello    { $display($cycle, " Hello! You gave me ", dinreg); } 
 8 }
 9 
10 fsm fhello_decoder(hello_decoder) {
11   initial s0;
12   state s1, s2;
13   @s0 (decode) -> s1;
14   @s1 if (insreg == 1) then (hello, decode) -> s2;
15                        else (decode)        -> s1;
16   @s2 if (insreg == 0) then (decode)        -> s1;
17                        else (decode)        -> s2;
18 }
19 
20 ipblock my8051 {
21   iptype "i8051system";
22   ipparm "exec=driver.ihx";
23   ipparm "verbose=1";
24 }
25 
26 ipblock my8051_ins(out data : ns(8)) {
27   iptype "i8051systemsource";
28   ipparm "core=my8051";
29   ipparm "port=P0";
30 }
31 
32 ipblock my8051_datain(out data : ns(8)) {
33   iptype "i8051systemsource";
34   ipparm "core=my8051";
35   ipparm "port=P1";
36 }
37 
38 dp sys {
39   sig ins, din : ns(8);
40 
41   use my8051;
42   use my8051_ins(ins);
43   use my8051_datain(din);
44   use hello_decoder(ins, din);
45 }
46 
47 system S {
48   sys;
49 }
The first part of the program, lines 1-17, is a one-way handshake, that accepts data values and prints them. Of particular interest for this example are the hardware/software interfaces in lines 26-36. The cosimulation interfaces with an 8051 are not memory-mapped but rather port-mapped. The 8051 has four ports, labeled P0 to P3, which are mapped to its' internal memory space but which are available as IO ports on the core. These ports are intended to attach peripherals, and in this case are used to attach a GEZEL processor. To learn more about the 8051, refer to the UCR Dalton project (http://www.cs.ucr.edu/~dalton/i8051/) or the numerous other sources of 8051 information on the web. Here is a driver program in C for this coprocessor.

#include <8051.h>
enum {ins_idle, ins_hello};
void sayhello(char d) {
  P1 = d;
  P0 = ins_hello;
  P0 = ins_idle;
}
void terminate() {
  // special command to stop simulator
  P3 = 0x55;
}
void main() {
  sayhello(3);
  sayhello(2);
  sayhello(1);
  terminate();
}

The program transfers a values to the GEZEL coprocessor using sayhello in lines 3-8. The include file on line 1 is specific for this 8051 processor. Unlike a standard C program, a C program on the 8051 never terminates, and there is no concept of standard C library. Consequently, there are no printf functions and so on; these would be of little use within a micro-controller. The include file 8051.h contains several defintions, including those of ports P0 to P3. The special function terminate in lines 8 to 11 is used to stop the cosimulation. It writes the hex value 55 to port P3 (this is a specific convention for this simulator). The simulation proceeds as follows. First compile the 8051 program, using the Small Devices C Compiler (sdcc)

> sdcc listing18.c
The compiler creates several intermediate files, as well as a hex-dump format of the compiled code in Intel Hex format, listing18.ihx. Next, run the gplatform simulator to execute the cosimulation.

>gplatform listing17.fdl
 i8051system: loading executable [listing18.ihx]
 0xFF    0x03 0xFF 0xFF
 0x01    0x03 0xFF 0xFF
 9612 Hello! You gave me 3/3
 0x00    0x03 0xFF 0xFF
 0x00    0x02 0xFF 0xFF
 0x01    0x02 0xFF 0xFF
 9753 Hello! You gave me 2/2
 0x00    0x02 0xFF 0xFF
 0x00    0x01 0xFF 0xFF
 0x01    0x01 0xFF 0xFF
 9894 Hello! You gave me 1/1
 0x00    0x01 0xFF 0xFF
 0x00    0x01 0xFF 0x55
 Total Cycles: 9987
The output of the simulation shows the $display output from GEZEL, in addition to a value-change trace of the 8051's ports (P0 to P3). The 8051 uses many clock cycles; there is one machine cycle for each 12 clock cycles. Typically, a single instruction can execute in one machine cycle.
Picoblaze-based Cosimulation

The picoblaze is instantiated as a single block in the simulation. For a description of the picoblaze microcontroller, please refer to the documentation of Xilinx. The model implemented in GEZEL is a cycle-true implementation based on the instruction-set simulator kpicosim by Mark Six. The encapsulation into a cycle-true interface was designed by Eric Simpson. Here is a small example of a GEZEL design that uses a picoblaze processor.

ipblock mypico (out port_id : ns(8);
                 out write_strobe : ns(1);
                 out read_strobe : ns(1);
                 out out_port : ns(8);
                 in  in_port : ns(8);
                 in  interrupt : ns(1);
                 out interrupt_ack : ns(1);
                 in  reset : ns(1);
                 in  clk : ns(1)) {
     iptype "picoblaze";
     ipparm "exec=SMALL.DEC";
     ipparm "verbose=0";
}
dp shw(in a    : ns(8); 
       in addr : ns(8); 
       in ws   : ns(1)) {
  reg k : ns(8);
  always {
    k = a;
    $display("* ", $cycle, " P->G: V = ", a, " addr = ", addr, " ws = ", ws);
  }
}
dp cnt(out a   : ns(8); 
       in addr : ns(8); 
       in rs   : ns(1)) {
  reg c : ns(8);
  always {
   c = c + 1;
   a = c;
  }
}
dp top {
  sig port_id, out_port, in_port : ns(8);
  sig write_strobe, read_strobe, interrupt, interrupt_ack : ns(1);
  sig reset, clk : ns(1);
  use mypico(port_id,      
             write_strobe, 
             read_strobe,  
             out_port,     
             in_port,      
             interrupt,    
             interrupt_ack,
             reset,
             clk);
  use cnt(in_port,  port_id, read_strobe);
  use shw(out_port, port_id, write_strobe);
  always {
    interrupt = 0;
    reset = 0;
    clk   = 0;
  } 
}  
system S {
  top;
}

The design attaches a free-running counter to the input data port of a picoblaze, and prints whatever is generated on the output data port of the picoblaze. Note that the reset and clk ports have no meaning for the GEZEL simulation - they are only there to create an exact pin-compatible copy of the picoblaze processor in the top-level netlist. The program running on the picoblaze is written in picoblaze assembly. Again, refer to the documentation by Xilinx for a description of available picoblaze instructions. Here is a small program that copies the input to the output, while adding 1.

         ENABLE INTERRUPT
 LOOP:   INPUT SA,25
         ADD SA,01
         OUTPUT SA,10
         JUMP LOOP
We can convert that program into a binary (hex) file with the picoblaze assembler. Next, the resulting program and architecture can be simulated by gplatfotm.

> make sim
 ../../../build/bin/gplatform -c 5000 pb.fdl
 picoblaze: executable [exec=SMALL.DEC]
 (RESET EVENT)
 *  0 P->G: V = 20 addr = b8 ws = 0
 *  1 P->G: V = 20 addr = b8 ws = 0
 *  2 P->G: V = 20 addr = 25 ws = 0
 *  3 P->G: V = 20 addr = 25 ws = 0
 *  4 P->G: V = 20 addr = 25 ws = 0
 *  5 P->G: V = 20 addr = 25 ws = 0
 *  6 P->G: V =  4 addr = 10 ws = 0
 *  7 P->G: V =  4 addr = 10 ws = 1
 *  8 P->G: V =  4 addr = 10 ws = 0
 *  9 P->G: V =  4 addr = 10 ws = 0
 * 10 P->G: V =  4 addr = 25 ws = 0
 * 11 P->G: V =  4 addr = 25 ws = 0
 * 12 P->G: V =  4 addr = 25 ws = 0
 * 13 P->G: V =  4 addr = 25 ws = 0
 * 14 P->G: V =  c addr = 10 ws = 0
 * 15 P->G: V =  c addr = 10 ws = 1
 * 16 P->G: V =  c addr = 10 ws = 0
 * 17 P->G: V =  c addr = 10 ws = 0
The first output instruction is at cycle 6 (the write strobe goes high in the second cycle of the output instruction, showing ws = 1 at cycle 7). The data written out at that point is 4. Consequently, considering the picoblaze assembly program, we conclude that this data was captured in cycle 3. A picoblaze is particularly effective in coping with complex control situations. If you find yourself developing FSM after FSM, with no improvement in sight, it may be useful to reconsider your approach to control design, and try to use a picoblaze controller.
To Keep in Mind

In general, the speed of a good instruction-set simulator is far higher than that of the GEZEL kernel. This is because an ISS is developed with the architecture of the processor it must model in mind, which is not possible for the GEZEL kernel. Also, the GEZEL kernel uses scripted simulation, rather than compiled simulation.

As an optimization, the GEZEL simulator uses a strategy of sleep/awake modes. This mode switching is also important for cosimulation. If this is possible, a user should develop the GEZEL hardware model in such a way that periods of idle or inactive operation also imply no datapath register changes and no state changes in the GEZEL controllers. This will enable the GEZEL simulator to enter sleep mode, while only the ISS keeps on running. The resulting simulation speed will greatly improve because of this.

Valid HTML 4.0 Transitional