SFL: Structured Function description Language Tutorial

SFL is a RTL (Register Transfer Level) hardware description language and serves as an input of the PARTHENON system. It has a simple clock model that allows describing synchronous single-phase clock digital circuits. The main characteristics of SFL are as follows:

The following tutorial contains three simple examples that stress above concepts. We will simulate them with the PARTHENON simulator SECONDS. Here is the full SFL syntax (Backnus Naur Form).

Full-adder

Let's have a look at the whole source code before sorting out line after line. The design holds two modules: 'adder1' (one bit full-adder) is a submodule and used in the top module 'adder4' (bit slice implementation).

Code file 'adder4.sfl' (with line numbering)

As in C language, SFL distinguishes between module declaration and module definition. Before the 'adder1' circuit can be used in the top module 'adder4', it must be declared. This is done in line 7..13. The definition of the 'adder1' behavior can be found in line 55..67. This definition code could be located in a separate file. Further it is not necessary to declare the top module 'adder4'.

Comments are delimited with /*this is a comment*/. SFL module syntax always starts with the facility declaration. The 'adder1' module (line 7) declares three one bit data inputs 'a', 'b', 'cin' and two one bit data outputs, the sum bit 's' and the carry-out bit 'cout'. The control terminal 'add' in line 10 is an important thing. As mentioned in the introduction, behavior of an object can be associated with a control signal. If doing so the 'add' behavior is only "executed" if this control signal is active. Line 12 shows the argument binding of the 'add' behavior with the 'instr_arg' construct. With this it is very easy to access the object 'adder1' and its behavior 'add' from other modules (module instantiation).

Before we use our 'adder1' object in the top module 'adder4', let's take a look at the actual implementation. After the facility declaration the 'instruct' keyword on line 60 starts the behavior definition related to the 'add' control signal. The 'par {..}' statement says what has to be done in parallel during that clock period. The sum bit 's' is realized with xor operations while the carry output bit 'cout' is formulated as a sum of product (SOP). Although this description holds some redundancy we leave it by this straightforward "truth table" definition and let the underlying PARTHENON synthesizer do the optimization work. In real designs its anyway appropriate to use building blocks such as adders or multipliers from a library, which is optimized for the target technology (CMOS, FPGA, etc).

Now since the 'adder1' module is declared and defined we can use it in the top module 'adder4'. Again we start with the facility declaration in line 19. Apart from input and output we have some internal facilities, i.e. four 1-bit adders ('ADDER0'.. 'ADDER3') and three selectors ('CARRY1'.. 'CARRY3') for the carry calculation. The 'adder4' object binds its behavior to the 'ADD' control signal. Line 27..35 show how easy submodules can get connected in SFL. The most significant bit (MSB) of the sum vector 'S' is assigned to the output of the 'ADDER3' object by simply "calling" its behavior 'add' with the corresponding arguments. The SFL operator '||' stands for concatenation with the order MSB..LSB. Thus the sum vector 'S=[S<3>.. S<0>]' can be assigned as in line 27..30. In line 31 and the following we do not have to repeat the argument binding (i.e. 'COUT= ADDER3.add( A<3>, B<3>, CARRY3)' ) since a hardware object can only get connected once. Because all the assignments are within the 'par' statement, their order does not matter.

Now let's simulate the 'adder4.sfl' module with the PARTHENON simulator SECONDS with the script file 'adder4.scr'. Here is the result of following command.

 
%seconds < adder4.scr
...

--- simulation modules:
adder4                          (module          )
adder1                          (module          )

--- facilities of top module:
/                               (module          )
  A                               (term input      )
  ADD                             (instr input     )
  ADDER0                          (submodule       )
  ADDER1                          (submodule       )
  ADDER2                          (submodule       )
  ADDER3                          (submodule       )
  B                               (term input      )
  CARRY1                          (term internal   )
  CARRY2                          (term internal   )
  CARRY3                          (term internal   )
  CIN                             (term input      )
  COUT                            (term output     )
  S                               (term output     )

--- simulation start:

input pins | internal signals                              | output pins
A B CIN ADD| ADDER0.a ADDER0.b ADDER0.cin ADDER0.add CARRY1| S COUT
------------------------------------------------------------------------
1 1 0   1  | 1        1        0          1          1     | 2 0
1 6 1   1  | 1        0        1          1          1     | 8 0
7 2 0   1  | 1        0        0          1          0     | 9 0
9 8 0   1  | 1        0        0          1          0     | 1 1
1 1 0   0  | z        z        z          0          z     | z z
1 6 1   0  | z        z        z          0          z     | z z

After listing the simulated modules and the facilities of the top module 'adder4', simulation starts. Our circuit does what we want. Interesting are the control signals 'ADD' and 'ADDER0.add'. If they are zero, the output 'S' and 'COUT' are in an undefined state 'z'.

Let's go down to gates now: The following shows a schematic (adder4.jpg, adder4.ps) and the listing of the netlist (adder4.edif) after PARTHENON synthesis with the demonstration CMOS library:

%auto adder4 ps DEMO demo

Where are our control signals 'ADD' and 'add'? They have gone. This is actually not surprising since the full-adder consist of only combinatorial circuitry. Because nothing has to be controlled the PARTHENON synthesizer has optimized them away. Actually it is a question of style to having introduced them after all. If we omit them, on the other hand, module instantiation becomes clumsy as in structural description languages. That is why we still use them even in combinatorial designs.

Timer

The second example shows the implementation of an 8-bit timer. Here is the interface specification:

 

Inputs

SET

sets the timer with the 'INIT<8>' value

   

INIT<8>

8-bit initialization value

   

RESET

resets timer by deactivating the 'EXPIRE' signal

 

Outputs

EXPIRE

asserted if the timer reaches zero

   

ENABLE

active if 'COUNT<8>' is valid otherwise inactive

   

COUNT<8>

shows timer countdown

The 'SET' signal initializes the timer with the 'INIT' value and the countdown starts. While counting down the 'COUNT' output shows the momentary value of the timer. The 'ENABLE' signal is '1' whenever 'COUNT' is valid, otherwise it is '0'. If the timer reaches zero the 'EXPIRE' signal is activated. It stays active until 'RESET' is triggered. If 'RESET' gets active during countdown, the timer stops and the 'COUNT' output is don't care.

The first step is the interface declaration of the top module:

declare timer8 {
  instrin  SET, RESET;
  input    INIT<8>;
  instrout EXPIRE, ENABLE;
  output   COUNT<8>;
  instr_arg SET(INIT);
}/*timer8*/

We have declared four control signals: 'SET', 'RESET' as input and 'EXPIRE', 'ENABLE' as output. The reason for this is, that these signals are supposed to be driven all the times, i.e. they are '0' or '1'. On the other hand the input value 'INIT<8>' is a data terminal which must only be driven if the 'SET' is active. The same is true for the output 'COUNT<8>'. If ENABLE is zero it can have any value. Don't cares are very important when it comes to logic synthesis. A high degree of freedom allows the optimizer to find better solutions (i.e. shorter critical paths, less gates) for a given specification. The 'SET' behavior takes an argument 'INIT', which is declared with the keyword 'instr_arg'.

After the interface declaration we can start now with the implementation of the timer. It consists of an 8-bit decrementor submodule 'dec8' and the top module 'timer8':

Code file 'timer8.sfl' (with line numbering)

After declaration the decrementor module 'dec8' is defined as a 'circuit' in line 17..23. A circuit is a library module that is already synthesized. With this it is possible to reuse third party designs. Thus we only have to declare its interface and a quasi behavior for simulating it. This is done in line 22 with the 'instruct' command. Since subtraction is not defined in binary world we use the addition with the "two's complement". Here is an overview of the SFL operators.

 

priority

symbol

operation

example

remarks

code

a<4>=1011
b<4>=1111

 

high

<n:m>

extraction

a<2:1>

01

 

<n>

extraction

a<2>

0

 
 

middle

^

not of all bits

^a

0100

 

/|

or of all bits

/|a

1

 

/@

xor of all bits

/@a

1

 

/&

and of all bits

/&a

0

 

/

decode

/a

0000100000000000

only circuit

\

encode

\a

011

only circuit

#

bit expansion

8#a

11111011

 
 

low

|

or

a|b

1111

 

@

xor

a@b

0100

 

&

and

a&b

1011

 

||

concatenation

a||b

10111111

 

+

addition

a+b

11010

only circuit

>>

bit shift right

a>>0x2

0010

only circuit

<<

bit shift left

a<<0x2

1100

only circuit

==

comparison

a==b

0

only circuit, ok if rvalue is constant

Operators only supported within circuit modules have an entry in the remark column. The comparison operator can be synthesized if the right value is a constant, i.e. a== 0xf8 works fine. In the future SFL will also allow multiplication and other operations for behavioral simulation within SECONDS.

Let's come back to the code. The comments in the code of the timer8 module (line 28..) show the SFL syntax skeleton. It is divided into a "facility declaration F1..F4" (line 33..48) and a "behavior definition part B1..B3" (line 53..89). Facility declaration contains 4 subsections:

The behavior definition part can be divided into three subsections B1..B3.

Here is the result 'timer.sim' of the 'timer8.sfl' simulation with the SECONDS script 'timer8.scr':

Again the simulation modules and the facilities of the top module 'timer8' are listed first. The first column 'CLK' shows the clock period. At CLK=1 we set the timer with the initialization value 'INIT'=0x04. As defined in our code, the state machine 'MAIN' is in the 'DOWN' state and the task 'MAIN.RUN' is not running. At power on, the content of the register 'REMAINED' is unknown. The output signals behave correct. As specified the control terminals 'EXPIRE' and 'ENABLE' are driven with '0'. On the other hand, 'COUNT' is not driven. From CLK=2..5 the timer is counting down in the 'DOWN' state. Now the output 'COUNT' shows the momentary value. In CLK=6 count has reached zero and the 'EXPIRE' signal is asserted, until the 'RESET' in CLK=9.

CPU

In the last example we build a simple 8-bit MISC (Minimum Instruction Set Computer!?). It is a register-memory architecture with only 16 instructions. We use circuits out of the system library and show the concept of communicating state machines. First let's have look at the full source code.

Code file 'cpu.sfl' (with line numbering)

In line 8 an SRAM memory 'r256_8.h' with 256 cells is included from the PARTHENON system library. As in C language, the system library is searched if the object is in <brackets>. User defined includes are marked with "double quotes". Apart form the memory an 8-bit incrementor 'inc8.h' and carry look-ahead adder 'cla8.h' are included.

The encoding of the instruction set is done with the define '%d' preprocessor statements in line 15..30. Instructions with a memory operand (LDAI, LDXI, LDXM, STXM, BC) have their MSB encode with '1'. The rest of the encoding is done with the 4 LSBs.

Line 35..42 declare the 'cpu' module. It has a simple memory interface with two data buses 'dti', 'dto', one address bus 'adrs' and two control signals for read and write operations. Further the 'cpu' gets activated with the 'start' terminal.

The behavior of the 'top' module is defined in line 47..55. Thanks to the object oriented SFL syntax connecting the 'ram' memory and 'cpu' is a piece of cake. The whole behavior is packed into three 'instruct' statements.

The definition of the 'cpu' module starts with the I/O facilities declaration in line 60. The program counter 'pc' is a 8-bit register with reset 'reg_wr'. At power on it will be initialized with '0x00', ready to fetch the first instruction. Register 'a' is an accumulator, 'x' the memory address register, 'c' a carry or condition register. In 'op1' the fetched instruction is stored, while 'op2' holds the immediate memory operand. The memory data register 'md' is used for register deferred addressing mode of the ALU instructions ADCX and ANDX.

In line 81..86 two state machines and corresponding tasks are declared. The 'if.ift' task controls the instruction fetch and the 'exec.ext' cares for proper execution.

With these constructs control can be easily transferred from state machine to state machine and various kind of parallelisms can be utilized (i.e. leave behind, pipeline, etc.)

Now let's simulate our machine with the PARTHENON simulator SECONDS. The code file is 'cpu.sfl' and the corresponding simulator script is 'cpu.scr'. Here is the result 'cpu.sim' of the following little program which computes the sum '0x12+ 0x34= 0x46':

 
address
contents
assembler
meaning
 
0x00
0x02
0x03
0x04
0x06
0x07
0x09
0x0a
0x0b
0x83 0xfd
0x01
0x08
0x83 0xfe
0x0b
0x83 0xff
0x02
0x07
0x8d 0x0b
LDXI 0xfd
LDAX
CLC
LDXI 0xfe
ADCX
LDXI 0xff
STAX
SEC
BC 0x0b
x <- 0xfd
a <- (0xfd)
c <- 0
x <- 0xfe
a <- a+ (0xfe)+ c
x <- 0xff
(x) <- a
c <- 0
if (c) pc <- op2

This code is set in the simulator script 'cpu.scr' with 'meset /ram/cell X00 0X83 0Xfd..'. Data is set with 'memset /ram/cell Xfd 0X12 0X34 0X00'.

After listing the simulated modules and the facilities of the 'top' module, simulation starts. The first colon 'CLK' numbers the clock cycles. After triggering the 'start' control signal in CLK=2 the 'ift' task is activated for two clock cycles: first in the 'fetch1' state (instruction fetch of LDXI) and then in the 'fetch2' state (operand fetch 0xfd). In CLK=4 the task 'ift' finishes and the execution task 'ext' is activated (relay statement). The operand 0xfd is transferred to the memory address register 'x'. If we compare the values in the colon 'ift' with the ones in 'ext' during the following clock cycles, we can see the ping-pong control between these two tasks. Only in CLK=14 both task are active because the instruction ADCX is pipelined. While the 'ift' task fetches the next instruction LDXI the 'ext' task adds the values 'a'= 0x12, 'md'= 0x34, 'c'= 0 and writes the result back into the accumulator register 'a'=0x46. The program hangs up in an endless loop after 'CLK'=21.

Further examples

Please try also these code examples:

Conclusion

A quick overview of the SFL hardware description language was given by coding and simulating three examples. In order to compare SFL with other languages have a look at similar tutorials (LOLA, Verilog HDL, VHDL).

The main advantages of SFL are: