Lecture A Simple Implementation Scheme

Reading materials of this Lecture:

  1. Chapter 5.2 Building a Datapath in the text book.
  2. Chapter 5.3 A simple Implementation Scheme in the text book

Section 1 Create a Simple Datapath

A single clock cycle processor: every instruction takes one clock cycle
to execute.

Therefore, every devices needed more than once  by an instruction
should be duplicated. However, the same devices needed for the
different instructions can be shared.

Format for different instructions:

      1. R-format
      ---------------------------------------
      |       op       |      rs      |      rt       |       rd     |   shamt   |   funct      |
      ---------------------------------------
            6-bits   5-bits      5 bits      5 bits      5 bits     6 bits

     2.  lw and sw
     ---------------------------------------
      |       op       |      rs      |      rt       |                      address                  |
      ---------------------------------------
            6-bits   5-bits      5 bits                             16 bits

      3. beq
     ---------------------------------------
      |       op       |      rs      |      rt       |                      address                  |
      ---------------------------------------
            6-bits   5-bits      5 bits                             16 bits
 
 

What devices can be shared? see Fig. 5.17
1. Instruction fetch part: PC, instruction memory, adder for (PC + 4)

2. Register File:

    write address:
        a. bits [15:11] for R-format instructions
        b. bits [20:16] for lw instruction

     write data:
         a. ALU result for R-format instructions
         b. data from Data Memory  for lw instruction

Need two MUX to select different resources. First MUX
has 5-bit input, and the second MUX has 32-bit input.

3. ALU

    The first input to ALU always from Read data 1 from register
     defined by bits [25:21].

     The second input to ALU:
         a. Read data 2 from register defined by bits [20:16]
              for R-format instructions and beq.
         b. extended address [15:0] for either lw or sw instruction.

Need one MUX to select different resources for the second input to ALU.

4. Data Memory
     sw instruction writes Data Memory, and lw reads Data Memory
 
 

What devices must be duplicated?
      We need one Adder to calculate (PC + 4), the other adder
       to calculate branch target address, one CPU for the operation.

Fig. 5.11, Fig. 5.12, Fig. 5.13, and Fig.5.17 show the process
how to put all parts together to build a datapath.

Section 2 Control Signals

Control Unit :
       1. Main Control : generate 9-bit control signals including ALUOp
       2. ALU Control : input ALUOp is from Main Control

The Control Unit is divided into two parts: Main Control
and ALU Control. Main Control provides ALUOp as input
for ALU Control. Such hierarchy design is to make the design
simple and improve the performance.

9-bit control signals:

1. RegDst :       1    select bit[15:11] as write address for R-format
                           0    select bit[20:16] as write address for lw

2. RegWrite :  1    write operation takes place in Register File
                           0    no write operation takes place in Register File

3. ALUSrc:     1    sign-extended offset field as the second input to ALU
                           0     Read data 2   as the second input to ALU

4. MemRead:  1    read operation takes place in Data Memory
                           0    no read operation takes place in Data Memory

5. MemWrite: 1    write operation takes place in Data Memory
                           0    no write operation takes place in Data Memory

6. MemtoReg:  1    select the data from Data Memory
                            0    select ALU result

7. PCSrc :          1    select the target branch
                            0    select (PC + 4)

8. ALUOp:        00  lw or sw
                             01  beq
                            10    R-format
                            11   Immediate operations
 

Section 3 ALU Control and the Main Control

Verilog of ALU Control: used nested Case statements

module alucontrol(Oper, opcode, funct, ALUOp);
       input [1:0] ALUOp;
       input [5:0]  opcode, funct;
       output  [2:0] Oper;
       reg  [2:0] Oper;

       always@(opcode or funct or ALUOp)
            case (ALUOp)
                 2'b00: Oper = 3'b010 ;        // addition for lw or sw
                 2'b01: Oper = 3'b110 ;        // subtraction for beg
                 2'b10:
                    begin                          // R-format
                       case (funct)
                            6'b100000: Oper = 3'b010;   // add
                            6'b100010: Oper = 3'b110;   // sub
                            6'b100100: Oper = 3'b000;   // and
                            6'b100101: Oper = 3'b001;   // or
                            6'b101010: Oper = 3'b111;   // slt
                        endcase
                    end
                 2'b11:
                     begin                          // immediate operations
                        case (opcode)
                           ...
                         endcase
                       end
                endcase
endmodule
 

Verilog of Main Control: similar to ALU control, use Case
 statement based on input opcode.

Setup the values of control signals for different instructions:

                                  R-format      lw      sw     beg
RegDst                             1                0        x        x

RegWrite                         1               1         0        0

ALUSrc                            0              1         1         0

MemRead                         0              1          0        0

MemWrite                       0               0          1        0

MemtoReg                       0              1           x        x

Branch                              0              0           0        1

Notice PCSrc is a driven signal: PCSrc = Branch & Zero