如何設計一顆簡易的 CPU (使用 VHDL)

程式作品

C 語言

Java

C#

JavaScript

常用函數

文字處理

遊戲程式

衛星定位

系統程式

資料結構

網路程式

自然語言

人工智慧

機率統計

資訊安全

等待完成

訊息

相關網站

參考文獻

最新修改

簡體版

English

程式專案下載:CPU.zip

前言

想自己設計 CPU,有可能嗎?這可是工業上的不傳之祕,哪有可能自己做一顆?

真的是這樣嗎? 或許 20 年前是這樣的,但現在可就不是了,有了 VHDL 與 FPGA 之後,我們真的可以用程式寫出一顆 CPU,包含記憶體與測試案例都可以在你的 PC 上模擬出來後,燒到 FPGA 當中,就成了一顆 FPGA CPU 了。

源起

當我在金門技術學院教計算機組織這門課的時候,開使對於自己設計一顆 CPU 有了強烈的興趣,或許是因為我很喜歡程式設計,只要看到可以寫程式做出來的東西就很想自己寫一個,因為、有了 FPGA 之後,CPU 居然變成一種軟體了。

也可能是因為心虛,教計算機結構的老師沒有自己設計過 CPU,那也太遜咖了,於是、我花了三個星期,反覆的用 Altera 的 Quartus 設計修改一顆自己設計的 CPU,但還是無法成功。

最後、我決定看看網路上是否有人教人設計 CPU,找到很多,但是真正將程式與原理講得很清楚的很少,華聖頓大學的 William D. Richard 在 Introduction To Digital Logic And Computer Design 課程中,將 CPU 的設計原理講得很清楚,然後更用 200 行的 VHDL 設計出一顆 CPU,接著再加入記憶體後建構出一台測試電腦,總共寫了 315 行的 VHDL 程式,並且在投影片 當中詳細的交待了其設計細節,並給出模擬的結果,非常令人激賞。

以下我們將深入解析這顆 CPU 的設計方式與運作原理,我們將這顆 CPU 稱為 理察一號

理察一號 CPU 的設計

暫存器

理察一號是一顆 16 位元的 CPU,採用以累積器 ACC 為核心的簡易架構,包含有四個暫存器 PC, iReg, IAR, ACC,兩組匯流排 (位址匯流排 aBus 與 資料匯流排 dBus),另外還有一些控制訊號線,包含時脈線 clk,重置線 reset,記憶體啟動 m_en 與記憶體讀寫 m_rw 等,其架構如下圖所示:

Richard_CPU_Architecture.JPG

根據此架構,其 VHDL 的 entity 定義如下:

entity cpu is port (
        clk, reset     : in  std_logic;
        m_en, m_rw    : out std_logic;        
    aBus        : out std_logic_vector(adrLength-1 downto 0);
    dBus        : inout std_logic_vector(wordSize-1 downto 0);
    -- these signals "exported" so they can be monitored in post-P&R simulation
    pcX, iarX    : out std_logic_vector(adrLength-1 downto 0);
    iregX, accX, aluX    : out std_logic_vector(wordSize-1 downto 0));
end cpu;

architecture cpuArch of cpu is
type state_type is (
    reset_state, fetch, halt, negate, mload, dload, iload,
    dstore, istore, branch, brZero, brPos, brNeg, add
);
signal state: state_type;

type tick_type is (t0, t1, t2, t3, t4, t5, t6, t7);
signal tick: tick_type;

signal pc:     std_logic_vector(adrLength-1 downto 0); -- program counter (程式計數器)
signal iReg:     std_logic_vector(wordSize-1 downto 0); -- instruction register (指令暫存器)
signal iar:     std_logic_vector(adrLength-1 downto 0); -- indirect address register (間接定址暫存器)
signal acc:     std_logic_vector(wordSize-1 downto 0); -- accumulator (累積器)
signal alu:     std_logic_vector(wordSize-1 downto 0); -- alu output

指令集

要設計一顆 CPU,首先要從指令集下手,理察一號的指令集共有 11 個指令,每個指令都對應到上列程式中的一個狀態 (state),每個指令長度都是 16 位元,指令中前 4 位元通常用來代表運算碼,後 12 位元代表記憶體位址,指令存放在指令暫存器當中,格式如下:

  IR(15..12) : 運算碼
  IR(11..0)  : 運算元

以下是理察一號的指令列表:

<csv>
代碼, 指令, 說明, 運算方法
0000, halt, 暫停 (halt execution)
0001, negate, 反相 (negation), ACC := -ACC
1xxx, load, 立即載入 (immediate load), if sign bit of xxx is 0 then ACC := 0xxx else ACC := fxxx
2xxx, dload, 直接載入 (direct load), ACC := M[0xxx]
3xxx, iload, 間接載入 (indirect load), ACC := M[M[0xxx]]
4xxx, dstore, 直接儲存 (direct store), M[0xxx] := ACC
5xxx, istore, 間接儲存 (indirect store), M[M[0xxx]] := ACC
6xxx, br, 分枝 (branch), PC := 0xxx
7xxx, brZero, 零分枝 (branch if zero), if ACC = 0 then PC := 0xxx
8xxx, brPos, 正分枝 (branch if positive), if ACC > 0 then PC := 0xxx
9xxx, brNeg, 負分枝 (branch if negative), if ACC < 0 then PC := 0xxx
axxx, add, 加法, ACC := ACC + M[0xxx]
</csv>

在上述指令表中,halt 與 negate 由於沒有運算元(參數),因此被編碼為 16 進位的 0000
與 0001,其他指令都具有一個記憶體位址運算元,這些指令的前 4 個 bit 用來表示運算碼,
後 12 個 bit 用來表示記憶體位址(運算元:參數),指令共可分為四群,第一群為載入運算 (load, dload, iload)
,第二群為儲存 (dstore, istore) 第三群為分枝 (br, brZero, brZero, brPos, brNeg),
第四群只有一個指令,就是加法運算 (add)。

算術邏輯單元

    alu <=     (not acc) + x"0001"     when state = negate else
        acc + dbus         when state = add else
        (alu'range => '0');

控制單元

指令步驟的時程

理察一號的指令執行最多包含 8 個時程,在程式中用 t0 t1 … t7 表示目前在哪一個時程,在程式中呼叫 nextTick 函數可前進到下一個時程。

    function nextTick(tick: tick_type) return tick_type is begin
        -- return next logical value for tick
        case tick is
        when t0 => return t1; when t1 => return t2; when t2 => return t3;
        when t3 => return t4; when t4 => return t5; when t5 => return t6;
        when t6 => return t7; when others => return t0;
        end case;
    end function nextTick;

解碼函數

decode 解碼函數會檢視指令暫存器,看看目前是哪一種指令,然後記錄在 state 變數中,以下是 decode 函數的程式碼。

    procedure decode is begin
        -- Instruction decoding.
        case iReg(15 downto 12) is
        when x"0" =>
            if iReg(11 downto 0) = x"000" then 
                state <= halt;
            elsif iReg(11 downto 0) = x"001" then 
                state <= negate;
            end if;
        when x"1" =>     state <= mload;
        when x"2" =>     state <= dload;
        when x"3" =>     state <= iload;
        when x"4" =>     state <= dstore;
        when x"5" =>     state <= istore;
        when x"6" =>     state <= branch;
        when x"7" =>     state <= brZero;    
        when x"8" =>     state <= brPos;
        when x"9" =>     state <= brNeg;
        when x"a" =>     state <= add;
        when others => state <= halt;
        end case;
    end procedure decode;

詳細的指令步驟分解

要設計出一個 CPU 的控制單元,必需仔細的將每一個指令分解為資料轉移的步驟,理查一號 CPU 由於要增進效能,因此、在電流的上升與下降邊緣都有執行動作,但這卻使得控制程式分散侕難以理解,要理解其控制邏輯,必需將上升與下降邊緣的動作重新組合,才能看清其運作邏輯,以下是我們對程式進行重組的結果,並對每一個狀態按 1. 2. 3. … 標定其執行順序,以利讀者理解。

reset_state
  state <= fetch; tick <= t0;

fetch
  -- rising edge
  if tick = t1 then iReg <= dBus; end if;        -- 2. get Intstruction from data bus to iReg
  if tick = t2 then                     -- 3. decode and PC++, go to next state
    decode; pc <= pc + '1'; tick <= t0;
  end if;
  -- falling edge
  if tick = t0 then m_en <= '1'; aBus <= pc; end if;     -- 1. put PC on to address bus, enable read instruction
  if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if; -- 4. clear address bus

halt -- halt execution
  -- rising edge
  tick <= t0; -- do nothing

negate --  ACC := -ACC
  -- rising edge
  negate => acc <= alu;    wrapup;                -- 1. acc <= alu <= -acc;

-- load instructions
mload
  -- rising edge
  if iReg(11) = '0' then -- sign extension            
    acc <= x"0" & ireg(11 downto 0);             -- 1. acc <= 0xxx
  else
    acc <= x"f" & ireg(11 downto 0);            -- 1. acc <= fxxx
  end if;
  wrapup;

dload m -- ACC := M[0xxx]
  -- rising edge
  if tick = t1 then acc <= dBus; end if;        -- 2. load data bus into ACC
  if tick = t2 then wrapup; end if;            -- 4. end instruction

  -- falling edge
  if tick = t0 then                     -- 1. Put IR(11..0) onto address bus.
     m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); 
  end if;
  if tick = t2 then                     -- 3. clear address bus 
     m_en <= '0'; aBus <= (aBus'range => '0'); 
  end if;

iload m    -- ACC := M[M[0xxx]]
  -- rising edge
  if tick = t1 then iar <= dBus; end if;        -- 2. load data bus into IAR
  if tick = t4 then acc <= dBus; end if;        -- 5. load data bus into ACC
  if tick = t5 then wrapup; end if;            -- 7. end instruction

  -- falling edge
  if tick = t0 then                     -- 1. Put IR(11..0) onto address bus.
    m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); 
  end if;
  if tick = t2 then                     -- 3. Clear Address Bus.
    m_en <= '0'; aBus <= (aBus'range => '0'); 
  end if;        
  if tick = t3 then                     -- 4. Put IAR onto address
    m_en <= '1'; aBus <= iar; 
  end if;
  if tick = t5 then                     -- 6. clear address bus.
    m_en <= '0'; aBus <= (abus'range => '0'); 
  end if;

-- store instructions                  
dstore -- M[0xxx] := ACC
  -- rising edge
  if tick = t4 then wrapup; end if;            -- 5. end instruction

  -- falling edge
  if tick = t0 then                     -- 1. Put IR(11..0) onto address bus.
    m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); 
  end if;
  if tick = t1 then                     -- 2. Put ACC onto address bus.
    m_rw <= '0'; dBus <= acc; 
  end if;
  if tick = t3 then                     -- 3. enable memory write operation
    m_rw <= '1'; 
  end if;
  if tick = t4 then                      -- 4. clear address bus, disable data bus.
    m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
  end if;

istore -- M[M[0xxx]] := ACC
  -- rising edge
  if tick = t1 then iar <= dBus; end if;        -- 2. load databus (M[0xxx]) into IAR
  if tick = t7 then wrapup; end if;            -- 8. end instruction.

  -- falling edge
  if tick = t0 then                     -- 1. Put IR(11..0) [0xxx] onto address bus.
     m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); 
  end if;
  if tick = t2 then                     -- 3. clear address bus.
     m_en <= '0'; aBus <= (aBus'range => '0'); 
  end if;
  if tick = t3 then m_en <= '1'; aBus <= iar; end if;    -- 4. Put IAR onto address bus.
  if tick = t4 then m_rw <= '0'; dBus <= acc; end if;    -- 5. Put ACC onto data bus.
  if tick = t6 then m_rw <= '1'; end if;        -- 6. enable memory write operation.
  if tick = t7 then                      -- 8. clear address bus, disable data bus.
    m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
  end if;

-- branch instructions
branch -- PC := 0xxx
  -- rising edge
  pc <= x"0" & iReg(11 downto 0);            -- 1. PC <= IR(11..0) 
  wrapup;

  -- falling edge

brZero -- if ACC = 0 then PC := 0xxx
  -- rising edge
  if acc = x"0000" then                    -- if (ACC=0) PC <= IR(11..0)
    pc <= x"0" & iReg(11 downto 0);    
  end if;
  wrapup;

  -- falling edge

brPos -- if ACC > 0 then PC := 0xxx
  -- rising edge
  if acc(15) = '0' and acc /= x"0000" then         -- if (ACC > 0) PC <= IR(11..0)
    pc <= x"0" & iReg(11 downto 0);
  end if;
  wrapup;

  -- falling edge

brNeg -- if ACC < 0 then PC := 0xxx            -- if (ACC < 0) PC <= IR(11..0)
  -- rising edge
  if acc(15) = '1' then pc <= x"0" & iReg(11 downto 0);    end if;
  wrapup;

  -- falling edge

add -- ACC := ACC + M[0xxx]
  -- rising edge
  if tick = t1 then acc <= alu; end if;            -- 2. ACC <= ALU <= ACC+data bus
  if tick = t2 then wrapup; end if;            -- 4. end instruction.

  -- falling edge
  if tick = t0 then                     -- 1. Put IR(11..0) onto address bus.
     m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); 
  end if;
  if tick = t2 then                     -- 3. clear address bus.
     m_en <= '0'; aBus <= (aBus'range => '0'); 
  end if;

記憶體

雖然理察一號是一顆 CPU,照理講不需要納入記憶體的設計,但是若沒有記憶體, CPU 也很難進行完整的測試,因此、作者理察先生在其 VHDL 程式當中加入了一個檔名為 RAM.VHD 的記憶體程式,並在一開機時即放入一小段程式以供測試之用,如此、才能完整的在 VHDL 的模擬軟體 (例如:Altera Quartus II) 當中進行 CPU 的測試。

功能性模擬

程式作者理察先生原先在 RAM.VHD 中所放的是一段對所有指令都進行一次測試的程式如下

                  -- basic instruction check
                ram(0)  <= x"1a0f"; -- immediate load
                ram(1)  <= x"2010"; -- direct load
                ram(2)  <= x"3030"; -- indirect load
                ram(3)  <= x"4034"; -- direct store
                ram(4)  <= x"0001"; -- negate
                ram(5)  <= x"2034"; -- direct load
                ram(6)  <= x"0001"; -- negate
                ram(7)  <= x"5032"; -- indirect store
                ram(8)  <= x"0001"; -- negate
                ram(9)  <= x"1fff"; -- immediate load
                ram(10) <= x"a008"; -- add
                ram(11) <= x"700d"; -- brZero
                ram(12) <= x"0000"; -- halt
                ram(13) <= x"1400"; -- immediate load
                ram(14) <= x"8010"; -- brPos
                ram(15) <= x"0000"; -- halt
                ram(16) <= x"0001"; -- negate
                ram(17) <= x"9013"; -- brNeg
                ram(18) <= x"0000"; -- halt
                ram(19) <= x"6015"; -- branch
                ram(20) <= x"0000"; -- halt
                ram(21) <= x"8014"; -- brPos
                ram(22) <= x"7014"; -- brZero
                ram(23) <= x"0001"; -- negate
                ram(24) <= x"9014"; -- brNeg
                ram(25) <= x"0000"; -- halt
                ram(48) <= x"0031"; -- pointer for iload
                ram(49) <= x"5af0"; -- target of iload
                ram(50) <= x"0033"; -- pointer for istore
                ram(51) <= x"0000"; -- target of istore
                ram(52) <= x"f5af"; -- target of dstore

其中、對於每一個指令的模擬重點,作者以下列圖形簡要呈現:

Richard_CPU_Signal_Timing1.JPG
Richard_CPU_Signal_Timing2.JPG

接著針對加法指令 ADD 的模擬時序進行詳細的圖說。

Richard_CPU_Instruction_Add.JPG

最後呈現真正的功能模擬結果。

Richard_CPU_Processor_Simulation1.JPG
Richard_CPU_Processor_Simulation2.JPG

然而、我希望能建立一個完整的小程式以供測試,於是我篆寫了下列程式:— add.asm

000        dload    sum        -- 2003
001    loop:    add    sum        -- A003
002        br    loop        -- 8001
003    sum    word    4        -- 0001

接著將作者的測試程式修改成對應上述程式的 VHDL 版本,以便測試。

                ram(0)  <= x"2003"; -- dload     sum
                ram(1)  <= x"A003"; -- add         sum
                ram(2)  <= x"8001"; -- br         loop
                ram(3)  <= x"0004"; -- sum        word    1

然後我在 Altera 的 Quartus II 中進行模擬,圖示如下,請仔細觀察累積器 AccX 的值,可發現其值由 0, 4, 8, 12, … 不斷累加,因此程式確實正確的執行無誤。

Richard_CPU_Ccc_Add_Prog_Simulation.JPG

結語

有了 VHDL 這樣的數位硬體設計語言之後,設計硬體不再是工業上的秘密了,本文藉用華盛頓大學 William D. Richard 老師所設計的一顆 CPU,詳細的說明 CPU 的設計原理,完整的程式碼請參考下文的附錄,仔細閱讀該 VHDL 程式,相信你會成為 CPU 設計的高手。

附錄 : 理查一號 CPU 的完整原始程式碼

共用常數宣告 : commonConstant.vhd

-- commonConstant.vhd --
package commonConstants is
    constant wordSize: integer := 16;
    constant adrLength: integer := 16;
end package commonConstants;

中央處理器 : cpu.vhd

-- cpu.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;
use work.commonConstants.all;

entity cpu is port (
    clk, reset     : in  std_logic;
    m_en, m_rw    : out std_logic;        
    aBus        : out std_logic_vector(adrLength-1 downto 0);
    dBus        : inout std_logic_vector(wordSize-1 downto 0);
    -- these signals "exported" so they can be monitored in post-P&R simulation
    pcX, iarX    : out std_logic_vector(adrLength-1 downto 0);
    iregX, accX, aluX    : out std_logic_vector(wordSize-1 downto 0));
end cpu;

architecture cpuArch of cpu is
type state_type is (
    reset_state, fetch, halt, negate, mload, dload, iload,
    dstore, istore, branch, brZero, brPos, brNeg, add
);
signal state: state_type;

type tick_type is (t0, t1, t2, t3, t4, t5, t6, t7);
signal tick: tick_type;

signal pc:     std_logic_vector(adrLength-1 downto 0); -- program counter
signal iReg:     std_logic_vector(wordSize-1 downto 0); -- instruction register
signal iar:     std_logic_vector(adrLength-1 downto 0); -- indirect address register
signal acc:     std_logic_vector(wordSize-1 downto 0); -- accumulator
signal alu:     std_logic_vector(wordSize-1 downto 0); -- alu output

begin
    alu <=     (not acc) + x"0001" when state = negate else
                acc + dbus when state = add else
                (alu'range => '0');
    pcX <= pc; iregX <= ireg; iarX <= iar; accX <= acc; aluX <= alu;

    process(clk) -- perform actions that occur on rising clock edges

    function nextTick(tick: tick_type) return tick_type is begin
        -- return next logical value for tick
        case tick is
        when t0 => return t1; when t1 => return t2; when t2 => return t3;
        when t3 => return t4; when t4 => return t5; when t5 => return t6;
        when t6 => return t7; when others => return t0;
        end case;
    end function nextTick;

    procedure decode is begin
        -- Instruction decoding.
        case iReg(15 downto 12) is
        when x"0" =>
            if iReg(11 downto 0) = x"000" then 
                state <= halt;
            elsif iReg(11 downto 0) = x"001" then 
                state <= negate;
            end if;
        when x"1" =>     state <= mload;
        when x"2" =>     state <= dload;
        when x"3" =>     state <= iload;
        when x"4" =>     state <= dstore;
        when x"5" =>     state <= istore;
        when x"6" =>     state <= branch;
        when x"7" =>     state <= brZero;    
        when x"8" =>     state <= brPos;
        when x"9" =>     state <= brNeg;
        when x"a" =>     state <= add;
        when others => state <= halt;
        end case;
    end procedure decode;

    procedure wrapup is begin
        -- Do this at end of every instruction
        state <= fetch; tick <= t0;
    end procedure wrapup;

    begin
          if clk'event and clk = '1' then
              if reset = '1' then 
                state <= reset_state; tick <= t0;
                pc <= (pc'range => '0'); iReg <= (iReg'range => '0');
                acc <= (acc'range => '0'); iar <= (iar'range => '0');
                else
                tick <= nextTick(tick) ; -- advance time by default
                case state is
                when reset_state => state <= fetch; tick <= t0;

                when fetch =>     if tick = t1 then iReg <= dBus; end if;
                                    if tick = t2 then 
                                        decode; pc <= pc + '1'; tick <= t0;
                                    end if;
                when halt => tick <= t0; -- do nothing

                when negate => acc <= alu;    wrapup;

                -- load instructions
                when mload => 
                    if iReg(11) = '0' then -- sign extension
                        acc <= x"0" & ireg(11 downto 0); 
                    else
                        acc <= x"f" & ireg(11 downto 0);
                    end if;
                    wrapup;

                when dload =>
                    if tick = t1 then acc <= dBus; end if;
                    if tick = t2 then wrapup; end if;

                when iload =>
                    if tick = t1 then iar <= dBus; end if;
                    if tick = t4 then acc <= dBus; end if;
                    if tick = t5 then wrapup; end if;

                -- store instructions                  
                when dstore =>
                    if tick = t4 then wrapup; end if;

                when istore =>
                    if tick = t1 then iar <= dBus; end if;
                    if tick = t7 then wrapup; end if;

                -- branch instructions
                when branch => 
                    pc <= x"0" & iReg(11 downto 0);
                    wrapup;
                when brZero => 
                    if acc = x"0000" then pc <= x"0" & iReg(11 downto 0);    end if;
                    wrapup;
                when brPos => 
                    if acc(15) = '0' and acc /= x"0000" then 
                        pc <= x"0" & iReg(11 downto 0);
                    end if;
                    wrapup;
                when brNeg => 
                    if acc(15) = '1' then pc <= x"0" & iReg(11 downto 0);    end if;
                    wrapup;

                -- arithmetic instructions
                when add =>
                    if tick = t1 then acc <= alu; end if;
                    if tick = t2 then wrapup; end if;

                when others => state <= halt;
                end case;
            end if;
          end if;
    end process;

    process(clk) begin -- perform actions that occur on falling clock edges
        if clk'event and clk ='0' then
            if reset = '1' then
                m_en <= '0'; m_rw <= '1';
                aBus <= (aBus'range => '0'); dBus <= (dBus'range => 'Z');
            else
                case state is

                when fetch =>
                    if tick = t0 then m_en <= '1'; aBus <= pc; end if;
                    if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;

                  when dload =>
                    if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
                    if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;

                  when iload =>
                    if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
                    if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
                    if tick = t3 then m_en <= '1'; aBus <= iar; end if;
                    if tick = t5 then m_en <= '0'; aBus <= (abus'range => '0'); end if;

                  when dstore =>
                    if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
                    if tick = t1 then m_rw <= '0'; dBus <= acc; end if;
                    if tick = t3 then m_rw <= '1'; end if;
                    if tick = t4 then 
                        m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z'); 
                    end if;

                when istore =>
                    if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
                    if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
                    if tick = t3 then m_en <= '1'; aBus <= iar; end if;
                    if tick = t4 then m_rw <= '0'; dBus <= acc; end if;
                    if tick = t6 then m_rw <= '1'; end if;
                    if tick = t7 then 
                        m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z'); 
                    end if;

                when add =>
                    if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
                    if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;

                when others => -- do nothing
                end case;
            end if;    
        end if;                    
    end process;
end cpuArch;

測試記憶體 ram.vhd : 內含測試程式

-- ram.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.commonConstants.all;

entity ram is port (
        reset, en, r_w: in STD_LOGIC;
        aBus: in STD_LOGIC_VECTOR(adrLength-1 downto 0);
        dBus: inout STD_LOGIC_VECTOR(wordSize-1 downto 0));
end ram;

architecture ramArch of ram is
constant resAdrLength: integer := 6; -- address length restricted within architecture
constant memSize: integer := 2**resAdrLength;
type ram_typ is array(0 to memSize-1) of STD_LOGIC_VECTOR(wordSize-1 downto 0);
signal ram: ram_typ;
begin
    process(reset, en, r_w, aBus, dBus) begin
          if reset = '1' then
                  -- basic instruction check
                ram(0)  <= x"1a0f"; -- immediate load
                ram(1)  <= x"2010"; -- direct load
                ram(2)  <= x"3030"; -- indirect load
                ram(3)  <= x"4034"; -- direct store
                ram(4)  <= x"0001"; -- negate
                ram(5)  <= x"2034"; -- direct load
                ram(6)  <= x"0001"; -- negate
                ram(7)  <= x"5032"; -- indirect store
                ram(8)  <= x"0001"; -- negate
                ram(9)  <= x"1fff"; -- immediate load
                ram(10) <= x"a008"; -- add
                ram(11) <= x"700d"; -- brZero
                ram(12) <= x"0000"; -- halt
                ram(13) <= x"1400"; -- immediate load
                ram(14) <= x"8010"; -- brPos
                ram(15) <= x"0000"; -- halt
                ram(16) <= x"0001"; -- negate
                ram(17) <= x"9013"; -- brNeg
                ram(18) <= x"0000"; -- halt
                ram(19) <= x"6015"; -- branch
                ram(20) <= x"0000"; -- halt
                ram(21) <= x"8014"; -- brPos
                ram(22) <= x"7014"; -- brZero
                ram(23) <= x"0001"; -- negate
                ram(24) <= x"9014"; -- brNeg
                ram(25) <= x"0000"; -- halt
                ram(48) <= x"0031"; -- pointer for iload
                ram(49) <= x"5af0"; -- target of iload
                ram(50) <= x"0033"; -- pointer for istore
                ram(51) <= x"0000"; -- target of istore
                ram(52) <= x"f5af"; -- target of dstore
        elsif en = '1' and r_w = '0' then
              ram(conv_integer(unsigned(aBus(resAdrLength-1 downto 0)))) <= dBus;
        end if;
    end process;
    dBus <= ram(conv_integer(unsigned(aBus(resAdrLength-1 downto 0))))
              when reset = '0' and en = '1' and r_w = '1' else
            (dbus'range => 'Z');
end ramArch;

最上層測試程式 top.vhd

-- top.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.commonConstants.all;

entity top is port(
    clk, reset: in STD_LOGIC;
    mem_enX, mem_rwX : out std_logic;
    aBusX : out std_logic_vector(adrLength-1 downto 0);
    dBusX : out std_logic_vector(wordSize-1 downto 0);
    pcX, iarX : out std_logic_vector(adrLength-1 downto 0);
    iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end top;

architecture topArch of top is

component ram port (
    reset, en, r_w: in STD_LOGIC;
    aBus: in STD_LOGIC_VECTOR(adrLength-1 downto 0);
    dBus: inout STD_LOGIC_VECTOR(wordSize-1 downto 0));
end component;

component cpu port (
    clk, reset:    in  STD_LOGIC;
    m_en, m_rw: out STD_LOGIC;        
    aBus:    out STD_LOGIC_VECTOR(adrLength-1 downto 0);
    dBus:    inout STD_LOGIC_VECTOR(wordSize-1 downto 0);
    pcX, iarX : out std_logic_vector(wordSize-1 downto 0);
    iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end component;

signal mem_en, mem_rw: STD_LOGIC;
signal aBus, dBus: STD_LOGIC_VECTOR(15 downto 0);
signal pc, ireg, iar, acc, alu: std_logic_vector(15 downto 0);

begin
    ramC: ram port map(reset, mem_en, mem_rw, aBus, dBus);
    cpuC: cpu port map(clk, reset, mem_en, mem_rw, aBus, dBus,
            pc, iar, ireg, acc, alu);
    mem_enX <= mem_en; mem_rwX <= mem_rw;
    aBusX <= aBus; dBusX <= dBus;
    pcX <= pc; iregX <= ireg; iarX <= iar; accX <= acc; aluX <= alu;
end topArch;

簡易的測試程式 1 (Add the values in locations 20-2f and write sum in 10).

位址        指令                說明
0000 (start)    1000 (ACC := 0000)        initialize sum
0001        4010 (M[0010] := ACC)
0002        1020 (ACC := 0020)        initialize pointer
0003        4011 (M[0011] := ACC)    
0004 (loop)    1030 (ACC := 0030)        if pointer = 030, quit
0005        0001 (ACC := -ACC)        
0006        a011 (ACC :=ACC+M[0011])
0007        700f (if 0 goto 000f)
0008        3011 (ACC := M[M[0011]])     sum = sum+*pointer
0009        a010 (ACC :=ACC+M[0010])
000a        4010 (M[0010] := ACC)
000b        1001 (ACC := 0001)        pointer = pointer + 1
000c        a011 (ACC :=ACC+M[0011])
000d        4011 (M[011] := ACC)
000e        6004 (goto 0004)        goto loop
000f  (end)    0000 (halt)    halt
0010        Store sum here
0011        Pointer to next value

我撰寫的測試程式 — add.asm

000        dload    A        -- 2004
001        add    B        -- A005
002        dstore    C        -- 4006
003    done:    br    done        -- 6003
004    A    word    3        -- 0003
005    B    word    5        -- 0005
006    C    word    0        -- 0000

我撰寫的測試程式 — sum.asm

000    loop:    dload    sum        -- 200A
001        add    i        -- A00B
002        dstore    sum        -- 400A
003        dload    i        -- 200B
004        add    one        -- A00C
005        dstore    i        -- 400B
006        neg            -- 0001
007        add    K10        -- A00D
008        brPos    loop        -- 8000
009    done:    br    done        -- 6009
00A    sum    word    0        -- 0000
00B    i    word    1        -- 0000
00C    one    word    1        -- 0001
00D    N    word    10        -- 000A

Facebook

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License