程式專案下載:CPU.zip
前言
想自己設計 CPU,有可能嗎?這可是工業上的不傳之祕,哪有可能自己做一顆?
真的是這樣嗎? 或許 20 年前是這樣的,但現在可就不是了,有了 VHDL 與 FPGA 之後,我們真的可以用程式寫出一顆 CPU,包含記憶體與測試案例都可以在你的 PC 上模擬出來後,燒到 FPGA 當中,就成了一顆 FPGA CPU 了。
源起
當我在金門技術學院教計算機組織這門課的時候,開使對於自己設計一顆 CPU 有了強烈的興趣,或許是因為我很喜歡程式設計,只要看到可以寫程式做出來的東西就很想自己寫一個,因為、有了 FPGA 之後,CPU 居然變成一種軟體了。
也可能是因為心虛,教計算機結構的老師沒有自己設計過 CPU,那也太遜咖了,於是、我花了三個星期,反覆的用 Altera 的 Quartus 設計修改一顆自己設計的 CPU,但還是無法成功。
最後、我決定看看網路上是否有人教人設計 CPU,找到很多,但是真正將程式與原理講得很清楚的很少,華聖頓大學的 William D. Richard 在 Introduction To Digital Logic And Computer Design 課程中,將 CPU 的設計原理講得很清楚,然後更用 200 行的 VHDL 設計出一顆 CPU,接著再加入記憶體後建構出一台測試電腦,總共寫了 315 行的 VHDL 程式,並且在投影片 當中詳細的交待了其設計細節,並給出模擬的結果,非常令人激賞。
以下我們將深入解析這顆 CPU 的設計方式與運作原理,我們將這顆 CPU 稱為 理察一號。
理察一號 CPU 的設計
暫存器
理察一號是一顆 16 位元的 CPU,採用以累積器 ACC 為核心的簡易架構,包含有四個暫存器 PC, iReg, IAR, ACC,兩組匯流排 (位址匯流排 aBus 與 資料匯流排 dBus),另外還有一些控制訊號線,包含時脈線 clk,重置線 reset,記憶體啟動 m_en 與記憶體讀寫 m_rw 等,其架構如下圖所示:
根據此架構,其 VHDL 的 entity 定義如下:
entity cpu is port (
clk, reset : in std_logic;
m_en, m_rw : out std_logic;
aBus : out std_logic_vector(adrLength-1 downto 0);
dBus : inout std_logic_vector(wordSize-1 downto 0);
-- these signals "exported" so they can be monitored in post-P&R simulation
pcX, iarX : out std_logic_vector(adrLength-1 downto 0);
iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end cpu;
architecture cpuArch of cpu is
type state_type is (
reset_state, fetch, halt, negate, mload, dload, iload,
dstore, istore, branch, brZero, brPos, brNeg, add
);
signal state: state_type;
type tick_type is (t0, t1, t2, t3, t4, t5, t6, t7);
signal tick: tick_type;
signal pc: std_logic_vector(adrLength-1 downto 0); -- program counter (程式計數器)
signal iReg: std_logic_vector(wordSize-1 downto 0); -- instruction register (指令暫存器)
signal iar: std_logic_vector(adrLength-1 downto 0); -- indirect address register (間接定址暫存器)
signal acc: std_logic_vector(wordSize-1 downto 0); -- accumulator (累積器)
signal alu: std_logic_vector(wordSize-1 downto 0); -- alu output
指令集
要設計一顆 CPU,首先要從指令集下手,理察一號的指令集共有 11 個指令,每個指令都對應到上列程式中的一個狀態 (state),每個指令長度都是 16 位元,指令中前 4 位元通常用來代表運算碼,後 12 位元代表記憶體位址,指令存放在指令暫存器當中,格式如下:
IR(15..12) : 運算碼
IR(11..0) : 運算元
以下是理察一號的指令列表:
<csv>
代碼, 指令, 說明, 運算方法
0000, halt, 暫停 (halt execution)
0001, negate, 反相 (negation), ACC := -ACC
1xxx, load, 立即載入 (immediate load), if sign bit of xxx is 0 then ACC := 0xxx else ACC := fxxx
2xxx, dload, 直接載入 (direct load), ACC := M[0xxx]
3xxx, iload, 間接載入 (indirect load), ACC := M[M[0xxx]]
4xxx, dstore, 直接儲存 (direct store), M[0xxx] := ACC
5xxx, istore, 間接儲存 (indirect store), M[M[0xxx]] := ACC
6xxx, br, 分枝 (branch), PC := 0xxx
7xxx, brZero, 零分枝 (branch if zero), if ACC = 0 then PC := 0xxx
8xxx, brPos, 正分枝 (branch if positive), if ACC > 0 then PC := 0xxx
9xxx, brNeg, 負分枝 (branch if negative), if ACC < 0 then PC := 0xxx
axxx, add, 加法, ACC := ACC + M[0xxx]
</csv>
在上述指令表中,halt 與 negate 由於沒有運算元(參數),因此被編碼為 16 進位的 0000
與 0001,其他指令都具有一個記憶體位址運算元,這些指令的前 4 個 bit 用來表示運算碼,
後 12 個 bit 用來表示記憶體位址(運算元:參數),指令共可分為四群,第一群為載入運算 (load, dload, iload)
,第二群為儲存 (dstore, istore) 第三群為分枝 (br, brZero, brZero, brPos, brNeg),
第四群只有一個指令,就是加法運算 (add)。
算術邏輯單元
alu <= (not acc) + x"0001" when state = negate else
acc + dbus when state = add else
(alu'range => '0');
控制單元
指令步驟的時程
理察一號的指令執行最多包含 8 個時程,在程式中用 t0 t1 … t7 表示目前在哪一個時程,在程式中呼叫 nextTick 函數可前進到下一個時程。
function nextTick(tick: tick_type) return tick_type is begin
-- return next logical value for tick
case tick is
when t0 => return t1; when t1 => return t2; when t2 => return t3;
when t3 => return t4; when t4 => return t5; when t5 => return t6;
when t6 => return t7; when others => return t0;
end case;
end function nextTick;
解碼函數
decode 解碼函數會檢視指令暫存器,看看目前是哪一種指令,然後記錄在 state 變數中,以下是 decode 函數的程式碼。
procedure decode is begin
-- Instruction decoding.
case iReg(15 downto 12) is
when x"0" =>
if iReg(11 downto 0) = x"000" then
state <= halt;
elsif iReg(11 downto 0) = x"001" then
state <= negate;
end if;
when x"1" => state <= mload;
when x"2" => state <= dload;
when x"3" => state <= iload;
when x"4" => state <= dstore;
when x"5" => state <= istore;
when x"6" => state <= branch;
when x"7" => state <= brZero;
when x"8" => state <= brPos;
when x"9" => state <= brNeg;
when x"a" => state <= add;
when others => state <= halt;
end case;
end procedure decode;
詳細的指令步驟分解
要設計出一個 CPU 的控制單元,必需仔細的將每一個指令分解為資料轉移的步驟,理查一號 CPU 由於要增進效能,因此、在電流的上升與下降邊緣都有執行動作,但這卻使得控制程式分散侕難以理解,要理解其控制邏輯,必需將上升與下降邊緣的動作重新組合,才能看清其運作邏輯,以下是我們對程式進行重組的結果,並對每一個狀態按 1. 2. 3. … 標定其執行順序,以利讀者理解。
reset_state
state <= fetch; tick <= t0;
fetch
-- rising edge
if tick = t1 then iReg <= dBus; end if; -- 2. get Intstruction from data bus to iReg
if tick = t2 then -- 3. decode and PC++, go to next state
decode; pc <= pc + '1'; tick <= t0;
end if;
-- falling edge
if tick = t0 then m_en <= '1'; aBus <= pc; end if; -- 1. put PC on to address bus, enable read instruction
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if; -- 4. clear address bus
halt -- halt execution
-- rising edge
tick <= t0; -- do nothing
negate -- ACC := -ACC
-- rising edge
negate => acc <= alu; wrapup; -- 1. acc <= alu <= -acc;
-- load instructions
mload
-- rising edge
if iReg(11) = '0' then -- sign extension
acc <= x"0" & ireg(11 downto 0); -- 1. acc <= 0xxx
else
acc <= x"f" & ireg(11 downto 0); -- 1. acc <= fxxx
end if;
wrapup;
dload m -- ACC := M[0xxx]
-- rising edge
if tick = t1 then acc <= dBus; end if; -- 2. load data bus into ACC
if tick = t2 then wrapup; end if; -- 4. end instruction
-- falling edge
if tick = t0 then -- 1. Put IR(11..0) onto address bus.
m_en <= '1'; aBus <= x"0" & iReg(11 downto 0);
end if;
if tick = t2 then -- 3. clear address bus
m_en <= '0'; aBus <= (aBus'range => '0');
end if;
iload m -- ACC := M[M[0xxx]]
-- rising edge
if tick = t1 then iar <= dBus; end if; -- 2. load data bus into IAR
if tick = t4 then acc <= dBus; end if; -- 5. load data bus into ACC
if tick = t5 then wrapup; end if; -- 7. end instruction
-- falling edge
if tick = t0 then -- 1. Put IR(11..0) onto address bus.
m_en <= '1'; aBus <= x"0" & iReg(11 downto 0);
end if;
if tick = t2 then -- 3. Clear Address Bus.
m_en <= '0'; aBus <= (aBus'range => '0');
end if;
if tick = t3 then -- 4. Put IAR onto address
m_en <= '1'; aBus <= iar;
end if;
if tick = t5 then -- 6. clear address bus.
m_en <= '0'; aBus <= (abus'range => '0');
end if;
-- store instructions
dstore -- M[0xxx] := ACC
-- rising edge
if tick = t4 then wrapup; end if; -- 5. end instruction
-- falling edge
if tick = t0 then -- 1. Put IR(11..0) onto address bus.
m_en <= '1'; aBus <= x"0" & iReg(11 downto 0);
end if;
if tick = t1 then -- 2. Put ACC onto address bus.
m_rw <= '0'; dBus <= acc;
end if;
if tick = t3 then -- 3. enable memory write operation
m_rw <= '1';
end if;
if tick = t4 then -- 4. clear address bus, disable data bus.
m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
end if;
istore -- M[M[0xxx]] := ACC
-- rising edge
if tick = t1 then iar <= dBus; end if; -- 2. load databus (M[0xxx]) into IAR
if tick = t7 then wrapup; end if; -- 8. end instruction.
-- falling edge
if tick = t0 then -- 1. Put IR(11..0) [0xxx] onto address bus.
m_en <= '1'; aBus <= x"0" & iReg(11 downto 0);
end if;
if tick = t2 then -- 3. clear address bus.
m_en <= '0'; aBus <= (aBus'range => '0');
end if;
if tick = t3 then m_en <= '1'; aBus <= iar; end if; -- 4. Put IAR onto address bus.
if tick = t4 then m_rw <= '0'; dBus <= acc; end if; -- 5. Put ACC onto data bus.
if tick = t6 then m_rw <= '1'; end if; -- 6. enable memory write operation.
if tick = t7 then -- 8. clear address bus, disable data bus.
m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
end if;
-- branch instructions
branch -- PC := 0xxx
-- rising edge
pc <= x"0" & iReg(11 downto 0); -- 1. PC <= IR(11..0)
wrapup;
-- falling edge
brZero -- if ACC = 0 then PC := 0xxx
-- rising edge
if acc = x"0000" then -- if (ACC=0) PC <= IR(11..0)
pc <= x"0" & iReg(11 downto 0);
end if;
wrapup;
-- falling edge
brPos -- if ACC > 0 then PC := 0xxx
-- rising edge
if acc(15) = '0' and acc /= x"0000" then -- if (ACC > 0) PC <= IR(11..0)
pc <= x"0" & iReg(11 downto 0);
end if;
wrapup;
-- falling edge
brNeg -- if ACC < 0 then PC := 0xxx -- if (ACC < 0) PC <= IR(11..0)
-- rising edge
if acc(15) = '1' then pc <= x"0" & iReg(11 downto 0); end if;
wrapup;
-- falling edge
add -- ACC := ACC + M[0xxx]
-- rising edge
if tick = t1 then acc <= alu; end if; -- 2. ACC <= ALU <= ACC+data bus
if tick = t2 then wrapup; end if; -- 4. end instruction.
-- falling edge
if tick = t0 then -- 1. Put IR(11..0) onto address bus.
m_en <= '1'; aBus <= x"0" & iReg(11 downto 0);
end if;
if tick = t2 then -- 3. clear address bus.
m_en <= '0'; aBus <= (aBus'range => '0');
end if;
記憶體
雖然理察一號是一顆 CPU,照理講不需要納入記憶體的設計,但是若沒有記憶體, CPU 也很難進行完整的測試,因此、作者理察先生在其 VHDL 程式當中加入了一個檔名為 RAM.VHD 的記憶體程式,並在一開機時即放入一小段程式以供測試之用,如此、才能完整的在 VHDL 的模擬軟體 (例如:Altera Quartus II) 當中進行 CPU 的測試。
功能性模擬
程式作者理察先生原先在 RAM.VHD 中所放的是一段對所有指令都進行一次測試的程式如下
-- basic instruction check
ram(0) <= x"1a0f"; -- immediate load
ram(1) <= x"2010"; -- direct load
ram(2) <= x"3030"; -- indirect load
ram(3) <= x"4034"; -- direct store
ram(4) <= x"0001"; -- negate
ram(5) <= x"2034"; -- direct load
ram(6) <= x"0001"; -- negate
ram(7) <= x"5032"; -- indirect store
ram(8) <= x"0001"; -- negate
ram(9) <= x"1fff"; -- immediate load
ram(10) <= x"a008"; -- add
ram(11) <= x"700d"; -- brZero
ram(12) <= x"0000"; -- halt
ram(13) <= x"1400"; -- immediate load
ram(14) <= x"8010"; -- brPos
ram(15) <= x"0000"; -- halt
ram(16) <= x"0001"; -- negate
ram(17) <= x"9013"; -- brNeg
ram(18) <= x"0000"; -- halt
ram(19) <= x"6015"; -- branch
ram(20) <= x"0000"; -- halt
ram(21) <= x"8014"; -- brPos
ram(22) <= x"7014"; -- brZero
ram(23) <= x"0001"; -- negate
ram(24) <= x"9014"; -- brNeg
ram(25) <= x"0000"; -- halt
ram(48) <= x"0031"; -- pointer for iload
ram(49) <= x"5af0"; -- target of iload
ram(50) <= x"0033"; -- pointer for istore
ram(51) <= x"0000"; -- target of istore
ram(52) <= x"f5af"; -- target of dstore
其中、對於每一個指令的模擬重點,作者以下列圖形簡要呈現:
接著針對加法指令 ADD 的模擬時序進行詳細的圖說。
最後呈現真正的功能模擬結果。
然而、我希望能建立一個完整的小程式以供測試,於是我篆寫了下列程式:— add.asm
000 dload sum -- 2003
001 loop: add sum -- A003
002 br loop -- 8001
003 sum word 4 -- 0001
接著將作者的測試程式修改成對應上述程式的 VHDL 版本,以便測試。
ram(0) <= x"2003"; -- dload sum
ram(1) <= x"A003"; -- add sum
ram(2) <= x"8001"; -- br loop
ram(3) <= x"0004"; -- sum word 1
然後我在 Altera 的 Quartus II 中進行模擬,圖示如下,請仔細觀察累積器 AccX 的值,可發現其值由 0, 4, 8, 12, … 不斷累加,因此程式確實正確的執行無誤。
結語
有了 VHDL 這樣的數位硬體設計語言之後,設計硬體不再是工業上的秘密了,本文藉用華盛頓大學 William D. Richard 老師所設計的一顆 CPU,詳細的說明 CPU 的設計原理,完整的程式碼請參考下文的附錄,仔細閱讀該 VHDL 程式,相信你會成為 CPU 設計的高手。
附錄 : 理查一號 CPU 的完整原始程式碼
共用常數宣告 : commonConstant.vhd
-- commonConstant.vhd --
package commonConstants is
constant wordSize: integer := 16;
constant adrLength: integer := 16;
end package commonConstants;
中央處理器 : cpu.vhd
-- cpu.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_unsigned.all;
use work.commonConstants.all;
entity cpu is port (
clk, reset : in std_logic;
m_en, m_rw : out std_logic;
aBus : out std_logic_vector(adrLength-1 downto 0);
dBus : inout std_logic_vector(wordSize-1 downto 0);
-- these signals "exported" so they can be monitored in post-P&R simulation
pcX, iarX : out std_logic_vector(adrLength-1 downto 0);
iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end cpu;
architecture cpuArch of cpu is
type state_type is (
reset_state, fetch, halt, negate, mload, dload, iload,
dstore, istore, branch, brZero, brPos, brNeg, add
);
signal state: state_type;
type tick_type is (t0, t1, t2, t3, t4, t5, t6, t7);
signal tick: tick_type;
signal pc: std_logic_vector(adrLength-1 downto 0); -- program counter
signal iReg: std_logic_vector(wordSize-1 downto 0); -- instruction register
signal iar: std_logic_vector(adrLength-1 downto 0); -- indirect address register
signal acc: std_logic_vector(wordSize-1 downto 0); -- accumulator
signal alu: std_logic_vector(wordSize-1 downto 0); -- alu output
begin
alu <= (not acc) + x"0001" when state = negate else
acc + dbus when state = add else
(alu'range => '0');
pcX <= pc; iregX <= ireg; iarX <= iar; accX <= acc; aluX <= alu;
process(clk) -- perform actions that occur on rising clock edges
function nextTick(tick: tick_type) return tick_type is begin
-- return next logical value for tick
case tick is
when t0 => return t1; when t1 => return t2; when t2 => return t3;
when t3 => return t4; when t4 => return t5; when t5 => return t6;
when t6 => return t7; when others => return t0;
end case;
end function nextTick;
procedure decode is begin
-- Instruction decoding.
case iReg(15 downto 12) is
when x"0" =>
if iReg(11 downto 0) = x"000" then
state <= halt;
elsif iReg(11 downto 0) = x"001" then
state <= negate;
end if;
when x"1" => state <= mload;
when x"2" => state <= dload;
when x"3" => state <= iload;
when x"4" => state <= dstore;
when x"5" => state <= istore;
when x"6" => state <= branch;
when x"7" => state <= brZero;
when x"8" => state <= brPos;
when x"9" => state <= brNeg;
when x"a" => state <= add;
when others => state <= halt;
end case;
end procedure decode;
procedure wrapup is begin
-- Do this at end of every instruction
state <= fetch; tick <= t0;
end procedure wrapup;
begin
if clk'event and clk = '1' then
if reset = '1' then
state <= reset_state; tick <= t0;
pc <= (pc'range => '0'); iReg <= (iReg'range => '0');
acc <= (acc'range => '0'); iar <= (iar'range => '0');
else
tick <= nextTick(tick) ; -- advance time by default
case state is
when reset_state => state <= fetch; tick <= t0;
when fetch => if tick = t1 then iReg <= dBus; end if;
if tick = t2 then
decode; pc <= pc + '1'; tick <= t0;
end if;
when halt => tick <= t0; -- do nothing
when negate => acc <= alu; wrapup;
-- load instructions
when mload =>
if iReg(11) = '0' then -- sign extension
acc <= x"0" & ireg(11 downto 0);
else
acc <= x"f" & ireg(11 downto 0);
end if;
wrapup;
when dload =>
if tick = t1 then acc <= dBus; end if;
if tick = t2 then wrapup; end if;
when iload =>
if tick = t1 then iar <= dBus; end if;
if tick = t4 then acc <= dBus; end if;
if tick = t5 then wrapup; end if;
-- store instructions
when dstore =>
if tick = t4 then wrapup; end if;
when istore =>
if tick = t1 then iar <= dBus; end if;
if tick = t7 then wrapup; end if;
-- branch instructions
when branch =>
pc <= x"0" & iReg(11 downto 0);
wrapup;
when brZero =>
if acc = x"0000" then pc <= x"0" & iReg(11 downto 0); end if;
wrapup;
when brPos =>
if acc(15) = '0' and acc /= x"0000" then
pc <= x"0" & iReg(11 downto 0);
end if;
wrapup;
when brNeg =>
if acc(15) = '1' then pc <= x"0" & iReg(11 downto 0); end if;
wrapup;
-- arithmetic instructions
when add =>
if tick = t1 then acc <= alu; end if;
if tick = t2 then wrapup; end if;
when others => state <= halt;
end case;
end if;
end if;
end process;
process(clk) begin -- perform actions that occur on falling clock edges
if clk'event and clk ='0' then
if reset = '1' then
m_en <= '0'; m_rw <= '1';
aBus <= (aBus'range => '0'); dBus <= (dBus'range => 'Z');
else
case state is
when fetch =>
if tick = t0 then m_en <= '1'; aBus <= pc; end if;
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
when dload =>
if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
when iload =>
if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
if tick = t3 then m_en <= '1'; aBus <= iar; end if;
if tick = t5 then m_en <= '0'; aBus <= (abus'range => '0'); end if;
when dstore =>
if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
if tick = t1 then m_rw <= '0'; dBus <= acc; end if;
if tick = t3 then m_rw <= '1'; end if;
if tick = t4 then
m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
end if;
when istore =>
if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
if tick = t3 then m_en <= '1'; aBus <= iar; end if;
if tick = t4 then m_rw <= '0'; dBus <= acc; end if;
if tick = t6 then m_rw <= '1'; end if;
if tick = t7 then
m_en <= '0'; aBus <= (abus'range => '0'); dBus <= (dBus'range => 'Z');
end if;
when add =>
if tick = t0 then m_en <= '1'; aBus <= x"0" & iReg(11 downto 0); end if;
if tick = t2 then m_en <= '0'; aBus <= (aBus'range => '0'); end if;
when others => -- do nothing
end case;
end if;
end if;
end process;
end cpuArch;
測試記憶體 ram.vhd : 內含測試程式
-- ram.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.commonConstants.all;
entity ram is port (
reset, en, r_w: in STD_LOGIC;
aBus: in STD_LOGIC_VECTOR(adrLength-1 downto 0);
dBus: inout STD_LOGIC_VECTOR(wordSize-1 downto 0));
end ram;
architecture ramArch of ram is
constant resAdrLength: integer := 6; -- address length restricted within architecture
constant memSize: integer := 2**resAdrLength;
type ram_typ is array(0 to memSize-1) of STD_LOGIC_VECTOR(wordSize-1 downto 0);
signal ram: ram_typ;
begin
process(reset, en, r_w, aBus, dBus) begin
if reset = '1' then
-- basic instruction check
ram(0) <= x"1a0f"; -- immediate load
ram(1) <= x"2010"; -- direct load
ram(2) <= x"3030"; -- indirect load
ram(3) <= x"4034"; -- direct store
ram(4) <= x"0001"; -- negate
ram(5) <= x"2034"; -- direct load
ram(6) <= x"0001"; -- negate
ram(7) <= x"5032"; -- indirect store
ram(8) <= x"0001"; -- negate
ram(9) <= x"1fff"; -- immediate load
ram(10) <= x"a008"; -- add
ram(11) <= x"700d"; -- brZero
ram(12) <= x"0000"; -- halt
ram(13) <= x"1400"; -- immediate load
ram(14) <= x"8010"; -- brPos
ram(15) <= x"0000"; -- halt
ram(16) <= x"0001"; -- negate
ram(17) <= x"9013"; -- brNeg
ram(18) <= x"0000"; -- halt
ram(19) <= x"6015"; -- branch
ram(20) <= x"0000"; -- halt
ram(21) <= x"8014"; -- brPos
ram(22) <= x"7014"; -- brZero
ram(23) <= x"0001"; -- negate
ram(24) <= x"9014"; -- brNeg
ram(25) <= x"0000"; -- halt
ram(48) <= x"0031"; -- pointer for iload
ram(49) <= x"5af0"; -- target of iload
ram(50) <= x"0033"; -- pointer for istore
ram(51) <= x"0000"; -- target of istore
ram(52) <= x"f5af"; -- target of dstore
elsif en = '1' and r_w = '0' then
ram(conv_integer(unsigned(aBus(resAdrLength-1 downto 0)))) <= dBus;
end if;
end process;
dBus <= ram(conv_integer(unsigned(aBus(resAdrLength-1 downto 0))))
when reset = '0' and en = '1' and r_w = '1' else
(dbus'range => 'Z');
end ramArch;
最上層測試程式 top.vhd
-- top.vhd --
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.commonConstants.all;
entity top is port(
clk, reset: in STD_LOGIC;
mem_enX, mem_rwX : out std_logic;
aBusX : out std_logic_vector(adrLength-1 downto 0);
dBusX : out std_logic_vector(wordSize-1 downto 0);
pcX, iarX : out std_logic_vector(adrLength-1 downto 0);
iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end top;
architecture topArch of top is
component ram port (
reset, en, r_w: in STD_LOGIC;
aBus: in STD_LOGIC_VECTOR(adrLength-1 downto 0);
dBus: inout STD_LOGIC_VECTOR(wordSize-1 downto 0));
end component;
component cpu port (
clk, reset: in STD_LOGIC;
m_en, m_rw: out STD_LOGIC;
aBus: out STD_LOGIC_VECTOR(adrLength-1 downto 0);
dBus: inout STD_LOGIC_VECTOR(wordSize-1 downto 0);
pcX, iarX : out std_logic_vector(wordSize-1 downto 0);
iregX, accX, aluX : out std_logic_vector(wordSize-1 downto 0));
end component;
signal mem_en, mem_rw: STD_LOGIC;
signal aBus, dBus: STD_LOGIC_VECTOR(15 downto 0);
signal pc, ireg, iar, acc, alu: std_logic_vector(15 downto 0);
begin
ramC: ram port map(reset, mem_en, mem_rw, aBus, dBus);
cpuC: cpu port map(clk, reset, mem_en, mem_rw, aBus, dBus,
pc, iar, ireg, acc, alu);
mem_enX <= mem_en; mem_rwX <= mem_rw;
aBusX <= aBus; dBusX <= dBus;
pcX <= pc; iregX <= ireg; iarX <= iar; accX <= acc; aluX <= alu;
end topArch;
簡易的測試程式 1 (Add the values in locations 20-2f and write sum in 10).
位址 指令 說明
0000 (start) 1000 (ACC := 0000) initialize sum
0001 4010 (M[0010] := ACC)
0002 1020 (ACC := 0020) initialize pointer
0003 4011 (M[0011] := ACC)
0004 (loop) 1030 (ACC := 0030) if pointer = 030, quit
0005 0001 (ACC := -ACC)
0006 a011 (ACC :=ACC+M[0011])
0007 700f (if 0 goto 000f)
0008 3011 (ACC := M[M[0011]]) sum = sum+*pointer
0009 a010 (ACC :=ACC+M[0010])
000a 4010 (M[0010] := ACC)
000b 1001 (ACC := 0001) pointer = pointer + 1
000c a011 (ACC :=ACC+M[0011])
000d 4011 (M[011] := ACC)
000e 6004 (goto 0004) goto loop
000f (end) 0000 (halt) halt
0010 Store sum here
0011 Pointer to next value
我撰寫的測試程式 — add.asm
000 dload A -- 2004
001 add B -- A005
002 dstore C -- 4006
003 done: br done -- 6003
004 A word 3 -- 0003
005 B word 5 -- 0005
006 C word 0 -- 0000
我撰寫的測試程式 — sum.asm
000 loop: dload sum -- 200A
001 add i -- A00B
002 dstore sum -- 400A
003 dload i -- 200B
004 add one -- A00C
005 dstore i -- 400B
006 neg -- 0001
007 add K10 -- A00D
008 brPos loop -- 8000
009 done: br done -- 6009
00A sum word 0 -- 0000
00B i word 1 -- 0000
00C one word 1 -- 0001
00D N word 10 -- 000A
Facebook
希望可以發表32bit cpu 相容80x86的指令.
你好,陈先生;
我对如何制作一个 fpga cpu 非常感兴趣,但是查理的 ppt 在网上已经找不到了;
您那里还有吗?能否发一份给我。
看到您的文章非常开心,非常感谢你的文章的指导。
Post preview:
Close preview