Processor

Datapath

Execution

Fetch instruction from PC
Decode instruction
- Read source registers
Perform the operation with ALU
1. Arithmetic & logic: ALU
2. Memory-reference instructions: ALU for address calculation
3. Conditional branch instructions: ALI for comparison
Memory access
1. lw, sw
Write back the result
1. PC += 4 OR change PC to branch target address

PC is flip-flop, falling edge, only update at the end

Operations

R-format
- Read 2 operands
- ALI perform arithmetic/logical operation
- Write register result
I-format load/store (e.g. lw $t0, 4($s0))
- Read base register operand ($s0)
- ALU adds base address with 16-bit sign-extended offset (4)
- Load: Read memory and update register
- Store: Write register value to memory
I-format branch
- Read register operands
- ALU compares operands by subtraction, check zero output (branch or not)
- Calculate branch target address

Control

ALU Control

load/store: add
branch: subtract
r-type: depends on funct

ALU Control Input	Function
0000	and
0001	or
0010	add
0110	subtract
0111	set on less than
1100	nor

1-level decoding
- more input bits
- 6 bit opcode + 6 bit funct = 12 bits
- 2^12 = 4096
2-level decoding
- less input bits, less complicated, faster logic
- 6 bit opcode → 2 bit ALUOp + 6 bit funct = 8 bits
- 2^8 = 256

opcode	ALUOp	Operation	funct	ALU function	ALU control
lw	00	load word		add	0010
sw	00	store word		add	0010
beq	01	branch equal		subtract	0110
R-type	10	add	100000	add	0010
		subtract	100010	subtract	0110
		and	100100	and	0000
		or	100101	or	0001
		set on less than	101010	set on less than	0111

ALUOp1	ALUOp0	F3	F2	F1	F0	Operation
0	0					0010	lw	add
	1					0110	beq	sub
1		0	0	0	0	0010	add
1		0	0	1	0	0110	sub
1		0	1	0	0	0000	and
1		0	1	0	1	0001	or
1		1	0	1	0	0111	slt

e.g. $Operation_{0} = ALUOp_{1} \cdot \overset{ˉ}{F_{3}} \cdot F_{2} \cdot \overset{ˉ}{F_{1}} \cdot F_{0} + ALUOp_{1} \cdot F_{3} \cdot \overset{ˉ}{F_{2}} \cdot F_{1} \cdot \overset{ˉ}{F_{0}}$

Control Signals

Signal	Deasserted	Asserted
RegDst	write to rt	write to rd
RegWrite		register write
ALUSrc	second register file output	sign extended immeidate
PCSrc	PC + 4	Branch
MemRead		memory read
MemWrite		memory write
MemtoReg	write from output from ALU	write from output from memory

Pipelining

Multiple tasks simultaneously
Independent
Does not help latency of single task
Helps the throughput
Potential speedup = number of pipeline stages
Pipeline rate is limited by the slowest pipeline stage
Unbalanced length can reduce speedup
Have to ensure no overlap
Limited by the slowest stage vs single cycle limited by the sum of all

Structural Hazards

https://stackoverflow.com/a/77893282

🏡

Explorer

Processor

Datapath

Execution

Operations

Control

ALU Control

Control Signals

Pipelining

Structural Hazards

Explorer

Table of Contents

Backlinks