Unit 4: Processor Fundamentals | 现代数学启蒙

slug

type

status

Test：

1. https://zvjqzp.form.top/f/FTfDgn（20 道填空题）

Processor Fundamentals

1.1. Central Processing Unit (CPU) Architecture

Von Neumann model: Von Neumann realized data & programs are indistinguishable and can, therefore, use the same memory. Von Neumann's architecture uses a single processor. It follows a linear sequence of fetch–decode–execute operations for the set of instructions, i.e., the program. To do this, the processor uses registers.

Registers: the smallest unit of storage of a microprocessor; allows fast data transfer between other registers

General Purpose registers: Used to temporarily store data values which have been read from memory or some processed result. Assembly language instructions can use it.

Special Purpose Registers: Some are accessible by assembly language instructions. Only holds either data or memory location, not both. Particular purpose registers include:

Program Counter (PC): holds the address of the next instruction to be fetched
Memory Data Register (MDR): holds data value fetched from memory
Memory Address Register (MAR): Holds the address of the memory cell of the program which is to be accessed
Accumulator (ACC): holds all values that are processed by arithmetic & logical operations.
Index Register (IX): Stores a number used to change an address value
Current Instruction Register (CIR): Once program instruction is fetched, it is stored in CIR and allows the processor to decode & execute it
Status Register: holds results of comparisons to decide later for action, intermediate and erroneous results of arithmetic performed

The Processor (CPU)

Arithmetic and Logic Unit (ALU): part of the processor that processes instructions which require some form of arithmetic or logical operation

Control Unit (CU): part of the CPU that fetches instructions from memory, decodes them & synchronizes operations before sending signals to the computer’s memory, ALU, and I/O devices to direct how to respond to instructions sent to the processor

Immediate Access Store (IAS): memory unit that the processor can directly access

System Clock: a timing device connected to a processor that synchronises all components.

Buses

A set of parallel wires that allow the transfer of data between components in a computer system.

Data bus: A bidirectional bus that carries data instructions between the processor, memory, and I/O devices.

Address bus: A unidirectional bus that carries the address of the main memory location or input/output device about to be used, from the processor to the memory address register (MAR).

Control bus: Bidirectional and used to transmit control signals from the control unit to ensure access/use of data & address buses by components of the system does not lead to conflict.

Performance of Computer System Factors

Clock Speed

Definition: Number of pulses the clock sends out in a given time interval, which determines the number of cycles (processes) the CPU executes in a given time interval.

Measurement: Usually measured in Gigahertz (GHz).

Impact: If the clock speed is increased, then the execution time for instructions decreases. Hence, more cycles per unit time, which increases performance. However, there is a limit on clock speed since the heat generated by higher clock speeds cannot be removed fast enough, which leads to overheating.

Bus Width

Definition: Determines the number of bits that can be simultaneously transferred.

Impact: Refers to the number of lines in a bus. Increasing bus width increases the number of bits transferred simultaneously, increasing processing speed and performance.

Cache Memory

Usage: Commonly used instructions are stored in the cache memory area of the CPU.

Impact: If the cache memory size is increased, more commonly executed instructions can be stored, and the need for the CPU to wait for instructions to be loaded reduces. Hence, the CPU executes more cycles per unit of time, thus improving performance.

Number of Cores

Definition: Most CPU chips are multi-core — have more than one core (essentially a processor).

Impact: Each core simultaneously processes different instructions through multithreading, improving computer performance.

Ports

Definition: Hardware which provides a physical interface between a device with CPU and a peripheral device.

Types:

Universal Serial Bus (USB): Can connect both input and output devices to the processor through a USB port.
High Definition Multimedia Interface (HDMI): Can only connect output devices (e.g., LCD) to the processor through an HDMI port. HDMI cables transmit high-bandwidth and high-resolution video & audio streams through HDMI ports.
Video Graphics Array (VGA): Can only connect output devices (e.g., second monitor/display) to the processor through a VGA port. VGA ports allow only the transmission of video streams but not audio components.

Fetch-Execute (F-E) cycle

Fetch stage

Program Counter (PC) holds the address of the next instruction to be fetched.

The address on the PC is copied to the Memory Address Register (MAR).

PC is incremented.

Instruction loaded to the Memory Data Register (MDR) from the address held in MAR.

Instruction from MDR loaded to the Current Instruction Register (CIR).

Decode stage

The opcode and operand parts of the instruction are identified.

Execute stage

Instructions executed by the control unit sending control signals.

Register Transfer Notation (RTN)

MAR ← [PC]

PC ← [PC] + 1

MDR ← [[MAR]]

CIR ← [MDR]

Decode

Execute

Return to start

Square brackets indicate the value currently in that register.

Double square brackets indicate the CPU is getting the value stored at the address in the register.

Interrupts

Definition: A signal from a program seeking the processor’s attention.

Interrupt Service Routine (ISR):

Handles the interrupt by controlling the processor.

Different ISRs used for different sources of interrupt.

A typical sequence of actions when an interrupt occurs:

The processor checks the interrupt register for an interrupt at the end of the F-E cycle for the current instruction.

If the interrupt flag is set in the interrupt register, the interrupt source is detected.

If the interrupt is low priority, then an interrupt is disabled.

If interrupting is high priority:

All contents of registers of the running process are saved on the stack.

PC is loaded with the ISR and is executed.

Once ISR is completed, the processor restores the registers’ contents from the stack, and the interrupted program continues its execution.

Interrupts re-enabled and Return to the start of the cycle.

1.2. Assembly Language

Assembly language: A low-level programming language with instructions made up of an opcode and an operand.

Machine code: Code written in binary that uses the processor’s basic machine operations.

Relationship between machine and assembly language: Every assembly language instruction (source code) translates into exactly one machine code instruction (object code).

Symbolic addressing:

Symbols used to represent operation codes.

Labels can be used for addresses.

Absolute addressing: A fixed address in memory.

Assembler:

Definition: Software that changes assembly language into machine code for the processor to understand.

The assembler replaces all mnemonics and labels with their respective binary values (that are predefined before by the assembler software).

One pass assembler:

Converts mnemonic source code into machine code in one sweep of program.

Cannot handle code that involves forward referencing.

Two pass assembler:

First pass: Symbol table created to enter symbolic addresses and labels into specific addresses. All errors are suppressed.

Second pass: Jump instructions access memory addresses via table. Whole source code translates into machine code. Error reported if they exist.

Grouping the Processor’s Instruction Set

(#denotes immediate addressingB denotes a binary number, e.g. B01001010 & denotes a hexadecimal number, e.g. &4A)

Modes of Addressing

Direct Addressing: loads contents at address into ACC

Indirect Addressing: The address to be used is at the given address. Load contents of this second address to ACC

Indexed addressing: form the address to be used as the base address plus the contents of the IR (Index Register)

Relative addressing: the next instruction to be carried out is an offset number of locations away, relative to the address of the current instruction held in the PC; allows for relocatable code

Conditional jump: has a condition that will be checked (like using an IF statement)

Unconditional jump: no condition to be followed, simply jump to the next instruction as specified

1.3 Bit Manipulation

Binary numbers can be multiplied or divided by shifting:

Left shift (LSL #n): Bits are shifted to the left to multiply.

E.g. to multiply by four, all digits shift two places to the left.

Right shift (LSR #n): Bits are shifted to the right to divide.

E.g. to divide by four, all digits shift two places to the right.

Logical shift: zeros replace the vacated bit position.

Arithmetic shift: Used to carry out multiplication and division of signed integers represented by bits in the accumulator by ensuring that the sign-bit (usually the MSB) is the same after the shift.

Cyclic shift: the bit that is removed from one end by the shift is added to the other end.

Bit Masking

Each bit can represent an individual flag. Therefore, by altering the bits, flags could be operated upon.

Bit manipulation operations:

Masking: an operation that defines which bits you want to keep and which bits you want to clear.

Masking to 1: The OR operation is used with a 1.

Masking to 0: The AND operation is used with a 0.

Matching: an operation that allows the accumulator to compare the value it contains to the given value in order to change the state of the status register.

Practical applications of Bit Masking:

Setting an individual bit position:

Mask the content of the register with a mask pattern that has 0 in the ‘mask out’ positions and 1 in the ‘retain’ positions.
Set the result with the match pattern by using the AND command with a direct address.

Testing one or more bits:

Mask the content of the register with a mask pattern that has 0 in the ‘mask out’ positions and 1 in the ‘retain’ positions.
Compare the result with the match pattern by using the CMP command or by “Checking the pattern”.

Checking the pattern:

Use the AND operation to mask bits and obtain the result.
Now subtract the matching bit pattern from the result.
The final ‘non-zero’ result confirms the patterns are not the same, else vice versa.

处理器基础

1.1. 中央处理单元（CPU）架构

冯·诺依曼模型: 冯·诺依曼认识到数据与程序是无法区分的，因此可以使用同一内存。冯·诺依曼架构使用单一处理器，它以线性顺序执行取指、解码和执行操作，即程序指令集。处理器使用寄存器来执行这些操作。

寄存器: 微处理器中最小的存储单元，允许在寄存器之间快速传输数据。

通用寄存器: 用于临时存储从内存读取的数据值或一些处理结果。汇编语言指令可以使用它。

专用寄存器: 一些寄存器可以通过汇编语言指令访问。只能保存数据或内存地址，不能两者兼具。专用寄存器包括:

程序计数器（PC）: 保存下一个将被取指的指令地址。
内存数据寄存器（MDR）: 保存从内存中取出的数据值。
内存地址寄存器（MAR）: 保存将要访问的程序的内存单元地址。
累加器（ACC）: 保存所有由算术和逻辑操作处理的值。
索引寄存器（IX）: 保存用于改变地址值的数字。
当前指令寄存器（CIR）: 一旦程序指令被取出，它将存储在CIR中，使处理器能够解码和执行该指令。
状态寄存器: 保存比较结果，以决定后续操作，还保存算术运算的中间和错误结果。

处理器（CPU）

算术逻辑单元（ALU）: 处理需要进行某种算术或逻辑操作的指令的部分。

控制单元（CU）: CPU的一部分，从内存中取指令、解码，并同步操作，然后发送信号到计算机的内存、ALU和I/O设备，以指示处理器应如何响应指令。

立即存取存储器（IAS）: 处理器可以直接访问的内存单元。

系统时钟: 连接到处理器的定时设备，用于同步所有组件。

总线

一组并行的电线，允许计算机系统中的组件之间传输数据。

数据总线: 一个双向总线，用于在处理器、内存和I/O设备之间传输数据指令。

地址总线: 一个单向总线，用于将主存储器位置或即将使用的输入/输出设备的地址从处理器传输到内存地址寄存器（MAR）。

控制总线: 双向总线，用于从控制单元传输控制信号，确保系统组件在使用数据和地址总线时不会发生冲突。

计算机系统性能因素

时钟频率

定义: 时钟在给定时间间隔内发出的脉冲数，决定CPU在给定时间间隔内执行的循环（进程）数。

测量: 通常以千兆赫（GHz）为单位测量。

影响: 如果时钟频率增加，指令执行时间会减少。因此，每单位时间的循环次数增加，从而提高性能。但是时钟频率有一定限制，因为较高时钟频率产生的热量无法及时消散，会导致过热。

总线宽度

定义: 决定可以同时传输的比特数。

影响: 指的是总线中的线数。增加总线宽度会增加同时传输的比特数，从而提高处理速度和性能。

缓存内存

用途: 常用指令存储在CPU的缓存内存区域。

影响: 如果缓存内存大小增加，更多常用指令可以被存储，减少CPU等待指令加载的时间。因此，CPU可以在单位时间内执行更多的循环，从而提高性能。

核心数量

定义: 大多数CPU芯片是多核的，即拥有一个以上的核心（实质上是一个处理器）。

影响: 每个核心通过多线程同时处理不同的指令，提高计算机性能。

接口端口

定义: 提供设备与CPU和外设之间物理连接的硬件。

类型:

通用串行总线（USB）: 可以通过USB端口将输入和输出设备连接到处理器。
高清多媒体接口（HDMI）: 只能通过HDMI端口将输出设备（如液晶显示器）连接到处理器。HDMI电缆通过HDMI端口传输高带宽和高分辨率的视频和音频流。
视频图形阵列（VGA）: 只能通过VGA端口将输出设备（如第二台显示器）连接到处理器。VGA端口只能传输视频流，无法传输音频。

取指-执行（F-E）循环

取指阶段

程序计数器（PC） 保存下一个将被取指的指令地址。

PC中的地址被复制到 内存地址寄存器（MAR）。

PC递增。

从MAR中保存的地址加载指令到 内存数据寄存器（MDR）。

MDR中的指令被加载到 当前指令寄存器（CIR）。

解码阶段

识别指令的操作码和操作数部分。

执行阶段

由控制单元发送控制信号执行指令。

寄存器传输符号（RTN）

MAR ← [PC]

PC ← [PC] + 1

MDR ← [[MAR]]

CIR ← [MDR]

解码

执行

返回开始

方括号表示寄存器中的当前值。

双方括号表示CPU正在获取寄存器中的地址所存储的值。

中断

定义: 从程序发出的信号，寻求处理器的注意。

中断服务程序（ISR）:

通过控制处理器来处理中断。

不同的中断源使用不同的ISR。

中断发生时的典型操作顺序:

处理器在当前指令的F-E循环结束时检查中断寄存器中是否有中断。

如果中断标志在中断寄存器中被设置，检测到中断源。

如果中断优先级较低，则中断被禁用。

如果中断优先级较高:

将正在运行的进程的所有寄存器内容保存在堆栈中。

将ISR加载到PC中并执行。

一旦ISR完成，处理器从堆栈中恢复寄存器的内容，中断的程序继续执行。

中断重新启用，返回循环开始。

1.2. 汇编语言

汇编语言: 一种低级编程语言，指令由操作码和操作数组成。

机器码: 以二进制编写的代码，使用处理器的基本机器操作。

汇编语言与机器码的关系: 每条汇编语言指令（源代码）精确地对应一条机器码指令（目标代码）。

符号寻址:

使用符号表示操作码。

标签可以用于表示地址。

绝对寻址: 内存中的固定地址。

汇编器:

定义: 将汇编语言转换为机器码的软件，处理器可以理解机器码。

汇编器将所有助记符和标签替换为其各自的二进制值（这些值在汇编器软件中预定义）。

一次扫描汇编器:

在程序的一次扫描中将助记符源代码转换为机器码。

无法处理涉及前向引用的代码。

两次扫描汇编器:

第一次扫描: 创建符号表，将符号地址和标签输入特定地址。所有错误被暂时忽略。

第二次扫描: 跳转指令通过符号表访问内存地址。整个源代码被转换为机器码。如果存在错误，则报告。

处理器指令集的分组

(见原图)

(#表示立即寻址B 表示二进制数，如 B01001010 & 表示十六进制数，如 &4A)

寻址模式

直接寻址: 将地址中的内容加载到累加器（ACC）中。

间接寻址: 使用给定地址中的地址。将第二个地址中的内容加载到ACC中。

索引寻址: 使用基址加上索引寄存器（IR）中的内容形成要使用的地址。

相对寻址: 下一个要执行的指令在当前位置相对的偏移量内，允许可重定位代码。

条件跳转: 有一个将被检查的条件（类似于IF语句）。

无条件跳转: 无需条件，直接跳转到指定的下一条指令。

位操作

通过移位可以对二进制数进行乘法或除法操作：

左移（LSL #n）: 将位左移以进行乘法运算。

例如，要乘以4，所有数字向左移两位。

右移（LSR #n）: 将位右移以进行除法运算。

例如，要除以4，所有数字向右移两位。

逻辑移位: 用零替换被移位腾出的位位置。

算术移位: 用于对累加器中用位表示的有符号整数进行乘法和除法，确保移位后的符号位（通常是MSB）保持不变。

循环移位: 被移出的位从一端移到另一端。

位掩码

补充阅读：

位掩码（Bitmask）是一种用于在二进制数操作中选择特定位的技术。它通常用于以下操作：

设置位（Set Bits）：通过使用按位“或”（OR）操作，可以将某些位设置为1。例如：

如果我们有一个数字 0010，并希望将第1位设置为1，可以与掩码 0001 进行“或”运算，结果为 0011。

清除位（Clear Bits）：通过使用按位“与”（AND）操作和掩码，可以将某些位清除为0。例如：

如果我们有一个数字 0111，并希望将第1位清除为0，可以与掩码 1110 进行“与”运算，结果为 0110。

切换位（Toggle Bits）：通过使用按位“异或”（XOR）操作，可以切换某些位的值。例如：

如果我们有一个数字 0101，并希望切换第1位，可以与掩码 0001 进行“异或”运算，结果为 0100。

位掩码的应用：

控制特定位的设置和清除

权限管理系统（如文件系统权限）

状态标志管理

这是一种高效且常用的位操作方法，特别是在嵌入式系统和低层次编程中。

每个位可以表示一个单独的标志。因此，通过改变位，可以操作标志。

1.3 位操作：

掩码: 一种定义希望保留哪些位、清除哪些位的操作。

掩码为1: 使用1进行或操作。

掩码为0: 使用0进行与操作。

匹配: 一种操作，允许累加器将其包含的值与给定值进行比较，以更改状态寄存器的状态。

位掩码的实际应用：

设置单个位位置:

使用掩码模式掩盖寄存器的内容，该模式在“掩盖”位置为0，在“保留”位置为1。
通过与直接地址的与命令，设置结果与匹配模式。

测试一个或多个位:

使用掩码模式掩盖寄存器的内容，该模式在“掩盖”位置为0，在“保留”位置为1。
使用CMP命令或“检查模式”比较结果与匹配模式。

检查模式:

使用与操作掩盖位并获得结果。
现在从结果中减去匹配的位模式。
如果最终的“非零”结果确认模式不相同，反之亦然。