slug
type
status
category
summary
date
tags
password
icon
1. Information Representation
1.1 Data Representation
Binary Magnitudes and Prefixes:
- Binary Magnitudes refer to the way data is represented using only two digits: 0 and 1. This is the fundamental language of computers.
- Binary Prefixes (kibi, mebi, gibi, etc.) are used to indicate powers of 2, which are more suitable for binary systems, whereas decimal prefixes (kilo, mega, giga, etc.) refer to powers of 10. For example:
- Kilo (k) in the decimal system = 10^3 = 1000.
- Kibi (Ki) in the binary system = 2^10 = 1024.
- Mebi (Mi) in the binary system = 2^20 = 1,048,576.
Different Number Systems:
- Binary (Base-2): Numbers are expressed with 0 and 1. For example, 1010 in binary is 10 in decimal.
- Decimal (Base-10): The everyday number system.
- Hexadecimal (Base-16): Uses digits 0-9 and letters A-F (A=10, B=11, ..., F=15). Hexadecimal is often used in programming for a more compact representation of binary numbers.
- BCD (Binary Coded Decimal): Each decimal digit is represented by a fixed number of binary digits (usually 4 bits per digit). For example, the decimal number 59 would be written in BCD as
0101 1001
.
Binary Addition and Subtraction:
- Binary addition is similar to decimal addition, but it only involves 0 and 1. Carrying over occurs when the sum exceeds 1.
- Example: 1 + 1 = 10 (carry the 1).
- Binary subtraction follows similar rules, and you may need to borrow, similar to decimal subtraction.
Character Data Representation:
- ASCII (American Standard Code for Information Interchange): A standard that represents text characters using 7 bits, allowing for 128 possible characters.
- Extended ASCII: Uses 8 bits to represent characters, allowing for 256 characters.
- Unicode: A character encoding standard that includes virtually all characters from every writing system, including emojis. It can use up to 32 bits.
Overflow in Binary Arithmetic:
- Overflow happens when the result of an arithmetic operation exceeds the number of bits allocated for storage. For example, in an 8-bit system, adding
11111111
and00000001
results in100000000
, but the overflow causes data loss as only the lower 8 bits can be stored.
Conversion Between Number Bases:
- Converting between binary, decimal, and hexadecimal can be done using division and multiplication, or using lookup tables for direct conversion. For example:
- To convert from binary to decimal:
1010
(binary) =1*2^3 + 0*2^2 + 1*2^1 + 0*2^0
= 10 (decimal). - To convert from decimal to binary: divide the number by 2 and record the remainders. Example: 10 (decimal) =
1010
(binary).
1.2 Multimedia
Graphics:
- Bitmapped Images: These images are made up of individual pixels, and each pixel has a color value stored in binary form. Common formats are BMP, PNG, and JPEG.
- File size of a bitmap image is calculated by multiplying the width, height, and the number of bits per pixel (color depth).
- Formula:
File Size = Width * Height * Color Depth
.
- Vector Graphics: These images are made up of objects (like lines, curves, shapes) defined by mathematical formulas. They are resolution-independent and can be scaled without losing quality. Formats include SVG.
- Bitmap vs. Vector Graphics: Use bitmap images for complex, detailed images (photos) and vector images for simpler, scalable graphics (logos, icons).
Sound:
- Sound is represented digitally by sampling an analogue signal at regular intervals and converting each sample into a digital value. This process is called digitization.
- Sampling Rate: The number of samples per second. A higher sampling rate means more accurate sound but results in larger file sizes. Common sampling rates are 44.1 kHz (CD quality) or 48 kHz.
- Bit Depth: The number of bits used to represent each sample. Higher bit depth improves sound quality by reducing noise.
- File Size: The size of a sound file depends on the sampling rate, bit depth, and length of the sound.
1.3 Compression
Need for Compression:
- Compression is used to reduce the size of files for easier storage or transmission, especially for large files like images, sound, and videos.
Lossy vs. Lossless Compression:
- Lossy Compression: Reduces file size by removing some data permanently. It is used when some loss in quality is acceptable (e.g., JPEG for images, MP3 for sound).
- Lossless Compression: Reduces file size without losing any data. The original file can be perfectly reconstructed (e.g., PNG for images, ZIP for files).
Compression Methods:
- Run-Length Encoding (RLE): A simple compression technique that works by encoding repeated sequences of data as a single value and count. For example, the string
AAAABBBCCDAA
could be compressed as4A3B2C1D2A
.
Key Terms:
- Pixel: The smallest unit of a digital image.
- File Header: Contains metadata about the file, such as its size and format.
- Image Resolution: The number of pixels in an image, typically described by width x height (e.g., 1920x1080).
- Colour Depth/Bit Depth: The number of bits used to represent the color of each pixel.
- Sampling Rate: The frequency at which an analogue signal is sampled in digital audio.
- Sampling Resolution: The number of bits used to represent each sample in digital audio.
This section of the curriculum focuses on understanding how digital data is represented, manipulated, and compressed in various media formats, including text, images, and sound. It also emphasizes the importance of compression techniques in managing data efficiently while balancing quality and file size.
1. 信息表示
1.1 数据表示
二进制大小和前缀:
- 二进制大小 是指数据用 0 和 1 两个数字表示,这是计算机的基本语言。
- 二进制前缀(如:kibi、mebi、gibi等)表示 2 的幂,而 十进制前缀(如:kilo、mega、giga等)表示 10 的幂。例如:
- Kilo(k)在十进制中 = 10^3 = 1000。
- Kibi(Ki)在二进制中 = 2^10 = 1024。
- Mebi(Mi)在二进制中 = 2^20 = 1,048,576。
不同的数字系统:
- 二进制(Base-2):仅使用 0 和 1 表示数字。例如,二进制 1010 表示十进制的 10。
- 十进制(Base-10):我们日常使用的数字系统。
- 十六进制(Base-16):使用数字 0-9 和字母 A-F(A=10,B=11,...,F=15)。十六进制通常用于编程中,以更紧凑的形式表示二进制数。
- BCD(二进制编码十进制):每个十进制数字用固定的二进制位数(通常为 4 位)表示。例如,十进制数 59 会在 BCD 中表示为
0101 1001
。
二进制加法和减法:
- 二进制加法与十进制加法类似,只是运算的数字仅限于 0 和 1。进位规则是当结果大于 1 时,向前一位进位。
- 例如:1 + 1 = 10(进位 1)。
- 二进制减法也类似于十进制减法,可能需要借位。
字符数据表示:
- ASCII(美国标准信息交换码):使用 7 位二进制表示字符,共有 128 个字符。
- 扩展 ASCII:使用 8 位二进制表示字符,共有 256 个字符。
- Unicode:一个字符编码标准,包含几乎所有书写系统的字符,包括表情符号。它最多可以使用 32 位。
二进制运算中的溢出:
- 溢出 是指运算结果超出了分配的存储位数。例如,在一个 8 位系统中,加法
11111111
和00000001
会得到100000000
,但由于溢出,只有低 8 位会被存储,导致数据丢失。
进制转换:
- 从二进制转换为十进制:将二进制数按位展开计算,例如
1010
(二进制) =1*2^3 + 0*2^2 + 1*2^1 + 0*2^0
= 10(十进制)。
- 从十进制转换为二进制:将十进制数除以 2,并记录余数。例:10(十进制) =
1010
(二进制)。
1.2 多媒体
图像:
- 位图图像:由一个个像素组成,每个像素的颜色值以二进制表示。常见的图像格式有 BMP、PNG 和 JPEG。
- 文件大小 计算:宽度 × 高度 × 每像素的位数(色深)。
- 公式:
文件大小 = 宽度 × 高度 × 色深
。
- 矢量图形:由对象(如线条、曲线、形状)组成,这些对象由数学公式定义。矢量图形是与分辨率无关的,放大或缩小时不会失真。常见格式有 SVG。
- 位图与矢量图形的选择:位图适合用于复杂、细节丰富的图像(如照片),而矢量图适合用于简单、可缩放的图形(如标志、图标)。
声音:
- 声音通过采样模拟信号并将每个采样转换为数字值来进行表示,这个过程叫做 数字化。
- 采样率:每秒采样的次数。采样率越高,声音越准确,但文件也会越大。常见采样率有 44.1 kHz(CD 质量)和 48 kHz。
- 位深度:每个采样的位数。位深度越高,声音质量越好,噪音越少。
- 文件大小:声音文件的大小取决于采样率、位深度和声音的时长。
1.3 压缩
压缩的必要性:
- 压缩是用来减小文件大小的,以便更容易地存储或传输,尤其是对于大文件如图像、声音和视频。
有损压缩与无损压缩:
- 有损压缩:通过永久丢弃一些数据来减小文件大小,适用于质量有所损失的情况(例如,JPEG 用于图像,MP3 用于声音)。
- 无损压缩:压缩时不丢失任何数据,原文件可以完美恢复(例如,PNG 用于图像,ZIP 用于文件)。
压缩方法:
- 行程编码(RLE):一种简单的压缩技术,通过将重复的数据序列编码为一个值和计数来实现压缩。例如,字符串
AAAABBBCCDAA
可以压缩为4A3B2C1D2A
。
关键术语:
- 像素:数字图像中的最小单位。
- 文件头:包含关于文件的元数据,如文件大小和格式等信息。
- 图像分辨率:图像中的像素数量,通常用宽度×高度表示(例如 1920x1080)。
- 色深/位深度:表示每个像素的颜色所需的比特数。
- 采样率:在数字音频中,每秒钟采样的次数。
- 采样分辨率:在数字音频中,表示每个采样的比特数。
这部分课程内容主要聚焦于理解数字数据如何在不同的媒体格式(如文本、图像、声音)中表示、处理和压缩,并强调了在数据管理中压缩技术的重要性,尤其是在平衡质量与文件大小之间的权衡。
- 作者:现代数学启蒙
- 链接:https://www.math1234567.com/informationRepresentation
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章