PyTorch中Tensor的基本操作

Ghlerrix2025/4/2pytorchPyTorch

PyTorch中Tensor的基本操作

类型

数据类型	别名	描述
`torch.float32`	`torch.float`	32位浮点（默认类型），适用于大多数计算场景
`torch.float64`	`torch.double`	64位浮点，适用于高精度科学计算
`torch.float16`	`torch.half`	16位半精度浮点（5指数位 + 10尾数位），节省显存但精度较低
`torch.bfloat16`	-	16位Brain浮点（8指数位 + 7尾数位），保留float32的指数范围，适合大范围数值
`torch.complex32`	`torch.chalf`	32位复数（16位实部 + 16位虚部）
`torch.complex64`	`torch.cfloat`	64位复数（32位实部 + 32位虚部）
`torch.complex128`	`torch.cdouble`	128位复数（64位实部 + 64位虚部）
`torch.uint8`	-	8位无符号整数（0~255），常用于图像数据
`torch.int8`	-	8位有符号整数（-128~127）
`torch.int16`	`torch.short`	16位有符号整数
`torch.int32`	`torch.int`	32位有符号整数
`torch.int64`	`torch.long`	64位有符号整数（常用于索引操作）
`torch.bool`	-	布尔类型（True/False），非零值视为`True`
`torch.quint8`	-	无符号8位量化整数，用于模型量化部署
`torch.qint8`	-	有符号8位量化整数，用于低精度推理
`torch.qint32`	-	32位量化整数，用于量化计算中的累加操作
`torch.quint4x2`	-	4位量化整数（存储为8位有符号整数），仅支持`EmbeddingBag`操作
`torch.float8_e4m3fn`	-	8位浮点（4指数位 + 3尾数位），实验性支持，操作受限
`torch.float8_e5m2`	-	8位浮点（5指数位 + 2尾数位），实验性支持，范围更大
`torch.uint16`	-	16位无符号整数（有限支持，主要用于`torch.compile`优化）
`torch.uint32`	-	32位无符号整数（有限支持，不推荐日常使用）
`torch.uint64`	-	64位无符号整数（有限支持，不推荐日常使用）
`torch.FloatTensor`	`torch.float32`	传统别名：CPU上的32位浮点Tensor
`torch.cuda.FloatTensor`	`torch.float32`	传统别名：GPU上的32位浮点Tensor
`torch.DoubleTensor`	`torch.float64`	传统别名：CPU上的64位浮点Tensor
`torch.cuda.DoubleTensor`	`torch.float64`	传统别名：GPU上的64位浮点Tensor
`torch.HalfTensor`	`torch.float16`	传统别名：CPU上的16位半精度浮点Tensor
`torch.cuda.HalfTensor`	`torch.float16`	传统别名：GPU上的16位半精度浮点Tensor

更详细的说明见官方文档: https://pytorch.org/docs/stable/tensors.html#torch-tensoropen in new window

创建

传入具体的数据

a = torch.Tensor([1, 2, 3])
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
# tensor([1., 2., 3.])
# shape: torch.Size([3])
# dtype: torch.float32

传入维度

a = torch.Tensor(1, 2, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
# tensor([[[3.1757e-31, 4.5808e-41, 3.1757e-31],
#          [4.5808e-41, 1.8459e-14, 0.0000e+00]]])
# shape: torch.Size([1, 2, 3])
# dtype: torch.float32

几种特殊的Tensor

a = torch.ones(2, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[1., 1., 1.],
#        [1., 1., 1.]])
#shape: torch.Size([2, 3])
#dtype: torch.float32

a = torch.zeros(2, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[0., 0., 0.],
#        [0., 0., 0.]])
#shape: torch.Size([2, 3])
#dtype: torch.float32

a = torch.eye(3, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[1., 0., 0.],
#        [0., 1., 0.],
#        [0., 0., 1.]])
#shape: torch.Size([3, 3])
#dtype: torch.float32

a = torch.arange(1, 10, 2)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([1, 3, 5, 7, 9])
#shape: torch.Size([5])
#dtype: torch.int64

a = torch.linspace(1, 10, 5)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([ 1.0000,  3.2500,  5.5000,  7.7500, 10.0000])
#shape: torch.Size([5])
#dtype: torch.float32

随机

#[0,1)内的均匀分布随机数
a = torch.rand(2, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[0.7363, 0.5903, 0.4994],
#        [0.9168, 0.9666, 0.8203]])
#shape: torch.Size([2, 3])
#dtype: torch.float32


#返回标准正太分布N(0,1)的随机数
a = torch.randn(2, 3)
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[ 0.6687,  0.6134,  1.7500],
#        [ 1.1684, -0.5670, -1.7086]])
#shape: torch.Size([2, 3])
#dtype: torch.float32


#正态分布。这里注意，mean和std都是tensor，返回的形状由mean和std的形状决定，一般要求两者形状一样。
#如果，mean缺失，则默认为均值0，如果std缺失，则默认标准差为1.
#这里 mean=1, std=2
a = torch.normal(1, 2, (2, 3))
print(a)
print('shape:', a.shape)
print('dtype:', a.dtype)

# 输出结果
#tensor([[ 2.5269,  2.6123,  0.0952],
#        [ 1.7483, -1.4747,  3.7354]])
#shape: torch.Size([2, 3])
#dtype: torch.float32

计算

基本数学运算

加法

x = torch.tensor([1.0, 2.0, 3.0])
y = torch.tensor([3.0, 2.0, 1.0])

# 方法 1：操作符 `+`
z1 = x + y
print(z1)

# 输出结果
#tensor([4., 4., 4.])

# 方法 2：函数 `torch.add`
z2 = torch.add(x, y)
print(z2)

# 输出结果
#tensor([4., 4., 4.])

# 方法 3：in-place 操作 `add_`
x.add_(y)  # x 直接被修改
print(x)

# 输出结果
#tensor([4., 4., 4.])

值得注意的是第三方法会直接改变x的值，其他两种方法则仅仅是执行计算

减法（同加法，不在赘述）

# 方法 1：操作符 `-`
z1 = x - y

# 方法 2：函数 `torch.sub`
z2 = torch.sub(x, y)

# 方法 3：in-place 操作 `sub_`
x.sub_(y)

乘法（逐元素乘法）

# 方法 1：操作符 `*`
z1 = x * y
print(z1)

# 输出结果
#tensor([12.,  8.,  4.])

# 方法 2：函数 `torch.mul`
z2 = torch.mul(x, y)
print(z2)

# 输出结果
#tensor([12.,  8.,  4.])

# 方法 3：in-place 操作 `mul_`
x.mul_(y)
print(x)

# 输出结果
#tensor([12.,  8.,  4.])

这里需要注意的是，他是逐元素相乘，即对应位置上的数相乘

除法（逐元素除法，同乘法）

# 方法 1：操作符 `/`
z1 = x / y

# 方法 2：函数 `torch.div`
z2 = torch.div(x, y)

# 方法 3：in-place 操作 `div_`
x.div_(y)

指数运算

# 方法 1：幂运算 `**`
z1 = x ** 2

# 方法 2：`torch.pow`
z2 = torch.pow(x, 2)

开方

# 方法 1：`**`
z1 = x ** 0.5

# 方法 2：`torch.sqrt`
z2 = torch.sqrt(x)

对数

# 方法 1：`torch.log`
z1 = torch.log(x)

# 方法 2：`torch.log2`
z2 = torch.log2(x)

# 方法 3：`torch.log10`
z3 = torch.log10(x)

矩阵运算

矩阵乘法

A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])

# 方法 1：操作符 `@`
C1 = A @ B

# 方法 2：`torch.matmul`
C2 = torch.matmul(A, B)

# 方法 3：`torch.mm`（仅限 2D 矩阵）
C3 = torch.mm(A, B)

转置

# 方法 1：`.T`
A_T1 = A.T

# 方法 2：`torch.transpose`
A_T2 = torch.transpose(A, 0, 1)

逆

A = torch.tensor([[1.0, 2.0], [3.0, 4.0]])

# 方法 1：`torch.inverse`
A_inv = torch.inverse(A)

维度操作

改变形状

x = torch.randn(2, 3, 4)
print(x.size())

# 输出结果
#torch.Size([2, 3, 4])

# 方法 1：`.view()`
x1 = x.view(6, 4)
print(x1.size())

# 输出结果
#torch.Size([6, 4])

# 方法 2：`.reshape()`
x2 = x.reshape(6, 4)
print(x2.size())

# 输出结果
#torch.Size([6, 4])

# 方法 3：`.resize_()`（in-place）
x.resize_(6, 4)
print(x.size())

# 输出结果
#torch.Size([6, 4])

维度交换

transpose (交换两个维度)

用于交换两个指定维度的位置，适用于 2D 或更高维度的张量。

x = torch.rand(2, 3, 4)  
print("原始形状:", x.shape)  # torch.Size([2, 3, 4])

# 交换第 0 维和第 1 维
y = torch.transpose(x, 0, 1)
print("交换 (0,1) 维度后:", y.shape)  # torch.Size([3, 2, 4])

# 交换第 1 维和第 2 维
z = torch.transpose(x, 1, 2)
print("交换 (1,2) 维度后:", z.shape)  # torch.Size([2, 4, 3])

permute （重新排列所有维度）

允许按照指定顺序重新排列张量的所有维度，比 transpose 更灵活。

x = torch.rand(2, 3, 4)  
print("原始形状:", x.shape)  # torch.Size([2, 3, 4])

# 改变维度顺序
y = x.permute(1, 2, 0)  
print("permute(1,2,0) 结果:", y.shape)  # torch.Size([3, 4, 2])

z = x.permute(2, 1, 0)  
print("permute(2,1,0) 结果:", z.shape)  # torch.Size([4, 3, 2])

添加或删除维度

unsqueeze （增加维度）

用于在指定的 dim 维度上插入一个大小为 1 的新维度。

x = torch.tensor([1, 2, 3])
print("原始形状:", x.shape)  # torch.Size([3])

# 在第 0 维增加一个维度
y = x.unsqueeze(0)
print("在 dim=0 处插入:", y.shape)  # torch.Size([1, 3])

# 在第 1 维增加一个维度
y = x.unsqueeze(1)
print("在 dim=1 处插入:", y.shape)  # torch.Size([3, 1])

squeeze (去除维度)

主要用于去除形状中大小为 1 的维度，但不会影响其他维度。

x = torch.rand(1, 3, 1, 4)  
print("原始形状:", x.shape)  # torch.Size([1, 3, 1, 4])

# 不指定维度，去除所有 1 维
y = x.squeeze()
print("去除所有 1 维后的形状:", y.shape)  # torch.Size([3, 4])

# 指定去除第 0 维
y = x.squeeze(0)
print("去除第 0 维后的形状:", y.shape)  # torch.Size([3, 1, 4])

# 指定去除第 2 维
y = x.squeeze(2)
print("去除第 2 维后的形状:", y.shape)  # torch.Size([1, 3, 4])

广播机制

是什么？
广播（Broadcasting）是一种自动扩展张量维度的机制，使不同形状的张量可以进行数学运算，而无需手动重复数据（expand 或 repeat）。广播可以减少存储消耗，提高计算效率。
PyTorch 遵循 NumPy 的广播规则，即：
- 从右向左 逐个比较两个张量的维度。
- 维度相等，或者其中一个维度为 1，可以进行广播。
- 如果两个维度都不相等且都不为 1，则无法广播，会报错。
规则
1. 从右向左匹配
  对齐两个张量的维度时，从右往左进行匹配。例如：
```
A: (2, 3, 4)
B:     (3, 4)  # 省略的部分视为前面填充 1
```
  等效于：
```
A: (2, 3, 4)
B: (1, 3, 4)  # B 的前面隐式扩展一个 1
```
  然后进行广播，B 扩展为 (2, 3, 4)，即可计算。
2. 维度相等，或者其中一个维度为 1
  如果两个张量在某个维度上不同，但其中一个维度为 1，则广播成较大维度。
  示例：
```
A: (3, 1)
B: (1, 4)
```
  - A 的第二个维度是 1，可以扩展为 4。
  - B 的第一个维度是 1，可以扩展为 3。
  广播后：
```
A: (3, 4)
B: (3, 4)
```
  然后就可以进行逐元素计算。
3. 无法匹配的维度会导致错误
  如果某个维度不匹配，且两个维度都不是 1，则会报错：
```
A: (3, 2)
B: (4, 3)
```
  A 的第二个维度是 2，但 B 的第二个维度是 3，无法广播，会报错。
expand 和 repeat
有时候，我们希望手动控制张量的广播，可以使用：
- expand() 创建新的视图，不复制数据（仅改变 shape）
- repeat() 真正复制数据，增加内存消耗
expand 只改变 shape，不复制数据
```
A = torch.tensor([[1], [2], [3]])  # (3,1)
B = A.expand(3, 4)  # (3,4) 只扩展视图，不占额外内存
```
- 注意：expand 只是改变了张量的视图，不会复制数据，因此效率更高。
repeat 复制数据
```
A = torch.tensor([[1], [2], [3]])  # (3,1)
B = A.repeat(1, 4)  # (3,4) 复制数据
```
repeat() 真的复制数据，因此会占用额外内存，通常优先使用 expand()。
什么时候使用
- 当张量维度不同但符合广播规则时，可以避免 for 循环，提高效率。
- 例如，在深度学习中，批量归一化（Batch Normalization）、逐通道计算均值和标准差时常用广播。

PyTorch中Tensor的基本操作

页面导航