第01章：Python与AI开发环境

本章导读

工欲善其事，必先利其器。本章将系统地介绍AI开发所需的Python基础和开发环境搭建。无论你是Python新手还是有一定编程经验的开发者，这一章都将帮助你建立完整的AI开发工具链。

本章目标：

掌握AI开发必备的Python核心库
理解并使用PyTorch深度学习框架
搭建完整的AI开发环境
配置GPU加速环境
了解常用的AI工具库和生态

学习时长：6-8小时 难度等级：☆☆☆

前置知识：

Python基础语法
命令行基本操作
了解机器学习基本概念(已在第00章学习)

第一节：Python核心库

1.1 NumPy - 数值计算基础

NumPy是Python科学计算的基石，提供高效的多维数组对象和丰富的数学函数。

为什么需要NumPy：

# Python原生列表 vs NumPy数组

import time
import numpy as np

# Python列表
python_list = list(range(1000000))
start = time.time()
result = [x * 2 for x in python_list]
print(f"Python列表耗时: {time.time() - start:.4f}秒")
# 输出: Python列表耗时: 0.0856秒

# NumPy数组
numpy_array = np.arange(1000000)
start = time.time()
result = numpy_array * 2
print(f"NumPy数组耗时: {time.time() - start:.4f}秒")
# 输出: NumPy数组耗时: 0.0021秒

# NumPy快40倍以上!

NumPy核心概念：

1. 数组创建

import numpy as np

# 从列表创建
arr1 = np.array([1, 2, 3, 4, 5])
print(arr1)  # [1 2 3 4 5]

# 创建多维数组
arr2 = np.array([[1, 2, 3],
                 [4, 5, 6]])
print(arr2.shape)  # (2, 3) - 2行3列

# 创建特殊数组
zeros = np.zeros((3, 4))      # 3×4的零矩阵
ones = np.ones((2, 3))         # 2×3的全1矩阵
identity = np.eye(4)           # 4×4单位矩阵
random = np.random.randn(3, 3) # 3×3标准正态分布随机矩阵

# 创建序列
arange = np.arange(0, 10, 2)   # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1.]

2. 数组操作

# 数组形状操作
arr = np.arange(12)
print(arr.shape)  # (12,)

# 改变形状
reshaped = arr.reshape(3, 4)
print(reshaped.shape)  # (3, 4)

# 转置
transposed = reshaped.T
print(transposed.shape)  # (4, 3)

# 展平
flattened = reshaped.flatten()
print(flattened.shape)  # (12,)

# 数组拼接
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# 垂直拼接
v_concat = np.vstack((a, b))
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]

# 水平拼接
h_concat = np.hstack((a, b))
# [[1 2 5 6]
#  [3 4 7 8]]

3. 数组索引和切片

arr = np.arange(10)  # [0 1 2 3 4 5 6 7 8 9]

# 基本索引
print(arr[0])     # 0
print(arr[-1])    # 9

# 切片
print(arr[2:5])   # [2 3 4]
print(arr[::2])   # [0 2 4 6 8]
print(arr[::-1])  # [9 8 7 6 5 4 3 2 1 0] (反转)

# 多维数组索引
arr2d = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

print(arr2d[1, 2])    # 6 (第2行第3列)
print(arr2d[:2, 1:])  # [[2 3]
                       #  [5 6]]

# 布尔索引
arr = np.array([1, 2, 3, 4, 5])
mask = arr > 3
print(arr[mask])  # [4 5]

# 花式索引
indices = [0, 2, 4]
print(arr[indices])  # [1 3 5]

4. 数学运算

# 向量化运算
a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

# 元素级运算
print(a + b)   # [11 22 33 44]
print(a * b)   # [10 40 90 160]
print(a ** 2)  # [1 4 9 16]

# 矩阵运算
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

# 矩阵乘法
C = A @ B  # 或 np.dot(A, B)
# [[19 22]
#  [43 50]]

# 广播机制
arr = np.array([[1, 2, 3],
                [4, 5, 6]])
print(arr + 10)
# [[11 12 13]
#  [14 15 16]]

# 统计函数
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(f"均值: {data.mean()}")      # 5.5
print(f"标准差: {data.std()}")     # 2.8723
print(f"最大值: {data.max()}")     # 10
print(f"最小值: {data.min()}")     # 1
print(f"总和: {data.sum()}")       # 55

5. AI中的NumPy应用

# 例1: 实现Softmax函数
def softmax(x):
    """
    Softmax函数: 将任意实数向量转换为概率分布
    """
    exp_x = np.exp(x - np.max(x))  # 减去最大值防止溢出
    return exp_x / np.sum(exp_x)

logits = np.array([2.0, 1.0, 0.1])
probs = softmax(logits)
print(f"Softmax输出: {probs}")
# Softmax输出: [0.659 0.242 0.099]
print(f"概率和: {probs.sum()}")
# 概率和: 1.0

# 例2: 实现向量化的欧氏距离
def euclidean_distance(X, Y):
    """
    计算X中每个点到Y中每个点的欧氏距离
    X: (n, d) - n个d维向量
    Y: (m, d) - m个d维向量
    返回: (n, m) - 距离矩阵
    """
    # 使用广播机制
    return np.sqrt(np.sum((X[:, np.newaxis, :] - Y[np.newaxis, :, :]) ** 2, axis=2))

X = np.random.randn(100, 50)  # 100个50维向量
Y = np.random.randn(20, 50)   # 20个50维向量
distances = euclidean_distance(X, Y)
print(f"距离矩阵形状: {distances.shape}")  # (100, 20)

# 例3: 实现简单的神经网络前向传播
def relu(x):
    """ReLU激活函数"""
    return np.maximum(0, x)

# 模拟一层神经网络
X = np.random.randn(32, 784)  # 32个样本, 每个784维(28×28图像)
W = np.random.randn(784, 128) / np.sqrt(784)  # Xavier初始化
b = np.zeros(128)

# 前向传播
z = X @ W + b          # 线性变换
a = relu(z)            # 激活
print(f"输入形状: {X.shape}")   # (32, 784)
print(f"输出形状: {a.shape}")   # (32, 128)
print(f"激活率: {(a > 0).mean():.2%}")  # 约50% (ReLU特性)

1.2 Pandas - 数据处理利器

Pandas是数据分析和处理的核心库，提供DataFrame数据结构和丰富的数据操作功能。

1. DataFrame基础

import pandas as pd

# 从字典创建DataFrame
data = {
    'name': ['Alice', 'Bob', 'Charlie', 'David'],
    'age': [25, 30, 35, 28],
    'city': ['北京', '上海', '深圳', '杭州'],
    'salary': [15000, 20000, 18000, 22000]
}
df = pd.DataFrame(data)
print(df)
#       name  age city  salary
# 0    Alice   25   北京   15000
# 1      Bob   30   上海   20000
# 2  Charlie   35   深圳   18000
# 3    David   28   杭州   22000

# 基本信息
print(df.info())
print(df.describe())  # 统计摘要
print(df.head(2))     # 前2行
print(df.tail(2))     # 后2行

2. 数据选择和过滤

# 选择列
print(df['name'])           # 单列
print(df[['name', 'age']])  # 多列

# 选择行
print(df.iloc[0])      # 第一行(位置索引)
print(df.loc[0])       # 第一行(标签索引)
print(df.iloc[0:2])    # 前两行

# 条件过滤
high_salary = df[df['salary'] > 18000]
print(high_salary)
#     name  age city  salary
# 1    Bob   30   上海   20000
# 3  David   28   杭州   22000

# 多条件
young_high_salary = df[(df['age'] < 30) & (df['salary'] > 20000)]
print(young_high_salary)

3. 数据处理

# 添加新列
df['bonus'] = df['salary'] * 0.1
print(df)

# 修改列
df['salary'] = df['salary'] * 1.05  # 涨薪5%

# 删除列
df = df.drop('bonus', axis=1)

# 处理缺失值
df_with_nan = df.copy()
df_with_nan.loc[0, 'salary'] = np.nan

print(df_with_nan.isnull().sum())  # 统计缺失值
df_filled = df_with_nan.fillna(df_with_nan['salary'].mean())  # 填充均值

# 排序
df_sorted = df.sort_values('salary', ascending=False)
print(df_sorted)

# 分组统计
city_stats = df.groupby('city')['salary'].agg(['mean', 'count'])
print(city_stats)

4. 数据读写

# 读取CSV
# df = pd.read_csv('data.csv')

# 读取Excel
# df = pd.read_excel('data.xlsx')

# 读取JSON
# df = pd.read_json('data.json')

# 保存CSV
# df.to_csv('output.csv', index=False)

# 保存Excel
# df.to_excel('output.xlsx', index=False)

5. AI数据预处理示例

from sklearn.datasets import load_iris

# 加载鸢尾花数据集
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = iris.target

print(df.head())
#    sepal length (cm)  sepal width (cm)  ...  petal width (cm)  species
# 0                5.1               3.5  ...               0.2        0
# 1                4.9               3.0  ...               0.2        0
# 2                4.7               3.2  ...               0.2        0

# 数据探索
print(df.groupby('species').mean())

# 特征标准化
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
feature_cols = iris.feature_names
df[feature_cols] = scaler.fit_transform(df[feature_cols])

# 数据分割
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
print(f"训练集大小: {len(train_df)}, 测试集大小: {len(test_df)}")
# 训练集大小: 120, 测试集大小: 30

1.3 Matplotlib - 数据可视化

Matplotlib是Python最流行的绘图库，用于创建各种静态、动态和交互式图表。

1. 基础绘图

import matplotlib.pyplot as plt
import numpy as np

# 折线图
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.figure(figsize=(10, 6))
plt.plot(x, y, label='sin(x)', linewidth=2)
plt.plot(x, np.cos(x), label='cos(x)', linewidth=2)
plt.xlabel('x')
plt.ylabel('y')
plt.title('三角函数图像')
plt.legend()
plt.grid(True)
plt.savefig('trig_functions.png')
# plt.show()

# 散点图
np.random.seed(42)
x = np.random.randn(100)
y = 2 * x + np.random.randn(100) * 0.5

plt.figure(figsize=(8, 6))
plt.scatter(x, y, alpha=0.6)
plt.xlabel('X')
plt.ylabel('Y')
plt.title('散点图示例')
# plt.show()

# 直方图
data = np.random.randn(1000)
plt.figure(figsize=(8, 6))
plt.hist(data, bins=30, edgecolor='black', alpha=0.7)
plt.xlabel('值')
plt.ylabel('频数')
plt.title('正态分布直方图')
# plt.show()

2. 多子图

fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# 子图1: 折线图
axes[0, 0].plot(x, y)
axes[0, 0].set_title('折线图')

# 子图2: 散点图
axes[0, 1].scatter(x, y)
axes[0, 1].set_title('散点图')

# 子图3: 柱状图
categories = ['A', 'B', 'C', 'D']
values = [23, 45, 56, 78]
axes[1, 0].bar(categories, values)
axes[1, 0].set_title('柱状图')

# 子图4: 饼图
axes[1, 1].pie(values, labels=categories, autopct='%1.1f%%')
axes[1, 1].set_title('饼图')

plt.tight_layout()
# plt.show()

3. AI可视化示例

# 示例1: 可视化神经网络训练过程
epochs = 50
train_loss = np.exp(-np.linspace(0, 3, epochs)) + np.random.randn(epochs) * 0.05
val_loss = np.exp(-np.linspace(0, 2.5, epochs)) + np.random.randn(epochs) * 0.08

plt.figure(figsize=(10, 6))
plt.plot(epochs_range := range(epochs), train_loss, label='训练损失', linewidth=2)
plt.plot(epochs_range, val_loss, label='验证损失', linewidth=2)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('模型训练过程')
plt.legend()
plt.grid(True, alpha=0.3)
# plt.show()

# 示例2: 可视化混淆矩阵
from sklearn.metrics import confusion_matrix
import seaborn as sns

# 模拟预测结果
y_true = np.random.randint(0, 3, 100)
y_pred = y_true.copy()
y_pred[np.random.choice(100, 20, replace=False)] = np.random.randint(0, 3, 20)

cm = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('预测标签')
plt.ylabel('真实标签')
plt.title('混淆矩阵')
# plt.show()

# 示例3: 可视化决策边界
from sklearn.datasets import make_classification
from sklearn.svm import SVC

X, y = make_classification(n_samples=100, n_features=2, n_redundant=0,
                           n_informative=2, random_state=42)
clf = SVC(kernel='rbf')
clf.fit(X, y)

# 创建网格
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                     np.arange(y_min, y_max, 0.02))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.figure(figsize=(10, 8))
plt.contourf(xx, yy, Z, alpha=0.4)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='black')
plt.xlabel('特征1')
plt.ylabel('特征2')
plt.title('SVM决策边界')
# plt.show()

第二节：深度学习框架

2.1 PyTorch vs TensorFlow

对比表格：

特性	PyTorch	TensorFlow
开发者	Meta(Facebook)	Google
发布时间	2016	2015
计算图	动态图	静态图(2.x动态图)
易用性	更pythonic	稍复杂
调试	容易	较难
部署	较难	容易(TF Serving)
社区	学术界主流	工业界主流
学习曲线	较平缓	较陡峭

选择建议：

PyTorch：研究、原型开发、学习
TensorFlow：生产部署、移动端、大规模系统

本手册主要使用PyTorch，因为它更适合学习和理解深度学习原理。

2.2 PyTorch基础

1. 张量(Tensor)

import torch

# 创建张量
t1 = torch.tensor([1, 2, 3])
print(t1)  # tensor([1, 2, 3])

# 从NumPy转换
import numpy as np
np_array = np.array([1, 2, 3])
t2 = torch.from_numpy(np_array)
print(t2)  # tensor([1, 2, 3])

# 创建特殊张量
zeros = torch.zeros(2, 3)      # 2×3零张量
ones = torch.ones(3, 4)         # 3×4全1张量
rand = torch.rand(2, 2)         # 2×2随机张量[0,1)
randn = torch.randn(3, 3)       # 3×3标准正态分布
eye = torch.eye(4)              # 4×4单位矩阵

# 张量属性
print(f"形状: {rand.shape}")         # torch.Size([2, 2])
print(f"数据类型: {rand.dtype}")     # torch.float32
print(f"设备: {rand.device}")        # cpu
print(f"元素总数: {rand.numel()}")   # 4

2. 张量运算

# 基本运算
a = torch.tensor([1.0, 2.0, 3.0])
b = torch.tensor([4.0, 5.0, 6.0])

print(a + b)        # tensor([5., 7., 9.])
print(a * b)        # tensor([4., 10., 18.])
print(a ** 2)       # tensor([1., 4., 9.])
print(torch.dot(a, b))  # tensor(32.) - 点积

# 矩阵运算
A = torch.randn(3, 4)
B = torch.randn(4, 5)
C = A @ B  # 矩阵乘法
print(C.shape)  # torch.Size([3, 5])

# 改变形状
x = torch.arange(12)
print(x.shape)  # torch.Size([12])
x = x.view(3, 4)  # 或 x.reshape(3, 4)
print(x.shape)  # torch.Size([3, 4])

# 拼接
a = torch.ones(2, 3)
b = torch.zeros(2, 3)
c = torch.cat([a, b], dim=0)  # 垂直拼接
print(c.shape)  # torch.Size([4, 3])

# 广播
x = torch.ones(3, 1)
y = torch.ones(1, 4)
z = x + y
print(z.shape)  # torch.Size([3, 4])

3. 自动微分

# 自动求导
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2 + 3 * x + 1

# 反向传播
y.backward()
print(f"dy/dx = {x.grad}")  # 7.0 (2*2 + 3)

# 多变量求导
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = (x ** 2).sum()
y.backward()
print(f"梯度: {x.grad}")  # tensor([2., 4., 6.])

# 梯度清零(重要!)
x.grad.zero_()线性回归梯度下降
torch.manual_seed(42)

# 生成数据 y = 3x + 2 + noise
X = torch.randn(100, 1)
y_true = 3 * X + 2 + torch.randn(100, 1) * 0.1

# 初始化参数
w = torch.randn(1, requires_grad=True)
b = torch.zeros(1, requires_grad=True)

# 训练
learning_rate = 0.1
for epoch in range(100):
    # 前向传播
    y_pred = w * X + b

    # 计算损失
    loss = ((y_pred - y_true) ** 2).mean()

    # 反向传播
    loss.backward()

    # 更新参数(不需要梯度)
    with torch.no_grad():
        w -= learning_rate * w.grad
        b -= learning_rate * b.grad

        # 清零梯度
        w.grad.zero_()
        b.grad.zero_()

    if (epoch + 1) % 20 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}, w: {w.item():.2f}, b: {b.item():.2f}")

# 输出:
# Epoch 20, Loss: 0.0123, w: 2.95, b: 1.98
# Epoch 40, Loss: 0.0101, w: 2.98, b: 2.00
# Epoch 60, Loss: 0.0100, w: 2.99, b: 2.01
# Epoch 80, Loss: 0.0100, w: 3.00, b: 2.01
# Epoch 100, Loss: 0.0100, w: 3.00, b: 2.01

4. 神经网络模块

import torch.nn as nn
import torch.nn.functional as F

# 方法1: 使用nn.Sequential
model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Linear(256, 128),
    nn.ReLU(),
    nn.Linear(128, 10)
)

# 方法2: 自定义模块
class MyNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 10)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = MyNet()
print(model)

# 模型参数
total_params = sum(p.numel() for p in model.parameters())
print(f"总参数量: {total_params:,}")
# 总参数量: 235,146

# 前向传播
x = torch.randn(32, 784)  # batch_size=32
output = model(x)
print(f"输出形状: {output.shape}")  # torch.Size([32, 10])

5. 完整训练示例

# 完整的MNIST训练示例
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# 1. 数据准备
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])实际运行需要下载数据集
# train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
# test_dataset = datasets.MNIST('./data', train=False, transform=transform)
# train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
# test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

# 2. 定义模型
class MNISTNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(28*28, 512)
        self.fc2 = nn.Linear(512, 256)
        self.fc3 = nn.Linear(256, 10)
        self.dropout = nn.Dropout(0.2)

    def forward(self, x):
        x = x.view(-1, 28*28)  # 展平
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        return F.log_softmax(x, dim=1)

model = MNISTNet()

# 3. 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 4. 训练函数
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)

        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        if batch_idx % 100 == 0:
            print(f'Epoch {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)}]'
                  f' Loss: {loss.item():.6f}')

# 5. 测试函数
def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()
            pred = output.argmax(dim=1)
            correct += pred.eq(target).sum().item()

    test_loss /= len(test_loader)
    accuracy = 100. * correct / len(test_loader.dataset)
    print(f'\nTest: Average loss: {test_loss:.4f}, '
          f'Accuracy: {correct}/{len(test_loader.dataset)} ({accuracy:.2f}%)\n')

# 6. 训练循环
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# model.to(device)
# for epoch in range(1, 11):
#     train(model, device, train_loader, optimizer, epoch)
#     test(model, device, test_loader)

# 7. 保存和加载模型
# torch.save(model.state_dict(), 'mnist_model.pth')
# model.load_state_dict(torch.load('mnist_model.pth'))

2.3 PyTorch常用技巧

1. GPU加速

# 检查GPU是否可用
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.device_count()}")
if torch.cuda.is_available():
    print(f"GPU name: {torch.cuda.get_device_name(0)}")

# 设置设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# 将张量移到GPU
x = torch.randn(1000, 1000)
x_gpu = x.to(device)
# 或 x_gpu = x.cuda()

# 将模型移到GPU
model = MyNet()
model.to(device)

# GPU上的运算
a = torch.randn(1000, 1000, device=device)
b = torch.randn(1000, 1000, device=device)
c = a @ b  # GPU上执行

# 将结果移回CPU
c_cpu = c.cpu()
# 或 c_cpu = c.to('cpu')

2. 数据加载优化

from torch.utils.data import Dataset, DataLoader

# 自定义数据集
class CustomDataset(Dataset):
    def __init__(self, data, labels):
        self.data = data
        self.labels = labels

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]

# 创建DataLoader
dataset = CustomDataset(torch.randn(1000, 10), torch.randint(0, 2, (1000,)))
loader = DataLoader(
    dataset,
    batch_size=32,
    shuffle=True,
    num_workers=4,  # 多进程加载
    pin_memory=True  # 加速GPU传输
)

# 使用
for batch_data, batch_labels in loader:
    # 训练代码
    pass

3. 学习率调度

import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=0.001)

# 学习率衰减
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

for epoch in range(50):
    # 训练代码
    # train(...)

    # 更新学习率
    scheduler.step()

    # 查看当前学习率
    current_lr = optimizer.param_groups[0]['lr']
    print(f"Epoch {epoch}, LR: {current_lr}")

# 其他调度器
# ReduceLROnPlateau: 根据指标自动调整
# scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', patience=5)

# CosineAnnealingLR: 余弦退火
# scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=50)

# OneCycleLR: 单周期学习率
# scheduler = optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.01, total_steps=1000)

4. 模型保存和加载

# 保存整个模型
torch.save(model, 'full_model.pth')
model = torch.load('full_model.pth')

# 只保存参数(推荐)
torch.save(model.state_dict(), 'model_weights.pth')
model = MyNet()
model.load_state_dict(torch.load('model_weights.pth'))

# 保存checkpoint(包含优化器状态)
checkpoint = {
    'epoch': epoch,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'loss': loss,
}
torch.save(checkpoint, 'checkpoint.pth')

# 加载checkpoint
checkpoint = torch.load('checkpoint.pth')
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

第三节：开发环境搭建

3.1 Conda环境管理

为什么使用Conda：

隔离不同项目的依赖
管理Python版本
跨平台支持
科学计算包丰富

安装Miniconda：

# Linux/macOS
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

# Windows
# 下载 Miniconda3-latest-Windows-x86_64.exe
# 双击安装

# 验证安装
conda --version

Conda常用命令：

# 创建环境
conda create -n ai-env python=3.10

# 激活环境
conda activate ai-env

# 查看环境列表
conda env list

# 安装包
conda install numpy pandas matplotlib
conda install pytorch torchvision -c pytorch

# 导出环境
conda env export > environment.yml

# 从文件创建环境
conda env create -f environment.yml

# 删除环境
conda deactivate
conda env remove -n ai-env

# 更新conda
conda update conda

# 清理缓存
conda clean --all

虚拟环境最佳实践：

# 为每个项目创建独立环境
conda create -n project1-env python=3.10
conda create -n project2-env python=3.9

# 使用requirements.txt管理依赖
pip freeze > requirements.txt
pip install -r requirements.txt

# 使用environment.yml(推荐)
# environment.yml内容:
# name: ai-env
# channels:
#   - pytorch
#   - conda-forge
#   - defaults
# dependencies:
#   - python=3.10
#   - numpy
#   - pandas
#   - matplotlib
#   - pytorch
#   - torchvision
#   - pip:
#     - transformers
#     - langchain

3.2 CUDA与GPU环境

检查GPU：

# Linux
nvidia-smi

# 输出示例:
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0   |
# |-------------------------------+----------------------+----------------------+
# | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
# | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
# |===============================+======================+======================|
# |   0  NVIDIA A100-SXM4  Off  | 00000000:00:04.0 Off |                    0 |
# | N/A   34C    P0    43W / 400W |      0MiB / 40960MiB |      0%      Default |
# +-------------------------------+----------------------+----------------------+

安装CUDA Toolkit：

# Ubuntu
wget https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run
sudo sh cuda_12.0.0_525.60.13_linux.run

# 添加环境变量
export PATH=/usr/local/cuda-12.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64:$LD_LIBRARY_PATH

# 验证安装
nvcc --version

安装cuDNN：

# 下载cuDNN (需要NVIDIA账号)
# https://developer.nvidia.com/cudnn

# 解压并复制文件
tar -xzvf cudnn-linux-x86_64-8.x.x.x_cudaX.Y-archive.tar.xz
sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

安装PyTorch GPU版本：

# CUDA 11.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

# CUDA 12.1
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

# 验证GPU支持
python -c "import torch; print(torch.cuda.is_available())"
# 输出: True

GPU使用监控：

# 实时监控
watch -n 1 nvidia-smi

# 查看GPU进程
nvidia-smi pids

# 设置可见GPU
export CUDA_VISIBLE_DEVICES=0,1  # 只使用GPU 0和1
export CUDA_VISIBLE_DEVICES=2    # 只使用GPU 2

# Python中设置
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

3.3 Jupyter Notebook使用

安装Jupyter：

conda install jupyter notebook
# 或
pip install jupyter notebook

启动Jupyter：

jupyter notebook
# 浏览器自动打开 http://localhost:8888

Jupyter常用快捷键：

命令模式 (按 Esc 进入):
  Enter: 进入编辑模式
  A: 在上方插入单元格
  B: 在下方插入单元格
  D, D: 删除单元格
  M: 转为Markdown单元格
  Y: 转为代码单元格
  Shift+Enter: 运行当前单元格并移到下一个
  Ctrl+Enter: 运行当前单元格

编辑模式 (按 Enter 进入):
  Esc: 进入命令模式
  Tab: 代码补全
  Shift+Tab: 查看文档
  Ctrl+/: 注释/取消注释

Jupyter魔法命令：

# 测量运行时间
%time result = some_function()  # 单次
%timeit some_function()         # 多次平均

# 查看变量
%who      # 列出所有变量
%whos     # 详细信息

# 运行外部脚本
%run script.py

# 显示matplotlib图形
%matplotlib inline

# 自动重载模块
%load_ext autoreload
%autoreload 2

# 查看GPU使用
!nvidia-smi

# Shell命令
!ls
!pip install package_name

# 调试
%debug  # 进入调试器

Jupyter扩展：

# 安装扩展
conda install -c conda-forge jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

# 常用扩展:
# - Table of Contents: 目录
# - Variable Inspector: 变量查看
# - ExecuteTime: 显示执行时间
# - Code prettify: 代码格式化

3.4 Docker容器化

为什么使用Docker：

环境一致性
快速部署
资源隔离
易于分享

安装Docker：

# Ubuntu
sudo apt-get update
sudo apt-get install docker.io
sudo systemctl start docker
sudo systemctl enable docker

# 验证安装
docker --version
sudo docker run hello-world

拉取PyTorch官方镜像：

# CPU版本
docker pull pytorch/pytorch:2.0.1-cpu

# GPU版本
docker pull pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

# 运行容器
docker run -it --rm pytorch/pytorch:2.0.1-cpu python

创建自定义Dockerfile：

# Dockerfile
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

# 设置工作目录
WORKDIR /workspace

# 安装依赖
COPY requirements.txt .
RUN pip install -r requirements.txt

# 复制代码
COPY . .

# 暴露端口(如果运行Jupyter)
EXPOSE 8888

# 启动命令
CMD ["python", "train.py"]

构建和运行：

# 构建镜像
docker build -t my-ai-project .

# 运行容器
docker run -it --rm \
  --gpus all \
  -v $(pwd):/workspace \
  -p 8888:8888 \
  my-ai-project

# docker-compose.yml 示例
# version: '3.8'
# services:
#   ai-dev:
#     image: pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
#     volumes:
#       - .:/workspace
#     ports:
#       - "8888:8888"
#     deploy:
#       resources:
#         reservations:
#           devices:
#             - driver: nvidia
#               count: 1
#               capabilities: [gpu]

# 使用docker-compose
docker-compose up

第四节：常用AI工具库

4.1 Scikit-learn - 传统机器学习

主要功能：

分类、回归、聚类
数据预处理
模型选择和评估
降维

常用示例：

from sklearn import datasets, model_selection, preprocessing, metrics
from sklearn.ensemble import RandomForestClassifier

# 1. 加载数据
iris = datasets.load_iris()
X, y = iris.data, iris.target

# 2. 数据划分
X_train, X_test, y_train, y_test = model_selection.train_test_split(
    X, y, test_size=0.2, random_state=42
)

# 3. 数据预处理
scaler = preprocessing.StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# 4. 训练模型
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train_scaled, y_train)

# 5. 预测和评估
y_pred = clf.predict(X_test_scaled)
accuracy = metrics.accuracy_score(y_test, y_pred)
print(f"准确率: {accuracy:.4f}")  # 0.9667

# 详细报告
print(metrics.classification_report(y_test, y_pred))

# 交叉验证
scores = model_selection.cross_val_score(clf, X, y, cv=5)
print(f"交叉验证分数: {scores.mean():.4f} (+/- {scores.std() * 2:.4f})")

4.2 Transformers - 预训练模型库

Hugging Face Transformers：最流行的预训练模型库。

安装：

pip install transformers

基础使用：

from transformers import pipeline

# 1. 情感分析
classifier = pipeline("sentiment-analysis")
result = classifier("I love using Transformers!")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998}]

# 2. 文本生成
generator = pipeline("text-generation", model="gpt2")
result = generator("Once upon a time", max_length=50)
print(result[0]['generated_text'])

# 3. 问答
qa_pipeline = pipeline("question-answering")
context = "Transformers is a library by Hugging Face for NLP tasks."
question = "What is Transformers?"
result = qa_pipeline(question=question, context=context)
print(result['answer'])  # a library by Hugging Face for NLP tasks

# 4. 翻译
translator = pipeline("translation_en_to_fr")
result = translator("Hello, how are you?")
print(result[0]['translation_text'])  # Bonjour, comment allez-vous?

# 5. 摘要
summarizer = pipeline("summarization")
text = """Long article text here..."""
summary = summarizer(text, max_length=130, min_length=30)
print(summary[0]['summary_text'])

使用预训练模型：

from transformers import AutoTokenizer, AutoModel
import torch

# 加载tokenizer和模型
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# 文本编码
text = "Hello, Transformers!"
inputs = tokenizer(text, return_tensors="pt")
print(inputs)
# {'input_ids': tensor([[...]]), 'attention_mask': tensor([[...]])}

# 获取embedding
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state

print(f"Embedding shape: {embeddings.shape}")
# Embedding shape: torch.Size([1, 5, 768])

微调BERT：

from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments

# 加载模型(用于分类)
model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased",
    num_labels=2
)

# 准备数据
# train_dataset = ...
# eval_dataset = ...

# 训练参数
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
)

# 训练器
trainer = Trainer(
    model=model,
    args=training_args,
    # train_dataset=train_dataset,
    # eval_dataset=eval_dataset,
)

# 开始训练
# trainer.train()

4.3 LangChain - LLM应用框架

核心概念：

Prompts: 提示模板
Models: 大模型接口
Chains: 链式调用
Agents: 智能代理
Memory: 记忆系统

安装：

pip install langchain openai

基础使用：

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# 初始化LLM
# llm = OpenAI(temperature=0.7)

# 提示模板
template = """
你是一个{role}。
请回答以下问题: {question}
"""

prompt = PromptTemplate(
    input_variables=["role", "question"],
    template=template,
)

# 创建链
# chain = LLMChain(llm=llm, prompt=prompt)

# 运行
# result = chain.run(role="Python专家", question="如何学习深度学习?")
# print(result)

RAG示例：

from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import RetrievalQA

# 文档准备
documents = [
    "PyTorch是一个深度学习框架...",
    "TensorFlow是Google开发的...",
    "Transformers架构在2017年提出...",
]

# 文本分割
text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
texts = text_splitter.create_documents(documents)

# 创建向量存储
# embeddings = OpenAIEmbeddings()
# vectorstore = FAISS.from_documents(texts, embeddings)

# 创建检索链
# qa_chain = RetrievalQA.from_chain_type(
#     llm=llm,
#     chain_type="stuff",
#     retriever=vectorstore.as_retriever()
# )

# 提问
# question = "什么是Transformers?"
# answer = qa_chain.run(question)
# print(answer)

4.4 其他重要库

1. OpenCV - 计算机视觉

import cv2
import numpy as np

# 读取图像
# img = cv2.imread('image.jpg')

# 转换颜色空间
# gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 图像处理
# blurred = cv2.GaussianBlur(gray, (5, 5), 0)
# edges = cv2.Canny(blurred, 50, 150)

# 显示图像
# cv2.imshow('Edges', edges)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

2. NLTK - 自然语言处理

import nltk
# nltk.download('punkt')
# nltk.download('stopwords')

from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

text = "Natural language processing with Python is fun!"
tokens = word_tokenize(text)
print(tokens)
# ['Natural', 'language', 'processing', 'with', 'Python', 'is', 'fun', '!']

# 去除停用词
stop_words = set(stopwords.words('english'))
filtered = [w for w in tokens if w.lower() not in stop_words]
print(filtered)
# ['Natural', 'language', 'processing', 'Python', 'fun', '!']

3. Pillow - 图像处理

from PIL import Image, ImageFilter

# 打开图像
# img = Image.open('image.jpg')

# 调整大小
# resized = img.resize((224, 224))

# 应用滤镜
# blurred = img.filter(ImageFilter.BLUR)

# 保存
# resized.save('resized.jpg')

4. Weights & Biases - 实验跟踪

import wandb

# 初始化
# wandb.init(project="my-project", config={"learning_rate": 0.001})

# 记录指标
# for epoch in range(10):
#     loss = train()
#     wandb.log({"loss": loss, "epoch": epoch})

# 保存模型
# wandb.save('model.pth')

第五节：开发工具与最佳实践

5.1 IDE和编辑器

1. VSCode

推荐扩展：

Python
Pylance
Jupyter
autoDocstring
Python Indent
GitLens

配置示例：

// settings.json
{
    "python.linting.enabled": true,
    "python.linting.pylintEnabled": true,
    "python.formatting.provider": "black",
    "editor.formatOnSave": true,
    "python.testing.pytestEnabled": true
}

2. PyCharm

专业的Python IDE，适合大型项目。

3. Jupyter Lab

Jupyter Notebook的升级版：

pip install jupyterlab
jupyter lab

5.2 代码规范

PEP 8风格指南：

# 好的示例
def calculate_mean(numbers):
    """计算数字列表的平均值"""
    if not numbers:
        return 0
    return sum(numbers) / len(numbers)

class NeuralNetwork:
    def __init__(self, input_size, hidden_size):
        self.input_size = input_size
        self.hidden_size = hidden_size

# 使用black格式化
# pip install black
# black your_script.py

# 使用flake8检查
# pip install flake8
# flake8 your_script.py

类型提示：

from typing import List, Tuple, Optional

def process_data(
    data: List[float],
    threshold: float = 0.5
) -> Tuple[List[float], List[float]]:
    """
    处理数据并根据阈值分组

    Args:
        data: 输入数据列表
        threshold: 分组阈值

    Returns:
        (低于阈值的数据, 高于阈值的数据)
    """
    low = [x for x in data if x < threshold]
    high = [x for x in data if x >= threshold]
    return low, high

5.3 版本控制

Git基础命令：

# 初始化仓库
git init

# 添加文件
git add .
git commit -m "Initial commit"

# 创建分支
git branch develop
git checkout develop
# 或
git checkout -b develop

# 合并分支
git checkout main
git merge develop

# 远程仓库
git remote add origin https://github.com/user/repo.git
git push -u origin main

# .gitignore示例
# __pycache__/
# *.pyc
# .ipynb_checkpoints/
# data/
# models/
# *.pth
# .env

5.4 项目结构

推荐的项目结构：

my-ai-project/
├── data/                  # 数据目录
│   ├── raw/              # 原始数据
│   ├── processed/        # 处理后的数据
│   └── external/         # 外部数据
├── notebooks/            # Jupyter notebooks
│   ├── 01-exploration.ipynb
│   └── 02-modeling.ipynb
├── src/                  # 源代码
│   ├── __init__.py
│   ├── data/            # 数据处理
│   │   ├── __init__.py
│   │   └── dataset.py
│   ├── models/          # 模型定义
│   │   ├── __init__.py
│   │   └── network.py
│   ├── training/        # 训练代码
│   │   ├── __init__.py
│   │   └── train.py
│   └── utils/           # 工具函数
│       ├── __init__.py
│       └── helpers.py
├── tests/               # 测试代码
│   ├── test_data.py
│   └── test_models.py
├── configs/             # 配置文件
│   └── config.yaml
├── scripts/             # 脚本
│   └── download_data.sh
├── requirements.txt     # 依赖
├── environment.yml      # Conda环境
├── Dockerfile
├── README.md
└── .gitignore

第六节：本章总结

6.1 核心要点回顾

Python核心库：

NumPy: 高效数值计算，向量化运算
Pandas: 数据处理和分析
Matplotlib: 数据可视化

PyTorch框架：

张量(Tensor)操作
自动微分机制
神经网络模块
GPU加速

开发环境：

Conda虚拟环境管理
CUDA GPU配置
Jupyter Notebook交互式开发
Docker容器化部署

工具生态：

Scikit-learn: 传统机器学习
Transformers: 预训练模型
LangChain: LLM应用开发
OpenCV, NLTK等专用库

6.2 技能检查清单

[ ] 熟练使用NumPy进行数组操作
[ ] 掌握Pandas数据处理技巧
[ ] 能用Matplotlib创建各类图表
[ ] 理解PyTorch张量和自动微分
[ ] 能用PyTorch搭建简单神经网络
[ ] 会配置Conda虚拟环境
[ ] 了解GPU环境配置
[ ] 熟悉Jupyter Notebook使用
[ ] 了解Transformers库基础
[ ] 掌握基本的项目组织方式

6.3 实践任务

任务1：NumPy练习

# 实现一个简单的K-Means聚类算法
# 要求：
# 1. 只使用NumPy
# 2. 支持任意维度数据
# 3. 可视化聚类结果

任务2：PyTorch实践

# 使用PyTorch实现线性回归
# 要求：
# 1. 手动实现梯度下降
# 2. 使用nn.Module重新实现
# 3. 可视化拟合过程

任务3：环境搭建

# 完整搭建AI开发环境
# 要求：
# 1. 创建Conda环境
# 2. 安装PyTorch(GPU版本)
# 3. 配置Jupyter
# 4. 验证GPU可用

任务4：小项目

# 使用Scikit-learn和PyTorch对比实验
# 数据集：MNIST手写数字
# 要求：
# 1. Scikit-learn实现传统方法(如SVM)
# 2. PyTorch实现神经网络
# 3. 对比结果并可视化

6.4 学习资源

官方文档：

NumPy: https://numpy.org/doc/
PyTorch: https://pytorch.org/docs/
Transformers: https://huggingface.co/docs

教程：

PyTorch官方教程
Fast.ai课程
Hugging Face Course

实践平台：

Kaggle竞赛
Google Colab(免费GPU)
Paperspace Gradient

6.5 常见问题

Q1: PyTorch和TensorFlow选哪个？ A: 学习建议PyTorch，更Pythonic；生产环境可考虑TensorFlow。

Q2: GPU显存不够怎么办？ A:

减小batch size
使用梯度累积
混合精度训练
模型并行

Q3: Conda环境冲突怎么解决？ A:

删除重建环境
使用mamba(更快的conda)
固定版本号

Q4: 如何加速数据加载？ A:

使用DataLoader的num_workers
数据预处理后保存
使用pin_memory

下一章预告

恭喜你完成开发环境的搭建！现在你已经拥有了完整的AI开发工具链。

下一章《数学基础-线性代数与微积分》将学习：

线性代数：向量、矩阵、张量运算的数学原理
微积分：导数、梯度、链式法则在神经网络中的应用
概率论：贝叶斯定理、概率分布
优化理论：梯度下降及其变种的数学基础
PyTorch自动微分的底层原理

这些数学知识将帮助你深入理解深度学习的核心机制！

学习记录：

阅读时间：____小时
理解程度：
环境搭建：□ Conda □ PyTorch □ GPU □ Jupyter
实践完成：□ 任务1 □ 任务2 □ 任务3 □ 任务4
笔记整理：□ 已完成

下次学习计划：

时间：________
章节：第02章数学基础-线性代数与微积分
目标：理解深度学习的数学原理