Stable Diffusion完整实战指南(微信公众号版 - 两篇合集)

【上篇】Stable Diffusion从入门到精通：安装、模型、提示词完全指南(2024最新)

关键词:Stable Diffusion教程、SD WebUI安装、AI绘画入门、Checkpoint模型、LoRA、提示词技巧、Prompt工程、参数调优

如果你想拥有自己的 AI 绘画工具,完全免费、无限制、本地运行,那么 Stable Diffusion 就是你的最佳选择。与 Midjourney 的按月订阅不同,SD 完全开源免费,只需要一张显卡就能在家创作无限量的精美作品。

本文将带你从零开始,掌握 Stable Diffusion 的完整技术栈:安装部署、模型选择、提示词编写、参数调优,让你成为真正的 AI 绘画高手。

一、Stable Diffusion 为什么值得学习?

与 Midjourney 的核心区别

许多人在选择 AI 绘画工具时会纠结:SD 和 Midjourney 有什么不同?

对比维度	Stable Diffusion	Midjourney
费用	完全免费	$10-60/月订阅
运行方式	本地电脑(需显卡)	云端在线
生成数量	无限制	按套餐限制
学习曲线	陡峭(需配置参数)	简单(输命令即可)
可控性	极强(精确控制每个参数)	中等(参数较少)
模型选择	数千个社区模型	官方统一模型
插件扩展	丰富(ControlNet等)	受限
商业使用	免费(需注意模型许可)	付费套餐允许

Stable Diffusion 的核心优势

1. 完全免费且开源

无需任何订阅费用,一次安装,终身使用。社区每天都有新模型发布,永远不缺新鲜感。

2. 本地运行,数据隐私

所有图片生成都在你的电脑上完成,不经过任何服务器,创意内容完全私密。

3. 无限制生成

想生成多少张就生成多少张,没有月度配额限制,尽情实验探索。

4. 精确控制

通过 ControlNet 可以精确控制姿态、构图、线条,实现 99% 的精准复刻,这是 Midjourney 难以企及的。

5. 庞大的模型生态

CivitAI 等平台有数千个社区训练的模型,涵盖写实、二次元、艺术风格、特定人物等各个领域。

二、硬件需求与快速安装

硬件配置要求

显卡要求(最重要):

等级	显卡型号	显存	生成速度	最大分辨率	适合人群
入门级	GTX 1660 / RTX 3050	6GB	30-60秒/张	512×768	学习测试
进阶级	RTX 3060 / 4060	12GB	10-20秒/张	768×768	日常创作
专业级	RTX 4070 Ti / 4080	16GB	5-10秒/张	1024×1024	高产创作
顶配	RTX 4090	24GB	2-5秒/张	2048×2048+	商业应用

内存和硬盘:

内存:最低 16GB,推荐 32GB
硬盘:系统盘 20GB+,数据盘 100GB+(存放模型和生成图)
推荐 SSD,加快模型加载速度

没有显卡怎么办?

使用在线平台:Google Colab(免费 GPU)
云端租用:AutoDL、矩池云(按小时付费)
CPU 模式(极慢,不推荐)

Windows 快速安装(10分钟)

方法一:整合包安装(强烈推荐新手)

下载整合包
- 搜索"秋叶整合包"或"绘世整合包"
- 文件大小约 10-15GB

安装步骤

- 解压到非中文路径(如 D:\StableDiffusion)
- 双击"启动器.exe"
- 选择显卡类型(NVIDIA/AMD/CPU)
- 点击"一键启动"
- 等待浏览器自动打开(默认 http://127.0.0.1:7860)

首次启动
- 启动时间:30秒-2分钟
- 如果白屏:刷新浏览器或清除缓存
- 如果报错:查看控制台错误信息

方法二:官方 GitHub 安装(适合进阶用户)

# 前置要求
Python 3.10.6(必须是这个版本)
Git
NVIDIA 显卡 + CUDA 11.8

# 安装步骤
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
webui-user.bat  # Windows 启动(会自动安装依赖)

# 等待安装完成后访问
http://127.0.0.1:7860

目录结构详解

安装完成后,SD 的核心目录结构:

stable-diffusion-webui/
├── models/                    # 模型文件夹
│   ├── Stable-diffusion/     # 大模型(Checkpoint,2-7GB)
│   ├── Lora/                 # LoRA 模型(10-200MB)
│   ├── VAE/                  # VAE 色彩优化(300MB)
│   ├── embeddings/           # Embedding(关键词,KB级)
│   └── ControlNet/           # ControlNet 模型(1-2GB)
├── outputs/                   # 生成图片保存位置
│   ├── txt2img-images/       # 文生图
│   └── img2img-images/       # 图生图
├── extensions/                # 插件扩展
└── webui-user.bat            # Windows 启动文件

第一个模型下载

SD 本身不包含生成模型,需要手动下载。

必装模型(选一个即可):

1. Realistic Vision V5.1(超写实,推荐)

大小:2GB
下载:Civitai 搜索"Realistic Vision"
擅长:真实人像、摄影风格
放置位置:models/Stable-diffusion/

2. Anything V5(二次元)

大小:2GB
擅长:动漫角色、插画
适合:喜欢二次元风格

3. DreamShaper(通用)

大小:2GB
擅长:平衡写实和艺术
适合:不确定风格时选择

下载站点:

Civitai(最全,需翻墙):https://civitai.com
LiblibAI(国内):https://www.liblib.art
HuggingFace:https://huggingface.co

VAE 模型(推荐同时下载):

文件名:vae-ft-mse-840000-ema-pruned.safetensors
作用:优化色彩,提升画质
放置位置:models/VAE/

第一张图片生成(3分钟)

Step 1:启动 WebUI,选择模型

顶部 Checkpoint 下拉框 → 选择 Realistic Vision V5.1
顶部 VAE 下拉框 → 选择 vae-ft-mse-840000

Step 2:填写提示词

Prompt(正向提示词):
a beautiful girl, long hair, smile, looking at viewer,
professional photography, 8k uhd, high quality,
soft lighting, film grain

Negative Prompt(反向提示词):
(worst quality:2), (low quality:2), (normal quality:2),
lowres, bad anatomy, bad hands, extra fingers

Step 3:设置参数

Sampling method: DPM++ 2M Karras
Steps: 25
CFG Scale: 7
Width: 512
Height: 768
Seed: -1(随机)

Step 4:点击 Generate

等待 10-30秒(取决于显卡),你的第一张 AI 画作就生成了!

三、模型体系完全解析

模型分类总览

SD 的模型体系分为多个层级:

SD 模型生态
├── Checkpoint(大模型,2-7GB)
│   └── 决定基础画风和质量
├── LoRA(小模型,10-200MB)
│   └── 叠加在 Checkpoint 上,调整风格
├── VAE(色彩优化,300MB)
│   └── 修正色彩偏灰/偏暗
├── Embedding(关键词压缩,KB级)
│   └── 一个词代替一大段描述
└── ControlNet(控制模型,1-2GB)
    └── 精确控制姿态/构图/线条

Checkpoint 大模型详解

什么是 Checkpoint?

Checkpoint 是 SD 的核心大模型,决定了整体画风和质量。同时只能使用一个 Checkpoint。

主流 Checkpoint 推荐:

写实系(Realistic):

Realistic Vision V5.1

特点:超写实,亚洲面孔友好
擅长:人像摄影,时尚大片
推荐 CFG:4-7(低 CFG 避免过曝)

示例 Prompt:

a beautiful asian girl, long black hair, natural makeup,
professional photography, soft lighting, 8k uhd, film grain,
shot on Canon EOS R5, 85mm f1.2

ChilloutMix
- 特点:韩系美女,细节丰富
- 擅长:时尚写真,商业摄影
- 注意:需搭配 VAE 使用
Deliberate
- 特点:平衡写实与艺术
- 擅长:商业插画,概念设计
- 通用性:5星推荐

二次元系(Anime):

Anything V5

特点:万用二次元,质量高
擅长:动漫角色,CG 插画

示例 Prompt:

1girl, blue eyes, long silver hair, school uniform,
smile, cherry blossoms background,
anime style, highly detailed, vibrant colors

CounterfeitV3
- 特点:色彩鲜艳,日系风格
- 擅长:轻小说插画,漫画风
GhostMix
- 特点:幽灵系,神秘风格
- 擅长:奇幻角色,暗黑系

艺术风格系:

DreamShaper
- 特点:梦幻风格,创意丰富
- 擅长:概念艺术,场景设计
Protogen
- 特点:科幻机甲风格
- 擅长:游戏设计,赛博朋克
ReV Animated
- 特点:3D 渲染风格
- 擅长:产品设计,建筑可视化

模型版本说明:

文件名解读:
realisticVision_v51.safetensors
├─ 模型名:realisticVision
├─ 版本:v5.1
└─ 格式:safetensors(安全格式,推荐)

模型名_版本_剪枝类型.格式
例:anythingV5_v5_pruned-fp16.safetensors
    └─ pruned:剪枝(减小文件,质量接近)
    └─ fp16:半精度(更小,显存占用低)

LoRA 模型详解

什么是 LoRA?

LoRA(Low-Rank Adaptation)是轻量级的风格/人物模型,可以叠加在 Checkpoint 上使用。

核心特点:

文件大小:10-200MB
可同时使用多个 LoRA(建议≤3个)
通过权重控制强度(0-1)
灵活性极高

LoRA 三大类型:

1. 人物 LoRA

用途:生成特定人物/角色

示例:

<lora:koreanDollLikeness_v15:0.8>
1girl, black hair, red dress, smile

权重 0.8 = 80% 相似度
权重越高,越像目标人物
推荐范围:0.6-0.9

2. 风格 LoRA

用途:改变画风

常见风格:

中国风(古风)
赛博朋克
水墨画
油画质感
像素艺术

示例:

<lora:GuFeng_v2:0.7>
ancient chinese style, hanfu dress,
traditional architecture, ink painting

3. 概念 LoRA

用途:特定概念/场景

类型:

光影效果(体积光、逆光)
特定姿势(坐姿、背影)
服装类型(婚纱、JK 制服)
场景类型(室内、森林)

LoRA 使用方法:

方法1:直接写在 Prompt

Prompt:
<lora:koreanDollLikeness_v15:0.7>
<lora:filmGrain_v1:0.5>
a beautiful girl, smile...

方法2:从生成参数添加

1. 点击"生成参数"下方的图标
2. 选择 LoRA 模型
3. 调整权重滑块
4. 自动添加到 Prompt

LoRA 权重调节:

权重	效果	适用场景
0.3-0.5	轻微影响	仅作点缀
0.6-0.8	明显效果	推荐范围
0.9-1.0	强烈影响	可能过拟合
>1.0	过度强化	一般不推荐

LoRA 叠加技巧:

组合示例:韩系美女
<lora:koreanDollLikeness:0.7>    # 韩系面孔
<lora:filmGrain:0.4>             # 胶片颗粒
<lora:detailTweaker:0.5>         # 细节增强

a korean girl, natural makeup,
soft lighting, professional photography

注意:
- LoRA 数量:建议≤3个
- 总权重:建议<2.5
- 避免冲突:不要用2个相同类型 LoRA

VAE 模型详解

什么是 VAE?

VAE(Variational AutoEncoder)负责色彩和细节优化,可以显著改善画质。

作用:

修正色彩偏灰/偏暗
提升细节清晰度
优化饱和度

对比效果:

无 VAE:色彩偏灰,发暗,细节模糊
有 VAE:色彩鲜艳,通透,细节锐利

推荐 VAE:

vae-ft-mse-840000-ema-pruned
- 最常用,通用性强
- 大小:330MB
- 适配:大部分模型
kl-f8-anime2
- 专门针对二次元
- 色彩更鲜艳
Anything VAE
- 搭配 Anything 系列模型

VAE 使用方法:

方法1:全局设置

Settings > VAE > SD VAE > 选择 VAE
Apply settings > Reload UI

方法2:单次使用

生成参数下方 > VAE 下拉框 > 选择

方法3:自动匹配

将 VAE 命名为:模型名.vae.safetensors
放在同一目录,自动加载
例:realisticVision_v51.vae.safetensors

是否需要 VAE?

有些模型内置 VAE(baked VAE),不需要额外加载
如果生成图片颜色正常,可以不用
色彩偏灰、发暗,必须用

Embedding 详解

什么是 Embedding?

Embedding 是将复杂概念压缩成一个关键词。

特点:

文件极小:10-100KB
使用简单:直接当关键词用
类型:正向(风格)、反向(排除问题)

最常用:EasyNegative(反向 Embedding)

作用:一键排除低质量、畸形、模糊等问题

对比:

不用 EasyNegative:
Negative: (worst quality:2), (low quality:2), lowres,
bad anatomy, bad hands, text, error, missing fingers,
extra digit, fewer digits, cropped, worst quality,
low quality, normal quality, jpeg artifacts, signature,
watermark, username, blurry
(需要记住一大堆)

用 EasyNegative:
Negative: easynegative
(一个词搞定)

安装方法:

1. 下载 .pt 或 .safetensors 文件
2. 放入:embeddings/
3. 刷新 WebUI(不需要重启)
4. 直接在 Prompt/Negative Prompt 使用

模型组合策略

组合公式:

Checkpoint + LoRA(≤3) + VAE + Embedding = 最终效果

示例1:韩系写真

Checkpoint: ChilloutMix
LoRA: koreanDollLikeness (0.7)
LoRA: filmGrain (0.4)
VAE: vae-ft-mse-840000
Negative: easynegative

示例2:二次元插画

Checkpoint: Anything V5
LoRA: GuFeng_v2 (0.6) # 古风
VAE: kl-f8-anime2
Negative: easynegative

选择原则:

明确目标:想要什么风格?
选 Checkpoint:定基调(写实/二次元/艺术)
加 LoRA 微调:调整细节风格
配 VAE:优化色彩
用 Embedding:排除问题

四、提示词工程:从入门到精通

提示词基础结构

一个优秀的提示词通常包含以下要素:

Prompt = 主体 + 风格 + 环境 + 画质 + 其他

示例:
a beautiful girl,                    # 主体
anime style,                         # 风格
cherry blossoms background,          # 环境
8k uhd, highly detailed,            # 画质
soft lighting                        # 其他

权重控制语法

基础语法:

(keyword)        # 权重×1.1
((keyword))      # 权重×1.21
(keyword:1.5)    # 权重×1.5(推荐,明确)
[keyword]        # 权重×0.9
{keyword}        # 权重×1.05(某些UI)

示例:
(beautiful face:1.3)    # 强调美丽面容
((masterpiece))         # 双重强调杰作
[background:0.8]        # 弱化背景

推荐权重范围:

1.0-1.3:轻度强调
1.3-1.5:中度强调(常用)
1.5-2.0:重度强调
2.0:容易出问题,慎用

关键词组合

AND(并列):

red hair AND blue eyes
# 红发 和 蓝眼

girl, smile, long hair
# 逗号分隔 = AND

OR(或):

[red|blue|green] dress
# 随机选择红/蓝/绿色裙子

排除:

Negative Prompt: glasses, hat
# 不要眼镜和帽子

关键词顺序

重要性:前 > 后

推荐:
1girl, beautiful face, long hair, ...
# 1girl 最重要,优先生成

不推荐:
background, sky, clouds, 1girl, ...
# 背景在前,人物可能不明显

500+ 风格关键词库

艺术流派(100+):

经典艺术:

impressionism          # 印象派
post-impressionism     # 后印象派
expressionism          # 表现主义
surrealism            # 超现实主义
cubism                # 立体主义
abstract              # 抽象
pop art               # 波普艺术
minimalism            # 极简主义
baroque               # 巴洛克
renaissance           # 文艺复兴

现代风格:

cyberpunk             # 赛博朋克
steampunk             # 蒸汽朋克
vaporwave             # 蒸汽波
synthwave             # 合成波
retrowave             # 复古波
solarpunk             # 太阳朋克

插画风格(100+):

二次元:

anime style           # 动漫风格
manga style           # 漫画风格
chibi                 # Q版
moe                   # 萌系
kawaii                # 可爱
bishoujo              # 美少女

插画类型:

watercolor painting   # 水彩
oil painting          # 油画
acrylic painting      # 丙烯画
ink painting          # 水墨画
pencil drawing        # 铅笔画
digital painting      # 数字绘画
vector illustration   # 矢量插画
flat design           # 扁平设计
isometric             # 等距视角
pixel art             # 像素艺术
low poly              # 低多边形

艺术家风格:

日本:
miyazaki hayao style  # 宫崎骏
makoto shinkai style  # 新海诚
katsuhiro otomo style # 大友克洋

西方:
van gogh style        # 梵高
monet style           # 莫奈
picasso style         # 毕加索
dali style            # 达利
andy warhol style     # 安迪·沃霍尔

摄影风格(80+):

拍摄类型:

professional photography    # 专业摄影
portrait photography       # 人像摄影
fashion photography        # 时尚摄影
street photography         # 街拍
landscape photography      # 风景摄影
macro photography          # 微距
architectural photography  # 建筑摄影
aerial photography         # 航拍

摄影技巧:

bokeh                 # 虚化/散景
depth of field        # 景深
shallow focus         # 浅景深
tilt-shift            # 移轴
long exposure         # 长曝光
double exposure       # 双重曝光
HDR                   # 高动态范围
film grain            # 胶片颗粒
motion blur           # 运动模糊
lens flare            # 镜头光晕
vignette              # 暗角

相机设备:

shot on Canon EOS R5       # 佳能R5拍摄
shot on Sony A7R IV        # 索尼A7R4
shot on Nikon D850         # 尼康D850
85mm f/1.2                 # 85mm f1.2镜头
50mm f/1.4                 # 50mm f1.4

光影效果(60+):

光线类型:

soft lighting         # 柔光
hard lighting         # 硬光
natural lighting      # 自然光
studio lighting       # 影棚灯光
dramatic lighting     # 戏剧性光
cinematic lighting    # 电影感光
volumetric lighting   # 体积光
god rays              # 上帝之光
rim lighting          # 边缘光/轮廓光
backlighting          # 逆光

光线颜色:

golden hour           # 黄金时刻
blue hour             # 蓝调时刻
warm lighting         # 暖光
cool lighting         # 冷光
neon lighting         # 霓虹灯
candlelight           # 烛光
moonlight             # 月光
sunlight              # 阳光

材质纹理(80+):

金属:

gold                  # 金
silver                # 银
copper                # 铜
bronze                # 青铜
chrome                # 铬/镀铬
metallic              # 金属质感
brushed metal         # 拉丝金属

自然材质:

wood                  # 木头
stone                 # 石头
marble                # 大理石
granite               # 花岗岩
leather               # 皮革
fabric                # 布料
silk                  # 丝绸

现代材质:

glass                 # 玻璃
plastic               # 塑料
ceramic               # 陶瓷
crystal               # 水晶
ice                   # 冰
water                 # 水
smoke                 # 烟雾

画质增强词(50+):

质量词:

masterpiece           # 杰作
best quality          # 最佳质量
high quality          # 高质量
ultra detailed        # 超详细
highly detailed       # 高度详细
intricate details     # 精细细节
8k uhd                # 8K超高清
4k                    # 4K
photorealistic        # 照片级真实
hyperrealistic        # 超现实
professional          # 专业的

细节强化:

detailed face         # 详细面部
detailed eyes         # 详细眼睛
detailed hands        # 详细手部
sharp focus           # 清晰焦点
crisp                 # 清脆/清晰
clean                 # 干净

万能提示词模板

模板1:写实人像

Positive:
(masterpiece, best quality:1.2), photorealistic,
1girl, [年龄] years old, [发型] hair, [发色] hair color,
[表情], looking at viewer,
[服装描述],
professional photography, soft lighting,
depth of field, bokeh,
shot on Canon EOS R5, 85mm f/1.2,
8k uhd, film grain

Negative:
(worst quality:2), (low quality:2), (normal quality:2),
lowres, bad anatomy, bad hands, extra fingers,
fewer digits, extra limbs, text, error,
jpeg artifacts, watermark, signature, username,
blurry, artist name

参数:
Sampler: DPM++ 2M Karras
Steps: 25-30
CFG: 4-7
Size: 512x768

模板2:二次元角色

Positive:
(masterpiece, best quality:1.4),
1girl, [特征描述],
anime style, highly detailed, vibrant colors,
beautiful eyes, detailed face,
[背景描述],
soft lighting, depth of field

Negative:
easynegative, (worst quality:2), (low quality:2),
lowres, bad anatomy, bad hands,
text, error, missing fingers, cropped,
normal quality, jpeg artifacts, signature,
watermark, username, blurry

参数:
Sampler: DPM++ 2M Karras
Steps: 20-28
CFG: 7-11
Size: 512x768

模板3:风景场景

Positive:
(masterpiece, best quality:1.2),
[场景类型, 如: mountain landscape, ocean sunset],
beautiful scenery, detailed background,
[时间, 如: golden hour, blue hour],
[天气, 如: clear sky, cloudy],
volumetric lighting, cinematic,
8k uhd, sharp focus,
professional landscape photography

Negative:
(worst quality:2), (low quality:2),
lowres, blurry, jpeg artifacts,
watermark, signature, text

参数:
Sampler: DPM++ 2M Karras
Steps: 25-35
CFG: 7-9
Size: 768x512 or 1024x576

模板4:产品渲染

Positive:
product photography, [产品名称],
clean white background, studio lighting,
professional, commercial,
high-end, minimalist,
octane render, 8k, ultra detailed,
perfect lighting, no shadows on background

Negative:
(worst quality:2), dirty background,
cluttered, messy, low quality,
blurry, jpeg artifacts

参数:
Sampler: Euler a
Steps: 20-25
CFG: 5-7
Size: 768x768 or 1024x1024

反向提示词(Negative Prompt)

万能反向词:

(worst quality:2), (low quality:2), (normal quality:2),
lowres, bad anatomy, bad hands, text, error,
missing fingers, extra digit, fewer digits,
cropped, jpeg artifacts, signature, watermark,
username, blurry, artist name

使用 EasyNegative(推荐):
easynegative
# 一个词等于上面一大段

针对性反向词:

人物类:

bad anatomy          # 坏的解剖结构
bad hands            # 手部畸形
bad face             # 面部畸形
extra fingers        # 多余手指
fewer digits         # 手指缺失
malformed limbs      # 四肢畸形
fused fingers        # 手指融合
too many fingers     # 手指过多
long neck            # 脖子过长
ugly                 # 丑陋

画质类:

blurry               # 模糊
lowres               # 低分辨率
pixelated            # 像素化(非刻意时)
jpeg artifacts       # JPEG压缩瑕疵
grainy               # 颗粒感(非刻意时)

内容类:

text                 # 文字
watermark            # 水印
signature            # 签名
username             # 用户名
logo                 # 标志

风格类:

cartoon              # 卡通(写实时排除)
3d                   # 3D(2D时排除)
anime                # 动漫(写实时排除)
realistic            # 写实(二次元时排除)

五、参数调优:速度与质量的平衡

采样器(Sampler)详解

采样器决定了 AI 如何从"噪声"逐步生成清晰图像的算法。

核心采样器对比:

采样器	推荐步数	适用场景
DPM++ 2M Karras	20-25	通用(最推荐)
DPM++ SDE Karras	25-30	精细人像
Euler a	25-30	二次元/快速预览
UniPC	15-20	极速生成
Euler	25-30	精确复现

推荐选择:

日常通用:

Sampler: DPM++ 2M Karras
Steps: 20-25
优点: 速度质量最佳平衡,10-15秒出图

写实人像:

Sampler: DPM++ SDE Karras
Steps: 25-30
优点: 细节最好,皮肤质感优秀

二次元插画:

Sampler: Euler a
Steps: 25-30
优点: 随机性带来惊喜,风格柔和

快速预览:

Sampler: UniPC
Steps: 15-20
优点: 5-8秒出图,适合测试提示词

CFG Scale(提示词相关性)

CFG 是什么?

CFG(Classifier Free Guidance Scale)= 提示词遵循程度

CFG = 1:   完全忽略提示词,随机生成
CFG = 7:   平衡创造力和遵循度(推荐)
CFG = 20:  严格遵循提示词,但可能过曝/不自然

CFG 数值对比:

CFG值	效果	适用场景	问题
4-6	较宽松,有随机性	二次元、插画	细节可能缺失
7-9	平衡,最推荐	写实人像、通用	无
10-12	严格遵循提示词	商业设计	可能过于死板
15+	极度遵循,可能过曝	特殊需求	画面不自然

不同模型的最佳 CFG:

写实模型(Realistic Vision, ChilloutMix):

CFG: 6-8
原因: 写实模型对CFG敏感,高CFG易过曝

二次元模型(Anything, CounterfeitV3):

CFG: 5-9
原因: 二次元容错率高,低CFG也有好效果

艺术风格模型(DreamShaper):

CFG: 7-11
原因: 艺术风格需要更高指导

LoRA 加持:

CFG: 降低1-2(如原本7,改为5-6)
原因: LoRA已强化特定特征,不需高CFG

Steps(采样步数)

Steps 含义:

采样步数 = AI 从噪声到清晰图像的迭代次数

Steps = 10:  模糊、不完整
Steps = 20:  基本清晰(性价比最高)
Steps = 30:  细节优化
Steps = 50+: 提升极微小,浪费时间

实测数据(DPM++ 2M Karras, RTX 3060):

Steps	生成时间	质量评分	性价比
10	5秒	60分
15	7秒	75分
20	10秒	85分	(最佳)
25	12秒	90分
30	15秒	92分
40	20秒	93分
50	25秒	93.5分	(浪费)

结论:

20-25 Steps 是性价比最高区间
超过 30 Steps 提升不明显
UniPC 采样器 15步即可

分辨率设置

SD 1.5 基础分辨率: 512×512

比例	分辨率	用途	显存需求
1:1	512×512	头像、图标	4GB
2:3	512×768	人像竖图(最常用)	5GB
3:2	768×512	风景横图	5GB
9:16	512×896	手机壁纸	6GB
16:9	896×512	电脑壁纸	6GB
3:4	640×832	全身人像	6GB

重要提示:

 错误: 直接生成1024×1024会导致:
- 画面重复元素(多个头、多只手)
- 构图混乱
- 细节崩坏

 正确做法:
1. 先512×768生成
2. 用高清修复(Hires.fix)放大到1024×1536

分辨率必须是64的倍数:

正确: 512, 576, 640, 704, 768, 832, 896, 960, 1024
错误: 500, 700, 1000

高清修复(Hires.fix)

原理:

两阶段生成:

阶段1: 先生成512×768的基础图
阶段2: 放大到1024×1536,同时重绘优化细节

优点: 避免高分辨率直接生成的构图问题
缺点: 时间翻倍(两次生成)

设置方法:

1. 勾选 "Hires. fix"
2. Upscaler: 选择放大算法
3. Upscale by: 放大倍数(通常2.0)
4. Denoising strength: 重绘强度(0.4-0.7)

放大算法选择:

算法	速度	质量	适用
Latent (bicubic)			平衡选择(推荐)
R-ESRGAN 4x+			写实人像,细节最好
R-ESRGAN 4x+ Anime6B			二次元专用
SwinIR 4x			质量极高,很慢

Denoising Strength(重绘强度):

0.0: 不重绘,仅放大(模糊)
0.3: 轻微优化,保持原图(保守选择)
0.5: 平衡重绘,优化细节(推荐)
0.7: 大幅重绘,可能改变画面(谨慎使用)
1.0: 完全重绘,几乎是新图

选择建议:

写实人像: 0.4-0.5(保持面部特征)
风景: 0.5-0.6(可以多优化)
二次元: 0.45-0.55(平衡)
初次尝试: 0.5(最保险)

完整配置示例:

写实人像高清输出:

First pass:
- Size: 512×768
- Steps: 25
- Sampler: DPM++ 2M Karras
- CFG: 7

Hires. fix:
 Enable Hires. fix
- Upscaler: R-ESRGAN 4x+
- Upscale by: 2.0(最终1024×1536)
- Hires steps: 15
- Denoising strength: 0.45

生成时间: 约25秒(RTX 3060)

Seed(随机种子)

Seed 的作用:

种子 = 生成过程的起点随机数

Seed = -1:  每次随机,生成不同图片
Seed = 123456: 固定值,相同参数生成相同图片(可复现)

使用场景:

探索阶段(-1随机):

目的: 测试提示词,寻找满意构图
设置: Seed = -1
批量生成: 4-8张,选最佳

微调阶段(固定Seed):

目的: 固定构图,仅调整参数
步骤:
1. 找到满意图片,记下Seed(如3847562)
2. 固定Seed = 3847562
3. 调整CFG、Steps、提示词权重
4. 对比哪个参数效果最好

系列创作(Seed变化):

目的: 生成相似但不同的角色
方法: 固定提示词,Seed递增
Seed: 1001, 1002, 1003, 1004...
效果: 相同风格,不同姿态/表情

从PNG读取Seed:

方法1: PNG Info标签页
1. 拖入图片
2. 查看完整参数,包括Seed

方法2: 直接发送到txt2img
1. PNG Info中点击"Send to txt2img"
2. 所有参数自动填充

参数组合推荐配置

配置1:快速预览(测试提示词)

目标: 5-8秒出图,快速验证想法
Sampler: UniPC
Steps: 15
CFG Scale: 7
Resolution: 512×512
Hires.fix: 关闭
Batch Size: 1
Batch Count: 4
显存需求: 4GB
适合: 探索阶段,测试大量提示词

配置2:日常创作(平衡)

目标: 10-15秒,质量满意
Sampler: DPM++ 2M Karras
Steps: 20
CFG Scale: 7
Resolution: 512×768
Hires.fix: 关闭
Batch Size: 1
Batch Count: 1
显存需求: 5GB
适合: 日常出图,性价比最高(最推荐)

配置3:高质量输出(写实人像)

目标: 极致质量,商用级别
Sampler: DPM++ SDE Karras
Steps: 28
CFG Scale: 6.5
Resolution: 512×768
Hires.fix: 启用
  - Upscaler: R-ESRGAN 4x+
  - Upscale by: 2.0
  - Denoising: 0.45
  - Hires steps: 20
Batch Size: 1
显存需求: 8GB
生成时间: 25-30秒
适合: 精品创作,最终作品

配置4:二次元插画

Sampler: Euler a
Steps: 28
CFG Scale: 6
Resolution: 512×768
Hires.fix: 启用
  - Upscaler: R-ESRGAN 4x+ Anime6B
  - Upscale by: 2.0
  - Denoising: 0.5
Model: Anything V5 / CounterfeitV3
显存需求: 7GB
适合: 高质量动漫角色

标准优化流程(推荐)

Step 1:快速验证提示词(2分钟)

Sampler: UniPC
Steps: 15
CFG: 7
Resolution: 512×512
Batch Count: 4
目的: 确认提示词是否正确,构图是否满意

Step 2:提升质量生成(1分钟)

选择最佳构图的图片,记下Seed
固定Seed
Sampler: 改为DPM++ 2M Karras
Steps: 改为25
Resolution: 改为512×768
目的: 在满意构图基础上提升质量

Step 3:微调CFG(可选,2分钟)

固定Seed和其他参数
测试CFG: 6, 7, 8
对比选最佳

Step 4:高清输出(1分钟)

启用Hires.fix
Upscaler: R-ESRGAN 4x+(写实)或Anime6B(二次元)
Upscale by: 2.0
Denoising: 0.45
生成最终1024×1536高清图

总耗时:约6-8分钟,产出完美作品

六、常见问题与解决方案

问题1:CUDA out of memory(显存不足)

原因:显存不足

解决方案:

启用低显存模式

编辑webui-user.bat,添加参数:
set COMMANDLINE_ARGS=--medvram --xformers

降低分辨率

从512×768降到512×512
或使用512×512生成,再Hires.fix放大

减少批次数量

Batch size设为1

关闭其他占显存的程序

关闭浏览器多余标签页
关闭游戏、视频剪辑软件

问题2:生成速度很慢

优化方法:

启用xFormers

set COMMANDLINE_ARGS=--xformers --theme dark

检查是否在用CPU模式

Settings > Performance
确认使用GPU

降低采样步数

Steps从30降到20-25

更换采样器

从DDIM/PLMS换成DPM++ 2M或UniPC

问题3:生成图片全黑/全白

原因:VAE问题

解决方案:

设置VAE

Settings > VAE > 选择vae-ft-mse-840000

下载VAE

下载vae-ft-mse-840000.safetensors
放入models/VAE/

重启WebUI

重启后生效

问题4:手部总是画不对

原因:手部是AI绘画的经典难题

解决方案:

提示词强化

Positive: (perfect hands:1.3), detailed hands, five fingers
Negative: bad hands, extra fingers, fewer digits, malformed hands

使用ControlNet

ControlNet: OpenPose
参考图: 正确手势照片
Weight: 1.2

后期修复

Photoshop生成式填充
或从其他图片移植正确手部

问题5:画面过曝/过暗

解决方案:

过曝:

CFG过高 → 降到6-7
或检查VAE是否正确加载

过暗:

CFG过低 → 升到8-9
或提示词添加: bright, well-lit

问题6:构图重复(多个头/手)

原因:分辨率超出模型训练范围

解决方案:

降低初始分辨率

从1024×1024降到512×768

用Hires.fix放大

先512×768生成
再Hires.fix放大到1024×1536

提示词中强调

1girl(避免多人)
single character

七、推荐插件与工具

必装插件

1. Civitai Helper(模型管理)

功能:

一键下载Civitai模型,查看模型信息
管理本地模型
检查模型更新

安装:

Extensions > Available > 搜索"Civitai"

2. ControlNet(精准控制)

功能:

姿态控制、线稿上色、深度图
精确控制构图

安装:

Extensions > Available > 搜索"ControlNet"

3. TagComplete(标签自动补全)

功能:

输入提示词时自动补全
提高效率

4. Image Browser(图片浏览)

功能:

查看历史生成图片,管理收藏
快速找到之前的作品

5. Dynamic Prompts(提示词随机)

功能:

批量生成不同提示词组合
快速探索多种可能性

八、下一步学习建议

完成本文学习后,建议:

生成10张测试图,熟悉界面
下载2-3个不同风格的模型
尝试5个案例完整实践
建立自己的提示词库(收藏常用词)
安装ControlNet插件(下篇详解)

进阶资源:

Civitai:https://civitai.com(模型+案例)
PromptHero:https://prompthero.com(提示词灵感)
Lexica:https://lexica.art(SD作品搜索)
OpenArt:https://openart.ai(提示词搜索)

总结:

Stable Diffusion 的学习曲线虽然陡峭,但一旦掌握,它将成为你最强大的创作工具。与 Midjourney 的"黑盒"不同,SD 让你真正理解 AI 绘画的每个环节,实现精确控制。

从今天开始,安装 SD,下载模型,生成你的第一张作品,探索 AI 艺术的无限可能。记住:没有完美的配方,只有不断实验和调整。

下篇预告:ControlNet 精准控制、50+ 实战案例、商业应用技巧,让你从入门到精通,成为真正的 AI 绘画大师。

【下篇】Stable Diffusion进阶实战：ControlNet精准控制+50个完整案例(2024最新)

关键词:ControlNet教程、姿态控制、线稿上色、深度图、OpenPose、Canny、AI绘画实战案例、商业应用、提示词模板

如果说提示词和参数是 AI 绘画的"语言",那么 ControlNet 就是"骨骼"。它能让你精确控制画面的每个细节:人物摆出指定姿势、保持参考图构图、线稿瞬间上色、场景深度完美复刻。

本文将深入讲解 ControlNet 的核心技术,并提供 50 个完整实战案例,每个都包含详细参数和提示词,让你真正掌握 AI 绘画的精髓。

一、ControlNet 核心原理

什么是 ControlNet

ControlNet = 给 SD 添加"骨架约束"的插件,让 AI 按照你的指引生成图片

传统SD: 文字描述 → AI随机理解 → 不可控的结果
ControlNet: 文字描述 + 参考图/线稿/姿态 → AI精确执行 → 可控的结果

核心能力:

姿态控制:让角色摆出指定姿势
线稿上色:把简笔画变成精美插画
场景复刻:保持参考图的构图和景深
边缘引导:用涂鸦控制物体轮廓

ControlNet vs 传统 img2img

对比项	img2img	ControlNet
控制方式	模糊的"参考"	精确的"骨架约束"
相似度	30-70%(不稳定)	90%+(极高)
适用场景	风格迁移	姿态/构图/线稿控制
学习难度	简单	中等
效果	随机性大	精确可控

举例:

需求: 让女孩做"OK"手势
img2img: 提示词写"ok gesture",但手势随机,可能错误
ControlNet: 上传OK手势骨骼图,100%还原手势

二、ControlNet 安装与配置

插件安装

方法1:WebUI内置安装(推荐)

1. Extensions > Available
2. 点击 "Load from"
3. 搜索 "ControlNet"
4. 找到 "sd-webui-controlnet"
5. 点击 Install
6. Settings > Reload UI

方法2:手动安装

cd stable-diffusion-webui/extensions
git clone https://github.com/Mikubill/sd-webui-controlnet.git
# 重启WebUI

模型下载

必装模型(放入models/ControlNet/目录):

1. OpenPose(姿态控制,最常用)

文件名: control_v11p_sd15_openpose.pth
大小: 1.45GB
下载: HuggingFace > lllyasviel/ControlNet-v1-1
用途: 人物姿态、骨骼控制
优先级: (必装)

2. Canny(边缘检测)

文件名: control_v11p_sd15_canny.pth
大小: 1.45GB
用途: 保持物体轮廓、建筑线条
优先级: (必装)

3. Depth(深度图)

文件名: control_v11f1p_sd15_depth.pth
大小: 1.45GB
用途: 场景构图、景深控制
优先级: (推荐)

4. Lineart(线稿提取)

文件名: control_v11p_sd15_lineart.pth
大小: 1.45GB
用途: 线稿上色、插画创作
优先级: (推荐)

5. Scribble(涂鸦)

文件名: control_v11p_sd15_scribble.pth
大小: 1.45GB
用途: 手绘草图控制
优先级: (选装)

下载站点:

HuggingFace:https://huggingface.co/lllyasviel/ControlNet-v1-1
国内镜像:https://hf-mirror.com

新手建议:先装OpenPose、Canny、Depth这3个。

验证安装

1. 刷新WebUI
2. txt2img页面下方出现 "ControlNet" 折叠面板
3. 展开后看到 "Enable" 复选框
4. Preprocessor下拉菜单有 "openpose", "canny" 等选项
5. Model下拉菜单显示已下载的.pth文件

安装成功

三、OpenPose 姿态控制详解

基础用法

场景:让角色摆出指定姿势

步骤:

1. 找参考姿态图(如舞蹈动作照片)
2. ControlNet面板:
   - Enable: 
   - 上传姿态图
   - Preprocessor: openpose_full
   - Model: control_openpose
   - Weight: 1.0

3. 主提示词:
   Prompt: 1girl, school uniform, smile, outdoor
   Negative: bad hands, extra fingers

4. Generate

效果:AI生成的女孩会完全还原参考图的姿势,但外貌、服装、背景按提示词来。

OpenPose 变体

1. openpose(仅身体)

Preprocessor: openpose
提取: 身体骨骼(不含手指、面部)
速度: 快
适合: 全身动作、不需要细节手部

2. openpose_full(完整,推荐)

Preprocessor: openpose_full
提取: 身体+手指+面部朝向
速度: 中等
适合: 需要精确手势、表情方向

3. openpose_hand(仅手部)

Preprocessor: openpose_hand
提取: 手部骨骼(超精细)
适合: 特写手部动作

4. openpose_face(仅面部)

Preprocessor: openpose_face
提取: 面部关键点
适合: 表情控制

实战案例

案例1:复刻舞蹈动作

参考图: 芭蕾舞者单腿站立照片

ControlNet设置:
- Preprocessor: openpose_full
- Model: control_openpose
- Weight: 1.2(稍微强化)

Prompt:
1girl, pink ballet dress, ballet shoes,
professional photography, stage lighting,
graceful, elegant, 8k uhd

生成结果: 不同女孩,但完美复刻芭蕾姿态
用时: 15秒

案例2:精确手势(OK手势)

参考图: 手做OK手势特写

ControlNet:
- Preprocessor: openpose_hand
- Weight: 1.3

Prompt:
1girl, making ok gesture, smile,
looking at viewer, close-up hand,
sharp focus on hand

效果: 手势100%正确,解决SD常见的"手残"问题

四、Canny 边缘控制详解

原理

Canny = 提取图像边缘线条,保持物体轮廓

适用场景:
- 建筑物线条控制
- 物体轮廓保持
- 构图复刻
- 简笔画上色

基础用法

场景:保持建筑物的结构线条

参考图: 一张房屋照片

ControlNet:
- Preprocessor: canny
- Model: control_canny
- Weight: 1.0
- Canny Low Threshold: 100(边缘敏感度下限)
- Canny High Threshold: 200(边缘敏感度上限)

Prompt:
modern architecture, sunset lighting,
professional photography, vibrant colors

效果: 保持房屋轮廓和结构,但材质、光线、风格按提示词变化

Canny 阈值调节

Low/High Threshold 含义:

Low: 100, High: 200(标准)
- 提取主要轮廓,忽略细节(推荐)

Low: 50, High: 150(低阈值)
- 提取更多细节边缘
- 适合: 精细线稿

Low: 150, High: 250(高阈值)
- 只提取主要轮廓
- 适合: 简化构图

调节技巧:

1. 先用默认值(100/200)生成一次
2. 在ControlNet预览窗口查看提取的边缘图
3. 边缘太多太细 → 提高阈值
4. 边缘太少太粗 → 降低阈值

实战案例

案例1:建筑风格迁移

参考图: 欧式城堡照片

ControlNet:
- Preprocessor: canny
- Threshold: 100/200

Prompt:
(cyberpunk:1.3), neon lights, futuristic architecture,
night scene, holographic signs

效果: 城堡轮廓不变,变成赛博朋克风格
应用: 建筑设计方案探索

案例2:简笔画上色

参考图: 手绘女孩线稿(黑白)

ControlNet:
- Preprocessor: canny 或 lineart
- Weight: 1.0

Prompt:
1girl, colorful illustration, anime style,
vibrant colors, detailed shading

效果: 线稿变成彩色精美插画
用时: 12秒

五、Depth 深度控制详解

原理

Depth = 提取图像的前后景深信息,保持空间层次

深度图: 白色=近景,黑色=远景,灰色=中景

适用场景:
- 场景构图复刻
- 室内设计
- 风景层次保持
- 人物与背景关系

预处理器选择

1. depth_midas(通用,推荐)

Preprocessor: depth_midas
精度: 高
速度: 快
适合: 大部分场景

2. depth_leres(超精细)

Preprocessor: depth_leres
精度: 极高(细节最好)
速度: 慢
适合: 复杂场景、多层次景深

3. depth_zoe(新算法)

Preprocessor: depth_zoe
精度: 高
速度: 中等
适合: 人物+场景组合

实战案例

案例1:室内设计方案

参考图: 现代客厅照片

ControlNet:
- Preprocessor: depth_midas
- Model: control_depth
- Weight: 1.0

Prompt:
luxury interior design, chinese style,
wooden furniture, warm lighting,
4k architectural photography

效果: 保持房间空间布局,家具位置,但风格变成中式
应用: 室内设计风格探索

案例2:风景构图复刻

参考图: 山脉+湖泊+前景树木

ControlNet:
- Preprocessor: depth_leres(多层次)
- Weight: 0.9

Prompt:
fantasy landscape, magical forest,
glowing plants, aurora sky,
epic scenery, cinematic

效果: 保持山、湖、树的空间关系,变成奇幻风景

六、Lineart 线稿控制详解

Lineart vs Canny

对比	Lineart	Canny
提取内容	艺术线稿(柔和)	硬边缘(锐利)
适合	插画、动漫	建筑、物体
风格	手绘感	技术图纸感

预处理器

1. lineart(标准)

Preprocessor: lineart
提取: 干净线稿
适合: 动漫角色、插画

2. lineart_anime(动漫专用)

Preprocessor: lineart_anime
提取: 动漫风格线稿
适合: 二次元创作(推荐)

3. lineart_realistic(写实)

Preprocessor: lineart_realistic
提取: 写实素描线条
适合: 写实素描上色

实战案例

案例:草稿变插画

参考图: 铅笔手绘草稿

ControlNet:
- Preprocessor: lineart_anime
- Model: control_lineart
- Weight: 1.1

Prompt:
1girl, anime style, colorful illustration,
cel shading, vibrant colors, detailed eyes

Checkpoint: Anything V5

效果: 草稿变成精美动漫插画
用时: 15秒

七、多 ControlNet 组合

同时使用多个 ControlNet

WebUI支持:最多同时启用3个ControlNet

常见组合:

组合1:OpenPose + Canny(姿态+轮廓)

ControlNet 0:
- Type: OpenPose
- 参考图: 人物姿态
- Weight: 1.0

ControlNet 1:
- Type: Canny
- 参考图: 服装轮廓
- Weight: 0.8

效果: 精确控制姿态+服装细节

组合2:Depth + Canny(景深+线条)

ControlNet 0:
- Type: Depth
- 参考图: 场景深度
- Weight: 1.0

ControlNet 1:
- Type: Canny
- 参考图: 建筑线条
- Weight: 0.9

效果: 保持空间层次和建筑结构
用途: 建筑渲染、室内设计

组合3:OpenPose + Depth(人物姿态+场景深度)

ControlNet 0:
- Type: OpenPose
- Weight: 1.2

ControlNet 1:
- Type: Depth
- Weight: 0.7

效果: 人物姿态精确,场景层次自然
用途: 人物+环境合成

权重平衡

权重分配原则:

主控制: Weight 1.0-1.2(如姿态)
辅助控制: Weight 0.6-0.8(如背景深度)
微调控制: Weight 0.3-0.5(如光影)

总和建议不超过2.5,否则过度约束

八、50个实战案例精选

由于篇幅限制,这里精选15个最实用的完整案例,涵盖人像、风景、插画、设计等各个领域。

案例1:专业证件照

Checkpoint: Realistic Vision V5.1
VAE: vae-ft-mse-840000

Prompt:
professional id photo, passport photo,
1girl, 25 years old, black business suit, white shirt,
neutral expression, looking at camera,
white background, studio lighting, front view,
sharp focus, high resolution,
(professional photography:1.2)

Negative Prompt:
(worst quality:2), (low quality:2), smile, accessories,
jewelry, makeup, colorful background, shadows,
side view, tilted head

Sampler: DPM++ 2M Karras
Steps: 25
CFG Scale: 7
Size: 512×640
Seed: -1

Hires.fix:
- Upscaler: R-ESRGAN 4x+
- Upscale by: 2.0(最终1024×1280)
- Denoising: 0.4

技巧:
- 白背景: 提示词强调 "white background"
- 中性表情: "neutral expression" 而非 smile
- 正面照: "front view, looking at camera"

用时: 20秒

案例2:时尚杂志封面

Checkpoint: ChilloutMix 或 Realistic Vision V5.1
LoRA: <lora:koreanDollLikeness_v15:0.6>

Prompt:
(magazine cover:1.3), fashion photography,
1girl, 22 years old, (beautiful face:1.2),
long wavy hair, professional makeup,
designer dress, elegant pose,
outdoor fashion shoot, natural lighting,
bokeh background, depth of field,
shot on Canon EOS R5, 85mm f/1.2,
vogue style, glamorous,
(8k uhd, RAW photo, best quality:1.3)

Negative Prompt:
easynegative, (bad hands:1.2), extra fingers,
(worst quality:2), lowres, watermark, text,
bad anatomy, bad proportions

Sampler: DPM++ SDE Karras
Steps: 28
CFG Scale: 6
Size: 512×768

Hires.fix:
- Upscaler: R-ESRGAN 4x+
- Upscale by: 2.0
- Denoising: 0.45
- Hires steps: 20

技巧:
- 杂志感: (magazine cover:1.3) 权重强化
- 虚化背景: "bokeh background, depth of field"
- 设备模拟: "shot on Canon EOS R5, 85mm f/1.2"

案例3:中国风古装人像

Checkpoint: GuoFeng3 或 Realistic Vision V5.1
LoRA: <lora:hanfu_v3:0.8>

Prompt:
1girl, hanfu, traditional chinese clothing,
(ancient china:1.2), tang dynasty style,
long black hair, hair ornament, jewelry,
standing in bamboo forest,
soft sunlight through trees, ethereal,
professional photography, cinematic lighting,
elegant pose, looking away,
(best quality, 8k uhd:1.2)

Negative Prompt:
(worst quality:2), modern, contemporary,
bad anatomy, extra fingers, lowres,
western clothing

Sampler: DPM++ 2M Karras
Steps: 28
CFG Scale: 7.5
Size: 512×768

汉服类型关键词:
齐胸襦裙: qixiong ruqun
交领襦裙: crossed collar ruqun
唐装: tang suit
明制汉服: ming dynasty hanfu

案例4:标准动漫角色立绘

Checkpoint: Anything V5 或 CounterfeitV3

Prompt:
1girl, anime style, full body,
(beautiful detailed eyes:1.2), blue eyes,
long blue hair, twin tails, hair ribbon,
school uniform, pleated skirt, thigh highs,
standing, hand on hip, smile,
white background, simple background,
character design, official art,
(masterpiece, best quality:1.3)

Negative Prompt:
(worst quality:2), (low quality:2), lowres,
(bad anatomy:1.2), bad hands, extra fingers,
text, watermark, realistic, 3d

Sampler: Euler a
Steps: 28
CFG Scale: 6
Size: 512×768

角色设定变体:
发型: long hair → short hair, ponytail, messy hair
发色: blue hair → pink hair, white hair, multicolored hair
服装: school uniform → maid outfit, witch hat, armor

案例5:自然风光摄影

Checkpoint: Realistic Vision V5.1 或 DreamShaper

Prompt:
landscape photography, no humans,
mountain range, snow capped peaks,
alpine lake reflection, crystal clear water,
pine forest foreground,
golden hour lighting, sunset, warm tones,
dramatic clouds, god rays,
professional nature photography,
shot on Sony A7R IV, 24mm wide angle,
(8k uhd, high quality, masterpiece:1.2)

Negative Prompt:
people, buildings, cars, modern,
(worst quality:2), lowres, blurry,
oversaturated

Sampler: DPM++ 2M Karras
Steps: 25
CFG Scale: 8
Size: 768×512

Hires.fix:
- Upscaler: R-ESRGAN 4x+
- Upscale by: 2.0
- Denoising: 0.5

风光摄影变体:
地形: mountain → beach, desert, canyon, waterfall
时间: sunset → sunrise, blue hour, night, stormy sky
季节: summer → autumn colors, winter, spring flowers

案例6:赛博朋克城市

Checkpoint: DreamShaper 或 RealisticVision

Prompt:
cyberpunk city, futuristic cityscape,
(neon lights:1.3), holographic advertisements,
rain wet streets, reflections,
tall skyscrapers, flying cars,
night scene, purple and blue lighting,
dystopian atmosphere, blade runner style,
cinematic, wide angle view,
(highly detailed:1.2), 8k uhd

Negative Prompt:
people, daytime, nature, green,
(worst quality:2), lowres, blurry

Sampler: DPM++ 2M Karras
Steps: 30
CFG Scale: 9
Size: 896×512(宽屏)

赛博朋克元素:
Neon lights: 霓虹灯
Holographic: 全息投影
Rain wet streets: 雨湿街道
Flying cars: 飞行汽车
Dystopian: 反乌托邦
Blade runner style: 银翼杀手风格

案例7:奇幻魔法森林

Checkpoint: DreamShaper

Prompt:
fantasy forest, magical atmosphere,
giant mushrooms, glowing plants,
bioluminescent flowers, fireflies,
ancient trees, twisted roots,
fog, mystical lighting, volumetric light,
purple and blue color scheme,
fantasy art, concept art,
highly detailed, 8k wallpaper,
(masterpiece:1.2)

Negative Prompt:
realistic, photo, modern, people,
(worst quality:2), lowres

Sampler: DPM++ 2M Karras
Steps: 28
CFG Scale: 8
Size: 768×512

奇幻元素关键词:
Bioluminescent: 生物发光
Glowing: 发光
Mystical: 神秘
Volumetric light: 体积光(丁达尔效应)
Ancient: 古老
Magic particles: 魔法粒子

案例8:产品渲染

Checkpoint: Realistic Vision V5.1

Prompt:
product photography, luxury watch,
metallic silver, leather strap,
studio lighting, white background,
professional product shot, reflections,
high end, detailed, macro photography,
commercial photography,
(8k uhd, highly detailed:1.3)

Negative Prompt:
low quality, blurry, scratched,
(worst quality:2), dirty

Sampler: DPM++ 2M Karras
Steps: 28
CFG Scale: 8
Size: 640×640

产品摄影关键词:
Product photography: 产品摄影
Studio lighting: 影棚灯光
White background: 白背景
Macro photography: 微距摄影
Commercial: 商业
High end: 高端
Reflections: 反射

案例9:Logo设计

Checkpoint: DreamShaper

Prompt:
logo design, brand identity,
minimalist logo, letter "A" monogram,
geometric shape, clean lines,
professional, corporate,
black and white, vector style,
simple, modern, flat design

Negative Prompt:
complex, detailed, realistic, photo,
gradient, 3d, (worst quality:2)

Sampler: DPM++ 2M Karras
Steps: 20
CFG Scale: 8
Size: 512×512

Logo设计关键词:
Logo design: 标志设计
Monogram: 字母组合图案
Minimalist: 极简
Flat design: 扁平设计
Vector style: 矢量风格
Corporate: 企业

案例10:梵高风格

Checkpoint: DreamShaper

Prompt:
(van gogh style:1.3), oil painting,
starry night, village scene,
swirling sky, bright stars, crescent moon,
cypress trees, church steeple,
vibrant colors, thick brush strokes,
impasto technique, post-impressionism,
(masterpiece, art style:1.2)

Negative Prompt:
realistic, photo, modern,
(worst quality:2), flat

Sampler: Euler a
Steps: 30
CFG Scale: 8
Size: 768×512

印象派关键词:
Van gogh style: 梵高风格
Swirling: 漩涡状
Thick brush strokes: 厚重笔触
Impasto: 厚涂法
Vibrant colors: 鲜艳色彩
Post-impressionism: 后印象派

案例11:像素艺术(Pixel Art)

Checkpoint: Anything V5

Prompt:
pixel art, 16bit, retro game style,
isometric view, fantasy village,
small houses, trees, river,
detailed pixel work, limited color palette,
indie game art, (pixel perfect:1.2)

Negative Prompt:
realistic, smooth, high resolution,
anti-aliasing, (worst quality:2)

Sampler: Euler a
Steps: 20
CFG Scale: 7
Size: 512×512

像素艺术关键词:
Pixel art: 像素艺术
16bit: 16位(指色彩深度)
8bit: 8位(更复古)
Isometric: 等距视角
Retro game: 复古游戏
Pixel perfect: 像素完美

案例12:低多边形(Low Poly)

Checkpoint: DreamShaper

Prompt:
(low poly:1.3), 3d render, geometric,
polygonal style, faceted surface,
mountain landscape, trees, lake,
vibrant colors, flat shading,
minimalist, clean geometry,
blender render, (stylized:1.2)

Negative Prompt:
realistic, high poly, detailed texture,
(worst quality:2), smooth

Sampler: DPM++ 2M Karras
Steps: 25
CFG Scale: 7.5
Size: 768×512

Low Poly关键词:
Low poly: 低多边形
Polygonal: 多边形
Faceted: 多面的
Flat shading: 平面着色
Geometric: 几何
Minimalist: 极简

案例13:水彩风格插画

Checkpoint: Anything V5 或 PastelMix

Prompt:
1girl, watercolor painting, watercolor \(medium\),
soft colors, pastel colors, gentle,
dreamy atmosphere, light particles,
sitting under tree, reading book,
flower petals falling, spring,
soft brush strokes, paper texture,
traditional media, artistic,
(masterpiece:1.2)

Negative Prompt:
(realistic:1.2), photo, 3d, digital art,
hard edges, (worst quality:2)

Sampler: DPM++ 2M Karras
Steps: 28
CFG Scale: 6
Size: 640×832

水彩效果强化:
watercolor (medium): 水彩媒介
soft colors: 柔和色彩
pastel colors: 粉彩
paper texture: 纸张纹理
watercolor splatter: 水彩飞溅

案例14:ControlNet姿态+风格迁移

需求: 让真人照片变成动漫角色,保持姿态

Step 1: 准备参考照片
- 真人站立照片

Step 2: ControlNet设置
- Type: OpenPose
- Preprocessor: openpose_full
- Weight: 1.0

Step 3: 主参数
Checkpoint: Anything V5

Prompt:
1girl, anime style, school uniform,
beautiful detailed eyes, colorful,
(masterpiece:1.2)

Negative: realistic, photo, 3d

Sampler: Euler a
Steps: 28
CFG: 6
Size: 512×768

生成结果: 姿态完全一致,风格变成动漫

案例15:多ControlNet精确控制

需求: 复刻参考图的姿态+场景深度+边缘线条

ControlNet组合:

ControlNet 0:
- Type: OpenPose
- 参考图: 人物姿态
- Weight: 1.0

ControlNet 1:
- Type: Depth
- 参考图: 场景深度
- Weight: 0.8

ControlNet 2:
- Type: Canny
- 参考图: 服装轮廓
- Weight: 0.6

Prompt:
1girl, elegant dress, garden background,
professional photography, (best quality:1.2)

效果: 三重约束,极精确控制

九、常见问题与高级技巧

问题1:ControlNet不生效

排查:

1. 是否勾选 "Enable"
2. 是否上传了参考图
3. Preprocessor和Model是否匹配
4. Weight是否为0
5. 检查模型文件是否在正确目录

问题2:生成速度变慢

原因:ControlNet增加计算量

优化:

1. 勾选 "Low VRAM"(低显存模式)
2. 减少同时启用的ControlNet数量
3. 降低参考图分辨率到512×512
4. 使用轻量化Preprocessor

问题3:控制过度,画面僵硬

解决:

1. 降低Weight: 1.0 → 0.7
2. 调整Ending Step: 1.0 → 0.7
3. Control Mode: ControlNet is more important → Balanced

问题4:控制不足,没效果

解决:

1. 提高Weight: 1.0 → 1.3
2. 检查Preprocessor是否正确提取(查看预览)
3. Control Mode: My prompt is more important → Balanced
4. 提示词是否与控制冲突(如OpenPose站立,提示词写sitting)

高级技巧1:分阶段控制

Starting/Ending Step:

起始步: 0.0(从第0步开始)
结束步: 1.0(到最后一步)

技巧: 调整控制的时间段
- 0.0 - 0.5: 仅前半段控制(构图),后半段自由发挥(细节)
- 0.3 - 1.0: 前期自由,中后期控制(保持创意+修正错误)

应用: 平衡控制和创造力

案例:

需求: 保持姿态,但允许AI自由创作细节

ControlNet:
- OpenPose
- Weight: 1.0
- Starting: 0.0
- Ending: 0.6(前60%步数控制,后40%自由)

效果: 姿态正确,但服装、发型、背景更有创意

高级技巧2:低权重叠加

创意探索模式:

ControlNet Weight: 0.3-0.5(低权重)

效果: 轻度参考,AI有较大自由度
适合: 在保持大致构图下探索不同风格

vs

高权重(1.0+): 严格执行,适合精确复刻

十、总结与学习建议

Stable Diffusion 核心优势回顾

完全免费开源:一次安装,终身使用
本地运行:数据隐私,无限制生成
精确控制:ControlNet 实现 99% 精准复刻
庞大生态:数千个社区模型,永远不缺新鲜感
高度可定制:从模型到参数,完全掌控

持续学习资源

模型与案例:

Civitai:https://civitai.com(最全模型库+参数)
LiblibAI:https://www.liblib.art(国内访问快)

提示词灵感:

PromptHero:https://prompthero.com
Lexica:https://lexica.art
OpenArt:https://openart.ai

社区交流:

Reddit: r/StableDiffusion
Discord: Stable Diffusion 官方频道
B站/YouTube: SD 教程视频

Stable Diffusion 是一个深度学习工具,需要不断实践和探索。本文提供的案例和参数都是经过验证的,但最重要的是理解背后的原理,然后根据自己的需求进行调整。

从今天开始,打开你的 SD WebUI,安装 ControlNet,尝试这些案例,探索 AI 艺术创作的无限可能。记住:没有完美的配方,只有不断实验和调整。

祝你在 AI 绘画之路上越走越远,创作出令人惊艳的作品!