人工智能实践（语言智能）

第3讲：提示词

参考材料

按主题分组的核心论文、综述与内部译本——构建你自己的提示工程阅读地图

内部材料(本讲直接依据)

文件	内容
`prompt-engineering-literature-review-zh.md`	课程内部综述(25K 字),覆盖提示范式演进、理论基础、自动化优化
`DSPy论文中文翻译_Khattab2023.md`	DSPy 论文 arXiv:2310.03714 的中文译本
`TextGrad论文中文翻译.md`	TextGrad Nature 2025 论文的中文译本

核心原始论文

基础范式

年份	作者	标题	要点
2020	Brown et al., NeurIPS	GPT-3: Language Models are Few-Shot Learners	确立 ICL 范式,催生提示工程
2022	Wei et al., ICLR	FLAN: Finetuned Language Models Are Zero-Shot Learners	指令微调,孵化 InstructGPT 系列
2021	Zhao et al., ICML	Calibrate Before Use	识别 ICL 的三种系统性偏差
2022	Min et al., EMNLP	Rethinking the Role of Demonstrations	"标签随机化"实验,格式 > 语义
2024	Sclar et al., ICLR	Quantifying Language Models' Sensitivity to Spurious Features	FormatSpread,76 点准确率差异

Chain-of-Thought 及扩展

年份	作者	标题	要点
2022	Wei et al., NeurIPS	Chain-of-Thought Prompting Elicits Reasoning	CoT 原始论文,GSM8K 碾压纪录
2022	Kojima et al., NeurIPS	Large Language Models are Zero-Shot Reasoners	Zero-shot CoT,"Let's think step by step"
2023	Wang et al., ICLR	Self-Consistency Improves Chain-of-Thought Reasoning	多路采样 + 多数投票
2023	Gao et al., ICML	PAL: Program-aided Language Models	生成 Python 代码作为推理步骤
2023	Chen et al., TMLR	Program of Thoughts (PoT)	与 PAL 并行的工作,外包计算

Tree / Graph of Thoughts

年份	作者	标题	要点
2023	Yao et al., NeurIPS	Tree of Thoughts: Deliberate Problem Solving	评估器 + DFS/BFS 搜索,24 点 74%
2024	Besta et al., AAAI	Graph of Thoughts	任意 DAG,聚合与反馈循环
2023	Zhou et al., ICLR	Least-to-Most Prompting	显式问题分解,SCAN 上 16% → 99.7%
2023	Khot et al., ICLR	Decomposed Prompting	模块化递归分解

Agent / ReAct

年份	作者	标题	要点
2023	Yao et al., ICLR	ReAct: Synergizing Reasoning and Acting	Thought-Action-Observation 循环,LangChain 的直接先驱
2023	Shinn et al., NeurIPS	Reflexion	语言自我反思,HumanEval 91% pass@1
2023	Schick et al., NeurIPS	Toolformer	LM 自主决定调用哪些 API

DSPy 与自动化提示优化

年份	作者	标题	要点
2024	Khattab et al., ICLR Spotlight	DSPy: Compiling Declarative LM Calls	Signature / Module / Teleprompter 三层抽象
2024	Opsahl-Ong et al., EMNLP	MIPRO: Optimizing Multi-Stage LM Programs	贝叶斯优化联合搜索指令与示例
2023	Zhou et al., ICLR	APE: Large Language Models as Human-Level Prompt Engineers	LLM 作为 prompt 生成器,发现比 "Let's think" 更优的触发语
2024	Yang et al., ICLR	OPRO: Large Language Models as Optimizers	把优化问题用自然语言描述给 LLM
2024	Guo et al., ICLR	EvoPrompt	进化算法 + LLM 作为变异/交叉算子
2023	Pryzant et al., EMNLP	ProTeGi / APO	自然语言"梯度"从错误示例中提取

TextGrad 与文本梯度

年份	作者	标题	要点
2025	Yuksekgonul et al., Nature	TextGrad: Automatic "Differentiation" via Text	PyTorch 风格 API,文本梯度反向传播
2024	Suzgun and Kalai	Meta-Prompting	"全新视角"原则,单 LLM 变指挥家

理论与机制解释

年份	作者	标题	要点
2022	Xie et al., ICLR	An Explanation of In-Context Learning as Implicit Bayesian Inference	贝叶斯后验更新视角
2023	Von Oswald et al., ICML	Transformers Learn In-Context by Gradient Descent	ICL 等价于一步 GD
2022	Olsson et al., Anthropic	In-context Learning and Induction Heads	机制可解释性的关键电路
2022	Chan et al., NeurIPS	Data Distributional Properties Drive Emergent ICL	齐普夫分布与突发性是 ICL 的前提

推荐综述

作者	标题	收录
Liu et al.	Pre-train, Prompt, and Predict: A Systematic Survey	ACM Computing Surveys, 2023
Schulhoff et al.	The Prompt Report: A Systematic Survey	2024,1500+ 论文
Sahoo et al.	A Systematic Survey of Prompt Engineering Techniques	2024
Besta et al.	Demystifying Chains, Trees, and Graphs of Thoughts	2024

课程推荐阅读清单(按时间投入排序)

15 分钟:快速了解

课程内部综述第 1–2 章(范式演进 + 下游任务)
DSPy 论文中文翻译的 §3 与 §4

1 小时:理解 CoT 家族

Wei et al. 2022 CoT 原文(§1–3)
Kojima et al. 2022 Zero-shot CoT
Wang et al. 2023 Self-Consistency

3 小时:搞懂 DSPy + TextGrad

DSPy 论文完整中文译本
TextGrad 论文完整中文译本
DSPy 官方 tutorial:https://dspy.ai/tutorials/

一个周末:理论基础

课程内部综述第 3 章(ICL 的贝叶斯 / GD 两种解释)
Olsson et al. 2022 Induction Heads
Min et al. 2022 Rethinking Demonstrations

工具与代码仓库

项目	地址	用途
DSPy	https://github.com/stanfordnlp/dspy	声明式 LM pipeline 框架
TextGrad	https://github.com/zou-group/textgrad	文本梯度优化
LangChain	https://github.com/langchain-ai/langchain	工具链与 Agent 编排
LlamaIndex	https://github.com/run-llama/llama_index	RAG 与文档问答
Promptfoo	https://github.com/promptfoo/promptfoo	Prompt 的 A/B 测试与评估

阅读建议:优先读综述建立心智地图,再按你的应用方向(推理 / RAG / Agent / 优化)深入 2–3 篇代表作。直接开读 100 篇 prompt 论文会让你迷路——这个领域的噪声远大于信号。

实验3：用 DSPy 编译一个中文问答 pipeline

在中文 QA 任务上对比手写 prompt 与 BootstrapFewShot 编译后的效果,理解声明式优化的威力

第4讲：RAG 检索增强生成

从检索到生成到评估，完整构建北大研究生手册问答系统——理解参数化知识与非参数化检索的取舍

On this page

内部材料(本讲直接依据)核心原始论文基础范式 Chain-of-Thought 及扩展 Tree / Graph of Thoughts Agent / ReAct DSPy 与自动化提示优化 TextGrad 与文本梯度理论与机制解释推荐综述课程推荐阅读清单(按时间投入排序)15 分钟:快速了解 1 小时:理解 CoT 家族 3 小时:搞懂 DSPy + TextGrad 一个周末:理论基础工具与代码仓库