深度学习中英术语表

本文最后更新于 2024-11-13，文章内容可能已经过时。

以下深度学习相关术语主要来自 deep learning 这本书中的术语表

中文	英文	缩写
深度学习	deep learning
机器学习	machine learning
机器学习模型	machine learning model
逻辑回归	logistic regression
回归	regression
人工智能	artificial intelligence
朴素贝叶斯	naive Bayes
表示	representation
表示学习	representation learning
自编码器	autoencoder
编码器	encoder
解码器	decoder
多层感知机	multilayer perceptron
人工神经网络	artificial neural network
神经网络	neural network
随机梯度下降	stochastic gradient descent	SGD
线性模型	linear model
线性回归	linear regression
整流线性单元	rectified linear unit	ReLU
分布式表示	distributed representation
非分布式表示	nondistributed representation
非分布式	nondistributed
隐藏单元	hidden unit
长短期记忆	long short-term memory	LSTM
深度信念网络	deep belief network	DBN
循环神经网络	recurrent neural network	RNN
循环	recurrence
强化学习	reinforcement learning
推断	inference
上溢	overflow
下溢	underflow
softmax 函数	softmax function
softmax	softmax
欠估计	underestimation
过估计	overestimation
病态条件	poor conditioning
目标函数	objective function
目标	objective
准则	criterion
代价函数	cost function
代价	cost
损失函数	loss function
PR 曲线	PR curve
F 值	F-score
损失	loss
误差函数	error function
梯度下降	gradient descent
导数	derivative
临界点	critical point
驻点	stationary point
局部极小点	local minimum
极小点	minimum
局部极小值	local minima
极小值	minima
全局极小值	global minima
局部极大值	local maxima
极大值	maxima
局部极大点	local maximum
鞍点	saddle point
全局最小点	global minimum
偏导数	partial derivative
梯度	gradient
样本	example
二阶导数	second derivative
曲率	curvature
凸优化	Convex optimization
非凸	nonconvex
数值优化	numerical optimization
约束优化	constrained optimization
可行	feasible
等式约束	equality constraint
不等式约束	inequality constraint
正则化	regularization
正则化项	regularizer
正则化	regularize
泛化	generalization
泛化	generalize
欠拟合	underfitting
过拟合	overfitting
偏差	biass
方差	variance
集成	ensemble
估计	estimator
权重衰减	weight decay
协方差	covariance
稀疏	sparse
特征选择	feature selection
特征提取器	feature extractor
最大后验	Maximum A Posteriori	MAP
池化	pooling
Dropout	Dropout
蒙特卡罗	Monte Carlo
提前终止	early stopping
卷积神经网络	convolutional neural network	CNN
小批量	minibatch
重要采样	Importance Sampling
变分自编码器	variational auto-encoder	VAE
计算机视觉	Computer Vision	CV
语音识别	Speech Recognition
自然语言处理	Natural Language Processing	NLP
有向模型	Directed Model
原始采样	Ancestral Sampling
随机矩阵	Stochastic Matrix
平稳分布	Stationary Distribution
均衡分布	Equilibrium Distribution
索引	index of matrix
磨合	Burning-in
混合时间	Mixing Time
混合	Mixing
Gibbs 采样	Gibbs Sampling
吉布斯步数	Gibbs steps
Bagging	bootstrap aggregating
掩码	mask
批标准化	batch normalization
参数共享	parameter sharing
KL 散度	KL divergence
温度	temperature
临界温度	critical temperatures
并行回火	parallel tempering
自动语音识别	Automatic Speech Recognition	ASR
级联	coalesced
数据并行	data parallelism
模型并行	model parallelism
异步随机梯度下降	Asynchoronous Stochastic Gradient Descent
参数服务器	parameter server
模型压缩	model compression
动态结构	dynamic structure
隐马尔可夫模型	Hidden Markov Model	HMM
高斯混合模型	Gaussian Mixture Model	GMM
转录	transcribe
主成分分析	principal components analysis	PCA
因子分析	factor analysis
独立成分分析	independent component analysis	ICA
稀疏编码	sparse coding
定点运算	fixed-point arithmetic
浮点运算	float-point arithmetic
生成模型	generative model
生成式建模	generative modeling
数据集增强	dataset augmentation
白化	whitening
深度神经网络	DNN
端到端的	end-to-end
图模型	graphical model
有向图模型	directed graphical model
依赖	dependency
贝叶斯网络	Bayesian network
模型平均	model averaging
声明	statement
量子力学	quantum mechanics
亚原子	subatomic
逼真度	fidelity
信任度	degree of belief
频率派概率	frequentist probability
贝叶斯概率	Bayesian probability
似然	likelihood
随机变量	random variable
概率分布	probability distribution
联合概率分布	joint probability distribution
归一化的	normalized
均匀分布	uniform distribution
概率密度函数	probability density function	PDF
累积函数	cumulative function
边缘概率分布	marginal probability distribution
求和法则	sum rule
条件概率	conditional probability
干预查询	intervention query
因果模型	causal modeling
因果因子	causal factor
链式法则	chain rule
乘法法则	product rule
相互独立的	independent
条件独立的	conditionally independent
期望	expectation
期望值	expected value
样本	example
特征	feature
准确率	accuracy
错误率	error rate
训练集	training set
解释因子	explanatory factort
潜在	underlying
潜在成因	underlying cause
测试集	test set
性能度量	performance measures
经验	experience
无监督	unsupervised
有监督	supervised
半监督	semi-supervised
监督学习	supervised learning
无监督学习	unsupervised learning
数据集	dataset
数据点	data point
标签	label
标注	labeled
未标注	unlabeled
目标	target
强化学习	reinforcement learning
设计矩阵	design matrix
参数	parameter
权重	weight
均方误差	mean squared error	MSE
正规方程	normal equation
训练误差	training error
泛化误差	generalization error
测试误差	test error
假设空间	hypothesis space
容量	capacity
表示容量	representational capacity
有效容量	effective capacity
线性阈值单元	linear threshold units
非参数	non-parametric
最近邻回归	nearest neighbor regression
最近邻	nearest neighbor
验证集	validation set
基准	bechmark
基准	baseline
点估计	point estimator
估计量	estimator
统计量	statistics
无偏	unbiased
有偏	biased
异步	asynchronous
渐近无偏	asymptotically unbiased
标准差	standard error
一致性	consistency
统计效率	statistic efficiency
有参情况	parametric case
贝叶斯统计	Bayesian statistics
先验概率分布	prior probability distribution
最大后验	maximum a posteriori
最大似然估计	maximum likelihood estimation
最大似然	maximum likelihood
核技巧	kernel trick
核函数	kernel function
高斯核	Gaussian kernel
核机器	kernel machine
核方法	kernel method
支持向量	support vector
支持向量机	support vector machine	SVM
音素	phoneme
声学	acoustic
语音	phonetic
专家混合体	mixture of experts
高斯混合体	Gaussian mixtures
选通器	gater
专家网络	expert network
注意力机制	attention mechanism
对抗样本	adversarial example
对抗	adversarial
对抗训练	adversarial training
切面距离	tangent distance
正切传播	tangent prop
正切传播	tangent propagation
双反向传播	double backprop
期望最大化	expectation maximization	EM
均值场	mean-field
变分推断	variational inference
二值稀疏编码	binary sparse coding
前馈网络	feedforward network
转移	transition
重构	reconstruction
生成随机网络	generative stochastic network
得分匹配	score matching
因子	factorial
分解的	factorized
均匀场	meanfield
最大似然估计	maximum likelihood estimation
概率 PCA	probabilistic PCA
随机梯度上升	Stochastic Gradient Ascent
团	clique
Dirac 分布	dirac distribution
不动点方程	fixed point equation
变分法	calculus of variations
信念网络	belief network
马尔可夫随机场	Markov random field
马尔可夫网络	Markov network
对数线性模型	log-linear model
自由能	free energy
局部条件概率分布	local conditional probability distribution
条件概率分布	conditional probability distribution
玻尔兹曼分布	Boltzmann distribution
吉布斯分布	Gibbs distribution
能量函数	energy function
标准差	standard deviation
相关系数	correlation
标准正态分布	standard normal distribution
协方差矩阵	covariance matrix
Bernoulli 分布	Bernoulli distribution
Bernoulli 输出分布	Bernoulli output distribution
Multinoulli 分布	multinoulli distribution
Multinoulli 输出分布	multinoulli output distribution
范畴分布	categorical distribution
多项式分布	multinomial distribution
正态分布	normal distribution
高斯分布	Gaussian distribution
精度	precision
多维正态分布	multivariate normal distribution
精度矩阵	precision matrix
各向同性	isotropic
指数分布	exponential distribution
指示函数	indicator function
广义函数	generalized function
经验分布	empirical distribution
经验频率	empirical frequency
混合分布	mixture distribution
潜变量	latent variable
隐藏变量	hidden variable
先验概率	prior probability
后验概率	posterior probability
万能近似器	universal approximator
饱和	saturate
分对数	logit
正部函数	positive part function
负部函数	negative part function
贝叶斯规则	Bayes' rule
测度论	measure theory
零测度	measure zero
Jacobian 矩阵	Jacobian matrix
自信息	self-information
奈特	nats
比特	bit
香农	shannons
香农熵	Shannon entropy
微分熵	differential entropy
微分方程	differential equation
KL 散度	Kullback-Leibler (KL) divergence
交叉熵	cross-entropy
熵	entropy
分解	factorization
结构化概率模型	structured probabilistic model
图模型	graphical model
回退	back-off
有向	directed
无向	undirected
无向图模型	undirected graphical model
成比例	proportional
描述	description
决策树	decision tree
因子图	factor graph
结构学习	structure learning
环状信念传播	loopy belief propagation
卷积网络	convolutional network
卷积网络	convolutional net
主对角线	main diagonal
转置	transpose
广播	broadcasting
矩阵乘积	matrix product
AdaGrad	AdaGrad
逐元素乘积	element-wise product
Hadamard 乘积	Hadamard product
团势能	clique potential
因子	factor
未归一化概率函数	unnormalized probability function
循环网络	recurrent network
梯度消失与爆炸问题	vanishing and exploding gradient problem
梯度消失	vanishing gradient
梯度爆炸	exploding gradient
计算图	computational graph
展开	unfolding
求逆	invert
时间步	time step
维数灾难	curse of dimensionality
平滑先验	smoothness prior
局部不变性先验	local constancy prior
局部核	local kernel
流形	manifold
流形正切分类器	manifold tangent classifier
流形学习	manifold learning
流形假设	manifold hypothesis
环	loop
弦	chord
弦图	chordal graph
三角形化图	triangulated graph
三角形化	triangulate
风险	risk
经验风险	empirical risk
经验风险最小化	empirical risk minimization
代理损失函数	surrogate loss function
批量	batch
确定性	deterministic
随机	stochastic
在线	online
流	stream
梯度截断	gradient clipping
幂方法	power method
前向传播	forward propagation
反向传播	backward propagation
展开图	unfolded graph
深度前馈网络	deep feedforward network
前馈神经网络	feedforward neural network
前向	feedforward
反馈	feedback
网络	network
深度	depth
输出层	output layer
隐藏层	hidden layer
宽度	width
单元	unit
激活函数	activation function
反向传播	back propagation	backprop
泛函	functional
平均绝对误差	mean absolute error
赢者通吃	winner-take-all
异方差	heteroscedastic
混合密度网络	mixture density network
梯度截断	clip gradient
绝对值整流	absolute value rectification
渗漏整流线性单元	Leaky ReLU
参数化整流线性单元	parametric ReLU	PReLU
maxout 单元	maxout unit
硬双曲正切函数	hard tanh
架构	architecture
操作	operation
符号	symbol
数值	numeric value
动态规划	dynamic programming
自动微分	automatic differentiation
并行分布式处理	Parallel Distributed Processing
稀疏激活	sparse activation
衰减	damping
学成	learned
信息传输	message passing
泛函导数	functional derivative
变分导数	variational derivative
额外误差	excess error
动量	momentum
混沌	chaos
稀疏初始化	sparse initialization
共轭方向	conjugate directions
共轭	conjugate
条件独立	conditionally independent
集成学习	ensemble learning
独立子空间分析	independent subspace analysis
慢特征分析	slow feature analysis	SFA
慢性原则	slowness principle
整流线性	rectified linear
整流网络	rectifier network
坐标下降	coordinate descent
坐标上升	coordinate ascent
预训练	pretraining
无监督预训练	unsupervised pretraining
逐层的	layer-wise
贪心算法	greedy algorithm
贪心	greedy
精调	fine-tuning
课程学习	curriculum learning
召回率	recall
覆盖	coverage
超参数优化	hyperparameter optimization
超参数	hyperparameter
网格搜索	grid search
有限差分	finite difference
中心差分	centered difference
储层计算	reservoir computing
谱半径	spectral radius
收缩	contractive
长期依赖	long-term dependency
跳跃连接	skip connection
门控 RNN	gated RNN
门控	gated
卷积	convolution
输入	input
输入分布	input distribution
输出	output
特征映射	feature map
翻转	flip
稀疏交互	sparse interactions
等变表示	equivariant representations
稀疏连接	sparse connectivity
稀疏权重	sparse weights
接受域	receptive field
绑定的权重	tied weights
等变	equivariance
探测级	detector stage
符号表示	symbolic representation
池化函数	pooling function
最大池化	max pooling
池	pool
不变	invariant
步幅	stride
降采样	downsampling
全	full
非共享卷积	unshared convolution
平铺卷积	tiled convolution
循环卷积网络	recurrent convolutional network
傅立叶变换	Fourier transform
可分离的	separable
初级视觉皮层	primary visual cortex
简单细胞	simple cell
复杂细胞	complex cell
象限对	quadrature pair
门控循环单元	gated recurrent unit	GRU
门控循环网络	gated recurrent net
遗忘门	forget gate
截断梯度	clipping the gradient
记忆网络	memory network
神经网络图灵机	neural Turing machine	NTM
精调	fine-tune
共因	common cause
编码	code
再循环	recirculation
欠完备	undercomplete
完全图	complete graph
欠定的	underdetermined
过完备	overcomplete
去噪	denoising
去噪	denoise
重构误差	reconstruction error
梯度场	gradient field
得分	score
切平面	tangent plane
最近邻图	nearest neighbor graph
嵌入	embedding
近似推断	approximate inference
信息检索	information retrieval
语义哈希	semantic hashing
降维	dimensionality reduction
对比散度	contrastive divergence
语言模型	language model
标记	token
一元语法	unigram
二元语法	bigram
三元语法	trigram
平滑	smoothing
级联	cascade
模型	model
层	layer
半监督学习	semi-supervised learning
监督模型	supervised model
词嵌入	word embedding
one-hot	one-hot
监督预训练	supervised pretraining
迁移学习	transfer learning
学习器	learner
多任务学习	multitask learning
领域自适应	domain adaption
一次学习	one-shot learning
零次学习	zero-shot learning
零数据学习	zero-data learning
多模态学习	multimodal learning
生成式对抗网络	generative adversarial network	GAN
前馈分类器	feedforward classifier
线性分类器	linear classifier
正相	positive phase
负相	negative phase
随机最大似然	stochastic maximum likelihood
噪声对比估计	noise-contrastive estimation	NCE
噪声分布	noise distribution
噪声	noise
独立同分布	independent identically distributed
专用集成电路	application-specific integrated circuit	ASIC
现场可编程门阵列	field programmable gated array	FPGA
标量	scalar
向量	vector
矩阵	matrix
张量	tensor
点积	dot product
内积	inner product
方阵	square
奇异的	singular
范数	norm
三角不等式	triangle inequality
欧几里得范数	Euclidean norm
最大范数	max norm
对角矩阵	diagonal matrix
对称	symmetric
单位向量	unit vector
单位范数	unit norm
正交	orthogonal
正交矩阵	orthogonal matrix
标准正交	orthonormal
特征分解	eigendecomposition
特征向量	eigenvector
特征值	eigenvalue
分解	decompose
正定	positive definite
负定	negative definite
半负定	negative semidefinite
半正定	positive semidefinite
奇异值分解	singular value decomposition	SVD
奇异值	singular value
奇异向量	singular vector
单位矩阵	identity matrix
矩阵逆	matrix inversion
原点	origin
线性组合	linear combination
列空间	column space
值域	range
线性相关	linear dependency
线性无关	linearly independent
列	column
行	row
同分布的	identically distributed
词嵌入	word embedding
机器翻译	machine translation
推荐系统	recommender system
词袋	bag of words
协同过滤	collaborative filtering
探索	exploration
策略	policy
关系	relation
属性	attribute
词义消歧	word-sense disambiguation
误差度量	error metric
性能度量	performance metrics
共轭梯度	conjugate gradient
在线学习	online learning
逐层预训练	layer-wise pretraining
自回归网络	auto-regressive network
生成器网络	generator network
判别器网络	discriminator network
矩	moment
可见层	visible layer
无限	infinite
容差	tolerance
学习率	learning rate
轮数	epochs
轮	epoch
对数尺度	logarithmic scale
随机搜索	random search
分段	piecewise
汉明距离	Hamming distance
可见变量	visible variable
近似推断	approximate inference
精确推断	exact inference
潜层	latent layer
知识图谱	knowledge graph