训练集、验证集和测试集

机器学习的普遍任务就是从数据中学习和构建模型（该过程称之为训练），并且能够在将来遇到的数据上进行预测。^[1]用于构建最终模型的数据集通常有多个；在构建模型的不同阶段，通常有三种数据集：训练集、验证集和测试集。

首先，模型在训练集（英语：training dataset）上进行拟合。^[2]对于监督式学习，训练集是由用来拟合参数（例如人工神经网络中神经元之间链接的权重）的样本组成的集合。^[3]在实践中，训练集通常是由输入向量（标量）和输出向量（标量）组成的数据对。其中输出向量（标量）被称为目标或标签。在训练过程中，当前模型会对训练集中的每个样本进行预测，并将预测结果与目标进行比较。根据比较的结果，学习算法会更新模型的参数。模型拟合的过程可能同时包括特征选择和参数估计。

接下来，拟合得到的模型会在第二个数据集——验证集（英语：validation dataset）——上进行预测。^[2]在对模型的超参数（例如神经网络中隐藏层的神经元数量^[3]）进行调整时，验证集提供了对在训练集上拟合得到模型的无偏评估。^[4]验证集可用于正则化中的提前停止：在验证集误差上升时（这是在训练集上过拟合的信号），停止训练。^[5]不过，在实践中，由于验证集误差在训练过程中会有起伏，这种做法有时不奏效。由此，人们发明了一些规则，用做判定过拟合更好的信号。^[5]

最后，测试集（英语：test dataset）可被用来提供对最终模型的无偏评估。^[4]若测试集在训练过程中从未用到（例如，没有被用在交叉验证当中），则它也被称之为预留集。

参考文献[编辑]

^ Ron Kohavi; Foster Provost. Glossary of terms. Machine Learning. 1998, 30: 271–274 [2019-12-10]. （原始内容存档于2019-11-11）.
^ ^2.0 ^2.1 James, Gareth. An Introduction to Statistical Learning: with Applications in R. Springer. 2013: 176 [2019-12-10]. ISBN 978-1461471370. （原始内容存档于2019-06-23）.
^ ^3.0 ^3.1 Ripley, Brian. Pattern Recognition and Neural Networks. Cambridge University Press. 1996: 354. ISBN 978-0521717700.
^ ^4.0 ^4.1 Brownlee, Jason. What is the Difference Between Test and Validation Datasets?. 2017-07-13 [12 October 2017]. （原始内容存档于2019-12-10）.
^ ^5.0 ^5.1 Prechelt, Lutz; Geneviève B. Orr. Early Stopping — But When?. Grégoire Montavon; Klaus-Robert Müller (编). Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science. Springer Berlin Heidelberg. 2012-01-01: 53–67. ISBN 978-3-642-35289-8. doi:10.1007/978-3-642-35289-8_5.

[1] Ron Kohavi; Foster Provost. Glossary of terms. Machine Learning. 1998, 30: 271–274 [2019-12-10]. （原始内容存档于2019-11-11）.

[James_2013_176-2] 2.0 ^2.1 James, Gareth. An Introduction to Statistical Learning: with Applications in R. Springer. 2013: 176 [2019-12-10]. ISBN 978-1461471370. （原始内容存档于2019-06-23）.

[Ripley_1996_354-3] 3.0 ^3.1 Ripley, Brian. Pattern Recognition and Neural Networks. Cambridge University Press. 1996: 354. ISBN 978-0521717700.

[Brownlee-4] 4.0 ^4.1 Brownlee, Jason. What is the Difference Between Test and Validation Datasets?. 2017-07-13 [12 October 2017]. （原始内容存档于2019-12-10）.

[prechelt_early_2012-5] 5.0 ^5.1 Prechelt, Lutz; Geneviève B. Orr. Early Stopping — But When?. Grégoire Montavon; Klaus-Robert Müller (编). Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science. Springer Berlin Heidelberg. 2012-01-01: 53–67. ISBN 978-3-642-35289-8. doi:10.1007/978-3-642-35289-8_5.

[1]

[2]

[3]

[4]

[5]