Apply ML is a highly iterative process

hyper parameter:

Train/Dev/Test Set

传统：1w条数据左右：数据集可以这样划分：

Train/Dev/Test Set = 60:20:20

BigData:当数据集到百万级时，验证集只需要1w 条就可区分出哪个分类器好，而1k条 test set就可评估到它的性能。

98/1/1 ; 99.5/0.25/0.25

rule:make sure the dev and test come from the same distribution,ML will learn faster.

如果训练数据来自web,会有数据分布不一致的问题，

Testset - 是为无偏估计unabiased estimate , it might be OK to have no test set.