Apply ML is a highly iterative process
hyper parameter:
- layers
- hidden units
- learning rates
- activation functions
Train/Dev/Test Set
传统:1w条数据左右:数据集可以这样划分:
Train/Dev/Test Set = 60:20:20
BigData:当数据集到百万级时,验证集只需要1w 条就可区分出哪个分类器好,而1k条 test set就可评估到它的性能。
98/1/1 ; 99.5/0.25/0.25
关于Train / Test set distribition Mismatch
rule:make sure the dev and test come from the same distribution,ML will learn faster.
如果训练数据来自web,会有数据分布不一致的问题,
Testset - 是为无偏估计unabiased estimate , it might be OK to have no test set.