MNIST數據集

出自維基百科,自由嘅百科全書
MNIST sample images
來自 MNIST 測試數據集嘅示例圖像

MNIST 數據集Modified National Institute of Standards and Technology database [1] )係一個大型嘅手寫數字數據集,通常用於訓練各種圖像處理系統。 [2] [3]呢個數據集仲廣泛噉用於機械學習領域嘅訓練同測試當中。 [4] [5] 個數據集係通過「重溝混過」(re-mixing)啲來自 NIST 原始數據集嘅樣本嚟得到創建嘅。 [6]創建者認為,由於NIST嘅訓練數據集係來自啲美國人口普查局嘅員工,而測試數據集係取自啲美國高中學生,噉樣唔係幾啱攞來做啲機械學習實驗。 [7]另外,來自 NIST 嘅黑白圖像得到歸一化(normalization)嚟啱返個 28x28 像素嘅邊界框,亦都引入咗灰度階級嚟做到抗鋸齒

MNIST 數據集包含有 60,000 張訓練圖像同 10,000 張測試圖像。 [8]一半訓練集與及一半測試集取自 NIST 嘅訓練數據集,而訓練集、測試集嘅另一半取自 NIST 嘅測試數據集。 [9]數據集嘅原始創建者保留咗個一覽清單,上邊記有一啲方法係喺數據集度測試過嘅。 [7]喺佢哋嘅原始論文入便,使咗支持向量機嚟達到 0.8% 嘅錯誤率。 [10] 2017 年,一個類似於 MNIST 嘅擴展數據集 EMNIST 發布咗,其中包含有 240,000 張訓練圖像與及 40,000 張手寫數字、字符嘅測試圖像。 [11]

歷史[編輯]

MNIST 數據集裏便啲圖像集喺 1998 年得到創建,係由NIST嘅兩個數據集組合而成:Special Database 1 同 Special Database 3。嗰兩個數據集分別係由啲高中生同埋美國人口普查局嘅僱員啲書寫數字組成。 [7]

表現[編輯]

一啲研究人員使用神經網絡委員會喺 MNIST 數據集上實現咗「接近人類嘅表現」;作者喺同一篇論文中話到喺其他識別任務上嘅表現係人類嘅兩倍。 [12]數據集原始網站上列出嘅最高錯誤率[7]為 12%,呢個係使用冇預處理過嘅簡單線性分類器嚟實現到嘅。 [10]

2004 年,研究人員使用叫做 LIRA 嘅新分類器喺數據集上實現咗 0.42% ,呢個係一種基於 Rosenblatt 感知器原理嘅、具有三個神經元層嘅神經分類器。 [13]

一啲研究人員使用隨機扭曲處理(random distortions)嘅數據集測試咗啲人工智能系統。呢啲情況下嘅系統通常係神經網絡,所使用嘅扭曲好多時係仿射扭曲或者彈性扭曲[7]有時,呢啲系統可能非常之成功;一個噉嘅系統喺數據集上實現咗 0.39% 嘅錯誤率。 [14]

2011 年,研究人員使用類似嘅神經網絡系統報告咗 0.27% 嘅錯誤率,相較之之前啲最佳結果有所改善。 [15] 2013 年,一種方法、基於正則化啲使用 DropConnect 嘅神經網絡嘅,聲稱可實現得到 0.21% 嘅錯誤率。 2016 年,單個卷積神經網絡嘅最佳性能為 0.25% 錯誤率。 [16]截到 2018 年 8 月,喺未使用數據增強(data augmentation)嘅MNIST 訓練數據上訓練嘅單個卷積神經網絡,佢個最佳性能係 0.25% 錯誤率。 [17]另外,並行計算中心(喺Khmelnytskyi,烏克蘭)獲得咗一個只有 5 個卷積神經網絡嘅集合,佢喺 MNIST 上以 0.21% 嘅錯誤率執行。 [18] [19]測試數據集裏便嘅一啲圖像係幾乎唔讀得嘅,可能會阻止達至 0% 嘅測試錯誤率。 [20] 2018 年,維珍尼亞大學系統同信息工程系嘅研究人員宣布,(全連接、循環同埋卷積神經網絡)嘅錯誤率為 0.18%。 [21]

分類器[編輯]

呢個係表格數據顯示到集上使用嘅一過啲機械學習方法與及相應嘅錯誤率嘅,按分類器類型分類:

分類器 扭曲 預處理 錯誤率 (%)
線性分類器 Pairwise linear classifier Template:Okay Deskewing 7.6[10]
K-近鄰

k-NN)

K-NN with non-linear deformation (P2DHMDM) Template:Okay Shiftable edges 0.52[22]
Boosted Stumps Product of stumps on Haar features Template:Okay Haar features 0.87[23]
非線性分類器 40 PCA + quadratic classifier Template:Okay Template:Okay 3.3[10]
隨機森林 Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)[24] Template:Okay Simple statistical pixel importance 2.8[25]
支援向量機 (SVM) Virtual SVM, deg-9 poly, 2-pixel jittered Template:Okay Deskewing 0.56[26]
深度神經網絡(DNN) 2-layer 784-800-10 Template:Okay Template:Okay 1.6[27]
2-layer 784-800-10 Elastic distortions Template:Okay 0.7[27]
6-layer 784-2500-2000-1500-1000-500-10 Elastic distortions Template:Okay 0.35[28]
卷積神經網絡

(CNN)

6-layer 784-40-80-500-1000-2000-10 Template:Okay Expansion of the training data 0.31[29]
6-layer 784-50-100-500-1000-10-10 Template:Okay Expansion of the training data 0.27[30]
13-layer 64-128(5x)-256(3x)-512-2048-256-256-10 Template:Okay Template:Okay 0.25[16]
Committee of 35 CNNs, 1-20-P-40-P-150-10 Elastic distortions Width normalizations 0.23[12]
Committee of 5 CNNs, 6-layer 784-50-100-500-1000-10-10 Template:Okay Expansion of the training data 0.21[18][19]
隨機多模型深度學習(RMDL) 10 NN-10 RNN - 10 CNN Template:Okay Template:Okay 0.18[21]
卷積神經網絡 Committee of 20 CNNS with Squeeze-and-Excitation Networks[31] Template:Okay Data augmentation 0.17[32]

睇埋[編輯]

[編輯]

 

  1. "THE MNIST DATABASE of handwritten digits". Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.
  2. "Support vector machines speed pattern recognition - Vision Systems Design". Vision Systems Design. 喺17 August 2013搵到.
  3. Gangaputra, Sachin. "Handwritten digit database". 喺17 August 2013搵到.
  4. Qiao, Yu (2007). "THE MNIST DATABASE of handwritten digits". 原著喺2018年2月11號歸檔. 喺18 August 2013搵到.
  5. Platt, John C. (1999). "Using analytic QP and sparseness to speed training of support vector machines" (PDF). Advances in Neural Information Processing Systems: 557–563. 原著 (PDF)喺4 March 2016歸檔. 喺18 August 2013搵到.
  6. Grother, Patrick J. "NIST Special Database 19 - Handprinted Forms and Characters Database" (PDF). National Institute of Standards and Technology.
  7. 7.0 7.1 7.2 7.3 7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. "The MNIST Handwritten Digit Database". Yann LeCun's Website yann.lecun.com. 喺30 April 2020搵到.
  8. Kussul, Ernst; Baidyk, Tatiana (2004). "Improved method of handwritten digit recognition tested on MNIST database". Image and Vision Computing. 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.
  9. Zhang, Bin; Srihari, Sargur N. (2004). "Fast k-Nearest Neighbor Classification Using Cluster-Based Trees" (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 26 (4): 525–528. doi:10.1109/TPAMI.2004.1265868. PMID 15382657. 原著 (PDF)喺2021年7月25號歸檔. 喺20 April 2020搵到.
  10. 10.0 10.1 10.2 10.3 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998). "Gradient-Based Learning Applied to Document Recognition" (PDF). Proceedings of the IEEE. 86 (11): 2278–2324. doi:10.1109/5.726791. 喺18 August 2013搵到.
  11. A bot will complete this citation soon. Click here to jump the queue arXiv:[1].
  12. 12.0 12.1 Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber (2012). "Multi-column deep neural networks for image classification" (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. pp. 3642–3649. arXiv:1202.2745. CiteSeerX 10.1.1.300.3283. doi:10.1109/CVPR.2012.6248110. ISBN 978-1-4673-1228-8.
  13. Kussul, Ernst; Tatiana Baidyk (2004). "Improved method of handwritten digit recognition tested on MNIST database" (PDF). Image and Vision Computing. 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008. 原著 (PDF)喺21 September 2013歸檔. 喺20 September 2013搵到.
  14. Ranzato, Marc’Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun (2006). "Efficient Learning of Sparse Representations with an Energy-Based Model" (PDF). Advances in Neural Information Processing Systems. 19: 1137–1144. 喺20 September 2013搵到.
  15. Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber (2011). "Convolutional neural network committees for handwritten character classification" (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). pp. 1135–1139. CiteSeerX 10.1.1.465.2138. doi:10.1109/ICDAR.2011.229. ISBN 978-1-4577-1350-7. 原著 (PDF)喺22 February 2016歸檔. 喺20 September 2013搵到.
  16. 16.0 16.1 SimpleNet (2016). "Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures". arXiv:1608.06037. 喺3 December 2020搵到.
  17. SimpNet. "Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet". Github. arXiv:1802.06205. 喺3 December 2020搵到.
  18. 18.0 18.1 Romanuke, Vadim. "Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate". 喺24 November 2016搵到.
  19. 19.0 19.1 Romanuke, Vadim (2016). "Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate". Research Bulletin of NTUU "Kyiv Polytechnic Institute". 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115.
  20. MNIST classifier, GitHub. "Classify MNIST digits using Convolutional Neural Networks". 喺3 August 2018搵到.
  21. 21.0 21.1 Kowsari, Kamran; Heidarysafa, Mojtaba; Brown, Donald E.; Meimandi, Kiana Jafari; Barnes, Laura E. (2018-05-03). "RMDL: Random Multimodel Deep Learning for Classification". Proceedings of the 2018 International Conference on Information System and Data Mining. arXiv:1805.01890. doi:10.1145/3206098.3206111.
  22. Keysers, Daniel; Thomas Deselaers; Christian Gollan; Hermann Ney (August 2007). "Deformation models for image recognition". IEEE Transactions on Pattern Analysis and Machine Intelligence. 29 (8): 1422–1435. CiteSeerX 10.1.1.106.3963. doi:10.1109/TPAMI.2007.1153. PMID 17568145. S2CID 2528485.
  23. Kégl, Balázs; Róbert Busa-Fekete (2009). "Boosting products of base classifiers" (PDF). Proceedings of the 26th Annual International Conference on Machine Learning: 497–504. 喺27 August 2013搵到.{{cite journal}}: CS1 maint: url-status (link)
  24. "RandomForestSRC: Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)". 21 January 2020.
  25. "Mehrad Mahmoudian / MNIST with RandomForest". 原著喺2021年4月14號歸檔. 喺2021年6月3號搵到.
  26. Decoste, Dennis; Schölkopf, Bernhard (2002). "Training Invariant Support Vector Machines". Machine Learning. 46 (1–3): 161–190. doi:10.1023/A:1012454411458. ISSN 0885-6125. OCLC 703649027. 喺2021-02-05搵到.
  27. 27.0 27.1 Patrice Y. Simard; Dave Steinkraus; John C. Platt (2003). "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis". Proceedings of the Seventh International Conference on Document Analysis and Recognition.第1卷. Institute of Electrical and Electronics Engineers. p. 958. doi:10.1109/ICDAR.2003.1227801. ISBN 978-0-7695-1960-9. S2CID 4659176.
  28. Ciresan, Claudiu Dan; Ueli Meier; Luca Maria Gambardella; Juergen Schmidhuber (December 2010). "Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition". Neural Computation. 22 (12): 3207–20. arXiv:1003.0358. doi:10.1162/NECO_a_00052. PMID 20858131. S2CID 1918673.
  29. Romanuke, Vadim. "The single convolutional neural network best performance in 18 epochs on the expanded training data at Parallel Computing Center, Khmelnytskyi, Ukraine". 喺16 November 2016搵到.
  30. Romanuke, Vadim. "Parallel Computing Center (Khmelnytskyi, Ukraine) gives a single convolutional neural network performing on MNIST at 0.27 percent error rate". 喺24 November 2016搵到.
  31. Hu, Jie; Shen, Li; Albanie, Samuel; Sun, Gang; Wu, Enhua (2019). "Squeeze-and-Excitation Networks". IEEE Transactions on Pattern Analysis and Machine Intelligence. 42 (8): 2011–2023. arXiv:1709.01507. doi:10.1109/TPAMI.2019.2913372. PMID 31034408. S2CID 140309863.
  32. "GitHub - Matuzas77/MNIST-0.17: MNIST classifier with average 0.17% error". 25 February 2020.

延伸閱讀[編輯]

外部鏈接[編輯]