NTU-Coursera机器学习:机器学习基石 (Machine Learning Foundations)

课讲内容

这门课以8周设计,分成 4个核心问题,每个核心问题约需2周的时间来探讨.每个约2个小时的录影中,每个小时为一个主题,以会各分成4到5个小段落,每个段里会有一个后多个随堂的练习.我们在探讨每个核心问题的第二周。依上所述,課程的規畫如下:

When Can Machines Learn? [何时可以使用机器学习]

Why Can Machines Learn? [为什么机器可以学习]

How Can Machines Learn? [机器可以怎么样学习]

  • 第五週: 
    第九:Linear Regression [線性迴歸] 
    第十:`Soft‘ Classification [軟性分類]
  • 第六週:
    十一:Multiclass Classification [多類別分類] 
    十二:Nonlinear Transformation [非線性轉換]

How Can Machines Learn Better? [機器可以怎麼樣學得更好]

  • 第七週: 
    十三:Hazard of Overfitting [過度訓練的危險] 
    十四:Regularization [探制調適]
  • 第八週:
    十五:Validation [自我檢測] 
    十六:Three Learning Principles [三個機器學習的重要原則]

延伸阅读

預备知识

  • 作业零 (機率統計、線性代數、微分之基本知識)

参考书籍

经典文献

  • F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386-408, 1958. (第二講:Perceptron 的出處)

  • W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. (第四講:Hoeffding‘s Inequality)

  • Y. S. Abu-Mostafa, X. Song , A. Nicholson, M. Magdon-ismail. The bin model, 1995. (第四講:bin model 的出處)

  • V. Vapnik. The nature of statistical learning theory, 2nd edition, 2000. (第五到八講:VC dimension 與 VC bound 的完整數學推導及延伸)

  • Y. S. Abu-Mostafa. The Vapnik-Chervonenkis dimension: information versus complexity in learning. Neural Computation, 1(3):312-317, 1989. (第七講:VC Dimension 的概念與重要性)

參考文獻

  • A. Sadilek, S. Brennan, H. Kautz, and V. Silenzio. nEmesis: Which restaurants should you avoid today? First AAAI Conference on Human Computation and Crowdsourcing, 2013. (第一講:ML 在「食」的應用)

  • Y. S. Abu-Mostafa. Machines that think for themselves. Scientific American, 289(7):78-81, 2012. (第一講:ML 在「衣」的應用)

  • A. Tsanas, A. Xifara. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49: 560-567, 2012. (第一講:ML 在「住」的應用)

  • J. Stallkamp, M. Schlipsing, J. Salmen, C. Igel. Introduction to the special issue on machine learning for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems 13(4): 1481-1483, 2012. (第一講:ML 在「行」的應用)

  • R. Bell, J. Bennett, Y. Koren, and C. Volinsky. The million dollar programming prize. IEEE Spectrum, 46(5):29-33, 2009. (第一講:Netflix 大賽)

  • S. I. Gallant. Perceptron-based learning algorithms. IEEE Transactions on Neural Networks, 1(2):179-191, 1990. (第二講:pocket 的出處,注意到實際的 pocket 演算法比我們介紹的要複雜)

  • R. Xu, D. Wunsch II. Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645-678, 2005. (第三講:Clustering)

  • X. Zhu. Semi-supervised learning literature survey. University of Wisconsin Madison, 2008. (第三講:Semi-supervised)

  • Z. Ghahramani. Unsupervised learning. In Advanced Lectures in Machine Learning (MLSS ’03), pages 72–112, 2004. (第三講:Unsupervised)

  • L. Kaelbling, M. Littman, A. Moore. reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4: 237-285. (第三講:Reinforcement)

  • A. Blum. On-Line algorithms in machine learning. Carnegie Mellon University,1998. (第三講:Online)

  • B. Settles. Active learning literature survey. University of Wisconsin Madison, 2010. (第三講:Active)

  • D. Wolpert. The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7): 1341-1390. (第四講:No free lunch 的正式版)

  • T. M. Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 14(3):326–334, 1965. (第五到六講:Growth Function)

  • B. Zadrozny, J. Langford, N. Abe. Cost sensitive learning by cost-proportionate example weighting. IEEE International Conference on Data Mining, 2003. (第八講:Weighted Classification)

  • G. Sever, A. Lee. Linear Regression Analysis, 2nd Edition, Wiley, 2003. (第九講:Linear Regression 由統計學的角度來分析;第十二到十三講:Polynomial Transform 後再做 Linear Regression)

  • D. C. Hoaglin, R. E. Welsch. The hat matrix in regression and ANOVA. American Statistician, 32:17–22, 1978. (第九講:Linear Regression 的 Hat Matrix)

  • D. W. Hosmer, Jr., S. Lemeshow, R. X. Sturdivant. Applied Logistic Regression, 3rd Edition, Wiley, 2013 (第十講:Logistic Regression 由統計學的角度來分析)

  • T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. International Conference on Machine Learning,  (第十一講:Stochastic Gradient Descent 用在線性模型的理論分析)

  • R. Rifkin, A. Klautau. In Defense of One-Vs-All Classification. Journal of Machine Learning Research, 5: 101-141, 2004. (第十一講:One-versus-all)

  • J. Fürnkranz. Round Robin Classification. Journal of Machine Learning Research, 2: 721-747, 2002. (第十一講:One-versus-one)

  • L. Li, H.-T. Lin. Optimizing 0/1 loss for perceptrons by random coordinate descent. In Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN ’07), pages 749–754, 2007. (第十一講:一個由最佳化角度出發的 Perceptron Algorithm)

  • G.-X. Yuan, C.-H. Ho, C.-J. Lin. Recent advances of large-scale linear classification. Proceedings of IEEE, 2012. (第十一講:更先進的線性分類方法)

  • Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, C.-J. Lin. Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11(2010), 1471-1490. (第十二講:一個使用多項式轉換加上線性分類模型的方法)

  • M. Magdon-Ismail, A. Nicholson, Y. S. Abu-Mostafa. Learning in the presence of noise. In Intelligent Signal Processing. IEEE Press, 2001. (第十三講:Noise 和 Learning)

  • A. Neumaier, Solving ill-conditioned and singular linear systems: A tutorial on regularization, SIAM Review 40 (1998), 636-666. (第十四講:Regularization)

  • T. Poggio, S. Smale. The mathematics of learning: Dealing with data. Notices of the American Mathematical Society, 50(5):537–544, 2003. (第十四講:Regularization)

  • P. Burman. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76(3): 503–514, 1989. (第十五講:Cross Validation)

  • R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial intelligence (IJCAI ’95), volume 2, 1137–1143, 1995. (第十五講:Cross Validation)

  • A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Occam’s razor. Information Processing Letters, 24(6):377–380, 1987. (第十六講:Occam‘s Razor)



关于Machine Learning更多讨论与交流,敬请关注本博客和新浪微博songzi_tea.

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。