2881
Comment:

2863

Deletions are marked like this.  Additions are marked like this. 
Line 34:  Line 34: 
* Embeddings (RBF?)?  * Embeddings (RBF) 
Line 36:  Line 36: 
* tSNE?  * tSNE 
Line 44:  Line 44: 
* Bayesian/Akaike information criterion, Occam's razor?  * Bayesian/Akaike information criterion, Occam's razor 
Line 59:  Line 59: 
* Multiclass SVM?  * Multiclass SVM 
Line 75:  Line 75: 
* Recurrent NNs?  * Recurrent NNs 
Line 78:  Line 78: 
8. Recap 
Beginners Workshop Machine Learning
From:
20180903
To:
20180914
Exam:
20180924
Organisation:
Seulki Yeom: yeom@tuberlin.de, Philipp Seegerer: philipp.seegerer@tuberlin.de, David Lassner: lassner@tuberlin.de
Language
English
Enrollment / Limited number of participants
If you intend to participate, please send an email to lassner@tuberlin.de with title "Beginners Workshop Enrollment" and this text:
Name: Your name Matr.Nr: Your student ID (Matrikelnummer) Degree: The degree you are enrolled in and want to use this course for. TU student: Yes/No (Are you a enrolled as a regular student at TU Berlin?) Other student: If you are not a regular student, please write your status. ML1: Yes/No (Did you take the course Machine Learning 1 at TU Berlin?) Other ML course: If you did not take ML1 at TU Berlin, please write if you took any equivalent course.
Participation spots are mostly assigned on a random basis. Please keep in mind that auditing students and Nebenhörer can only participate if less than the maximum number of regular TU students register for the course (http://www.studsek.tuberlin.de/menue/studierendenverwaltung/gast_und_nebenhoererschaft/parameter/en/).
(temporary) Workshop Lecture topics are:
1. Clustering, mixtures, density estimation
 Density estimation: kernel density estimation, Parzen windows, parametric density/MaxLikelihood
 K means clustering
 Gaussian mixture models, EM algorithm
 Curse of dimensionality
2. Manifold learning
 LLE
 Embeddings (RBF)
 Multidimensional scaling
 tSNE
3. Bayesian Methods
 What is learning?
 Frequentist vs Bayes
 Bayes rule
 Naive Bayes
 Bayesian linear regression
 Bayesian/Akaike information criterion, Occam's razor
4. Classical and linear methods
 Matrix factorization
 Logistic regression
 Regularization, Lasso, Ridge regression
 Fisher's Linear discriminant
 Gradient descent? Where should it go? NNs?
 Decision boundaries
5. Support Vector Machine
 Linear SVM
 Linear separability, margins
 Duality in optimization, KKT conditions
 SVM for regression
 Multiclass SVM
 Applications
6. Kernels
 Feature transformations
 Kernel trick
 Cross references to previous methods: ridge regression, PCA, SVM
 NadarayaWatson kernel regression
7. Neural Networks
 Rosenblatt's Perceptron
 Multi layer perceptron
 Motivation with Logistic regression
 Backpropagation, (Stochastic) (Minibatch) gradient descent
 Convolutional NNs
Famous Conv nets (imagenet winners): AlexNet, GoogleNet, ResNet
 Recurrent NNs
 Applications
 Practical recommendations for Training of DNNs (following e.g. Bengio's 2012 paper), hyperparameters