Machine learning

Marta Vomlelová

Slides	Slides the last slide - list of topics - is also the list of topics for the exam.
Zoom recordings 2001	Zoom Recordings.
Consultations	ask by e-mail Marta.Vomlelova[at]mff.cuni.cz
Contact	E-mail Marta.Vomlelova[at]mff.cuni.cz
Key book	[ESLII] The Elements of Statistical Learning.
Czech version with English slides is on moodle.

1.	Introduction, linear regression, k-NN, expected prediction error, Curse of dimensionality	Section 2 in ESLII
2.	Linear regression, Ridge, Lasso regression, Undirected Graphical Models (first part)	Section 17.1-17.3,17.4.4 in ESLII - later editions, like the internet one. Further sources: S. Hojsgaard, D. Edwards, S. Lauritzen: Graphical Models with R, Springer 2012,
3.	Undirected Graphical Models (second part), Gaussian Processes	C. E. Rasmussen \& C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006 Peter I. Frazier: A Tutorial on Bayesian Optimization, 2018
4.	Splines (Basis Expansion and Regularization)	Sections 5 and 6 ESLII
5.	Linear Models for Classification	Section 4 ESLII
6.	Model Assesment and Selection	Sections 7 ESLII
7.	Decision Trees and Related Methods (MARS)	Sections 9 ESLII
8.	Model Inference and Averaging	Sections 8,10,15,16 ESLII
9.	Clustering	Selected parts of Chapter 14 ESLII, mean shift clustering and Silhouette from Scikitlearn
10.	Bayesian Learning, Other use of the EM algorithm
11.	Association rules, Frequent itemsets (Apriori algorithm)
12.	Support Vector Machines (+ Independent Component Analysis)
13.	Inductive Logic Programming

k-NN nearest neighbors (instance based learning),
linear regression,
undirected graphical models,
Gaussian processes and Bayesian optimization
logistic regression, LDA- linear discriminant analysis
optimal separating hyperplane, SVM, kernel functions
decision trees with prunning, entropy, information gain,
model assesment (overfitting, test/validation data, crossvalitdation, one-leave-out, bootstrap),
RSS = square error loss, crossentropy, 0-1 error loss
bayes optimal prediction, maximum aposteriory hypothesis, maximum likelihood hypothesis,
model averaging (bagging, stacking, boosting, random forest),
k-means clustering
EM algorithm
Apriory algorithm, assotiation rules (market basket analysis)

Knihu Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani: An Introduction to Statistical Learning with Applications in R .