1. Introduction
  2. Probability
  3. Generative models for discrete data
  4. Gaussian models
  5. Bayesian statistics
  6. Frequentist statistics
  7. Linear regression
  8. Logistic regression
  9. Generalized linear models and the exponential family
  10. Directed graphical models (Bayes nets)
  11. Mixture models and the EM algorithm
  12. Latent linear models
  13. Sparse linear models
  14. Kernels
  15. Gaussian processes
  16. Adaptive basis function models
  17. Markov and hidden Markov models
  18. State space models
  19. Undirected graphical models (Markov random fields)
  20. Exact inference for graphical models
  21. Variational inference
  22. More variational inference
  23. Monte Carlo inference
  24. Markov chain Monte Carlo (MCMC) inference
  25. Clustering
  26. Graphical model structure learning
  27. Latent variable models for discrete data
  28. Deep learning

 

  1. Introduction
    1. Machine Learning: what and why?
    2. Supervised learning
    3. Unsupervised learning
    4. Some basic concepts in machine learning
  2. Probability
    1. Introduction
    2. A brief review of probability theory
    3. Some common discrete distributions
    4. Some common continuous distributions
    5. Joint probability distributions
    6. Transformation of random variables
    7. Monte Carlo approximation
    8. Information theory
  3. Generative models for discrete data
    1. Introduction
    2. Bayesian concept learning
    3. The beta-binomial model
    4. The Dirichlet-multinomial model
    5. Naive Bayes classifier
  4. Gaussian models
    1. Introduction
    2. Gaussian discriminant analysis
    3. Inference in jointly Gaussian distributions
    4. Linear Gaussian systems
    5. Digression: The Wishart distribution
    6. Inferring the parameters of MVN
  5. Bayesian statistics
    1. Introduction
    2. Summarizing posterior distributions
    3. Bayesian model selection
    4. Priors
    5. Hierachical Bayes
    6. Empirical Bayes
    7. Baeysian decision theory
  6. Frequentist statistics
    1. Introduction
    2. Sampling distribution of an estimator
    3. Frequentist decision theory
    4. Desirable properties of estimators
    5. Empirical risk minimization
    6. Pathologies of frequentist statistics
  7. Linear regression
    1. Introduction
    2. Model specification
    3. Maximum likelihood estimation (least squares)
    4. Robust linear regression
    5. Ridge regression
    6. Bayesian linear regression
  8. Logistic regression
    1. Introductions
    2. Model specification
    3. Model fitting
    4. Bayesian logistic regression
    5. Online learning and stochasitc optimization
    6. Generative vs discriminative classifiers
  9. Generalized linear models and the exponential family
    1. Introduction
    2. The exponential families
    3. Generalized linear models (GLMs)
    4. Probit regression
    5. Multi-task learning
    6. Generalized linear mixed models
    7. Learning to rank
  10. Directed graphical models (Bayes nets)
    1. Introduction
    2. Exmples
    3. Inference
    4. Learning
    5. Conditional indepence properties of DGMs
  11. Mixture models and the EM algorithm
    1. Latent variable models
    2. Mixture models
    3. Parameter estimation for mixture models
    4. The EM algorithm
    5. Model selection for latent variable models
    6. Fitting models with missing data
  12. Latent linear models
    1. Factor analysis
    2. Principal components analysis (PCA)
    3. Choosing the umber of latent dimensions
    4. PCA for categorical data
    5. PCA for paired and multi-view data
    6. Independent Component Analysis (ICA)
  13. Sparse linear models
    1. Introduction
    2. Bayesian variable selection
    3. L1 regularization: basics
    4. L1 regularization: algorithms
    5. L1 regularization: extensions
    6. Non-convex regularizers
    7. Automatic relevance determination (ARD)/sparse Bayesian learning (SBL)
    8. Sparse coding
  14. Kernels
    1. Introduction
    2. Kernel functions
    3. Using kernels inside GLMs
    4. The kernel trick
    5. Support vector machines (SVMs)
    6. Comparison of discriminative kernel methods
    7. Kernels for building generative models
  15. Gaussian processes
    1. Introduction
    2. GPs for regression
    3. GPs meet GLMs
    4. Connection with other methods
    5. GP latent variable model
    6. Approximation methods for large datasets
  16. Adaptive basis function models
    1. Introduction
    2. Classification and regression trees (CART)
    3. Generalized additive models
    4. Boosting
    5. Feedforward neural networks (multilayer perceptrons)
    6. Ensemble learning
    7. Experimental comparison
    8. Interperting black-box models
  17. Markov and hidden Markov models
    1. Introduction
    2. Markov models
    3. Hidden Markov models
    4. Inference in HMMs
    5. Learning for HMMs
    6. Generalizations of HMMs
  18. State space models
    1. Introduction
    2. Application of SSMs
    3. Inference in LG-SSM
    4. Learning for LG-SSM
    5. Approximate online infernce for non-linear, non-Gaussian SSMs
    6. Hybrid discrete/continuous SSMs
  19. Undirected graphical models (Markov random fields)
    1. Introduction
    2. Conditional independence properties of UGMs
    3. Parameterization of MRFs
    4. Examples of MRFs
    5. Learning
    6. Conditional random fields (CRFs)
    7. Structural SVMs
  20. Exact inference for graphical models
    1. Introduction
    2. Belief propagation for trees
    3. The variable elimination algorithm
    4. The junction tree algorithm
    5. Computational intractability of exact inference in the worst case
  21. Variational inference
    1. Introduction
    2. Variational inference
    3. The mean field method
    4. Structured mean field
    5. Variational Bayes
    6. Variational Bayes EM
    7. Variational message passing and VIBES
    8. Local variational bounds
  22. More variational inference
    1. Introduction
    2. Loopy belief propagation: algorithmic issues
    3. Loopy belief propagationl: theoretical issues
    4. Extensions of belief propagation
    5. Expectation propagation
    6. MAP state estimation
  23. Monte Carlo inference
    1. Introduction
    2. Sampling from standard distributions
    3. Rejection sampling
    4. Importance sampling
    5. Particle filtering
    6. Rao-Blackwellised particle filtering (RBPF)
  24. Markov chain Monte Carlo (MCMC) inference
    1. Introduction
    2. Gibbs sampling
    3. Metroplis Hastings algorithm
    4. Speed and accuracy of MCMC
    5. Auxiliary variable MCMC
    6. Annealing methods
    7. Approximationg the marginal likelihood
  25. Clustering
    1. Introduction
    2. Dirichlet process mixture models
    3. Affinity propagation
    4. Spectral clustering
    5. Hierarchical clustering
    6. Clustering datapoints and features
  26. Graphical model structure learning
    1. Introduction
    2. Structure learning for knowledge discovery
    3. Learning tree structures
    4. Learning DAG structures
    5. Learning DAG structure with latent variables
    6. Learning causal DAGs
    7. Learning undirected Gaussian graphical models
    8. Learning undirected discrete graphical models
  27. Latent variable models for discrete data
    1. Introduction
    2. Distributed state LVMs for discrete data
    3. Latent Dirichlet allocation (LDA)
    4. Extensions of LDA
    5. LVMs for graph-sructured data
    6. LVMs for relational data
    7. Restricted Boltxmann machines (RBMs)
  28. Deep learning
    1. Introduction
    2. Deep generative models
    3. Deep neural networks
    4. Application of deep networks
    5. Discussion