'We are drowning in information and starving for knowledge.' β John Naisbitt
Machine learning and data science represent a subfield of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs which can access data and use it to learn for themselves. A machine learning algorithm learns by building a mathematical / statistical model from the data. This model can then be used for inference and decision making. Machine learning has become an integral part of many modern applications. In general, data science is a cross-topic discipline which comprises computer science, math / statistics as well as domain and business knowledge:
The lecture 'Artificial Intelligence and Machine Learning' is supposed to provide in-depth knowledge about state-of-the-art machine learning algorithms and their applications. This Readme file provides you with all necessary information about the lecture. It is structured as follows:
- π Lecture contents
- βοΈ Exercises
- π Exam
- π Programming tasks (bonus points for the exam)
- π Literature and recommended reading
- π Bugs and errors
The following topics and algorithms will be covered in the lecture:
0. Math refresher
- Link: π click here
- Content:
- Linear algebra
- Statistics
1. Introduction to machine learning
- Link: π click here
- Content:
- Motivation and basic terminology
- Problem types in machine learning
- Key challenges in machine learning:
- Generalization
- Feature engineering
- Performance measurement
- Model selection
- Computation
- Applications
2. Optimization techniques
- Link: π click here
- Content:
- Important concepts and definitions
- Gradients
- Hessian matrix
- Taylor expansion
- Convex sets and convex functions
- Unconstrained optimization
- Constrained optimization
- Karush-Kuhn-Tucker (KKT) conditions
- Lagrange function
- Lagrange duality
- Numeric optimization
- Gradient descent (with momentum)
- Newton's method
- Important concepts and definitions
3. Bayesian decision theory
- Link: π click here
- Content:
- Bayes optimal classifiers
- Error minimization vs. risk minimization
- Multinomial and Gaussian naive Bayes
- Probability density estimation and maximum likelihood estimation
- Generative and discriminative models
- Exponential family distributions
4. Non-parametric density estimation and expectation-maximization (EM)
- Link: π click here
- Content:
- Histograms
- Kernel density estimation
- k-nearest neighbors
- Expectation-maximization (EM) algorithm for Gaussian mixture models
- BIC and AIC
5. Probabilistic graphical models (PGMs)
- Link: π click here
- Content:
- Bayesian networks
- Inference and sampling in graphical models
- Hidden Markov models (HMMs) and the Viterbi algorithm
6. Linear regression
- Link: π click here
- Content:
- Normal equations and gradient descent for linear regression
- Probabilistic regression
- Basis function regression (polynomial basis functions, radial basis functions)
- Regularization techniques
7. Logistic regression
- Link: π click here
- Content:
- Why you should not use linear regression for classification
- Derivation of the logistic regression model
- Logistic regression with basis functions
- Regularization techniques
- Dealing with multi-class problems:
- Softmax regression
- One-vs-one and one-vs-rest
8. Deep learning
- Link: π click here
- Content:
- Biological motivation of deep learning
- Rosenblatt's perceptron
- Network architectures
- Multi-layer-perceptrons (MLPs) and the backpropagation algorithm
- Extensions and improvements
- Activation functions
- Parameter initialization techniques
- Optimization algorithms for deep learning models (AdaGrad, RMSProp)
- Introduction to convolutional neural networks (CNNs)
9. Evaluation of machine learning models
- Link: π click here
- Content:
- Out-of-sample testing and cross validation
- Confusion matrices
- Evaluation metrics: Precision, recall, F1-score, ROC, accuracy, RMSE, MAE
- Model selection: Grid search, random search, early stopping
- Bias-variance decomposition
10. Decision trees and ensemble methods
- Link: π click here
- Content:
- The ID3 algorithm
- Extensions and variants:
- Impurity measures
- Dealing with numeric attributes
- Regression trees
- Ensemble methods:
- Bagging
- Random forests
- ExtraTrees
11. Support vector machines (SVMs)
- Link: π click here
- Content:
- Hard-margin SVMs (primal and dual formulation)
- The kernel concept
- Soft-margin SVMs
- Sequential minimal optimization (SMO)
12. Clustering algorithms
- Link: π click here
- Content:
- KMeans
- Hierarchical clustering
- DBSCAN
- Mean-shift clustering
13. Dimensionality reduction: Principal component analysis (PCA)
- Link: π click here
- Content:
- Why dimensionality reduction?
- PCA applications
- Maximum variance formulation of PCA
- Properties of covariance matrices
- The PCA algorithm
- Fisher's linear discriminant
14. Introduction to reinforcement learning
- Link: π click here
- Content:
- What is reinforcement learning?
- Key challenges in reinforcement learning
- Dynamic programming:
- Value iteration
- Policy iteration
- Q-learning
- SARSA
- Exploitation versus exploration
- Non-deterministic rewards and actions
- Temporal difference learning
- Deep reinforcement learning
15. Advanced regression techniques
- Link: π click here
- Content:
- MLE and MAP regression
- Full Bayesian regression
- Kernel regression
- Gaussian process regression
Please refer to the official DHBW module catalogue for further details.
An exercise sheet is provided for (almost) all lecture units. Most of the time, the exercises are a compilation of old exam questions. However, the exercises also include programming tasks and questions which would not be suitable for an exam (due to time constraints). But the programming tasks can be used to collect bonus points for the exam (see description below).
The solutions will be provided via the Moodle forum after two weeks. It is highly recommended to solve the exercises on your own! Do not wait for the solutions to be uploaded.
Number | Title | Link π |
---|---|---|
Sheet 1: | Numeric optimization techniques | Download |
Sheet 2: | Decision theory and parametric density estimation | Download |
Sheet 3: | Non-parametric density estimation, k-nearest neighbors, and EM | Download |
Sheet 4: | Linear regression | Download |
Sheet 5: | Logistic regression | Download |
Sheet 6: | Neural networks and deep learning | Download |
Sheet 7: | Evaluation of machine learning models | Download |
Sheet 8: | Decision trees and ensemble methods | Download |
Sheet 9: | Support vector machines | Download |
Sheet 10: | Clustering | Download |
Sheet 11: | Principal component analysis | Download |
The exam is going to take 120 minutes. The maximum attainable score will be 120 points, so you have one minute per point. Important: Keep your answers short and simple in order not to lose too much valuable time.
The exam questions will be given in German, but you may answer them in either English or German (you are also allowed to mix the languages if you like). Please do not translate domain specific technical terms in order to avoid confusion. Please answer all questions (except for multiple choice questions) on the concept paper which is handed out during the exam.
Exam preparation:
- You will not be asked for lengthy derivations. Instead, I want to test whether you understand the general concepts.
- Any content not discussed in the lecture will not be part of the exam. A list of relevant topics will be shared at the end of the lecture.
- The exam will contain a mix of multiple choice questions, short answer questions and calculations.
- Make sure you can answer the self-test questions provided for each topic. You can find those at the end of each slide deck. There won't be sample solutions for those questions!
- Solve the exercises and work through the solutions if necessary! The solutions will be uploaded after two weeks. Questions marked with an asterisk are old exam questions (or slightly modified).
- Some of the slides give you important hints (upper left corner):
- A slide marked with symbol (1) provides in-depth information which you do not have to know by heart. Think of it as additional material for the sake of completeness. However, do not ignore the content completely during the exam preparation. The content may still be relevant, but won't be a focus topic.
- Symbol (2) indicates very important content. Make sure you understand it!
- Have a look at the old exam to familiarize yourself with the exam format.
- The last lecture slot is reserved for exam preparation and additional questions.
Symbol (1):
Symbol (2):
Auxiliary material for the exam:
- Non-programmable pocket calculator
- Two-sided hand-written cheat sheet (you may note whatever you want). Hand-written means pen and paper (not on a tablet or computer!)
Almost every exercise sheet contains at least one programming task. You can collect bonus points for the exam by working on one of these tasks.
How it works:
- Form groups of 2 to 3 students.
- At the end of each lecture slot one group will be chosen to work on the next programming task. Each group will be given a separate programming task.
- The group solves the programming task in Python and presents the results in the next lecture slot.
- The usage of advanced machine learning libraries (sklearn, etc.) is strictly forbidden. The goal is to implement the algorithms in NumPy from scratch. Exception: Functions that are explicitly mentioned in the task description are allowed.
- The code has to be shared with me via e-mail.
- Please submit .py files (not Jupyter notebooks). Please zip the file before sending it!
- You don't have to prepare slides for the presentation.
Grading:
- A group can achieve 10 points per group member (e.g. 2 students => 20 points, 3 students => 30 points).
- The group will be given a total number of points.
- The group members can distribute these points between them.
- No member can achieve more than 15 points.
- You can still achieve 100 % in the exam, even if you do not participate in the bonus point tasks.
You do not need to buy any books for the lecture, most resources are available online.
Please find a curated list below:
Title | Author(s) | View online π |
---|---|---|
Convex Optimization | Boyd/Vandenberghe (2004) | click here |
C4.5 - Programs for Machine Learning | Quinlan (1993) | - |
Deep Learning | Goodfellow et al. (2016) | click here |
Elements of statistical Learning | Hastie et al. (2008) | click here |
Gaussian Processes for Machine Learning | Rasmussen/Williams (2006) | click here |
Machine Learning | Mitchell (1997) | click here |
Machine Learning - A probabilistic Perspective | Murphy (2012) | click here |
Mathematics for Machine Learning | Deisenroth et al. (2019) | click here |
Pattern Recognition and Machine Learning | Bishop (2006) | click here |
Probabilistic Graphical Models | Koller et al. (2009) | click here |
Reinforcement Learning - An Introduction | Sutton et al. (2014) | click here |
Speech and Language Processing | Jurafsky/Martin (2006) | click here |
The Matrix Cookbook | Petersen et al. (2012) | click here |
π YouTube resources:
- Machine learning lecture by Andrew Ng, Stanford University (new version)
- Machine learning lecture by Andrew Ng, Stanford University (old version)
- Support vector machines by Patrick Winston, MIT
- Linear algebra by Gilbert Strang, MIT
- Matrix methods in data analysis, signal processing, and machine learning by Gilbert Strang, MIT
- Gradient descent, how neural networks learn (3BlueOneBrown)
- Hidden Markov models (ritvikmath)
- Viterbi algorithm (ritvikmath)
π Interesting papers:
- Playing atari with deep reinforcement learning (Mnih et al., 2013)
- Efficient estimation of word representations in vector space (Mikolov et al., 2013)
- Fast training of support vector machines using sequential minimal optimization (Platt, 1998)
- An implementation of the mean shift algorithm (Demirovic, 2019)
- Independent component analysis: A tutorial (HyvΓ€rinen, Oja, 1999)
π Miscellaneous:
- CS229 lecture notes (Andrew Ng, CS229 Stanford)
- The simplified SMO algorithm (Andrew Ng, CS229 Stanford)
- Momentum simulator
Help me improve the lecture. Please feel free to file an issue in case you spot any errors or issues. Thank you very much in advance! Please do not open issues for questions concerning the content! Either use the Moodle forum or send me an e-mail for that (daniel.wehner@sap.com).
Β© 2025 Daniel Wehner, M.Sc.