Skip to content

DaWe1992/Applied_ML_Fundamentals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Artificial Intelligence and Machine Learning (Lecture) πŸ€–

'We are drowning in information and starving for knowledge.' – John Naisbitt

Machine learning and data science represent a subfield of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs which can access data and use it to learn for themselves. A machine learning algorithm learns by building a mathematical / statistical model from the data. This model can then be used for inference and decision making. Machine learning has become an integral part of many modern applications. In general, data science is a cross-topic discipline which comprises computer science, math / statistics as well as domain and business knowledge:

The lecture 'Artificial Intelligence and Machine Learning' is supposed to provide in-depth knowledge about state-of-the-art machine learning algorithms and their applications. This Readme file provides you with all necessary information about the lecture. It is structured as follows:

  1. πŸ“œ Lecture contents
  2. βœ’οΈ Exercises
  3. πŸ“ Exam
  4. 🐍 Programming tasks (bonus points for the exam)
  5. πŸ“š Literature and recommended reading
  6. 🐞 Bugs and errors

Lecture Contents πŸ“œ

The following topics and algorithms will be covered in the lecture:

0. Math refresher
  • Link: πŸ”— click here
  • Content:
    • Linear algebra
    • Statistics
1. Introduction to machine learning
  • Link: πŸ”— click here
  • Content:
    • Motivation and basic terminology
    • Problem types in machine learning
    • Key challenges in machine learning:
      • Generalization
      • Feature engineering
      • Performance measurement
      • Model selection
      • Computation
    • Applications
2. Optimization techniques
  • Link: πŸ”— click here
  • Content:
    • Important concepts and definitions
      • Gradients
      • Hessian matrix
      • Taylor expansion
      • Convex sets and convex functions
    • Unconstrained optimization
    • Constrained optimization
      • Karush-Kuhn-Tucker (KKT) conditions
      • Lagrange function
      • Lagrange duality
    • Numeric optimization
      • Gradient descent (with momentum)
      • Newton's method
3. Bayesian decision theory
  • Link: πŸ”— click here
  • Content:
    • Bayes optimal classifiers
    • Error minimization vs. risk minimization
    • Multinomial and Gaussian naive Bayes
    • Probability density estimation and maximum likelihood estimation
    • Generative and discriminative models
    • Exponential family distributions
4. Non-parametric density estimation and expectation-maximization (EM)
  • Link: πŸ”— click here
  • Content:
    • Histograms
    • Kernel density estimation
    • k-nearest neighbors
    • Expectation-maximization (EM) algorithm for Gaussian mixture models
    • BIC and AIC
5. Probabilistic graphical models (PGMs)
  • Link: πŸ”— click here
  • Content:
    • Bayesian networks
    • Inference and sampling in graphical models
    • Hidden Markov models (HMMs) and the Viterbi algorithm
6. Linear regression
  • Link: πŸ”— click here
  • Content:
    • Normal equations and gradient descent for linear regression
    • Probabilistic regression
    • Basis function regression (polynomial basis functions, radial basis functions)
    • Regularization techniques
7. Logistic regression
  • Link: πŸ”— click here
  • Content:
    • Why you should not use linear regression for classification
    • Derivation of the logistic regression model
    • Logistic regression with basis functions
    • Regularization techniques
    • Dealing with multi-class problems:
      • Softmax regression
      • One-vs-one and one-vs-rest
8. Deep learning
  • Link: πŸ”— click here
  • Content:
    • Biological motivation of deep learning
    • Rosenblatt's perceptron
    • Network architectures
    • Multi-layer-perceptrons (MLPs) and the backpropagation algorithm
    • Extensions and improvements
      • Activation functions
      • Parameter initialization techniques
      • Optimization algorithms for deep learning models (AdaGrad, RMSProp)
    • Introduction to convolutional neural networks (CNNs)
9. Evaluation of machine learning models
  • Link: πŸ”— click here
  • Content:
    • Out-of-sample testing and cross validation
    • Confusion matrices
    • Evaluation metrics: Precision, recall, F1-score, ROC, accuracy, RMSE, MAE
    • Model selection: Grid search, random search, early stopping
    • Bias-variance decomposition
10. Decision trees and ensemble methods
  • Link: πŸ”— click here
  • Content:
    • The ID3 algorithm
    • Extensions and variants:
      • Impurity measures
      • Dealing with numeric attributes
      • Regression trees
    • Ensemble methods:
      • Bagging
      • Random forests
      • ExtraTrees
11. Support vector machines (SVMs)
  • Link: πŸ”— click here
  • Content:
    • Hard-margin SVMs (primal and dual formulation)
    • The kernel concept
    • Soft-margin SVMs
    • Sequential minimal optimization (SMO)
12. Clustering algorithms
  • Link: πŸ”— click here
  • Content:
    • KMeans
    • Hierarchical clustering
    • DBSCAN
    • Mean-shift clustering
13. Dimensionality reduction: Principal component analysis (PCA)
  • Link: πŸ”— click here
  • Content:
    • Why dimensionality reduction?
    • PCA applications
    • Maximum variance formulation of PCA
    • Properties of covariance matrices
    • The PCA algorithm
    • Fisher's linear discriminant
14. Introduction to reinforcement learning
  • Link: πŸ”— click here
  • Content:
    • What is reinforcement learning?
    • Key challenges in reinforcement learning
    • Dynamic programming:
      • Value iteration
      • Policy iteration
      • Q-learning
      • SARSA
    • Exploitation versus exploration
    • Non-deterministic rewards and actions
    • Temporal difference learning
    • Deep reinforcement learning
15. Advanced regression techniques
  • Link: πŸ”— click here
  • Content:
    • MLE and MAP regression
    • Full Bayesian regression
    • Kernel regression
    • Gaussian process regression

Please refer to the official DHBW module catalogue for further details.

Exercises βœ’οΈ

An exercise sheet is provided for (almost) all lecture units. Most of the time, the exercises are a compilation of old exam questions. However, the exercises also include programming tasks and questions which would not be suitable for an exam (due to time constraints). But the programming tasks can be used to collect bonus points for the exam (see description below).

The solutions will be provided via the Moodle forum after two weeks. It is highly recommended to solve the exercises on your own! Do not wait for the solutions to be uploaded.

Number Title Link πŸ”—
Sheet 1: Numeric optimization techniques Download
Sheet 2: Decision theory and parametric density estimation Download
Sheet 3: Non-parametric density estimation, k-nearest neighbors, and EM Download
Sheet 4: Linear regression Download
Sheet 5: Logistic regression Download
Sheet 6: Neural networks and deep learning Download
Sheet 7: Evaluation of machine learning models Download
Sheet 8: Decision trees and ensemble methods Download
Sheet 9: Support vector machines Download
Sheet 10: Clustering Download
Sheet 11: Principal component analysis Download

Exam πŸ“

The exam is going to take 120 minutes. The maximum attainable score will be 120 points, so you have one minute per point. Important: Keep your answers short and simple in order not to lose too much valuable time.

The exam questions will be given in German, but you may answer them in either English or German (you are also allowed to mix the languages if you like). Please do not translate domain specific technical terms in order to avoid confusion. Please answer all questions (except for multiple choice questions) on the concept paper which is handed out during the exam.

Exam preparation:

  • You will not be asked for lengthy derivations. Instead, I want to test whether you understand the general concepts.
  • Any content not discussed in the lecture will not be part of the exam. A list of relevant topics will be shared at the end of the lecture.
  • The exam will contain a mix of multiple choice questions, short answer questions and calculations.
  • Make sure you can answer the self-test questions provided for each topic. You can find those at the end of each slide deck. There won't be sample solutions for those questions!
  • Solve the exercises and work through the solutions if necessary! The solutions will be uploaded after two weeks. Questions marked with an asterisk are old exam questions (or slightly modified).
  • Some of the slides give you important hints (upper left corner):
    • A slide marked with symbol (1) provides in-depth information which you do not have to know by heart. Think of it as additional material for the sake of completeness. However, do not ignore the content completely during the exam preparation. The content may still be relevant, but won't be a focus topic.
    • Symbol (2) indicates very important content. Make sure you understand it!
  • Have a look at the old exam to familiarize yourself with the exam format.
  • The last lecture slot is reserved for exam preparation and additional questions.

Symbol (1):

Symbol (2):

Auxiliary material for the exam:

  • Non-programmable pocket calculator
  • Two-sided hand-written cheat sheet (you may note whatever you want). Hand-written means pen and paper (not on a tablet or computer!)

Programming Tasks (Bonus Points for the Exam) 🐍

Almost every exercise sheet contains at least one programming task. You can collect bonus points for the exam by working on one of these tasks.

How it works:

  • Form groups of 2 to 3 students.
  • At the end of each lecture slot one group will be chosen to work on the next programming task. Each group will be given a separate programming task.
  • The group solves the programming task in Python and presents the results in the next lecture slot.
  • The usage of advanced machine learning libraries (sklearn, etc.) is strictly forbidden. The goal is to implement the algorithms in NumPy from scratch. Exception: Functions that are explicitly mentioned in the task description are allowed.
  • The code has to be shared with me via e-mail.
  • Please submit .py files (not Jupyter notebooks). Please zip the file before sending it!
  • You don't have to prepare slides for the presentation.

Grading:

  • A group can achieve 10 points per group member (e.g. 2 students => 20 points, 3 students => 30 points).
  • The group will be given a total number of points.
  • The group members can distribute these points between them.
  • No member can achieve more than 15 points.
  • You can still achieve 100 % in the exam, even if you do not participate in the bonus point tasks.

Literature and recommended Reading πŸ“š

You do not need to buy any books for the lecture, most resources are available online.
Please find a curated list below:

Title Author(s) View online πŸ”—
Convex Optimization Boyd/Vandenberghe (2004) click here
C4.5 - Programs for Machine Learning Quinlan (1993) -
Deep Learning Goodfellow et al. (2016) click here
Elements of statistical Learning Hastie et al. (2008) click here
Gaussian Processes for Machine Learning Rasmussen/Williams (2006) click here
Machine Learning Mitchell (1997) click here
Machine Learning - A probabilistic Perspective Murphy (2012) click here
Mathematics for Machine Learning Deisenroth et al. (2019) click here
Pattern Recognition and Machine Learning Bishop (2006) click here
Probabilistic Graphical Models Koller et al. (2009) click here
Reinforcement Learning - An Introduction Sutton et al. (2014) click here
Speech and Language Processing Jurafsky/Martin (2006) click here
The Matrix Cookbook Petersen et al. (2012) click here

πŸ”— YouTube resources:

πŸ”— Interesting papers:

πŸ”— Miscellaneous:

Bugs and Errors 🐞

Help me improve the lecture. Please feel free to file an issue in case you spot any errors or issues. Thank you very much in advance! Please do not open issues for questions concerning the content! Either use the Moodle forum or send me an e-mail for that (daniel.wehner@sap.com).

Β© 2025 Daniel Wehner, M.Sc.

About

πŸ“” DHBW Lecture Notes "Artificial Intelligence and Machine Learning" πŸ€–

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •