Trustworthy Artificial Intelligence (SS 2026)
Table of Content
Content
Artificial intelligence systems are increasingly used in domains where errors can have serious consequences. Ensuring that AI behaves reliably, safely, and fairly is therefore essential. This course covers the theoretical foundations and practical techniques required to design, evaluate, and deploy AI systems that can be trusted in real-world applications. Main topics of the course include:- Monitoring of Machine Learning Models. We discuss monitoring strategies for neural network classifiers, focusing on distribution shift detection, contextual monitoring, and runtime frameworks for continuous assurance.
- Testing of Reinforcement Learning Systems. We explore systematic testing methodologies for reinforcement learning agents, including critical scenario generation, model-based testing strategies, and safety and performance evaluation in probabilistic environments.
- Shielded Reinforcement Learning. Shielded reinforcement learning is introduced as an approach to provide formal correctness guarantees for reinforcement learning systems by combining model-based formal reasoning with learning-based control.
- Explainable AI (XAI). We examine interpretability and explainability techniques for different types of machine learning models.
- Fair Machine Learning. We discuss common definitions of fairness criteria, techniques for bias detection and mitigation, and algorithms for fair decision-making.
- Neural Network Verification. We explore formal verification techniques to mathematically prove that a neural network satisfies specified safety, robustness, or correctness properties under all admissible inputs.
Material
| NR | Date | Topic | Speaker | Slides |
|---|---|---|---|---|
| 1 | 05.03.2026 | Introduction | Bettina Könighofer | TBA |
| 2 | 12.03.2026 | Monitoring of Machine Learning Models | Hazem Torfah | TBA |
| 3 | 19.03.2026 | Testing for Reinforcement Learning | Martin Tappler | |
| 4 | 26.03.2026 | Shielded Reinforcement Learning | Bettina Könighofer, Stefan Pranger | |
| 5 | 16.04.2026 | Explainable Sequential Decision Making | Sabine Rieder | |
| 6 | 23.04.2026 | Fairness in Machine Learning | Konstantin Kueffner | |
| 7 | 30.04.2026 | Neural Network Verification | Laura Nenzi | |
| 8 | 07.05.2026 | Watermarking of AI-generated Content | Verena, Matthias | |
| 9 | 21.05.2026 | Concept Bottleneck Models
Neural Network Verification |
Lukas, Boris
Eymen, Yuliia |
|
| 10 | 28.05.2026 | Hallucination Detection / Mitigation / Guardrails for LLMs
Interpretable Multi-Objective Reinforcement Learning |
Anna-Sophie, Andreas
Lucrezia, Laurent |
|
| 11 | 11.06.2026 | Explainability in LLMs
Norm-Compliant Reinforcement Learning |
Daria, Kevin
Isabella, Liam |
|
| 12 | 18.06.2026 | Learning Symbolic World Models with Automata Learning & Learning Data-driven World Models with Dreamer V3 |
Benjamin, Matthias, Arttu | |
| 13 | 25.06.2026 | Adversarial Attacks on LLMs
Provable Safe Reinforcement Learning |
Sam
Dominik, Markus |
Administrative Information
In the first session, students will receive an introduction to the course and an overview of the available topics. They will then select a specific topic to explore in depth during the semester and present it to the group in the second half of the course.In the first six lectures, leading experts will provide introductions to their respective research fields (see Materials).
Timeline
- By 19/03: Form groups of two and choose a topic.
- 16–27/03: First topic discussion – Narrowing the topic and selecting publications for the presentation.
- 20–30/04: Second topic discussion – Research discussion on the topic.
- 07/05–25/06: Student presentations
Grading
The course consists of following graded partial assessments:- Research Discussion 1 (10 pts)
- Research Discussion 2 (30 pts)
- Final Presentation (60 pts)
- fulfill the attendance policy (absence only with justified excuse)
- reach at least 15 pts in the research discussion
- reach at least 50 pts overall
| Points | Grade |
|---|---|
| ≥ 87.5 | (1) Excellent |
| ≥ 75.0 | (2) Good |
| ≥ 62.5 | (3) Satisfactory |
| ≥ 50.0 | (4) Sufficient |
| < 50.0 | (5) Insufficient |
Further details, including deadlines and regulations regarding the repetition of specific partial assessments, are provided in the introductory slides of the first lecture.
Attendance
Attendance is compulsory. Exceptions may be granted for justified reasons (e.g., illness) but must be communicated with the course instructor as soon as possible.
Usage of Generative AI
We follow the general TU Graz guidelines on the use of generative AI.
Lecture Dates
| Date | Begin | End | Location | Event | Type | Comment |
|---|---|---|---|---|---|---|
| 2026/03/19 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/03/26 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/04/16 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/04/23 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/04/30 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/05/07 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/05/21 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/05/28 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/06/11 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/06/18 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/06/25 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
| 2026/06/25 | 12:00 | 14:00 | Seminarraum | Abhaltung | VU | fix/ |
Lecturers
Könighofer
Assistant Professor