Trustworthy Artificial Intelligence (SS 2026)

Course Number 705007 | Sommersemester 2026

Lecturers

Table of Content

Content

Artificial intelligence systems are increasingly used in domains where errors can have serious consequences. Ensuring that AI behaves reliably, safely, and fairly is therefore essential. This course covers the theoretical foundations and practical techniques required to design, evaluate, and deploy AI systems that can be trusted in real-world applications. Main topics of the course include:

Monitoring of Machine Learning Models. We discuss monitoring strategies for neural network classifiers, focusing on distribution shift detection, contextual monitoring, and runtime frameworks for continuous assurance.
Testing of Reinforcement Learning Systems. We explore systematic testing methodologies for reinforcement learning agents, including critical scenario generation, model-based testing strategies, and safety and performance evaluation in probabilistic environments.
Shielded Reinforcement Learning. Shielded reinforcement learning is introduced as an approach to provide formal correctness guarantees for reinforcement learning systems by combining model-based formal reasoning with learning-based control.
Explainable AI (XAI). We examine interpretability and explainability techniques for different types of machine learning models.
Fair Machine Learning. We discuss common definitions of fairness criteria, techniques for bias detection and mitigation, and algorithms for fair decision-making.
Neural Network Verification. We explore formal verification techniques to mathematically prove that a neural network satisfies specified safety, robustness, or correctness properties under all admissible inputs.

Material

NR	Date	Topic	Speaker	Slides
1	05.03.2026	Introduction	Bettina Könighofer	TBA
2	12.03.2026	Monitoring of Machine Learning Models	Hazem Torfah	TBA
3	19.03.2026	Testing for Reinforcement Learning	Martin Tappler
4	26.03.2026	Shielded Reinforcement Learning	Bettina Könighofer, Stefan Pranger
5	16.04.2026	Explainable Sequential Decision Making	Sabine Rieder
6	23.04.2026	Fairness in Machine Learning	Konstantin Kueffner
7	30.04.2026	Neural Network Verification	Laura Nenzi
8	07.05.2026	Watermarking of AI-generated Content	Verena, Matthias
9	21.05.2026	Concept Bottleneck Models Neural Network Verification	Lukas, Boris Eymen, Yuliia
10	28.05.2026	Hallucination Detection / Mitigation / Guardrails for LLMs Interpretable Multi-Objective Reinforcement Learning	Anna-Sophie, Andreas Lucrezia, Laurent
11	11.06.2026	Explainability in LLMs Norm-Compliant Reinforcement Learning	Daria, Kevin Isabella, Liam
12	18.06.2026	Learning Symbolic World Models with Automata Learning & Learning Data-driven World Models with Dreamer V3	Benjamin, Matthias, Arttu
13	25.06.2026	Adversarial Attacks on LLMs Provable Safe Reinforcement Learning	Sam Dominik, Markus

Administrative Information

In the first session, students will receive an introduction to the course and an overview of the available topics. They will then select a specific topic to explore in depth during the semester and present it to the group in the second half of the course.

In the first six lectures, leading experts will provide introductions to their respective research fields (see Materials).

Timeline

By 19/03: Form groups of two and choose a topic.
16–27/03: First topic discussion – Narrowing the topic and selecting publications for the presentation.
20–30/04: Second topic discussion – Research discussion on the topic.
07/05–25/06: Student presentations

Grading

The course consists of following graded partial assessments:

Research Discussion 1 (10 pts)
Research Discussion 2 (30 pts)
Final Presentation (60 pts)

In order to obtain a passing grade, students must

fulfill the attendance policy (absence only with justified excuse)
reach at least 15 pts in the research discussion
reach at least 50 pts overall

The grade is then determined by following grading scheme:

Points	Grade
≥ 87.5	(1) Excellent
≥ 75.0	(2) Good
≥ 62.5	(3) Satisfactory
≥ 50.0	(4) Sufficient
< 50.0	(5) Insufficient

Further details, including deadlines and regulations regarding the repetition of specific partial assessments, are provided in the introductory slides of the first lecture.

Attendance

Attendance is compulsory. Exceptions may be granted for justified reasons (e.g., illness) but must be communicated with the course instructor as soon as possible.

Usage of Generative AI

We follow the general TU Graz guidelines on the use of generative AI.

Lecture Dates

Date	Begin	End	Location	Event	Type	Comment
2026/03/19	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/03/26	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/04/16	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/04/23	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/04/30	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/05/07	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/05/21	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/05/28	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/06/11	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/06/18	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/06/25	12:00	14:00	Seminarraum	Abhaltung	VU	fix/
2026/06/25	12:00	14:00	Seminarraum	Abhaltung	VU	fix/

Bettina Könighofer — Bettina
Könighofer

**Assistant Professor**
View more