Trustworthy Artificial Intelligence (SS 2026)

Course Number 705007 | Sommersemester 2026

Content

Artificial intelligence systems are increasingly used in domains where errors can have serious consequences. Ensuring that AI behaves reliably, safely, and fairly is therefore essential. This course covers the theoretical foundations and practical techniques required to design, evaluate, and deploy AI systems that can be trusted in real-world applications. Main topics of the course include:
  • Monitoring of Machine Learning Models. We discuss monitoring strategies for neural network classifiers, focusing on distribution shift detection, contextual monitoring, and runtime frameworks for continuous assurance.
  • Testing of Reinforcement Learning Systems. We explore systematic testing methodologies for reinforcement learning agents, including critical scenario generation, model-based testing strategies, and safety and performance evaluation in probabilistic environments.
  • Shielded Reinforcement Learning. Shielded reinforcement learning is introduced as an approach to provide formal correctness guarantees for reinforcement learning systems by combining model-based formal reasoning with learning-based control.
  • Explainable AI (XAI). We examine interpretability and explainability techniques for different types of machine learning models.
  • Fair Machine Learning. We discuss common definitions of fairness criteria, techniques for bias detection and mitigation, and algorithms for fair decision-making.
  • Neural Network Verification. We explore formal verification techniques to mathematically prove that a neural network satisfies specified safety, robustness, or correctness properties under all admissible inputs.

Material

 
NR Date Topic Speaker Slides
1 05.03.2026 Introduction Bettina Könighofer TBA
2 12.03.2026 Monitoring of Machine Learning Models Hazem Torfah TBA
3 19.03.2026 Testing for Reinforcement Learning Martin Tappler
4 26.03.2026 Shielded Reinforcement Learning Bettina Könighofer, Stefan Pranger
5 16.04.2026 Explainable Sequential Decision Making Sabine Rieder
6 23.04.2026 Fairness in Machine Learning Konstantin Kueffner
7 30.04.2026 Neural Network Verification Laura Nenzi
8 07.05.2026 Watermarking of AI-generated Content Verena, Matthias
9 21.05.2026 Concept Bottleneck Models
Neural Network Verification
Lukas, Boris
Eymen, Yuliia
10 28.05.2026 Hallucination Detection / Mitigation / Guardrails for LLMs
Interpretable Multi-Objective Reinforcement Learning
Anna-Sophie, Andreas
Lucrezia, Laurent
11 11.06.2026 Explainability in LLMs
Norm-Compliant Reinforcement Learning
Daria, Kevin
Isabella, Liam
12 18.06.2026 Learning Symbolic World Models with Automata Learning
& Learning Data-driven World Models with Dreamer V3
Benjamin, Matthias, Arttu
13 25.06.2026 Adversarial Attacks on LLMs
Provable Safe Reinforcement Learning
Sam
Dominik, Markus

Administrative Information

In the first session, students will receive an introduction to the course and an overview of the available topics. They will then select a specific topic to explore in depth during the semester and present it to the group in the second half of the course.

In the first six lectures, leading experts will provide introductions to their respective research fields (see Materials).

Timeline

  • By 19/03: Form groups of two and choose a topic.
  • 16–27/03: First topic discussion – Narrowing the topic and selecting publications for the presentation.
  • 20–30/04: Second topic discussion – Research discussion on the topic.
  • 07/05–25/06: Student presentations 

Grading

The course consists of following graded partial assessments:
  • Research Discussion 1 (10 pts)
  • Research Discussion 2 (30 pts)
  • Final Presentation (60 pts)
In order to obtain a passing grade, students must
  • fulfill the attendance policy (absence only with justified excuse) 
  • reach at least 15 pts in the research discussion
  • reach at least 50 pts overall
The grade is then determined by following grading scheme:
Points Grade
≥ 87.5 (1) Excellent
≥ 75.0 (2) Good
≥ 62.5 (3) Satisfactory
≥ 50.0 (4) Sufficient
< 50.0 (5) Insufficient

Further details, including deadlines and regulations regarding the repetition of specific partial assessments, are provided in the introductory slides of the first lecture.

Attendance

Attendance is compulsory. Exceptions may be granted for justified reasons (e.g., illness) but must be communicated with the course instructor as soon as possible.

Usage of Generative AI

We follow the general TU Graz guidelines on the use of generative AI.

Lecture Dates

Date Begin End Location Event Type Comment
2026/03/19 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/03/26 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/04/16 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/04/23 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/04/30 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/05/07 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/05/21 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/05/28 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/06/11 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/06/18 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/06/25 12:00 14:00 Seminarraum Abhaltung VU fix/
2026/06/25 12:00 14:00 Seminarraum Abhaltung VU fix/

Lecturers

Bettina Könighofer
Bettina
Könighofer

Assistant Professor

View more