GenAI Cybersecurity for Energy Systems

Focus: Assessment of LLM capabilities for energy systems cybersecurity tasks, with an emphasis on the IEC 61850-based digital substation testbed of the KASTEL Security Lab Energy.

Modern OT-based energy systems, such as IEC 61850 digital substations, form critical cyber-physical infrastructures whose failure can cause severe societal and economic damage. At the same time, recent Large Language Models (LLMs) show strong capabilities in code understanding, tool orchestration, and security analysis. Still, their behavior in OT environments remains largely unexplored and is strongly dual-use. RA 3 investigates how far LLMs can support or partially automate cybersecurity tasks for energy systems. The KASTEL Security Lab Energy serves as a central case-study platform featuring multi-vendor IEC 61850 substations and RTUs.

Topic 3.1 – Knowledge Benchmarking for OT Cybersecurity: Quantifying the cybersecurity knowledge of LLMs, with a particular emphasis on operational technology and energy systems. The goal is to systematically evaluate what LLMs actually “know” about cybersecurity in energy systems, including protocol semantics, common vulnerabilities, and defense concepts.

Topic 3.2 – Solving Capture the Flag Challenges: Evaluating LLMs on OT-specific CTF challenges in IEC 61850-based digital substation environments. The goal is to systematically measure how well LLM agents can perform tasks such as network reconnaissance, IEC 61850 file server interaction, and traffic analysis.

Topic 3.3 – Vulnerability Discovery and Exploit Generation for RTUs: LLM agents are equipped with static and dynamic analysis tools to inspect firmware, configuration, and protocol implementations under realistic constraints. The aim is to assess the extent to which LLMs can autonomously identify OT-relevant targets and synthesize proof-of-concept exploits.

Topic 3.4 – Automated Threat Modeling: Automating and quantifying threat modeling for energy systems. The aim is to compare LLM-based threat modeling strategies with human expert approaches across key phases: constructing system architecture representations, mapping system components to threat categories, and deriving attack paths and mitigation strategies.