For research into APT detection approaches, data from APT attacks is required for development, testing, and evaluation. There are publicly available datasets that are used for these purposes; however, these are discussed controversially in the literature. Criticism focuses on the quality of the data as well as the attacks represented, which often do not correspond to real APT attacks or fail to include all relevant characteristics, such as persistence or lateral movement, that are typical of APT attacks. For this reason, researchers often create their own datasets for the development and evaluation of their APT detection approaches. However, these datasets are rarely published. As a result, evaluations are often difficult to compare directly, since the data used is different. This has serious implications for scientific rigour in the field of APT detection.
This project aims to solve this problem. APT detection approaches are to be reimplemented within a unified framework, so that they can then be evaluated on consistent, high-quality data from real APT attacks. It will be ensured that the data is provided in a standardized representation, forming the basis for data processing and detection across the individual detection approaches. The dataset, along with the infrastructure and reimplementations, is planned to be released as open-source.
In this way, a foundation for comparability between different approaches to APT attack detection will be created, thereby contributing to the overall improvement of scientific rigour in this field of research.
 
                 
			 
			