Data set synthetization to improve deep learning

A common drawback of deep learning algorithms is their need for large, annotated datasets for training and evaluation. In case of 2D images, the manual annotation is often just seen as a burdensome and time-consuming task. However, the annotation of higher dimensional data such as 3D or 3D+t images introduce new difficulties which. The visualization and annotation are mainly limited to 2D views, as the rendering of large 3D structures with many instances is not very useful. The additional dimensions thus lead to a severe increase in the amount of data to be annotated. Furthermore, the spatial recognition of objects is more difficult and the generation of consistent object outlines between layers is not feasible. The manual annotation of higher dimensional data is thus a nearly impossible task.

The main goal of the project is to develop and improve methods for the generation of synthetic image data, where labels are known by design. Important aspects of this research are the inclusion of rare object states and structures and the consideration of biophysical forces. In addition, methods for evaluating the quality and physical plausibility of the synthetic data are developed.