Data Life Cycle Lab Energy
Data Life Cycle
In big data applications, the complete life cycle of the data plays an important role. Every step is associated with special data management challenges.
Especially in the energy sector, time series are encountered frequently. Acquisition of measurement data, such as voltage values of an electrical data recorder (EDR), or of simulation results often yields very large data volumes with partly high data rates.
Data transmission is characterized not only by data rates, but also by high security requirements. Measurement data and other data describing energy systems frequently are of confidential character. They represent personal data that are subject to high data security requirements.
Data administration proper with distributed storage, high-performance reading and writing accesses, need-based deletion or long-term archiving of data requires an elaborate definition of high-quality, meaningful metadata as well as their intelligent administration and use.
Various types of data analyses need high-performance access to measurement data and other data of energy systems. Hence, they largely determine the requirements to be met by data administration. In addition, data analysis yields results that also have to be administrated smartly.
Last but not least, access for publication, use in education or for the planning of new projects is another important factor that determines data management. This is where the data cycle is closed and a new cycle starts, because new projects will also give rise to a new cascade of data acquisition.
Generic Data Services
The Data Management in Energy Informatics Group conceives and develops generic data services (GDS) meeting the above requirements of big data applications. If possible, the services to be developed are to be suitable for applications other than energy research as well.
Factors supporting the generic character of data management software are a clearly defined metadata concept, the possibility of standardized identification of data objects, the use of service technologies for a distributed system, and a systematic object-oriented program development process.
For the storage of data, GDS use various data storage systems adapted to the purposes and requirements, such as SQL databases, document- and graph-oriented databases, column-oriented database solutions, and various file-based storage systems.