AP9 Research Data Management

Lead: Universität Duisburg-Essen, Dr. Rehwald

One of the major obstacles in the further development and implementation of THz technologies is the lack of a common data foundation. Making research data available is, therefore, the central task for AP9. Especially concerning the further development and participation in advanced projects at the BMBF (Future Clusters), DFG (SFB/TRR and Excellence Clusters), and EU levels, establishing a research data space is a central key element. For this purpose, measurement data must be made transparent and made available to researchers, even outside the network, free of charge, as part of the Open Science strategy. However, research data is also always an economic factor and important for long-term funding. Therefore, dual strategies must be developed, for example, through blockchain technologies or patent portfolios, which on the one hand enable free access for the scientific community and on the other hand allow industrial exploitation for network partners. This leads to three pillars of a Terahertz data space to be implemented:


Technical Infrastructure: It involves a technical infrastructure for the structured storage of various data types that combines archiving functions, regulated data exchange, and flexible data access through interfaces (APIs). Although these requirements are basically universal and not limited to a research discipline, there are currently no ready-made solutions for this, and they are the subject of both local developments and the National Research Data Infrastructure (NFDI). The necessary components of an FDM infrastructure are currently being established and provided at RUB and UDE. The integration of the components into a connected data space and its opening to the community will be implemented as part of terahertz.NRW and can build on prototype work of the SFB MARIE for applications such as Dataverse and Nextcloud.


Data Structure and Ontology: The development of a suitable data structure and ontology with an appropriate metadata schema is essential for the Terahertz data space, which will cover material characterizations, algorithms, and component designs and semantically link them. In terms of the later community-wide use of the THz data space, embedding metadata in engineering standards is unavoidable and will, therefore, be developed in cooperation with the NFDI consortium NFDI4Ing. Starting from the described demonstrators, data collections and data structures are developed that define and store Research Objects for THz research precisely, following the example of RO-Crate [SOI/2022].


Data Policy and Intellectual Property: Especially in the field of component and chip design, patent and usage rights provide clear guidelines for reuse scenarios and access control of the data space. A detailed data policy is, therefore, necessary, as well as new solutions for smart data formats that allow components and their characterizations to be cataloged while keeping patent-relevant information inaccessible and protected.