Contributions to data lake caching, management, and prototypes
Research conducted by DESY (German Electron Synchrotron).
As an associated partner, DESY has actively contributed to the following key areas of the project:
- Data lake caching
- Data and workflow management
- Data lake prototypes and Quality of Service (QoS)
Throughout the project, we leveraged our expertise in managing large-scale computing and storage resources, utilizing heterogeneous resources, handling big data, and setting up robust monitoring infrastructures.
Our specific contributions include:
In data lake caching, we played a key role in integrating heterogeneous resources into High Energy Physics (HEP) production environments, optimizing the use of available resources.
For data lake data and workflow management, we
- Collaborated with CERN on evaluating Rucio for efficient data management.
- Worked with Johannes Gutenberg University Mainz to explore alternative methods like consistent hashing.
- Partnered with PUNCH4NFDI to develop and test workflow management prototypes.
In the field of data lake prototypes and QoS, we
- Contributed to the development of data lake prototypes in collaboration with PUNCH4NFDI.
- Focused on enhancing the interconnection between data and computing resources to improve efficiency and performance.
Future work
In the future, our focus in FIDIUM 1.5 will be on enhancing caching systems and optimizing data management across various resources. Key areas of work will include:
- Caching Systems for Opportunistic Resources. We plan to implement dynamic data caches, utilizing dCache for more efficient resource management.
- Cache-Aware Data Management. This will involve comparing features of XCache and dCache, as well as simplifying installation methods to streamline processes.
- Federated dCache Instances. We’ll establish federated dCache systems between DESY and the University of Wuppertal, as well as between DESY NAF and PhysNet Hamburg, with a focus on performance measurement and optimization.
- Federated Computing Cluster. We aim to create a federated computing cluster using COBalD/TARDIS between DESY NAF and PhysNet Hamburg, and will be conducting performance assessments.
- AUDITOR Test Instance. Finally, we will deploy a test instance of AUDITOR at DESY to further enhance our capabilities.