High-performance computing (HPC) data centers are experiencing rising energy consumption, despite the urgent need for increased efficiency. In this study, we develop an approach inspired by digital twins to enhance energy and thermal management in an HPC facility. We create a comprehensive framework that incorporates a digital twin for the CRESCO7 supercomputer cluster at ENEA in Italy, integrating data-driven time series forecasting with an interactive analytical dashboard for resource prediction. We begin by reviewing relevant literature on digital twins and modern time series modeling techniques. After ingesting and cleansing sensor and job scheduling datasets, we perform exploratory and inferential analyses to understand key correlations. We then conduct descriptive statistical analyses and identify important features, which are used to train machine learning models for accurate short- and medium-term forecasts of power and temperature. These models feed into a simulated environment that provides real-time prediction metrics and a holistic “health score” for each node, all visualized in a dashboard built with Streamlit. The results demonstrate that a digital twin-based approach can help data center operators efficiently plan resources and maintenance, ultimately reducing the carbon footprint and improving energy efficiency. The proposed framework uniquely combines concepts inspired by digital twins with time series machine learning and interactive visualization for enhanced HPC energy planning. Key contributions include the novel integration of predictive models into a live virtual replica of the HPC cluster, employing a gradient-boosted tree-based LightGBM model. Our findings underscore the potential of data-driven digital twins to facilitate sustainable and intelligent management of HPC data centers.

Towards Energy Efficiency of HPC Data Centers: A Data-Driven Analytical Visualization Dashboard Prototype Approach

De Chiara D.;
2025-01-01

Abstract

High-performance computing (HPC) data centers are experiencing rising energy consumption, despite the urgent need for increased efficiency. In this study, we develop an approach inspired by digital twins to enhance energy and thermal management in an HPC facility. We create a comprehensive framework that incorporates a digital twin for the CRESCO7 supercomputer cluster at ENEA in Italy, integrating data-driven time series forecasting with an interactive analytical dashboard for resource prediction. We begin by reviewing relevant literature on digital twins and modern time series modeling techniques. After ingesting and cleansing sensor and job scheduling datasets, we perform exploratory and inferential analyses to understand key correlations. We then conduct descriptive statistical analyses and identify important features, which are used to train machine learning models for accurate short- and medium-term forecasts of power and temperature. These models feed into a simulated environment that provides real-time prediction metrics and a holistic “health score” for each node, all visualized in a dashboard built with Streamlit. The results demonstrate that a digital twin-based approach can help data center operators efficiently plan resources and maintenance, ultimately reducing the carbon footprint and improving energy efficiency. The proposed framework uniquely combines concepts inspired by digital twins with time series machine learning and interactive visualization for enhanced HPC energy planning. Key contributions include the novel integration of predictive models into a live virtual replica of the HPC cluster, employing a gradient-boosted tree-based LightGBM model. Our findings underscore the potential of data-driven digital twins to facilitate sustainable and intelligent management of HPC data centers.
2025
datacenter optimization
energy efficiency
high-performance computing
machine learning
predictive modeling
thermal management
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12079/88171
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
social impact