International Open Symposium : Benchmarking and software sustainbility at the exascale era
The emphasis on benchmarks will cover several point of views for benchmarking:
This symposium aims to share the different visions of benchmarking methods and purpose, in order to propose a common definition of benchmarking and ensure a collaborative work for all the HPC+ community.
Many software development are funded but a very few have a long-term strategy for long-terme sustainability. Software sustainability will be addressed along three mains axis: Performance portability – Long-term sustainability – Environmental sustainability. During this 2nd day, these different topics will be illustrated through examples and experiences from academia, industries and startups. This symposium aims to share visions for software sustainability and propose some first thoughts to improve it.
SCIENTIFIC COORDINATION
France Boillord-Cerneux
|
Raluca Hodoroaba
|
Christophe Calvin
|
Michele De Lorenzi
|
Dr. Tjerk P. Straatsma
|
Rio Yokota
|
DATE & PLACE
Du 26 septembre au 27 septembre 2023
Du 26-09 au 27-09-2023
9h00 - 17h30
Hôtel Mercure Paris 19 Philharmonie La Villette 4 étoiles
216 Avenue Jean Jaures,
75019 PARIS, France
Organizers and partners
PROGRAM
09:00 | Welcome & Opening |
Dr. France Boillod-Cerneux Dr. Christophe Calvin |
09:10 |
Keynote: HPC System Management and Application Software Initiatives in India under National Supercomputing Mission Voir la ressource |
Dr. Sanjay A. Wandhekar, C-DAC |
Perspective from the Systems Community | ||
10:00 |
Benchmarking and co-design: examples from the DEEP and EPI projects Voir la ressource |
Dr. Estela Suarez |
10:20 |
Benchmarks for System Procurements Voir la ressource |
Dr. Andreas Herten, FZJ |
10:45 | Dr. Sadaf Alam, Bristol University | |
11:10 | Break | |
11:30 | Dr. Miwako Tsuji, RIKEN | |
11:45 | Open discussion | |
Perspective from the Applications Community | ||
12:00 |
GronOR, a massively parallel and GPU accelerated program for Non-Orthogonal Configuration Interaction Voir la ressource |
Dr. Coen de Graaf, University Rovira i Virgili |
12:25 | Working lunch | |
14:00 |
Benchmarking electronic structure codes on their way to exascale Voir la ressource |
Dr. Andrea Ferretti, CNR |
14:25 |
Improving our understanding of the Sun: the role of simulations and their validation for studying turbulent magnetized flows Voir la ressource |
Dr. Allan Sacha Brun, CEA |
Perspective from the Vendor Community | ||
14:50 |
Sustainability through accelerated datacenters, parallel language standards, community benchmarks, and regular software testing at scale Voir la ressource |
Dr. Jack Wells, NVIDIA |
15:15 |
HPC and AI benchmark workflow and objectives Voir la ressource |
Dr. Ludovic Enault, ATOS |
15:40 | Perspectives on HPC Benchmarking | Dr. David Lecomber & Dr.Conrad Hillairet, ARM |
16:05 |
Challenges of performance modelling at scale Voir la ressource |
Dr. Fabrice Dupros, Intel |
16:30 | Break | |
17:00 | Towards benchmarking metadata: White paper |
Dr. France Boillod-Cerneux Dr. Christophe Calvin |
17:40 | Conclusion and Closing |
Dr. France Boillod-Cerneux Dr. Christophe Calvin |
18:00 | Adjourn Day 1 | |
18:45 | A diner is offered by AIDAS for all the Open symposium participants |
09:00 | Introduction | Christophe Calvin and France Boillod-Cerneux |
09:10 |
Keynote : Sustainable research ecosystems through Open Science Voir la ressource |
Dr. Marta Teperek – FAIR data Programme leader of the NWO |
10:10 | Break | |
Performance Portability | ||
10:30 |
Performance portability and software sustainability with Kokkos Voir la ressource |
D. Lebrun-Grandi / ORNL Christian Trott / SNL |
11:00 |
CExA project: Towards a Middleware to operate GPUs in Exascale context (Mission and Use Cases) Voir la ressource |
F. Baligand / CEA E. Audit / CEA |
11:30 |
Preparing BigDFT for Aurora Voir la ressource |
Christoph Bauinger / Intel |
12:00 | Working Lunch | |
Long-term sustainability | ||
13:00 |
The NumPEx program, a major French contribution to the Exascale European roadmap Voir la ressource |
Jérôme Bobin / CEA |
13:30 |
E4S Voir la ressource |
Todd Gamblin / LLNL |
14:00 |
Towards a Software Pillar of Open Science, from policy to implementation Voir la ressource |
Roberto Di Cosmo / INRIA |
14:30 | Break | |
Environmental sustainability | ||
14:50 |
Software for sustainability – Green IT and Sustainable Computing Voir la ressource |
Woong Shin – ORNL |
15:20 | Energy aware numerical simulation | Simon McIntosh-Smith – University of Bristol |
15:50 |
Finding the path to optimize the energy criterion of HPC/AI applications Voir la ressource |
Hervé Mathieu / DENERGIUM |
16:20 | Adjourn Day 2 |
Vidéo 26 september : Benchmarking at the exascale era
Vidéo 27 september : Software sustainbility at the exascale era
Speakers Day 1
Benchmarking and co-design: examples from the DEEP and EPI projects | Dr. Estela Suarez
Benchmarking brings insight into the behaviour of application codes and their kernels on HPC systems. This is useful both to identify bottlenecks and do performance optimization on the application side, as well as to understand how specific hardware features can impact (positively or negatively) the performance of those applications. It is therefore natural to use benchmarking as a vehicle for “co-design”. Since this, too, is a term that finds various definitions, we would like to specify here that we refer to co-design in the sense of studying the interaction between application code, system software, and hardware components, in search of the modifications at each of those three levels that would bring the best performance and energy efficiency overall. In this talk we will present experiences on the use of benchmarking for co-design purposes, gathered within the EU-funded DEEP and EPI projects. While on the former we looked at the system level, in the latter the focus is at the processor and even core level. We will describe the differences between both and the challenges (both technical and organisational) that we found.
Prof. Dr. Estela Suarez is head of the department “Novel System Architecture Design” at the Jülich Supercomputing Centre, which she joined in 2010. Since 2022 she is also Associate Professor of High Performance Computing at the University of Bonn. Her research focuses on HPC system architecture and codesign. As leader of the DEEP project series she has driven the development of the Modular Supercomputing Architecture, including hardware, software and application implementation and validation. Additionally, since 2018 she leads the codesign efforts within the European Processor Initiative. She holds a PhD in Physics from the University of Geneva (Switzerland) and a Master degree in Astrophysics from the University Complutense of Madrid (Spain)
Benchmarks for System Procurements | Dr. Andreas Herten
HPC machines are drivers of scientific discoveries and host to applications of various domains. To represent these workloads in procurements of new machines, benchmarks are a key tool to assess quality and level of support. Only with a tight integration of evaluating applications into the planning and acquisition process, highest usability of the future system can be ensured. This talk presents highlights of JSC’s current set of benchmarks and methodology used to assess procurements.
Head of Novel System Architecture Design Division and Accelerating Devices Lab at Jülich Supercomputing Centre
Benchmarking electronic structure codes on their way to exascale | Dr. Andrea Ferretti
Materials are crucial to science and technology, and connected to major societal challenges ranging from energy and environment, to information and communication, and manufacturing.
Electronic structure methods have become key to meterials simulations, allowing scientists to study and design new materials before running actual experiments.
The MaX Centre of Excellence – Materials design at the eXascale – is devoted to enable materials modelling on exascale-class HPC machine. MaX’s action focuses on popular open source community codes in the electronic structure field (Quantum ESPRESSO, Yambo, Siesta, Fleur, BigDFT).
In this Talk I will discuss the main strategies and targets considered during the execution of benchmarking campaigns within MaX, making specific examples related to the MaX flagship codes. In particular I will focus on benchmarking on GPU accelerated machines and will stress the relevance ofusing different metrics (icluding e.g. energy-to-solution besides time-to-solution).
CNR, Istituto Nanoscienze, IT; MaX CoE
Improving our understanding of the Sun: the role of simulations and their validation for studying turbulent magnetized flows | Dr. Allan Sacha Brun
The Sun is a self-gravitating, rotating, turbulent and magnetized sphere of plasma, that provides the heat that makes life possible on our planet Earth. Hence being able to model and understand our star is fundamental, even more so, that Earth bathes in its hot expanding atmosphere that can be subject to intense magnetic storms that have detrimental consequences on our technological society (loss of communication or gps signal, loss of satellites, to only cite a few).
Various types of numerical models have been developed over the years to model the nonlinear magnetohydrodynamics of this sphere of churning hot plasma. We will present a reader digest of such simulation initiatives and how through systematic inter-comparison and international benchmarks the scientific community has been able to improve our understanding of the magnetized Sun and how we are preparing the future with the Dyablo multi-platforms adaptative mesh refinement (AMR) framework developed at CEA.
Astrophysicist, Director of Research at CEA Paris-Saclay
Sustainability through accelerated datacenters, parallel language standards, community benchmarks, and regular software testing at scale | Dr. Jack Wells
In this talk, I offer selected comments on the general topic on the sustainability of the scientific computing ecosystem, with special emphasis on the energy efficiency of accelerated datacenters, the productivity of standard parallel languages and libraries, the evolution of community benchmarks for high-performance computers, and the benefits of continuous testing of software stacks composed as HPC system software from multiple vendors. My goal in making these comments is to advance community discussion and collaboration.
Jack Wells is Scientific Program Manager at NVIDIA, where he engages thought leaders across the scientific computing ecosystem, focusing on aligning the NVIDIA computing platform with stakeholder goals and mission. Jack joined NVIDIA in 2021 following 23 years at Oak Ridge National Laboratory where he was part of the team that delivered three world-leading supercomputers, serving as Director of Science for the Oak Ridge Leadership Computing Facility, a Department of Energy, Office of Science national user facility. He has authored or co-authored more than 100 scientific papers and edited two books. His research interests span nanoscience, materials science and engineering, nuclear and atomic physics, computational science, applied mathematics, and data analytics. Jack has a PhD in Physics from Vanderbilt University. He is serving as President of OpenACC, a consortium dedicated to the research community’s advancement by expanding accelerated and parallel computing capabilities.
Perspectives on HPC Benchmarking | Dr. David Lecomber
In this talk we explore what benchmarking means for Arm’s HPC group, where it can make a difference, its challenges and opportunities.
Challenges of performance modelling at scale| Dr. Fabrice Dupros
One of the main topics in High-Performance Computing is the characterization and the prediction of the performances of HPC applications on future platforms.
In this talk, we will discuss major challenges with a focus on performance exploration at scale.
Speakers Day 2
Sustainable research ecosystems through Open Science | Dr. Marta Teperek
In an era of exponential data storage needs, ever-growing computational demands, and the necessity for increasingly sophisticated software to process vast amounts of data, it is only natural that significant attention is given to sustainable research software and research data. However, to address the challenges of the digital era comprehensively, we must adopt a more holistic approach and strive for sustainable research ecosystems. These ecosystems encompass not only research data and software but all research outputs, processes, infrastructures, as well as research careers. In my talk I will argue that to achieve this vision, we should embrace open science as a transformative force.
However, putting open science into practice is not easy and requires the active involvement of researchers, institutions, and funding agencies. It is essential to establish incentives and recognise open science practices, provide the necessary infrastructure, and foster a culture of collaboration and appreciation. In my talk I will provide some examples of concrete initiatives which foster the implementation of open science and lead to greater sustainability of the scientific endeavour as a whole.
Marta Teperek is the Programme Leader for FAIR Data in the Open Science NL team of the Dutch Research Council (NWO). She is a researcher by training and obtained her PhD in epigenetics and developmental biology from the University of Cambridge.
After completing her PhD, she played a leading role in establishing the research data support team at Cambridge. Subsequently, she moved to the Netherlands to become the Data Stewardship Coordinator at TU Delft, where she successfully built and led a team of disciplinary data stewards. Prior to joining NWO, she served as the director of 4TU.ResearchData and held the position of head of Research Data Services at TU Delft Library.
Besides her professional roles, Marta is actively involved in various national and international organisations. For instance, she played a key role in setting up the NWO’s thematic Digital Competence Centre for Natural and Engineering Sciences and served as the Co-Chair of the FAIR in Practice Task Force within the European Open Science Cloud FAIR working group.
Performance portability and software sustainability with Kokkos | D. Lebrun-Grandi - Christian Trott
We have entered the exascale era of high-performance computing (HPC), and with it comes the challenge of writing software that can achieve high performance on a wide variety of heterogeneous architectures. The Kokkos Ecosystem is a performance portability solution which addresses that challenge through a single source C++ programming model that follows modern software engineering practices. Accompanied by a suite of libraries for fundamental computational algorithms as well as tools for profiling and debugging, Kokkos is a productive framework to write sustainable software.
Today Kokkos arguably provides the leading non-vendor provided programming model for HPC exascale applications. Developed and maintained by a multi-institutional team of HPC experts, Kokkos is relied on by scientist and engineers from well over 100 institutions to help them leverage modern HPC systems.
This talk will provide an overview of Kokkos and its benefits for performance portability and software sustainability. We will also present the Kokkos’ team vision for a community driven future of the project.
Damien Lebrun-Grandié is a Computational Scientist with Oak Ridge National Laboratory. His research focus on the development of algorithms and enabling technologies for the solution of large-scale complex engineering and scientific problems. He is a high-performance computing expert and is a member of the ISO C++ Standards Committee. He co-leads the Kokkos performance portability library team. He holds a MEng in Applied Physics from Grenoble INP in France, and a MSc in Physics from Karlruhe Institute of Technology in Germany. He completed his PhD in Nuclear Engineering from Texas A&M University.
Christian Trott is a high performance computing expert with extensive experience in designing and implementing software for modern HPC systems. He is a principal member of staff at Sandia National Laboratories, where he co-leads the Kokkos core team developing performance portability solutions for engineering and science applications. Christian is also the head of Sandia’s delegation to the ISO C++ standards committee, and a principal author of C++ standard features such as mdspan and linear algebra support. He also serves as adviser to numerous application teams, helping them redesign their codes using Kokkos and achieve performance portability for the next generation of supercomputers.
In the past Christian contributed significantly to numerous scientific software projects such as Trilinos and LAMMPS. He earned a doctorate from the University of Technology Ilmenau in theoretical physics with a focus on computational material research.
CExA project: Towards a Middleware to operate GPUs in Exascale context (Mission and Use Cases) | F. Baligand - E. Audit
The CExA project aims to facilitate the CEA’s transition to Exascale computing and GPU-based calculations through an Exascale Computing software catalyst. This middleware leverages the computing power of Exaflop machines, ensures performance portability, and incorporates proven technologies such as the open-source Kokkos library. Diverse demonstrators from different CEA departments are showcased during the implementation process to demonstrate the software catalyst’s capabilities.
Fabien Baligand is a researcher in computer science at CEA – Technological Reseach Division. Fabien has worked in Cloud Computing, in diverse organizations like Microsoft, Thales and startups. He holds a PhD in the field of Service Oriented Computing.
Edouard Audit studied physics at the Ecole Normale Supérieure de Paris-Saclay and received a PhD in theoretical physics from university Paris 7 in 1997. He worked at the Paris observatory in the field of numerical cosmology and large scale structure formation. He then joined CEA were he has been developing massively parallel codes to study star formation, interstellar medium and laser generated plasmas. In 2010, he was appointed as the founding director of Maison de la Simulation (MdS), a joint laboratory between CEA, CNRS and the universities of Paris-Saclay and Versailles-St Quentin. The mission of MdS is to foster the usage of HPC for scientific discovery. Edouard Audit is also professor at INSTN and teaches computational astrophysics and parallel computing. He is the coordinator of the EU-funded Energy Oriented Centre of Excellence – EoCoE. He has more than 80 publications in peer reviewed journals.
Preparing BigDFT for Aurora | Christoph Bauinger
To prepare the BigDFT code – a highly precise wavelet-based implementation of Kohn-Sham DFT – for the nascent Aurora system, we ported it to SYCL. More precisely, we developed a SYCL code for an efficient Fock operator application, which is a highly accurate computational method of electronic structure calculations. In this session, we give an overview of SYCL and present our approach to porting the CUDA code in BigDFT to SYCL. In addition, we introduce Intel oneAPI and the novel Intel Data Center GPUs Max 1550, which is the device targeted with our approach. Finally, we demonstrate the performance of our SYCL implementation of the Fock operator application on Intel Data Center GPUs Max 1550 as well as 4th Generation Intel Xeon.
Christoph Bauinger is an Application Engineer at Intel, where he optimizes CFD, ML, DFT, and various other scientific workloads for Intel accelerators. Prior to joining Intel in September 2022, he was awarded his PhD in Applied Mathematics by the California Institute of Technology for developing a novel algorithm for the solution of scattering problems, the IFGF method.
The NumPEx program, a major French contribution to the Exascale European roadmap | Jérôme Bobin
The HPC, HPDA and AI ecosystem is undergoing a major technological and use shift on its road to the Exascale, which mandates a radical re-design of the software stack and the application codes. In this context, the French Exascale program NumPEx, a 6-years national project, aims at designing and developing the software components that will equip future exascale machines. It will contribute to bridge the gap between cutting-edge software developments and application domains to prepare the major scientific and industrial application codes to fully exploit the capabilities of these machines. One the major deliverables of NumPEx will be an Exascale-grade software stack. During this presentation, we will present the organisation of the NumPEx program with a particular on software production and the long-term sustainability of the NumPEx production.
Jérôme Bobin received his Ph.D. from Paris-Sud University (now Paris-Saclay University) (France) in 2008 in Computer Science. From 2008 to 2010, he has been a postdoctorate fellow in the applied mathematics department at the California Institute of Technology, and then in the mathematics department at Stanford University. From 2010, he is a researcher at CEA/IRFU. From 2010 to 2021, he has been member of the CosmoStat laboratory, and co-lead of the laboratory from 2014 to 2021. Since 2021, he is research director in the LILAS (Laboratoire d’Ingénierie Logicielle pour les Applications Scientifiques) at IRFU. From 2012 to 2014, he has been the principal investigator of the ANR project MultID, in collaboration with SAGEM. From 2016 to 2021, he has been the principal investigator of the LENA project, funded by the ERC (Starting Grant programme), devoted to the development of machine learning solutions in the scope of signal processing, with applications in Astrophysics. In 2021, he received, with his collaborators, the « Prix de la Recherche LNE » for his work on data analysis applied to nuclear physics. Since 2022, he is co-director of the NumPEx, a 6-years, 40 MEuros project focusing on high performance numerics for the Exascale.
His research interests are in signal and image processing, machine learning, statistics, massive data analysis, and their applications in physics, with a special emphasis on astrophysic and nuclear physics. From 2006 to 2015, he has been member of the ESA/Planck consortium, where he actively partipated to the analysis and exploitation of the Planck data. He is member of the ESA/LISA consortium and he participates actively to the ESA/Athena space mission.
Towards a Software Pillar of Open Science, from policy to implementation | Roberto Di Cosmo
Open Science is a tidal wave that will have a deep and long lasting impact on the way we conduct, share and evaluate research. Awareness is rising about the fact that open source software has a key role to play, on par with open access to research articles and data, and that it comes with specific challenges. In this talk we will survey recent policy news, report on ongoing national and international efforts, and explore in depth the approach taken by the Software Heritage initiative to preserve, reference and share all the publicly available source code, providing immediately actionable means to support the Software Pillar of Open Science.
An alumnus of the Scuola Normale Superiore di Pisa, with a PhD in Computer Science from the University of Pisa, Roberto Di Cosmo was associate professor for almost a decade at Ecole Normale Supérieure in Paris. In 1999, he became a Computer Science full professor at University Paris Diderot, where he was head of doctoral studies for Computer Science from 2004 to 2009. President of the board of trustees and scientific advisory board of the IMDEA Software institute and chair of the Software chapter of the National Committee for Open Science in France, he is currently on leave at Inria.
His research activity spans theoretical computing, functional programming, parallel and distributed programming, the semantics of programming languages, type systems, rewriting and linear logic, and, more recently, the new scientific problems posed by the general adoption of Free Software, with a particular focus on static analysis of large software collections. He has published over 20 international journals articles and 50 international conference articles.
In 2008, he has created and coordinated the european research project Mancoosi, that had a budget of 4.4Me and brought together 10 partners to improve the quality of package-based open source software systems.
Following the evolution of our society under the impact of IT with great interest, he is a long term Free Software advocate, contributing to its adoption since 1998 with the best-seller Hijacking the world, seminars, articles and software. He created in October 2007 the Free Software thematic group of Systematic, that helped fund over 50 Open Source research and development collaborative projects for a consolidated budget of over 200Me. From 2010 to 2018, he was director of IRILL, a research structure dedicated to Free and Open Source Software quality.
He created in 2015, and now directs Software Heritage, an initiative to build the universal archive of all the source code publicly available, in partnership with UNESCO.
Finding the path to optimize the energy criterion of HPC/AI applications | Hervé Mathieu
We know that the digital sector must integrate energy criteria into its approach in order to limit its cost, limit its environmental impact and improve its image. This applies in particular to the HPC/AI field, which must integrate the Energy-To-Solution paradigm. This new optimization paradigm requires, first and foremost, simple and reliable acquisition of energy data. This is what DENERGUM solutions provide today. We will conclude with DENERGIUM’s vision for energy optimization in the years to come.
Relevant associations and links