Workshop Program 2023
Technical Program [Friday, May 19, 8:45 AM - 4:40 PM]
Workshop Opening [8:45 AM - 8:50 AM]
-
Workshop Opening and Welcome (5 minutes)
Dalibor Klusáček
Keynote Lecture [8:50 AM - 10:00 AM]
- Architecture of the Slurm Scheduler
Morris Jette (SchedMD LLC, USA).
Abstract: Slurm is an open source, fault-tolerant, and highly scalable workload manager used on many of the world's supercomputers and computer clusters. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources for some duration of time. Second, it provides a framework for starting, executing, and monitoring work on the allocated resources. Finally, it arbitrates contention for resources by managing queues of pending work and enforcing administrative policies. This keynote describes the current design and capabilities of Slurm.
Biography: Morris "Moe" Jette is a lead architect of the Slurm Workload Manager, providing resource management on about 60% of the Top 500 systems. After a 30 year career at Lawrence Livermore National Laboratory focused primarily on system software and distributed computing, he co-founded SchedMD LLC in 2010 to further develop Slurm.
Coffee Break & Discussion [10:00 AM - 10:30 AM]
Technical Papers (session 1) [10:30 AM - 12:00 AM]
- Asynchronous Execution of Heterogeneous Tasks in ML-driven HPC Workflows.
Vincent R Pascuzzi, Ozgur Ozan Kilic, Matteo Turilli and Shantenu Jha - Memory-Aware Latency Prediction Model for Concurrent Kernels in Partitionable GPUs: Simulations and Experiments.
Alessio Masola, Nicola Capodieci, Roberto Cavicchioli, Benjamin Rouxel and Ignacio Sanudo Olmedo - Stragglers in Distributed Matrix Multiplication.
Roy Nissim and Oded Schwartz
12:00 AM - 1:30 PM - Lunch Break (on your own)
Technical Papers (session 2) [1:30 PM - 3:00 PM]
- Optimization Metrics for the Evaluation of Batch Schedulers in HPC.
Robin Boëzennec, Fanny Dufossé and Guillaume Pallez - An experimental analysis of regression-obtained HPC scheduling heuristics.
Lucas Rosa, Danilo Carastan-Santos and Alfredo Goldman - An efficient approach based on graph neural networks for predicting wait time in job schedulers.
Tomoe Kishimoto and Tomoaki Nakamura
Coffee Break & Discussion [3:00 PM - 3:30 PM]
Technical Papers (session 3) [3:30 PM - 4:30 PM]
- Evaluating the Potential of Coscheduling on High-Performance Computing Systems.
Jason Hall, Arjun Lathi, David Lowenthal and Tapasya Patki - Scale Optimal Allocation of Cloud Resources.
Luis de la Torre and Mahantesh Halappanavar
Workshop Closing [4:30 PM - 4:40 PM]
-
Workshop Closing and Proceedings Preparation Information (10 minutes)
Dalibor Klusáček
JSSPP 1995-2024 – the Workshop on Jobs Scheduling Strategies for Parallel Processing. Contact email: jssppw@gmail.com