Scheduling Beyond CPUs

Release Time:2019-05-13Number of visits:187

Speaker:    Prof. Zhiling Lan

Time:        10:00-11:00, May 14

Location:    SIST 1A 502

Host:         Prof. Shu Yin

Abstract:

Massively parallel computing is undergoing significant changes. The emerging applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from big data applications, burst buffers are deployed in production systems. Existing cluster schedulers are mainly CPU-centric. The extreme heterogeneity of hardware devices, combined with workload changes, forces the schedulers to consider multiple resources (e.g., burst buffers) beyond CPUs, in decision making. In this talk, I will present a multi-resource scheduling scheme named BBSched that schedules user jobs based on not only their CPU requirements, but also other schedulable resources such as burst buffer. BBSched formulates the scheduling problem into a multi-objective optimization problem and rapidly solves the problem using a multi-objective genetic algorithm. The multiple solutions generated by BBSched enables system managers to explore potential tradeoffs among various resources, and therefore obtains better utilization of all the resources. The trace-driven simulations with real system workloads demonstrate that BBSched improves scheduling performance by up to 41% compared to existing methods, indicating that explicitly optimizing multiple resources beyond CPUs is essential for cluster scheduling.

Bio:

Zhiling Lan received her PhD degree in Computer Engineering from Northwestern University in 2002. She has since joined the faculty of Illinois Institute of Technology and is currently a Professor at the Department of Computer Science. She is also a guest research faculty at Argonne National Laboratory. Her research interests are in the areas of high performance computing, with particular emphasis on fault tolerance, power efficiency, resource management and job scheduling, performance analysis and modeling. She has co-authored ~100 publications in these areas. She received the Best Paper Awards at IPDPS 2010 and Cluster 2014. She was the recipient of Deans Excellence Award in Research/Scholarship (Senior Category) from College of Science at the Illinois Institute of Technology in 2015. She has served in the Technical Program Committees (TPC) for over 80 international conferences and workshops. She has been serving on the Editorial Board of IEEE Transactions on Parallel and Distributed Systems since 2014.

SIST-Seminar 18152