Learning Approaches to Dynamic Workflow Scheduling based on Genetic Programming and Deep Reinforcement Learning
Dynamic workflow scheduling (DWS) in cloud computing is a critical yet challenging problem, involving assigning numerous workflow tasks to heterogeneous virtual machines under dynamic conditions to optimize cost or makespan. The complexity arises from unpredictable workflow arrivals and patterns, heterogeneous cloud resources, and rapidly evolving environments in workflow and resource status caused by real-time allocation. Among existing approaches, Priority Dispatching Rules (PDRs), are widely adopted for their intuitiveness, real-time efficiency, and ease of implementation. Manually designing effective PDRs is time-consuming, requires substantial domain expertise, and results in fixed structures that cannot adapt to various changes in dynamic environments. To address these limitations, this thesis develops advanced Genetic Programming Hyper-Heuristic (GPHH) and Deep Reinforcement Learning (DRL) approaches to automatically generate effective, generalizable, and adaptive PDRs for two practically important DWS problems, i.e., Deadline-Constrained Dynamic Workflow Scheduling in Cloud and Makespan-Aware Dynamic Workflow Scheduling in Cloud. Tree-based PDRs by GPHH have proven effective and interpretable in offline settings, avoiding the tediousness of manually designing rules. However, there is still room for improvement in terms of effectiveness and generalization. Neural network-based PDRs by DRL can address the adaptation challenge to rapidly changing environments in online settings, whereas tree-based PDRs, which are sensitive to structural adjustments, struggle to solve this issue. However, the potential of neural network-based PDRs for online DWS remains unexplored in the literature. In this thesis, we explore three key objectives, including: (1) evolving effective tree-based PDRs in offline settings through GPHH powered by a novel adaptive mutation operator (2) improving GPHH's generalization in offline settings through training instance sampling strategies, and (3) enabling adaption of neural network-based PDRs in online settings via a novel offline-online DRL method. First, we propose a mutation operator for GPHH with dual-tree representations, aiming to adaptively adjust the selection probabilities of trees as well as terminals, thereby improving the quality of evolved scheduling heuristics, i.e., GP tree-based PDRs. Extensive experiments and theoretical analysis demonstrate that the proposed probability adjustment techniques in the adaptive mutation operator enhance the effectiveness of GPHH in evolving high-performing scheduling heuristics compared to baselines.
Second, we conduct a comprehensive investigation of training instance sampling strategies to enhance the generalization ability of the GPHH algorithm. Additionally, we propose a novel GPHH with a sampling strategy framework to facilitate this investigation. Through systematic comparison of rotation, mini-batch, and hybrid sampling strategies, the research reveals that mini-batch and our proposed hybrid sampling strategies can improve GPHH's generalization performance over unseen problem instances across various scenarios.
Third, we develop a novel DRL approach to learning neural network-based PDRs, featuring innovations in task-specific and system-oriented graph representations, neural network architectures for actor and critic, and an offline-online learning framework. By integrating offline pre-training with online fine-tuning, it optimizes graph neural network parameters for real-time decision-making and continuous adaptation to future dynamic environments. Experimental results validate its adaptability to various offline and online scenarios, its effective architecture design, and its superior performance in terms of adaptability, transferability, and extensibility.
Overall, the methodologies and findings presented in this thesis advance the state-of-the-art in DWS under both offline and online settings, offering practical use for cloud computing environments. The developed approaches demonstrate significant improvements in scheduling effectiveness, generalization ability, and adaptation to dynamic scenarios, providing valuable insights for both academic research and industrial applications.