Distributed Data-intensive Web Service Composition
Service-oriented architecture (SOA) encourages the creation of modular applications involving Web services as the reusable components. Data-intensive Web services have emerged to manipulate and deal with the massive data emerged from technological advances and their various applications. Distributed Data-intensive Web Service Composition (DWSC) is a core of SOA, which includes the selection of data-intensive Web services from diverse locations on the network and composes them to accomplish a complicated task. As a fundamental challenge for service developers, service compositions must fulfil functional requirements and optimise Quality of Service (QoS), simultaneously. The QoS of a distributed DWSC is not only impacted by the QoS of component services and how the compositions are generated, but also by the locations of services and data transformation between services. However, existing works often neglect the impact of locations and data on service composition. The distributed DWSC has not been sufficiently studied in the literature. In this thesis, we first define the single-objective distributed DWSC that includes communication (e.g. bandwidth), Web service (execution time) and data (data cost) attributes. To this aim, we consider bandwidth information of communication links obtained using the location information of services. Based on the problem formulation, we then address the distributed DWSC problem by developing EC-based approaches. Those EC-based approaches are designed to incorporate domain-knowledge for effectively solving the distributed DWSC problem. Afterwards, we study the multi-objective distributed DWSC to satisfy different QoS requirements. In particular, the QoS-constrained distributed DWSC problem and user preferences are considered. For finding trade-off solutions for those problems, new Multi-objective Evolutionary Algorithms (MOEAs) are proposed based on the current Non-dominated Sorting Genetic Algorithm-II (NSGA-II). Furthermore, a new problem formulation for the dynamic distributed DWSC (D2−DWSC) problem with bandwidth fluctuations is proposed. An EC-based approach is developed to solve the D2-DWSC. Finally, extensive empirical evaluations are conducted that demonstrate the high performance of our proposed methods in finding composite services with good QoS.