Bookshelf
Dynamic OpenCL - Distributed Computing on Cloud ScaleMasterarbeit von Florian Rösler Hasso-Plattner-Institut an der Universität Potsdam | 20. April 2017 Creating parallelized software in order to reduce execution times is one of the main challenges in computer science. Writing parallel programs requires programmers to obtain extensive low level knowledge about the hardware specifications and the standards of the corresponding programming models. When enhancing multi-core programs to harness multiple machines of a cluster, another layer of indirection is added, hence significantly increasing code complexity. The goal of this thesis is to create a parallel computation framework that allows to distribute workloads among central processing units and graphics processing units within a cluster without deep knowledge of the underlying technologies. Thus, the framework is expected to provide a high level application programming interface that should be usable by novice programmers. Additionally, it should be possible to adjust computational resources dynamically depending on changing cluster utilizations. Various parallelization methods, such as OpenMP, Message Passing Interface and OpenCL, are illustrated and discussed regarding their ability to contribute to the framework. Because of the possibility to produce portable code following a fixed programming model, OpenCL is chosen as the underlying base technology. In combination with dOpenCL, OpenCL programs can be executed on remote machines via network without changes to the original code. The resulting parallelization potential of OpenCL in conjunction with dOpenCL is abstracted behind the main contribution of this thesis – Dynamic OpenCL. Dynamic OpenCL is written in Java and allows programmers to create distributed OpenCL programs in Java by employing Aparapi for code translations. It contains sophisticated mechanisms for cluster management and dynamic resource adjustments that enable its operation in various cluster setups such as hybrid clusters that are comprised of local and cloud resources. Depending on the respective cluster environment it is possible to adjust scheduling algorithms in order to gain performance improvements. During an extensive evaluation it is shown that the framework can introduce significant performance benefits by distributing selected tasks with almost linear speedups. Even in heavily network-bound cluster setups that incorporate cloud resources, Dynamic OpenCL is able to increase performance noticeably by providing a network-aware scheduling mechanism. Based on Dynamic OpenCL a prototypical web server is built that allows users to submit computations through a user interface and enables them to decrease execution times by booking additional cloud resources. It is concluded that Dynamic OpenCL is able to provide speedups in various cluster environments, mainly depending on the nature of submitted workloads and available network bandwidths to remote devices. With further improvements that aim at reducing the impact of limited network bandwidths, the speedup potential might be increased even further in the future. |
||||||||||||||||
|
||||||||||||||||
|