Bookshelf
Improved Data Transfer Efficiency for Scale-Out GPU Workloads using On-the-Fly I/O Link CompressionMasterarbeit Hasso-Plattner-Institut an der Universität Potsdam | 21. Juli 2020 While nowadays hardware accelerators such as GPUs are commonplace, it remains challenging to present a programming model that allows domain experts to exploit their full potential. Previous work developed dOpenCL, which allows users to distribute OpenCL applications over a compute cluster, as well as CloudCL, which builds on top of dOpenCL and simplifies application development and further eases cluster distribution and job scheduling. However, the relatively low bandwidth offered by networks available on commodity cloud services in comparison with the compute capability of modern GPU devices often places limits to the scalability of these approach. In order to achieve better scalability while maintaining an accessible programming model, this thesis integrates transparent I/O Link Compression to dOpenCL and CloudCL using the 842 compression algorithm, using both the dedicated NX842 hardware compressor available on the POWER architecture and an optimized 842 implementation for OpenCL. During the course of this work, a performance-oriented design for a compression system for OpenCL data transfers is presented, a implementation of this design is developed, and an evaluation for this system is given, demonstrating that the final system is capable of accelerating data transfers over high-performance networks across a range of tested set-ups and benchmarks. |
||||||||||||||||
|
||||||||||||||||
|