Abstract
General Purpose Graphics Processing Unit (GPGPU) computing has become a standard technique for high performance scientific computing. The recent improvements in GPGPU computing hardware has outpaced the improvements in the Peripheral Component Interconnect Express (PCIe) interconnect that transfers data between the host and the device by such a significant margin that PCIe bandwidth caps cripple performance. In this thesis, two alternatives are explored, the Nvidia NVLink interconnect and Heterogeneous System Architectures with zero-copy algorithms. To evaluate performance, the openCL Scalable HeterOgeneous Computing benchmark suite is used to measure performance of each device in various scientific applications.