چكيده به لاتين
One of the famous heterogeneous system architectures is CPU-GPUs. Graphic processing units (GPUs) provide high performance and throughput using high thread-level parallelism and lots of parallel cores. Although heterogeneous system architectures have the ability to provide high compute ability for a wide range of applications, they suffer from some challenges, such as resource management, scheduling, and communication. Managing shared resources, especially shared memory between compute units, is a serious challenge. It is because running thousands of threads and accessing multiple compute units to a shared memory increases conflicts, trashing, and access time. To handle the memory challenge, there is a wealth of research that focuses on the interaction among compute units, and managing memory accesses using prefetch and pre-eviction approaches. A promising approach to handling memory management is prefetching. However, prior approaches prefetch or pre-evict memory addresses by considering the adjacent blocks without taking sharing pattern into account. This shortcoming is getting high attention, especially by growing applications with irregular accesses and applications with high memory demands. In this thesis, we aim to propose a sharing-aware prefetch approach that considers the sharing (i.e., the possibility of accessing data in near future). The proposed idea checks the memory requests, the effectiveness of prefetched data, and their effects on data sharing. By this, we propose policies for prefetching and pre-evicting data in cache memories. Although this approach is general to be applied to different memory levels, we focus on L1 data cache in GPUs, due to the fact that wrong prefetch and pre-eviction in L1 data cache highly affect the total performance and consumed energy of GPUs. We evaluate the proposed idea using Accelsim. Compared to the state-of-the-art prefetching mechanism, our idea increases accuracy by 41% and improves performance and energy efficiency by 12% and 26%, respectively.