The solution of systems of linear equations is one of the most central processing unit-intensive steps in engineering and simulation applications and can greatly benefit from the multitude of processing cores and vectorisation on today’s parallel computers. Our objective is to evaluate the performance of one of them, the conjugate gradient method, on a hybrid computing platform (Multi-GPU/Multi-CPU). We consider the preconditioned conjugate gradient solver (PCG) since it exhibits the main features of such problems. Indeed, the relative performance of CPU and GPU highly depends on the sub-routine: GPUs are for instance much more efficient to process regular kernels such as matrix vector multiplications rather than more irregular kernels such as matrix factorization. In this context, one solution consists in relying on dynamic scheduling and resource allocation mechanisms such as the ones provided by StarPU. In this chapter we evaluate the performance of dynamic schedulers proposed by StarPU, and we analyse the scalability of PCG algorithm. We show how effectively we can choose the best combination of resources in order to improve their performance.
CITATION STYLE
Kasmi, N., Zbakh, M., & Haouari, A. (2019). Performance analysis of preconditioned conjugate gradient solver on heterogeneous (multi-CPUs/multi-GPUs) architecture. In Lecture Notes in Networks and Systems (Vol. 49, pp. 318–336). Springer. https://doi.org/10.1007/978-3-319-97719-5_20
Mendeley helps you to discover research relevant for your work.