Torun T., TORUN F. Ş., MANGUOĞLU M., Aykanat C.

SIAM Journal on Scientific Computing, vol.44, no.2, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 44 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1137/21m1411603
  • Journal Name: SIAM Journal on Scientific Computing
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Agricultural & Environmental Science Database, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, MathSciNet, zbMATH, Civil Engineering Abstracts
  • Keywords: parallel Gauss-Seidel, distributed-memory, Spike algorithm, parallel sparse trian-gular solve, hypergraph partitioning, sparse matrix reordering
  • Ankara Yıldırım Beyazıt University Affiliated: Yes


Copyright © 2022 by SIAM.Gauss-Seidel (GS) is a widely used iterative method for solving sparse linear systems of equations and also known to be effective as a smoother in algebraic multigrid methods. Parallelization of GS is a challenging task since solving the sparse lower triangular system in GS constitutes a sequential bottleneck at each iteration. We propose a distributed-memory parallel GS (dmpGS) by implementing a parallel sparse triangular solver (stSpike) based on the Spike algorithm. stSpike decouples the global triangular system into smaller systems that can be solved concurrently and requires the solution of a much smaller reduced sparse lower triangular system which constitutes a sequential bottleneck. In order to alleviate this bottleneck and to reduce the communication overhead of dmpGS, we propose a partitioning and reordering model consisting of two phases. The first phase is a novel hypergraph partitioning model whose partitioning objective simultaneously encodes minimizing the reduced system size and the communication volume. The second phase is an in-block row reordering method for decreasing the nonzero count of the reduced system. Extensive experiments on a dataset consisting of 359 sparse linear systems verify the effectiveness of the proposed partitioning and reordering model in terms of reducing the communication and the sequential computational overheads. Parallel experiments on 12 large systems using up to 320 cores demonstrate that the proposed model significantly improves the scalability of dmpGS.