All Stories

  1. Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes
  2. Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators
  3. GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure
  4. Using Additive Modifications in LU Factorization Instead of Pivoting
  5. A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines
  6. Using Advanced Vector Extensions AVX-512 for MPI Reductions
  7. Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications
  8. Load-balancing Sparse Matrix Vector Product Kernels on GPUs
  9. Guest editors’ note: Special issue on clusters, clouds, and data for scientific computing
  10. Massively Parallel Automated Software Tuning
  11. PLASMA
  12. Big data and extreme-scale computing
  13. A look back on 30 years of the Gordon Bell Prize
  14. GPU-accelerated co-design of induced dimension reduction
  15. Exascale computing and big data