SC²S Colloquium - April 23, 2013
|Date:||April 23, 2013|
|Time:||3 pm, s.t.|
Michele Martone: A Sparse BLAS implementation using the "Recursive Sparse Blocks" layout
Sparse matrix computations arise in many scientific computing problems and often operations such as Sparse Matrix-Vector Multiplication (SPMV) can dominate the total computation time. These computations are generally characterized by irregular memory accesses and low compute-to-load/store ratio. Recent general purpose CPUs (cache based, shared memory parallel) perform poorly, because of relative high latency in cache/memory access and decreasing memory-to-compute bandwidths' ratio. Recent experiments suggest that by combining cache-blocking techniques (leading to space-filling-curve block layout of data), and coarse-grained shared memory parallelism it is possible to execute SPMV of large sparse matrices more than twice as fast than with Intel MKL's industry-standard, highly optimized CSR implementation. Additionally, in contrast to the CSR format, our format (RSB --- Recursive Sparse Blocks) enables a higher parallelism of operations like transposed or symmetric SPMV or sparse triangular solve. This talk will present main ideas and results in using the RSB format in a Sparse BLAS (Basic Linear Algebra Subroutines) context, according to the recently released "librsb" library.