SC²S Colloquium - March 28, 2013
|Date:||March 28, 2013|
|Time:||3 pm, s.t.|
This work is a continuation of previous works of the chair working on cache- oblivious algorithms. These algorithms are used for common matrix operations, specifically matrix multiplication and LU decomposition. A cache-oblivious algo- rithm is an algorithm independent from the cache architecture, offering increased performance and portability. This property is achieved by using a new matrix storage approach inspired by space filling curves known as peano curves. The implementation of the algorithm is called TifaMMy and available as open-source.
We will review previous works and discuss our motivations on improving them. We will mention the importance of scalibility in contemporary and future comput- ers. The previous works used OpenMP to parallelize their implementation. This work intends to improve TifaMMy by using a new parallelization library called Threading Building Blocks, or TBB.
Finally we will test our implementation on a multitude of systems with different architectures and present and compare the results to the previous implementations and other architecture optimized matrix libraries. We will conclude with the analysis of results and the fulfillment degree of our targets.