SCCS Colloquium - May 20, 2020

From Sccswiki
Revision as of 14:44, 29 April 2020 by Makis (talk | contribs)
Jump to navigation Jump to search
Date: Μαυ 20, 2020
Room: Online (password: SCCS)
Time: 15:00 - 16:00

Tianyi Ge: Python software suite of geometry-oblivious Fast Multipole Methods: its application in statistical plotting and high dimensional data classification

Guided Research project submission talk. Tianyi is advised by Severin Reiz.

This guided research focuses on optimizing pre-existing python codes of GOFMM and incorporating new functionality into the embedded balanced tree data structure by GOFMM. The original codes rely heavily on nested loops, conditional checks and index tracking, making it difficult to apply GOFMM to new applications. The revised version implements a set-oriented interface on setting up the GOFMM data structure. Particularly, we formalize all sampling methods in set. As a result, the codes are much shorter, understandable and efficient as it involves basic mathematical operations on a fundamental data structure. Furthermore, we utilize scipy.linalg.interpolative package to simplify analysis in skeleton, such as interpolative decomposition and QR factorization. The resultant codes run much faster and are user-friendly in implementing related applications.

With our new GOFMM interface, we devised two examples that demonstrate high usability of GOFMM in statistical plotting and image recognition. Our first example takes in 2D position data and classifies points based on their density. By exploiting relative fast search and data structure in GOFMM, this user case accelerates underlying matrix-vector multiplication. In comparison to its raw data plot, our plot uses Gaussian kernel density estimation (Gaussian KDE) and displays a high level of accuracy. Our second example pipes over one thousand 8 × 8 images into GOFMM, utilizing its tree-like data structure for fast search and low storage. Then, we implement Gaussian KDE to learn over training dataset so that the updating model can accurately classify testing data. The result shows high accuracy of our model classification. With 30 training images, the accuracy of classifying 44 images is 43/44.

Keywords: GOFMM, Gaussian KDE, Image Classification