Advanced Graph-Based Techniques with Applications in Dimensionality Reduction 

Dr. Salimeh Yasaei Sekeh will present a talk on one aspect of her research concerning machine learning algorithms at noon this Friday. Anyone interested from the campus community is invited and welcome to attend.
Title: Advanced Graph-Based Techniques with Applications in Dimensionality Reduction 
Time: November 8, Friday, at 12:00 PM
Location:  Soderberg Lecture Hall, Jenness Hall, University of Maine
Several machine learning algorithms to classify a big data sample into multiple classes have been proposed in the past. In a multi-class classification problem, we train a classifier using our training instances and apply this classifier for labeling new data. As a part of the process, training a classifier requires complicated high-dimensional computations. Among machine learning techniques for big data, feature selection as a dimensionality reduction method seeks a curated subset of available features such that they contain sufficient discriminative information for a given learning task. In feature selection method, the dependency between a pair of multivariate random variables is adopted to retain relevant features. In this talk, a new graph-based dependency criterion inspired by the geometry of graphs and information-theoretic measures is proposed to estimate relevancy between multi-labels variables. Furthermore, a novel feature selection algorithm based on the graph-based dependency criterion is introduced. The proposed technique uses conditional graph-based dependency to measure feature relevance and maintain a feature subset with relatively high classification performance. The advantages of the proposed approach are demonstrated in a series of simulations. This approach results in an efficient and fast non-parametric implementation of dependency estimation and dimensionality reduction with broad applications in modern real-world problems. In this work, the proposed technique is applied to several real-world data sets to filter out redundant features by highlighting how our approach achieves higher prediction performance than baselines.