The persistent homology is a fundamental development in the field of topological data analysis in the past two decades. Given a potentially complex object (e.g, a manifold, an image, a graph, or a time series), persistent homology can summarize and characterize it from different perspectives in a multiscale manner. It thus provides a general and powerful way for data summarization / feature vectorization. In the past few years, there have been many interesting approaches developed either to provide kernels for persistence-based summaries or to vectorize them, so as to make persistence-based summaries more friendly with machine learning frameworks for downstream data analysis tasks.
In these approaches, the importance (weight) of different persistence features are often pre-set. However often in practice, the choice of the weight function should depend on the nature of the specific type of data one considers. In this talk, I will talk about our recent work on learning the ``best'' weight function (and thus metric for persistence-based summaries) from labelled data, as well as an application of this idea to the challenging graph classification task. I will show that a graph classification framework based on the learned kernel between persistence-based summaries can obtain similar or (sometimes significantly) better results than the best results from a range of previous graph classification frameworks on a collection of benchmark datasets. This is joint work with Qi Zhao.
Back to Workshop III: Geometry of Big Data