This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

# Computational science training: 2011 projects

This page presents projects to be run by Peter Saveliev for the grant "REU Site: Computational Science Training at Marshall University for Undergraduates in the Mathematical and Physical Sciences" (Principle Investigator: Howard L. Richards). Over the summers of 2010–2012, the Departments of Mathematics, Physics, and Chemistry at Marshall University will jointly host twelve students for ten weeks of instruction and research in computational science.

The general goal is to do something new computationally while learning some math. So, in summer 2010 we used and modified some existing software. In the beginning I suggested 3 potential projects. These are the two that I ended up supervising:

*The topology of data*by Joseph Snyder, and*3D image analysis*by James Molchanoff.

In the order of increasing math background required...

## Contents

## Image-to-image search

It is the exact opposite of the text-to-image search we are familiar with. Given an image, visual image search engines find images in a given collection that are similar, in some way, to the query image. So far, these engines exist mostly as experimental prototypes. Most of these demo programs work with small collections of images and, frequently, without an upload feature, which makes testing impossible. Meanwhile, when testing is possible, the results are questionable.

The approach is based on the methods related to the digital image analysis project: the distribution of the sizes of the objects in the image is compared to those of other images.

2011 project: Evaluating image-to-image search

*Possible projects*, software PxSearch (Windows):

- creating datasets for various, medium-size image collections;
- developing a comprehensive review of the literature on the subject;
- evaluating the quality of the matching;
- modifying the matching criteria (bins for the distributions, thresholds for noise, etc);
- analyzing the topology of the datasets (project below).

Background:

- Prerequisite: Calculus 3: course
- Co-requisite: Image processing: course

## Modeling with discrete exterior calculus

Conway's Game of Life and other cellular automata produce fascinating pictures, but, to the best of my knowledge, can't be used to model such a simple thing as a circular wave... We will be using discrete exterior calculus to model elementary ODE's and PDE's in dimension 2 with C++, MATLAB, and/or Excel.

2011 project: Modeling with discrete exterior calculus

*Possible projects:*

- flow from a vector field,
- wave equation,
- heat equation.

Background:

- Prerequisite: Linear algebra: course and Differential equations: course, some computer language
- Co-requisite: Differential forms: course

## Digital image analysis in 3D

Image analysis and computer vision is the extraction of meaningful information from digital images. Some of the most prominent application is in cell analysis, medical image processing, and industrial machine vision. There exists an abundance of methods for solving various well-defined computer vision tasks, where the methods are very task specific and seldom can be reused in a wide range of applications. Our long term goal is to design a computer vision system “from first principles”. These principles will come initially from algebraic topology.

*Possible projects:*

- developing gray scale analysis of 3D images, software: CHomP (C++) .
- application to stereo vision.

2010 project: 3D image analysis.

Background:

- Prerequisites: Linear algebra: course, some C++
- Co-requisite: Homology of cell complexes: course

## Topological data analysis

Suppose we have conducted 1000 experiments with a set of 100 various measurements in each. Then each experiment is a string of 100 numbers, or simply a vector of dimension 100. The result is a collection of disconnected 1000 points, called the point cloud, in a 100-dimensional vector space. It is impossible to visualize this data as any representation that one can see is limited to dimension 3. Yet we still need to answer a few simple topological questions about the object behind the point cloud:

- Is it one piece or more?
- Is there a tunnel?
- Or a void?
- And what about possible 100-dimensional topological features?

Through clustering (and related approaches) statistics answers the first question. This is a common *topological* approach to the problem. For a point cloud in a euclidean space, suppose we are given a threshold r so that any two points within r from each other are to be considered "close". Then each pair of such points is connected by an edge. If three points are “close”, we add a face, etc. The result is a simplicial complex that approximates the manifold M behind the point cloud. More: Topological data analysis.

*Possible projects*, software jPlex (Java):

- applying jPlex to various datasets,
- applying jPlex to the dataset from the image-to-image search project
- local analysis and dimensionality reduction

2010 project: The topology of data

Background:

- Prerequisites: Linear algebra: course, some Java
- Co-requisite: Homology of cell complexes: course