Signal (pre)processing of SELDI Data

As a guest researcher at the Centers for Disease Control and Prevention in Atlanta, GA since 2006, I have been interested in developing algorithms that can be used to preprocess large amounts (on the order of 10^3) of SELDI spectra reliably. This turns out to not be a trivial problem to solve, and became the basis of my PhD dissertation at the Georgia Institute of Technology.

Benchmarking Preprocessing Algorithms

We conducted extensive simulations to benchmark nine of the most popular approaches to preprocessing SELDI data. In this project, we established standard metrics for measuring algorithm performance, measured the performance of all nine algorithms, and examined in detail the performance of the algorithms for different subclasses of proteins (with respect to mass, abundance, and prevalence in a sample). All algorithms show room for improvement. We have published our results in Proteomics and have made available all of our Matlab and Perl code used in the study.

State-of-the-art automated preprocessing of SELDI spectra

We have developed a new SELDI preprocessing package, implemented in Matlab, that has the following advantages over its competitors,

In order to achieve these advances, we proposed a new way to model the physical processes inherent in SELDI, thus furthering our theoretical understanding of SELDI. Our work has been submitted for publication in Bioinformatics and is currently under peer review.

SELDI Analysis of Cervical Mucous Samples with Application to Early Detection of Cervical Cancer

We have plans to apply our new preprocessing methodology at the CDC to look for discriminating patterns in cervical mucous indicative of early stage cervical cancer.