Sheehan Kahn

My research is centered on the idea of exploiting the growing wealth of publicly available microarray data to enhance current medical studies. A microarray is a device biologist can use to measure the expression levels of all the genes in a tissue sample. Learning which genes are associated with different disease states can be extremely useful for several reasons. However, most studies using microarrays are statistically underpowered. An average array has on the order of 50,000 thousand features (gene measurements), but even the largest studies will only have at most 100-200 arrays. However, there is an increasing amount of publically available data from previous studies that can be exploited to create abstractions and ‘meta’ features that can then be used to facilitate the analysis of new data. As an analogy; an oncologist specialized in breast cancer may not know all the details about prostate cancer, but should be able to generalize their knowledge of cancer to help a patient they suspect has prostate cancer.