Research Interests
Research Interests
The amount of high throughput data in biological and clinical systems—from next-generation sequencing experiments to electronic health records—is increasing dramatically, allowing for the development of a quantitative understanding of these complex systems. Our lab is an interdisciplinary team interested in developing mathematical and computational tools to extract useful biological information from large data sets.

Our work focuses on three distinct topics:

  • Infectious diseases.
    Evolution is a dynamic process that shapes genomes. Our team at Columbia is developing algorithms and software to analyze genomic data, with a view to understanding the molecular biology, population genetics, phylogeny, and epidemiology of viruses.

  • Cancer.
    Next-generation sequencing technologies provide an extraordinary opportunity to identify somatic mutations that contribute to the development of tumors. We are developing methods to identify cancer-driving mutations in high throughput sequencing datasets.

  • Electronic Health Records.
    Clinical databases constitute a rich and complex source of raw data. We are using the power of statistics and computers to tease out important clinical patterns in these diverse, important datasets.

  • Because of this recent explosion in biological and medical data—a 2011 New York Times article referred to the phenomenon as a "deluge of data"—tackling these research problems often requires heavy-duty computation. To facilitate this, we have access to a super-fast computer cluster maintained by the Center for Computational Biology and Bioinformatics.

    See a listing of our current publications here.

    Read more about our research here.

    Find information about the members of our lab here.

    Philosophy is written in this vast book, which lies continuously open before our eyes (I mean the universe). But it cannot be understood unless you have first learned to understand the language and recognize the characters in which it is written. It is written in the language of mathematics, and the characters are triangles, circles,and other geometrical figures. Without such means, it is impossible for us humans to understand a word of it, and to be without them is to wander around in vain through a dark labyrinth.