Yale University.  
Computer Science.  
     
Computer Science
Main Page
Academics
Graduate Program
Undergraduate Program
Course Information
Course Web Pages
Research
Our Research
Research Areas
Technical Reports
People
Faculty
Graduate Students
Research and Technical Staff
Administrative Staff
Alumni
Degree Recipients
Resources
Calendars
Computing Facilities
CS Talks Mailing List
Yale Computer Science FAQ
Yale Workstation Support
Computing Lab
AfterCollege Job Resource
Graduate Writing Center
Department Information
Contact Us
History
Life in the Department
Life About Town
Directions
Job Openings
Faculty Positions
Useful Links
City of New Haven
Yale Applied Mathematics
Yale C2: Creative Consilience of
Computing and the Arts
Yale Faculty of Engineering
Yale GSAS Staff Directory
Yale University Home Page
Google Search
Yale Info Phonebook
Internal
Internal
 

Mark Gerstein
Albert L. Williams Professor of Biomedical Informatics, Molecular Biophysics & Biochemistry and Computer Science

A.B. 1989, Harvard University
Ph.D.1993, Cambridge University
Joined Yale Faculty 1997

Personal Homepage

Mark Gerstein.

Professor Gerstein does research in the new field of bioinformatics, which involves applying quantitative approaches to problems in molecular biology and genomics. His research involves a range of computational techniques, including systematic datamining and machine learning, visualization of high-dimensional data, biological database design, and molecular simulation.

Broadly, Professor Gerstein is interested in analyses of genome sequences, macromolecular structures, molecular networks, and functional-genomics datasets. He is particularly focused on the human genome and personal genome sequences in relation to three areas.

(1) He is interested in annotating the human genome sequence, especially in characterizing the vast expanse of non-coding sequence. This work involves the creation of automatic pipelines for identifying patterns and homologies in the genome sequence and processing large-scale next-generation sequencing data efficiently. He is also interested in studying the genomic variations between individuals, particularly in identifying and assembling large blocks of variant sequence.

(2) He is trying to get at the function of all the protein elements encoded by the genome. Here, the approach is to characterize function systematically through the use of molecular networks. This work involves extensive application of machine learning approaches such as Bayesian networks, decision trees, and clustering. Also important in this work is developing ontologies for biological functions and statistically reliable methods for predicting protein function based on sequence similarity, functional genomics data, and automated analysis of the literature.

(3) Finally, for the population of proteins that have known 3D structures, he is trying to see how their function is carried out through motion and how motion can be predicted from packing geometry. This involves developing ways of aligning structures, clustering related ones into fold families, analyzing packing with Voronoi polyhedra, and simulating motions using molecular-mechanics potentials.

Representative Publications:

Bullet.

L.Y. Wang, A. Abyzov, J.O. Korbel, M Snyder, M. Gerstein (2009). "MSB: a mean-shift-based approach for the analysis of structural variation in the genome," Genome Res 19: 106-17.

Bullet.

P.M. Kim, L.J. Lu, Y. Xia, M.B. Gerstein (2006). "Relating three-dimensional structures to protein networks provides evolutionary insights," Science 314: 1938-41.

Bullet.

H. Yu, M. Gerstein (2006). "Genomic analysis of the hierarchical structure of regulatory networks,"
Proc Natl Acad Sci U S A 103: 14724-31.

Bullet.

M. Gerstein, D. Zheng (2006). "The real life of pseudogenes," Sci Am 295: 48-55.

Top of Page.