Discovering Novel Subsystems Using Comparative Genomics

Citation

Luciana Ferrer, Alexander G. Shearer, Peter D. Karp, Discovering novel subsystems using comparative genomics, Bioinformatics, Volume 27, Issue 18, 15 September 2011, Pages 2478–2485, https://doi.org/10.1093/bioinformatics/btr428

Abstract

Motivation: Key problems for computational genomics include discovering novel pathways in genome data, and discovering functional interaction partners for genes to define new members of partially elucidated pathways.

Results: We propose a novel method for the discovery of subsystems from annotated genomes. For each gene pair, a score measuring the likelihood that the two genes belong to a same subsystem is computed using genome context methods. Genes are then grouped based on these scores, and the resulting groups are filtered to keep only high-confidence groups. Since the method is based on genome context analysis, it relies solely on structural annotation of the genomes. The method can be used to discover new pathways, find missing genes from a known pathway, find new protein complexes or other kinds of functional groups and assign function to genes. We tested the accuracy of our method in Escherichia coli K-12. In one configuration of the system, we find that 31.6% of the candidate groups generated by our method match a known pathway or protein complex closely, and that we rediscover 31.2% of all known pathways and protein complexes of at least 4 genes. We believe that a significant proportion of the candidates that do not match any known group in E.coli K-12 corresponds to novel subsystems that may represent promising leads for future laboratory research. We discuss in-depth examples of these findings.

Availability: Predicted subsystems are available at http://brg.ai.sri.com/pwy-discovery/journal.html.

Contact:lferrer@ai.sri.com


Read more from SRI