Multiple Sequence Alignment Construction, Visualization, and Analysis Using Partial Order Graphs

Catherine Grasso, Cornell University

Photo of Catherine Grasso

The traditional row-column multiple sequence alignment (RC-MSA) format represents each sequence as a row and each set of aligned residues as a column. This format conveniently shows the evolutionary changes in biological sequences that result from single residue substitutions, insertions, and deletions. It is easily viewed in a simple text editor or many existing graphical interfaces, and is quite adequate for studying the relationships between recently diverged sequences that are homologous over their complete length. However, recent genome and proteome projects have fueled interest in aligning and exploring the relationships between complex biological sequences whose multiple sequence alignments often contain both large internal insertions and long terminal extensions. While valuable tools for visualizing and analyzing multiple sequence alignments have been developed, the structure of such complex and large-scale changes is often difficult to see within the details of the traditional RC-MSA format.

Graph algorithms and visualization techniques provide a powerful approach to this problem. We recently described the Partial Order Multiple Sequence Alignment (PO-MSA) format, which represents a multiple sequence alignment as a directed acyclic graph, along with an algorithm for Partial Order multiple sequence Alignment (POA). Based on this approach, we have developed a tool for visualizing the overall structure of a complex multiple sequence alignment, the Partial Order multiple sequence Alignment Visualizer (POAVIZ).

Abstract Author(s): Catherine Grasso, Michael Quist, Kevin Ke, Christopher Lee