This section is to be added to the multiple sequence alignment article.
Phylogeny-aware methods
editMost multiple sequence alignment methods try to minimize the number of insertions/deletions (gaps) and, as a consequence, produce compact alignments. This causes several problems if sequences to align contain non-homologous regions, if gaps are informative in a phylogeny analysis. These problems are common in newly produced sequences that are poorly annotated and may contain frame-shifts, wrong domains or non-homologous spliced exons.
First of such methods started to be developed in 2005 by Löytynoja and Goldman.[1] The same authors released a software called PRANK in 2008.[2] PRANK improves alignments when insertions are present. Nevertheless, it runs slowly compared to progressive and/or iterative methods developed for several years.
In 2012, two new phylogeny-aware tools appeared. One is PAGAN developed by the same team than PRANK.[3] The other is ProGraphMSA developed by Szalkowski.[4] Both were developed independently but share common features, notably the use of graph algorithms to improve the recognition of non-homologous regions, and an improvement in code making these software faster than PRANK.
References
edit- ^ Loytynoja, A. (2005). "An algorithm for progressive multiple alignment of sequences with insertions". Proceedings of the National Academy of Sciences. 102 (30): 10557–10562. doi:10.1073/pnas.0409137102.
- ^ Loytynoja, A.; Goldman, N. (2008). "Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis". Science. 320 (5883): 1632–1635. doi:10.1126/science.1158395. PMID 18566285.
- ^ Loytynoja, A.; Vilella, A. J.; Goldman, N. (2012). "Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm". Bioinformatics. 28 (13): 1684–1691. doi:10.1093/bioinformatics/bts198. PMC 3381962. PMID 22531217.
- ^ Szalkowski, A. M. (2012). "Fast and robust multiple sequence alignment with phylogenyaware gap placement". BMC Bioinformatics. 13: 129–1180. doi:10.1186/1471-2105-13-129. PMC 3495709. PMID 22694311.
{{cite journal}}
: CS1 maint: unflagged free DOI (link)