MergeAlign

About

MergeAlign is a program that constructs a consensus multiple sequence alignment from multiple independent alignments. Using dynamic programming it efficiently combines individual multiple sequence alignments to generate a consensus that is maximally representative of all constituent alignments.

Using MergeAlign to combine multiple sequence alignments generated using different matrices of amino acid substitution produces multiple sequence alignments that are more robust and more accurate than alignments generated using only a single matrix of amino acid substitution. Phylogenetic trees inferred from these MergeAlign alignments have better topological support values, are better resolved and show increased consistency.

MergeAlign generates column support scores for each column in a multiple sequence alignment. When constituent alignments are generated using different models of amino acid substitution these support scores are related to alignment precision. MergeAlign can therefore be used to select accurately aligned data for downstream phylogenetic applications.

For more information see reference 1 below or read How MergeAlign works.

Align and Combine using MergeAlign

Enter a set of unaligned protein sequences in FASTA format and we will align them using MAFFT2 and 91 different amino acid substitution matrices1, then use MergeAlign to find the optimal consensus.

This may take a couple of minutes depending on alignment size and server load.

Run MergeAlign on existing alignments

If you already have multiple independent alignments of same set of sequences then you can just use MergeAlign. Alignments can be protein, DNA or RNA, but must be in the FASTA or CLUSTAL format. Gaps must be represented by the character '-'.

Click Choose Files below to select your files. Hold Shift to select a range of files; hold Crtl to select multiple separate files. Then click Upload to upload the constituent alignments. Your files are not stored on our server.

Select format of input files

Feedback

If you have any comments or questions please email either Peter Collingridge or Steven Kelly.

References

If you use MergeAlign please cite:

1. Collingridge PW & Kelly S (BMC Bioinformatics 2012, 13:117)
MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.

If you aligned your sequences here, please also cite:

2. Katoh K, Kuma K, Toh H & Miyata T (Nucleic Acids Res. 2005, 33:511-518) )
MAFFT version 5: improvement in accuracy of multiple sequence alignment.