A Characteristic-Based Framework for Multiple Sequence Aligners

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

The multiple sequence alignment is a well-known bioinformatics problem that consists in the alignment of three or more biological sequences (protein or nucleic acid). In the literature, a number of tools have been proposed for dealing with this biological sequence alignment problem, such as progressive methods, consistency-based methods, or iterative methods; among others. These aligners often use a default parameter configuration for all the input sequences to align. However, the default configuration is not always the best choice, the alignment accuracy of the tool may be highly boosted if specific parameter configurations are used, depending on the biological characteristics of the input sequences. In this paper, we propose a characteristic-based framework for multiple sequence aligners. The idea of the framework is, given an input set of unaligned sequences, extract its characteristics and run the aligner with the best parameter configuration found for another set of unaligned sequences with similar characteristics. In order to test the framework, we have used the well-known multiple sequence comparison by log-expectation (MUSCLE) v3.8 aligner with different benchmarks, such as benchmark alignments database v3.0, protein reference alignment benchmark v4.0, and sequence alignment benchmark v1.65. The results shown that the alignment accuracy and conservation of MUSCLE might be greatly improved with the proposed framework, specially in those scenarios with a low percentage of identity. The characteristic-based framework for multiple sequence aligners is freely available for downloading at http://arco.unex.es/arl/fwk-msa/cbf-msa.zip

Original languageEnglish
Pages (from-to)41-51
Number of pages10
JournalIEEE Transactions on Cybernetics
Volume48
Issue number1
DOIs
Publication statusPublished - Jan 2018

Keywords

  • Characteristics-based
  • Multiple sequence alignment (MSA)
  • Particle swarm optimization (PSO)

Fingerprint Dive into the research topics of 'A Characteristic-Based Framework for Multiple Sequence Aligners'. Together they form a unique fingerprint.

Cite this