PatMaN: rapid alignment of short sequences to large databases
- PMID: 18467344
- PMCID: PMC2718670
- DOI: 10.1093/bioinformatics/btn223
PatMaN: rapid alignment of short sequences to large databases
Abstract
We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a predefined number of gaps and mismatches. The commandline-driven program implements a non-deterministic automata matching algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed.
Availability: The C++ source code for PatMaN is distributed under the GNU General Public License and has been tested on the GNU/Linux operating system. It is available from http://bioinf.eva.mpg.de/patman.
Supplementary information: Supplementary data are available at Bioinformatics online.
Figures
References
-
- Aho AV, Corasick MJ. Efficient string matching: an aid to bibliographic search. Commun. ACM. 1975;18:333–340.
-
- Altschul SF, et al. Basic local alignment search tool. J. Mol. Bio. 1990;215:403–410. - PubMed
-
- Navarro G, Raffinot M. Flexible Pattern Matching in Strings: PracticalOn-line Search Algorithms for Texts and Biological Sequences. New York, NY, USA: Cambridge University Press; 2002.
-
- Smith R. A finite state machine algorithm for finding restriction sites and other pattern matching applications. Comput. Appl. Biosci. 1988;4:459–465. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
