Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 1;24(13):1530-1.
doi: 10.1093/bioinformatics/btn223. Epub 2008 May 8.

PatMaN: rapid alignment of short sequences to large databases

Affiliations

PatMaN: rapid alignment of short sequences to large databases

Kay Prüfer et al. Bioinformatics. .

Abstract

We present a tool suited for searching for many short nucleotide sequences in large databases, allowing for a predefined number of gaps and mismatches. The commandline-driven program implements a non-deterministic automata matching algorithm on a keyword tree of the search strings. Both queries with and without ambiguity codes can be searched. Search time is short for perfect matches, and retrieval time rises exponentially with the number of edits allowed.

Availability: The C++ source code for PatMaN is distributed under the GNU General Public License and has been tested on the GNU/Linux operating system. It is available from http://bioinf.eva.mpg.de/patman.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Keyword tree with suffix links after adding the sequences ‘CCC’, ‘GA’ and ‘GT’. The keyword tree (represented as bold lines) encodes the probe sequence as a path leading from the root node on the left side to the leaves on the right side. Suffix links are shown as arrows, but have been omitted at leaf nodes for brevity.

References

    1. Aho AV, Corasick MJ. Efficient string matching: an aid to bibliographic search. Commun. ACM. 1975;18:333–340.
    1. Altschul SF, et al. Basic local alignment search tool. J. Mol. Bio. 1990;215:403–410. - PubMed
    1. Mount DW, Conrad B. Improved programs for DNA and protein sequence analysis on the IBM personal computer and other standard computer systems. Nucleic Acids Res. 1986;14:443–454. - PMC - PubMed
    1. Navarro G, Raffinot M. Flexible Pattern Matching in Strings: PracticalOn-line Search Algorithms for Texts and Biological Sequences. New York, NY, USA: Cambridge University Press; 2002.
    1. Smith R. A finite state machine algorithm for finding restriction sites and other pattern matching applications. Comput. Appl. Biosci. 1988;4:459–465. - PubMed

Publication types