Bioinformatics. Группа авторов

Читать онлайн книгу.

Bioinformatics - Группа авторов


Скачать книгу
or chemical similarities. Note that the matrix is a mirror image of itself with respect to the diagonal.

      The other major type of gap penalty used is a non-affine (or linear) gap penalty. Here, there is no cost for opening the gap; a simple, fixed mismatch penalty is assessed for each position in the gap. It is thought that affine penalties better represent the biology underlying the sequence alignments, as affine gap penalties take into account the fact that most conserved regions are ungapped and that a single mutational event could insert or delete many more than just one residue. In practice, use of the affine gap penalty better enables the detection of more distant homologs.

      The Algorithm

Program Query Database
BLASTN Nucleotide Nucleotide
BLASTP Protein Protein
BLASTX Nucleotide, six-frame translation Protein
TBLASTN Protein Nucleotide, six-frame translation
TBLASTX Nucleotide, six-frame translation Nucleotide, six-frame translation
Snapshot depicts the initiation of a BLAST search in which the search begins with query words of a given length being compared against a scoring matrix to determine additional three-letter words in the neighborhood of the original query word. Graph depicts the BLAST search extension in which the length of extension represents the number of characters that have been aligned in a pairwise sequence comparison and the cumulative score represents the sum of the position-by-position scores, as determined by the scoring matrix used for the search and T represents the neighborhood score threshold, S is the minimum score required to return a hit in the BLAST output, and X is the significance decay.

      As the extension continues, at some point, mismatches and gaps will begin to outweigh the exact matches and conservative substitutions, accruing negative scores from the scoring matrix. As soon as the curve begins to turn downward, BLAST measures whether the drop-off exceeds a threshold called X. If the curve decays more than is allowed by the value of X, the extension is terminated and the alignment is trimmed back to the length corresponding to the preceding maximum in the curve. The resulting alignment is called a high-scoring segment pair, or HSP. Given that the BLAST algorithm systematically marches


Скачать книгу