Bioinformatics. Группа авторов

Читать онлайн книгу.

Bioinformatics - Группа авторов


Скачать книгу
information about the genomic assembly in this region of rat chromosome 5 (specifically, at 5q31) can be obtained (cf. Chapter 4).

Snapshot depicts the results of the first round of a PSI-BLAST search. Snapshot depicts the results of the second round of a PSI-BLAST search in which the new sequences identified through the use of the position-specific scoring matrix. Snapshot depicts the submission of a BLAT query.

Snapshot depicts the results of a BLAT query in which the highest scoring hit is to a sequence on chromosome five rat genome having ninety-eight-point one percentage sequence identity.
Program Query Database Corresponding BLAST Program
FASTA Nucleotide Nucleotide BLASTN
Protein Protein BLASTP
FASTX/FASTY DNA Protein BLASTX
TFASTYX/TFASTY Protein Translated DNA TBLASTN

      The Method

      In step 2, only the 10 best regions for a given pairwise alignment are considered for further analysis (Figure 3.20b). FASTA now tries to join together regions of similarity that are close to each other in the dotplot but that do not lie on the same diagonal, with the goal of extending the overall length of the alignment (Figure 3.20c). This means that insertions and deletions are now allowed, but there is a joining penalty for each of the diagonals that are connected. The net score for any two diagonals that have been connected is the sum of the score of the original diagonals, less the joining penalty. This new score is referred to as initn.

      In step 3, FASTA ranks all of the resulting diagonals, and then further considers only the “best” diagonals in the list. For each of the best diagonals, FASTA uses a modification of the Smith–Waterman algorithm (1981) to come up with the optimal pairwise alignment between the two sequences being considered. A final, optimal score (opt) is calculated on this pairwise alignment.

Schematic illustrations of the FASTA search strategy. (a) Once FASTA determines words of length ktup common to the query sequence and the target sequence, it connects words that are close to each other, and these are represented by the diagonals. (b) After an initial round of scoring, the top ten diagonals are selected for further analysis. (c) The Smith-Waterman algorithm is applied to yield the optimal pairwise alignment <hr><noindex><a href=Скачать книгу